Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add the ability to abort a skcipher walk.

  Algorithms:
   - Fix XTS to actually do the stealing.
   - Add library helpers for AES and DES for single-block users.
   - Add library helpers for SHA256.
   - Add new DES key verification helper.
   - Add surrounding bits for ESSIV generator.
   - Add accelerations for aegis128.
   - Add test vectors for lzo-rle.

  Drivers:
   - Add i.MX8MQ support to caam.
   - Add gcm/ccm/cfb/ofb aes support in inside-secure.
   - Add ofb/cfb aes support in media-tek.
   - Add HiSilicon ZIP accelerator support.

  Others:
   - Fix potential race condition in padata.
   - Use unbound workqueues in padata"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (311 commits)
  crypto: caam - Cast to long first before pointer conversion
  crypto: ccree - enable CTS support in AES-XTS
  crypto: inside-secure - Probe transform record cache RAM sizes
  crypto: inside-secure - Base RD fetchcount on actual RD FIFO size
  crypto: inside-secure - Base CD fetchcount on actual CD FIFO size
  crypto: inside-secure - Enable extended algorithms on newer HW
  crypto: inside-secure: Corrected configuration of EIP96_TOKEN_CTRL
  crypto: inside-secure - Add EIP97/EIP197 and endianness detection
  padata: remove cpu_index from the parallel_queue
  padata: unbind parallel jobs from specific CPUs
  padata: use separate workqueues for parallel and serial work
  padata, pcrypt: take CPU hotplug lock internally in padata_alloc_possible
  crypto: pcrypt - remove padata cpumask notifier
  padata: make padata_do_parallel find alternate callback CPU
  workqueue: require CPU hotplug read exclusion for apply_workqueue_attrs
  workqueue: unconfine alloc/apply/free_workqueue_attrs()
  padata: allocate workqueue internally
  arm64: dts: imx8mq: Add CAAM node
  random: Use wait_event_freezable() in add_hwgenerator_randomness()
  crypto: ux500 - Fix COMPILE_TEST warnings
  ...
This commit is contained in:
Linus Torvalds 2019-09-18 12:11:14 -07:00
commit 8b53c76533
297 changed files with 14692 additions and 17484 deletions

View File

@ -0,0 +1,50 @@
What: /sys/kernel/debug/hisi_zip/<bdf>/comp_core[01]/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of compression cores related debug registers.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/decomp_core[0-5]/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of decompression cores related debug registers.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Compression/decompression core debug registers read clear
control. 1 means enable register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/current_qm
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One ZIP controller has one PF and multiple VFs, each function
has a QM. Select the QM which below qm refers to.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/qm_regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Available for PF and VF in host. VF in guest currently only
has one debug register.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/current_q
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to
show its debug registers in above qm_regs.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: QM debug registers(qm_regs) read clear control. 1 means enable
register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.

View File

@ -1,4 +1,5 @@
.. SPDX-License-Identifier: GPL-2.0
Crypto Engine
=============

View File

@ -12,7 +12,7 @@ Optional properties:
which disables using this rng to automatically fill the kernel's
entropy pool.
N.B. currently 'reg' must be four bytes wide and aligned
N.B. currently 'reg' must be at least four bytes wide and 32-bit aligned
Example:

View File

@ -16,10 +16,12 @@ overall control of how tasks are to be run::
#include <linux/padata.h>
struct padata_instance *padata_alloc(struct workqueue_struct *wq,
struct padata_instance *padata_alloc(const char *name,
const struct cpumask *pcpumask,
const struct cpumask *cbcpumask);
'name' simply identifies the instance.
The pcpumask describes which processors will be used to execute work
submitted to this instance in parallel. The cbcpumask defines which
processors are allowed to be used as the serialization callback processor.
@ -128,8 +130,7 @@ in that CPU mask or about a not running instance.
Each task submitted to padata_do_parallel() will, in turn, be passed to
exactly one call to the above-mentioned parallel() function, on one CPU, so
true parallelism is achieved by submitting multiple tasks. Despite the
fact that the workqueue is used to make these calls, parallel() is run with
true parallelism is achieved by submitting multiple tasks. parallel() runs with
software interrupts disabled and thus cannot sleep. The parallel()
function gets the padata_priv structure pointer as its lone parameter;
information about the actual work to be done is probably obtained by using
@ -148,7 +149,7 @@ fact with a call to::
At some point in the future, padata_do_serial() will trigger a call to the
serial() function in the padata_priv structure. That call will happen on
the CPU requested in the initial call to padata_do_parallel(); it, too, is
done through the workqueue, but with local software interrupts disabled.
run with local software interrupts disabled.
Note that this call may be deferred for a while since the padata code takes
pains to ensure that tasks are completed in the order in which they were
submitted.
@ -159,5 +160,4 @@ when a padata instance is no longer needed::
void padata_free(struct padata_instance *pinst);
This function will busy-wait while any remaining tasks are completed, so it
might be best not to call it while there is work outstanding. Shutting
down the workqueue, if necessary, should be done separately.
might be best not to call it while there is work outstanding.

View File

@ -7350,6 +7350,17 @@ S: Supported
F: drivers/scsi/hisi_sas/
F: Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
HISILICON QM AND ZIP Controller DRIVER
M: Zhou Wang <wangzhou1@hisilicon.com>
L: linux-crypto@vger.kernel.org
S: Maintained
F: drivers/crypto/hisilicon/qm.c
F: drivers/crypto/hisilicon/qm.h
F: drivers/crypto/hisilicon/sgl.c
F: drivers/crypto/hisilicon/sgl.h
F: drivers/crypto/hisilicon/zip/
F: Documentation/ABI/testing/debugfs-hisi-zip
HMM - Heterogeneous Memory Management
M: Jérôme Glisse <jglisse@redhat.com>
L: linux-mm@kvack.org
@ -7703,7 +7714,7 @@ F: drivers/crypto/nx/nx-aes*
F: drivers/crypto/nx/nx-sha*
F: drivers/crypto/nx/nx.*
F: drivers/crypto/nx/nx_csbcpb.h
F: drivers/crypto/nx/nx_debugfs.h
F: drivers/crypto/nx/nx_debugfs.c
IBM Power Linux RAID adapter
M: Brian King <brking@us.ibm.com>

View File

@ -82,8 +82,8 @@ config CRYPTO_AES_ARM_BS
tristate "Bit sliced AES using NEON instructions"
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_LIB_AES
select CRYPTO_SIMD
select CRYPTO_AES
help
Use a faster and more secure NEON based implementation of AES in CBC,
CTR and XTS modes

View File

@ -44,63 +44,73 @@
veor q0, q0, \key3
.endm
.macro enc_dround_3x, key1, key2
.macro enc_dround_4x, key1, key2
enc_round q0, \key1
enc_round q1, \key1
enc_round q2, \key1
enc_round q3, \key1
enc_round q0, \key2
enc_round q1, \key2
enc_round q2, \key2
enc_round q3, \key2
.endm
.macro dec_dround_3x, key1, key2
.macro dec_dround_4x, key1, key2
dec_round q0, \key1
dec_round q1, \key1
dec_round q2, \key1
dec_round q3, \key1
dec_round q0, \key2
dec_round q1, \key2
dec_round q2, \key2
dec_round q3, \key2
.endm
.macro enc_fround_3x, key1, key2, key3
.macro enc_fround_4x, key1, key2, key3
enc_round q0, \key1
enc_round q1, \key1
enc_round q2, \key1
enc_round q3, \key1
aese.8 q0, \key2
aese.8 q1, \key2
aese.8 q2, \key2
aese.8 q3, \key2
veor q0, q0, \key3
veor q1, q1, \key3
veor q2, q2, \key3
veor q3, q3, \key3
.endm
.macro dec_fround_3x, key1, key2, key3
.macro dec_fround_4x, key1, key2, key3
dec_round q0, \key1
dec_round q1, \key1
dec_round q2, \key1
dec_round q3, \key1
aesd.8 q0, \key2
aesd.8 q1, \key2
aesd.8 q2, \key2
aesd.8 q3, \key2
veor q0, q0, \key3
veor q1, q1, \key3
veor q2, q2, \key3
veor q3, q3, \key3
.endm
.macro do_block, dround, fround
cmp r3, #12 @ which key size?
vld1.8 {q10-q11}, [ip]!
vld1.32 {q10-q11}, [ip]!
\dround q8, q9
vld1.8 {q12-q13}, [ip]!
vld1.32 {q12-q13}, [ip]!
\dround q10, q11
vld1.8 {q10-q11}, [ip]!
vld1.32 {q10-q11}, [ip]!
\dround q12, q13
vld1.8 {q12-q13}, [ip]!
vld1.32 {q12-q13}, [ip]!
\dround q10, q11
blo 0f @ AES-128: 10 rounds
vld1.8 {q10-q11}, [ip]!
vld1.32 {q10-q11}, [ip]!
\dround q12, q13
beq 1f @ AES-192: 12 rounds
vld1.8 {q12-q13}, [ip]
vld1.32 {q12-q13}, [ip]
\dround q10, q11
0: \fround q12, q13, q14
bx lr
@ -114,8 +124,9 @@
* transforms. These should preserve all registers except q0 - q2 and ip
* Arguments:
* q0 : first in/output block
* q1 : second in/output block (_3x version only)
* q2 : third in/output block (_3x version only)
* q1 : second in/output block (_4x version only)
* q2 : third in/output block (_4x version only)
* q3 : fourth in/output block (_4x version only)
* q8 : first round key
* q9 : secound round key
* q14 : final round key
@ -136,44 +147,44 @@ aes_decrypt:
ENDPROC(aes_decrypt)
.align 6
aes_encrypt_3x:
aes_encrypt_4x:
add ip, r2, #32 @ 3rd round key
do_block enc_dround_3x, enc_fround_3x
ENDPROC(aes_encrypt_3x)
do_block enc_dround_4x, enc_fround_4x
ENDPROC(aes_encrypt_4x)
.align 6
aes_decrypt_3x:
aes_decrypt_4x:
add ip, r2, #32 @ 3rd round key
do_block dec_dround_3x, dec_fround_3x
ENDPROC(aes_decrypt_3x)
do_block dec_dround_4x, dec_fround_4x
ENDPROC(aes_decrypt_4x)
.macro prepare_key, rk, rounds
add ip, \rk, \rounds, lsl #4
vld1.8 {q8-q9}, [\rk] @ load first 2 round keys
vld1.8 {q14}, [ip] @ load last round key
vld1.32 {q8-q9}, [\rk] @ load first 2 round keys
vld1.32 {q14}, [ip] @ load last round key
.endm
/*
* aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds,
* int blocks)
* aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* aes_ecb_decrypt(u8 out[], u8 const in[], u32 const rk[], int rounds,
* int blocks)
*/
ENTRY(ce_aes_ecb_encrypt)
push {r4, lr}
ldr r4, [sp, #8]
prepare_key r2, r3
.Lecbencloop3x:
subs r4, r4, #3
.Lecbencloop4x:
subs r4, r4, #4
bmi .Lecbenc1x
vld1.8 {q0-q1}, [r1]!
vld1.8 {q2}, [r1]!
bl aes_encrypt_3x
vld1.8 {q2-q3}, [r1]!
bl aes_encrypt_4x
vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0]!
b .Lecbencloop3x
vst1.8 {q2-q3}, [r0]!
b .Lecbencloop4x
.Lecbenc1x:
adds r4, r4, #3
adds r4, r4, #4
beq .Lecbencout
.Lecbencloop:
vld1.8 {q0}, [r1]!
@ -189,17 +200,17 @@ ENTRY(ce_aes_ecb_decrypt)
push {r4, lr}
ldr r4, [sp, #8]
prepare_key r2, r3
.Lecbdecloop3x:
subs r4, r4, #3
.Lecbdecloop4x:
subs r4, r4, #4
bmi .Lecbdec1x
vld1.8 {q0-q1}, [r1]!
vld1.8 {q2}, [r1]!
bl aes_decrypt_3x
vld1.8 {q2-q3}, [r1]!
bl aes_decrypt_4x
vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0]!
b .Lecbdecloop3x
vst1.8 {q2-q3}, [r0]!
b .Lecbdecloop4x
.Lecbdec1x:
adds r4, r4, #3
adds r4, r4, #4
beq .Lecbdecout
.Lecbdecloop:
vld1.8 {q0}, [r1]!
@ -212,9 +223,9 @@ ENTRY(ce_aes_ecb_decrypt)
ENDPROC(ce_aes_ecb_decrypt)
/*
* aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds,
* int blocks, u8 iv[])
* aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* aes_cbc_decrypt(u8 out[], u8 const in[], u32 const rk[], int rounds,
* int blocks, u8 iv[])
*/
ENTRY(ce_aes_cbc_encrypt)
@ -236,88 +247,181 @@ ENDPROC(ce_aes_cbc_encrypt)
ENTRY(ce_aes_cbc_decrypt)
push {r4-r6, lr}
ldrd r4, r5, [sp, #16]
vld1.8 {q6}, [r5] @ keep iv in q6
vld1.8 {q15}, [r5] @ keep iv in q15
prepare_key r2, r3
.Lcbcdecloop3x:
subs r4, r4, #3
.Lcbcdecloop4x:
subs r4, r4, #4
bmi .Lcbcdec1x
vld1.8 {q0-q1}, [r1]!
vld1.8 {q2}, [r1]!
vmov q3, q0
vmov q4, q1
vmov q5, q2
bl aes_decrypt_3x
veor q0, q0, q6
veor q1, q1, q3
veor q2, q2, q4
vmov q6, q5
vld1.8 {q2-q3}, [r1]!
vmov q4, q0
vmov q5, q1
vmov q6, q2
vmov q7, q3
bl aes_decrypt_4x
veor q0, q0, q15
veor q1, q1, q4
veor q2, q2, q5
veor q3, q3, q6
vmov q15, q7
vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0]!
b .Lcbcdecloop3x
vst1.8 {q2-q3}, [r0]!
b .Lcbcdecloop4x
.Lcbcdec1x:
adds r4, r4, #3
adds r4, r4, #4
beq .Lcbcdecout
vmov q15, q14 @ preserve last round key
vmov q6, q14 @ preserve last round key
.Lcbcdecloop:
vld1.8 {q0}, [r1]! @ get next ct block
veor q14, q15, q6 @ combine prev ct with last key
vmov q6, q0
vmov q15, q0
bl aes_decrypt
vst1.8 {q0}, [r0]!
subs r4, r4, #1
bne .Lcbcdecloop
.Lcbcdecout:
vst1.8 {q6}, [r5] @ keep iv in q6
vst1.8 {q15}, [r5] @ keep iv in q15
pop {r4-r6, pc}
ENDPROC(ce_aes_cbc_decrypt)
/*
* aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* ce_aes_cbc_cts_encrypt(u8 out[], u8 const in[], u32 const rk[],
* int rounds, int bytes, u8 const iv[])
* ce_aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[],
* int rounds, int bytes, u8 const iv[])
*/
ENTRY(ce_aes_cbc_cts_encrypt)
push {r4-r6, lr}
ldrd r4, r5, [sp, #16]
movw ip, :lower16:.Lcts_permute_table
movt ip, :upper16:.Lcts_permute_table
sub r4, r4, #16
add lr, ip, #32
add ip, ip, r4
sub lr, lr, r4
vld1.8 {q5}, [ip]
vld1.8 {q6}, [lr]
add ip, r1, r4
vld1.8 {q0}, [r1] @ overlapping loads
vld1.8 {q3}, [ip]
vld1.8 {q1}, [r5] @ get iv
prepare_key r2, r3
veor q0, q0, q1 @ xor with iv
bl aes_encrypt
vtbl.8 d4, {d0-d1}, d10
vtbl.8 d5, {d0-d1}, d11
vtbl.8 d2, {d6-d7}, d12
vtbl.8 d3, {d6-d7}, d13
veor q0, q0, q1
bl aes_encrypt
add r4, r0, r4
vst1.8 {q2}, [r4] @ overlapping stores
vst1.8 {q0}, [r0]
pop {r4-r6, pc}
ENDPROC(ce_aes_cbc_cts_encrypt)
ENTRY(ce_aes_cbc_cts_decrypt)
push {r4-r6, lr}
ldrd r4, r5, [sp, #16]
movw ip, :lower16:.Lcts_permute_table
movt ip, :upper16:.Lcts_permute_table
sub r4, r4, #16
add lr, ip, #32
add ip, ip, r4
sub lr, lr, r4
vld1.8 {q5}, [ip]
vld1.8 {q6}, [lr]
add ip, r1, r4
vld1.8 {q0}, [r1] @ overlapping loads
vld1.8 {q1}, [ip]
vld1.8 {q3}, [r5] @ get iv
prepare_key r2, r3
bl aes_decrypt
vtbl.8 d4, {d0-d1}, d10
vtbl.8 d5, {d0-d1}, d11
vtbx.8 d0, {d2-d3}, d12
vtbx.8 d1, {d2-d3}, d13
veor q1, q1, q2
bl aes_decrypt
veor q0, q0, q3 @ xor with iv
add r4, r0, r4
vst1.8 {q1}, [r4] @ overlapping stores
vst1.8 {q0}, [r0]
pop {r4-r6, pc}
ENDPROC(ce_aes_cbc_cts_decrypt)
/*
* aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], int rounds,
* int blocks, u8 ctr[])
*/
ENTRY(ce_aes_ctr_encrypt)
push {r4-r6, lr}
ldrd r4, r5, [sp, #16]
vld1.8 {q6}, [r5] @ load ctr
vld1.8 {q7}, [r5] @ load ctr
prepare_key r2, r3
vmov r6, s27 @ keep swabbed ctr in r6
vmov r6, s31 @ keep swabbed ctr in r6
rev r6, r6
cmn r6, r4 @ 32 bit overflow?
bcs .Lctrloop
.Lctrloop3x:
subs r4, r4, #3
.Lctrloop4x:
subs r4, r4, #4
bmi .Lctr1x
add r6, r6, #1
vmov q0, q6
vmov q1, q6
vmov q0, q7
vmov q1, q7
rev ip, r6
add r6, r6, #1
vmov q2, q6
vmov q2, q7
vmov s7, ip
rev ip, r6
add r6, r6, #1
vmov q3, q7
vmov s11, ip
vld1.8 {q3-q4}, [r1]!
vld1.8 {q5}, [r1]!
bl aes_encrypt_3x
veor q0, q0, q3
veor q1, q1, q4
veor q2, q2, q5
rev ip, r6
add r6, r6, #1
vmov s15, ip
vld1.8 {q4-q5}, [r1]!
vld1.8 {q6}, [r1]!
vld1.8 {q15}, [r1]!
bl aes_encrypt_4x
veor q0, q0, q4
veor q1, q1, q5
veor q2, q2, q6
veor q3, q3, q15
rev ip, r6
vst1.8 {q0-q1}, [r0]!
vst1.8 {q2}, [r0]!
vmov s27, ip
b .Lctrloop3x
vst1.8 {q2-q3}, [r0]!
vmov s31, ip
b .Lctrloop4x
.Lctr1x:
adds r4, r4, #3
adds r4, r4, #4
beq .Lctrout
.Lctrloop:
vmov q0, q6
vmov q0, q7
bl aes_encrypt
adds r6, r6, #1 @ increment BE ctr
rev ip, r6
vmov s27, ip
vmov s31, ip
bcs .Lctrcarry
.Lctrcarrydone:
@ -329,7 +433,7 @@ ENTRY(ce_aes_ctr_encrypt)
bne .Lctrloop
.Lctrout:
vst1.8 {q6}, [r5] @ return next CTR value
vst1.8 {q7}, [r5] @ return next CTR value
pop {r4-r6, pc}
.Lctrtailblock:
@ -337,7 +441,7 @@ ENTRY(ce_aes_ctr_encrypt)
b .Lctrout
.Lctrcarry:
.irp sreg, s26, s25, s24
.irp sreg, s30, s29, s28
vmov ip, \sreg @ load next word of ctr
rev ip, ip @ ... to handle the carry
adds ip, ip, #1
@ -349,10 +453,10 @@ ENTRY(ce_aes_ctr_encrypt)
ENDPROC(ce_aes_ctr_encrypt)
/*
* aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds,
* int blocks, u8 iv[], u8 const rk2[], int first)
* aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds,
* int blocks, u8 iv[], u8 const rk2[], int first)
* aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds,
* int bytes, u8 iv[], u32 const rk2[], int first)
* aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds,
* int bytes, u8 iv[], u32 const rk2[], int first)
*/
.macro next_tweak, out, in, const, tmp
@ -363,13 +467,10 @@ ENDPROC(ce_aes_ctr_encrypt)
veor \out, \out, \tmp
.endm
.align 3
.Lxts_mul_x:
.quad 1, 0x87
ce_aes_xts_init:
vldr d14, .Lxts_mul_x
vldr d15, .Lxts_mul_x + 8
vmov.i32 d30, #0x87 @ compose tweak mask vector
vmovl.u32 q15, d30
vshr.u64 d30, d31, #7
ldrd r4, r5, [sp, #16] @ load args
ldr r6, [sp, #28]
@ -390,49 +491,86 @@ ENTRY(ce_aes_xts_encrypt)
bl ce_aes_xts_init @ run shared prologue
prepare_key r2, r3
vmov q3, q0
vmov q4, q0
teq r6, #0 @ start of a block?
bne .Lxtsenc3x
bne .Lxtsenc4x
.Lxtsencloop3x:
next_tweak q3, q3, q7, q6
.Lxtsenc3x:
subs r4, r4, #3
.Lxtsencloop4x:
next_tweak q4, q4, q15, q10
.Lxtsenc4x:
subs r4, r4, #64
bmi .Lxtsenc1x
vld1.8 {q0-q1}, [r1]! @ get 3 pt blocks
vld1.8 {q2}, [r1]!
next_tweak q4, q3, q7, q6
veor q0, q0, q3
next_tweak q5, q4, q7, q6
veor q1, q1, q4
veor q2, q2, q5
bl aes_encrypt_3x
veor q0, q0, q3
veor q1, q1, q4
veor q2, q2, q5
vst1.8 {q0-q1}, [r0]! @ write 3 ct blocks
vst1.8 {q2}, [r0]!
vmov q3, q5
vld1.8 {q0-q1}, [r1]! @ get 4 pt blocks
vld1.8 {q2-q3}, [r1]!
next_tweak q5, q4, q15, q10
veor q0, q0, q4
next_tweak q6, q5, q15, q10
veor q1, q1, q5
next_tweak q7, q6, q15, q10
veor q2, q2, q6
veor q3, q3, q7
bl aes_encrypt_4x
veor q0, q0, q4
veor q1, q1, q5
veor q2, q2, q6
veor q3, q3, q7
vst1.8 {q0-q1}, [r0]! @ write 4 ct blocks
vst1.8 {q2-q3}, [r0]!
vmov q4, q7
teq r4, #0
beq .Lxtsencout
b .Lxtsencloop3x
beq .Lxtsencret
b .Lxtsencloop4x
.Lxtsenc1x:
adds r4, r4, #3
adds r4, r4, #64
beq .Lxtsencout
subs r4, r4, #16
bmi .LxtsencctsNx
.Lxtsencloop:
vld1.8 {q0}, [r1]!
veor q0, q0, q3
.Lxtsencctsout:
veor q0, q0, q4
bl aes_encrypt
veor q0, q0, q3
vst1.8 {q0}, [r0]!
subs r4, r4, #1
veor q0, q0, q4
teq r4, #0
beq .Lxtsencout
next_tweak q3, q3, q7, q6
subs r4, r4, #16
next_tweak q4, q4, q15, q6
bmi .Lxtsenccts
vst1.8 {q0}, [r0]!
b .Lxtsencloop
.Lxtsencout:
vst1.8 {q3}, [r5]
vst1.8 {q0}, [r0]
.Lxtsencret:
vst1.8 {q4}, [r5]
pop {r4-r6, pc}
.LxtsencctsNx:
vmov q0, q3
sub r0, r0, #16
.Lxtsenccts:
movw ip, :lower16:.Lcts_permute_table
movt ip, :upper16:.Lcts_permute_table
add r1, r1, r4 @ rewind input pointer
add r4, r4, #16 @ # bytes in final block
add lr, ip, #32
add ip, ip, r4
sub lr, lr, r4
add r4, r0, r4 @ output address of final block
vld1.8 {q1}, [r1] @ load final partial block
vld1.8 {q2}, [ip]
vld1.8 {q3}, [lr]
vtbl.8 d4, {d0-d1}, d4
vtbl.8 d5, {d0-d1}, d5
vtbx.8 d0, {d2-d3}, d6
vtbx.8 d1, {d2-d3}, d7
vst1.8 {q2}, [r4] @ overlapping stores
mov r4, #0
b .Lxtsencctsout
ENDPROC(ce_aes_xts_encrypt)
@ -441,50 +579,90 @@ ENTRY(ce_aes_xts_decrypt)
bl ce_aes_xts_init @ run shared prologue
prepare_key r2, r3
vmov q3, q0
vmov q4, q0
/* subtract 16 bytes if we are doing CTS */
tst r4, #0xf
subne r4, r4, #0x10
teq r6, #0 @ start of a block?
bne .Lxtsdec3x
bne .Lxtsdec4x
.Lxtsdecloop3x:
next_tweak q3, q3, q7, q6
.Lxtsdec3x:
subs r4, r4, #3
.Lxtsdecloop4x:
next_tweak q4, q4, q15, q10
.Lxtsdec4x:
subs r4, r4, #64
bmi .Lxtsdec1x
vld1.8 {q0-q1}, [r1]! @ get 3 ct blocks
vld1.8 {q2}, [r1]!
next_tweak q4, q3, q7, q6
veor q0, q0, q3
next_tweak q5, q4, q7, q6
veor q1, q1, q4
veor q2, q2, q5
bl aes_decrypt_3x
veor q0, q0, q3
veor q1, q1, q4
veor q2, q2, q5
vst1.8 {q0-q1}, [r0]! @ write 3 pt blocks
vst1.8 {q2}, [r0]!
vmov q3, q5
vld1.8 {q0-q1}, [r1]! @ get 4 ct blocks
vld1.8 {q2-q3}, [r1]!
next_tweak q5, q4, q15, q10
veor q0, q0, q4
next_tweak q6, q5, q15, q10
veor q1, q1, q5
next_tweak q7, q6, q15, q10
veor q2, q2, q6
veor q3, q3, q7
bl aes_decrypt_4x
veor q0, q0, q4
veor q1, q1, q5
veor q2, q2, q6
veor q3, q3, q7
vst1.8 {q0-q1}, [r0]! @ write 4 pt blocks
vst1.8 {q2-q3}, [r0]!
vmov q4, q7
teq r4, #0
beq .Lxtsdecout
b .Lxtsdecloop3x
b .Lxtsdecloop4x
.Lxtsdec1x:
adds r4, r4, #3
adds r4, r4, #64
beq .Lxtsdecout
subs r4, r4, #16
.Lxtsdecloop:
vld1.8 {q0}, [r1]!
veor q0, q0, q3
add ip, r2, #32 @ 3rd round key
bmi .Lxtsdeccts
.Lxtsdecctsout:
veor q0, q0, q4
bl aes_decrypt
veor q0, q0, q3
veor q0, q0, q4
vst1.8 {q0}, [r0]!
subs r4, r4, #1
teq r4, #0
beq .Lxtsdecout
next_tweak q3, q3, q7, q6
subs r4, r4, #16
next_tweak q4, q4, q15, q6
b .Lxtsdecloop
.Lxtsdecout:
vst1.8 {q3}, [r5]
vst1.8 {q4}, [r5]
pop {r4-r6, pc}
.Lxtsdeccts:
movw ip, :lower16:.Lcts_permute_table
movt ip, :upper16:.Lcts_permute_table
add r1, r1, r4 @ rewind input pointer
add r4, r4, #16 @ # bytes in final block
add lr, ip, #32
add ip, ip, r4
sub lr, lr, r4
add r4, r0, r4 @ output address of final block
next_tweak q5, q4, q15, q6
vld1.8 {q1}, [r1] @ load final partial block
vld1.8 {q2}, [ip]
vld1.8 {q3}, [lr]
veor q0, q0, q5
bl aes_decrypt
veor q0, q0, q5
vtbl.8 d4, {d0-d1}, d4
vtbl.8 d5, {d0-d1}, d5
vtbx.8 d0, {d2-d3}, d6
vtbx.8 d1, {d2-d3}, d7
vst1.8 {q2}, [r4] @ overlapping stores
mov r4, #0
b .Lxtsdecctsout
ENDPROC(ce_aes_xts_decrypt)
/*
@ -505,8 +683,18 @@ ENDPROC(ce_aes_sub)
* operation on round key *src
*/
ENTRY(ce_aes_invert)
vld1.8 {q0}, [r1]
vld1.32 {q0}, [r1]
aesimc.8 q0, q0
vst1.8 {q0}, [r0]
vst1.32 {q0}, [r0]
bx lr
ENDPROC(ce_aes_invert)
.section ".rodata", "a"
.align 6
.Lcts_permute_table:
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7
.byte 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff

View File

@ -7,9 +7,13 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <asm/unaligned.h>
#include <crypto/aes.h>
#include <crypto/ctr.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/cpufeature.h>
#include <linux/module.h>
#include <crypto/xts.h>
@ -22,25 +26,29 @@ MODULE_LICENSE("GPL v2");
asmlinkage u32 ce_aes_sub(u32 input);
asmlinkage void ce_aes_invert(void *dst, void *src);
asmlinkage void ce_aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[],
asmlinkage void ce_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks);
asmlinkage void ce_aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[],
asmlinkage void ce_aes_ecb_decrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks);
asmlinkage void ce_aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[],
asmlinkage void ce_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void ce_aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
asmlinkage void ce_aes_cbc_decrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void ce_aes_cbc_cts_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 const iv[]);
asmlinkage void ce_aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 const iv[]);
asmlinkage void ce_aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
asmlinkage void ce_aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 ctr[]);
asmlinkage void ce_aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[],
int rounds, int blocks, u8 iv[],
u8 const rk2[], int first);
asmlinkage void ce_aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[],
int rounds, int blocks, u8 iv[],
u8 const rk2[], int first);
asmlinkage void ce_aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int bytes, u8 iv[],
u32 const rk2[], int first);
asmlinkage void ce_aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int bytes, u8 iv[],
u32 const rk2[], int first);
struct aes_block {
u8 b[AES_BLOCK_SIZE];
@ -77,21 +85,17 @@ static int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
key_len != AES_KEYSIZE_256)
return -EINVAL;
memcpy(ctx->key_enc, in_key, key_len);
ctx->key_length = key_len;
for (i = 0; i < kwords; i++)
ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
kernel_neon_begin();
for (i = 0; i < sizeof(rcon); i++) {
u32 *rki = ctx->key_enc + (i * kwords);
u32 *rko = rki + kwords;
#ifndef CONFIG_CPU_BIG_ENDIAN
rko[0] = ror32(ce_aes_sub(rki[kwords - 1]), 8);
rko[0] = rko[0] ^ rki[0] ^ rcon[i];
#else
rko[0] = rol32(ce_aes_sub(rki[kwords - 1]), 8);
rko[0] = rko[0] ^ rki[0] ^ (rcon[i] << 24);
#endif
rko[1] = rko[0] ^ rki[1];
rko[2] = rko[1] ^ rki[2];
rko[3] = rko[2] ^ rki[3];
@ -178,15 +182,15 @@ static int ecb_encrypt(struct skcipher_request *req)
unsigned int blocks;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
ce_aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_enc, num_rounds(ctx), blocks);
ctx->key_enc, num_rounds(ctx), blocks);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
@ -198,58 +202,192 @@ static int ecb_decrypt(struct skcipher_request *req)
unsigned int blocks;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
ce_aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_dec, num_rounds(ctx), blocks);
ctx->key_dec, num_rounds(ctx), blocks);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
static int cbc_encrypt_walk(struct skcipher_request *req,
struct skcipher_walk *walk)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
unsigned int blocks;
int err = 0;
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
ce_aes_cbc_encrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_enc, num_rounds(ctx), blocks,
walk->iv);
kernel_neon_end();
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
return err;
}
static int cbc_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
unsigned int blocks;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
return cbc_encrypt_walk(req, &walk);
}
kernel_neon_begin();
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
ce_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_enc, num_rounds(ctx), blocks,
walk.iv);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
static int cbc_decrypt_walk(struct skcipher_request *req,
struct skcipher_walk *walk)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
unsigned int blocks;
int err = 0;
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
ce_aes_cbc_decrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_dec, num_rounds(ctx), blocks,
walk->iv);
kernel_neon_end();
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
static int cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
unsigned int blocks;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
return cbc_decrypt_walk(req, &walk);
}
static int cts_cbc_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
struct scatterlist *src = req->src, *dst = req->dst;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct skcipher_walk walk;
int err;
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq, skcipher_request_flags(req),
NULL, NULL);
if (req->cryptlen <= AES_BLOCK_SIZE) {
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
cbc_blocks = 1;
}
if (cbc_blocks > 0) {
skcipher_request_set_crypt(&subreq, req->src, req->dst,
cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false) ?:
cbc_encrypt_walk(&subreq, &walk);
if (err)
return err;
if (req->cryptlen == AES_BLOCK_SIZE)
return 0;
dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst,
subreq.cryptlen);
}
/* handle ciphertext stealing */
skcipher_request_set_crypt(&subreq, src, dst,
req->cryptlen - cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
kernel_neon_begin();
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
ce_aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_dec, num_rounds(ctx), blocks,
walk.iv);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
ce_aes_cbc_cts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, num_rounds(ctx), walk.nbytes,
walk.iv);
kernel_neon_end();
return err;
return skcipher_walk_done(&walk, 0);
}
static int cts_cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
struct scatterlist *src = req->src, *dst = req->dst;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct skcipher_walk walk;
int err;
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq, skcipher_request_flags(req),
NULL, NULL);
if (req->cryptlen <= AES_BLOCK_SIZE) {
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
cbc_blocks = 1;
}
if (cbc_blocks > 0) {
skcipher_request_set_crypt(&subreq, req->src, req->dst,
cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false) ?:
cbc_decrypt_walk(&subreq, &walk);
if (err)
return err;
if (req->cryptlen == AES_BLOCK_SIZE)
return 0;
dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst,
subreq.cryptlen);
}
/* handle ciphertext stealing */
skcipher_request_set_crypt(&subreq, src, dst,
req->cryptlen - cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
kernel_neon_begin();
ce_aes_cbc_cts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_dec, num_rounds(ctx), walk.nbytes,
walk.iv);
kernel_neon_end();
return skcipher_walk_done(&walk, 0);
}
static int ctr_encrypt(struct skcipher_request *req)
@ -259,13 +397,14 @@ static int ctr_encrypt(struct skcipher_request *req)
struct skcipher_walk walk;
int err, blocks;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
ce_aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_enc, num_rounds(ctx), blocks,
ctx->key_enc, num_rounds(ctx), blocks,
walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
if (walk.nbytes) {
@ -279,36 +418,109 @@ static int ctr_encrypt(struct skcipher_request *req)
*/
blocks = -1;
ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc,
num_rounds(ctx), blocks, walk.iv);
kernel_neon_begin();
ce_aes_ctr_encrypt(tail, NULL, ctx->key_enc, num_rounds(ctx),
blocks, walk.iv);
kernel_neon_end();
crypto_xor_cpy(tdst, tsrc, tail, nbytes);
err = skcipher_walk_done(&walk, 0);
}
kernel_neon_end();
return err;
}
static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
{
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
unsigned long flags;
/*
* Temporarily disable interrupts to avoid races where
* cachelines are evicted when the CPU is interrupted
* to do something else.
*/
local_irq_save(flags);
aes_encrypt(ctx, dst, src);
local_irq_restore(flags);
}
static int ctr_encrypt_sync(struct skcipher_request *req)
{
if (!crypto_simd_usable())
return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
return ctr_encrypt(req);
}
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = num_rounds(&ctx->key1);
int tail = req->cryptlen % AES_BLOCK_SIZE;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct scatterlist *src, *dst;
struct skcipher_walk walk;
unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true);
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
err = skcipher_walk_virt(&walk, req, false);
if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
int xts_blocks = DIV_ROUND_UP(req->cryptlen,
AES_BLOCK_SIZE) - 2;
skcipher_walk_abort(&walk);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
skcipher_request_flags(req),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
xts_blocks * AES_BLOCK_SIZE,
req->iv);
req = &subreq;
err = skcipher_walk_virt(&walk, req, false);
} else {
tail = 0;
}
for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) {
int nbytes = walk.nbytes;
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, nbytes, walk.iv,
ctx->key2.key_enc, first);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
if (err || likely(!tail))
return err;
dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen);
skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail,
req->iv);
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
kernel_neon_begin();
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key1.key_enc, rounds, blocks,
walk.iv, (u8 *)ctx->key2.key_enc, first);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, walk.nbytes, walk.iv,
ctx->key2.key_enc, first);
kernel_neon_end();
return err;
return skcipher_walk_done(&walk, 0);
}
static int xts_decrypt(struct skcipher_request *req)
@ -316,87 +528,165 @@ static int xts_decrypt(struct skcipher_request *req)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = num_rounds(&ctx->key1);
int tail = req->cryptlen % AES_BLOCK_SIZE;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct scatterlist *src, *dst;
struct skcipher_walk walk;
unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true);
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
err = skcipher_walk_virt(&walk, req, false);
if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
int xts_blocks = DIV_ROUND_UP(req->cryptlen,
AES_BLOCK_SIZE) - 2;
skcipher_walk_abort(&walk);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
skcipher_request_flags(req),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
xts_blocks * AES_BLOCK_SIZE,
req->iv);
req = &subreq;
err = skcipher_walk_virt(&walk, req, false);
} else {
tail = 0;
}
for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) {
int nbytes = walk.nbytes;
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, nbytes, walk.iv,
ctx->key2.key_enc, first);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
if (err || likely(!tail))
return err;
dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen);
skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail,
req->iv);
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
kernel_neon_begin();
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key1.key_dec, rounds, blocks,
walk.iv, (u8 *)ctx->key2.key_enc, first);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, walk.nbytes, walk.iv,
ctx->key2.key_enc, first);
kernel_neon_end();
return err;
return skcipher_walk_done(&walk, 0);
}
static struct skcipher_alg aes_algs[] = { {
.base = {
.cra_name = "__ecb(aes)",
.cra_driver_name = "__ecb-aes-ce",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.setkey = ce_aes_setkey,
.encrypt = ecb_encrypt,
.decrypt = ecb_decrypt,
.base.cra_name = "__ecb(aes)",
.base.cra_driver_name = "__ecb-aes-ce",
.base.cra_priority = 300,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.setkey = ce_aes_setkey,
.encrypt = ecb_encrypt,
.decrypt = ecb_decrypt,
}, {
.base = {
.cra_name = "__cbc(aes)",
.cra_driver_name = "__cbc-aes-ce",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = ce_aes_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
.base.cra_name = "__cbc(aes)",
.base.cra_driver_name = "__cbc-aes-ce",
.base.cra_priority = 300,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = ce_aes_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base = {
.cra_name = "__ctr(aes)",
.cra_driver_name = "__ctr-aes-ce",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.chunksize = AES_BLOCK_SIZE,
.setkey = ce_aes_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
.base.cra_name = "__cts(cbc(aes))",
.base.cra_driver_name = "__cts-cbc-aes-ce",
.base.cra_priority = 300,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.walksize = 2 * AES_BLOCK_SIZE,
.setkey = ce_aes_setkey,
.encrypt = cts_cbc_encrypt,
.decrypt = cts_cbc_decrypt,
}, {
.base = {
.cra_name = "__xts(aes)",
.cra_driver_name = "__xts-aes-ce",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_xts_ctx),
.cra_module = THIS_MODULE,
},
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = xts_set_key,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
.base.cra_name = "__ctr(aes)",
.base.cra_driver_name = "__ctr-aes-ce",
.base.cra_priority = 300,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.chunksize = AES_BLOCK_SIZE,
.setkey = ce_aes_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
}, {
.base.cra_name = "ctr(aes)",
.base.cra_driver_name = "ctr-aes-ce-sync",
.base.cra_priority = 300 - 1,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.chunksize = AES_BLOCK_SIZE,
.setkey = ce_aes_setkey,
.encrypt = ctr_encrypt_sync,
.decrypt = ctr_encrypt_sync,
}, {
.base.cra_name = "__xts(aes)",
.base.cra_driver_name = "__xts-aes-ce",
.base.cra_priority = 300,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct crypto_aes_xts_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.walksize = 2 * AES_BLOCK_SIZE,
.setkey = xts_set_key,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
} };
static struct simd_skcipher_alg *aes_simd_algs[ARRAY_SIZE(aes_algs)];
@ -425,6 +715,9 @@ static int __init aes_init(void)
return err;
for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
if (!(aes_algs[i].base.cra_flags & CRYPTO_ALG_INTERNAL))
continue;
algname = aes_algs[i].base.cra_name + 2;
drvname = aes_algs[i].base.cra_driver_name + 2;
basename = aes_algs[i].base.cra_driver_name;

View File

@ -219,43 +219,5 @@ ENDPROC(__aes_arm_encrypt)
.align 5
ENTRY(__aes_arm_decrypt)
do_crypt iround, crypto_it_tab, __aes_arm_inverse_sbox, 0
do_crypt iround, crypto_it_tab, crypto_aes_inv_sbox, 0
ENDPROC(__aes_arm_decrypt)
.section ".rodata", "a"
.align L1_CACHE_SHIFT
.type __aes_arm_inverse_sbox, %object
__aes_arm_inverse_sbox:
.byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38
.byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb
.byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87
.byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb
.byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d
.byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e
.byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2
.byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25
.byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16
.byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92
.byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda
.byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84
.byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a
.byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06
.byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02
.byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b
.byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea
.byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73
.byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85
.byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e
.byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89
.byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b
.byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20
.byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4
.byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31
.byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f
.byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d
.byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef
.byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0
.byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61
.byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26
.byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
.size __aes_arm_inverse_sbox, . - __aes_arm_inverse_sbox

View File

@ -11,12 +11,9 @@
#include <linux/module.h>
asmlinkage void __aes_arm_encrypt(u32 *rk, int rounds, const u8 *in, u8 *out);
EXPORT_SYMBOL(__aes_arm_encrypt);
asmlinkage void __aes_arm_decrypt(u32 *rk, int rounds, const u8 *in, u8 *out);
EXPORT_SYMBOL(__aes_arm_decrypt);
static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void aes_arm_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
@ -24,7 +21,7 @@ static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
__aes_arm_encrypt(ctx->key_enc, rounds, in, out);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void aes_arm_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
@ -44,8 +41,8 @@ static struct crypto_alg aes_alg = {
.cra_cipher.cia_min_keysize = AES_MIN_KEY_SIZE,
.cra_cipher.cia_max_keysize = AES_MAX_KEY_SIZE,
.cra_cipher.cia_setkey = crypto_aes_set_key,
.cra_cipher.cia_encrypt = aes_encrypt,
.cra_cipher.cia_decrypt = aes_decrypt,
.cra_cipher.cia_encrypt = aes_arm_encrypt,
.cra_cipher.cia_decrypt = aes_arm_decrypt,
#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
.cra_alignmask = 3,

View File

@ -887,19 +887,17 @@ ENDPROC(aesbs_ctr_encrypt)
veor \out, \out, \tmp
.endm
.align 4
.Lxts_mul_x:
.quad 1, 0x87
/*
* aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 iv[])
* int blocks, u8 iv[], int reorder_last_tweak)
* aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 iv[])
* int blocks, u8 iv[], int reorder_last_tweak)
*/
__xts_prepare8:
vld1.8 {q14}, [r7] // load iv
__ldr q15, .Lxts_mul_x // load tweak mask
vmov.i32 d30, #0x87 // compose tweak mask vector
vmovl.u32 q15, d30
vshr.u64 d30, d31, #7
vmov q12, q14
__adr ip, 0f
@ -946,17 +944,25 @@ __xts_prepare8:
vld1.8 {q7}, [r1]!
next_tweak q14, q12, q15, q13
veor q7, q7, q12
THUMB( itt le )
W(cmple) r8, #0
ble 1f
0: veor q7, q7, q12
vst1.8 {q12}, [r4, :128]
0: vst1.8 {q14}, [r7] // store next iv
vst1.8 {q14}, [r7] // store next iv
bx lr
1: vswp q12, q14
b 0b
ENDPROC(__xts_prepare8)
.macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
push {r4-r8, lr}
mov r5, sp // preserve sp
ldrd r6, r7, [sp, #24] // get blocks and iv args
ldr r8, [sp, #32] // reorder final tweak?
rsb r8, r8, #1
sub ip, sp, #128 // make room for 8x tweak
bic ip, ip, #0xf // align sp to 16 bytes
mov sp, ip

View File

@ -6,10 +6,13 @@
*/
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/aes.h>
#include <crypto/cbc.h>
#include <crypto/ctr.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <crypto/xts.h>
#include <linux/module.h>
@ -35,9 +38,9 @@ asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 ctr[], u8 final[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
int rounds, int blocks, u8 iv[], int);
asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
int rounds, int blocks, u8 iv[], int);
struct aesbs_ctx {
int rounds;
@ -51,9 +54,15 @@ struct aesbs_cbc_ctx {
struct aesbs_xts_ctx {
struct aesbs_ctx key;
struct crypto_cipher *cts_tfm;
struct crypto_cipher *tweak_tfm;
};
struct aesbs_ctr_ctx {
struct aesbs_ctx key; /* must be first member */
struct crypto_aes_ctx fallback;
};
static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
@ -61,7 +70,7 @@ static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;
err = crypto_aes_expand_key(&rk, in_key, key_len);
err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;
@ -83,9 +92,8 @@ static int __ecb_crypt(struct skcipher_request *req,
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
@ -93,12 +101,13 @@ static int __ecb_crypt(struct skcipher_request *req,
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
kernel_neon_begin();
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk,
ctx->rounds, blocks);
kernel_neon_end();
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
@ -120,7 +129,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;
err = crypto_aes_expand_key(&rk, in_key, key_len);
err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;
@ -152,9 +161,8 @@ static int cbc_decrypt(struct skcipher_request *req)
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
@ -162,13 +170,14 @@ static int cbc_decrypt(struct skcipher_request *req)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
kernel_neon_begin();
aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
@ -189,6 +198,25 @@ static void cbc_exit(struct crypto_tfm *tfm)
crypto_free_cipher(ctx->enc_tfm);
}
static int aesbs_ctr_setkey_sync(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = aes_expandkey(&ctx->fallback, in_key, key_len);
if (err)
return err;
ctx->key.rounds = 6 + key_len / 4;
kernel_neon_begin();
aesbs_convert_key(ctx->key.rk, ctx->fallback.key_enc, ctx->key.rounds);
kernel_neon_end();
return 0;
}
static int ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@ -197,9 +225,8 @@ static int ctr_encrypt(struct skcipher_request *req)
u8 buf[AES_BLOCK_SIZE];
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL;
@ -210,8 +237,10 @@ static int ctr_encrypt(struct skcipher_request *req)
final = NULL;
}
kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->rk, ctx->rounds, blocks, walk.iv, final);
kernel_neon_end();
if (final) {
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
@ -226,11 +255,33 @@ static int ctr_encrypt(struct skcipher_request *req)
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
}
static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
{
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
unsigned long flags;
/*
* Temporarily disable interrupts to avoid races where
* cachelines are evicted when the CPU is interrupted
* to do something else.
*/
local_irq_save(flags);
aes_encrypt(&ctx->fallback, dst, src);
local_irq_restore(flags);
}
static int ctr_encrypt_sync(struct skcipher_request *req)
{
if (!crypto_simd_usable())
return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
return ctr_encrypt(req);
}
static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
@ -242,6 +293,9 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
return err;
key_len /= 2;
err = crypto_cipher_setkey(ctx->cts_tfm, in_key, key_len);
if (err)
return err;
err = crypto_cipher_setkey(ctx->tweak_tfm, in_key + key_len, key_len);
if (err)
return err;
@ -253,7 +307,13 @@ static int xts_init(struct crypto_tfm *tfm)
{
struct aesbs_xts_ctx *ctx = crypto_tfm_ctx(tfm);
ctx->cts_tfm = crypto_alloc_cipher("aes", 0, 0);
if (IS_ERR(ctx->cts_tfm))
return PTR_ERR(ctx->cts_tfm);
ctx->tweak_tfm = crypto_alloc_cipher("aes", 0, 0);
if (IS_ERR(ctx->tweak_tfm))
crypto_free_cipher(ctx->cts_tfm);
return PTR_ERR_OR_ZERO(ctx->tweak_tfm);
}
@ -263,49 +323,89 @@ static void xts_exit(struct crypto_tfm *tfm)
struct aesbs_xts_ctx *ctx = crypto_tfm_ctx(tfm);
crypto_free_cipher(ctx->tweak_tfm);
crypto_free_cipher(ctx->cts_tfm);
}
static int __xts_crypt(struct skcipher_request *req,
static int __xts_crypt(struct skcipher_request *req, bool encrypt,
void (*fn)(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]))
int rounds, int blocks, u8 iv[], int))
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int tail = req->cryptlen % AES_BLOCK_SIZE;
struct skcipher_request subreq;
u8 buf[2 * AES_BLOCK_SIZE];
struct skcipher_walk walk;
int err;
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
if (unlikely(tail)) {
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
skcipher_request_flags(req),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
req->cryptlen - tail, req->iv);
req = &subreq;
}
err = skcipher_walk_virt(&walk, req, true);
if (err)
return err;
crypto_cipher_encrypt_one(ctx->tweak_tfm, walk.iv, walk.iv);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
int reorder_last_tweak = !encrypt && tail > 0;
if (walk.nbytes < walk.total)
if (walk.nbytes < walk.total) {
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
reorder_last_tweak = 0;
}
kernel_neon_begin();
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk,
ctx->key.rounds, blocks, walk.iv);
ctx->key.rounds, blocks, walk.iv, reorder_last_tweak);
kernel_neon_end();
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
kernel_neon_end();
return err;
if (err || likely(!tail))
return err;
/* handle ciphertext stealing */
scatterwalk_map_and_copy(buf, req->dst, req->cryptlen - AES_BLOCK_SIZE,
AES_BLOCK_SIZE, 0);
memcpy(buf + AES_BLOCK_SIZE, buf, tail);
scatterwalk_map_and_copy(buf, req->src, req->cryptlen, tail, 0);
crypto_xor(buf, req->iv, AES_BLOCK_SIZE);
if (encrypt)
crypto_cipher_encrypt_one(ctx->cts_tfm, buf, buf);
else
crypto_cipher_decrypt_one(ctx->cts_tfm, buf, buf);
crypto_xor(buf, req->iv, AES_BLOCK_SIZE);
scatterwalk_map_and_copy(buf, req->dst, req->cryptlen - AES_BLOCK_SIZE,
AES_BLOCK_SIZE + tail, 1);
return 0;
}
static int xts_encrypt(struct skcipher_request *req)
{
return __xts_crypt(req, aesbs_xts_encrypt);
return __xts_crypt(req, true, aesbs_xts_encrypt);
}
static int xts_decrypt(struct skcipher_request *req)
{
return __xts_crypt(req, aesbs_xts_decrypt);
return __xts_crypt(req, false, aesbs_xts_decrypt);
}
static struct skcipher_alg aes_algs[] = { {
@ -358,6 +458,22 @@ static struct skcipher_alg aes_algs[] = { {
.setkey = aesbs_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
}, {
.base.cra_name = "ctr(aes)",
.base.cra_driver_name = "ctr-aes-neonbs-sync",
.base.cra_priority = 250 - 1,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct aesbs_ctr_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.chunksize = AES_BLOCK_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_ctr_setkey_sync,
.encrypt = ctr_encrypt_sync,
.decrypt = ctr_encrypt_sync,
}, {
.base.cra_name = "__xts(aes)",
.base.cra_driver_name = "__xts-aes-neonbs",

View File

@ -9,6 +9,7 @@
#include <asm/neon.h>
#include <asm/simd.h>
#include <asm/unaligned.h>
#include <crypto/b128ops.h>
#include <crypto/cryptd.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
@ -17,7 +18,7 @@
#include <linux/crypto.h>
#include <linux/module.h>
MODULE_DESCRIPTION("GHASH secure hash using ARMv8 Crypto Extensions");
MODULE_DESCRIPTION("GHASH hash function using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("ghash");
@ -30,6 +31,8 @@ struct ghash_key {
u64 h2[2];
u64 h3[2];
u64 h4[2];
be128 k;
};
struct ghash_desc_ctx {
@ -62,6 +65,36 @@ static int ghash_init(struct shash_desc *desc)
return 0;
}
static void ghash_do_update(int blocks, u64 dg[], const char *src,
struct ghash_key *key, const char *head)
{
if (likely(crypto_simd_usable())) {
kernel_neon_begin();
pmull_ghash_update(blocks, dg, src, key, head);
kernel_neon_end();
} else {
be128 dst = { cpu_to_be64(dg[1]), cpu_to_be64(dg[0]) };
do {
const u8 *in = src;
if (head) {
in = head;
blocks++;
head = NULL;
} else {
src += GHASH_BLOCK_SIZE;
}
crypto_xor((u8 *)&dst, in, GHASH_BLOCK_SIZE);
gf128mul_lle(&dst, &key->k);
} while (--blocks);
dg[0] = be64_to_cpu(dst.b);
dg[1] = be64_to_cpu(dst.a);
}
}
static int ghash_update(struct shash_desc *desc, const u8 *src,
unsigned int len)
{
@ -85,10 +118,8 @@ static int ghash_update(struct shash_desc *desc, const u8 *src,
blocks = len / GHASH_BLOCK_SIZE;
len %= GHASH_BLOCK_SIZE;
kernel_neon_begin();
pmull_ghash_update(blocks, ctx->digest, src, key,
partial ? ctx->buf : NULL);
kernel_neon_end();
ghash_do_update(blocks, ctx->digest, src, key,
partial ? ctx->buf : NULL);
src += blocks * GHASH_BLOCK_SIZE;
partial = 0;
}
@ -106,9 +137,7 @@ static int ghash_final(struct shash_desc *desc, u8 *dst)
struct ghash_key *key = crypto_shash_ctx(desc->tfm);
memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial);
kernel_neon_begin();
pmull_ghash_update(1, ctx->digest, ctx->buf, key, NULL);
kernel_neon_end();
ghash_do_update(1, ctx->digest, ctx->buf, key, NULL);
}
put_unaligned_be64(ctx->digest[1], dst);
put_unaligned_be64(ctx->digest[0], dst + 8);
@ -132,24 +161,25 @@ static int ghash_setkey(struct crypto_shash *tfm,
const u8 *inkey, unsigned int keylen)
{
struct ghash_key *key = crypto_shash_ctx(tfm);
be128 h, k;
be128 h;
if (keylen != GHASH_BLOCK_SIZE) {
crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
memcpy(&k, inkey, GHASH_BLOCK_SIZE);
ghash_reflect(key->h, &k);
/* needed for the fallback */
memcpy(&key->k, inkey, GHASH_BLOCK_SIZE);
ghash_reflect(key->h, &key->k);
h = k;
gf128mul_lle(&h, &k);
h = key->k;
gf128mul_lle(&h, &key->k);
ghash_reflect(key->h2, &h);
gf128mul_lle(&h, &k);
gf128mul_lle(&h, &key->k);
ghash_reflect(key->h3, &h);
gf128mul_lle(&h, &k);
gf128mul_lle(&h, &key->k);
ghash_reflect(key->h4, &h);
return 0;
@ -162,15 +192,13 @@ static struct shash_alg ghash_alg = {
.final = ghash_final,
.setkey = ghash_setkey,
.descsize = sizeof(struct ghash_desc_ctx),
.base = {
.cra_name = "__ghash",
.cra_driver_name = "__driver-ghash-ce",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = GHASH_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct ghash_key),
.cra_module = THIS_MODULE,
},
.base.cra_name = "ghash",
.base.cra_driver_name = "ghash-ce-sync",
.base.cra_priority = 300 - 1,
.base.cra_blocksize = GHASH_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct ghash_key),
.base.cra_module = THIS_MODULE,
};
static int ghash_async_init(struct ahash_request *req)
@ -285,9 +313,7 @@ static int ghash_async_init_tfm(struct crypto_tfm *tfm)
struct cryptd_ahash *cryptd_tfm;
struct ghash_async_ctx *ctx = crypto_tfm_ctx(tfm);
cryptd_tfm = cryptd_alloc_ahash("__driver-ghash-ce",
CRYPTO_ALG_INTERNAL,
CRYPTO_ALG_INTERNAL);
cryptd_tfm = cryptd_alloc_ahash("ghash-ce-sync", 0, 0);
if (IS_ERR(cryptd_tfm))
return PTR_ERR(cryptd_tfm);
ctx->cryptd_tfm = cryptd_tfm;

View File

@ -39,7 +39,7 @@ int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
}
EXPORT_SYMBOL(crypto_sha256_arm_update);
static int sha256_final(struct shash_desc *desc, u8 *out)
static int crypto_sha256_arm_final(struct shash_desc *desc, u8 *out)
{
sha256_base_do_finalize(desc,
(sha256_block_fn *)sha256_block_data_order);
@ -51,7 +51,7 @@ int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
{
sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha256_block_data_order);
return sha256_final(desc, out);
return crypto_sha256_arm_final(desc, out);
}
EXPORT_SYMBOL(crypto_sha256_arm_finup);
@ -59,7 +59,7 @@ static struct shash_alg algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.update = crypto_sha256_arm_update,
.final = sha256_final,
.final = crypto_sha256_arm_final,
.finup = crypto_sha256_arm_finup,
.descsize = sizeof(struct sha256_state),
.base = {
@ -73,7 +73,7 @@ static struct shash_alg algs[] = { {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.update = crypto_sha256_arm_update,
.final = sha256_final,
.final = crypto_sha256_arm_final,
.finup = crypto_sha256_arm_finup,
.descsize = sizeof(struct sha256_state),
.base = {

View File

@ -25,8 +25,8 @@
asmlinkage void sha256_block_data_order_neon(u32 *digest, const void *data,
unsigned int num_blks);
static int sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
static int crypto_sha256_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha256_state *sctx = shash_desc_ctx(desc);
@ -42,8 +42,8 @@ static int sha256_update(struct shash_desc *desc, const u8 *data,
return 0;
}
static int sha256_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
static int crypto_sha256_neon_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable())
return crypto_sha256_arm_finup(desc, data, len, out);
@ -59,17 +59,17 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data,
return sha256_base_finish(desc, out);
}
static int sha256_final(struct shash_desc *desc, u8 *out)
static int crypto_sha256_neon_final(struct shash_desc *desc, u8 *out)
{
return sha256_finup(desc, NULL, 0, out);
return crypto_sha256_neon_finup(desc, NULL, 0, out);
}
struct shash_alg sha256_neon_algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.update = sha256_update,
.final = sha256_final,
.finup = sha256_finup,
.update = crypto_sha256_neon_update,
.final = crypto_sha256_neon_final,
.finup = crypto_sha256_neon_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha256",
@ -81,9 +81,9 @@ struct shash_alg sha256_neon_algs[] = { {
}, {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.update = sha256_update,
.final = sha256_final,
.finup = sha256_finup,
.update = crypto_sha256_neon_update,
.final = crypto_sha256_neon_final,
.finup = crypto_sha256_neon_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha224",

View File

@ -17,7 +17,6 @@ generic-y += parport.h
generic-y += preempt.h
generic-y += seccomp.h
generic-y += serial.h
generic-y += simd.h
generic-y += trace_clock.h
generated-y += mach-types.h

View File

@ -751,6 +751,36 @@
status = "disabled";
};
crypto: crypto@30900000 {
compatible = "fsl,sec-v4.0";
#address-cells = <1>;
#size-cells = <1>;
reg = <0x30900000 0x40000>;
ranges = <0 0x30900000 0x40000>;
interrupts = <GIC_SPI 91 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MQ_CLK_AHB>,
<&clk IMX8MQ_CLK_IPG_ROOT>;
clock-names = "aclk", "ipg";
sec_jr0: jr@1000 {
compatible = "fsl,sec-v4.0-job-ring";
reg = <0x1000 0x1000>;
interrupts = <GIC_SPI 105 IRQ_TYPE_LEVEL_HIGH>;
};
sec_jr1: jr@2000 {
compatible = "fsl,sec-v4.0-job-ring";
reg = <0x2000 0x1000>;
interrupts = <GIC_SPI 106 IRQ_TYPE_LEVEL_HIGH>;
};
sec_jr2: jr@3000 {
compatible = "fsl,sec-v4.0-job-ring";
reg = <0x3000 0x1000>;
interrupts = <GIC_SPI 114 IRQ_TYPE_LEVEL_HIGH>;
};
};
dphy: dphy@30a00300 {
compatible = "fsl,imx8mq-mipi-dphy";
reg = <0x30a00300 0x100>;

View File

@ -58,8 +58,7 @@ config CRYPTO_GHASH_ARM64_CE
depends on KERNEL_MODE_NEON
select CRYPTO_HASH
select CRYPTO_GF128MUL
select CRYPTO_AES
select CRYPTO_AES_ARM64
select CRYPTO_LIB_AES
config CRYPTO_CRCT10DIF_ARM64_CE
tristate "CRCT10DIF digest algorithm using PMULL instructions"
@ -74,15 +73,15 @@ config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_ALGAPI
select CRYPTO_AES_ARM64
select CRYPTO_LIB_AES
config CRYPTO_AES_ARM64_CE_CCM
tristate "AES in CCM mode using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_ALGAPI
select CRYPTO_AES_ARM64_CE
select CRYPTO_AES_ARM64
select CRYPTO_AEAD
select CRYPTO_LIB_AES
config CRYPTO_AES_ARM64_CE_BLK
tristate "AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions"
@ -97,7 +96,7 @@ config CRYPTO_AES_ARM64_NEON_BLK
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_AES_ARM64
select CRYPTO_AES
select CRYPTO_LIB_AES
select CRYPTO_SIMD
config CRYPTO_CHACHA20_NEON
@ -117,6 +116,7 @@ config CRYPTO_AES_ARM64_BS
select CRYPTO_BLKCIPHER
select CRYPTO_AES_ARM64_NEON_BLK
select CRYPTO_AES_ARM64
select CRYPTO_LIB_AES
select CRYPTO_SIMD
endif

View File

@ -43,8 +43,6 @@ asmlinkage void ce_aes_ccm_decrypt(u8 out[], u8 const in[], u32 cbytes,
asmlinkage void ce_aes_ccm_final(u8 mac[], u8 const ctr[], u32 const rk[],
u32 rounds);
asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
static int ccm_setkey(struct crypto_aead *tfm, const u8 *in_key,
unsigned int key_len)
{
@ -124,8 +122,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
}
while (abytes >= AES_BLOCK_SIZE) {
__aes_arm64_encrypt(key->key_enc, mac, mac,
num_rounds(key));
aes_encrypt(key, mac, mac);
crypto_xor(mac, in, AES_BLOCK_SIZE);
in += AES_BLOCK_SIZE;
@ -133,8 +130,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
}
if (abytes > 0) {
__aes_arm64_encrypt(key->key_enc, mac, mac,
num_rounds(key));
aes_encrypt(key, mac, mac);
crypto_xor(mac, in, abytes);
*macp = abytes;
}
@ -206,10 +202,8 @@ static int ccm_crypt_fallback(struct skcipher_walk *walk, u8 mac[], u8 iv0[],
bsize = nbytes;
crypto_inc(walk->iv, AES_BLOCK_SIZE);
__aes_arm64_encrypt(ctx->key_enc, buf, walk->iv,
num_rounds(ctx));
__aes_arm64_encrypt(ctx->key_enc, mac, mac,
num_rounds(ctx));
aes_encrypt(ctx, buf, walk->iv);
aes_encrypt(ctx, mac, mac);
if (enc)
crypto_xor(mac, src, bsize);
crypto_xor_cpy(dst, src, buf, bsize);
@ -224,8 +218,8 @@ static int ccm_crypt_fallback(struct skcipher_walk *walk, u8 mac[], u8 iv0[],
}
if (!err) {
__aes_arm64_encrypt(ctx->key_enc, buf, iv0, num_rounds(ctx));
__aes_arm64_encrypt(ctx->key_enc, mac, mac, num_rounds(ctx));
aes_encrypt(ctx, buf, iv0);
aes_encrypt(ctx, mac, mac);
crypto_xor(mac, buf, AES_BLOCK_SIZE);
}
return err;

View File

@ -20,9 +20,6 @@ MODULE_DESCRIPTION("Synchronous AES cipher using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
struct aes_block {
u8 b[AES_BLOCK_SIZE];
};
@ -51,7 +48,7 @@ static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
if (!crypto_simd_usable()) {
__aes_arm64_encrypt(ctx->key_enc, dst, src, num_rounds(ctx));
aes_encrypt(ctx, dst, src);
return;
}
@ -65,7 +62,7 @@ static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
if (!crypto_simd_usable()) {
__aes_arm64_decrypt(ctx->key_dec, dst, src, num_rounds(ctx));
aes_decrypt(ctx, dst, src);
return;
}

View File

@ -21,6 +21,9 @@
.macro xts_reload_mask, tmp
.endm
.macro xts_cts_skip_tw, reg, lbl
.endm
/* preload all round keys */
.macro load_round_keys, rounds, rk
cmp \rounds, #12

View File

@ -128,43 +128,5 @@ ENDPROC(__aes_arm64_encrypt)
.align 5
ENTRY(__aes_arm64_decrypt)
do_crypt iround, crypto_it_tab, __aes_arm64_inverse_sbox, 0
do_crypt iround, crypto_it_tab, crypto_aes_inv_sbox, 0
ENDPROC(__aes_arm64_decrypt)
.section ".rodata", "a"
.align L1_CACHE_SHIFT
.type __aes_arm64_inverse_sbox, %object
__aes_arm64_inverse_sbox:
.byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38
.byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb
.byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87
.byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb
.byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d
.byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e
.byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2
.byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25
.byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16
.byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92
.byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda
.byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84
.byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a
.byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06
.byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02
.byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b
.byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea
.byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73
.byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85
.byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e
.byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89
.byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b
.byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20
.byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4
.byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31
.byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f
.byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d
.byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef
.byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0
.byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61
.byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26
.byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
.size __aes_arm64_inverse_sbox, . - __aes_arm64_inverse_sbox

View File

@ -10,12 +10,9 @@
#include <linux/module.h>
asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
EXPORT_SYMBOL(__aes_arm64_encrypt);
asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
EXPORT_SYMBOL(__aes_arm64_decrypt);
static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void aes_arm64_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
@ -23,7 +20,7 @@ static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
__aes_arm64_encrypt(ctx->key_enc, out, in, rounds);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void aes_arm64_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int rounds = 6 + ctx->key_length / 4;
@ -43,8 +40,8 @@ static struct crypto_alg aes_alg = {
.cra_cipher.cia_min_keysize = AES_MIN_KEY_SIZE,
.cra_cipher.cia_max_keysize = AES_MAX_KEY_SIZE,
.cra_cipher.cia_setkey = crypto_aes_set_key,
.cra_cipher.cia_encrypt = aes_encrypt,
.cra_cipher.cia_decrypt = aes_decrypt
.cra_cipher.cia_encrypt = aes_arm64_encrypt,
.cra_cipher.cia_decrypt = aes_arm64_decrypt
};
static int __init aes_init(void)

View File

@ -1,50 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Fallback for sync aes(ctr) in contexts where kernel mode NEON
* is not allowed
*
* Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
*/
#include <crypto/aes.h>
#include <crypto/internal/skcipher.h>
asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
static inline int aes_ctr_encrypt_fallback(struct crypto_aes_ctx *ctx,
struct skcipher_request *req)
{
struct skcipher_walk walk;
u8 buf[AES_BLOCK_SIZE];
int err;
err = skcipher_walk_virt(&walk, req, true);
while (walk.nbytes > 0) {
u8 *dst = walk.dst.virt.addr;
u8 *src = walk.src.virt.addr;
int nbytes = walk.nbytes;
int tail = 0;
if (nbytes < walk.total) {
nbytes = round_down(nbytes, AES_BLOCK_SIZE);
tail = walk.nbytes % AES_BLOCK_SIZE;
}
do {
int bsize = min(nbytes, AES_BLOCK_SIZE);
__aes_arm64_encrypt(ctx->key_enc, buf, walk.iv,
6 + ctx->key_length / 4);
crypto_xor_cpy(dst, src, buf, bsize);
crypto_inc(walk.iv, AES_BLOCK_SIZE);
dst += AES_BLOCK_SIZE;
src += AES_BLOCK_SIZE;
nbytes -= AES_BLOCK_SIZE;
} while (nbytes > 0);
err = skcipher_walk_done(&walk, tail);
}
return err;
}

View File

@ -9,6 +9,8 @@
#include <asm/hwcap.h>
#include <asm/simd.h>
#include <crypto/aes.h>
#include <crypto/ctr.h>
#include <crypto/sha.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
@ -18,12 +20,10 @@
#include <crypto/xts.h>
#include "aes-ce-setkey.h"
#include "aes-ctr-fallback.h"
#ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE "ce"
#define PRIO 300
#define aes_setkey ce_aes_setkey
#define aes_expandkey ce_aes_expandkey
#define aes_ecb_encrypt ce_aes_ecb_encrypt
#define aes_ecb_decrypt ce_aes_ecb_decrypt
@ -31,6 +31,8 @@
#define aes_cbc_decrypt ce_aes_cbc_decrypt
#define aes_cbc_cts_encrypt ce_aes_cbc_cts_encrypt
#define aes_cbc_cts_decrypt ce_aes_cbc_cts_decrypt
#define aes_essiv_cbc_encrypt ce_aes_essiv_cbc_encrypt
#define aes_essiv_cbc_decrypt ce_aes_essiv_cbc_decrypt
#define aes_ctr_encrypt ce_aes_ctr_encrypt
#define aes_xts_encrypt ce_aes_xts_encrypt
#define aes_xts_decrypt ce_aes_xts_decrypt
@ -39,27 +41,31 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else
#define MODE "neon"
#define PRIO 200
#define aes_setkey crypto_aes_set_key
#define aes_expandkey crypto_aes_expand_key
#define aes_ecb_encrypt neon_aes_ecb_encrypt
#define aes_ecb_decrypt neon_aes_ecb_decrypt
#define aes_cbc_encrypt neon_aes_cbc_encrypt
#define aes_cbc_decrypt neon_aes_cbc_decrypt
#define aes_cbc_cts_encrypt neon_aes_cbc_cts_encrypt
#define aes_cbc_cts_decrypt neon_aes_cbc_cts_decrypt
#define aes_essiv_cbc_encrypt neon_aes_essiv_cbc_encrypt
#define aes_essiv_cbc_decrypt neon_aes_essiv_cbc_decrypt
#define aes_ctr_encrypt neon_aes_ctr_encrypt
#define aes_xts_encrypt neon_aes_xts_encrypt
#define aes_xts_decrypt neon_aes_xts_decrypt
#define aes_mac_update neon_aes_mac_update
MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 NEON");
#endif
#if defined(USE_V8_CRYPTO_EXTENSIONS) || !defined(CONFIG_CRYPTO_AES_ARM64_BS)
MODULE_ALIAS_CRYPTO("ecb(aes)");
MODULE_ALIAS_CRYPTO("cbc(aes)");
MODULE_ALIAS_CRYPTO("ctr(aes)");
MODULE_ALIAS_CRYPTO("xts(aes)");
#endif
MODULE_ALIAS_CRYPTO("cts(cbc(aes))");
MODULE_ALIAS_CRYPTO("essiv(cbc(aes),sha256)");
MODULE_ALIAS_CRYPTO("cmac(aes)");
MODULE_ALIAS_CRYPTO("xcbc(aes)");
MODULE_ALIAS_CRYPTO("cbcmac(aes)");
#endif
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@ -84,27 +90,34 @@ asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 ctr[]);
asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int blocks, u32 const rk2[], u8 iv[],
int rounds, int bytes, u32 const rk2[], u8 iv[],
int first);
asmlinkage void aes_xts_decrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int blocks, u32 const rk2[], u8 iv[],
int rounds, int bytes, u32 const rk2[], u8 iv[],
int first);
asmlinkage void aes_essiv_cbc_encrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int blocks, u8 iv[],
u32 const rk2[]);
asmlinkage void aes_essiv_cbc_decrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int blocks, u8 iv[],
u32 const rk2[]);
asmlinkage void aes_mac_update(u8 const in[], u32 const rk[], int rounds,
int blocks, u8 dg[], int enc_before,
int enc_after);
struct cts_cbc_req_ctx {
struct scatterlist sg_src[2];
struct scatterlist sg_dst[2];
struct skcipher_request subreq;
};
struct crypto_aes_xts_ctx {
struct crypto_aes_ctx key1;
struct crypto_aes_ctx __aligned(8) key2;
};
struct crypto_aes_essiv_cbc_ctx {
struct crypto_aes_ctx key1;
struct crypto_aes_ctx __aligned(8) key2;
struct crypto_shash *hash;
};
struct mac_tfm_ctx {
struct crypto_aes_ctx key;
u8 __aligned(8) consts[];
@ -118,11 +131,18 @@ struct mac_desc_ctx {
static int skcipher_aes_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
return aes_setkey(crypto_skcipher_tfm(tfm), in_key, key_len);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int ret;
ret = aes_expandkey(ctx, in_key, key_len);
if (ret)
crypto_skcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return ret;
}
static int xts_set_key(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
static int __maybe_unused xts_set_key(struct crypto_skcipher *tfm,
const u8 *in_key, unsigned int key_len)
{
struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int ret;
@ -142,7 +162,33 @@ static int xts_set_key(struct crypto_skcipher *tfm, const u8 *in_key,
return -EINVAL;
}
static int ecb_encrypt(struct skcipher_request *req)
static int __maybe_unused essiv_cbc_set_key(struct crypto_skcipher *tfm,
const u8 *in_key,
unsigned int key_len)
{
struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
SHASH_DESC_ON_STACK(desc, ctx->hash);
u8 digest[SHA256_DIGEST_SIZE];
int ret;
ret = aes_expandkey(&ctx->key1, in_key, key_len);
if (ret)
goto out;
desc->tfm = ctx->hash;
crypto_shash_digest(desc, in_key, key_len, digest);
ret = aes_expandkey(&ctx->key2, digest, sizeof(digest));
if (ret)
goto out;
return 0;
out:
crypto_skcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
static int __maybe_unused ecb_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
@ -162,7 +208,7 @@ static int ecb_encrypt(struct skcipher_request *req)
return err;
}
static int ecb_decrypt(struct skcipher_request *req)
static int __maybe_unused ecb_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
@ -182,63 +228,78 @@ static int ecb_decrypt(struct skcipher_request *req)
return err;
}
static int cbc_encrypt(struct skcipher_request *req)
static int cbc_encrypt_walk(struct skcipher_request *req,
struct skcipher_walk *walk)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk;
int err = 0, rounds = 6 + ctx->key_length / 4;
unsigned int blocks;
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, rounds, blocks, walk.iv);
aes_cbc_encrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_enc, rounds, blocks, walk->iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
return err;
}
static int cbc_decrypt(struct skcipher_request *req)
static int __maybe_unused cbc_encrypt(struct skcipher_request *req)
{
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
return cbc_encrypt_walk(req, &walk);
}
static int cbc_decrypt_walk(struct skcipher_request *req,
struct skcipher_walk *walk)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk;
int err = 0, rounds = 6 + ctx->key_length / 4;
unsigned int blocks;
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_dec, rounds, blocks, walk.iv);
aes_cbc_decrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_dec, rounds, blocks, walk->iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
return err;
}
static int cts_cbc_init_tfm(struct crypto_skcipher *tfm)
static int __maybe_unused cbc_decrypt(struct skcipher_request *req)
{
crypto_skcipher_set_reqsize(tfm, sizeof(struct cts_cbc_req_ctx));
return 0;
struct skcipher_walk walk;
int err;
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
return cbc_decrypt_walk(req, &walk);
}
static int cts_cbc_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
struct cts_cbc_req_ctx *rctx = skcipher_request_ctx(req);
int err, rounds = 6 + ctx->key_length / 4;
int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
struct scatterlist *src = req->src, *dst = req->dst;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct skcipher_walk walk;
skcipher_request_set_tfm(&rctx->subreq, tfm);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq, skcipher_request_flags(req),
NULL, NULL);
if (req->cryptlen <= AES_BLOCK_SIZE) {
if (req->cryptlen < AES_BLOCK_SIZE)
@ -247,41 +308,30 @@ static int cts_cbc_encrypt(struct skcipher_request *req)
}
if (cbc_blocks > 0) {
unsigned int blocks;
skcipher_request_set_crypt(&rctx->subreq, req->src, req->dst,
skcipher_request_set_crypt(&subreq, req->src, req->dst,
cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &rctx->subreq, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, rounds, blocks, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk,
walk.nbytes % AES_BLOCK_SIZE);
}
err = skcipher_walk_virt(&walk, &subreq, false) ?:
cbc_encrypt_walk(&subreq, &walk);
if (err)
return err;
if (req->cryptlen == AES_BLOCK_SIZE)
return 0;
dst = src = scatterwalk_ffwd(rctx->sg_src, req->src,
rctx->subreq.cryptlen);
dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(rctx->sg_dst, req->dst,
rctx->subreq.cryptlen);
dst = scatterwalk_ffwd(sg_dst, req->dst,
subreq.cryptlen);
}
/* handle ciphertext stealing */
skcipher_request_set_crypt(&rctx->subreq, src, dst,
skcipher_request_set_crypt(&subreq, src, dst,
req->cryptlen - cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &rctx->subreq, false);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
@ -297,13 +347,16 @@ static int cts_cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
struct cts_cbc_req_ctx *rctx = skcipher_request_ctx(req);
int err, rounds = 6 + ctx->key_length / 4;
int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
struct scatterlist *src = req->src, *dst = req->dst;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct skcipher_walk walk;
skcipher_request_set_tfm(&rctx->subreq, tfm);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq, skcipher_request_flags(req),
NULL, NULL);
if (req->cryptlen <= AES_BLOCK_SIZE) {
if (req->cryptlen < AES_BLOCK_SIZE)
@ -312,41 +365,30 @@ static int cts_cbc_decrypt(struct skcipher_request *req)
}
if (cbc_blocks > 0) {
unsigned int blocks;
skcipher_request_set_crypt(&rctx->subreq, req->src, req->dst,
skcipher_request_set_crypt(&subreq, req->src, req->dst,
cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &rctx->subreq, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_dec, rounds, blocks, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk,
walk.nbytes % AES_BLOCK_SIZE);
}
err = skcipher_walk_virt(&walk, &subreq, false) ?:
cbc_decrypt_walk(&subreq, &walk);
if (err)
return err;
if (req->cryptlen == AES_BLOCK_SIZE)
return 0;
dst = src = scatterwalk_ffwd(rctx->sg_src, req->src,
rctx->subreq.cryptlen);
dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(rctx->sg_dst, req->dst,
rctx->subreq.cryptlen);
dst = scatterwalk_ffwd(sg_dst, req->dst,
subreq.cryptlen);
}
/* handle ciphertext stealing */
skcipher_request_set_crypt(&rctx->subreq, src, dst,
skcipher_request_set_crypt(&subreq, src, dst,
req->cryptlen - cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &rctx->subreq, false);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
@ -358,6 +400,66 @@ static int cts_cbc_decrypt(struct skcipher_request *req)
return skcipher_walk_done(&walk, 0);
}
static int __maybe_unused essiv_cbc_init_tfm(struct crypto_skcipher *tfm)
{
struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
ctx->hash = crypto_alloc_shash("sha256", 0, 0);
return PTR_ERR_OR_ZERO(ctx->hash);
}
static void __maybe_unused essiv_cbc_exit_tfm(struct crypto_skcipher *tfm)
{
struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
crypto_free_shash(ctx->hash);
}
static int __maybe_unused essiv_cbc_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, rounds = 6 + ctx->key1.key_length / 4;
struct skcipher_walk walk;
unsigned int blocks;
err = skcipher_walk_virt(&walk, req, false);
blocks = walk.nbytes / AES_BLOCK_SIZE;
if (blocks) {
kernel_neon_begin();
aes_essiv_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, blocks,
req->iv, ctx->key2.key_enc);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err ?: cbc_encrypt_walk(req, &walk);
}
static int __maybe_unused essiv_cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, rounds = 6 + ctx->key1.key_length / 4;
struct skcipher_walk walk;
unsigned int blocks;
err = skcipher_walk_virt(&walk, req, false);
blocks = walk.nbytes / AES_BLOCK_SIZE;
if (blocks) {
kernel_neon_begin();
aes_essiv_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, blocks,
req->iv, ctx->key2.key_enc);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err ?: cbc_decrypt_walk(req, &walk);
}
static int ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@ -397,62 +499,176 @@ static int ctr_encrypt(struct skcipher_request *req)
return err;
}
static int ctr_encrypt_sync(struct skcipher_request *req)
static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
unsigned long flags;
/*
* Temporarily disable interrupts to avoid races where
* cachelines are evicted when the CPU is interrupted
* to do something else.
*/
local_irq_save(flags);
aes_encrypt(ctx, dst, src);
local_irq_restore(flags);
}
static int __maybe_unused ctr_encrypt_sync(struct skcipher_request *req)
{
if (!crypto_simd_usable())
return aes_ctr_encrypt_fallback(ctx, req);
return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
return ctr_encrypt(req);
}
static int xts_encrypt(struct skcipher_request *req)
static int __maybe_unused xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = 6 + ctx->key1.key_length / 4;
int tail = req->cryptlen % AES_BLOCK_SIZE;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct scatterlist *src, *dst;
struct skcipher_walk walk;
unsigned int blocks;
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
err = skcipher_walk_virt(&walk, req, false);
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
int xts_blocks = DIV_ROUND_UP(req->cryptlen,
AES_BLOCK_SIZE) - 2;
skcipher_walk_abort(&walk);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
skcipher_request_flags(req),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
xts_blocks * AES_BLOCK_SIZE,
req->iv);
req = &subreq;
err = skcipher_walk_virt(&walk, req, false);
} else {
tail = 0;
}
for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) {
int nbytes = walk.nbytes;
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, blocks,
ctx->key1.key_enc, rounds, nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
return err;
if (err || likely(!tail))
return err;
dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen);
skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
kernel_neon_begin();
aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, walk.nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
return skcipher_walk_done(&walk, 0);
}
static int xts_decrypt(struct skcipher_request *req)
static int __maybe_unused xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = 6 + ctx->key1.key_length / 4;
int tail = req->cryptlen % AES_BLOCK_SIZE;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct scatterlist *src, *dst;
struct skcipher_walk walk;
unsigned int blocks;
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
err = skcipher_walk_virt(&walk, req, false);
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
kernel_neon_begin();
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, blocks,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
int xts_blocks = DIV_ROUND_UP(req->cryptlen,
AES_BLOCK_SIZE) - 2;
skcipher_walk_abort(&walk);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
skcipher_request_flags(req),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
xts_blocks * AES_BLOCK_SIZE,
req->iv);
req = &subreq;
err = skcipher_walk_virt(&walk, req, false);
} else {
tail = 0;
}
return err;
for (first = 1; walk.nbytes >= AES_BLOCK_SIZE; first = 0) {
int nbytes = walk.nbytes;
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
if (err || likely(!tail))
return err;
dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen);
skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
kernel_neon_begin();
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, walk.nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
return skcipher_walk_done(&walk, 0);
}
static struct skcipher_alg aes_algs[] = { {
#if defined(USE_V8_CRYPTO_EXTENSIONS) || !defined(CONFIG_CRYPTO_AES_ARM64_BS)
.base = {
.cra_name = "__ecb(aes)",
.cra_driver_name = "__ecb-aes-" MODE,
@ -483,24 +699,6 @@ static struct skcipher_alg aes_algs[] = { {
.setkey = skcipher_aes_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base = {
.cra_name = "__cts(cbc(aes))",
.cra_driver_name = "__cts-cbc-aes-" MODE,
.cra_priority = PRIO,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.walksize = 2 * AES_BLOCK_SIZE,
.setkey = skcipher_aes_setkey,
.encrypt = cts_cbc_encrypt,
.decrypt = cts_cbc_decrypt,
.init = cts_cbc_init_tfm,
}, {
.base = {
.cra_name = "__ctr(aes)",
@ -547,9 +745,46 @@ static struct skcipher_alg aes_algs[] = { {
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.walksize = 2 * AES_BLOCK_SIZE,
.setkey = xts_set_key,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
}, {
#endif
.base = {
.cra_name = "__cts(cbc(aes))",
.cra_driver_name = "__cts-cbc-aes-" MODE,
.cra_priority = PRIO,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.walksize = 2 * AES_BLOCK_SIZE,
.setkey = skcipher_aes_setkey,
.encrypt = cts_cbc_encrypt,
.decrypt = cts_cbc_decrypt,
}, {
.base = {
.cra_name = "__essiv(cbc(aes),sha256)",
.cra_driver_name = "__essiv-cbc-aes-sha256-" MODE,
.cra_priority = PRIO + 1,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_essiv_cbc_ctx),
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = essiv_cbc_set_key,
.encrypt = essiv_cbc_encrypt,
.decrypt = essiv_cbc_decrypt,
.init = essiv_cbc_init_tfm,
.exit = essiv_cbc_exit_tfm,
} };
static int cbcmac_setkey(struct crypto_shash *tfm, const u8 *in_key,
@ -646,15 +881,14 @@ static void mac_do_update(struct crypto_aes_ctx *ctx, u8 const in[], int blocks,
kernel_neon_end();
} else {
if (enc_before)
__aes_arm64_encrypt(ctx->key_enc, dg, dg, rounds);
aes_encrypt(ctx, dg, dg);
while (blocks--) {
crypto_xor(dg, in, AES_BLOCK_SIZE);
in += AES_BLOCK_SIZE;
if (blocks || enc_after)
__aes_arm64_encrypt(ctx->key_enc, dg, dg,
rounds);
aes_encrypt(ctx, dg, dg);
}
}
}
@ -837,5 +1071,7 @@ module_cpu_feature_match(AES, aes_init);
module_init(aes_init);
EXPORT_SYMBOL(neon_aes_ecb_encrypt);
EXPORT_SYMBOL(neon_aes_cbc_encrypt);
EXPORT_SYMBOL(neon_aes_xts_encrypt);
EXPORT_SYMBOL(neon_aes_xts_decrypt);
#endif
module_exit(aes_exit);

View File

@ -118,8 +118,23 @@ AES_ENDPROC(aes_ecb_decrypt)
* int blocks, u8 iv[])
* aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 iv[])
* aes_essiv_cbc_encrypt(u8 out[], u8 const in[], u32 const rk1[],
* int rounds, int blocks, u8 iv[],
* u32 const rk2[]);
* aes_essiv_cbc_decrypt(u8 out[], u8 const in[], u32 const rk1[],
* int rounds, int blocks, u8 iv[],
* u32 const rk2[]);
*/
AES_ENTRY(aes_essiv_cbc_encrypt)
ld1 {v4.16b}, [x5] /* get iv */
mov w8, #14 /* AES-256: 14 rounds */
enc_prepare w8, x6, x7
encrypt_block v4, w8, x6, x7, w9
enc_switch_key w3, x2, x6
b .Lcbcencloop4x
AES_ENTRY(aes_cbc_encrypt)
ld1 {v4.16b}, [x5] /* get iv */
enc_prepare w3, x2, x6
@ -153,13 +168,25 @@ AES_ENTRY(aes_cbc_encrypt)
st1 {v4.16b}, [x5] /* return iv */
ret
AES_ENDPROC(aes_cbc_encrypt)
AES_ENDPROC(aes_essiv_cbc_encrypt)
AES_ENTRY(aes_essiv_cbc_decrypt)
stp x29, x30, [sp, #-16]!
mov x29, sp
ld1 {cbciv.16b}, [x5] /* get iv */
mov w8, #14 /* AES-256: 14 rounds */
enc_prepare w8, x6, x7
encrypt_block cbciv, w8, x6, x7, w9
b .Lessivcbcdecstart
AES_ENTRY(aes_cbc_decrypt)
stp x29, x30, [sp, #-16]!
mov x29, sp
ld1 {cbciv.16b}, [x5] /* get iv */
.Lessivcbcdecstart:
dec_prepare w3, x2, x6
.LcbcdecloopNx:
@ -212,6 +239,7 @@ ST5( st1 {v4.16b}, [x0], #16 )
ldp x29, x30, [sp], #16
ret
AES_ENDPROC(aes_cbc_decrypt)
AES_ENDPROC(aes_essiv_cbc_decrypt)
/*
@ -265,12 +293,11 @@ AES_ENTRY(aes_cbc_cts_decrypt)
ld1 {v5.16b}, [x5] /* get iv */
dec_prepare w3, x2, x6
tbl v2.16b, {v1.16b}, v4.16b
decrypt_block v0, w3, x2, x6, w7
eor v2.16b, v2.16b, v0.16b
tbl v2.16b, {v0.16b}, v3.16b
eor v2.16b, v2.16b, v1.16b
tbx v0.16b, {v1.16b}, v4.16b
tbl v2.16b, {v2.16b}, v3.16b
decrypt_block v0, w3, x2, x6, w7
eor v0.16b, v0.16b, v5.16b /* xor with iv */
@ -386,10 +413,10 @@ AES_ENDPROC(aes_ctr_encrypt)
/*
* aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds,
* int bytes, u8 const rk2[], u8 iv[], int first)
* aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds,
* int blocks, u8 const rk2[], u8 iv[], int first)
* aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds,
* int blocks, u8 const rk2[], u8 iv[], int first)
* int bytes, u8 const rk2[], u8 iv[], int first)
*/
.macro next_tweak, out, in, tmp
@ -415,6 +442,7 @@ AES_ENTRY(aes_xts_encrypt)
cbz w7, .Lxtsencnotfirst
enc_prepare w3, x5, x8
xts_cts_skip_tw w7, .LxtsencNx
encrypt_block v4, w3, x5, x8, w7 /* first tweak */
enc_switch_key w3, x2, x8
b .LxtsencNx
@ -424,7 +452,7 @@ AES_ENTRY(aes_xts_encrypt)
.LxtsencloopNx:
next_tweak v4, v4, v8
.LxtsencNx:
subs w4, w4, #4
subs w4, w4, #64
bmi .Lxtsenc1x
ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 pt blocks */
next_tweak v5, v4, v8
@ -441,39 +469,74 @@ AES_ENTRY(aes_xts_encrypt)
eor v2.16b, v2.16b, v6.16b
st1 {v0.16b-v3.16b}, [x0], #64
mov v4.16b, v7.16b
cbz w4, .Lxtsencout
cbz w4, .Lxtsencret
xts_reload_mask v8
b .LxtsencloopNx
.Lxtsenc1x:
adds w4, w4, #4
adds w4, w4, #64
beq .Lxtsencout
subs w4, w4, #16
bmi .LxtsencctsNx
.Lxtsencloop:
ld1 {v1.16b}, [x1], #16
eor v0.16b, v1.16b, v4.16b
ld1 {v0.16b}, [x1], #16
.Lxtsencctsout:
eor v0.16b, v0.16b, v4.16b
encrypt_block v0, w3, x2, x8, w7
eor v0.16b, v0.16b, v4.16b
st1 {v0.16b}, [x0], #16
subs w4, w4, #1
beq .Lxtsencout
cbz w4, .Lxtsencout
subs w4, w4, #16
next_tweak v4, v4, v8
bmi .Lxtsenccts
st1 {v0.16b}, [x0], #16
b .Lxtsencloop
.Lxtsencout:
st1 {v0.16b}, [x0]
.Lxtsencret:
st1 {v4.16b}, [x6]
ldp x29, x30, [sp], #16
ret
AES_ENDPROC(aes_xts_encrypt)
.LxtsencctsNx:
mov v0.16b, v3.16b
sub x0, x0, #16
.Lxtsenccts:
adr_l x8, .Lcts_permute_table
add x1, x1, w4, sxtw /* rewind input pointer */
add w4, w4, #16 /* # bytes in final block */
add x9, x8, #32
add x8, x8, x4
sub x9, x9, x4
add x4, x0, x4 /* output address of final block */
ld1 {v1.16b}, [x1] /* load final block */
ld1 {v2.16b}, [x8]
ld1 {v3.16b}, [x9]
tbl v2.16b, {v0.16b}, v2.16b
tbx v0.16b, {v1.16b}, v3.16b
st1 {v2.16b}, [x4] /* overlapping stores */
mov w4, wzr
b .Lxtsencctsout
AES_ENDPROC(aes_xts_encrypt)
AES_ENTRY(aes_xts_decrypt)
stp x29, x30, [sp, #-16]!
mov x29, sp
/* subtract 16 bytes if we are doing CTS */
sub w8, w4, #0x10
tst w4, #0xf
csel w4, w4, w8, eq
ld1 {v4.16b}, [x6]
xts_load_mask v8
xts_cts_skip_tw w7, .Lxtsdecskiptw
cbz w7, .Lxtsdecnotfirst
enc_prepare w3, x5, x8
encrypt_block v4, w3, x5, x8, w7 /* first tweak */
.Lxtsdecskiptw:
dec_prepare w3, x2, x8
b .LxtsdecNx
@ -482,7 +545,7 @@ AES_ENTRY(aes_xts_decrypt)
.LxtsdecloopNx:
next_tweak v4, v4, v8
.LxtsdecNx:
subs w4, w4, #4
subs w4, w4, #64
bmi .Lxtsdec1x
ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 ct blocks */
next_tweak v5, v4, v8
@ -503,22 +566,52 @@ AES_ENTRY(aes_xts_decrypt)
xts_reload_mask v8
b .LxtsdecloopNx
.Lxtsdec1x:
adds w4, w4, #4
adds w4, w4, #64
beq .Lxtsdecout
subs w4, w4, #16
.Lxtsdecloop:
ld1 {v1.16b}, [x1], #16
eor v0.16b, v1.16b, v4.16b
ld1 {v0.16b}, [x1], #16
bmi .Lxtsdeccts
.Lxtsdecctsout:
eor v0.16b, v0.16b, v4.16b
decrypt_block v0, w3, x2, x8, w7
eor v0.16b, v0.16b, v4.16b
st1 {v0.16b}, [x0], #16
subs w4, w4, #1
beq .Lxtsdecout
cbz w4, .Lxtsdecout
subs w4, w4, #16
next_tweak v4, v4, v8
b .Lxtsdecloop
.Lxtsdecout:
st1 {v4.16b}, [x6]
ldp x29, x30, [sp], #16
ret
.Lxtsdeccts:
adr_l x8, .Lcts_permute_table
add x1, x1, w4, sxtw /* rewind input pointer */
add w4, w4, #16 /* # bytes in final block */
add x9, x8, #32
add x8, x8, x4
sub x9, x9, x4
add x4, x0, x4 /* output address of final block */
next_tweak v5, v4, v8
ld1 {v1.16b}, [x1] /* load final block */
ld1 {v2.16b}, [x8]
ld1 {v3.16b}, [x9]
eor v0.16b, v0.16b, v5.16b
decrypt_block v0, w3, x2, x8, w7
eor v0.16b, v0.16b, v5.16b
tbl v2.16b, {v0.16b}, v2.16b
tbx v0.16b, {v1.16b}, v3.16b
st1 {v2.16b}, [x4] /* overlapping stores */
mov w4, wzr
b .Lxtsdecctsout
AES_ENDPROC(aes_xts_decrypt)
/*

View File

@ -19,6 +19,11 @@
xts_load_mask \tmp
.endm
/* special case for the neon-bs driver calling into this one for CTS */
.macro xts_cts_skip_tw, reg, lbl
tbnz \reg, #1, \lbl
.endm
/* multiply by polynomial 'x' in GF(2^8) */
.macro mul_by_x, out, in, temp, const
sshr \temp, \in, #7
@ -49,7 +54,7 @@
/* do preload for encryption */
.macro enc_prepare, ignore0, ignore1, temp
prepare .LForward_Sbox, .LForward_ShiftRows, \temp
prepare crypto_aes_sbox, .LForward_ShiftRows, \temp
.endm
.macro enc_switch_key, ignore0, ignore1, temp
@ -58,7 +63,7 @@
/* do preload for decryption */
.macro dec_prepare, ignore0, ignore1, temp
prepare .LReverse_Sbox, .LReverse_ShiftRows, \temp
prepare crypto_aes_inv_sbox, .LReverse_ShiftRows, \temp
.endm
/* apply SubBytes transformation using the the preloaded Sbox */
@ -234,75 +239,7 @@
#include "aes-modes.S"
.section ".rodata", "a"
.align 6
.LForward_Sbox:
.byte 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5
.byte 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76
.byte 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0
.byte 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0
.byte 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc
.byte 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15
.byte 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a
.byte 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75
.byte 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0
.byte 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84
.byte 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b
.byte 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf
.byte 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85
.byte 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8
.byte 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5
.byte 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2
.byte 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17
.byte 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73
.byte 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88
.byte 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb
.byte 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c
.byte 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79
.byte 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9
.byte 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08
.byte 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6
.byte 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a
.byte 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e
.byte 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e
.byte 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94
.byte 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf
.byte 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68
.byte 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16
.LReverse_Sbox:
.byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38
.byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb
.byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87
.byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb
.byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d
.byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e
.byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2
.byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25
.byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16
.byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92
.byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda
.byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84
.byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a
.byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06
.byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02
.byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b
.byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea
.byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73
.byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85
.byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e
.byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89
.byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b
.byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20
.byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4
.byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31
.byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f
.byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d
.byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef
.byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0
.byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61
.byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26
.byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
.align 4
.LForward_ShiftRows:
.octa 0x0b06010c07020d08030e09040f0a0500

View File

@ -730,11 +730,6 @@ ENDPROC(aesbs_cbc_decrypt)
eor \out\().16b, \out\().16b, \tmp\().16b
.endm
.align 4
.Lxts_mul_x:
CPU_LE( .quad 1, 0x87 )
CPU_BE( .quad 0x87, 1 )
/*
* aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 iv[])
@ -806,7 +801,9 @@ ENDPROC(__xts_crypt8)
mov x23, x4
mov x24, x5
0: ldr q30, .Lxts_mul_x
0: movi v30.2s, #0x1
movi v25.2s, #0x87
uzp1 v30.4s, v30.4s, v25.4s
ld1 {v25.16b}, [x24]
99: adr x7, \do8

View File

@ -8,13 +8,13 @@
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/aes.h>
#include <crypto/ctr.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <crypto/xts.h>
#include <linux/module.h>
#include "aes-ctr-fallback.h"
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@ -46,6 +46,12 @@ asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks);
asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void neon_aes_xts_encrypt(u8 out[], u8 const in[],
u32 const rk1[], int rounds, int bytes,
u32 const rk2[], u8 iv[], int first);
asmlinkage void neon_aes_xts_decrypt(u8 out[], u8 const in[],
u32 const rk1[], int rounds, int bytes,
u32 const rk2[], u8 iv[], int first);
struct aesbs_ctx {
u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32];
@ -65,6 +71,7 @@ struct aesbs_ctr_ctx {
struct aesbs_xts_ctx {
struct aesbs_ctx key;
u32 twkey[AES_MAX_KEYLENGTH_U32];
struct crypto_aes_ctx cts;
};
static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
@ -74,7 +81,7 @@ static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;
err = crypto_aes_expand_key(&rk, in_key, key_len);
err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;
@ -133,7 +140,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;
err = crypto_aes_expand_key(&rk, in_key, key_len);
err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;
@ -205,7 +212,7 @@ static int aesbs_ctr_setkey_sync(struct crypto_skcipher *tfm, const u8 *in_key,
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = crypto_aes_expand_key(&ctx->fallback, in_key, key_len);
err = aes_expandkey(&ctx->fallback, in_key, key_len);
if (err)
return err;
@ -271,7 +278,11 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
return err;
key_len /= 2;
err = crypto_aes_expand_key(&rk, in_key + key_len, key_len);
err = aes_expandkey(&ctx->cts, in_key, key_len);
if (err)
return err;
err = aes_expandkey(&rk, in_key + key_len, key_len);
if (err)
return err;
@ -280,59 +291,142 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
return aesbs_setkey(tfm, in_key, key_len);
}
static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
{
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
unsigned long flags;
/*
* Temporarily disable interrupts to avoid races where
* cachelines are evicted when the CPU is interrupted
* to do something else.
*/
local_irq_save(flags);
aes_encrypt(&ctx->fallback, dst, src);
local_irq_restore(flags);
}
static int ctr_encrypt_sync(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
if (!crypto_simd_usable())
return aes_ctr_encrypt_fallback(&ctx->fallback, req);
return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
return ctr_encrypt(req);
}
static int __xts_crypt(struct skcipher_request *req,
static int __xts_crypt(struct skcipher_request *req, bool encrypt,
void (*fn)(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]))
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int tail = req->cryptlen % (8 * AES_BLOCK_SIZE);
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct scatterlist *src, *dst;
struct skcipher_walk walk;
int err;
int nbytes, err;
int first = 1;
u8 *out, *in;
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
/* ensure that the cts tail is covered by a single step */
if (unlikely(tail > 0 && tail < AES_BLOCK_SIZE)) {
int xts_blocks = DIV_ROUND_UP(req->cryptlen,
AES_BLOCK_SIZE) - 2;
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
skcipher_request_flags(req),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
xts_blocks * AES_BLOCK_SIZE,
req->iv);
req = &subreq;
} else {
tail = 0;
}
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
kernel_neon_begin();
neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, ctx->key.rounds, 1);
kernel_neon_end();
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
if (walk.nbytes < walk.total)
if (walk.nbytes < walk.total || walk.nbytes % AES_BLOCK_SIZE)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
out = walk.dst.virt.addr;
in = walk.src.virt.addr;
nbytes = walk.nbytes;
kernel_neon_begin();
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk,
ctx->key.rounds, blocks, walk.iv);
if (likely(blocks > 6)) { /* plain NEON is faster otherwise */
if (first)
neon_aes_ecb_encrypt(walk.iv, walk.iv,
ctx->twkey,
ctx->key.rounds, 1);
first = 0;
fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
out += blocks * AES_BLOCK_SIZE;
in += blocks * AES_BLOCK_SIZE;
nbytes -= blocks * AES_BLOCK_SIZE;
}
if (walk.nbytes == walk.total && nbytes > 0)
goto xts_tail;
kernel_neon_end();
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
skcipher_walk_done(&walk, nbytes);
}
return err;
if (err || likely(!tail))
return err;
/* handle ciphertext stealing */
dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen);
skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail,
req->iv);
err = skcipher_walk_virt(&walk, req, false);
if (err)
return err;
out = walk.dst.virt.addr;
in = walk.src.virt.addr;
nbytes = walk.nbytes;
kernel_neon_begin();
xts_tail:
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2);
kernel_neon_end();
return skcipher_walk_done(&walk, 0);
}
static int xts_encrypt(struct skcipher_request *req)
{
return __xts_crypt(req, aesbs_xts_encrypt);
return __xts_crypt(req, true, aesbs_xts_encrypt);
}
static int xts_decrypt(struct skcipher_request *req)
{
return __xts_crypt(req, aesbs_xts_decrypt);
return __xts_crypt(req, false, aesbs_xts_decrypt);
}
static struct skcipher_alg aes_algs[] = { {

View File

@ -70,8 +70,6 @@ asmlinkage void pmull_gcm_decrypt(int blocks, u64 dg[], u8 dst[],
asmlinkage void pmull_gcm_encrypt_block(u8 dst[], u8 const src[],
u32 const rk[], int rounds);
asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
static int ghash_init(struct shash_desc *desc)
{
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
@ -309,14 +307,13 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *inkey,
u8 key[GHASH_BLOCK_SIZE];
int ret;
ret = crypto_aes_expand_key(&ctx->aes_key, inkey, keylen);
ret = aes_expandkey(&ctx->aes_key, inkey, keylen);
if (ret) {
tfm->base.crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
__aes_arm64_encrypt(ctx->aes_key.key_enc, key, (u8[AES_BLOCK_SIZE]){},
num_rounds(&ctx->aes_key));
aes_encrypt(&ctx->aes_key, key, (u8[AES_BLOCK_SIZE]){});
return __ghash_setkey(&ctx->ghash_key, key, sizeof(be128));
}
@ -467,7 +464,7 @@ static int gcm_encrypt(struct aead_request *req)
rk = ctx->aes_key.key_enc;
} while (walk.nbytes >= 2 * AES_BLOCK_SIZE);
} else {
__aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, nrounds);
aes_encrypt(&ctx->aes_key, tag, iv);
put_unaligned_be32(2, iv + GCM_IV_SIZE);
while (walk.nbytes >= (2 * AES_BLOCK_SIZE)) {
@ -478,8 +475,7 @@ static int gcm_encrypt(struct aead_request *req)
int remaining = blocks;
do {
__aes_arm64_encrypt(ctx->aes_key.key_enc,
ks, iv, nrounds);
aes_encrypt(&ctx->aes_key, ks, iv);
crypto_xor_cpy(dst, src, ks, AES_BLOCK_SIZE);
crypto_inc(iv, AES_BLOCK_SIZE);
@ -495,13 +491,10 @@ static int gcm_encrypt(struct aead_request *req)
walk.nbytes % (2 * AES_BLOCK_SIZE));
}
if (walk.nbytes) {
__aes_arm64_encrypt(ctx->aes_key.key_enc, ks, iv,
nrounds);
aes_encrypt(&ctx->aes_key, ks, iv);
if (walk.nbytes > AES_BLOCK_SIZE) {
crypto_inc(iv, AES_BLOCK_SIZE);
__aes_arm64_encrypt(ctx->aes_key.key_enc,
ks + AES_BLOCK_SIZE, iv,
nrounds);
aes_encrypt(&ctx->aes_key, ks + AES_BLOCK_SIZE, iv);
}
}
}
@ -605,7 +598,7 @@ static int gcm_decrypt(struct aead_request *req)
rk = ctx->aes_key.key_enc;
} while (walk.nbytes >= 2 * AES_BLOCK_SIZE);
} else {
__aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, nrounds);
aes_encrypt(&ctx->aes_key, tag, iv);
put_unaligned_be32(2, iv + GCM_IV_SIZE);
while (walk.nbytes >= (2 * AES_BLOCK_SIZE)) {
@ -618,8 +611,7 @@ static int gcm_decrypt(struct aead_request *req)
pmull_ghash_update_p64);
do {
__aes_arm64_encrypt(ctx->aes_key.key_enc,
buf, iv, nrounds);
aes_encrypt(&ctx->aes_key, buf, iv);
crypto_xor_cpy(dst, src, buf, AES_BLOCK_SIZE);
crypto_inc(iv, AES_BLOCK_SIZE);
@ -637,11 +629,9 @@ static int gcm_decrypt(struct aead_request *req)
memcpy(iv2, iv, AES_BLOCK_SIZE);
crypto_inc(iv2, AES_BLOCK_SIZE);
__aes_arm64_encrypt(ctx->aes_key.key_enc, iv2,
iv2, nrounds);
aes_encrypt(&ctx->aes_key, iv2, iv2);
}
__aes_arm64_encrypt(ctx->aes_key.key_enc, iv, iv,
nrounds);
aes_encrypt(&ctx->aes_key, iv, iv);
}
}

View File

@ -30,15 +30,15 @@ EXPORT_SYMBOL(sha256_block_data_order);
asmlinkage void sha256_block_neon(u32 *digest, const void *data,
unsigned int num_blks);
static int sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
static int crypto_sha256_arm64_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha256_block_data_order);
}
static int sha256_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
static int crypto_sha256_arm64_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (len)
sha256_base_do_update(desc, data, len,
@ -49,17 +49,17 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data,
return sha256_base_finish(desc, out);
}
static int sha256_final(struct shash_desc *desc, u8 *out)
static int crypto_sha256_arm64_final(struct shash_desc *desc, u8 *out)
{
return sha256_finup(desc, NULL, 0, out);
return crypto_sha256_arm64_finup(desc, NULL, 0, out);
}
static struct shash_alg algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.update = sha256_update,
.final = sha256_final,
.finup = sha256_finup,
.update = crypto_sha256_arm64_update,
.final = crypto_sha256_arm64_final,
.finup = crypto_sha256_arm64_finup,
.descsize = sizeof(struct sha256_state),
.base.cra_name = "sha256",
.base.cra_driver_name = "sha256-arm64",
@ -69,9 +69,9 @@ static struct shash_alg algs[] = { {
}, {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.update = sha256_update,
.final = sha256_final,
.finup = sha256_finup,
.update = crypto_sha256_arm64_update,
.final = crypto_sha256_arm64_final,
.finup = crypto_sha256_arm64_finup,
.descsize = sizeof(struct sha256_state),
.base.cra_name = "sha224",
.base.cra_driver_name = "sha224-arm64",

View File

@ -11,4 +11,3 @@ generic-y += mcs_spinlock.h
generic-y += preempt.h
generic-y += vtime.h
generic-y += msi.h
generic-y += simd.h

View File

@ -108,7 +108,7 @@ static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
return 0;
}
static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
@ -119,7 +119,7 @@ static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
cpacf_km(sctx->fc, &sctx->key, out, in, AES_BLOCK_SIZE);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
@ -172,8 +172,8 @@ static struct crypto_alg aes_alg = {
.cia_min_keysize = AES_MIN_KEY_SIZE,
.cia_max_keysize = AES_MAX_KEY_SIZE,
.cia_setkey = aes_set_key,
.cia_encrypt = aes_encrypt,
.cia_decrypt = aes_decrypt,
.cia_encrypt = crypto_aes_encrypt,
.cia_decrypt = crypto_aes_decrypt,
}
}
};
@ -512,7 +512,7 @@ static int xts_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned long fc;
int err;
err = xts_check_key(tfm, in_key, key_len);
err = xts_fallback_setkey(tfm, in_key, key_len);
if (err)
return err;
@ -529,7 +529,7 @@ static int xts_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
/* Check if the function code is available */
xts_ctx->fc = (fc && cpacf_test_func(&km_functions, fc)) ? fc : 0;
if (!xts_ctx->fc)
return xts_fallback_setkey(tfm, in_key, key_len);
return 0;
/* Split the XTS key into the two subkeys */
key_len = key_len / 2;
@ -589,7 +589,7 @@ static int xts_aes_encrypt(struct blkcipher_desc *desc,
if (!nbytes)
return -EINVAL;
if (unlikely(!xts_ctx->fc))
if (unlikely(!xts_ctx->fc || (nbytes % XTS_BLOCK_SIZE) != 0))
return xts_fallback_encrypt(desc, dst, src, nbytes);
blkcipher_walk_init(&walk, dst, src, nbytes);
@ -606,7 +606,7 @@ static int xts_aes_decrypt(struct blkcipher_desc *desc,
if (!nbytes)
return -EINVAL;
if (unlikely(!xts_ctx->fc))
if (unlikely(!xts_ctx->fc || (nbytes % XTS_BLOCK_SIZE) != 0))
return xts_fallback_decrypt(desc, dst, src, nbytes);
blkcipher_walk_init(&walk, dst, src, nbytes);

View File

@ -16,7 +16,7 @@
#include <linux/fips.h>
#include <linux/mutex.h>
#include <crypto/algapi.h>
#include <crypto/des.h>
#include <crypto/internal/des.h>
#include <asm/cpacf.h>
#define DES3_KEY_SIZE (3 * DES_KEY_SIZE)
@ -35,27 +35,24 @@ static int des_setkey(struct crypto_tfm *tfm, const u8 *key,
unsigned int key_len)
{
struct s390_des_ctx *ctx = crypto_tfm_ctx(tfm);
u32 tmp[DES_EXPKEY_WORDS];
int err;
/* check for weak keys */
if (!des_ekey(tmp, key) &&
(tfm->crt_flags & CRYPTO_TFM_REQ_FORBID_WEAK_KEYS)) {
tfm->crt_flags |= CRYPTO_TFM_RES_WEAK_KEY;
return -EINVAL;
}
err = crypto_des_verify_key(tfm, key);
if (err)
return err;
memcpy(ctx->key, key, key_len);
return 0;
}
static void des_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void s390_des_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct s390_des_ctx *ctx = crypto_tfm_ctx(tfm);
cpacf_km(CPACF_KM_DEA, ctx->key, out, in, DES_BLOCK_SIZE);
}
static void des_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void s390_des_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct s390_des_ctx *ctx = crypto_tfm_ctx(tfm);
@ -76,8 +73,8 @@ static struct crypto_alg des_alg = {
.cia_min_keysize = DES_KEY_SIZE,
.cia_max_keysize = DES_KEY_SIZE,
.cia_setkey = des_setkey,
.cia_encrypt = des_encrypt,
.cia_decrypt = des_decrypt,
.cia_encrypt = s390_des_encrypt,
.cia_decrypt = s390_des_decrypt,
}
}
};
@ -227,8 +224,8 @@ static int des3_setkey(struct crypto_tfm *tfm, const u8 *key,
struct s390_des_ctx *ctx = crypto_tfm_ctx(tfm);
int err;
err = __des3_verify_key(&tfm->crt_flags, key);
if (unlikely(err))
err = crypto_des3_ede_verify_key(tfm, key);
if (err)
return err;
memcpy(ctx->key, key, key_len);

View File

@ -153,4 +153,4 @@ module_exit(ghash_mod_exit);
MODULE_ALIAS_CRYPTO("ghash");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("GHASH Message Digest Algorithm, s390 implementation");
MODULE_DESCRIPTION("GHASH hash function, s390 implementation");

View File

@ -17,7 +17,7 @@
#include "sha.h"
static int sha256_init(struct shash_desc *desc)
static int s390_sha256_init(struct shash_desc *desc)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
@ -60,7 +60,7 @@ static int sha256_import(struct shash_desc *desc, const void *in)
static struct shash_alg sha256_alg = {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_init,
.init = s390_sha256_init,
.update = s390_sha_update,
.final = s390_sha_final,
.export = sha256_export,
@ -76,7 +76,7 @@ static struct shash_alg sha256_alg = {
}
};
static int sha224_init(struct shash_desc *desc)
static int s390_sha224_init(struct shash_desc *desc)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
@ -96,7 +96,7 @@ static int sha224_init(struct shash_desc *desc)
static struct shash_alg sha224_alg = {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_init,
.init = s390_sha224_init,
.update = s390_sha_update,
.final = s390_sha_final,
.export = sha256_export,

View File

@ -7,9 +7,11 @@ purgatory-y := head.o purgatory.o string.o sha256.o mem.o
targets += $(purgatory-y) purgatory.lds purgatory purgatory.ro
PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
$(obj)/sha256.o: $(srctree)/lib/sha256.c FORCE
$(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
$(call if_changed_rule,cc_o_c)
CFLAGS_sha256.o := -D__DISABLE_EXPORTS
$(obj)/mem.o: $(srctree)/arch/s390/lib/mem.S FORCE
$(call if_changed_rule,as_o_S)

View File

@ -8,8 +8,8 @@
*/
#include <linux/kexec.h>
#include <linux/sha256.h>
#include <linux/string.h>
#include <crypto/sha.h>
#include <asm/purgatory.h>
int verify_sha256_digest(void)

View File

@ -197,14 +197,14 @@ static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
return 0;
}
static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct crypto_sparc64_aes_ctx *ctx = crypto_tfm_ctx(tfm);
ctx->ops->encrypt(&ctx->key[0], (const u32 *) src, (u32 *) dst);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct crypto_sparc64_aes_ctx *ctx = crypto_tfm_ctx(tfm);
@ -396,8 +396,8 @@ static struct crypto_alg algs[] = { {
.cia_min_keysize = AES_MIN_KEY_SIZE,
.cia_max_keysize = AES_MAX_KEY_SIZE,
.cia_setkey = aes_set_key,
.cia_encrypt = aes_encrypt,
.cia_decrypt = aes_decrypt
.cia_encrypt = crypto_aes_encrypt,
.cia_decrypt = crypto_aes_decrypt
}
}
}, {

View File

@ -12,7 +12,7 @@
#include <linux/mm.h>
#include <linux/types.h>
#include <crypto/algapi.h>
#include <crypto/des.h>
#include <crypto/internal/des.h>
#include <asm/fpumacro.h>
#include <asm/pstate.h>
@ -45,19 +45,15 @@ static int des_set_key(struct crypto_tfm *tfm, const u8 *key,
unsigned int keylen)
{
struct des_sparc64_ctx *dctx = crypto_tfm_ctx(tfm);
u32 *flags = &tfm->crt_flags;
u32 tmp[DES_EXPKEY_WORDS];
int ret;
int err;
/* Even though we have special instructions for key expansion,
* we call des_ekey() so that we don't have to write our own
* we call des_verify_key() so that we don't have to write our own
* weak key detection code.
*/
ret = des_ekey(tmp, key);
if (unlikely(ret == 0) && (*flags & CRYPTO_TFM_REQ_FORBID_WEAK_KEYS)) {
*flags |= CRYPTO_TFM_RES_WEAK_KEY;
return -EINVAL;
}
err = crypto_des_verify_key(tfm, key);
if (err)
return err;
des_sparc64_key_expand((const u32 *) key, &dctx->encrypt_expkey[0]);
encrypt_to_decrypt(&dctx->decrypt_expkey[0], &dctx->encrypt_expkey[0]);
@ -68,7 +64,7 @@ static int des_set_key(struct crypto_tfm *tfm, const u8 *key,
extern void des_sparc64_crypt(const u64 *key, const u64 *input,
u64 *output);
static void des_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void sparc_des_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct des_sparc64_ctx *ctx = crypto_tfm_ctx(tfm);
const u64 *K = ctx->encrypt_expkey;
@ -76,7 +72,7 @@ static void des_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
des_sparc64_crypt(K, (const u64 *) src, (u64 *) dst);
}
static void des_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void sparc_des_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct des_sparc64_ctx *ctx = crypto_tfm_ctx(tfm);
const u64 *K = ctx->decrypt_expkey;
@ -202,14 +198,13 @@ static int des3_ede_set_key(struct crypto_tfm *tfm, const u8 *key,
unsigned int keylen)
{
struct des3_ede_sparc64_ctx *dctx = crypto_tfm_ctx(tfm);
u32 *flags = &tfm->crt_flags;
u64 k1[DES_EXPKEY_WORDS / 2];
u64 k2[DES_EXPKEY_WORDS / 2];
u64 k3[DES_EXPKEY_WORDS / 2];
int err;
err = __des3_verify_key(flags, key);
if (unlikely(err))
err = crypto_des3_ede_verify_key(tfm, key);
if (err)
return err;
des_sparc64_key_expand((const u32 *)key, k1);
@ -235,7 +230,7 @@ static int des3_ede_set_key(struct crypto_tfm *tfm, const u8 *key,
extern void des3_ede_sparc64_crypt(const u64 *key, const u64 *input,
u64 *output);
static void des3_ede_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void sparc_des3_ede_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct des3_ede_sparc64_ctx *ctx = crypto_tfm_ctx(tfm);
const u64 *K = ctx->encrypt_expkey;
@ -243,7 +238,7 @@ static void des3_ede_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
des3_ede_sparc64_crypt(K, (const u64 *) src, (u64 *) dst);
}
static void des3_ede_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void sparc_des3_ede_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct des3_ede_sparc64_ctx *ctx = crypto_tfm_ctx(tfm);
const u64 *K = ctx->decrypt_expkey;
@ -390,8 +385,8 @@ static struct crypto_alg algs[] = { {
.cia_min_keysize = DES_KEY_SIZE,
.cia_max_keysize = DES_KEY_SIZE,
.cia_setkey = des_set_key,
.cia_encrypt = des_encrypt,
.cia_decrypt = des_decrypt
.cia_encrypt = sparc_des_encrypt,
.cia_decrypt = sparc_des_decrypt
}
}
}, {
@ -447,8 +442,8 @@ static struct crypto_alg algs[] = { {
.cia_min_keysize = DES3_EDE_KEY_SIZE,
.cia_max_keysize = DES3_EDE_KEY_SIZE,
.cia_setkey = des3_ede_set_key,
.cia_encrypt = des3_ede_encrypt,
.cia_decrypt = des3_ede_decrypt
.cia_encrypt = sparc_des3_ede_encrypt,
.cia_decrypt = sparc_des3_ede_decrypt
}
}
}, {

View File

@ -14,11 +14,9 @@ sha256_ni_supported :=$(call as-instr,sha256msg1 %xmm0$(comma)%xmm1,yes,no)
obj-$(CONFIG_CRYPTO_GLUE_HELPER_X86) += glue_helper.o
obj-$(CONFIG_CRYPTO_AES_586) += aes-i586.o
obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_586) += serpent-sse2-i586.o
obj-$(CONFIG_CRYPTO_AES_X86_64) += aes-x86_64.o
obj-$(CONFIG_CRYPTO_DES3_EDE_X86_64) += des3_ede-x86_64.o
obj-$(CONFIG_CRYPTO_CAMELLIA_X86_64) += camellia-x86_64.o
obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
@ -38,14 +36,6 @@ obj-$(CONFIG_CRYPTO_CRCT10DIF_PCLMUL) += crct10dif-pclmul.o
obj-$(CONFIG_CRYPTO_POLY1305_X86_64) += poly1305-x86_64.o
obj-$(CONFIG_CRYPTO_AEGIS128_AESNI_SSE2) += aegis128-aesni.o
obj-$(CONFIG_CRYPTO_AEGIS128L_AESNI_SSE2) += aegis128l-aesni.o
obj-$(CONFIG_CRYPTO_AEGIS256_AESNI_SSE2) += aegis256-aesni.o
obj-$(CONFIG_CRYPTO_MORUS640_GLUE) += morus640_glue.o
obj-$(CONFIG_CRYPTO_MORUS1280_GLUE) += morus1280_glue.o
obj-$(CONFIG_CRYPTO_MORUS640_SSE2) += morus640-sse2.o
obj-$(CONFIG_CRYPTO_MORUS1280_SSE2) += morus1280-sse2.o
obj-$(CONFIG_CRYPTO_NHPOLY1305_SSE2) += nhpoly1305-sse2.o
obj-$(CONFIG_CRYPTO_NHPOLY1305_AVX2) += nhpoly1305-avx2.o
@ -64,15 +54,11 @@ endif
ifeq ($(avx2_supported),yes)
obj-$(CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64) += camellia-aesni-avx2.o
obj-$(CONFIG_CRYPTO_SERPENT_AVX2_X86_64) += serpent-avx2.o
obj-$(CONFIG_CRYPTO_MORUS1280_AVX2) += morus1280-avx2.o
endif
aes-i586-y := aes-i586-asm_32.o aes_glue.o
twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o
serpent-sse2-i586-y := serpent-sse2-i586-asm_32.o serpent_sse2_glue.o
aes-x86_64-y := aes-x86_64-asm_64.o aes_glue.o
des3_ede-x86_64-y := des3_ede-asm_64.o des3_ede_glue.o
camellia-x86_64-y := camellia-x86_64-asm_64.o camellia_glue.o
blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
@ -82,11 +68,6 @@ chacha-x86_64-y := chacha-ssse3-x86_64.o chacha_glue.o
serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o
aegis128-aesni-y := aegis128-aesni-asm.o aegis128-aesni-glue.o
aegis128l-aesni-y := aegis128l-aesni-asm.o aegis128l-aesni-glue.o
aegis256-aesni-y := aegis256-aesni-asm.o aegis256-aesni-glue.o
morus640-sse2-y := morus640-sse2-asm.o morus640-sse2-glue.o
morus1280-sse2-y := morus1280-sse2-asm.o morus1280-sse2-glue.o
nhpoly1305-sse2-y := nh-sse2-x86_64.o nhpoly1305-sse2-glue.o
@ -106,8 +87,6 @@ ifeq ($(avx2_supported),yes)
chacha-x86_64-y += chacha-avx2-x86_64.o
serpent-avx2-y := serpent-avx2-asm_64.o serpent_avx2_glue.o
morus1280-avx2-y := morus1280-avx2-asm.o morus1280-avx2-glue.o
nhpoly1305-avx2-y := nh-avx2-x86_64.o nhpoly1305-avx2-glue.o
endif

View File

@ -1,823 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* AES-NI + SSE2 implementation of AEGIS-128L
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <linux/linkage.h>
#include <asm/frame.h>
#define STATE0 %xmm0
#define STATE1 %xmm1
#define STATE2 %xmm2
#define STATE3 %xmm3
#define STATE4 %xmm4
#define STATE5 %xmm5
#define STATE6 %xmm6
#define STATE7 %xmm7
#define MSG0 %xmm8
#define MSG1 %xmm9
#define T0 %xmm10
#define T1 %xmm11
#define T2 %xmm12
#define T3 %xmm13
#define STATEP %rdi
#define LEN %rsi
#define SRC %rdx
#define DST %rcx
.section .rodata.cst16.aegis128l_const, "aM", @progbits, 32
.align 16
.Laegis128l_const_0:
.byte 0x00, 0x01, 0x01, 0x02, 0x03, 0x05, 0x08, 0x0d
.byte 0x15, 0x22, 0x37, 0x59, 0x90, 0xe9, 0x79, 0x62
.Laegis128l_const_1:
.byte 0xdb, 0x3d, 0x18, 0x55, 0x6d, 0xc2, 0x2f, 0xf1
.byte 0x20, 0x11, 0x31, 0x42, 0x73, 0xb5, 0x28, 0xdd
.section .rodata.cst16.aegis128l_counter, "aM", @progbits, 16
.align 16
.Laegis128l_counter0:
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.Laegis128l_counter1:
.byte 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17
.byte 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
.text
/*
* __load_partial: internal ABI
* input:
* LEN - bytes
* SRC - src
* output:
* MSG0 - first message block
* MSG1 - second message block
* changed:
* T0
* %r8
* %r9
*/
__load_partial:
xor %r9d, %r9d
pxor MSG0, MSG0
pxor MSG1, MSG1
mov LEN, %r8
and $0x1, %r8
jz .Lld_partial_1
mov LEN, %r8
and $0x1E, %r8
add SRC, %r8
mov (%r8), %r9b
.Lld_partial_1:
mov LEN, %r8
and $0x2, %r8
jz .Lld_partial_2
mov LEN, %r8
and $0x1C, %r8
add SRC, %r8
shl $0x10, %r9
mov (%r8), %r9w
.Lld_partial_2:
mov LEN, %r8
and $0x4, %r8
jz .Lld_partial_4
mov LEN, %r8
and $0x18, %r8
add SRC, %r8
shl $32, %r9
mov (%r8), %r8d
xor %r8, %r9
.Lld_partial_4:
movq %r9, MSG0
mov LEN, %r8
and $0x8, %r8
jz .Lld_partial_8
mov LEN, %r8
and $0x10, %r8
add SRC, %r8
pslldq $8, MSG0
movq (%r8), T0
pxor T0, MSG0
.Lld_partial_8:
mov LEN, %r8
and $0x10, %r8
jz .Lld_partial_16
movdqa MSG0, MSG1
movdqu (SRC), MSG0
.Lld_partial_16:
ret
ENDPROC(__load_partial)
/*
* __store_partial: internal ABI
* input:
* LEN - bytes
* DST - dst
* output:
* T0 - first message block
* T1 - second message block
* changed:
* %r8
* %r9
* %r10
*/
__store_partial:
mov LEN, %r8
mov DST, %r9
cmp $16, %r8
jl .Lst_partial_16
movdqu T0, (%r9)
movdqa T1, T0
sub $16, %r8
add $16, %r9
.Lst_partial_16:
movq T0, %r10
cmp $8, %r8
jl .Lst_partial_8
mov %r10, (%r9)
psrldq $8, T0
movq T0, %r10
sub $8, %r8
add $8, %r9
.Lst_partial_8:
cmp $4, %r8
jl .Lst_partial_4
mov %r10d, (%r9)
shr $32, %r10
sub $4, %r8
add $4, %r9
.Lst_partial_4:
cmp $2, %r8
jl .Lst_partial_2
mov %r10w, (%r9)
shr $0x10, %r10
sub $2, %r8
add $2, %r9
.Lst_partial_2:
cmp $1, %r8
jl .Lst_partial_1
mov %r10b, (%r9)
.Lst_partial_1:
ret
ENDPROC(__store_partial)
.macro update
movdqa STATE7, T0
aesenc STATE0, STATE7
aesenc STATE1, STATE0
aesenc STATE2, STATE1
aesenc STATE3, STATE2
aesenc STATE4, STATE3
aesenc STATE5, STATE4
aesenc STATE6, STATE5
aesenc T0, STATE6
.endm
.macro update0
update
pxor MSG0, STATE7
pxor MSG1, STATE3
.endm
.macro update1
update
pxor MSG0, STATE6
pxor MSG1, STATE2
.endm
.macro update2
update
pxor MSG0, STATE5
pxor MSG1, STATE1
.endm
.macro update3
update
pxor MSG0, STATE4
pxor MSG1, STATE0
.endm
.macro update4
update
pxor MSG0, STATE3
pxor MSG1, STATE7
.endm
.macro update5
update
pxor MSG0, STATE2
pxor MSG1, STATE6
.endm
.macro update6
update
pxor MSG0, STATE1
pxor MSG1, STATE5
.endm
.macro update7
update
pxor MSG0, STATE0
pxor MSG1, STATE4
.endm
.macro state_load
movdqu 0x00(STATEP), STATE0
movdqu 0x10(STATEP), STATE1
movdqu 0x20(STATEP), STATE2
movdqu 0x30(STATEP), STATE3
movdqu 0x40(STATEP), STATE4
movdqu 0x50(STATEP), STATE5
movdqu 0x60(STATEP), STATE6
movdqu 0x70(STATEP), STATE7
.endm
.macro state_store s0 s1 s2 s3 s4 s5 s6 s7
movdqu \s7, 0x00(STATEP)
movdqu \s0, 0x10(STATEP)
movdqu \s1, 0x20(STATEP)
movdqu \s2, 0x30(STATEP)
movdqu \s3, 0x40(STATEP)
movdqu \s4, 0x50(STATEP)
movdqu \s5, 0x60(STATEP)
movdqu \s6, 0x70(STATEP)
.endm
.macro state_store0
state_store STATE0 STATE1 STATE2 STATE3 STATE4 STATE5 STATE6 STATE7
.endm
.macro state_store1
state_store STATE7 STATE0 STATE1 STATE2 STATE3 STATE4 STATE5 STATE6
.endm
.macro state_store2
state_store STATE6 STATE7 STATE0 STATE1 STATE2 STATE3 STATE4 STATE5
.endm
.macro state_store3
state_store STATE5 STATE6 STATE7 STATE0 STATE1 STATE2 STATE3 STATE4
.endm
.macro state_store4
state_store STATE4 STATE5 STATE6 STATE7 STATE0 STATE1 STATE2 STATE3
.endm
.macro state_store5
state_store STATE3 STATE4 STATE5 STATE6 STATE7 STATE0 STATE1 STATE2
.endm
.macro state_store6
state_store STATE2 STATE3 STATE4 STATE5 STATE6 STATE7 STATE0 STATE1
.endm
.macro state_store7
state_store STATE1 STATE2 STATE3 STATE4 STATE5 STATE6 STATE7 STATE0
.endm
/*
* void crypto_aegis128l_aesni_init(void *state, const void *key, const void *iv);
*/
ENTRY(crypto_aegis128l_aesni_init)
FRAME_BEGIN
/* load key: */
movdqa (%rsi), MSG1
movdqa MSG1, STATE0
movdqa MSG1, STATE4
movdqa MSG1, STATE5
movdqa MSG1, STATE6
movdqa MSG1, STATE7
/* load IV: */
movdqu (%rdx), MSG0
pxor MSG0, STATE0
pxor MSG0, STATE4
/* load the constants: */
movdqa .Laegis128l_const_0, STATE2
movdqa .Laegis128l_const_1, STATE1
movdqa STATE1, STATE3
pxor STATE2, STATE5
pxor STATE1, STATE6
pxor STATE2, STATE7
/* update 10 times with IV and KEY: */
update0
update1
update2
update3
update4
update5
update6
update7
update0
update1
state_store1
FRAME_END
ret
ENDPROC(crypto_aegis128l_aesni_init)
.macro ad_block a i
movdq\a (\i * 0x20 + 0x00)(SRC), MSG0
movdq\a (\i * 0x20 + 0x10)(SRC), MSG1
update\i
sub $0x20, LEN
cmp $0x20, LEN
jl .Lad_out_\i
.endm
/*
* void crypto_aegis128l_aesni_ad(void *state, unsigned int length,
* const void *data);
*/
ENTRY(crypto_aegis128l_aesni_ad)
FRAME_BEGIN
cmp $0x20, LEN
jb .Lad_out
state_load
mov SRC, %r8
and $0xf, %r8
jnz .Lad_u_loop
.align 8
.Lad_a_loop:
ad_block a 0
ad_block a 1
ad_block a 2
ad_block a 3
ad_block a 4
ad_block a 5
ad_block a 6
ad_block a 7
add $0x100, SRC
jmp .Lad_a_loop
.align 8
.Lad_u_loop:
ad_block u 0
ad_block u 1
ad_block u 2
ad_block u 3
ad_block u 4
ad_block u 5
ad_block u 6
ad_block u 7
add $0x100, SRC
jmp .Lad_u_loop
.Lad_out_0:
state_store0
FRAME_END
ret
.Lad_out_1:
state_store1
FRAME_END
ret
.Lad_out_2:
state_store2
FRAME_END
ret
.Lad_out_3:
state_store3
FRAME_END
ret
.Lad_out_4:
state_store4
FRAME_END
ret
.Lad_out_5:
state_store5
FRAME_END
ret
.Lad_out_6:
state_store6
FRAME_END
ret
.Lad_out_7:
state_store7
FRAME_END
ret
.Lad_out:
FRAME_END
ret
ENDPROC(crypto_aegis128l_aesni_ad)
.macro crypt m0 m1 s0 s1 s2 s3 s4 s5 s6 s7
pxor \s1, \m0
pxor \s6, \m0
movdqa \s2, T3
pand \s3, T3
pxor T3, \m0
pxor \s2, \m1
pxor \s5, \m1
movdqa \s6, T3
pand \s7, T3
pxor T3, \m1
.endm
.macro crypt0 m0 m1
crypt \m0 \m1 STATE0 STATE1 STATE2 STATE3 STATE4 STATE5 STATE6 STATE7
.endm
.macro crypt1 m0 m1
crypt \m0 \m1 STATE7 STATE0 STATE1 STATE2 STATE3 STATE4 STATE5 STATE6
.endm
.macro crypt2 m0 m1
crypt \m0 \m1 STATE6 STATE7 STATE0 STATE1 STATE2 STATE3 STATE4 STATE5
.endm
.macro crypt3 m0 m1
crypt \m0 \m1 STATE5 STATE6 STATE7 STATE0 STATE1 STATE2 STATE3 STATE4
.endm
.macro crypt4 m0 m1
crypt \m0 \m1 STATE4 STATE5 STATE6 STATE7 STATE0 STATE1 STATE2 STATE3
.endm
.macro crypt5 m0 m1
crypt \m0 \m1 STATE3 STATE4 STATE5 STATE6 STATE7 STATE0 STATE1 STATE2
.endm
.macro crypt6 m0 m1
crypt \m0 \m1 STATE2 STATE3 STATE4 STATE5 STATE6 STATE7 STATE0 STATE1
.endm
.macro crypt7 m0 m1
crypt \m0 \m1 STATE1 STATE2 STATE3 STATE4 STATE5 STATE6 STATE7 STATE0
.endm
.macro encrypt_block a i
movdq\a (\i * 0x20 + 0x00)(SRC), MSG0
movdq\a (\i * 0x20 + 0x10)(SRC), MSG1
movdqa MSG0, T0
movdqa MSG1, T1
crypt\i T0, T1
movdq\a T0, (\i * 0x20 + 0x00)(DST)
movdq\a T1, (\i * 0x20 + 0x10)(DST)
update\i
sub $0x20, LEN
cmp $0x20, LEN
jl .Lenc_out_\i
.endm
.macro decrypt_block a i
movdq\a (\i * 0x20 + 0x00)(SRC), MSG0
movdq\a (\i * 0x20 + 0x10)(SRC), MSG1
crypt\i MSG0, MSG1
movdq\a MSG0, (\i * 0x20 + 0x00)(DST)
movdq\a MSG1, (\i * 0x20 + 0x10)(DST)
update\i
sub $0x20, LEN
cmp $0x20, LEN
jl .Ldec_out_\i
.endm
/*
* void crypto_aegis128l_aesni_enc(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis128l_aesni_enc)
FRAME_BEGIN
cmp $0x20, LEN
jb .Lenc_out
state_load
mov SRC, %r8
or DST, %r8
and $0xf, %r8
jnz .Lenc_u_loop
.align 8
.Lenc_a_loop:
encrypt_block a 0
encrypt_block a 1
encrypt_block a 2
encrypt_block a 3
encrypt_block a 4
encrypt_block a 5
encrypt_block a 6
encrypt_block a 7
add $0x100, SRC
add $0x100, DST
jmp .Lenc_a_loop
.align 8
.Lenc_u_loop:
encrypt_block u 0
encrypt_block u 1
encrypt_block u 2
encrypt_block u 3
encrypt_block u 4
encrypt_block u 5
encrypt_block u 6
encrypt_block u 7
add $0x100, SRC
add $0x100, DST
jmp .Lenc_u_loop
.Lenc_out_0:
state_store0
FRAME_END
ret
.Lenc_out_1:
state_store1
FRAME_END
ret
.Lenc_out_2:
state_store2
FRAME_END
ret
.Lenc_out_3:
state_store3
FRAME_END
ret
.Lenc_out_4:
state_store4
FRAME_END
ret
.Lenc_out_5:
state_store5
FRAME_END
ret
.Lenc_out_6:
state_store6
FRAME_END
ret
.Lenc_out_7:
state_store7
FRAME_END
ret
.Lenc_out:
FRAME_END
ret
ENDPROC(crypto_aegis128l_aesni_enc)
/*
* void crypto_aegis128l_aesni_enc_tail(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis128l_aesni_enc_tail)
FRAME_BEGIN
state_load
/* encrypt message: */
call __load_partial
movdqa MSG0, T0
movdqa MSG1, T1
crypt0 T0, T1
call __store_partial
update0
state_store0
FRAME_END
ret
ENDPROC(crypto_aegis128l_aesni_enc_tail)
/*
* void crypto_aegis128l_aesni_dec(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis128l_aesni_dec)
FRAME_BEGIN
cmp $0x20, LEN
jb .Ldec_out
state_load
mov SRC, %r8
or DST, %r8
and $0xF, %r8
jnz .Ldec_u_loop
.align 8
.Ldec_a_loop:
decrypt_block a 0
decrypt_block a 1
decrypt_block a 2
decrypt_block a 3
decrypt_block a 4
decrypt_block a 5
decrypt_block a 6
decrypt_block a 7
add $0x100, SRC
add $0x100, DST
jmp .Ldec_a_loop
.align 8
.Ldec_u_loop:
decrypt_block u 0
decrypt_block u 1
decrypt_block u 2
decrypt_block u 3
decrypt_block u 4
decrypt_block u 5
decrypt_block u 6
decrypt_block u 7
add $0x100, SRC
add $0x100, DST
jmp .Ldec_u_loop
.Ldec_out_0:
state_store0
FRAME_END
ret
.Ldec_out_1:
state_store1
FRAME_END
ret
.Ldec_out_2:
state_store2
FRAME_END
ret
.Ldec_out_3:
state_store3
FRAME_END
ret
.Ldec_out_4:
state_store4
FRAME_END
ret
.Ldec_out_5:
state_store5
FRAME_END
ret
.Ldec_out_6:
state_store6
FRAME_END
ret
.Ldec_out_7:
state_store7
FRAME_END
ret
.Ldec_out:
FRAME_END
ret
ENDPROC(crypto_aegis128l_aesni_dec)
/*
* void crypto_aegis128l_aesni_dec_tail(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis128l_aesni_dec_tail)
FRAME_BEGIN
state_load
/* decrypt message: */
call __load_partial
crypt0 MSG0, MSG1
movdqa MSG0, T0
movdqa MSG1, T1
call __store_partial
/* mask with byte count: */
movq LEN, T0
punpcklbw T0, T0
punpcklbw T0, T0
punpcklbw T0, T0
punpcklbw T0, T0
movdqa T0, T1
movdqa .Laegis128l_counter0, T2
movdqa .Laegis128l_counter1, T3
pcmpgtb T2, T0
pcmpgtb T3, T1
pand T0, MSG0
pand T1, MSG1
update0
state_store0
FRAME_END
ret
ENDPROC(crypto_aegis128l_aesni_dec_tail)
/*
* void crypto_aegis128l_aesni_final(void *state, void *tag_xor,
* u64 assoclen, u64 cryptlen);
*/
ENTRY(crypto_aegis128l_aesni_final)
FRAME_BEGIN
state_load
/* prepare length block: */
movq %rdx, MSG0
movq %rcx, T0
pslldq $8, T0
pxor T0, MSG0
psllq $3, MSG0 /* multiply by 8 (to get bit count) */
pxor STATE2, MSG0
movdqa MSG0, MSG1
/* update state: */
update0
update1
update2
update3
update4
update5
update6
/* xor tag: */
movdqu (%rsi), T0
pxor STATE1, T0
pxor STATE2, T0
pxor STATE3, T0
pxor STATE4, T0
pxor STATE5, T0
pxor STATE6, T0
pxor STATE7, T0
movdqu T0, (%rsi)
FRAME_END
ret
ENDPROC(crypto_aegis128l_aesni_final)

View File

@ -1,293 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The AEGIS-128L Authenticated-Encryption Algorithm
* Glue for AES-NI + SSE2 implementation
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/module.h>
#include <asm/fpu/api.h>
#include <asm/cpu_device_id.h>
#define AEGIS128L_BLOCK_ALIGN 16
#define AEGIS128L_BLOCK_SIZE 32
#define AEGIS128L_NONCE_SIZE 16
#define AEGIS128L_STATE_BLOCKS 8
#define AEGIS128L_KEY_SIZE 16
#define AEGIS128L_MIN_AUTH_SIZE 8
#define AEGIS128L_MAX_AUTH_SIZE 16
asmlinkage void crypto_aegis128l_aesni_init(void *state, void *key, void *iv);
asmlinkage void crypto_aegis128l_aesni_ad(
void *state, unsigned int length, const void *data);
asmlinkage void crypto_aegis128l_aesni_enc(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis128l_aesni_dec(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis128l_aesni_enc_tail(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis128l_aesni_dec_tail(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis128l_aesni_final(
void *state, void *tag_xor, unsigned int cryptlen,
unsigned int assoclen);
struct aegis_block {
u8 bytes[AEGIS128L_BLOCK_SIZE] __aligned(AEGIS128L_BLOCK_ALIGN);
};
struct aegis_state {
struct aegis_block blocks[AEGIS128L_STATE_BLOCKS];
};
struct aegis_ctx {
struct aegis_block key;
};
struct aegis_crypt_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_blocks)(void *state, unsigned int length, const void *src,
void *dst);
void (*crypt_tail)(void *state, unsigned int length, const void *src,
void *dst);
};
static void crypto_aegis128l_aesni_process_ad(
struct aegis_state *state, struct scatterlist *sg_src,
unsigned int assoclen)
{
struct scatter_walk walk;
struct aegis_block buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= AEGIS128L_BLOCK_SIZE) {
if (pos > 0) {
unsigned int fill = AEGIS128L_BLOCK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
crypto_aegis128l_aesni_ad(state,
AEGIS128L_BLOCK_SIZE,
buf.bytes);
pos = 0;
left -= fill;
src += fill;
}
crypto_aegis128l_aesni_ad(state, left, src);
src += left & ~(AEGIS128L_BLOCK_SIZE - 1);
left &= AEGIS128L_BLOCK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, AEGIS128L_BLOCK_SIZE - pos);
crypto_aegis128l_aesni_ad(state, AEGIS128L_BLOCK_SIZE, buf.bytes);
}
}
static void crypto_aegis128l_aesni_process_crypt(
struct aegis_state *state, struct skcipher_walk *walk,
const struct aegis_crypt_ops *ops)
{
while (walk->nbytes >= AEGIS128L_BLOCK_SIZE) {
ops->crypt_blocks(state, round_down(walk->nbytes,
AEGIS128L_BLOCK_SIZE),
walk->src.virt.addr, walk->dst.virt.addr);
skcipher_walk_done(walk, walk->nbytes % AEGIS128L_BLOCK_SIZE);
}
if (walk->nbytes) {
ops->crypt_tail(state, walk->nbytes, walk->src.virt.addr,
walk->dst.virt.addr);
skcipher_walk_done(walk, 0);
}
}
static struct aegis_ctx *crypto_aegis128l_aesni_ctx(struct crypto_aead *aead)
{
u8 *ctx = crypto_aead_ctx(aead);
ctx = PTR_ALIGN(ctx, __alignof__(struct aegis_ctx));
return (void *)ctx;
}
static int crypto_aegis128l_aesni_setkey(struct crypto_aead *aead,
const u8 *key, unsigned int keylen)
{
struct aegis_ctx *ctx = crypto_aegis128l_aesni_ctx(aead);
if (keylen != AEGIS128L_KEY_SIZE) {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
memcpy(ctx->key.bytes, key, AEGIS128L_KEY_SIZE);
return 0;
}
static int crypto_aegis128l_aesni_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
if (authsize > AEGIS128L_MAX_AUTH_SIZE)
return -EINVAL;
if (authsize < AEGIS128L_MIN_AUTH_SIZE)
return -EINVAL;
return 0;
}
static void crypto_aegis128l_aesni_crypt(struct aead_request *req,
struct aegis_block *tag_xor,
unsigned int cryptlen,
const struct aegis_crypt_ops *ops)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_ctx *ctx = crypto_aegis128l_aesni_ctx(tfm);
struct skcipher_walk walk;
struct aegis_state state;
ops->skcipher_walk_init(&walk, req, true);
kernel_fpu_begin();
crypto_aegis128l_aesni_init(&state, ctx->key.bytes, req->iv);
crypto_aegis128l_aesni_process_ad(&state, req->src, req->assoclen);
crypto_aegis128l_aesni_process_crypt(&state, &walk, ops);
crypto_aegis128l_aesni_final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end();
}
static int crypto_aegis128l_aesni_encrypt(struct aead_request *req)
{
static const struct aegis_crypt_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_blocks = crypto_aegis128l_aesni_enc,
.crypt_tail = crypto_aegis128l_aesni_enc_tail,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_block tag = {};
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_aegis128l_aesni_crypt(req, &tag, cryptlen, &OPS);
scatterwalk_map_and_copy(tag.bytes, req->dst,
req->assoclen + cryptlen, authsize, 1);
return 0;
}
static int crypto_aegis128l_aesni_decrypt(struct aead_request *req)
{
static const struct aegis_block zeros = {};
static const struct aegis_crypt_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_blocks = crypto_aegis128l_aesni_dec,
.crypt_tail = crypto_aegis128l_aesni_dec_tail,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag.bytes, req->src,
req->assoclen + cryptlen, authsize, 0);
crypto_aegis128l_aesni_crypt(req, &tag, cryptlen, &OPS);
return crypto_memneq(tag.bytes, zeros.bytes, authsize) ? -EBADMSG : 0;
}
static int crypto_aegis128l_aesni_init_tfm(struct crypto_aead *aead)
{
return 0;
}
static void crypto_aegis128l_aesni_exit_tfm(struct crypto_aead *aead)
{
}
static struct aead_alg crypto_aegis128l_aesni_alg = {
.setkey = crypto_aegis128l_aesni_setkey,
.setauthsize = crypto_aegis128l_aesni_setauthsize,
.encrypt = crypto_aegis128l_aesni_encrypt,
.decrypt = crypto_aegis128l_aesni_decrypt,
.init = crypto_aegis128l_aesni_init_tfm,
.exit = crypto_aegis128l_aesni_exit_tfm,
.ivsize = AEGIS128L_NONCE_SIZE,
.maxauthsize = AEGIS128L_MAX_AUTH_SIZE,
.chunksize = AEGIS128L_BLOCK_SIZE,
.base = {
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aegis_ctx) +
__alignof__(struct aegis_ctx),
.cra_alignmask = 0,
.cra_priority = 400,
.cra_name = "__aegis128l",
.cra_driver_name = "__aegis128l-aesni",
.cra_module = THIS_MODULE,
}
};
static struct simd_aead_alg *simd_alg;
static int __init crypto_aegis128l_aesni_module_init(void)
{
if (!boot_cpu_has(X86_FEATURE_XMM2) ||
!boot_cpu_has(X86_FEATURE_AES) ||
!cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL))
return -ENODEV;
return simd_register_aeads_compat(&crypto_aegis128l_aesni_alg, 1,
&simd_alg);
}
static void __exit crypto_aegis128l_aesni_module_exit(void)
{
simd_unregister_aeads(&crypto_aegis128l_aesni_alg, 1, &simd_alg);
}
module_init(crypto_aegis128l_aesni_module_init);
module_exit(crypto_aegis128l_aesni_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("AEGIS-128L AEAD algorithm -- AESNI+SSE2 implementation");
MODULE_ALIAS_CRYPTO("aegis128l");
MODULE_ALIAS_CRYPTO("aegis128l-aesni");

View File

@ -1,700 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* AES-NI + SSE2 implementation of AEGIS-128L
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <linux/linkage.h>
#include <asm/frame.h>
#define STATE0 %xmm0
#define STATE1 %xmm1
#define STATE2 %xmm2
#define STATE3 %xmm3
#define STATE4 %xmm4
#define STATE5 %xmm5
#define MSG %xmm6
#define T0 %xmm7
#define T1 %xmm8
#define T2 %xmm9
#define T3 %xmm10
#define STATEP %rdi
#define LEN %rsi
#define SRC %rdx
#define DST %rcx
.section .rodata.cst16.aegis256_const, "aM", @progbits, 32
.align 16
.Laegis256_const_0:
.byte 0x00, 0x01, 0x01, 0x02, 0x03, 0x05, 0x08, 0x0d
.byte 0x15, 0x22, 0x37, 0x59, 0x90, 0xe9, 0x79, 0x62
.Laegis256_const_1:
.byte 0xdb, 0x3d, 0x18, 0x55, 0x6d, 0xc2, 0x2f, 0xf1
.byte 0x20, 0x11, 0x31, 0x42, 0x73, 0xb5, 0x28, 0xdd
.section .rodata.cst16.aegis256_counter, "aM", @progbits, 16
.align 16
.Laegis256_counter:
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.text
/*
* __load_partial: internal ABI
* input:
* LEN - bytes
* SRC - src
* output:
* MSG - message block
* changed:
* T0
* %r8
* %r9
*/
__load_partial:
xor %r9d, %r9d
pxor MSG, MSG
mov LEN, %r8
and $0x1, %r8
jz .Lld_partial_1
mov LEN, %r8
and $0x1E, %r8
add SRC, %r8
mov (%r8), %r9b
.Lld_partial_1:
mov LEN, %r8
and $0x2, %r8
jz .Lld_partial_2
mov LEN, %r8
and $0x1C, %r8
add SRC, %r8
shl $0x10, %r9
mov (%r8), %r9w
.Lld_partial_2:
mov LEN, %r8
and $0x4, %r8
jz .Lld_partial_4
mov LEN, %r8
and $0x18, %r8
add SRC, %r8
shl $32, %r9
mov (%r8), %r8d
xor %r8, %r9
.Lld_partial_4:
movq %r9, MSG
mov LEN, %r8
and $0x8, %r8
jz .Lld_partial_8
mov LEN, %r8
and $0x10, %r8
add SRC, %r8
pslldq $8, MSG
movq (%r8), T0
pxor T0, MSG
.Lld_partial_8:
ret
ENDPROC(__load_partial)
/*
* __store_partial: internal ABI
* input:
* LEN - bytes
* DST - dst
* output:
* T0 - message block
* changed:
* %r8
* %r9
* %r10
*/
__store_partial:
mov LEN, %r8
mov DST, %r9
movq T0, %r10
cmp $8, %r8
jl .Lst_partial_8
mov %r10, (%r9)
psrldq $8, T0
movq T0, %r10
sub $8, %r8
add $8, %r9
.Lst_partial_8:
cmp $4, %r8
jl .Lst_partial_4
mov %r10d, (%r9)
shr $32, %r10
sub $4, %r8
add $4, %r9
.Lst_partial_4:
cmp $2, %r8
jl .Lst_partial_2
mov %r10w, (%r9)
shr $0x10, %r10
sub $2, %r8
add $2, %r9
.Lst_partial_2:
cmp $1, %r8
jl .Lst_partial_1
mov %r10b, (%r9)
.Lst_partial_1:
ret
ENDPROC(__store_partial)
.macro update
movdqa STATE5, T0
aesenc STATE0, STATE5
aesenc STATE1, STATE0
aesenc STATE2, STATE1
aesenc STATE3, STATE2
aesenc STATE4, STATE3
aesenc T0, STATE4
.endm
.macro update0 m
update
pxor \m, STATE5
.endm
.macro update1 m
update
pxor \m, STATE4
.endm
.macro update2 m
update
pxor \m, STATE3
.endm
.macro update3 m
update
pxor \m, STATE2
.endm
.macro update4 m
update
pxor \m, STATE1
.endm
.macro update5 m
update
pxor \m, STATE0
.endm
.macro state_load
movdqu 0x00(STATEP), STATE0
movdqu 0x10(STATEP), STATE1
movdqu 0x20(STATEP), STATE2
movdqu 0x30(STATEP), STATE3
movdqu 0x40(STATEP), STATE4
movdqu 0x50(STATEP), STATE5
.endm
.macro state_store s0 s1 s2 s3 s4 s5
movdqu \s5, 0x00(STATEP)
movdqu \s0, 0x10(STATEP)
movdqu \s1, 0x20(STATEP)
movdqu \s2, 0x30(STATEP)
movdqu \s3, 0x40(STATEP)
movdqu \s4, 0x50(STATEP)
.endm
.macro state_store0
state_store STATE0 STATE1 STATE2 STATE3 STATE4 STATE5
.endm
.macro state_store1
state_store STATE5 STATE0 STATE1 STATE2 STATE3 STATE4
.endm
.macro state_store2
state_store STATE4 STATE5 STATE0 STATE1 STATE2 STATE3
.endm
.macro state_store3
state_store STATE3 STATE4 STATE5 STATE0 STATE1 STATE2
.endm
.macro state_store4
state_store STATE2 STATE3 STATE4 STATE5 STATE0 STATE1
.endm
.macro state_store5
state_store STATE1 STATE2 STATE3 STATE4 STATE5 STATE0
.endm
/*
* void crypto_aegis256_aesni_init(void *state, const void *key, const void *iv);
*/
ENTRY(crypto_aegis256_aesni_init)
FRAME_BEGIN
/* load key: */
movdqa 0x00(%rsi), MSG
movdqa 0x10(%rsi), T1
movdqa MSG, STATE4
movdqa T1, STATE5
/* load IV: */
movdqu 0x00(%rdx), T2
movdqu 0x10(%rdx), T3
pxor MSG, T2
pxor T1, T3
movdqa T2, STATE0
movdqa T3, STATE1
/* load the constants: */
movdqa .Laegis256_const_0, STATE3
movdqa .Laegis256_const_1, STATE2
pxor STATE3, STATE4
pxor STATE2, STATE5
/* update 10 times with IV and KEY: */
update0 MSG
update1 T1
update2 T2
update3 T3
update4 MSG
update5 T1
update0 T2
update1 T3
update2 MSG
update3 T1
update4 T2
update5 T3
update0 MSG
update1 T1
update2 T2
update3 T3
state_store3
FRAME_END
ret
ENDPROC(crypto_aegis256_aesni_init)
.macro ad_block a i
movdq\a (\i * 0x10)(SRC), MSG
update\i MSG
sub $0x10, LEN
cmp $0x10, LEN
jl .Lad_out_\i
.endm
/*
* void crypto_aegis256_aesni_ad(void *state, unsigned int length,
* const void *data);
*/
ENTRY(crypto_aegis256_aesni_ad)
FRAME_BEGIN
cmp $0x10, LEN
jb .Lad_out
state_load
mov SRC, %r8
and $0xf, %r8
jnz .Lad_u_loop
.align 8
.Lad_a_loop:
ad_block a 0
ad_block a 1
ad_block a 2
ad_block a 3
ad_block a 4
ad_block a 5
add $0x60, SRC
jmp .Lad_a_loop
.align 8
.Lad_u_loop:
ad_block u 0
ad_block u 1
ad_block u 2
ad_block u 3
ad_block u 4
ad_block u 5
add $0x60, SRC
jmp .Lad_u_loop
.Lad_out_0:
state_store0
FRAME_END
ret
.Lad_out_1:
state_store1
FRAME_END
ret
.Lad_out_2:
state_store2
FRAME_END
ret
.Lad_out_3:
state_store3
FRAME_END
ret
.Lad_out_4:
state_store4
FRAME_END
ret
.Lad_out_5:
state_store5
FRAME_END
ret
.Lad_out:
FRAME_END
ret
ENDPROC(crypto_aegis256_aesni_ad)
.macro crypt m s0 s1 s2 s3 s4 s5
pxor \s1, \m
pxor \s4, \m
pxor \s5, \m
movdqa \s2, T3
pand \s3, T3
pxor T3, \m
.endm
.macro crypt0 m
crypt \m STATE0 STATE1 STATE2 STATE3 STATE4 STATE5
.endm
.macro crypt1 m
crypt \m STATE5 STATE0 STATE1 STATE2 STATE3 STATE4
.endm
.macro crypt2 m
crypt \m STATE4 STATE5 STATE0 STATE1 STATE2 STATE3
.endm
.macro crypt3 m
crypt \m STATE3 STATE4 STATE5 STATE0 STATE1 STATE2
.endm
.macro crypt4 m
crypt \m STATE2 STATE3 STATE4 STATE5 STATE0 STATE1
.endm
.macro crypt5 m
crypt \m STATE1 STATE2 STATE3 STATE4 STATE5 STATE0
.endm
.macro encrypt_block a i
movdq\a (\i * 0x10)(SRC), MSG
movdqa MSG, T0
crypt\i T0
movdq\a T0, (\i * 0x10)(DST)
update\i MSG
sub $0x10, LEN
cmp $0x10, LEN
jl .Lenc_out_\i
.endm
.macro decrypt_block a i
movdq\a (\i * 0x10)(SRC), MSG
crypt\i MSG
movdq\a MSG, (\i * 0x10)(DST)
update\i MSG
sub $0x10, LEN
cmp $0x10, LEN
jl .Ldec_out_\i
.endm
/*
* void crypto_aegis256_aesni_enc(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis256_aesni_enc)
FRAME_BEGIN
cmp $0x10, LEN
jb .Lenc_out
state_load
mov SRC, %r8
or DST, %r8
and $0xf, %r8
jnz .Lenc_u_loop
.align 8
.Lenc_a_loop:
encrypt_block a 0
encrypt_block a 1
encrypt_block a 2
encrypt_block a 3
encrypt_block a 4
encrypt_block a 5
add $0x60, SRC
add $0x60, DST
jmp .Lenc_a_loop
.align 8
.Lenc_u_loop:
encrypt_block u 0
encrypt_block u 1
encrypt_block u 2
encrypt_block u 3
encrypt_block u 4
encrypt_block u 5
add $0x60, SRC
add $0x60, DST
jmp .Lenc_u_loop
.Lenc_out_0:
state_store0
FRAME_END
ret
.Lenc_out_1:
state_store1
FRAME_END
ret
.Lenc_out_2:
state_store2
FRAME_END
ret
.Lenc_out_3:
state_store3
FRAME_END
ret
.Lenc_out_4:
state_store4
FRAME_END
ret
.Lenc_out_5:
state_store5
FRAME_END
ret
.Lenc_out:
FRAME_END
ret
ENDPROC(crypto_aegis256_aesni_enc)
/*
* void crypto_aegis256_aesni_enc_tail(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis256_aesni_enc_tail)
FRAME_BEGIN
state_load
/* encrypt message: */
call __load_partial
movdqa MSG, T0
crypt0 T0
call __store_partial
update0 MSG
state_store0
FRAME_END
ret
ENDPROC(crypto_aegis256_aesni_enc_tail)
/*
* void crypto_aegis256_aesni_dec(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis256_aesni_dec)
FRAME_BEGIN
cmp $0x10, LEN
jb .Ldec_out
state_load
mov SRC, %r8
or DST, %r8
and $0xF, %r8
jnz .Ldec_u_loop
.align 8
.Ldec_a_loop:
decrypt_block a 0
decrypt_block a 1
decrypt_block a 2
decrypt_block a 3
decrypt_block a 4
decrypt_block a 5
add $0x60, SRC
add $0x60, DST
jmp .Ldec_a_loop
.align 8
.Ldec_u_loop:
decrypt_block u 0
decrypt_block u 1
decrypt_block u 2
decrypt_block u 3
decrypt_block u 4
decrypt_block u 5
add $0x60, SRC
add $0x60, DST
jmp .Ldec_u_loop
.Ldec_out_0:
state_store0
FRAME_END
ret
.Ldec_out_1:
state_store1
FRAME_END
ret
.Ldec_out_2:
state_store2
FRAME_END
ret
.Ldec_out_3:
state_store3
FRAME_END
ret
.Ldec_out_4:
state_store4
FRAME_END
ret
.Ldec_out_5:
state_store5
FRAME_END
ret
.Ldec_out:
FRAME_END
ret
ENDPROC(crypto_aegis256_aesni_dec)
/*
* void crypto_aegis256_aesni_dec_tail(void *state, unsigned int length,
* const void *src, void *dst);
*/
ENTRY(crypto_aegis256_aesni_dec_tail)
FRAME_BEGIN
state_load
/* decrypt message: */
call __load_partial
crypt0 MSG
movdqa MSG, T0
call __store_partial
/* mask with byte count: */
movq LEN, T0
punpcklbw T0, T0
punpcklbw T0, T0
punpcklbw T0, T0
punpcklbw T0, T0
movdqa .Laegis256_counter, T1
pcmpgtb T1, T0
pand T0, MSG
update0 MSG
state_store0
FRAME_END
ret
ENDPROC(crypto_aegis256_aesni_dec_tail)
/*
* void crypto_aegis256_aesni_final(void *state, void *tag_xor,
* u64 assoclen, u64 cryptlen);
*/
ENTRY(crypto_aegis256_aesni_final)
FRAME_BEGIN
state_load
/* prepare length block: */
movq %rdx, MSG
movq %rcx, T0
pslldq $8, T0
pxor T0, MSG
psllq $3, MSG /* multiply by 8 (to get bit count) */
pxor STATE3, MSG
/* update state: */
update0 MSG
update1 MSG
update2 MSG
update3 MSG
update4 MSG
update5 MSG
update0 MSG
/* xor tag: */
movdqu (%rsi), MSG
pxor STATE0, MSG
pxor STATE1, MSG
pxor STATE2, MSG
pxor STATE3, MSG
pxor STATE4, MSG
pxor STATE5, MSG
movdqu MSG, (%rsi)
FRAME_END
ret
ENDPROC(crypto_aegis256_aesni_final)

View File

@ -1,293 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The AEGIS-256 Authenticated-Encryption Algorithm
* Glue for AES-NI + SSE2 implementation
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/module.h>
#include <asm/fpu/api.h>
#include <asm/cpu_device_id.h>
#define AEGIS256_BLOCK_ALIGN 16
#define AEGIS256_BLOCK_SIZE 16
#define AEGIS256_NONCE_SIZE 32
#define AEGIS256_STATE_BLOCKS 6
#define AEGIS256_KEY_SIZE 32
#define AEGIS256_MIN_AUTH_SIZE 8
#define AEGIS256_MAX_AUTH_SIZE 16
asmlinkage void crypto_aegis256_aesni_init(void *state, void *key, void *iv);
asmlinkage void crypto_aegis256_aesni_ad(
void *state, unsigned int length, const void *data);
asmlinkage void crypto_aegis256_aesni_enc(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis256_aesni_dec(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis256_aesni_enc_tail(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis256_aesni_dec_tail(
void *state, unsigned int length, const void *src, void *dst);
asmlinkage void crypto_aegis256_aesni_final(
void *state, void *tag_xor, unsigned int cryptlen,
unsigned int assoclen);
struct aegis_block {
u8 bytes[AEGIS256_BLOCK_SIZE] __aligned(AEGIS256_BLOCK_ALIGN);
};
struct aegis_state {
struct aegis_block blocks[AEGIS256_STATE_BLOCKS];
};
struct aegis_ctx {
struct aegis_block key[AEGIS256_KEY_SIZE / AEGIS256_BLOCK_SIZE];
};
struct aegis_crypt_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_blocks)(void *state, unsigned int length, const void *src,
void *dst);
void (*crypt_tail)(void *state, unsigned int length, const void *src,
void *dst);
};
static void crypto_aegis256_aesni_process_ad(
struct aegis_state *state, struct scatterlist *sg_src,
unsigned int assoclen)
{
struct scatter_walk walk;
struct aegis_block buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= AEGIS256_BLOCK_SIZE) {
if (pos > 0) {
unsigned int fill = AEGIS256_BLOCK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
crypto_aegis256_aesni_ad(state,
AEGIS256_BLOCK_SIZE,
buf.bytes);
pos = 0;
left -= fill;
src += fill;
}
crypto_aegis256_aesni_ad(state, left, src);
src += left & ~(AEGIS256_BLOCK_SIZE - 1);
left &= AEGIS256_BLOCK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, AEGIS256_BLOCK_SIZE - pos);
crypto_aegis256_aesni_ad(state, AEGIS256_BLOCK_SIZE, buf.bytes);
}
}
static void crypto_aegis256_aesni_process_crypt(
struct aegis_state *state, struct skcipher_walk *walk,
const struct aegis_crypt_ops *ops)
{
while (walk->nbytes >= AEGIS256_BLOCK_SIZE) {
ops->crypt_blocks(state,
round_down(walk->nbytes, AEGIS256_BLOCK_SIZE),
walk->src.virt.addr, walk->dst.virt.addr);
skcipher_walk_done(walk, walk->nbytes % AEGIS256_BLOCK_SIZE);
}
if (walk->nbytes) {
ops->crypt_tail(state, walk->nbytes, walk->src.virt.addr,
walk->dst.virt.addr);
skcipher_walk_done(walk, 0);
}
}
static struct aegis_ctx *crypto_aegis256_aesni_ctx(struct crypto_aead *aead)
{
u8 *ctx = crypto_aead_ctx(aead);
ctx = PTR_ALIGN(ctx, __alignof__(struct aegis_ctx));
return (void *)ctx;
}
static int crypto_aegis256_aesni_setkey(struct crypto_aead *aead, const u8 *key,
unsigned int keylen)
{
struct aegis_ctx *ctx = crypto_aegis256_aesni_ctx(aead);
if (keylen != AEGIS256_KEY_SIZE) {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
memcpy(ctx->key, key, AEGIS256_KEY_SIZE);
return 0;
}
static int crypto_aegis256_aesni_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
if (authsize > AEGIS256_MAX_AUTH_SIZE)
return -EINVAL;
if (authsize < AEGIS256_MIN_AUTH_SIZE)
return -EINVAL;
return 0;
}
static void crypto_aegis256_aesni_crypt(struct aead_request *req,
struct aegis_block *tag_xor,
unsigned int cryptlen,
const struct aegis_crypt_ops *ops)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_ctx *ctx = crypto_aegis256_aesni_ctx(tfm);
struct skcipher_walk walk;
struct aegis_state state;
ops->skcipher_walk_init(&walk, req, true);
kernel_fpu_begin();
crypto_aegis256_aesni_init(&state, ctx->key, req->iv);
crypto_aegis256_aesni_process_ad(&state, req->src, req->assoclen);
crypto_aegis256_aesni_process_crypt(&state, &walk, ops);
crypto_aegis256_aesni_final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end();
}
static int crypto_aegis256_aesni_encrypt(struct aead_request *req)
{
static const struct aegis_crypt_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_blocks = crypto_aegis256_aesni_enc,
.crypt_tail = crypto_aegis256_aesni_enc_tail,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_block tag = {};
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_aegis256_aesni_crypt(req, &tag, cryptlen, &OPS);
scatterwalk_map_and_copy(tag.bytes, req->dst,
req->assoclen + cryptlen, authsize, 1);
return 0;
}
static int crypto_aegis256_aesni_decrypt(struct aead_request *req)
{
static const struct aegis_block zeros = {};
static const struct aegis_crypt_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_blocks = crypto_aegis256_aesni_dec,
.crypt_tail = crypto_aegis256_aesni_dec_tail,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag.bytes, req->src,
req->assoclen + cryptlen, authsize, 0);
crypto_aegis256_aesni_crypt(req, &tag, cryptlen, &OPS);
return crypto_memneq(tag.bytes, zeros.bytes, authsize) ? -EBADMSG : 0;
}
static int crypto_aegis256_aesni_init_tfm(struct crypto_aead *aead)
{
return 0;
}
static void crypto_aegis256_aesni_exit_tfm(struct crypto_aead *aead)
{
}
static struct aead_alg crypto_aegis256_aesni_alg = {
.setkey = crypto_aegis256_aesni_setkey,
.setauthsize = crypto_aegis256_aesni_setauthsize,
.encrypt = crypto_aegis256_aesni_encrypt,
.decrypt = crypto_aegis256_aesni_decrypt,
.init = crypto_aegis256_aesni_init_tfm,
.exit = crypto_aegis256_aesni_exit_tfm,
.ivsize = AEGIS256_NONCE_SIZE,
.maxauthsize = AEGIS256_MAX_AUTH_SIZE,
.chunksize = AEGIS256_BLOCK_SIZE,
.base = {
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aegis_ctx) +
__alignof__(struct aegis_ctx),
.cra_alignmask = 0,
.cra_priority = 400,
.cra_name = "__aegis256",
.cra_driver_name = "__aegis256-aesni",
.cra_module = THIS_MODULE,
}
};
static struct simd_aead_alg *simd_alg;
static int __init crypto_aegis256_aesni_module_init(void)
{
if (!boot_cpu_has(X86_FEATURE_XMM2) ||
!boot_cpu_has(X86_FEATURE_AES) ||
!cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL))
return -ENODEV;
return simd_register_aeads_compat(&crypto_aegis256_aesni_alg, 1,
&simd_alg);
}
static void __exit crypto_aegis256_aesni_module_exit(void)
{
simd_unregister_aeads(&crypto_aegis256_aesni_alg, 1, &simd_alg);
}
module_init(crypto_aegis256_aesni_module_init);
module_exit(crypto_aegis256_aesni_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("AEGIS-256 AEAD algorithm -- AESNI+SSE2 implementation");
MODULE_ALIAS_CRYPTO("aegis256");
MODULE_ALIAS_CRYPTO("aegis256-aesni");

View File

@ -1,362 +0,0 @@
// -------------------------------------------------------------------------
// Copyright (c) 2001, Dr Brian Gladman < >, Worcester, UK.
// All rights reserved.
//
// LICENSE TERMS
//
// The free distribution and use of this software in both source and binary
// form is allowed (with or without changes) provided that:
//
// 1. distributions of this source code include the above copyright
// notice, this list of conditions and the following disclaimer//
//
// 2. distributions in binary form include the above copyright
// notice, this list of conditions and the following disclaimer
// in the documentation and/or other associated materials//
//
// 3. the copyright holder's name is not used to endorse products
// built using this software without specific written permission.
//
//
// ALTERNATIVELY, provided that this notice is retained in full, this product
// may be distributed under the terms of the GNU General Public License (GPL),
// in which case the provisions of the GPL apply INSTEAD OF those given above.
//
// Copyright (c) 2004 Linus Torvalds <torvalds@osdl.org>
// Copyright (c) 2004 Red Hat, Inc., James Morris <jmorris@redhat.com>
// DISCLAIMER
//
// This software is provided 'as is' with no explicit or implied warranties
// in respect of its properties including, but not limited to, correctness
// and fitness for purpose.
// -------------------------------------------------------------------------
// Issue Date: 29/07/2002
.file "aes-i586-asm.S"
.text
#include <linux/linkage.h>
#include <asm/asm-offsets.h>
#define tlen 1024 // length of each of 4 'xor' arrays (256 32-bit words)
/* offsets to parameters with one register pushed onto stack */
#define ctx 8
#define out_blk 12
#define in_blk 16
/* offsets in crypto_aes_ctx structure */
#define klen (480)
#define ekey (0)
#define dkey (240)
// register mapping for encrypt and decrypt subroutines
#define r0 eax
#define r1 ebx
#define r2 ecx
#define r3 edx
#define r4 esi
#define r5 edi
#define eaxl al
#define eaxh ah
#define ebxl bl
#define ebxh bh
#define ecxl cl
#define ecxh ch
#define edxl dl
#define edxh dh
#define _h(reg) reg##h
#define h(reg) _h(reg)
#define _l(reg) reg##l
#define l(reg) _l(reg)
// This macro takes a 32-bit word representing a column and uses
// each of its four bytes to index into four tables of 256 32-bit
// words to obtain values that are then xored into the appropriate
// output registers r0, r1, r4 or r5.
// Parameters:
// table table base address
// %1 out_state[0]
// %2 out_state[1]
// %3 out_state[2]
// %4 out_state[3]
// idx input register for the round (destroyed)
// tmp scratch register for the round
// sched key schedule
#define do_col(table, a1,a2,a3,a4, idx, tmp) \
movzx %l(idx),%tmp; \
xor table(,%tmp,4),%a1; \
movzx %h(idx),%tmp; \
shr $16,%idx; \
xor table+tlen(,%tmp,4),%a2; \
movzx %l(idx),%tmp; \
movzx %h(idx),%idx; \
xor table+2*tlen(,%tmp,4),%a3; \
xor table+3*tlen(,%idx,4),%a4;
// initialise output registers from the key schedule
// NB1: original value of a3 is in idx on exit
// NB2: original values of a1,a2,a4 aren't used
#define do_fcol(table, a1,a2,a3,a4, idx, tmp, sched) \
mov 0 sched,%a1; \
movzx %l(idx),%tmp; \
mov 12 sched,%a2; \
xor table(,%tmp,4),%a1; \
mov 4 sched,%a4; \
movzx %h(idx),%tmp; \
shr $16,%idx; \
xor table+tlen(,%tmp,4),%a2; \
movzx %l(idx),%tmp; \
movzx %h(idx),%idx; \
xor table+3*tlen(,%idx,4),%a4; \
mov %a3,%idx; \
mov 8 sched,%a3; \
xor table+2*tlen(,%tmp,4),%a3;
// initialise output registers from the key schedule
// NB1: original value of a3 is in idx on exit
// NB2: original values of a1,a2,a4 aren't used
#define do_icol(table, a1,a2,a3,a4, idx, tmp, sched) \
mov 0 sched,%a1; \
movzx %l(idx),%tmp; \
mov 4 sched,%a2; \
xor table(,%tmp,4),%a1; \
mov 12 sched,%a4; \
movzx %h(idx),%tmp; \
shr $16,%idx; \
xor table+tlen(,%tmp,4),%a2; \
movzx %l(idx),%tmp; \
movzx %h(idx),%idx; \
xor table+3*tlen(,%idx,4),%a4; \
mov %a3,%idx; \
mov 8 sched,%a3; \
xor table+2*tlen(,%tmp,4),%a3;
// original Gladman had conditional saves to MMX regs.
#define save(a1, a2) \
mov %a2,4*a1(%esp)
#define restore(a1, a2) \
mov 4*a2(%esp),%a1
// These macros perform a forward encryption cycle. They are entered with
// the first previous round column values in r0,r1,r4,r5 and
// exit with the final values in the same registers, using stack
// for temporary storage.
// round column values
// on entry: r0,r1,r4,r5
// on exit: r2,r1,r4,r5
#define fwd_rnd1(arg, table) \
save (0,r1); \
save (1,r5); \
\
/* compute new column values */ \
do_fcol(table, r2,r5,r4,r1, r0,r3, arg); /* idx=r0 */ \
do_col (table, r4,r1,r2,r5, r0,r3); /* idx=r4 */ \
restore(r0,0); \
do_col (table, r1,r2,r5,r4, r0,r3); /* idx=r1 */ \
restore(r0,1); \
do_col (table, r5,r4,r1,r2, r0,r3); /* idx=r5 */
// round column values
// on entry: r2,r1,r4,r5
// on exit: r0,r1,r4,r5
#define fwd_rnd2(arg, table) \
save (0,r1); \
save (1,r5); \
\
/* compute new column values */ \
do_fcol(table, r0,r5,r4,r1, r2,r3, arg); /* idx=r2 */ \
do_col (table, r4,r1,r0,r5, r2,r3); /* idx=r4 */ \
restore(r2,0); \
do_col (table, r1,r0,r5,r4, r2,r3); /* idx=r1 */ \
restore(r2,1); \
do_col (table, r5,r4,r1,r0, r2,r3); /* idx=r5 */
// These macros performs an inverse encryption cycle. They are entered with
// the first previous round column values in r0,r1,r4,r5 and
// exit with the final values in the same registers, using stack
// for temporary storage
// round column values
// on entry: r0,r1,r4,r5
// on exit: r2,r1,r4,r5
#define inv_rnd1(arg, table) \
save (0,r1); \
save (1,r5); \
\
/* compute new column values */ \
do_icol(table, r2,r1,r4,r5, r0,r3, arg); /* idx=r0 */ \
do_col (table, r4,r5,r2,r1, r0,r3); /* idx=r4 */ \
restore(r0,0); \
do_col (table, r1,r4,r5,r2, r0,r3); /* idx=r1 */ \
restore(r0,1); \
do_col (table, r5,r2,r1,r4, r0,r3); /* idx=r5 */
// round column values
// on entry: r2,r1,r4,r5
// on exit: r0,r1,r4,r5
#define inv_rnd2(arg, table) \
save (0,r1); \
save (1,r5); \
\
/* compute new column values */ \
do_icol(table, r0,r1,r4,r5, r2,r3, arg); /* idx=r2 */ \
do_col (table, r4,r5,r0,r1, r2,r3); /* idx=r4 */ \
restore(r2,0); \
do_col (table, r1,r4,r5,r0, r2,r3); /* idx=r1 */ \
restore(r2,1); \
do_col (table, r5,r0,r1,r4, r2,r3); /* idx=r5 */
// AES (Rijndael) Encryption Subroutine
/* void aes_enc_blk(struct crypto_aes_ctx *ctx, u8 *out_blk, const u8 *in_blk) */
.extern crypto_ft_tab
.extern crypto_fl_tab
ENTRY(aes_enc_blk)
push %ebp
mov ctx(%esp),%ebp
// CAUTION: the order and the values used in these assigns
// rely on the register mappings
1: push %ebx
mov in_blk+4(%esp),%r2
push %esi
mov klen(%ebp),%r3 // key size
push %edi
#if ekey != 0
lea ekey(%ebp),%ebp // key pointer
#endif
// input four columns and xor in first round key
mov (%r2),%r0
mov 4(%r2),%r1
mov 8(%r2),%r4
mov 12(%r2),%r5
xor (%ebp),%r0
xor 4(%ebp),%r1
xor 8(%ebp),%r4
xor 12(%ebp),%r5
sub $8,%esp // space for register saves on stack
add $16,%ebp // increment to next round key
cmp $24,%r3
jb 4f // 10 rounds for 128-bit key
lea 32(%ebp),%ebp
je 3f // 12 rounds for 192-bit key
lea 32(%ebp),%ebp
2: fwd_rnd1( -64(%ebp), crypto_ft_tab) // 14 rounds for 256-bit key
fwd_rnd2( -48(%ebp), crypto_ft_tab)
3: fwd_rnd1( -32(%ebp), crypto_ft_tab) // 12 rounds for 192-bit key
fwd_rnd2( -16(%ebp), crypto_ft_tab)
4: fwd_rnd1( (%ebp), crypto_ft_tab) // 10 rounds for 128-bit key
fwd_rnd2( +16(%ebp), crypto_ft_tab)
fwd_rnd1( +32(%ebp), crypto_ft_tab)
fwd_rnd2( +48(%ebp), crypto_ft_tab)
fwd_rnd1( +64(%ebp), crypto_ft_tab)
fwd_rnd2( +80(%ebp), crypto_ft_tab)
fwd_rnd1( +96(%ebp), crypto_ft_tab)
fwd_rnd2(+112(%ebp), crypto_ft_tab)
fwd_rnd1(+128(%ebp), crypto_ft_tab)
fwd_rnd2(+144(%ebp), crypto_fl_tab) // last round uses a different table
// move final values to the output array. CAUTION: the
// order of these assigns rely on the register mappings
add $8,%esp
mov out_blk+12(%esp),%ebp
mov %r5,12(%ebp)
pop %edi
mov %r4,8(%ebp)
pop %esi
mov %r1,4(%ebp)
pop %ebx
mov %r0,(%ebp)
pop %ebp
ret
ENDPROC(aes_enc_blk)
// AES (Rijndael) Decryption Subroutine
/* void aes_dec_blk(struct crypto_aes_ctx *ctx, u8 *out_blk, const u8 *in_blk) */
.extern crypto_it_tab
.extern crypto_il_tab
ENTRY(aes_dec_blk)
push %ebp
mov ctx(%esp),%ebp
// CAUTION: the order and the values used in these assigns
// rely on the register mappings
1: push %ebx
mov in_blk+4(%esp),%r2
push %esi
mov klen(%ebp),%r3 // key size
push %edi
#if dkey != 0
lea dkey(%ebp),%ebp // key pointer
#endif
// input four columns and xor in first round key
mov (%r2),%r0
mov 4(%r2),%r1
mov 8(%r2),%r4
mov 12(%r2),%r5
xor (%ebp),%r0
xor 4(%ebp),%r1
xor 8(%ebp),%r4
xor 12(%ebp),%r5
sub $8,%esp // space for register saves on stack
add $16,%ebp // increment to next round key
cmp $24,%r3
jb 4f // 10 rounds for 128-bit key
lea 32(%ebp),%ebp
je 3f // 12 rounds for 192-bit key
lea 32(%ebp),%ebp
2: inv_rnd1( -64(%ebp), crypto_it_tab) // 14 rounds for 256-bit key
inv_rnd2( -48(%ebp), crypto_it_tab)
3: inv_rnd1( -32(%ebp), crypto_it_tab) // 12 rounds for 192-bit key
inv_rnd2( -16(%ebp), crypto_it_tab)
4: inv_rnd1( (%ebp), crypto_it_tab) // 10 rounds for 128-bit key
inv_rnd2( +16(%ebp), crypto_it_tab)
inv_rnd1( +32(%ebp), crypto_it_tab)
inv_rnd2( +48(%ebp), crypto_it_tab)
inv_rnd1( +64(%ebp), crypto_it_tab)
inv_rnd2( +80(%ebp), crypto_it_tab)
inv_rnd1( +96(%ebp), crypto_it_tab)
inv_rnd2(+112(%ebp), crypto_it_tab)
inv_rnd1(+128(%ebp), crypto_it_tab)
inv_rnd2(+144(%ebp), crypto_il_tab) // last round uses a different table
// move final values to the output array. CAUTION: the
// order of these assigns rely on the register mappings
add $8,%esp
mov out_blk+12(%esp),%ebp
mov %r5,12(%ebp)
pop %edi
mov %r4,8(%ebp)
pop %esi
mov %r1,4(%ebp)
pop %ebx
mov %r0,(%ebp)
pop %ebp
ret
ENDPROC(aes_dec_blk)

View File

@ -1,185 +0,0 @@
/* AES (Rijndael) implementation (FIPS PUB 197) for x86_64
*
* Copyright (C) 2005 Andreas Steinmetz, <ast@domdv.de>
*
* License:
* This code can be distributed under the terms of the GNU General Public
* License (GPL) Version 2 provided that the above header down to and
* including this sentence is retained in full.
*/
.extern crypto_ft_tab
.extern crypto_it_tab
.extern crypto_fl_tab
.extern crypto_il_tab
.text
#include <linux/linkage.h>
#include <asm/asm-offsets.h>
#define R1 %rax
#define R1E %eax
#define R1X %ax
#define R1H %ah
#define R1L %al
#define R2 %rbx
#define R2E %ebx
#define R2X %bx
#define R2H %bh
#define R2L %bl
#define R3 %rcx
#define R3E %ecx
#define R3X %cx
#define R3H %ch
#define R3L %cl
#define R4 %rdx
#define R4E %edx
#define R4X %dx
#define R4H %dh
#define R4L %dl
#define R5 %rsi
#define R5E %esi
#define R6 %rdi
#define R6E %edi
#define R7 %r9 /* don't use %rbp; it breaks stack traces */
#define R7E %r9d
#define R8 %r8
#define R10 %r10
#define R11 %r11
#define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
ENTRY(FUNC); \
movq r1,r2; \
leaq KEY+48(r8),r9; \
movq r10,r11; \
movl (r7),r5 ## E; \
movl 4(r7),r1 ## E; \
movl 8(r7),r6 ## E; \
movl 12(r7),r7 ## E; \
movl 480(r8),r10 ## E; \
xorl -48(r9),r5 ## E; \
xorl -44(r9),r1 ## E; \
xorl -40(r9),r6 ## E; \
xorl -36(r9),r7 ## E; \
cmpl $24,r10 ## E; \
jb B128; \
leaq 32(r9),r9; \
je B192; \
leaq 32(r9),r9;
#define epilogue(FUNC,r1,r2,r5,r6,r7,r8,r9) \
movq r1,r2; \
movl r5 ## E,(r9); \
movl r6 ## E,4(r9); \
movl r7 ## E,8(r9); \
movl r8 ## E,12(r9); \
ret; \
ENDPROC(FUNC);
#define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
movzbl r2 ## H,r5 ## E; \
movzbl r2 ## L,r6 ## E; \
movl TAB+1024(,r5,4),r5 ## E;\
movw r4 ## X,r2 ## X; \
movl TAB(,r6,4),r6 ## E; \
roll $16,r2 ## E; \
shrl $16,r4 ## E; \
movzbl r4 ## L,r7 ## E; \
movzbl r4 ## H,r4 ## E; \
xorl OFFSET(r8),ra ## E; \
xorl OFFSET+4(r8),rb ## E; \
xorl TAB+3072(,r4,4),r5 ## E;\
xorl TAB+2048(,r7,4),r6 ## E;\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r4 ## E; \
movl TAB+1024(,r4,4),r4 ## E;\
movw r3 ## X,r1 ## X; \
roll $16,r1 ## E; \
shrl $16,r3 ## E; \
xorl TAB(,r7,4),r5 ## E; \
movzbl r3 ## L,r7 ## E; \
movzbl r3 ## H,r3 ## E; \
xorl TAB+3072(,r3,4),r4 ## E;\
xorl TAB+2048(,r7,4),r5 ## E;\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r3 ## E; \
shrl $16,r1 ## E; \
xorl TAB+3072(,r3,4),r6 ## E;\
movl TAB+2048(,r7,4),r3 ## E;\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r1 ## E; \
xorl TAB+1024(,r1,4),r6 ## E;\
xorl TAB(,r7,4),r3 ## E; \
movzbl r2 ## H,r1 ## E; \
movzbl r2 ## L,r7 ## E; \
shrl $16,r2 ## E; \
xorl TAB+3072(,r1,4),r3 ## E;\
xorl TAB+2048(,r7,4),r4 ## E;\
movzbl r2 ## H,r1 ## E; \
movzbl r2 ## L,r2 ## E; \
xorl OFFSET+8(r8),rc ## E; \
xorl OFFSET+12(r8),rd ## E; \
xorl TAB+1024(,r1,4),r3 ## E;\
xorl TAB(,r2,4),r4 ## E;
#define move_regs(r1,r2,r3,r4) \
movl r3 ## E,r1 ## E; \
movl r4 ## E,r2 ## E;
#define entry(FUNC,KEY,B128,B192) \
prologue(FUNC,KEY,B128,B192,R2,R8,R1,R3,R4,R6,R10,R5,R11)
#define return(FUNC) epilogue(FUNC,R8,R2,R5,R6,R3,R4,R11)
#define encrypt_round(TAB,OFFSET) \
round(TAB,OFFSET,R1,R2,R3,R4,R5,R6,R7,R10,R5,R6,R3,R4) \
move_regs(R1,R2,R5,R6)
#define encrypt_final(TAB,OFFSET) \
round(TAB,OFFSET,R1,R2,R3,R4,R5,R6,R7,R10,R5,R6,R3,R4)
#define decrypt_round(TAB,OFFSET) \
round(TAB,OFFSET,R2,R1,R4,R3,R6,R5,R7,R10,R5,R6,R3,R4) \
move_regs(R1,R2,R5,R6)
#define decrypt_final(TAB,OFFSET) \
round(TAB,OFFSET,R2,R1,R4,R3,R6,R5,R7,R10,R5,R6,R3,R4)
/* void aes_enc_blk(stuct crypto_tfm *tfm, u8 *out, const u8 *in) */
entry(aes_enc_blk,0,.Le128,.Le192)
encrypt_round(crypto_ft_tab,-96)
encrypt_round(crypto_ft_tab,-80)
.Le192: encrypt_round(crypto_ft_tab,-64)
encrypt_round(crypto_ft_tab,-48)
.Le128: encrypt_round(crypto_ft_tab,-32)
encrypt_round(crypto_ft_tab,-16)
encrypt_round(crypto_ft_tab, 0)
encrypt_round(crypto_ft_tab, 16)
encrypt_round(crypto_ft_tab, 32)
encrypt_round(crypto_ft_tab, 48)
encrypt_round(crypto_ft_tab, 64)
encrypt_round(crypto_ft_tab, 80)
encrypt_round(crypto_ft_tab, 96)
encrypt_final(crypto_fl_tab,112)
return(aes_enc_blk)
/* void aes_dec_blk(struct crypto_tfm *tfm, u8 *out, const u8 *in) */
entry(aes_dec_blk,240,.Ld128,.Ld192)
decrypt_round(crypto_it_tab,-96)
decrypt_round(crypto_it_tab,-80)
.Ld192: decrypt_round(crypto_it_tab,-64)
decrypt_round(crypto_it_tab,-48)
.Ld128: decrypt_round(crypto_it_tab,-32)
decrypt_round(crypto_it_tab,-16)
decrypt_round(crypto_it_tab, 0)
decrypt_round(crypto_it_tab, 16)
decrypt_round(crypto_it_tab, 32)
decrypt_round(crypto_it_tab, 48)
decrypt_round(crypto_it_tab, 64)
decrypt_round(crypto_it_tab, 80)
decrypt_round(crypto_it_tab, 96)
decrypt_final(crypto_il_tab,112)
return(aes_dec_blk)

View File

@ -1,71 +1 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Glue Code for the asm optimized version of the AES Cipher Algorithm
*
*/
#include <linux/module.h>
#include <crypto/aes.h>
#include <asm/crypto/aes.h>
asmlinkage void aes_enc_blk(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
asmlinkage void aes_dec_blk(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
void crypto_aes_encrypt_x86(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
{
aes_enc_blk(ctx, dst, src);
}
EXPORT_SYMBOL_GPL(crypto_aes_encrypt_x86);
void crypto_aes_decrypt_x86(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
{
aes_dec_blk(ctx, dst, src);
}
EXPORT_SYMBOL_GPL(crypto_aes_decrypt_x86);
static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
aes_enc_blk(crypto_tfm_ctx(tfm), dst, src);
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
aes_dec_blk(crypto_tfm_ctx(tfm), dst, src);
}
static struct crypto_alg aes_alg = {
.cra_name = "aes",
.cra_driver_name = "aes-asm",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_module = THIS_MODULE,
.cra_u = {
.cipher = {
.cia_min_keysize = AES_MIN_KEY_SIZE,
.cia_max_keysize = AES_MAX_KEY_SIZE,
.cia_setkey = crypto_aes_set_key,
.cia_encrypt = aes_encrypt,
.cia_decrypt = aes_decrypt
}
}
};
static int __init aes_init(void)
{
return crypto_register_alg(&aes_alg);
}
static void __exit aes_fini(void)
{
crypto_unregister_alg(&aes_alg);
}
module_init(aes_init);
module_exit(aes_fini);
MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm, asm optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS_CRYPTO("aes");
MODULE_ALIAS_CRYPTO("aes-asm");

View File

@ -26,7 +26,6 @@
#include <crypto/gcm.h>
#include <crypto/xts.h>
#include <asm/cpu_device_id.h>
#include <asm/crypto/aes.h>
#include <asm/simd.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/aead.h>
@ -329,7 +328,7 @@ static int aes_set_key_common(struct crypto_tfm *tfm, void *raw_ctx,
}
if (!crypto_simd_usable())
err = crypto_aes_expand_key(ctx, in_key, key_len);
err = aes_expandkey(ctx, in_key, key_len);
else {
kernel_fpu_begin();
err = aesni_set_key(ctx, in_key, key_len);
@ -345,26 +344,26 @@ static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
return aes_set_key_common(tfm, crypto_tfm_ctx(tfm), in_key, key_len);
}
static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void aesni_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
if (!crypto_simd_usable())
crypto_aes_encrypt_x86(ctx, dst, src);
else {
if (!crypto_simd_usable()) {
aes_encrypt(ctx, dst, src);
} else {
kernel_fpu_begin();
aesni_enc(ctx, dst, src);
kernel_fpu_end();
}
}
static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
static void aesni_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
if (!crypto_simd_usable())
crypto_aes_decrypt_x86(ctx, dst, src);
else {
if (!crypto_simd_usable()) {
aes_decrypt(ctx, dst, src);
} else {
kernel_fpu_begin();
aesni_dec(ctx, dst, src);
kernel_fpu_end();
@ -610,7 +609,8 @@ static int xts_encrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&aesni_enc_xts, req,
XTS_TWEAK_CAST(aesni_xts_tweak),
aes_ctx(ctx->raw_tweak_ctx),
aes_ctx(ctx->raw_crypt_ctx));
aes_ctx(ctx->raw_crypt_ctx),
false);
}
static int xts_decrypt(struct skcipher_request *req)
@ -621,32 +621,28 @@ static int xts_decrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&aesni_dec_xts, req,
XTS_TWEAK_CAST(aesni_xts_tweak),
aes_ctx(ctx->raw_tweak_ctx),
aes_ctx(ctx->raw_crypt_ctx));
aes_ctx(ctx->raw_crypt_ctx),
true);
}
static int
rfc4106_set_hash_subkey(u8 *hash_subkey, const u8 *key, unsigned int key_len)
{
struct crypto_cipher *tfm;
struct crypto_aes_ctx ctx;
int ret;
tfm = crypto_alloc_cipher("aes", 0, 0);
if (IS_ERR(tfm))
return PTR_ERR(tfm);
ret = crypto_cipher_setkey(tfm, key, key_len);
ret = aes_expandkey(&ctx, key, key_len);
if (ret)
goto out_free_cipher;
return ret;
/* Clear the data in the hash sub key container to zero.*/
/* We want to cipher all zeros to create the hash sub key. */
memset(hash_subkey, 0, RFC4106_HASH_SUBKEY_SIZE);
crypto_cipher_encrypt_one(tfm, hash_subkey, hash_subkey);
aes_encrypt(&ctx, hash_subkey, hash_subkey);
out_free_cipher:
crypto_free_cipher(tfm);
return ret;
memzero_explicit(&ctx, sizeof(ctx));
return 0;
}
static int common_rfc4106_set_key(struct crypto_aead *aead, const u8 *key,
@ -919,8 +915,8 @@ static struct crypto_alg aesni_cipher_alg = {
.cia_min_keysize = AES_MIN_KEY_SIZE,
.cia_max_keysize = AES_MAX_KEY_SIZE,
.cia_setkey = aes_set_key,
.cia_encrypt = aes_encrypt,
.cia_decrypt = aes_decrypt
.cia_encrypt = aesni_encrypt,
.cia_decrypt = aesni_decrypt
}
}
};

View File

@ -182,7 +182,7 @@ static int xts_encrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&camellia_enc_xts, req,
XTS_TWEAK_CAST(camellia_enc_blk),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
@ -192,7 +192,7 @@ static int xts_decrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&camellia_dec_xts, req,
XTS_TWEAK_CAST(camellia_enc_blk),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}
static struct skcipher_alg camellia_algs[] = {

View File

@ -208,7 +208,7 @@ static int xts_encrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&camellia_enc_xts, req,
XTS_TWEAK_CAST(camellia_enc_blk),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
@ -218,7 +218,7 @@ static int xts_decrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&camellia_dec_xts, req,
XTS_TWEAK_CAST(camellia_enc_blk),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}
static struct skcipher_alg camellia_algs[] = {

View File

@ -201,7 +201,7 @@ static int xts_encrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&cast6_enc_xts, req,
XTS_TWEAK_CAST(__cast6_encrypt),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
@ -211,7 +211,7 @@ static int xts_decrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&cast6_dec_xts, req,
XTS_TWEAK_CAST(__cast6_encrypt),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}
static struct skcipher_alg cast6_algs[] = {

View File

@ -19,8 +19,8 @@
#include <linux/types.h>
struct des3_ede_x86_ctx {
u32 enc_expkey[DES3_EDE_EXPKEY_WORDS];
u32 dec_expkey[DES3_EDE_EXPKEY_WORDS];
struct des3_ede_ctx enc;
struct des3_ede_ctx dec;
};
/* regular block cipher functions */
@ -34,7 +34,7 @@ asmlinkage void des3_ede_x86_64_crypt_blk_3way(const u32 *expkey, u8 *dst,
static inline void des3_ede_enc_blk(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
u32 *enc_ctx = ctx->enc_expkey;
u32 *enc_ctx = ctx->enc.expkey;
des3_ede_x86_64_crypt_blk(enc_ctx, dst, src);
}
@ -42,7 +42,7 @@ static inline void des3_ede_enc_blk(struct des3_ede_x86_ctx *ctx, u8 *dst,
static inline void des3_ede_dec_blk(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
u32 *dec_ctx = ctx->dec_expkey;
u32 *dec_ctx = ctx->dec.expkey;
des3_ede_x86_64_crypt_blk(dec_ctx, dst, src);
}
@ -50,7 +50,7 @@ static inline void des3_ede_dec_blk(struct des3_ede_x86_ctx *ctx, u8 *dst,
static inline void des3_ede_enc_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
u32 *enc_ctx = ctx->enc_expkey;
u32 *enc_ctx = ctx->enc.expkey;
des3_ede_x86_64_crypt_blk_3way(enc_ctx, dst, src);
}
@ -58,7 +58,7 @@ static inline void des3_ede_enc_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
static inline void des3_ede_dec_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
u32 *dec_ctx = ctx->dec_expkey;
u32 *dec_ctx = ctx->dec.expkey;
des3_ede_x86_64_crypt_blk_3way(dec_ctx, dst, src);
}
@ -122,7 +122,7 @@ static int ecb_encrypt(struct skcipher_request *req)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct des3_ede_x86_ctx *ctx = crypto_skcipher_ctx(tfm);
return ecb_crypt(req, ctx->enc_expkey);
return ecb_crypt(req, ctx->enc.expkey);
}
static int ecb_decrypt(struct skcipher_request *req)
@ -130,7 +130,7 @@ static int ecb_decrypt(struct skcipher_request *req)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct des3_ede_x86_ctx *ctx = crypto_skcipher_ctx(tfm);
return ecb_crypt(req, ctx->dec_expkey);
return ecb_crypt(req, ctx->dec.expkey);
}
static unsigned int __cbc_encrypt(struct des3_ede_x86_ctx *ctx,
@ -348,20 +348,28 @@ static int des3_ede_x86_setkey(struct crypto_tfm *tfm, const u8 *key,
u32 i, j, tmp;
int err;
/* Generate encryption context using generic implementation. */
err = __des3_ede_setkey(ctx->enc_expkey, &tfm->crt_flags, key, keylen);
if (err < 0)
err = des3_ede_expand_key(&ctx->enc, key, keylen);
if (err == -ENOKEY) {
if (crypto_tfm_get_flags(tfm) & CRYPTO_TFM_REQ_FORBID_WEAK_KEYS)
err = -EINVAL;
else
err = 0;
}
if (err) {
memset(ctx, 0, sizeof(*ctx));
return err;
}
/* Fix encryption context for this implementation and form decryption
* context. */
j = DES3_EDE_EXPKEY_WORDS - 2;
for (i = 0; i < DES3_EDE_EXPKEY_WORDS; i += 2, j -= 2) {
tmp = ror32(ctx->enc_expkey[i + 1], 4);
ctx->enc_expkey[i + 1] = tmp;
tmp = ror32(ctx->enc.expkey[i + 1], 4);
ctx->enc.expkey[i + 1] = tmp;
ctx->dec_expkey[j + 0] = ctx->enc_expkey[i + 0];
ctx->dec_expkey[j + 1] = tmp;
ctx->dec.expkey[j + 0] = ctx->enc.expkey[i + 0];
ctx->dec.expkey[j + 1] = tmp;
}
return 0;

View File

@ -357,6 +357,5 @@ module_init(ghash_pclmulqdqni_mod_init);
module_exit(ghash_pclmulqdqni_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("GHASH Message Digest Algorithm, "
"accelerated by PCLMULQDQ-NI");
MODULE_DESCRIPTION("GHASH hash function, accelerated by PCLMULQDQ-NI");
MODULE_ALIAS_CRYPTO("ghash");

View File

@ -14,6 +14,7 @@
#include <crypto/b128ops.h>
#include <crypto/gf128mul.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <crypto/xts.h>
#include <asm/crypto/glue_helper.h>
@ -259,17 +260,36 @@ done:
int glue_xts_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req,
common_glue_func_t tweak_fn, void *tweak_ctx,
void *crypt_ctx)
void *crypt_ctx, bool decrypt)
{
const bool cts = (req->cryptlen % XTS_BLOCK_SIZE);
const unsigned int bsize = 128 / 8;
struct skcipher_request subreq;
struct skcipher_walk walk;
bool fpu_enabled = false;
unsigned int nbytes;
unsigned int nbytes, tail;
int err;
if (req->cryptlen < XTS_BLOCK_SIZE)
return -EINVAL;
if (unlikely(cts)) {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
tail = req->cryptlen % XTS_BLOCK_SIZE + XTS_BLOCK_SIZE;
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
crypto_skcipher_get_flags(tfm),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
req->cryptlen - tail, req->iv);
req = &subreq;
}
err = skcipher_walk_virt(&walk, req, false);
nbytes = walk.nbytes;
if (!nbytes)
if (err)
return err;
/* set minimum length to bsize, for tweak_fn */
@ -287,6 +307,47 @@ int glue_xts_req_128bit(const struct common_glue_ctx *gctx,
nbytes = walk.nbytes;
}
if (unlikely(cts)) {
u8 *next_tweak, *final_tweak = req->iv;
struct scatterlist *src, *dst;
struct scatterlist s[2], d[2];
le128 b[2];
dst = src = scatterwalk_ffwd(s, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(d, req->dst, req->cryptlen);
if (decrypt) {
next_tweak = memcpy(b, req->iv, XTS_BLOCK_SIZE);
gf128mul_x_ble(b, b);
} else {
next_tweak = req->iv;
}
skcipher_request_set_crypt(&subreq, src, dst, XTS_BLOCK_SIZE,
next_tweak);
err = skcipher_walk_virt(&walk, req, false) ?:
skcipher_walk_done(&walk,
__glue_xts_req_128bit(gctx, crypt_ctx, &walk));
if (err)
goto out;
scatterwalk_map_and_copy(b, dst, 0, XTS_BLOCK_SIZE, 0);
memcpy(b + 1, b, tail - XTS_BLOCK_SIZE);
scatterwalk_map_and_copy(b, src, XTS_BLOCK_SIZE,
tail - XTS_BLOCK_SIZE, 0);
scatterwalk_map_and_copy(b, dst, 0, tail, 1);
skcipher_request_set_crypt(&subreq, dst, dst, XTS_BLOCK_SIZE,
final_tweak);
err = skcipher_walk_virt(&walk, req, false) ?:
skcipher_walk_done(&walk,
__glue_xts_req_128bit(gctx, crypt_ctx, &walk));
}
out:
glue_fpu_end(fpu_enabled);
return err;

View File

@ -1,619 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* AVX2 implementation of MORUS-1280
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <linux/linkage.h>
#include <asm/frame.h>
#define SHUFFLE_MASK(i0, i1, i2, i3) \
(i0 | (i1 << 2) | (i2 << 4) | (i3 << 6))
#define MASK1 SHUFFLE_MASK(3, 0, 1, 2)
#define MASK2 SHUFFLE_MASK(2, 3, 0, 1)
#define MASK3 SHUFFLE_MASK(1, 2, 3, 0)
#define STATE0 %ymm0
#define STATE0_LOW %xmm0
#define STATE1 %ymm1
#define STATE2 %ymm2
#define STATE3 %ymm3
#define STATE4 %ymm4
#define KEY %ymm5
#define MSG %ymm5
#define MSG_LOW %xmm5
#define T0 %ymm6
#define T0_LOW %xmm6
#define T1 %ymm7
.section .rodata.cst32.morus1280_const, "aM", @progbits, 32
.align 32
.Lmorus1280_const:
.byte 0x00, 0x01, 0x01, 0x02, 0x03, 0x05, 0x08, 0x0d
.byte 0x15, 0x22, 0x37, 0x59, 0x90, 0xe9, 0x79, 0x62
.byte 0xdb, 0x3d, 0x18, 0x55, 0x6d, 0xc2, 0x2f, 0xf1
.byte 0x20, 0x11, 0x31, 0x42, 0x73, 0xb5, 0x28, 0xdd
.section .rodata.cst32.morus1280_counter, "aM", @progbits, 32
.align 32
.Lmorus1280_counter:
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.byte 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17
.byte 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
.text
.macro morus1280_round s0, s1, s2, s3, s4, b, w
vpand \s1, \s2, T0
vpxor T0, \s0, \s0
vpxor \s3, \s0, \s0
vpsllq $\b, \s0, T0
vpsrlq $(64 - \b), \s0, \s0
vpxor T0, \s0, \s0
vpermq $\w, \s3, \s3
.endm
/*
* __morus1280_update: internal ABI
* input:
* STATE[0-4] - input state
* MSG - message block
* output:
* STATE[0-4] - output state
* changed:
* T0
*/
__morus1280_update:
morus1280_round STATE0, STATE1, STATE2, STATE3, STATE4, 13, MASK1
vpxor MSG, STATE1, STATE1
morus1280_round STATE1, STATE2, STATE3, STATE4, STATE0, 46, MASK2
vpxor MSG, STATE2, STATE2
morus1280_round STATE2, STATE3, STATE4, STATE0, STATE1, 38, MASK3
vpxor MSG, STATE3, STATE3
morus1280_round STATE3, STATE4, STATE0, STATE1, STATE2, 7, MASK2
vpxor MSG, STATE4, STATE4
morus1280_round STATE4, STATE0, STATE1, STATE2, STATE3, 4, MASK1
ret
ENDPROC(__morus1280_update)
/*
* __morus1280_update_zero: internal ABI
* input:
* STATE[0-4] - input state
* output:
* STATE[0-4] - output state
* changed:
* T0
*/
__morus1280_update_zero:
morus1280_round STATE0, STATE1, STATE2, STATE3, STATE4, 13, MASK1
morus1280_round STATE1, STATE2, STATE3, STATE4, STATE0, 46, MASK2
morus1280_round STATE2, STATE3, STATE4, STATE0, STATE1, 38, MASK3
morus1280_round STATE3, STATE4, STATE0, STATE1, STATE2, 7, MASK2
morus1280_round STATE4, STATE0, STATE1, STATE2, STATE3, 4, MASK1
ret
ENDPROC(__morus1280_update_zero)
/*
* __load_partial: internal ABI
* input:
* %rsi - src
* %rcx - bytes
* output:
* MSG - message block
* changed:
* %r8
* %r9
*/
__load_partial:
xor %r9d, %r9d
vpxor MSG, MSG, MSG
mov %rcx, %r8
and $0x1, %r8
jz .Lld_partial_1
mov %rcx, %r8
and $0x1E, %r8
add %rsi, %r8
mov (%r8), %r9b
.Lld_partial_1:
mov %rcx, %r8
and $0x2, %r8
jz .Lld_partial_2
mov %rcx, %r8
and $0x1C, %r8
add %rsi, %r8
shl $16, %r9
mov (%r8), %r9w
.Lld_partial_2:
mov %rcx, %r8
and $0x4, %r8
jz .Lld_partial_4
mov %rcx, %r8
and $0x18, %r8
add %rsi, %r8
shl $32, %r9
mov (%r8), %r8d
xor %r8, %r9
.Lld_partial_4:
movq %r9, MSG_LOW
mov %rcx, %r8
and $0x8, %r8
jz .Lld_partial_8
mov %rcx, %r8
and $0x10, %r8
add %rsi, %r8
pshufd $MASK2, MSG_LOW, MSG_LOW
pinsrq $0, (%r8), MSG_LOW
.Lld_partial_8:
mov %rcx, %r8
and $0x10, %r8
jz .Lld_partial_16
vpermq $MASK2, MSG, MSG
movdqu (%rsi), MSG_LOW
.Lld_partial_16:
ret
ENDPROC(__load_partial)
/*
* __store_partial: internal ABI
* input:
* %rdx - dst
* %rcx - bytes
* output:
* T0 - message block
* changed:
* %r8
* %r9
* %r10
*/
__store_partial:
mov %rcx, %r8
mov %rdx, %r9
cmp $16, %r8
jl .Lst_partial_16
movdqu T0_LOW, (%r9)
vpermq $MASK2, T0, T0
sub $16, %r8
add $16, %r9
.Lst_partial_16:
movq T0_LOW, %r10
cmp $8, %r8
jl .Lst_partial_8
mov %r10, (%r9)
pextrq $1, T0_LOW, %r10
sub $8, %r8
add $8, %r9
.Lst_partial_8:
cmp $4, %r8
jl .Lst_partial_4
mov %r10d, (%r9)
shr $32, %r10
sub $4, %r8
add $4, %r9
.Lst_partial_4:
cmp $2, %r8
jl .Lst_partial_2
mov %r10w, (%r9)
shr $16, %r10
sub $2, %r8
add $2, %r9
.Lst_partial_2:
cmp $1, %r8
jl .Lst_partial_1
mov %r10b, (%r9)
.Lst_partial_1:
ret
ENDPROC(__store_partial)
/*
* void crypto_morus1280_avx2_init(void *state, const void *key,
* const void *iv);
*/
ENTRY(crypto_morus1280_avx2_init)
FRAME_BEGIN
/* load IV: */
vpxor STATE0, STATE0, STATE0
movdqu (%rdx), STATE0_LOW
/* load key: */
vmovdqu (%rsi), KEY
vmovdqa KEY, STATE1
/* load all ones: */
vpcmpeqd STATE2, STATE2, STATE2
/* load all zeros: */
vpxor STATE3, STATE3, STATE3
/* load the constant: */
vmovdqa .Lmorus1280_const, STATE4
/* update 16 times with zero: */
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
/* xor-in the key again after updates: */
vpxor KEY, STATE1, STATE1
/* store the state: */
vmovdqu STATE0, (0 * 32)(%rdi)
vmovdqu STATE1, (1 * 32)(%rdi)
vmovdqu STATE2, (2 * 32)(%rdi)
vmovdqu STATE3, (3 * 32)(%rdi)
vmovdqu STATE4, (4 * 32)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus1280_avx2_init)
/*
* void crypto_morus1280_avx2_ad(void *state, const void *data,
* unsigned int length);
*/
ENTRY(crypto_morus1280_avx2_ad)
FRAME_BEGIN
cmp $32, %rdx
jb .Lad_out
/* load the state: */
vmovdqu (0 * 32)(%rdi), STATE0
vmovdqu (1 * 32)(%rdi), STATE1
vmovdqu (2 * 32)(%rdi), STATE2
vmovdqu (3 * 32)(%rdi), STATE3
vmovdqu (4 * 32)(%rdi), STATE4
mov %rsi, %r8
and $0x1F, %r8
jnz .Lad_u_loop
.align 4
.Lad_a_loop:
vmovdqa (%rsi), MSG
call __morus1280_update
sub $32, %rdx
add $32, %rsi
cmp $32, %rdx
jge .Lad_a_loop
jmp .Lad_cont
.align 4
.Lad_u_loop:
vmovdqu (%rsi), MSG
call __morus1280_update
sub $32, %rdx
add $32, %rsi
cmp $32, %rdx
jge .Lad_u_loop
.Lad_cont:
/* store the state: */
vmovdqu STATE0, (0 * 32)(%rdi)
vmovdqu STATE1, (1 * 32)(%rdi)
vmovdqu STATE2, (2 * 32)(%rdi)
vmovdqu STATE3, (3 * 32)(%rdi)
vmovdqu STATE4, (4 * 32)(%rdi)
.Lad_out:
FRAME_END
ret
ENDPROC(crypto_morus1280_avx2_ad)
/*
* void crypto_morus1280_avx2_enc(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_avx2_enc)
FRAME_BEGIN
cmp $32, %rcx
jb .Lenc_out
/* load the state: */
vmovdqu (0 * 32)(%rdi), STATE0
vmovdqu (1 * 32)(%rdi), STATE1
vmovdqu (2 * 32)(%rdi), STATE2
vmovdqu (3 * 32)(%rdi), STATE3
vmovdqu (4 * 32)(%rdi), STATE4
mov %rsi, %r8
or %rdx, %r8
and $0x1F, %r8
jnz .Lenc_u_loop
.align 4
.Lenc_a_loop:
vmovdqa (%rsi), MSG
vmovdqa MSG, T0
vpxor STATE0, T0, T0
vpermq $MASK3, STATE1, T1
vpxor T1, T0, T0
vpand STATE2, STATE3, T1
vpxor T1, T0, T0
vmovdqa T0, (%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Lenc_a_loop
jmp .Lenc_cont
.align 4
.Lenc_u_loop:
vmovdqu (%rsi), MSG
vmovdqa MSG, T0
vpxor STATE0, T0, T0
vpermq $MASK3, STATE1, T1
vpxor T1, T0, T0
vpand STATE2, STATE3, T1
vpxor T1, T0, T0
vmovdqu T0, (%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Lenc_u_loop
.Lenc_cont:
/* store the state: */
vmovdqu STATE0, (0 * 32)(%rdi)
vmovdqu STATE1, (1 * 32)(%rdi)
vmovdqu STATE2, (2 * 32)(%rdi)
vmovdqu STATE3, (3 * 32)(%rdi)
vmovdqu STATE4, (4 * 32)(%rdi)
.Lenc_out:
FRAME_END
ret
ENDPROC(crypto_morus1280_avx2_enc)
/*
* void crypto_morus1280_avx2_enc_tail(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_avx2_enc_tail)
FRAME_BEGIN
/* load the state: */
vmovdqu (0 * 32)(%rdi), STATE0
vmovdqu (1 * 32)(%rdi), STATE1
vmovdqu (2 * 32)(%rdi), STATE2
vmovdqu (3 * 32)(%rdi), STATE3
vmovdqu (4 * 32)(%rdi), STATE4
/* encrypt message: */
call __load_partial
vmovdqa MSG, T0
vpxor STATE0, T0, T0
vpermq $MASK3, STATE1, T1
vpxor T1, T0, T0
vpand STATE2, STATE3, T1
vpxor T1, T0, T0
call __store_partial
call __morus1280_update
/* store the state: */
vmovdqu STATE0, (0 * 32)(%rdi)
vmovdqu STATE1, (1 * 32)(%rdi)
vmovdqu STATE2, (2 * 32)(%rdi)
vmovdqu STATE3, (3 * 32)(%rdi)
vmovdqu STATE4, (4 * 32)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus1280_avx2_enc_tail)
/*
* void crypto_morus1280_avx2_dec(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_avx2_dec)
FRAME_BEGIN
cmp $32, %rcx
jb .Ldec_out
/* load the state: */
vmovdqu (0 * 32)(%rdi), STATE0
vmovdqu (1 * 32)(%rdi), STATE1
vmovdqu (2 * 32)(%rdi), STATE2
vmovdqu (3 * 32)(%rdi), STATE3
vmovdqu (4 * 32)(%rdi), STATE4
mov %rsi, %r8
or %rdx, %r8
and $0x1F, %r8
jnz .Ldec_u_loop
.align 4
.Ldec_a_loop:
vmovdqa (%rsi), MSG
vpxor STATE0, MSG, MSG
vpermq $MASK3, STATE1, T0
vpxor T0, MSG, MSG
vpand STATE2, STATE3, T0
vpxor T0, MSG, MSG
vmovdqa MSG, (%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Ldec_a_loop
jmp .Ldec_cont
.align 4
.Ldec_u_loop:
vmovdqu (%rsi), MSG
vpxor STATE0, MSG, MSG
vpermq $MASK3, STATE1, T0
vpxor T0, MSG, MSG
vpand STATE2, STATE3, T0
vpxor T0, MSG, MSG
vmovdqu MSG, (%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Ldec_u_loop
.Ldec_cont:
/* store the state: */
vmovdqu STATE0, (0 * 32)(%rdi)
vmovdqu STATE1, (1 * 32)(%rdi)
vmovdqu STATE2, (2 * 32)(%rdi)
vmovdqu STATE3, (3 * 32)(%rdi)
vmovdqu STATE4, (4 * 32)(%rdi)
.Ldec_out:
FRAME_END
ret
ENDPROC(crypto_morus1280_avx2_dec)
/*
* void crypto_morus1280_avx2_dec_tail(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_avx2_dec_tail)
FRAME_BEGIN
/* load the state: */
vmovdqu (0 * 32)(%rdi), STATE0
vmovdqu (1 * 32)(%rdi), STATE1
vmovdqu (2 * 32)(%rdi), STATE2
vmovdqu (3 * 32)(%rdi), STATE3
vmovdqu (4 * 32)(%rdi), STATE4
/* decrypt message: */
call __load_partial
vpxor STATE0, MSG, MSG
vpermq $MASK3, STATE1, T0
vpxor T0, MSG, MSG
vpand STATE2, STATE3, T0
vpxor T0, MSG, MSG
vmovdqa MSG, T0
call __store_partial
/* mask with byte count: */
movq %rcx, T0_LOW
vpbroadcastb T0_LOW, T0
vmovdqa .Lmorus1280_counter, T1
vpcmpgtb T1, T0, T0
vpand T0, MSG, MSG
call __morus1280_update
/* store the state: */
vmovdqu STATE0, (0 * 32)(%rdi)
vmovdqu STATE1, (1 * 32)(%rdi)
vmovdqu STATE2, (2 * 32)(%rdi)
vmovdqu STATE3, (3 * 32)(%rdi)
vmovdqu STATE4, (4 * 32)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus1280_avx2_dec_tail)
/*
* void crypto_morus1280_avx2_final(void *state, void *tag_xor,
* u64 assoclen, u64 cryptlen);
*/
ENTRY(crypto_morus1280_avx2_final)
FRAME_BEGIN
/* load the state: */
vmovdqu (0 * 32)(%rdi), STATE0
vmovdqu (1 * 32)(%rdi), STATE1
vmovdqu (2 * 32)(%rdi), STATE2
vmovdqu (3 * 32)(%rdi), STATE3
vmovdqu (4 * 32)(%rdi), STATE4
/* xor state[0] into state[4]: */
vpxor STATE0, STATE4, STATE4
/* prepare length block: */
vpxor MSG, MSG, MSG
vpinsrq $0, %rdx, MSG_LOW, MSG_LOW
vpinsrq $1, %rcx, MSG_LOW, MSG_LOW
vpsllq $3, MSG, MSG /* multiply by 8 (to get bit count) */
/* update state: */
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
/* xor tag: */
vmovdqu (%rsi), MSG
vpxor STATE0, MSG, MSG
vpermq $MASK3, STATE1, T0
vpxor T0, MSG, MSG
vpand STATE2, STATE3, T0
vpxor T0, MSG, MSG
vmovdqu MSG, (%rsi)
FRAME_END
ret
ENDPROC(crypto_morus1280_avx2_final)

View File

@ -1,62 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The MORUS-1280 Authenticated-Encryption Algorithm
* Glue for AVX2 implementation
*
* Copyright (c) 2016-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/simd.h>
#include <crypto/morus1280_glue.h>
#include <linux/module.h>
#include <asm/fpu/api.h>
#include <asm/cpu_device_id.h>
asmlinkage void crypto_morus1280_avx2_init(void *state, const void *key,
const void *iv);
asmlinkage void crypto_morus1280_avx2_ad(void *state, const void *data,
unsigned int length);
asmlinkage void crypto_morus1280_avx2_enc(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_avx2_dec(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_avx2_enc_tail(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_avx2_dec_tail(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_avx2_final(void *state, void *tag_xor,
u64 assoclen, u64 cryptlen);
MORUS1280_DECLARE_ALG(avx2, "morus1280-avx2", 400);
static struct simd_aead_alg *simd_alg;
static int __init crypto_morus1280_avx2_module_init(void)
{
if (!boot_cpu_has(X86_FEATURE_AVX2) ||
!boot_cpu_has(X86_FEATURE_OSXSAVE) ||
!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL))
return -ENODEV;
return simd_register_aeads_compat(&crypto_morus1280_avx2_alg, 1,
&simd_alg);
}
static void __exit crypto_morus1280_avx2_module_exit(void)
{
simd_unregister_aeads(&crypto_morus1280_avx2_alg, 1, &simd_alg);
}
module_init(crypto_morus1280_avx2_module_init);
module_exit(crypto_morus1280_avx2_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("MORUS-1280 AEAD algorithm -- AVX2 implementation");
MODULE_ALIAS_CRYPTO("morus1280");
MODULE_ALIAS_CRYPTO("morus1280-avx2");

View File

@ -1,893 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* SSE2 implementation of MORUS-1280
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <linux/linkage.h>
#include <asm/frame.h>
#define SHUFFLE_MASK(i0, i1, i2, i3) \
(i0 | (i1 << 2) | (i2 << 4) | (i3 << 6))
#define MASK2 SHUFFLE_MASK(2, 3, 0, 1)
#define STATE0_LO %xmm0
#define STATE0_HI %xmm1
#define STATE1_LO %xmm2
#define STATE1_HI %xmm3
#define STATE2_LO %xmm4
#define STATE2_HI %xmm5
#define STATE3_LO %xmm6
#define STATE3_HI %xmm7
#define STATE4_LO %xmm8
#define STATE4_HI %xmm9
#define KEY_LO %xmm10
#define KEY_HI %xmm11
#define MSG_LO %xmm10
#define MSG_HI %xmm11
#define T0_LO %xmm12
#define T0_HI %xmm13
#define T1_LO %xmm14
#define T1_HI %xmm15
.section .rodata.cst16.morus640_const, "aM", @progbits, 16
.align 16
.Lmorus640_const_0:
.byte 0x00, 0x01, 0x01, 0x02, 0x03, 0x05, 0x08, 0x0d
.byte 0x15, 0x22, 0x37, 0x59, 0x90, 0xe9, 0x79, 0x62
.Lmorus640_const_1:
.byte 0xdb, 0x3d, 0x18, 0x55, 0x6d, 0xc2, 0x2f, 0xf1
.byte 0x20, 0x11, 0x31, 0x42, 0x73, 0xb5, 0x28, 0xdd
.section .rodata.cst16.morus640_counter, "aM", @progbits, 16
.align 16
.Lmorus640_counter_0:
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.Lmorus640_counter_1:
.byte 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17
.byte 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
.text
.macro rol1 hi, lo
/*
* HI_1 | HI_0 || LO_1 | LO_0
* ==>
* HI_0 | HI_1 || LO_1 | LO_0
* ==>
* HI_0 | LO_1 || LO_0 | HI_1
*/
pshufd $MASK2, \hi, \hi
movdqa \hi, T0_LO
punpcklqdq \lo, T0_LO
punpckhqdq \hi, \lo
movdqa \lo, \hi
movdqa T0_LO, \lo
.endm
.macro rol2 hi, lo
movdqa \lo, T0_LO
movdqa \hi, \lo
movdqa T0_LO, \hi
.endm
.macro rol3 hi, lo
/*
* HI_1 | HI_0 || LO_1 | LO_0
* ==>
* HI_0 | HI_1 || LO_1 | LO_0
* ==>
* LO_0 | HI_1 || HI_0 | LO_1
*/
pshufd $MASK2, \hi, \hi
movdqa \lo, T0_LO
punpckhqdq \hi, T0_LO
punpcklqdq \lo, \hi
movdqa T0_LO, \lo
.endm
.macro morus1280_round s0_l, s0_h, s1_l, s1_h, s2_l, s2_h, s3_l, s3_h, s4_l, s4_h, b, w
movdqa \s1_l, T0_LO
pand \s2_l, T0_LO
pxor T0_LO, \s0_l
movdqa \s1_h, T0_LO
pand \s2_h, T0_LO
pxor T0_LO, \s0_h
pxor \s3_l, \s0_l
pxor \s3_h, \s0_h
movdqa \s0_l, T0_LO
psllq $\b, T0_LO
psrlq $(64 - \b), \s0_l
pxor T0_LO, \s0_l
movdqa \s0_h, T0_LO
psllq $\b, T0_LO
psrlq $(64 - \b), \s0_h
pxor T0_LO, \s0_h
\w \s3_h, \s3_l
.endm
/*
* __morus1280_update: internal ABI
* input:
* STATE[0-4] - input state
* MSG - message block
* output:
* STATE[0-4] - output state
* changed:
* T0
*/
__morus1280_update:
morus1280_round \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
13, rol1
pxor MSG_LO, STATE1_LO
pxor MSG_HI, STATE1_HI
morus1280_round \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
46, rol2
pxor MSG_LO, STATE2_LO
pxor MSG_HI, STATE2_HI
morus1280_round \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
38, rol3
pxor MSG_LO, STATE3_LO
pxor MSG_HI, STATE3_HI
morus1280_round \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
7, rol2
pxor MSG_LO, STATE4_LO
pxor MSG_HI, STATE4_HI
morus1280_round \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
4, rol1
ret
ENDPROC(__morus1280_update)
/*
* __morus1280_update_zero: internal ABI
* input:
* STATE[0-4] - input state
* output:
* STATE[0-4] - output state
* changed:
* T0
*/
__morus1280_update_zero:
morus1280_round \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
13, rol1
morus1280_round \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
46, rol2
morus1280_round \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
38, rol3
morus1280_round \
STATE3_LO, STATE3_HI, \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
7, rol2
morus1280_round \
STATE4_LO, STATE4_HI, \
STATE0_LO, STATE0_HI, \
STATE1_LO, STATE1_HI, \
STATE2_LO, STATE2_HI, \
STATE3_LO, STATE3_HI, \
4, rol1
ret
ENDPROC(__morus1280_update_zero)
/*
* __load_partial: internal ABI
* input:
* %rsi - src
* %rcx - bytes
* output:
* MSG - message block
* changed:
* %r8
* %r9
*/
__load_partial:
xor %r9d, %r9d
pxor MSG_LO, MSG_LO
pxor MSG_HI, MSG_HI
mov %rcx, %r8
and $0x1, %r8
jz .Lld_partial_1
mov %rcx, %r8
and $0x1E, %r8
add %rsi, %r8
mov (%r8), %r9b
.Lld_partial_1:
mov %rcx, %r8
and $0x2, %r8
jz .Lld_partial_2
mov %rcx, %r8
and $0x1C, %r8
add %rsi, %r8
shl $16, %r9
mov (%r8), %r9w
.Lld_partial_2:
mov %rcx, %r8
and $0x4, %r8
jz .Lld_partial_4
mov %rcx, %r8
and $0x18, %r8
add %rsi, %r8
shl $32, %r9
mov (%r8), %r8d
xor %r8, %r9
.Lld_partial_4:
movq %r9, MSG_LO
mov %rcx, %r8
and $0x8, %r8
jz .Lld_partial_8
mov %rcx, %r8
and $0x10, %r8
add %rsi, %r8
pslldq $8, MSG_LO
movq (%r8), T0_LO
pxor T0_LO, MSG_LO
.Lld_partial_8:
mov %rcx, %r8
and $0x10, %r8
jz .Lld_partial_16
movdqa MSG_LO, MSG_HI
movdqu (%rsi), MSG_LO
.Lld_partial_16:
ret
ENDPROC(__load_partial)
/*
* __store_partial: internal ABI
* input:
* %rdx - dst
* %rcx - bytes
* output:
* T0 - message block
* changed:
* %r8
* %r9
* %r10
*/
__store_partial:
mov %rcx, %r8
mov %rdx, %r9
cmp $16, %r8
jl .Lst_partial_16
movdqu T0_LO, (%r9)
movdqa T0_HI, T0_LO
sub $16, %r8
add $16, %r9
.Lst_partial_16:
movq T0_LO, %r10
cmp $8, %r8
jl .Lst_partial_8
mov %r10, (%r9)
psrldq $8, T0_LO
movq T0_LO, %r10
sub $8, %r8
add $8, %r9
.Lst_partial_8:
cmp $4, %r8
jl .Lst_partial_4
mov %r10d, (%r9)
shr $32, %r10
sub $4, %r8
add $4, %r9
.Lst_partial_4:
cmp $2, %r8
jl .Lst_partial_2
mov %r10w, (%r9)
shr $16, %r10
sub $2, %r8
add $2, %r9
.Lst_partial_2:
cmp $1, %r8
jl .Lst_partial_1
mov %r10b, (%r9)
.Lst_partial_1:
ret
ENDPROC(__store_partial)
/*
* void crypto_morus1280_sse2_init(void *state, const void *key,
* const void *iv);
*/
ENTRY(crypto_morus1280_sse2_init)
FRAME_BEGIN
/* load IV: */
pxor STATE0_HI, STATE0_HI
movdqu (%rdx), STATE0_LO
/* load key: */
movdqu 0(%rsi), KEY_LO
movdqu 16(%rsi), KEY_HI
movdqa KEY_LO, STATE1_LO
movdqa KEY_HI, STATE1_HI
/* load all ones: */
pcmpeqd STATE2_LO, STATE2_LO
pcmpeqd STATE2_HI, STATE2_HI
/* load all zeros: */
pxor STATE3_LO, STATE3_LO
pxor STATE3_HI, STATE3_HI
/* load the constant: */
movdqa .Lmorus640_const_0, STATE4_LO
movdqa .Lmorus640_const_1, STATE4_HI
/* update 16 times with zero: */
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
call __morus1280_update_zero
/* xor-in the key again after updates: */
pxor KEY_LO, STATE1_LO
pxor KEY_HI, STATE1_HI
/* store the state: */
movdqu STATE0_LO, (0 * 16)(%rdi)
movdqu STATE0_HI, (1 * 16)(%rdi)
movdqu STATE1_LO, (2 * 16)(%rdi)
movdqu STATE1_HI, (3 * 16)(%rdi)
movdqu STATE2_LO, (4 * 16)(%rdi)
movdqu STATE2_HI, (5 * 16)(%rdi)
movdqu STATE3_LO, (6 * 16)(%rdi)
movdqu STATE3_HI, (7 * 16)(%rdi)
movdqu STATE4_LO, (8 * 16)(%rdi)
movdqu STATE4_HI, (9 * 16)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus1280_sse2_init)
/*
* void crypto_morus1280_sse2_ad(void *state, const void *data,
* unsigned int length);
*/
ENTRY(crypto_morus1280_sse2_ad)
FRAME_BEGIN
cmp $32, %rdx
jb .Lad_out
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0_LO
movdqu (1 * 16)(%rdi), STATE0_HI
movdqu (2 * 16)(%rdi), STATE1_LO
movdqu (3 * 16)(%rdi), STATE1_HI
movdqu (4 * 16)(%rdi), STATE2_LO
movdqu (5 * 16)(%rdi), STATE2_HI
movdqu (6 * 16)(%rdi), STATE3_LO
movdqu (7 * 16)(%rdi), STATE3_HI
movdqu (8 * 16)(%rdi), STATE4_LO
movdqu (9 * 16)(%rdi), STATE4_HI
mov %rsi, %r8
and $0xF, %r8
jnz .Lad_u_loop
.align 4
.Lad_a_loop:
movdqa 0(%rsi), MSG_LO
movdqa 16(%rsi), MSG_HI
call __morus1280_update
sub $32, %rdx
add $32, %rsi
cmp $32, %rdx
jge .Lad_a_loop
jmp .Lad_cont
.align 4
.Lad_u_loop:
movdqu 0(%rsi), MSG_LO
movdqu 16(%rsi), MSG_HI
call __morus1280_update
sub $32, %rdx
add $32, %rsi
cmp $32, %rdx
jge .Lad_u_loop
.Lad_cont:
/* store the state: */
movdqu STATE0_LO, (0 * 16)(%rdi)
movdqu STATE0_HI, (1 * 16)(%rdi)
movdqu STATE1_LO, (2 * 16)(%rdi)
movdqu STATE1_HI, (3 * 16)(%rdi)
movdqu STATE2_LO, (4 * 16)(%rdi)
movdqu STATE2_HI, (5 * 16)(%rdi)
movdqu STATE3_LO, (6 * 16)(%rdi)
movdqu STATE3_HI, (7 * 16)(%rdi)
movdqu STATE4_LO, (8 * 16)(%rdi)
movdqu STATE4_HI, (9 * 16)(%rdi)
.Lad_out:
FRAME_END
ret
ENDPROC(crypto_morus1280_sse2_ad)
/*
* void crypto_morus1280_sse2_enc(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_sse2_enc)
FRAME_BEGIN
cmp $32, %rcx
jb .Lenc_out
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0_LO
movdqu (1 * 16)(%rdi), STATE0_HI
movdqu (2 * 16)(%rdi), STATE1_LO
movdqu (3 * 16)(%rdi), STATE1_HI
movdqu (4 * 16)(%rdi), STATE2_LO
movdqu (5 * 16)(%rdi), STATE2_HI
movdqu (6 * 16)(%rdi), STATE3_LO
movdqu (7 * 16)(%rdi), STATE3_HI
movdqu (8 * 16)(%rdi), STATE4_LO
movdqu (9 * 16)(%rdi), STATE4_HI
mov %rsi, %r8
or %rdx, %r8
and $0xF, %r8
jnz .Lenc_u_loop
.align 4
.Lenc_a_loop:
movdqa 0(%rsi), MSG_LO
movdqa 16(%rsi), MSG_HI
movdqa STATE1_LO, T1_LO
movdqa STATE1_HI, T1_HI
rol3 T1_HI, T1_LO
movdqa MSG_LO, T0_LO
movdqa MSG_HI, T0_HI
pxor T1_LO, T0_LO
pxor T1_HI, T0_HI
pxor STATE0_LO, T0_LO
pxor STATE0_HI, T0_HI
movdqa STATE2_LO, T1_LO
movdqa STATE2_HI, T1_HI
pand STATE3_LO, T1_LO
pand STATE3_HI, T1_HI
pxor T1_LO, T0_LO
pxor T1_HI, T0_HI
movdqa T0_LO, 0(%rdx)
movdqa T0_HI, 16(%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Lenc_a_loop
jmp .Lenc_cont
.align 4
.Lenc_u_loop:
movdqu 0(%rsi), MSG_LO
movdqu 16(%rsi), MSG_HI
movdqa STATE1_LO, T1_LO
movdqa STATE1_HI, T1_HI
rol3 T1_HI, T1_LO
movdqa MSG_LO, T0_LO
movdqa MSG_HI, T0_HI
pxor T1_LO, T0_LO
pxor T1_HI, T0_HI
pxor STATE0_LO, T0_LO
pxor STATE0_HI, T0_HI
movdqa STATE2_LO, T1_LO
movdqa STATE2_HI, T1_HI
pand STATE3_LO, T1_LO
pand STATE3_HI, T1_HI
pxor T1_LO, T0_LO
pxor T1_HI, T0_HI
movdqu T0_LO, 0(%rdx)
movdqu T0_HI, 16(%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Lenc_u_loop
.Lenc_cont:
/* store the state: */
movdqu STATE0_LO, (0 * 16)(%rdi)
movdqu STATE0_HI, (1 * 16)(%rdi)
movdqu STATE1_LO, (2 * 16)(%rdi)
movdqu STATE1_HI, (3 * 16)(%rdi)
movdqu STATE2_LO, (4 * 16)(%rdi)
movdqu STATE2_HI, (5 * 16)(%rdi)
movdqu STATE3_LO, (6 * 16)(%rdi)
movdqu STATE3_HI, (7 * 16)(%rdi)
movdqu STATE4_LO, (8 * 16)(%rdi)
movdqu STATE4_HI, (9 * 16)(%rdi)
.Lenc_out:
FRAME_END
ret
ENDPROC(crypto_morus1280_sse2_enc)
/*
* void crypto_morus1280_sse2_enc_tail(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_sse2_enc_tail)
FRAME_BEGIN
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0_LO
movdqu (1 * 16)(%rdi), STATE0_HI
movdqu (2 * 16)(%rdi), STATE1_LO
movdqu (3 * 16)(%rdi), STATE1_HI
movdqu (4 * 16)(%rdi), STATE2_LO
movdqu (5 * 16)(%rdi), STATE2_HI
movdqu (6 * 16)(%rdi), STATE3_LO
movdqu (7 * 16)(%rdi), STATE3_HI
movdqu (8 * 16)(%rdi), STATE4_LO
movdqu (9 * 16)(%rdi), STATE4_HI
/* encrypt message: */
call __load_partial
movdqa STATE1_LO, T1_LO
movdqa STATE1_HI, T1_HI
rol3 T1_HI, T1_LO
movdqa MSG_LO, T0_LO
movdqa MSG_HI, T0_HI
pxor T1_LO, T0_LO
pxor T1_HI, T0_HI
pxor STATE0_LO, T0_LO
pxor STATE0_HI, T0_HI
movdqa STATE2_LO, T1_LO
movdqa STATE2_HI, T1_HI
pand STATE3_LO, T1_LO
pand STATE3_HI, T1_HI
pxor T1_LO, T0_LO
pxor T1_HI, T0_HI
call __store_partial
call __morus1280_update
/* store the state: */
movdqu STATE0_LO, (0 * 16)(%rdi)
movdqu STATE0_HI, (1 * 16)(%rdi)
movdqu STATE1_LO, (2 * 16)(%rdi)
movdqu STATE1_HI, (3 * 16)(%rdi)
movdqu STATE2_LO, (4 * 16)(%rdi)
movdqu STATE2_HI, (5 * 16)(%rdi)
movdqu STATE3_LO, (6 * 16)(%rdi)
movdqu STATE3_HI, (7 * 16)(%rdi)
movdqu STATE4_LO, (8 * 16)(%rdi)
movdqu STATE4_HI, (9 * 16)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus1280_sse2_enc_tail)
/*
* void crypto_morus1280_sse2_dec(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_sse2_dec)
FRAME_BEGIN
cmp $32, %rcx
jb .Ldec_out
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0_LO
movdqu (1 * 16)(%rdi), STATE0_HI
movdqu (2 * 16)(%rdi), STATE1_LO
movdqu (3 * 16)(%rdi), STATE1_HI
movdqu (4 * 16)(%rdi), STATE2_LO
movdqu (5 * 16)(%rdi), STATE2_HI
movdqu (6 * 16)(%rdi), STATE3_LO
movdqu (7 * 16)(%rdi), STATE3_HI
movdqu (8 * 16)(%rdi), STATE4_LO
movdqu (9 * 16)(%rdi), STATE4_HI
mov %rsi, %r8
or %rdx, %r8
and $0xF, %r8
jnz .Ldec_u_loop
.align 4
.Ldec_a_loop:
movdqa 0(%rsi), MSG_LO
movdqa 16(%rsi), MSG_HI
pxor STATE0_LO, MSG_LO
pxor STATE0_HI, MSG_HI
movdqa STATE1_LO, T1_LO
movdqa STATE1_HI, T1_HI
rol3 T1_HI, T1_LO
pxor T1_LO, MSG_LO
pxor T1_HI, MSG_HI
movdqa STATE2_LO, T1_LO
movdqa STATE2_HI, T1_HI
pand STATE3_LO, T1_LO
pand STATE3_HI, T1_HI
pxor T1_LO, MSG_LO
pxor T1_HI, MSG_HI
movdqa MSG_LO, 0(%rdx)
movdqa MSG_HI, 16(%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Ldec_a_loop
jmp .Ldec_cont
.align 4
.Ldec_u_loop:
movdqu 0(%rsi), MSG_LO
movdqu 16(%rsi), MSG_HI
pxor STATE0_LO, MSG_LO
pxor STATE0_HI, MSG_HI
movdqa STATE1_LO, T1_LO
movdqa STATE1_HI, T1_HI
rol3 T1_HI, T1_LO
pxor T1_LO, MSG_LO
pxor T1_HI, MSG_HI
movdqa STATE2_LO, T1_LO
movdqa STATE2_HI, T1_HI
pand STATE3_LO, T1_LO
pand STATE3_HI, T1_HI
pxor T1_LO, MSG_LO
pxor T1_HI, MSG_HI
movdqu MSG_LO, 0(%rdx)
movdqu MSG_HI, 16(%rdx)
call __morus1280_update
sub $32, %rcx
add $32, %rsi
add $32, %rdx
cmp $32, %rcx
jge .Ldec_u_loop
.Ldec_cont:
/* store the state: */
movdqu STATE0_LO, (0 * 16)(%rdi)
movdqu STATE0_HI, (1 * 16)(%rdi)
movdqu STATE1_LO, (2 * 16)(%rdi)
movdqu STATE1_HI, (3 * 16)(%rdi)
movdqu STATE2_LO, (4 * 16)(%rdi)
movdqu STATE2_HI, (5 * 16)(%rdi)
movdqu STATE3_LO, (6 * 16)(%rdi)
movdqu STATE3_HI, (7 * 16)(%rdi)
movdqu STATE4_LO, (8 * 16)(%rdi)
movdqu STATE4_HI, (9 * 16)(%rdi)
.Ldec_out:
FRAME_END
ret
ENDPROC(crypto_morus1280_sse2_dec)
/*
* void crypto_morus1280_sse2_dec_tail(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus1280_sse2_dec_tail)
FRAME_BEGIN
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0_LO
movdqu (1 * 16)(%rdi), STATE0_HI
movdqu (2 * 16)(%rdi), STATE1_LO
movdqu (3 * 16)(%rdi), STATE1_HI
movdqu (4 * 16)(%rdi), STATE2_LO
movdqu (5 * 16)(%rdi), STATE2_HI
movdqu (6 * 16)(%rdi), STATE3_LO
movdqu (7 * 16)(%rdi), STATE3_HI
movdqu (8 * 16)(%rdi), STATE4_LO
movdqu (9 * 16)(%rdi), STATE4_HI
/* decrypt message: */
call __load_partial
pxor STATE0_LO, MSG_LO
pxor STATE0_HI, MSG_HI
movdqa STATE1_LO, T1_LO
movdqa STATE1_HI, T1_HI
rol3 T1_HI, T1_LO
pxor T1_LO, MSG_LO
pxor T1_HI, MSG_HI
movdqa STATE2_LO, T1_LO
movdqa STATE2_HI, T1_HI
pand STATE3_LO, T1_LO
pand STATE3_HI, T1_HI
pxor T1_LO, MSG_LO
pxor T1_HI, MSG_HI
movdqa MSG_LO, T0_LO
movdqa MSG_HI, T0_HI
call __store_partial
/* mask with byte count: */
movq %rcx, T0_LO
punpcklbw T0_LO, T0_LO
punpcklbw T0_LO, T0_LO
punpcklbw T0_LO, T0_LO
punpcklbw T0_LO, T0_LO
movdqa T0_LO, T0_HI
movdqa .Lmorus640_counter_0, T1_LO
movdqa .Lmorus640_counter_1, T1_HI
pcmpgtb T1_LO, T0_LO
pcmpgtb T1_HI, T0_HI
pand T0_LO, MSG_LO
pand T0_HI, MSG_HI
call __morus1280_update
/* store the state: */
movdqu STATE0_LO, (0 * 16)(%rdi)
movdqu STATE0_HI, (1 * 16)(%rdi)
movdqu STATE1_LO, (2 * 16)(%rdi)
movdqu STATE1_HI, (3 * 16)(%rdi)
movdqu STATE2_LO, (4 * 16)(%rdi)
movdqu STATE2_HI, (5 * 16)(%rdi)
movdqu STATE3_LO, (6 * 16)(%rdi)
movdqu STATE3_HI, (7 * 16)(%rdi)
movdqu STATE4_LO, (8 * 16)(%rdi)
movdqu STATE4_HI, (9 * 16)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus1280_sse2_dec_tail)
/*
* void crypto_morus1280_sse2_final(void *state, void *tag_xor,
* u64 assoclen, u64 cryptlen);
*/
ENTRY(crypto_morus1280_sse2_final)
FRAME_BEGIN
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0_LO
movdqu (1 * 16)(%rdi), STATE0_HI
movdqu (2 * 16)(%rdi), STATE1_LO
movdqu (3 * 16)(%rdi), STATE1_HI
movdqu (4 * 16)(%rdi), STATE2_LO
movdqu (5 * 16)(%rdi), STATE2_HI
movdqu (6 * 16)(%rdi), STATE3_LO
movdqu (7 * 16)(%rdi), STATE3_HI
movdqu (8 * 16)(%rdi), STATE4_LO
movdqu (9 * 16)(%rdi), STATE4_HI
/* xor state[0] into state[4]: */
pxor STATE0_LO, STATE4_LO
pxor STATE0_HI, STATE4_HI
/* prepare length block: */
movq %rdx, MSG_LO
movq %rcx, T0_LO
pslldq $8, T0_LO
pxor T0_LO, MSG_LO
psllq $3, MSG_LO /* multiply by 8 (to get bit count) */
pxor MSG_HI, MSG_HI
/* update state: */
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
call __morus1280_update
/* xor tag: */
movdqu 0(%rsi), MSG_LO
movdqu 16(%rsi), MSG_HI
pxor STATE0_LO, MSG_LO
pxor STATE0_HI, MSG_HI
movdqa STATE1_LO, T0_LO
movdqa STATE1_HI, T0_HI
rol3 T0_HI, T0_LO
pxor T0_LO, MSG_LO
pxor T0_HI, MSG_HI
movdqa STATE2_LO, T0_LO
movdqa STATE2_HI, T0_HI
pand STATE3_LO, T0_LO
pand STATE3_HI, T0_HI
pxor T0_LO, MSG_LO
pxor T0_HI, MSG_HI
movdqu MSG_LO, 0(%rsi)
movdqu MSG_HI, 16(%rsi)
FRAME_END
ret
ENDPROC(crypto_morus1280_sse2_final)

View File

@ -1,61 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The MORUS-1280 Authenticated-Encryption Algorithm
* Glue for SSE2 implementation
*
* Copyright (c) 2016-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/simd.h>
#include <crypto/morus1280_glue.h>
#include <linux/module.h>
#include <asm/fpu/api.h>
#include <asm/cpu_device_id.h>
asmlinkage void crypto_morus1280_sse2_init(void *state, const void *key,
const void *iv);
asmlinkage void crypto_morus1280_sse2_ad(void *state, const void *data,
unsigned int length);
asmlinkage void crypto_morus1280_sse2_enc(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_sse2_dec(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_sse2_enc_tail(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_sse2_dec_tail(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus1280_sse2_final(void *state, void *tag_xor,
u64 assoclen, u64 cryptlen);
MORUS1280_DECLARE_ALG(sse2, "morus1280-sse2", 350);
static struct simd_aead_alg *simd_alg;
static int __init crypto_morus1280_sse2_module_init(void)
{
if (!boot_cpu_has(X86_FEATURE_XMM2) ||
!cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL))
return -ENODEV;
return simd_register_aeads_compat(&crypto_morus1280_sse2_alg, 1,
&simd_alg);
}
static void __exit crypto_morus1280_sse2_module_exit(void)
{
simd_unregister_aeads(&crypto_morus1280_sse2_alg, 1, &simd_alg);
}
module_init(crypto_morus1280_sse2_module_init);
module_exit(crypto_morus1280_sse2_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("MORUS-1280 AEAD algorithm -- SSE2 implementation");
MODULE_ALIAS_CRYPTO("morus1280");
MODULE_ALIAS_CRYPTO("morus1280-sse2");

View File

@ -1,205 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The MORUS-1280 Authenticated-Encryption Algorithm
* Common x86 SIMD glue skeleton
*
* Copyright (c) 2016-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/morus1280_glue.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/scatterlist.h>
#include <asm/fpu/api.h>
struct morus1280_state {
struct morus1280_block s[MORUS_STATE_BLOCKS];
};
struct morus1280_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_blocks)(void *state, const void *src, void *dst,
unsigned int length);
void (*crypt_tail)(void *state, const void *src, void *dst,
unsigned int length);
};
static void crypto_morus1280_glue_process_ad(
struct morus1280_state *state,
const struct morus1280_glue_ops *ops,
struct scatterlist *sg_src, unsigned int assoclen)
{
struct scatter_walk walk;
struct morus1280_block buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= MORUS1280_BLOCK_SIZE) {
if (pos > 0) {
unsigned int fill = MORUS1280_BLOCK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
ops->ad(state, buf.bytes, MORUS1280_BLOCK_SIZE);
pos = 0;
left -= fill;
src += fill;
}
ops->ad(state, src, left);
src += left & ~(MORUS1280_BLOCK_SIZE - 1);
left &= MORUS1280_BLOCK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, MORUS1280_BLOCK_SIZE - pos);
ops->ad(state, buf.bytes, MORUS1280_BLOCK_SIZE);
}
}
static void crypto_morus1280_glue_process_crypt(struct morus1280_state *state,
struct morus1280_ops ops,
struct skcipher_walk *walk)
{
while (walk->nbytes >= MORUS1280_BLOCK_SIZE) {
ops.crypt_blocks(state, walk->src.virt.addr,
walk->dst.virt.addr,
round_down(walk->nbytes,
MORUS1280_BLOCK_SIZE));
skcipher_walk_done(walk, walk->nbytes % MORUS1280_BLOCK_SIZE);
}
if (walk->nbytes) {
ops.crypt_tail(state, walk->src.virt.addr, walk->dst.virt.addr,
walk->nbytes);
skcipher_walk_done(walk, 0);
}
}
int crypto_morus1280_glue_setkey(struct crypto_aead *aead, const u8 *key,
unsigned int keylen)
{
struct morus1280_ctx *ctx = crypto_aead_ctx(aead);
if (keylen == MORUS1280_BLOCK_SIZE) {
memcpy(ctx->key.bytes, key, MORUS1280_BLOCK_SIZE);
} else if (keylen == MORUS1280_BLOCK_SIZE / 2) {
memcpy(ctx->key.bytes, key, keylen);
memcpy(ctx->key.bytes + keylen, key, keylen);
} else {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
return 0;
}
EXPORT_SYMBOL_GPL(crypto_morus1280_glue_setkey);
int crypto_morus1280_glue_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
return (authsize <= MORUS_MAX_AUTH_SIZE) ? 0 : -EINVAL;
}
EXPORT_SYMBOL_GPL(crypto_morus1280_glue_setauthsize);
static void crypto_morus1280_glue_crypt(struct aead_request *req,
struct morus1280_ops ops,
unsigned int cryptlen,
struct morus1280_block *tag_xor)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus1280_ctx *ctx = crypto_aead_ctx(tfm);
struct morus1280_state state;
struct skcipher_walk walk;
ops.skcipher_walk_init(&walk, req, true);
kernel_fpu_begin();
ctx->ops->init(&state, &ctx->key, req->iv);
crypto_morus1280_glue_process_ad(&state, ctx->ops, req->src, req->assoclen);
crypto_morus1280_glue_process_crypt(&state, ops, &walk);
ctx->ops->final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end();
}
int crypto_morus1280_glue_encrypt(struct aead_request *req)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus1280_ctx *ctx = crypto_aead_ctx(tfm);
struct morus1280_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_blocks = ctx->ops->enc,
.crypt_tail = ctx->ops->enc_tail,
};
struct morus1280_block tag = {};
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_morus1280_glue_crypt(req, OPS, cryptlen, &tag);
scatterwalk_map_and_copy(tag.bytes, req->dst,
req->assoclen + cryptlen, authsize, 1);
return 0;
}
EXPORT_SYMBOL_GPL(crypto_morus1280_glue_encrypt);
int crypto_morus1280_glue_decrypt(struct aead_request *req)
{
static const u8 zeros[MORUS1280_BLOCK_SIZE] = {};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus1280_ctx *ctx = crypto_aead_ctx(tfm);
struct morus1280_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_blocks = ctx->ops->dec,
.crypt_tail = ctx->ops->dec_tail,
};
struct morus1280_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag.bytes, req->src,
req->assoclen + cryptlen, authsize, 0);
crypto_morus1280_glue_crypt(req, OPS, cryptlen, &tag);
return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0;
}
EXPORT_SYMBOL_GPL(crypto_morus1280_glue_decrypt);
void crypto_morus1280_glue_init_ops(struct crypto_aead *aead,
const struct morus1280_glue_ops *ops)
{
struct morus1280_ctx *ctx = crypto_aead_ctx(aead);
ctx->ops = ops;
}
EXPORT_SYMBOL_GPL(crypto_morus1280_glue_init_ops);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("MORUS-1280 AEAD mode -- glue for x86 optimizations");

View File

@ -1,612 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* SSE2 implementation of MORUS-640
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <linux/linkage.h>
#include <asm/frame.h>
#define SHUFFLE_MASK(i0, i1, i2, i3) \
(i0 | (i1 << 2) | (i2 << 4) | (i3 << 6))
#define MASK1 SHUFFLE_MASK(3, 0, 1, 2)
#define MASK2 SHUFFLE_MASK(2, 3, 0, 1)
#define MASK3 SHUFFLE_MASK(1, 2, 3, 0)
#define STATE0 %xmm0
#define STATE1 %xmm1
#define STATE2 %xmm2
#define STATE3 %xmm3
#define STATE4 %xmm4
#define KEY %xmm5
#define MSG %xmm5
#define T0 %xmm6
#define T1 %xmm7
.section .rodata.cst16.morus640_const, "aM", @progbits, 32
.align 16
.Lmorus640_const_0:
.byte 0x00, 0x01, 0x01, 0x02, 0x03, 0x05, 0x08, 0x0d
.byte 0x15, 0x22, 0x37, 0x59, 0x90, 0xe9, 0x79, 0x62
.Lmorus640_const_1:
.byte 0xdb, 0x3d, 0x18, 0x55, 0x6d, 0xc2, 0x2f, 0xf1
.byte 0x20, 0x11, 0x31, 0x42, 0x73, 0xb5, 0x28, 0xdd
.section .rodata.cst16.morus640_counter, "aM", @progbits, 16
.align 16
.Lmorus640_counter:
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.text
.macro morus640_round s0, s1, s2, s3, s4, b, w
movdqa \s1, T0
pand \s2, T0
pxor T0, \s0
pxor \s3, \s0
movdqa \s0, T0
pslld $\b, T0
psrld $(32 - \b), \s0
pxor T0, \s0
pshufd $\w, \s3, \s3
.endm
/*
* __morus640_update: internal ABI
* input:
* STATE[0-4] - input state
* MSG - message block
* output:
* STATE[0-4] - output state
* changed:
* T0
*/
__morus640_update:
morus640_round STATE0, STATE1, STATE2, STATE3, STATE4, 5, MASK1
pxor MSG, STATE1
morus640_round STATE1, STATE2, STATE3, STATE4, STATE0, 31, MASK2
pxor MSG, STATE2
morus640_round STATE2, STATE3, STATE4, STATE0, STATE1, 7, MASK3
pxor MSG, STATE3
morus640_round STATE3, STATE4, STATE0, STATE1, STATE2, 22, MASK2
pxor MSG, STATE4
morus640_round STATE4, STATE0, STATE1, STATE2, STATE3, 13, MASK1
ret
ENDPROC(__morus640_update)
/*
* __morus640_update_zero: internal ABI
* input:
* STATE[0-4] - input state
* output:
* STATE[0-4] - output state
* changed:
* T0
*/
__morus640_update_zero:
morus640_round STATE0, STATE1, STATE2, STATE3, STATE4, 5, MASK1
morus640_round STATE1, STATE2, STATE3, STATE4, STATE0, 31, MASK2
morus640_round STATE2, STATE3, STATE4, STATE0, STATE1, 7, MASK3
morus640_round STATE3, STATE4, STATE0, STATE1, STATE2, 22, MASK2
morus640_round STATE4, STATE0, STATE1, STATE2, STATE3, 13, MASK1
ret
ENDPROC(__morus640_update_zero)
/*
* __load_partial: internal ABI
* input:
* %rsi - src
* %rcx - bytes
* output:
* MSG - message block
* changed:
* T0
* %r8
* %r9
*/
__load_partial:
xor %r9d, %r9d
pxor MSG, MSG
mov %rcx, %r8
and $0x1, %r8
jz .Lld_partial_1
mov %rcx, %r8
and $0x1E, %r8
add %rsi, %r8
mov (%r8), %r9b
.Lld_partial_1:
mov %rcx, %r8
and $0x2, %r8
jz .Lld_partial_2
mov %rcx, %r8
and $0x1C, %r8
add %rsi, %r8
shl $16, %r9
mov (%r8), %r9w
.Lld_partial_2:
mov %rcx, %r8
and $0x4, %r8
jz .Lld_partial_4
mov %rcx, %r8
and $0x18, %r8
add %rsi, %r8
shl $32, %r9
mov (%r8), %r8d
xor %r8, %r9
.Lld_partial_4:
movq %r9, MSG
mov %rcx, %r8
and $0x8, %r8
jz .Lld_partial_8
mov %rcx, %r8
and $0x10, %r8
add %rsi, %r8
pslldq $8, MSG
movq (%r8), T0
pxor T0, MSG
.Lld_partial_8:
ret
ENDPROC(__load_partial)
/*
* __store_partial: internal ABI
* input:
* %rdx - dst
* %rcx - bytes
* output:
* T0 - message block
* changed:
* %r8
* %r9
* %r10
*/
__store_partial:
mov %rcx, %r8
mov %rdx, %r9
movq T0, %r10
cmp $8, %r8
jl .Lst_partial_8
mov %r10, (%r9)
psrldq $8, T0
movq T0, %r10
sub $8, %r8
add $8, %r9
.Lst_partial_8:
cmp $4, %r8
jl .Lst_partial_4
mov %r10d, (%r9)
shr $32, %r10
sub $4, %r8
add $4, %r9
.Lst_partial_4:
cmp $2, %r8
jl .Lst_partial_2
mov %r10w, (%r9)
shr $16, %r10
sub $2, %r8
add $2, %r9
.Lst_partial_2:
cmp $1, %r8
jl .Lst_partial_1
mov %r10b, (%r9)
.Lst_partial_1:
ret
ENDPROC(__store_partial)
/*
* void crypto_morus640_sse2_init(void *state, const void *key, const void *iv);
*/
ENTRY(crypto_morus640_sse2_init)
FRAME_BEGIN
/* load IV: */
movdqu (%rdx), STATE0
/* load key: */
movdqu (%rsi), KEY
movdqa KEY, STATE1
/* load all ones: */
pcmpeqd STATE2, STATE2
/* load the constants: */
movdqa .Lmorus640_const_0, STATE3
movdqa .Lmorus640_const_1, STATE4
/* update 16 times with zero: */
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
call __morus640_update_zero
/* xor-in the key again after updates: */
pxor KEY, STATE1
/* store the state: */
movdqu STATE0, (0 * 16)(%rdi)
movdqu STATE1, (1 * 16)(%rdi)
movdqu STATE2, (2 * 16)(%rdi)
movdqu STATE3, (3 * 16)(%rdi)
movdqu STATE4, (4 * 16)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus640_sse2_init)
/*
* void crypto_morus640_sse2_ad(void *state, const void *data,
* unsigned int length);
*/
ENTRY(crypto_morus640_sse2_ad)
FRAME_BEGIN
cmp $16, %rdx
jb .Lad_out
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0
movdqu (1 * 16)(%rdi), STATE1
movdqu (2 * 16)(%rdi), STATE2
movdqu (3 * 16)(%rdi), STATE3
movdqu (4 * 16)(%rdi), STATE4
mov %rsi, %r8
and $0xF, %r8
jnz .Lad_u_loop
.align 4
.Lad_a_loop:
movdqa (%rsi), MSG
call __morus640_update
sub $16, %rdx
add $16, %rsi
cmp $16, %rdx
jge .Lad_a_loop
jmp .Lad_cont
.align 4
.Lad_u_loop:
movdqu (%rsi), MSG
call __morus640_update
sub $16, %rdx
add $16, %rsi
cmp $16, %rdx
jge .Lad_u_loop
.Lad_cont:
/* store the state: */
movdqu STATE0, (0 * 16)(%rdi)
movdqu STATE1, (1 * 16)(%rdi)
movdqu STATE2, (2 * 16)(%rdi)
movdqu STATE3, (3 * 16)(%rdi)
movdqu STATE4, (4 * 16)(%rdi)
.Lad_out:
FRAME_END
ret
ENDPROC(crypto_morus640_sse2_ad)
/*
* void crypto_morus640_sse2_enc(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus640_sse2_enc)
FRAME_BEGIN
cmp $16, %rcx
jb .Lenc_out
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0
movdqu (1 * 16)(%rdi), STATE1
movdqu (2 * 16)(%rdi), STATE2
movdqu (3 * 16)(%rdi), STATE3
movdqu (4 * 16)(%rdi), STATE4
mov %rsi, %r8
or %rdx, %r8
and $0xF, %r8
jnz .Lenc_u_loop
.align 4
.Lenc_a_loop:
movdqa (%rsi), MSG
movdqa MSG, T0
pxor STATE0, T0
pshufd $MASK3, STATE1, T1
pxor T1, T0
movdqa STATE2, T1
pand STATE3, T1
pxor T1, T0
movdqa T0, (%rdx)
call __morus640_update
sub $16, %rcx
add $16, %rsi
add $16, %rdx
cmp $16, %rcx
jge .Lenc_a_loop
jmp .Lenc_cont
.align 4
.Lenc_u_loop:
movdqu (%rsi), MSG
movdqa MSG, T0
pxor STATE0, T0
pshufd $MASK3, STATE1, T1
pxor T1, T0
movdqa STATE2, T1
pand STATE3, T1
pxor T1, T0
movdqu T0, (%rdx)
call __morus640_update
sub $16, %rcx
add $16, %rsi
add $16, %rdx
cmp $16, %rcx
jge .Lenc_u_loop
.Lenc_cont:
/* store the state: */
movdqu STATE0, (0 * 16)(%rdi)
movdqu STATE1, (1 * 16)(%rdi)
movdqu STATE2, (2 * 16)(%rdi)
movdqu STATE3, (3 * 16)(%rdi)
movdqu STATE4, (4 * 16)(%rdi)
.Lenc_out:
FRAME_END
ret
ENDPROC(crypto_morus640_sse2_enc)
/*
* void crypto_morus640_sse2_enc_tail(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus640_sse2_enc_tail)
FRAME_BEGIN
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0
movdqu (1 * 16)(%rdi), STATE1
movdqu (2 * 16)(%rdi), STATE2
movdqu (3 * 16)(%rdi), STATE3
movdqu (4 * 16)(%rdi), STATE4
/* encrypt message: */
call __load_partial
movdqa MSG, T0
pxor STATE0, T0
pshufd $MASK3, STATE1, T1
pxor T1, T0
movdqa STATE2, T1
pand STATE3, T1
pxor T1, T0
call __store_partial
call __morus640_update
/* store the state: */
movdqu STATE0, (0 * 16)(%rdi)
movdqu STATE1, (1 * 16)(%rdi)
movdqu STATE2, (2 * 16)(%rdi)
movdqu STATE3, (3 * 16)(%rdi)
movdqu STATE4, (4 * 16)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus640_sse2_enc_tail)
/*
* void crypto_morus640_sse2_dec(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus640_sse2_dec)
FRAME_BEGIN
cmp $16, %rcx
jb .Ldec_out
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0
movdqu (1 * 16)(%rdi), STATE1
movdqu (2 * 16)(%rdi), STATE2
movdqu (3 * 16)(%rdi), STATE3
movdqu (4 * 16)(%rdi), STATE4
mov %rsi, %r8
or %rdx, %r8
and $0xF, %r8
jnz .Ldec_u_loop
.align 4
.Ldec_a_loop:
movdqa (%rsi), MSG
pxor STATE0, MSG
pshufd $MASK3, STATE1, T0
pxor T0, MSG
movdqa STATE2, T0
pand STATE3, T0
pxor T0, MSG
movdqa MSG, (%rdx)
call __morus640_update
sub $16, %rcx
add $16, %rsi
add $16, %rdx
cmp $16, %rcx
jge .Ldec_a_loop
jmp .Ldec_cont
.align 4
.Ldec_u_loop:
movdqu (%rsi), MSG
pxor STATE0, MSG
pshufd $MASK3, STATE1, T0
pxor T0, MSG
movdqa STATE2, T0
pand STATE3, T0
pxor T0, MSG
movdqu MSG, (%rdx)
call __morus640_update
sub $16, %rcx
add $16, %rsi
add $16, %rdx
cmp $16, %rcx
jge .Ldec_u_loop
.Ldec_cont:
/* store the state: */
movdqu STATE0, (0 * 16)(%rdi)
movdqu STATE1, (1 * 16)(%rdi)
movdqu STATE2, (2 * 16)(%rdi)
movdqu STATE3, (3 * 16)(%rdi)
movdqu STATE4, (4 * 16)(%rdi)
.Ldec_out:
FRAME_END
ret
ENDPROC(crypto_morus640_sse2_dec)
/*
* void crypto_morus640_sse2_dec_tail(void *state, const void *src, void *dst,
* unsigned int length);
*/
ENTRY(crypto_morus640_sse2_dec_tail)
FRAME_BEGIN
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0
movdqu (1 * 16)(%rdi), STATE1
movdqu (2 * 16)(%rdi), STATE2
movdqu (3 * 16)(%rdi), STATE3
movdqu (4 * 16)(%rdi), STATE4
/* decrypt message: */
call __load_partial
pxor STATE0, MSG
pshufd $MASK3, STATE1, T0
pxor T0, MSG
movdqa STATE2, T0
pand STATE3, T0
pxor T0, MSG
movdqa MSG, T0
call __store_partial
/* mask with byte count: */
movq %rcx, T0
punpcklbw T0, T0
punpcklbw T0, T0
punpcklbw T0, T0
punpcklbw T0, T0
movdqa .Lmorus640_counter, T1
pcmpgtb T1, T0
pand T0, MSG
call __morus640_update
/* store the state: */
movdqu STATE0, (0 * 16)(%rdi)
movdqu STATE1, (1 * 16)(%rdi)
movdqu STATE2, (2 * 16)(%rdi)
movdqu STATE3, (3 * 16)(%rdi)
movdqu STATE4, (4 * 16)(%rdi)
FRAME_END
ret
ENDPROC(crypto_morus640_sse2_dec_tail)
/*
* void crypto_morus640_sse2_final(void *state, void *tag_xor,
* u64 assoclen, u64 cryptlen);
*/
ENTRY(crypto_morus640_sse2_final)
FRAME_BEGIN
/* load the state: */
movdqu (0 * 16)(%rdi), STATE0
movdqu (1 * 16)(%rdi), STATE1
movdqu (2 * 16)(%rdi), STATE2
movdqu (3 * 16)(%rdi), STATE3
movdqu (4 * 16)(%rdi), STATE4
/* xor state[0] into state[4]: */
pxor STATE0, STATE4
/* prepare length block: */
movq %rdx, MSG
movq %rcx, T0
pslldq $8, T0
pxor T0, MSG
psllq $3, MSG /* multiply by 8 (to get bit count) */
/* update state: */
call __morus640_update
call __morus640_update
call __morus640_update
call __morus640_update
call __morus640_update
call __morus640_update
call __morus640_update
call __morus640_update
call __morus640_update
call __morus640_update
/* xor tag: */
movdqu (%rsi), MSG
pxor STATE0, MSG
pshufd $MASK3, STATE1, T0
pxor T0, MSG
movdqa STATE2, T0
pand STATE3, T0
pxor T0, MSG
movdqu MSG, (%rsi)
FRAME_END
ret
ENDPROC(crypto_morus640_sse2_final)

View File

@ -1,61 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The MORUS-640 Authenticated-Encryption Algorithm
* Glue for SSE2 implementation
*
* Copyright (c) 2016-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/simd.h>
#include <crypto/morus640_glue.h>
#include <linux/module.h>
#include <asm/fpu/api.h>
#include <asm/cpu_device_id.h>
asmlinkage void crypto_morus640_sse2_init(void *state, const void *key,
const void *iv);
asmlinkage void crypto_morus640_sse2_ad(void *state, const void *data,
unsigned int length);
asmlinkage void crypto_morus640_sse2_enc(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus640_sse2_dec(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus640_sse2_enc_tail(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus640_sse2_dec_tail(void *state, const void *src,
void *dst, unsigned int length);
asmlinkage void crypto_morus640_sse2_final(void *state, void *tag_xor,
u64 assoclen, u64 cryptlen);
MORUS640_DECLARE_ALG(sse2, "morus640-sse2", 400);
static struct simd_aead_alg *simd_alg;
static int __init crypto_morus640_sse2_module_init(void)
{
if (!boot_cpu_has(X86_FEATURE_XMM2) ||
!cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL))
return -ENODEV;
return simd_register_aeads_compat(&crypto_morus640_sse2_alg, 1,
&simd_alg);
}
static void __exit crypto_morus640_sse2_module_exit(void)
{
simd_unregister_aeads(&crypto_morus640_sse2_alg, 1, &simd_alg);
}
module_init(crypto_morus640_sse2_module_init);
module_exit(crypto_morus640_sse2_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("MORUS-640 AEAD algorithm -- SSE2 implementation");
MODULE_ALIAS_CRYPTO("morus640");
MODULE_ALIAS_CRYPTO("morus640-sse2");

View File

@ -1,200 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The MORUS-640 Authenticated-Encryption Algorithm
* Common x86 SIMD glue skeleton
*
* Copyright (c) 2016-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/morus640_glue.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/scatterlist.h>
#include <asm/fpu/api.h>
struct morus640_state {
struct morus640_block s[MORUS_STATE_BLOCKS];
};
struct morus640_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_blocks)(void *state, const void *src, void *dst,
unsigned int length);
void (*crypt_tail)(void *state, const void *src, void *dst,
unsigned int length);
};
static void crypto_morus640_glue_process_ad(
struct morus640_state *state,
const struct morus640_glue_ops *ops,
struct scatterlist *sg_src, unsigned int assoclen)
{
struct scatter_walk walk;
struct morus640_block buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= MORUS640_BLOCK_SIZE) {
if (pos > 0) {
unsigned int fill = MORUS640_BLOCK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
ops->ad(state, buf.bytes, MORUS640_BLOCK_SIZE);
pos = 0;
left -= fill;
src += fill;
}
ops->ad(state, src, left);
src += left & ~(MORUS640_BLOCK_SIZE - 1);
left &= MORUS640_BLOCK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, MORUS640_BLOCK_SIZE - pos);
ops->ad(state, buf.bytes, MORUS640_BLOCK_SIZE);
}
}
static void crypto_morus640_glue_process_crypt(struct morus640_state *state,
struct morus640_ops ops,
struct skcipher_walk *walk)
{
while (walk->nbytes >= MORUS640_BLOCK_SIZE) {
ops.crypt_blocks(state, walk->src.virt.addr,
walk->dst.virt.addr,
round_down(walk->nbytes, MORUS640_BLOCK_SIZE));
skcipher_walk_done(walk, walk->nbytes % MORUS640_BLOCK_SIZE);
}
if (walk->nbytes) {
ops.crypt_tail(state, walk->src.virt.addr, walk->dst.virt.addr,
walk->nbytes);
skcipher_walk_done(walk, 0);
}
}
int crypto_morus640_glue_setkey(struct crypto_aead *aead, const u8 *key,
unsigned int keylen)
{
struct morus640_ctx *ctx = crypto_aead_ctx(aead);
if (keylen != MORUS640_BLOCK_SIZE) {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
memcpy(ctx->key.bytes, key, MORUS640_BLOCK_SIZE);
return 0;
}
EXPORT_SYMBOL_GPL(crypto_morus640_glue_setkey);
int crypto_morus640_glue_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
return (authsize <= MORUS_MAX_AUTH_SIZE) ? 0 : -EINVAL;
}
EXPORT_SYMBOL_GPL(crypto_morus640_glue_setauthsize);
static void crypto_morus640_glue_crypt(struct aead_request *req,
struct morus640_ops ops,
unsigned int cryptlen,
struct morus640_block *tag_xor)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus640_ctx *ctx = crypto_aead_ctx(tfm);
struct morus640_state state;
struct skcipher_walk walk;
ops.skcipher_walk_init(&walk, req, true);
kernel_fpu_begin();
ctx->ops->init(&state, &ctx->key, req->iv);
crypto_morus640_glue_process_ad(&state, ctx->ops, req->src, req->assoclen);
crypto_morus640_glue_process_crypt(&state, ops, &walk);
ctx->ops->final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end();
}
int crypto_morus640_glue_encrypt(struct aead_request *req)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus640_ctx *ctx = crypto_aead_ctx(tfm);
struct morus640_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_blocks = ctx->ops->enc,
.crypt_tail = ctx->ops->enc_tail,
};
struct morus640_block tag = {};
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_morus640_glue_crypt(req, OPS, cryptlen, &tag);
scatterwalk_map_and_copy(tag.bytes, req->dst,
req->assoclen + cryptlen, authsize, 1);
return 0;
}
EXPORT_SYMBOL_GPL(crypto_morus640_glue_encrypt);
int crypto_morus640_glue_decrypt(struct aead_request *req)
{
static const u8 zeros[MORUS640_BLOCK_SIZE] = {};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus640_ctx *ctx = crypto_aead_ctx(tfm);
struct morus640_ops OPS = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_blocks = ctx->ops->dec,
.crypt_tail = ctx->ops->dec_tail,
};
struct morus640_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag.bytes, req->src,
req->assoclen + cryptlen, authsize, 0);
crypto_morus640_glue_crypt(req, OPS, cryptlen, &tag);
return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0;
}
EXPORT_SYMBOL_GPL(crypto_morus640_glue_decrypt);
void crypto_morus640_glue_init_ops(struct crypto_aead *aead,
const struct morus640_glue_ops *ops)
{
struct morus640_ctx *ctx = crypto_aead_ctx(aead);
ctx->ops = ops;
}
EXPORT_SYMBOL_GPL(crypto_morus640_glue_init_ops);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("MORUS-640 AEAD mode -- glue for x86 optimizations");

View File

@ -167,7 +167,7 @@ static int xts_encrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&serpent_enc_xts, req,
XTS_TWEAK_CAST(__serpent_encrypt),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
@ -177,7 +177,7 @@ static int xts_decrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&serpent_dec_xts, req,
XTS_TWEAK_CAST(__serpent_encrypt),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}
static struct skcipher_alg serpent_algs[] = {

View File

@ -207,7 +207,7 @@ static int xts_encrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&serpent_enc_xts, req,
XTS_TWEAK_CAST(__serpent_encrypt),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
@ -217,7 +217,7 @@ static int xts_decrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&serpent_dec_xts, req,
XTS_TWEAK_CAST(__serpent_encrypt),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}
static struct skcipher_alg serpent_algs[] = {

View File

@ -45,8 +45,8 @@ asmlinkage void sha256_transform_ssse3(u32 *digest, const char *data,
u64 rounds);
typedef void (sha256_transform_fn)(u32 *digest, const char *data, u64 rounds);
static int sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len, sha256_transform_fn *sha256_xform)
static int _sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len, sha256_transform_fn *sha256_xform)
{
struct sha256_state *sctx = shash_desc_ctx(desc);
@ -84,7 +84,7 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data,
static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sha256_update(desc, data, len, sha256_transform_ssse3);
return _sha256_update(desc, data, len, sha256_transform_ssse3);
}
static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data,
@ -151,7 +151,7 @@ asmlinkage void sha256_transform_avx(u32 *digest, const char *data,
static int sha256_avx_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sha256_update(desc, data, len, sha256_transform_avx);
return _sha256_update(desc, data, len, sha256_transform_avx);
}
static int sha256_avx_finup(struct shash_desc *desc, const u8 *data,
@ -233,7 +233,7 @@ asmlinkage void sha256_transform_rorx(u32 *digest, const char *data,
static int sha256_avx2_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sha256_update(desc, data, len, sha256_transform_rorx);
return _sha256_update(desc, data, len, sha256_transform_rorx);
}
static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data,
@ -313,7 +313,7 @@ asmlinkage void sha256_ni_transform(u32 *digest, const char *data,
static int sha256_ni_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sha256_update(desc, data, len, sha256_ni_transform);
return _sha256_update(desc, data, len, sha256_ni_transform);
}
static int sha256_ni_finup(struct shash_desc *desc, const u8 *data,

View File

@ -210,7 +210,7 @@ static int xts_encrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&twofish_enc_xts, req,
XTS_TWEAK_CAST(twofish_enc_blk),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
@ -220,7 +220,7 @@ static int xts_decrypt(struct skcipher_request *req)
return glue_xts_req_128bit(&twofish_dec_xts, req,
XTS_TWEAK_CAST(twofish_enc_blk),
&ctx->tweak_ctx, &ctx->crypt_ctx);
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}
static struct skcipher_alg twofish_algs[] = {

View File

@ -1,12 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef ASM_X86_AES_H
#define ASM_X86_AES_H
#include <linux/crypto.h>
#include <crypto/aes.h>
void crypto_aes_encrypt_x86(struct crypto_aes_ctx *ctx, u8 *dst,
const u8 *src);
void crypto_aes_decrypt_x86(struct crypto_aes_ctx *ctx, u8 *dst,
const u8 *src);
#endif

View File

@ -114,7 +114,7 @@ extern int glue_ctr_req_128bit(const struct common_glue_ctx *gctx,
extern int glue_xts_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req,
common_glue_func_t tweak_fn, void *tweak_ctx,
void *crypt_ctx);
void *crypt_ctx, bool decrypt);
extern void glue_xts_crypt_128bit_one(void *ctx, u128 *dst, const u128 *src,
le128 *iv, common_glue_func_t fn);

View File

@ -9,9 +9,11 @@ PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
$(obj)/string.o: $(srctree)/arch/x86/boot/compressed/string.c FORCE
$(call if_changed_rule,cc_o_c)
$(obj)/sha256.o: $(srctree)/lib/sha256.c FORCE
$(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
$(call if_changed_rule,cc_o_c)
CFLAGS_sha256.o := -D__DISABLE_EXPORTS
LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z nodefaultlib
targets += purgatory.ro

View File

@ -9,7 +9,7 @@
*/
#include <linux/bug.h>
#include <linux/sha256.h>
#include <crypto/sha.h>
#include <asm/purgatory.h>
#include "../boot/string.h"

View File

@ -306,19 +306,10 @@ config CRYPTO_AEGIS128
help
Support for the AEGIS-128 dedicated AEAD algorithm.
config CRYPTO_AEGIS128L
tristate "AEGIS-128L AEAD algorithm"
select CRYPTO_AEAD
select CRYPTO_AES # for AES S-box tables
help
Support for the AEGIS-128L dedicated AEAD algorithm.
config CRYPTO_AEGIS256
tristate "AEGIS-256 AEAD algorithm"
select CRYPTO_AEAD
select CRYPTO_AES # for AES S-box tables
help
Support for the AEGIS-256 dedicated AEAD algorithm.
config CRYPTO_AEGIS128_SIMD
bool "Support SIMD acceleration for AEGIS-128"
depends on CRYPTO_AEGIS128 && ((ARM || ARM64) && KERNEL_MODE_NEON)
default y
config CRYPTO_AEGIS128_AESNI_SSE2
tristate "AEGIS-128 AEAD algorithm (x86_64 AESNI+SSE2 implementation)"
@ -328,78 +319,6 @@ config CRYPTO_AEGIS128_AESNI_SSE2
help
AESNI+SSE2 implementation of the AEGIS-128 dedicated AEAD algorithm.
config CRYPTO_AEGIS128L_AESNI_SSE2
tristate "AEGIS-128L AEAD algorithm (x86_64 AESNI+SSE2 implementation)"
depends on X86 && 64BIT
select CRYPTO_AEAD
select CRYPTO_SIMD
help
AESNI+SSE2 implementation of the AEGIS-128L dedicated AEAD algorithm.
config CRYPTO_AEGIS256_AESNI_SSE2
tristate "AEGIS-256 AEAD algorithm (x86_64 AESNI+SSE2 implementation)"
depends on X86 && 64BIT
select CRYPTO_AEAD
select CRYPTO_SIMD
help
AESNI+SSE2 implementation of the AEGIS-256 dedicated AEAD algorithm.
config CRYPTO_MORUS640
tristate "MORUS-640 AEAD algorithm"
select CRYPTO_AEAD
help
Support for the MORUS-640 dedicated AEAD algorithm.
config CRYPTO_MORUS640_GLUE
tristate
depends on X86
select CRYPTO_AEAD
select CRYPTO_SIMD
help
Common glue for SIMD optimizations of the MORUS-640 dedicated AEAD
algorithm.
config CRYPTO_MORUS640_SSE2
tristate "MORUS-640 AEAD algorithm (x86_64 SSE2 implementation)"
depends on X86 && 64BIT
select CRYPTO_AEAD
select CRYPTO_MORUS640_GLUE
help
SSE2 implementation of the MORUS-640 dedicated AEAD algorithm.
config CRYPTO_MORUS1280
tristate "MORUS-1280 AEAD algorithm"
select CRYPTO_AEAD
help
Support for the MORUS-1280 dedicated AEAD algorithm.
config CRYPTO_MORUS1280_GLUE
tristate
depends on X86
select CRYPTO_AEAD
select CRYPTO_SIMD
help
Common glue for SIMD optimizations of the MORUS-1280 dedicated AEAD
algorithm.
config CRYPTO_MORUS1280_SSE2
tristate "MORUS-1280 AEAD algorithm (x86_64 SSE2 implementation)"
depends on X86 && 64BIT
select CRYPTO_AEAD
select CRYPTO_MORUS1280_GLUE
help
SSE2 optimizedimplementation of the MORUS-1280 dedicated AEAD
algorithm.
config CRYPTO_MORUS1280_AVX2
tristate "MORUS-1280 AEAD algorithm (x86_64 AVX2 implementation)"
depends on X86 && 64BIT
select CRYPTO_AEAD
select CRYPTO_MORUS1280_GLUE
help
AVX2 optimized implementation of the MORUS-1280 dedicated AEAD
algorithm.
config CRYPTO_SEQIV
tristate "Sequence Number IV Generator"
select CRYPTO_AEAD
@ -728,11 +647,12 @@ config CRYPTO_VPMSUM_TESTER
Unless you are testing these algorithms, you don't need this.
config CRYPTO_GHASH
tristate "GHASH digest algorithm"
tristate "GHASH hash function"
select CRYPTO_GF128MUL
select CRYPTO_HASH
help
GHASH is message digest algorithm for GCM (Galois/Counter Mode).
GHASH is the hash function used in GCM (Galois/Counter Mode).
It is not a general-purpose cryptographic hash function.
config CRYPTO_POLY1305
tristate "Poly1305 authenticator algorithm"
@ -929,9 +849,13 @@ config CRYPTO_SHA1_PPC_SPE
SHA-1 secure hash standard (DFIPS 180-4) implemented
using powerpc SPE SIMD instruction set.
config CRYPTO_LIB_SHA256
tristate
config CRYPTO_SHA256
tristate "SHA224 and SHA256 digest algorithm"
select CRYPTO_HASH
select CRYPTO_LIB_SHA256
help
SHA256 secure hash standard (DFIPS 180-2).
@ -1057,18 +981,22 @@ config CRYPTO_WP512
<http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html>
config CRYPTO_GHASH_CLMUL_NI_INTEL
tristate "GHASH digest algorithm (CLMUL-NI accelerated)"
tristate "GHASH hash function (CLMUL-NI accelerated)"
depends on X86 && 64BIT
select CRYPTO_CRYPTD
help
GHASH is message digest algorithm for GCM (Galois/Counter Mode).
The implementation is accelerated by CLMUL-NI of Intel.
This is the x86_64 CLMUL-NI accelerated implementation of
GHASH, the hash function used in GCM (Galois/Counter mode).
comment "Ciphers"
config CRYPTO_LIB_AES
tristate
config CRYPTO_AES
tristate "AES cipher algorithms"
select CRYPTO_ALGAPI
select CRYPTO_LIB_AES
help
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
@ -1089,6 +1017,7 @@ config CRYPTO_AES
config CRYPTO_AES_TI
tristate "Fixed time AES cipher"
select CRYPTO_ALGAPI
select CRYPTO_LIB_AES
help
This is a generic implementation of AES that attempts to eliminate
data dependent latencies as much as possible without affecting
@ -1104,56 +1033,11 @@ config CRYPTO_AES_TI
block. Interrupts are also disabled to avoid races where cachelines
are evicted when the CPU is interrupted to do something else.
config CRYPTO_AES_586
tristate "AES cipher algorithms (i586)"
depends on (X86 || UML_X86) && !64BIT
select CRYPTO_ALGAPI
select CRYPTO_AES
help
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
Rijndael appears to be consistently a very good performer in
both hardware and software across a wide range of computing
environments regardless of its use in feedback or non-feedback
modes. Its key setup time is excellent, and its key agility is
good. Rijndael's very low memory requirements make it very well
suited for restricted-space environments, in which it also
demonstrates excellent performance. Rijndael's operations are
among the easiest to defend against power and timing attacks.
The AES specifies three key sizes: 128, 192 and 256 bits
See <http://csrc.nist.gov/encryption/aes/> for more information.
config CRYPTO_AES_X86_64
tristate "AES cipher algorithms (x86_64)"
depends on (X86 || UML_X86) && 64BIT
select CRYPTO_ALGAPI
select CRYPTO_AES
help
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
Rijndael appears to be consistently a very good performer in
both hardware and software across a wide range of computing
environments regardless of its use in feedback or non-feedback
modes. Its key setup time is excellent, and its key agility is
good. Rijndael's very low memory requirements make it very well
suited for restricted-space environments, in which it also
demonstrates excellent performance. Rijndael's operations are
among the easiest to defend against power and timing attacks.
The AES specifies three key sizes: 128, 192 and 256 bits
See <http://csrc.nist.gov/encryption/aes/> for more information.
config CRYPTO_AES_NI_INTEL
tristate "AES cipher algorithms (AES-NI)"
depends on X86
select CRYPTO_AEAD
select CRYPTO_AES_X86_64 if 64BIT
select CRYPTO_AES_586 if !64BIT
select CRYPTO_LIB_AES
select CRYPTO_ALGAPI
select CRYPTO_BLKCIPHER
select CRYPTO_GLUE_HELPER_X86 if 64BIT
@ -1426,9 +1310,13 @@ config CRYPTO_CAST6_AVX_X86_64
This module provides the Cast6 cipher algorithm that processes
eight blocks parallel using the AVX instruction set.
config CRYPTO_LIB_DES
tristate
config CRYPTO_DES
tristate "DES and Triple DES EDE cipher algorithms"
select CRYPTO_ALGAPI
select CRYPTO_LIB_DES
help
DES cipher algorithm (FIPS 46-2), and Triple DES EDE (FIPS 46-3).
@ -1436,7 +1324,7 @@ config CRYPTO_DES_SPARC64
tristate "DES and Triple DES EDE cipher algorithms (SPARC64)"
depends on SPARC64
select CRYPTO_ALGAPI
select CRYPTO_DES
select CRYPTO_LIB_DES
help
DES cipher algorithm (FIPS 46-2), and Triple DES EDE (FIPS 46-3),
optimized using SPARC64 crypto opcodes.
@ -1445,7 +1333,7 @@ config CRYPTO_DES3_EDE_X86_64
tristate "Triple DES EDE cipher algorithm (x86-64)"
depends on X86 && 64BIT
select CRYPTO_BLKCIPHER
select CRYPTO_DES
select CRYPTO_LIB_DES
help
Triple DES EDE (FIPS 46-3) algorithm.

View File

@ -90,10 +90,26 @@ obj-$(CONFIG_CRYPTO_GCM) += gcm.o
obj-$(CONFIG_CRYPTO_CCM) += ccm.o
obj-$(CONFIG_CRYPTO_CHACHA20POLY1305) += chacha20poly1305.o
obj-$(CONFIG_CRYPTO_AEGIS128) += aegis128.o
obj-$(CONFIG_CRYPTO_AEGIS128L) += aegis128l.o
obj-$(CONFIG_CRYPTO_AEGIS256) += aegis256.o
obj-$(CONFIG_CRYPTO_MORUS640) += morus640.o
obj-$(CONFIG_CRYPTO_MORUS1280) += morus1280.o
aegis128-y := aegis128-core.o
ifeq ($(ARCH),arm)
CFLAGS_aegis128-neon-inner.o += -ffreestanding -march=armv7-a -mfloat-abi=softfp
CFLAGS_aegis128-neon-inner.o += -mfpu=crypto-neon-fp-armv8
aegis128-$(CONFIG_CRYPTO_AEGIS128_SIMD) += aegis128-neon.o aegis128-neon-inner.o
endif
ifeq ($(ARCH),arm64)
aegis128-cflags-y := -ffreestanding -mcpu=generic+crypto
aegis128-cflags-$(CONFIG_CC_IS_GCC) += -ffixed-q16 -ffixed-q17 -ffixed-q18 \
-ffixed-q19 -ffixed-q20 -ffixed-q21 \
-ffixed-q22 -ffixed-q23 -ffixed-q24 \
-ffixed-q25 -ffixed-q26 -ffixed-q27 \
-ffixed-q28 -ffixed-q29 -ffixed-q30 \
-ffixed-q31
CFLAGS_aegis128-neon-inner.o += $(aegis128-cflags-y)
CFLAGS_REMOVE_aegis128-neon-inner.o += -mgeneral-regs-only
aegis128-$(CONFIG_CRYPTO_AEGIS128_SIMD) += aegis128-neon.o aegis128-neon-inner.o
endif
obj-$(CONFIG_CRYPTO_PCRYPT) += pcrypt.o
obj-$(CONFIG_CRYPTO_CRYPTD) += cryptd.o
obj-$(CONFIG_CRYPTO_DES) += des_generic.o
@ -136,6 +152,8 @@ obj-$(CONFIG_CRYPTO_ANSI_CPRNG) += ansi_cprng.o
obj-$(CONFIG_CRYPTO_DRBG) += drbg.o
obj-$(CONFIG_CRYPTO_JITTERENTROPY) += jitterentropy_rng.o
CFLAGS_jitterentropy.o = -O0
KASAN_SANITIZE_jitterentropy.o = n
UBSAN_SANITIZE_jitterentropy.o = n
jitterentropy_rng-y := jitterentropy.o jitterentropy-kcapi.o
obj-$(CONFIG_CRYPTO_TEST) += tcrypt.o
obj-$(CONFIG_CRYPTO_GHASH) += ghash-generic.o

View File

@ -70,7 +70,8 @@ int crypto_aead_setauthsize(struct crypto_aead *tfm, unsigned int authsize)
{
int err;
if (authsize > crypto_aead_maxauthsize(tfm))
if ((!authsize && crypto_aead_maxauthsize(tfm)) ||
authsize > crypto_aead_maxauthsize(tfm))
return -EINVAL;
if (crypto_aead_alg(tfm)->setauthsize) {

View File

@ -10,6 +10,7 @@
#define _CRYPTO_AEGIS_H
#include <crypto/aes.h>
#include <linux/bitops.h>
#include <linux/types.h>
#define AEGIS_BLOCK_SIZE 16
@ -23,46 +24,32 @@ union aegis_block {
#define AEGIS_BLOCK_ALIGN (__alignof__(union aegis_block))
#define AEGIS_ALIGNED(p) IS_ALIGNED((uintptr_t)p, AEGIS_BLOCK_ALIGN)
static const union aegis_block crypto_aegis_const[2] = {
{ .words64 = {
cpu_to_le64(U64_C(0x0d08050302010100)),
cpu_to_le64(U64_C(0x6279e99059372215)),
} },
{ .words64 = {
cpu_to_le64(U64_C(0xf12fc26d55183ddb)),
cpu_to_le64(U64_C(0xdd28b57342311120)),
} },
};
static void crypto_aegis_block_xor(union aegis_block *dst,
const union aegis_block *src)
static __always_inline void crypto_aegis_block_xor(union aegis_block *dst,
const union aegis_block *src)
{
dst->words64[0] ^= src->words64[0];
dst->words64[1] ^= src->words64[1];
}
static void crypto_aegis_block_and(union aegis_block *dst,
const union aegis_block *src)
static __always_inline void crypto_aegis_block_and(union aegis_block *dst,
const union aegis_block *src)
{
dst->words64[0] &= src->words64[0];
dst->words64[1] &= src->words64[1];
}
static void crypto_aegis_aesenc(union aegis_block *dst,
const union aegis_block *src,
const union aegis_block *key)
static __always_inline void crypto_aegis_aesenc(union aegis_block *dst,
const union aegis_block *src,
const union aegis_block *key)
{
const u8 *s = src->bytes;
const u32 *t0 = crypto_ft_tab[0];
const u32 *t1 = crypto_ft_tab[1];
const u32 *t2 = crypto_ft_tab[2];
const u32 *t3 = crypto_ft_tab[3];
const u32 *t = crypto_ft_tab[0];
u32 d0, d1, d2, d3;
d0 = t0[s[ 0]] ^ t1[s[ 5]] ^ t2[s[10]] ^ t3[s[15]];
d1 = t0[s[ 4]] ^ t1[s[ 9]] ^ t2[s[14]] ^ t3[s[ 3]];
d2 = t0[s[ 8]] ^ t1[s[13]] ^ t2[s[ 2]] ^ t3[s[ 7]];
d3 = t0[s[12]] ^ t1[s[ 1]] ^ t2[s[ 6]] ^ t3[s[11]];
d0 = t[s[ 0]] ^ rol32(t[s[ 5]], 8) ^ rol32(t[s[10]], 16) ^ rol32(t[s[15]], 24);
d1 = t[s[ 4]] ^ rol32(t[s[ 9]], 8) ^ rol32(t[s[14]], 16) ^ rol32(t[s[ 3]], 24);
d2 = t[s[ 8]] ^ rol32(t[s[13]], 8) ^ rol32(t[s[ 2]], 16) ^ rol32(t[s[ 7]], 24);
d3 = t[s[12]] ^ rol32(t[s[ 1]], 8) ^ rol32(t[s[ 6]], 16) ^ rol32(t[s[11]], 24);
dst->words32[0] = cpu_to_le32(d0) ^ key->words32[0];
dst->words32[1] = cpu_to_le32(d1) ^ key->words32[1];

View File

@ -8,6 +8,7 @@
#include <crypto/algapi.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
@ -16,6 +17,8 @@
#include <linux/module.h>
#include <linux/scatterlist.h>
#include <asm/simd.h>
#include "aegis.h"
#define AEGIS128_NONCE_SIZE 16
@ -40,6 +43,35 @@ struct aegis128_ops {
const u8 *src, unsigned int size);
};
static bool have_simd;
static const union aegis_block crypto_aegis_const[2] = {
{ .words64 = {
cpu_to_le64(U64_C(0x0d08050302010100)),
cpu_to_le64(U64_C(0x6279e99059372215)),
} },
{ .words64 = {
cpu_to_le64(U64_C(0xf12fc26d55183ddb)),
cpu_to_le64(U64_C(0xdd28b57342311120)),
} },
};
static bool aegis128_do_simd(void)
{
#ifdef CONFIG_CRYPTO_AEGIS128_SIMD
if (have_simd)
return crypto_simd_usable();
#endif
return false;
}
bool crypto_aegis128_have_simd(void);
void crypto_aegis128_update_simd(struct aegis_state *state, const void *msg);
void crypto_aegis128_encrypt_chunk_simd(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size);
void crypto_aegis128_decrypt_chunk_simd(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size);
static void crypto_aegis128_update(struct aegis_state *state)
{
union aegis_block tmp;
@ -55,12 +87,22 @@ static void crypto_aegis128_update(struct aegis_state *state)
static void crypto_aegis128_update_a(struct aegis_state *state,
const union aegis_block *msg)
{
if (aegis128_do_simd()) {
crypto_aegis128_update_simd(state, msg);
return;
}
crypto_aegis128_update(state);
crypto_aegis_block_xor(&state->blocks[0], msg);
}
static void crypto_aegis128_update_u(struct aegis_state *state, const void *msg)
{
if (aegis128_do_simd()) {
crypto_aegis128_update_simd(state, msg);
return;
}
crypto_aegis128_update(state);
crypto_xor(state->blocks[0].bytes, msg, AEGIS_BLOCK_SIZE);
}
@ -365,7 +407,7 @@ static void crypto_aegis128_crypt(struct aead_request *req,
static int crypto_aegis128_encrypt(struct aead_request *req)
{
static const struct aegis128_ops ops = {
const struct aegis128_ops *ops = &(struct aegis128_ops){
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_chunk = crypto_aegis128_encrypt_chunk,
};
@ -375,7 +417,12 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_aegis128_crypt(req, &tag, cryptlen, &ops);
if (aegis128_do_simd())
ops = &(struct aegis128_ops){
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_chunk = crypto_aegis128_encrypt_chunk_simd };
crypto_aegis128_crypt(req, &tag, cryptlen, ops);
scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen,
authsize, 1);
@ -384,7 +431,7 @@ static int crypto_aegis128_encrypt(struct aead_request *req)
static int crypto_aegis128_decrypt(struct aead_request *req)
{
static const struct aegis128_ops ops = {
const struct aegis128_ops *ops = &(struct aegis128_ops){
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_chunk = crypto_aegis128_decrypt_chunk,
};
@ -398,27 +445,21 @@ static int crypto_aegis128_decrypt(struct aead_request *req)
scatterwalk_map_and_copy(tag.bytes, req->src, req->assoclen + cryptlen,
authsize, 0);
crypto_aegis128_crypt(req, &tag, cryptlen, &ops);
if (aegis128_do_simd())
ops = &(struct aegis128_ops){
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_chunk = crypto_aegis128_decrypt_chunk_simd };
crypto_aegis128_crypt(req, &tag, cryptlen, ops);
return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0;
}
static int crypto_aegis128_init_tfm(struct crypto_aead *tfm)
{
return 0;
}
static void crypto_aegis128_exit_tfm(struct crypto_aead *tfm)
{
}
static struct aead_alg crypto_aegis128_alg = {
.setkey = crypto_aegis128_setkey,
.setauthsize = crypto_aegis128_setauthsize,
.encrypt = crypto_aegis128_encrypt,
.decrypt = crypto_aegis128_decrypt,
.init = crypto_aegis128_init_tfm,
.exit = crypto_aegis128_exit_tfm,
.ivsize = AEGIS128_NONCE_SIZE,
.maxauthsize = AEGIS128_MAX_AUTH_SIZE,
@ -440,6 +481,9 @@ static struct aead_alg crypto_aegis128_alg = {
static int __init crypto_aegis128_module_init(void)
{
if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD))
have_simd = crypto_aegis128_have_simd();
return crypto_register_aead(&crypto_aegis128_alg);
}

View File

@ -0,0 +1,212 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2019 Linaro, Ltd. <ard.biesheuvel@linaro.org>
*/
#ifdef CONFIG_ARM64
#include <asm/neon-intrinsics.h>
#define AES_ROUND "aese %0.16b, %1.16b \n\t aesmc %0.16b, %0.16b"
#else
#include <arm_neon.h>
#define AES_ROUND "aese.8 %q0, %q1 \n\t aesmc.8 %q0, %q0"
#endif
#define AEGIS_BLOCK_SIZE 16
#include <stddef.h>
extern int aegis128_have_aes_insn;
void *memcpy(void *dest, const void *src, size_t n);
void *memset(void *s, int c, size_t n);
struct aegis128_state {
uint8x16_t v[5];
};
extern const uint8_t crypto_aes_sbox[];
static struct aegis128_state aegis128_load_state_neon(const void *state)
{
return (struct aegis128_state){ {
vld1q_u8(state),
vld1q_u8(state + 16),
vld1q_u8(state + 32),
vld1q_u8(state + 48),
vld1q_u8(state + 64)
} };
}
static void aegis128_save_state_neon(struct aegis128_state st, void *state)
{
vst1q_u8(state, st.v[0]);
vst1q_u8(state + 16, st.v[1]);
vst1q_u8(state + 32, st.v[2]);
vst1q_u8(state + 48, st.v[3]);
vst1q_u8(state + 64, st.v[4]);
}
static inline __attribute__((always_inline))
uint8x16_t aegis_aes_round(uint8x16_t w)
{
uint8x16_t z = {};
#ifdef CONFIG_ARM64
if (!__builtin_expect(aegis128_have_aes_insn, 1)) {
static const uint8_t shift_rows[] = {
0x0, 0x5, 0xa, 0xf, 0x4, 0x9, 0xe, 0x3,
0x8, 0xd, 0x2, 0x7, 0xc, 0x1, 0x6, 0xb,
};
static const uint8_t ror32by8[] = {
0x1, 0x2, 0x3, 0x0, 0x5, 0x6, 0x7, 0x4,
0x9, 0xa, 0xb, 0x8, 0xd, 0xe, 0xf, 0xc,
};
uint8x16_t v;
// shift rows
w = vqtbl1q_u8(w, vld1q_u8(shift_rows));
// sub bytes
#ifndef CONFIG_CC_IS_GCC
v = vqtbl4q_u8(vld1q_u8_x4(crypto_aes_sbox), w);
v = vqtbx4q_u8(v, vld1q_u8_x4(crypto_aes_sbox + 0x40), w - 0x40);
v = vqtbx4q_u8(v, vld1q_u8_x4(crypto_aes_sbox + 0x80), w - 0x80);
v = vqtbx4q_u8(v, vld1q_u8_x4(crypto_aes_sbox + 0xc0), w - 0xc0);
#else
asm("tbl %0.16b, {v16.16b-v19.16b}, %1.16b" : "=w"(v) : "w"(w));
w -= 0x40;
asm("tbx %0.16b, {v20.16b-v23.16b}, %1.16b" : "+w"(v) : "w"(w));
w -= 0x40;
asm("tbx %0.16b, {v24.16b-v27.16b}, %1.16b" : "+w"(v) : "w"(w));
w -= 0x40;
asm("tbx %0.16b, {v28.16b-v31.16b}, %1.16b" : "+w"(v) : "w"(w));
#endif
// mix columns
w = (v << 1) ^ (uint8x16_t)(((int8x16_t)v >> 7) & 0x1b);
w ^= (uint8x16_t)vrev32q_u16((uint16x8_t)v);
w ^= vqtbl1q_u8(v ^ w, vld1q_u8(ror32by8));
return w;
}
#endif
/*
* We use inline asm here instead of the vaeseq_u8/vaesmcq_u8 intrinsics
* to force the compiler to issue the aese/aesmc instructions in pairs.
* This is much faster on many cores, where the instruction pair can
* execute in a single cycle.
*/
asm(AES_ROUND : "+w"(w) : "w"(z));
return w;
}
static inline __attribute__((always_inline))
struct aegis128_state aegis128_update_neon(struct aegis128_state st,
uint8x16_t m)
{
m ^= aegis_aes_round(st.v[4]);
st.v[4] ^= aegis_aes_round(st.v[3]);
st.v[3] ^= aegis_aes_round(st.v[2]);
st.v[2] ^= aegis_aes_round(st.v[1]);
st.v[1] ^= aegis_aes_round(st.v[0]);
st.v[0] ^= m;
return st;
}
static inline __attribute__((always_inline))
void preload_sbox(void)
{
if (!IS_ENABLED(CONFIG_ARM64) ||
!IS_ENABLED(CONFIG_CC_IS_GCC) ||
__builtin_expect(aegis128_have_aes_insn, 1))
return;
asm("ld1 {v16.16b-v19.16b}, [%0], #64 \n\t"
"ld1 {v20.16b-v23.16b}, [%0], #64 \n\t"
"ld1 {v24.16b-v27.16b}, [%0], #64 \n\t"
"ld1 {v28.16b-v31.16b}, [%0] \n\t"
:: "r"(crypto_aes_sbox));
}
void crypto_aegis128_update_neon(void *state, const void *msg)
{
struct aegis128_state st = aegis128_load_state_neon(state);
preload_sbox();
st = aegis128_update_neon(st, vld1q_u8(msg));
aegis128_save_state_neon(st, state);
}
void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
unsigned int size)
{
struct aegis128_state st = aegis128_load_state_neon(state);
uint8x16_t msg;
preload_sbox();
while (size >= AEGIS_BLOCK_SIZE) {
uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
msg = vld1q_u8(src);
st = aegis128_update_neon(st, msg);
vst1q_u8(dst, msg ^ s);
size -= AEGIS_BLOCK_SIZE;
src += AEGIS_BLOCK_SIZE;
dst += AEGIS_BLOCK_SIZE;
}
if (size > 0) {
uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
uint8_t buf[AEGIS_BLOCK_SIZE] = {};
memcpy(buf, src, size);
msg = vld1q_u8(buf);
st = aegis128_update_neon(st, msg);
vst1q_u8(buf, msg ^ s);
memcpy(dst, buf, size);
}
aegis128_save_state_neon(st, state);
}
void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
unsigned int size)
{
struct aegis128_state st = aegis128_load_state_neon(state);
uint8x16_t msg;
preload_sbox();
while (size >= AEGIS_BLOCK_SIZE) {
msg = vld1q_u8(src) ^ st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
st = aegis128_update_neon(st, msg);
vst1q_u8(dst, msg);
size -= AEGIS_BLOCK_SIZE;
src += AEGIS_BLOCK_SIZE;
dst += AEGIS_BLOCK_SIZE;
}
if (size > 0) {
uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4];
uint8_t buf[AEGIS_BLOCK_SIZE];
vst1q_u8(buf, s);
memcpy(buf, src, size);
msg = vld1q_u8(buf) ^ s;
vst1q_u8(buf, msg);
memcpy(dst, buf, size);
st = aegis128_update_neon(st, msg);
}
aegis128_save_state_neon(st, state);
}

49
crypto/aegis128-neon.c Normal file
View File

@ -0,0 +1,49 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2019 Linaro Ltd <ard.biesheuvel@linaro.org>
*/
#include <asm/cpufeature.h>
#include <asm/neon.h>
#include "aegis.h"
void crypto_aegis128_update_neon(void *state, const void *msg);
void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,
unsigned int size);
void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,
unsigned int size);
int aegis128_have_aes_insn __ro_after_init;
bool crypto_aegis128_have_simd(void)
{
if (cpu_have_feature(cpu_feature(AES))) {
aegis128_have_aes_insn = 1;
return true;
}
return IS_ENABLED(CONFIG_ARM64);
}
void crypto_aegis128_update_simd(union aegis_block *state, const void *msg)
{
kernel_neon_begin();
crypto_aegis128_update_neon(state, msg);
kernel_neon_end();
}
void crypto_aegis128_encrypt_chunk_simd(union aegis_block *state, u8 *dst,
const u8 *src, unsigned int size)
{
kernel_neon_begin();
crypto_aegis128_encrypt_chunk_neon(state, dst, src, size);
kernel_neon_end();
}
void crypto_aegis128_decrypt_chunk_simd(union aegis_block *state, u8 *dst,
const u8 *src, unsigned int size)
{
kernel_neon_begin();
crypto_aegis128_decrypt_chunk_neon(state, dst, src, size);
kernel_neon_end();
}

View File

@ -1,522 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The AEGIS-128L Authenticated-Encryption Algorithm
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/algapi.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/scatterlist.h>
#include "aegis.h"
#define AEGIS128L_CHUNK_BLOCKS 2
#define AEGIS128L_CHUNK_SIZE (AEGIS128L_CHUNK_BLOCKS * AEGIS_BLOCK_SIZE)
#define AEGIS128L_NONCE_SIZE 16
#define AEGIS128L_STATE_BLOCKS 8
#define AEGIS128L_KEY_SIZE 16
#define AEGIS128L_MIN_AUTH_SIZE 8
#define AEGIS128L_MAX_AUTH_SIZE 16
union aegis_chunk {
union aegis_block blocks[AEGIS128L_CHUNK_BLOCKS];
u8 bytes[AEGIS128L_CHUNK_SIZE];
};
struct aegis_state {
union aegis_block blocks[AEGIS128L_STATE_BLOCKS];
};
struct aegis_ctx {
union aegis_block key;
};
struct aegis128l_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_chunk)(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size);
};
static void crypto_aegis128l_update(struct aegis_state *state)
{
union aegis_block tmp;
unsigned int i;
tmp = state->blocks[AEGIS128L_STATE_BLOCKS - 1];
for (i = AEGIS128L_STATE_BLOCKS - 1; i > 0; i--)
crypto_aegis_aesenc(&state->blocks[i], &state->blocks[i - 1],
&state->blocks[i]);
crypto_aegis_aesenc(&state->blocks[0], &tmp, &state->blocks[0]);
}
static void crypto_aegis128l_update_a(struct aegis_state *state,
const union aegis_chunk *msg)
{
crypto_aegis128l_update(state);
crypto_aegis_block_xor(&state->blocks[0], &msg->blocks[0]);
crypto_aegis_block_xor(&state->blocks[4], &msg->blocks[1]);
}
static void crypto_aegis128l_update_u(struct aegis_state *state,
const void *msg)
{
crypto_aegis128l_update(state);
crypto_xor(state->blocks[0].bytes, msg + 0 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
crypto_xor(state->blocks[4].bytes, msg + 1 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
}
static void crypto_aegis128l_init(struct aegis_state *state,
const union aegis_block *key,
const u8 *iv)
{
union aegis_block key_iv;
union aegis_chunk chunk;
unsigned int i;
memcpy(chunk.blocks[0].bytes, iv, AEGIS_BLOCK_SIZE);
chunk.blocks[1] = *key;
key_iv = *key;
crypto_aegis_block_xor(&key_iv, &chunk.blocks[0]);
state->blocks[0] = key_iv;
state->blocks[1] = crypto_aegis_const[1];
state->blocks[2] = crypto_aegis_const[0];
state->blocks[3] = crypto_aegis_const[1];
state->blocks[4] = key_iv;
state->blocks[5] = *key;
state->blocks[6] = *key;
state->blocks[7] = *key;
crypto_aegis_block_xor(&state->blocks[5], &crypto_aegis_const[0]);
crypto_aegis_block_xor(&state->blocks[6], &crypto_aegis_const[1]);
crypto_aegis_block_xor(&state->blocks[7], &crypto_aegis_const[0]);
for (i = 0; i < 10; i++) {
crypto_aegis128l_update_a(state, &chunk);
}
}
static void crypto_aegis128l_ad(struct aegis_state *state,
const u8 *src, unsigned int size)
{
if (AEGIS_ALIGNED(src)) {
const union aegis_chunk *src_chunk =
(const union aegis_chunk *)src;
while (size >= AEGIS128L_CHUNK_SIZE) {
crypto_aegis128l_update_a(state, src_chunk);
size -= AEGIS128L_CHUNK_SIZE;
src_chunk += 1;
}
} else {
while (size >= AEGIS128L_CHUNK_SIZE) {
crypto_aegis128l_update_u(state, src);
size -= AEGIS128L_CHUNK_SIZE;
src += AEGIS128L_CHUNK_SIZE;
}
}
}
static void crypto_aegis128l_encrypt_chunk(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
union aegis_chunk tmp;
union aegis_block *tmp0 = &tmp.blocks[0];
union aegis_block *tmp1 = &tmp.blocks[1];
if (AEGIS_ALIGNED(src) && AEGIS_ALIGNED(dst)) {
while (size >= AEGIS128L_CHUNK_SIZE) {
union aegis_chunk *dst_blk =
(union aegis_chunk *)dst;
const union aegis_chunk *src_blk =
(const union aegis_chunk *)src;
*tmp0 = state->blocks[2];
crypto_aegis_block_and(tmp0, &state->blocks[3]);
crypto_aegis_block_xor(tmp0, &state->blocks[6]);
crypto_aegis_block_xor(tmp0, &state->blocks[1]);
crypto_aegis_block_xor(tmp0, &src_blk->blocks[0]);
*tmp1 = state->blocks[6];
crypto_aegis_block_and(tmp1, &state->blocks[7]);
crypto_aegis_block_xor(tmp1, &state->blocks[5]);
crypto_aegis_block_xor(tmp1, &state->blocks[2]);
crypto_aegis_block_xor(tmp1, &src_blk->blocks[1]);
crypto_aegis128l_update_a(state, src_blk);
*dst_blk = tmp;
size -= AEGIS128L_CHUNK_SIZE;
src += AEGIS128L_CHUNK_SIZE;
dst += AEGIS128L_CHUNK_SIZE;
}
} else {
while (size >= AEGIS128L_CHUNK_SIZE) {
*tmp0 = state->blocks[2];
crypto_aegis_block_and(tmp0, &state->blocks[3]);
crypto_aegis_block_xor(tmp0, &state->blocks[6]);
crypto_aegis_block_xor(tmp0, &state->blocks[1]);
crypto_xor(tmp0->bytes, src + 0 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
*tmp1 = state->blocks[6];
crypto_aegis_block_and(tmp1, &state->blocks[7]);
crypto_aegis_block_xor(tmp1, &state->blocks[5]);
crypto_aegis_block_xor(tmp1, &state->blocks[2]);
crypto_xor(tmp1->bytes, src + 1 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
crypto_aegis128l_update_u(state, src);
memcpy(dst, tmp.bytes, AEGIS128L_CHUNK_SIZE);
size -= AEGIS128L_CHUNK_SIZE;
src += AEGIS128L_CHUNK_SIZE;
dst += AEGIS128L_CHUNK_SIZE;
}
}
if (size > 0) {
union aegis_chunk msg = {};
memcpy(msg.bytes, src, size);
*tmp0 = state->blocks[2];
crypto_aegis_block_and(tmp0, &state->blocks[3]);
crypto_aegis_block_xor(tmp0, &state->blocks[6]);
crypto_aegis_block_xor(tmp0, &state->blocks[1]);
*tmp1 = state->blocks[6];
crypto_aegis_block_and(tmp1, &state->blocks[7]);
crypto_aegis_block_xor(tmp1, &state->blocks[5]);
crypto_aegis_block_xor(tmp1, &state->blocks[2]);
crypto_aegis128l_update_a(state, &msg);
crypto_aegis_block_xor(&msg.blocks[0], tmp0);
crypto_aegis_block_xor(&msg.blocks[1], tmp1);
memcpy(dst, msg.bytes, size);
}
}
static void crypto_aegis128l_decrypt_chunk(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
union aegis_chunk tmp;
union aegis_block *tmp0 = &tmp.blocks[0];
union aegis_block *tmp1 = &tmp.blocks[1];
if (AEGIS_ALIGNED(src) && AEGIS_ALIGNED(dst)) {
while (size >= AEGIS128L_CHUNK_SIZE) {
union aegis_chunk *dst_blk =
(union aegis_chunk *)dst;
const union aegis_chunk *src_blk =
(const union aegis_chunk *)src;
*tmp0 = state->blocks[2];
crypto_aegis_block_and(tmp0, &state->blocks[3]);
crypto_aegis_block_xor(tmp0, &state->blocks[6]);
crypto_aegis_block_xor(tmp0, &state->blocks[1]);
crypto_aegis_block_xor(tmp0, &src_blk->blocks[0]);
*tmp1 = state->blocks[6];
crypto_aegis_block_and(tmp1, &state->blocks[7]);
crypto_aegis_block_xor(tmp1, &state->blocks[5]);
crypto_aegis_block_xor(tmp1, &state->blocks[2]);
crypto_aegis_block_xor(tmp1, &src_blk->blocks[1]);
crypto_aegis128l_update_a(state, &tmp);
*dst_blk = tmp;
size -= AEGIS128L_CHUNK_SIZE;
src += AEGIS128L_CHUNK_SIZE;
dst += AEGIS128L_CHUNK_SIZE;
}
} else {
while (size >= AEGIS128L_CHUNK_SIZE) {
*tmp0 = state->blocks[2];
crypto_aegis_block_and(tmp0, &state->blocks[3]);
crypto_aegis_block_xor(tmp0, &state->blocks[6]);
crypto_aegis_block_xor(tmp0, &state->blocks[1]);
crypto_xor(tmp0->bytes, src + 0 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
*tmp1 = state->blocks[6];
crypto_aegis_block_and(tmp1, &state->blocks[7]);
crypto_aegis_block_xor(tmp1, &state->blocks[5]);
crypto_aegis_block_xor(tmp1, &state->blocks[2]);
crypto_xor(tmp1->bytes, src + 1 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
crypto_aegis128l_update_a(state, &tmp);
memcpy(dst, tmp.bytes, AEGIS128L_CHUNK_SIZE);
size -= AEGIS128L_CHUNK_SIZE;
src += AEGIS128L_CHUNK_SIZE;
dst += AEGIS128L_CHUNK_SIZE;
}
}
if (size > 0) {
union aegis_chunk msg = {};
memcpy(msg.bytes, src, size);
*tmp0 = state->blocks[2];
crypto_aegis_block_and(tmp0, &state->blocks[3]);
crypto_aegis_block_xor(tmp0, &state->blocks[6]);
crypto_aegis_block_xor(tmp0, &state->blocks[1]);
crypto_aegis_block_xor(&msg.blocks[0], tmp0);
*tmp1 = state->blocks[6];
crypto_aegis_block_and(tmp1, &state->blocks[7]);
crypto_aegis_block_xor(tmp1, &state->blocks[5]);
crypto_aegis_block_xor(tmp1, &state->blocks[2]);
crypto_aegis_block_xor(&msg.blocks[1], tmp1);
memset(msg.bytes + size, 0, AEGIS128L_CHUNK_SIZE - size);
crypto_aegis128l_update_a(state, &msg);
memcpy(dst, msg.bytes, size);
}
}
static void crypto_aegis128l_process_ad(struct aegis_state *state,
struct scatterlist *sg_src,
unsigned int assoclen)
{
struct scatter_walk walk;
union aegis_chunk buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= AEGIS128L_CHUNK_SIZE) {
if (pos > 0) {
unsigned int fill = AEGIS128L_CHUNK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
crypto_aegis128l_update_a(state, &buf);
pos = 0;
left -= fill;
src += fill;
}
crypto_aegis128l_ad(state, src, left);
src += left & ~(AEGIS128L_CHUNK_SIZE - 1);
left &= AEGIS128L_CHUNK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, AEGIS128L_CHUNK_SIZE - pos);
crypto_aegis128l_update_a(state, &buf);
}
}
static void crypto_aegis128l_process_crypt(struct aegis_state *state,
struct aead_request *req,
const struct aegis128l_ops *ops)
{
struct skcipher_walk walk;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) {
unsigned int nbytes = walk.nbytes;
if (nbytes < walk.total)
nbytes = round_down(nbytes, walk.stride);
ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr,
nbytes);
skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
}
static void crypto_aegis128l_final(struct aegis_state *state,
union aegis_block *tag_xor,
u64 assoclen, u64 cryptlen)
{
u64 assocbits = assoclen * 8;
u64 cryptbits = cryptlen * 8;
union aegis_chunk tmp;
unsigned int i;
tmp.blocks[0].words64[0] = cpu_to_le64(assocbits);
tmp.blocks[0].words64[1] = cpu_to_le64(cryptbits);
crypto_aegis_block_xor(&tmp.blocks[0], &state->blocks[2]);
tmp.blocks[1] = tmp.blocks[0];
for (i = 0; i < 7; i++)
crypto_aegis128l_update_a(state, &tmp);
for (i = 0; i < 7; i++)
crypto_aegis_block_xor(tag_xor, &state->blocks[i]);
}
static int crypto_aegis128l_setkey(struct crypto_aead *aead, const u8 *key,
unsigned int keylen)
{
struct aegis_ctx *ctx = crypto_aead_ctx(aead);
if (keylen != AEGIS128L_KEY_SIZE) {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
memcpy(ctx->key.bytes, key, AEGIS128L_KEY_SIZE);
return 0;
}
static int crypto_aegis128l_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
if (authsize > AEGIS128L_MAX_AUTH_SIZE)
return -EINVAL;
if (authsize < AEGIS128L_MIN_AUTH_SIZE)
return -EINVAL;
return 0;
}
static void crypto_aegis128l_crypt(struct aead_request *req,
union aegis_block *tag_xor,
unsigned int cryptlen,
const struct aegis128l_ops *ops)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_ctx *ctx = crypto_aead_ctx(tfm);
struct aegis_state state;
crypto_aegis128l_init(&state, &ctx->key, req->iv);
crypto_aegis128l_process_ad(&state, req->src, req->assoclen);
crypto_aegis128l_process_crypt(&state, req, ops);
crypto_aegis128l_final(&state, tag_xor, req->assoclen, cryptlen);
}
static int crypto_aegis128l_encrypt(struct aead_request *req)
{
static const struct aegis128l_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_chunk = crypto_aegis128l_encrypt_chunk,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
union aegis_block tag = {};
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_aegis128l_crypt(req, &tag, cryptlen, &ops);
scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen,
authsize, 1);
return 0;
}
static int crypto_aegis128l_decrypt(struct aead_request *req)
{
static const struct aegis128l_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_chunk = crypto_aegis128l_decrypt_chunk,
};
static const u8 zeros[AEGIS128L_MAX_AUTH_SIZE] = {};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
union aegis_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag.bytes, req->src, req->assoclen + cryptlen,
authsize, 0);
crypto_aegis128l_crypt(req, &tag, cryptlen, &ops);
return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0;
}
static int crypto_aegis128l_init_tfm(struct crypto_aead *tfm)
{
return 0;
}
static void crypto_aegis128l_exit_tfm(struct crypto_aead *tfm)
{
}
static struct aead_alg crypto_aegis128l_alg = {
.setkey = crypto_aegis128l_setkey,
.setauthsize = crypto_aegis128l_setauthsize,
.encrypt = crypto_aegis128l_encrypt,
.decrypt = crypto_aegis128l_decrypt,
.init = crypto_aegis128l_init_tfm,
.exit = crypto_aegis128l_exit_tfm,
.ivsize = AEGIS128L_NONCE_SIZE,
.maxauthsize = AEGIS128L_MAX_AUTH_SIZE,
.chunksize = AEGIS128L_CHUNK_SIZE,
.base = {
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aegis_ctx),
.cra_alignmask = 0,
.cra_priority = 100,
.cra_name = "aegis128l",
.cra_driver_name = "aegis128l-generic",
.cra_module = THIS_MODULE,
}
};
static int __init crypto_aegis128l_module_init(void)
{
return crypto_register_aead(&crypto_aegis128l_alg);
}
static void __exit crypto_aegis128l_module_exit(void)
{
crypto_unregister_aead(&crypto_aegis128l_alg);
}
subsys_initcall(crypto_aegis128l_module_init);
module_exit(crypto_aegis128l_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("AEGIS-128L AEAD algorithm");
MODULE_ALIAS_CRYPTO("aegis128l");
MODULE_ALIAS_CRYPTO("aegis128l-generic");

View File

@ -1,473 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The AEGIS-256 Authenticated-Encryption Algorithm
*
* Copyright (c) 2017-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <crypto/algapi.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/scatterlist.h>
#include "aegis.h"
#define AEGIS256_NONCE_SIZE 32
#define AEGIS256_STATE_BLOCKS 6
#define AEGIS256_KEY_SIZE 32
#define AEGIS256_MIN_AUTH_SIZE 8
#define AEGIS256_MAX_AUTH_SIZE 16
struct aegis_state {
union aegis_block blocks[AEGIS256_STATE_BLOCKS];
};
struct aegis_ctx {
union aegis_block key[AEGIS256_KEY_SIZE / AEGIS_BLOCK_SIZE];
};
struct aegis256_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_chunk)(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size);
};
static void crypto_aegis256_update(struct aegis_state *state)
{
union aegis_block tmp;
unsigned int i;
tmp = state->blocks[AEGIS256_STATE_BLOCKS - 1];
for (i = AEGIS256_STATE_BLOCKS - 1; i > 0; i--)
crypto_aegis_aesenc(&state->blocks[i], &state->blocks[i - 1],
&state->blocks[i]);
crypto_aegis_aesenc(&state->blocks[0], &tmp, &state->blocks[0]);
}
static void crypto_aegis256_update_a(struct aegis_state *state,
const union aegis_block *msg)
{
crypto_aegis256_update(state);
crypto_aegis_block_xor(&state->blocks[0], msg);
}
static void crypto_aegis256_update_u(struct aegis_state *state, const void *msg)
{
crypto_aegis256_update(state);
crypto_xor(state->blocks[0].bytes, msg, AEGIS_BLOCK_SIZE);
}
static void crypto_aegis256_init(struct aegis_state *state,
const union aegis_block *key,
const u8 *iv)
{
union aegis_block key_iv[2];
unsigned int i;
key_iv[0] = key[0];
key_iv[1] = key[1];
crypto_xor(key_iv[0].bytes, iv + 0 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
crypto_xor(key_iv[1].bytes, iv + 1 * AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
state->blocks[0] = key_iv[0];
state->blocks[1] = key_iv[1];
state->blocks[2] = crypto_aegis_const[1];
state->blocks[3] = crypto_aegis_const[0];
state->blocks[4] = key[0];
state->blocks[5] = key[1];
crypto_aegis_block_xor(&state->blocks[4], &crypto_aegis_const[0]);
crypto_aegis_block_xor(&state->blocks[5], &crypto_aegis_const[1]);
for (i = 0; i < 4; i++) {
crypto_aegis256_update_a(state, &key[0]);
crypto_aegis256_update_a(state, &key[1]);
crypto_aegis256_update_a(state, &key_iv[0]);
crypto_aegis256_update_a(state, &key_iv[1]);
}
}
static void crypto_aegis256_ad(struct aegis_state *state,
const u8 *src, unsigned int size)
{
if (AEGIS_ALIGNED(src)) {
const union aegis_block *src_blk =
(const union aegis_block *)src;
while (size >= AEGIS_BLOCK_SIZE) {
crypto_aegis256_update_a(state, src_blk);
size -= AEGIS_BLOCK_SIZE;
src_blk++;
}
} else {
while (size >= AEGIS_BLOCK_SIZE) {
crypto_aegis256_update_u(state, src);
size -= AEGIS_BLOCK_SIZE;
src += AEGIS_BLOCK_SIZE;
}
}
}
static void crypto_aegis256_encrypt_chunk(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
union aegis_block tmp;
if (AEGIS_ALIGNED(src) && AEGIS_ALIGNED(dst)) {
while (size >= AEGIS_BLOCK_SIZE) {
union aegis_block *dst_blk =
(union aegis_block *)dst;
const union aegis_block *src_blk =
(const union aegis_block *)src;
tmp = state->blocks[2];
crypto_aegis_block_and(&tmp, &state->blocks[3]);
crypto_aegis_block_xor(&tmp, &state->blocks[5]);
crypto_aegis_block_xor(&tmp, &state->blocks[4]);
crypto_aegis_block_xor(&tmp, &state->blocks[1]);
crypto_aegis_block_xor(&tmp, src_blk);
crypto_aegis256_update_a(state, src_blk);
*dst_blk = tmp;
size -= AEGIS_BLOCK_SIZE;
src += AEGIS_BLOCK_SIZE;
dst += AEGIS_BLOCK_SIZE;
}
} else {
while (size >= AEGIS_BLOCK_SIZE) {
tmp = state->blocks[2];
crypto_aegis_block_and(&tmp, &state->blocks[3]);
crypto_aegis_block_xor(&tmp, &state->blocks[5]);
crypto_aegis_block_xor(&tmp, &state->blocks[4]);
crypto_aegis_block_xor(&tmp, &state->blocks[1]);
crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE);
crypto_aegis256_update_u(state, src);
memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE);
size -= AEGIS_BLOCK_SIZE;
src += AEGIS_BLOCK_SIZE;
dst += AEGIS_BLOCK_SIZE;
}
}
if (size > 0) {
union aegis_block msg = {};
memcpy(msg.bytes, src, size);
tmp = state->blocks[2];
crypto_aegis_block_and(&tmp, &state->blocks[3]);
crypto_aegis_block_xor(&tmp, &state->blocks[5]);
crypto_aegis_block_xor(&tmp, &state->blocks[4]);
crypto_aegis_block_xor(&tmp, &state->blocks[1]);
crypto_aegis256_update_a(state, &msg);
crypto_aegis_block_xor(&msg, &tmp);
memcpy(dst, msg.bytes, size);
}
}
static void crypto_aegis256_decrypt_chunk(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
union aegis_block tmp;
if (AEGIS_ALIGNED(src) && AEGIS_ALIGNED(dst)) {
while (size >= AEGIS_BLOCK_SIZE) {
union aegis_block *dst_blk =
(union aegis_block *)dst;
const union aegis_block *src_blk =
(const union aegis_block *)src;
tmp = state->blocks[2];
crypto_aegis_block_and(&tmp, &state->blocks[3]);
crypto_aegis_block_xor(&tmp, &state->blocks[5]);
crypto_aegis_block_xor(&tmp, &state->blocks[4]);
crypto_aegis_block_xor(&tmp, &state->blocks[1]);
crypto_aegis_block_xor(&tmp, src_blk);
crypto_aegis256_update_a(state, &tmp);
*dst_blk = tmp;
size -= AEGIS_BLOCK_SIZE;
src += AEGIS_BLOCK_SIZE;
dst += AEGIS_BLOCK_SIZE;
}
} else {
while (size >= AEGIS_BLOCK_SIZE) {
tmp = state->blocks[2];
crypto_aegis_block_and(&tmp, &state->blocks[3]);
crypto_aegis_block_xor(&tmp, &state->blocks[5]);
crypto_aegis_block_xor(&tmp, &state->blocks[4]);
crypto_aegis_block_xor(&tmp, &state->blocks[1]);
crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE);
crypto_aegis256_update_a(state, &tmp);
memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE);
size -= AEGIS_BLOCK_SIZE;
src += AEGIS_BLOCK_SIZE;
dst += AEGIS_BLOCK_SIZE;
}
}
if (size > 0) {
union aegis_block msg = {};
memcpy(msg.bytes, src, size);
tmp = state->blocks[2];
crypto_aegis_block_and(&tmp, &state->blocks[3]);
crypto_aegis_block_xor(&tmp, &state->blocks[5]);
crypto_aegis_block_xor(&tmp, &state->blocks[4]);
crypto_aegis_block_xor(&tmp, &state->blocks[1]);
crypto_aegis_block_xor(&msg, &tmp);
memset(msg.bytes + size, 0, AEGIS_BLOCK_SIZE - size);
crypto_aegis256_update_a(state, &msg);
memcpy(dst, msg.bytes, size);
}
}
static void crypto_aegis256_process_ad(struct aegis_state *state,
struct scatterlist *sg_src,
unsigned int assoclen)
{
struct scatter_walk walk;
union aegis_block buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= AEGIS_BLOCK_SIZE) {
if (pos > 0) {
unsigned int fill = AEGIS_BLOCK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
crypto_aegis256_update_a(state, &buf);
pos = 0;
left -= fill;
src += fill;
}
crypto_aegis256_ad(state, src, left);
src += left & ~(AEGIS_BLOCK_SIZE - 1);
left &= AEGIS_BLOCK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, AEGIS_BLOCK_SIZE - pos);
crypto_aegis256_update_a(state, &buf);
}
}
static void crypto_aegis256_process_crypt(struct aegis_state *state,
struct aead_request *req,
const struct aegis256_ops *ops)
{
struct skcipher_walk walk;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) {
unsigned int nbytes = walk.nbytes;
if (nbytes < walk.total)
nbytes = round_down(nbytes, walk.stride);
ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr,
nbytes);
skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
}
static void crypto_aegis256_final(struct aegis_state *state,
union aegis_block *tag_xor,
u64 assoclen, u64 cryptlen)
{
u64 assocbits = assoclen * 8;
u64 cryptbits = cryptlen * 8;
union aegis_block tmp;
unsigned int i;
tmp.words64[0] = cpu_to_le64(assocbits);
tmp.words64[1] = cpu_to_le64(cryptbits);
crypto_aegis_block_xor(&tmp, &state->blocks[3]);
for (i = 0; i < 7; i++)
crypto_aegis256_update_a(state, &tmp);
for (i = 0; i < AEGIS256_STATE_BLOCKS; i++)
crypto_aegis_block_xor(tag_xor, &state->blocks[i]);
}
static int crypto_aegis256_setkey(struct crypto_aead *aead, const u8 *key,
unsigned int keylen)
{
struct aegis_ctx *ctx = crypto_aead_ctx(aead);
if (keylen != AEGIS256_KEY_SIZE) {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
memcpy(ctx->key[0].bytes, key, AEGIS_BLOCK_SIZE);
memcpy(ctx->key[1].bytes, key + AEGIS_BLOCK_SIZE,
AEGIS_BLOCK_SIZE);
return 0;
}
static int crypto_aegis256_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
if (authsize > AEGIS256_MAX_AUTH_SIZE)
return -EINVAL;
if (authsize < AEGIS256_MIN_AUTH_SIZE)
return -EINVAL;
return 0;
}
static void crypto_aegis256_crypt(struct aead_request *req,
union aegis_block *tag_xor,
unsigned int cryptlen,
const struct aegis256_ops *ops)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aegis_ctx *ctx = crypto_aead_ctx(tfm);
struct aegis_state state;
crypto_aegis256_init(&state, ctx->key, req->iv);
crypto_aegis256_process_ad(&state, req->src, req->assoclen);
crypto_aegis256_process_crypt(&state, req, ops);
crypto_aegis256_final(&state, tag_xor, req->assoclen, cryptlen);
}
static int crypto_aegis256_encrypt(struct aead_request *req)
{
static const struct aegis256_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_chunk = crypto_aegis256_encrypt_chunk,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
union aegis_block tag = {};
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_aegis256_crypt(req, &tag, cryptlen, &ops);
scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen,
authsize, 1);
return 0;
}
static int crypto_aegis256_decrypt(struct aead_request *req)
{
static const struct aegis256_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_chunk = crypto_aegis256_decrypt_chunk,
};
static const u8 zeros[AEGIS256_MAX_AUTH_SIZE] = {};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
union aegis_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag.bytes, req->src, req->assoclen + cryptlen,
authsize, 0);
crypto_aegis256_crypt(req, &tag, cryptlen, &ops);
return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0;
}
static int crypto_aegis256_init_tfm(struct crypto_aead *tfm)
{
return 0;
}
static void crypto_aegis256_exit_tfm(struct crypto_aead *tfm)
{
}
static struct aead_alg crypto_aegis256_alg = {
.setkey = crypto_aegis256_setkey,
.setauthsize = crypto_aegis256_setauthsize,
.encrypt = crypto_aegis256_encrypt,
.decrypt = crypto_aegis256_decrypt,
.init = crypto_aegis256_init_tfm,
.exit = crypto_aegis256_exit_tfm,
.ivsize = AEGIS256_NONCE_SIZE,
.maxauthsize = AEGIS256_MAX_AUTH_SIZE,
.chunksize = AEGIS_BLOCK_SIZE,
.base = {
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aegis_ctx),
.cra_alignmask = 0,
.cra_priority = 100,
.cra_name = "aegis256",
.cra_driver_name = "aegis256-generic",
.cra_module = THIS_MODULE,
}
};
static int __init crypto_aegis256_module_init(void)
{
return crypto_register_aead(&crypto_aegis256_alg);
}
static void __exit crypto_aegis256_module_exit(void)
{
crypto_unregister_aead(&crypto_aegis256_alg);
}
subsys_initcall(crypto_aegis256_module_init);
module_exit(crypto_aegis256_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("AEGIS-256 AEAD algorithm");
MODULE_ALIAS_CRYPTO("aegis256");
MODULE_ALIAS_CRYPTO("aegis256-generic");

View File

@ -61,8 +61,6 @@ static inline u8 byte(const u32 x, const unsigned n)
return x >> (n << 3);
}
static const u32 rco_tab[10] = { 1, 2, 4, 8, 16, 32, 64, 128, 27, 54 };
/* cacheline-aligned to facilitate prefetching into cache */
__visible const u32 crypto_ft_tab[4][256] ____cacheline_aligned = {
{
@ -328,7 +326,7 @@ __visible const u32 crypto_ft_tab[4][256] ____cacheline_aligned = {
}
};
__visible const u32 crypto_fl_tab[4][256] ____cacheline_aligned = {
static const u32 crypto_fl_tab[4][256] ____cacheline_aligned = {
{
0x00000063, 0x0000007c, 0x00000077, 0x0000007b,
0x000000f2, 0x0000006b, 0x0000006f, 0x000000c5,
@ -856,7 +854,7 @@ __visible const u32 crypto_it_tab[4][256] ____cacheline_aligned = {
}
};
__visible const u32 crypto_il_tab[4][256] ____cacheline_aligned = {
static const u32 crypto_il_tab[4][256] ____cacheline_aligned = {
{
0x00000052, 0x00000009, 0x0000006a, 0x000000d5,
0x00000030, 0x00000036, 0x000000a5, 0x00000038,
@ -1121,158 +1119,7 @@ __visible const u32 crypto_il_tab[4][256] ____cacheline_aligned = {
};
EXPORT_SYMBOL_GPL(crypto_ft_tab);
EXPORT_SYMBOL_GPL(crypto_fl_tab);
EXPORT_SYMBOL_GPL(crypto_it_tab);
EXPORT_SYMBOL_GPL(crypto_il_tab);
/* initialise the key schedule from the user supplied key */
#define star_x(x) (((x) & 0x7f7f7f7f) << 1) ^ ((((x) & 0x80808080) >> 7) * 0x1b)
#define imix_col(y, x) do { \
u = star_x(x); \
v = star_x(u); \
w = star_x(v); \
t = w ^ (x); \
(y) = u ^ v ^ w; \
(y) ^= ror32(u ^ t, 8) ^ \
ror32(v ^ t, 16) ^ \
ror32(t, 24); \
} while (0)
#define ls_box(x) \
crypto_fl_tab[0][byte(x, 0)] ^ \
crypto_fl_tab[1][byte(x, 1)] ^ \
crypto_fl_tab[2][byte(x, 2)] ^ \
crypto_fl_tab[3][byte(x, 3)]
#define loop4(i) do { \
t = ror32(t, 8); \
t = ls_box(t) ^ rco_tab[i]; \
t ^= ctx->key_enc[4 * i]; \
ctx->key_enc[4 * i + 4] = t; \
t ^= ctx->key_enc[4 * i + 1]; \
ctx->key_enc[4 * i + 5] = t; \
t ^= ctx->key_enc[4 * i + 2]; \
ctx->key_enc[4 * i + 6] = t; \
t ^= ctx->key_enc[4 * i + 3]; \
ctx->key_enc[4 * i + 7] = t; \
} while (0)
#define loop6(i) do { \
t = ror32(t, 8); \
t = ls_box(t) ^ rco_tab[i]; \
t ^= ctx->key_enc[6 * i]; \
ctx->key_enc[6 * i + 6] = t; \
t ^= ctx->key_enc[6 * i + 1]; \
ctx->key_enc[6 * i + 7] = t; \
t ^= ctx->key_enc[6 * i + 2]; \
ctx->key_enc[6 * i + 8] = t; \
t ^= ctx->key_enc[6 * i + 3]; \
ctx->key_enc[6 * i + 9] = t; \
t ^= ctx->key_enc[6 * i + 4]; \
ctx->key_enc[6 * i + 10] = t; \
t ^= ctx->key_enc[6 * i + 5]; \
ctx->key_enc[6 * i + 11] = t; \
} while (0)
#define loop8tophalf(i) do { \
t = ror32(t, 8); \
t = ls_box(t) ^ rco_tab[i]; \
t ^= ctx->key_enc[8 * i]; \
ctx->key_enc[8 * i + 8] = t; \
t ^= ctx->key_enc[8 * i + 1]; \
ctx->key_enc[8 * i + 9] = t; \
t ^= ctx->key_enc[8 * i + 2]; \
ctx->key_enc[8 * i + 10] = t; \
t ^= ctx->key_enc[8 * i + 3]; \
ctx->key_enc[8 * i + 11] = t; \
} while (0)
#define loop8(i) do { \
loop8tophalf(i); \
t = ctx->key_enc[8 * i + 4] ^ ls_box(t); \
ctx->key_enc[8 * i + 12] = t; \
t ^= ctx->key_enc[8 * i + 5]; \
ctx->key_enc[8 * i + 13] = t; \
t ^= ctx->key_enc[8 * i + 6]; \
ctx->key_enc[8 * i + 14] = t; \
t ^= ctx->key_enc[8 * i + 7]; \
ctx->key_enc[8 * i + 15] = t; \
} while (0)
/**
* crypto_aes_expand_key - Expands the AES key as described in FIPS-197
* @ctx: The location where the computed key will be stored.
* @in_key: The supplied key.
* @key_len: The length of the supplied key.
*
* Returns 0 on success. The function fails only if an invalid key size (or
* pointer) is supplied.
* The expanded key size is 240 bytes (max of 14 rounds with a unique 16 bytes
* key schedule plus a 16 bytes key which is used before the first round).
* The decryption key is prepared for the "Equivalent Inverse Cipher" as
* described in FIPS-197. The first slot (16 bytes) of each key (enc or dec) is
* for the initial combination, the second slot for the first round and so on.
*/
int crypto_aes_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
unsigned int key_len)
{
u32 i, t, u, v, w, j;
if (key_len != AES_KEYSIZE_128 && key_len != AES_KEYSIZE_192 &&
key_len != AES_KEYSIZE_256)
return -EINVAL;
ctx->key_length = key_len;
ctx->key_enc[0] = get_unaligned_le32(in_key);
ctx->key_enc[1] = get_unaligned_le32(in_key + 4);
ctx->key_enc[2] = get_unaligned_le32(in_key + 8);
ctx->key_enc[3] = get_unaligned_le32(in_key + 12);
ctx->key_dec[key_len + 24] = ctx->key_enc[0];
ctx->key_dec[key_len + 25] = ctx->key_enc[1];
ctx->key_dec[key_len + 26] = ctx->key_enc[2];
ctx->key_dec[key_len + 27] = ctx->key_enc[3];
switch (key_len) {
case AES_KEYSIZE_128:
t = ctx->key_enc[3];
for (i = 0; i < 10; ++i)
loop4(i);
break;
case AES_KEYSIZE_192:
ctx->key_enc[4] = get_unaligned_le32(in_key + 16);
t = ctx->key_enc[5] = get_unaligned_le32(in_key + 20);
for (i = 0; i < 8; ++i)
loop6(i);
break;
case AES_KEYSIZE_256:
ctx->key_enc[4] = get_unaligned_le32(in_key + 16);
ctx->key_enc[5] = get_unaligned_le32(in_key + 20);
ctx->key_enc[6] = get_unaligned_le32(in_key + 24);
t = ctx->key_enc[7] = get_unaligned_le32(in_key + 28);
for (i = 0; i < 6; ++i)
loop8(i);
loop8tophalf(i);
break;
}
ctx->key_dec[0] = ctx->key_enc[key_len + 24];
ctx->key_dec[1] = ctx->key_enc[key_len + 25];
ctx->key_dec[2] = ctx->key_enc[key_len + 26];
ctx->key_dec[3] = ctx->key_enc[key_len + 27];
for (i = 4; i < key_len + 24; ++i) {
j = key_len + 24 - (i & ~3) + (i & 3);
imix_col(ctx->key_dec[j], ctx->key_enc[i]);
}
return 0;
}
EXPORT_SYMBOL_GPL(crypto_aes_expand_key);
/**
* crypto_aes_set_key - Set the AES key.
@ -1281,7 +1128,7 @@ EXPORT_SYMBOL_GPL(crypto_aes_expand_key);
* @key_len: The size of the key.
*
* Returns 0 on success, on failure the %CRYPTO_TFM_RES_BAD_KEY_LEN flag in tfm
* is set. The function uses crypto_aes_expand_key() to expand the key.
* is set. The function uses aes_expand_key() to expand the key.
* &crypto_aes_ctx _must_ be the private data embedded in @tfm which is
* retrieved with crypto_tfm_ctx().
*/
@ -1292,7 +1139,7 @@ int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
u32 *flags = &tfm->crt_flags;
int ret;
ret = crypto_aes_expand_key(ctx, in_key, key_len);
ret = aes_expandkey(ctx, in_key, key_len);
if (!ret)
return 0;
@ -1332,7 +1179,7 @@ EXPORT_SYMBOL_GPL(crypto_aes_set_key);
f_rl(bo, bi, 3, k); \
} while (0)
static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
u32 b0[4], b1[4];
@ -1402,7 +1249,7 @@ static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
i_rl(bo, bi, 3, k); \
} while (0)
static void aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
u32 b0[4], b1[4];
@ -1454,8 +1301,8 @@ static struct crypto_alg aes_alg = {
.cia_min_keysize = AES_MIN_KEY_SIZE,
.cia_max_keysize = AES_MAX_KEY_SIZE,
.cia_setkey = crypto_aes_set_key,
.cia_encrypt = aes_encrypt,
.cia_decrypt = aes_decrypt
.cia_encrypt = crypto_aes_encrypt,
.cia_decrypt = crypto_aes_decrypt
}
}
};

View File

@ -8,271 +8,19 @@
#include <crypto/aes.h>
#include <linux/crypto.h>
#include <linux/module.h>
#include <asm/unaligned.h>
/*
* Emit the sbox as volatile const to prevent the compiler from doing
* constant folding on sbox references involving fixed indexes.
*/
static volatile const u8 __cacheline_aligned __aesti_sbox[] = {
0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,
0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,
0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc,
0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,
0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a,
0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,
0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0,
0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,
0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b,
0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,
0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85,
0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,
0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5,
0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,
0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17,
0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,
0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88,
0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,
0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c,
0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,
0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9,
0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,
0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6,
0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,
0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e,
0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,
0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94,
0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68,
0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16,
};
static volatile const u8 __cacheline_aligned __aesti_inv_sbox[] = {
0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,
0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,
0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d,
0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,
0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2,
0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,
0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16,
0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,
0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda,
0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,
0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a,
0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,
0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02,
0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,
0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea,
0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,
0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85,
0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,
0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89,
0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,
0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20,
0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,
0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31,
0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,
0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d,
0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,
0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0,
0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,
0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26,
0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d,
};
static u32 mul_by_x(u32 w)
{
u32 x = w & 0x7f7f7f7f;
u32 y = w & 0x80808080;
/* multiply by polynomial 'x' (0b10) in GF(2^8) */
return (x << 1) ^ (y >> 7) * 0x1b;
}
static u32 mul_by_x2(u32 w)
{
u32 x = w & 0x3f3f3f3f;
u32 y = w & 0x80808080;
u32 z = w & 0x40404040;
/* multiply by polynomial 'x^2' (0b100) in GF(2^8) */
return (x << 2) ^ (y >> 7) * 0x36 ^ (z >> 6) * 0x1b;
}
static u32 mix_columns(u32 x)
{
/*
* Perform the following matrix multiplication in GF(2^8)
*
* | 0x2 0x3 0x1 0x1 | | x[0] |
* | 0x1 0x2 0x3 0x1 | | x[1] |
* | 0x1 0x1 0x2 0x3 | x | x[2] |
* | 0x3 0x1 0x1 0x2 | | x[3] |
*/
u32 y = mul_by_x(x) ^ ror32(x, 16);
return y ^ ror32(x ^ y, 8);
}
static u32 inv_mix_columns(u32 x)
{
/*
* Perform the following matrix multiplication in GF(2^8)
*
* | 0xe 0xb 0xd 0x9 | | x[0] |
* | 0x9 0xe 0xb 0xd | | x[1] |
* | 0xd 0x9 0xe 0xb | x | x[2] |
* | 0xb 0xd 0x9 0xe | | x[3] |
*
* which can conveniently be reduced to
*
* | 0x2 0x3 0x1 0x1 | | 0x5 0x0 0x4 0x0 | | x[0] |
* | 0x1 0x2 0x3 0x1 | | 0x0 0x5 0x0 0x4 | | x[1] |
* | 0x1 0x1 0x2 0x3 | x | 0x4 0x0 0x5 0x0 | x | x[2] |
* | 0x3 0x1 0x1 0x2 | | 0x0 0x4 0x0 0x5 | | x[3] |
*/
u32 y = mul_by_x2(x);
return mix_columns(x ^ y ^ ror32(y, 16));
}
static __always_inline u32 subshift(u32 in[], int pos)
{
return (__aesti_sbox[in[pos] & 0xff]) ^
(__aesti_sbox[(in[(pos + 1) % 4] >> 8) & 0xff] << 8) ^
(__aesti_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
(__aesti_sbox[(in[(pos + 3) % 4] >> 24) & 0xff] << 24);
}
static __always_inline u32 inv_subshift(u32 in[], int pos)
{
return (__aesti_inv_sbox[in[pos] & 0xff]) ^
(__aesti_inv_sbox[(in[(pos + 3) % 4] >> 8) & 0xff] << 8) ^
(__aesti_inv_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
(__aesti_inv_sbox[(in[(pos + 1) % 4] >> 24) & 0xff] << 24);
}
static u32 subw(u32 in)
{
return (__aesti_sbox[in & 0xff]) ^
(__aesti_sbox[(in >> 8) & 0xff] << 8) ^
(__aesti_sbox[(in >> 16) & 0xff] << 16) ^
(__aesti_sbox[(in >> 24) & 0xff] << 24);
}
static int aesti_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
unsigned int key_len)
{
u32 kwords = key_len / sizeof(u32);
u32 rc, i, j;
if (key_len != AES_KEYSIZE_128 &&
key_len != AES_KEYSIZE_192 &&
key_len != AES_KEYSIZE_256)
return -EINVAL;
ctx->key_length = key_len;
for (i = 0; i < kwords; i++)
ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
for (i = 0, rc = 1; i < 10; i++, rc = mul_by_x(rc)) {
u32 *rki = ctx->key_enc + (i * kwords);
u32 *rko = rki + kwords;
rko[0] = ror32(subw(rki[kwords - 1]), 8) ^ rc ^ rki[0];
rko[1] = rko[0] ^ rki[1];
rko[2] = rko[1] ^ rki[2];
rko[3] = rko[2] ^ rki[3];
if (key_len == 24) {
if (i >= 7)
break;
rko[4] = rko[3] ^ rki[4];
rko[5] = rko[4] ^ rki[5];
} else if (key_len == 32) {
if (i >= 6)
break;
rko[4] = subw(rko[3]) ^ rki[4];
rko[5] = rko[4] ^ rki[5];
rko[6] = rko[5] ^ rki[6];
rko[7] = rko[6] ^ rki[7];
}
}
/*
* Generate the decryption keys for the Equivalent Inverse Cipher.
* This involves reversing the order of the round keys, and applying
* the Inverse Mix Columns transformation to all but the first and
* the last one.
*/
ctx->key_dec[0] = ctx->key_enc[key_len + 24];
ctx->key_dec[1] = ctx->key_enc[key_len + 25];
ctx->key_dec[2] = ctx->key_enc[key_len + 26];
ctx->key_dec[3] = ctx->key_enc[key_len + 27];
for (i = 4, j = key_len + 20; j > 0; i += 4, j -= 4) {
ctx->key_dec[i] = inv_mix_columns(ctx->key_enc[j]);
ctx->key_dec[i + 1] = inv_mix_columns(ctx->key_enc[j + 1]);
ctx->key_dec[i + 2] = inv_mix_columns(ctx->key_enc[j + 2]);
ctx->key_dec[i + 3] = inv_mix_columns(ctx->key_enc[j + 3]);
}
ctx->key_dec[i] = ctx->key_enc[0];
ctx->key_dec[i + 1] = ctx->key_enc[1];
ctx->key_dec[i + 2] = ctx->key_enc[2];
ctx->key_dec[i + 3] = ctx->key_enc[3];
return 0;
}
static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int err;
err = aesti_expand_key(ctx, in_key, key_len);
if (err)
return err;
/*
* In order to force the compiler to emit data independent Sbox lookups
* at the start of each block, xor the first round key with values at
* fixed indexes in the Sbox. This will need to be repeated each time
* the key is used, which will pull the entire Sbox into the D-cache
* before any data dependent Sbox lookups are performed.
*/
ctx->key_enc[0] ^= __aesti_sbox[ 0] ^ __aesti_sbox[128];
ctx->key_enc[1] ^= __aesti_sbox[32] ^ __aesti_sbox[160];
ctx->key_enc[2] ^= __aesti_sbox[64] ^ __aesti_sbox[192];
ctx->key_enc[3] ^= __aesti_sbox[96] ^ __aesti_sbox[224];
ctx->key_dec[0] ^= __aesti_inv_sbox[ 0] ^ __aesti_inv_sbox[128];
ctx->key_dec[1] ^= __aesti_inv_sbox[32] ^ __aesti_inv_sbox[160];
ctx->key_dec[2] ^= __aesti_inv_sbox[64] ^ __aesti_inv_sbox[192];
ctx->key_dec[3] ^= __aesti_inv_sbox[96] ^ __aesti_inv_sbox[224];
return 0;
return aes_expandkey(ctx, in_key, key_len);
}
static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
const u32 *rkp = ctx->key_enc + 4;
int rounds = 6 + ctx->key_length / 4;
u32 st0[4], st1[4];
unsigned long flags;
int round;
st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
st0[1] = ctx->key_enc[1] ^ get_unaligned_le32(in + 4);
st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
/*
* Temporarily disable interrupts to avoid races where cachelines are
@ -280,30 +28,7 @@ static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
*/
local_irq_save(flags);
st0[0] ^= __aesti_sbox[ 0] ^ __aesti_sbox[128];
st0[1] ^= __aesti_sbox[32] ^ __aesti_sbox[160];
st0[2] ^= __aesti_sbox[64] ^ __aesti_sbox[192];
st0[3] ^= __aesti_sbox[96] ^ __aesti_sbox[224];
for (round = 0;; round += 2, rkp += 8) {
st1[0] = mix_columns(subshift(st0, 0)) ^ rkp[0];
st1[1] = mix_columns(subshift(st0, 1)) ^ rkp[1];
st1[2] = mix_columns(subshift(st0, 2)) ^ rkp[2];
st1[3] = mix_columns(subshift(st0, 3)) ^ rkp[3];
if (round == rounds - 2)
break;
st0[0] = mix_columns(subshift(st1, 0)) ^ rkp[4];
st0[1] = mix_columns(subshift(st1, 1)) ^ rkp[5];
st0[2] = mix_columns(subshift(st1, 2)) ^ rkp[6];
st0[3] = mix_columns(subshift(st1, 3)) ^ rkp[7];
}
put_unaligned_le32(subshift(st1, 0) ^ rkp[4], out);
put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
aes_encrypt(ctx, out, in);
local_irq_restore(flags);
}
@ -311,16 +36,7 @@ static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
const u32 *rkp = ctx->key_dec + 4;
int rounds = 6 + ctx->key_length / 4;
u32 st0[4], st1[4];
unsigned long flags;
int round;
st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
st0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4);
st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
/*
* Temporarily disable interrupts to avoid races where cachelines are
@ -328,30 +44,7 @@ static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
*/
local_irq_save(flags);
st0[0] ^= __aesti_inv_sbox[ 0] ^ __aesti_inv_sbox[128];
st0[1] ^= __aesti_inv_sbox[32] ^ __aesti_inv_sbox[160];
st0[2] ^= __aesti_inv_sbox[64] ^ __aesti_inv_sbox[192];
st0[3] ^= __aesti_inv_sbox[96] ^ __aesti_inv_sbox[224];
for (round = 0;; round += 2, rkp += 8) {
st1[0] = inv_mix_columns(inv_subshift(st0, 0)) ^ rkp[0];
st1[1] = inv_mix_columns(inv_subshift(st0, 1)) ^ rkp[1];
st1[2] = inv_mix_columns(inv_subshift(st0, 2)) ^ rkp[2];
st1[3] = inv_mix_columns(inv_subshift(st0, 3)) ^ rkp[3];
if (round == rounds - 2)
break;
st0[0] = inv_mix_columns(inv_subshift(st1, 0)) ^ rkp[4];
st0[1] = inv_mix_columns(inv_subshift(st1, 1)) ^ rkp[5];
st0[2] = inv_mix_columns(inv_subshift(st1, 2)) ^ rkp[6];
st0[3] = inv_mix_columns(inv_subshift(st1, 3)) ^ rkp[7];
}
put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[4], out);
put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
aes_decrypt(ctx, out, in);
local_irq_restore(flags);
}

View File

@ -16,7 +16,7 @@
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/cryptd.h>
#include <linux/atomic.h>
#include <linux/refcount.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
@ -63,7 +63,7 @@ struct aead_instance_ctx {
};
struct cryptd_skcipher_ctx {
atomic_t refcnt;
refcount_t refcnt;
struct crypto_sync_skcipher *child;
};
@ -72,7 +72,7 @@ struct cryptd_skcipher_request_ctx {
};
struct cryptd_hash_ctx {
atomic_t refcnt;
refcount_t refcnt;
struct crypto_shash *child;
};
@ -82,7 +82,7 @@ struct cryptd_hash_request_ctx {
};
struct cryptd_aead_ctx {
atomic_t refcnt;
refcount_t refcnt;
struct crypto_aead *child;
};
@ -127,7 +127,7 @@ static int cryptd_enqueue_request(struct cryptd_queue *queue,
{
int cpu, err;
struct cryptd_cpu_queue *cpu_queue;
atomic_t *refcnt;
refcount_t *refcnt;
cpu = get_cpu();
cpu_queue = this_cpu_ptr(queue->cpu_queue);
@ -140,10 +140,10 @@ static int cryptd_enqueue_request(struct cryptd_queue *queue,
queue_work_on(cpu, cryptd_wq, &cpu_queue->work);
if (!atomic_read(refcnt))
if (!refcount_read(refcnt))
goto out_put_cpu;
atomic_inc(refcnt);
refcount_inc(refcnt);
out_put_cpu:
put_cpu();
@ -270,13 +270,13 @@ static void cryptd_skcipher_complete(struct skcipher_request *req, int err)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cryptd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm);
struct cryptd_skcipher_request_ctx *rctx = skcipher_request_ctx(req);
int refcnt = atomic_read(&ctx->refcnt);
int refcnt = refcount_read(&ctx->refcnt);
local_bh_disable();
rctx->complete(&req->base, err);
local_bh_enable();
if (err != -EINPROGRESS && refcnt && atomic_dec_and_test(&ctx->refcnt))
if (err != -EINPROGRESS && refcnt && refcount_dec_and_test(&ctx->refcnt))
crypto_free_skcipher(tfm);
}
@ -521,13 +521,13 @@ static void cryptd_hash_complete(struct ahash_request *req, int err)
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct cryptd_hash_ctx *ctx = crypto_ahash_ctx(tfm);
struct cryptd_hash_request_ctx *rctx = ahash_request_ctx(req);
int refcnt = atomic_read(&ctx->refcnt);
int refcnt = refcount_read(&ctx->refcnt);
local_bh_disable();
rctx->complete(&req->base, err);
local_bh_enable();
if (err != -EINPROGRESS && refcnt && atomic_dec_and_test(&ctx->refcnt))
if (err != -EINPROGRESS && refcnt && refcount_dec_and_test(&ctx->refcnt))
crypto_free_ahash(tfm);
}
@ -772,13 +772,13 @@ static void cryptd_aead_crypt(struct aead_request *req,
out:
ctx = crypto_aead_ctx(tfm);
refcnt = atomic_read(&ctx->refcnt);
refcnt = refcount_read(&ctx->refcnt);
local_bh_disable();
compl(&req->base, err);
local_bh_enable();
if (err != -EINPROGRESS && refcnt && atomic_dec_and_test(&ctx->refcnt))
if (err != -EINPROGRESS && refcnt && refcount_dec_and_test(&ctx->refcnt))
crypto_free_aead(tfm);
}
@ -979,7 +979,7 @@ struct cryptd_skcipher *cryptd_alloc_skcipher(const char *alg_name,
}
ctx = crypto_skcipher_ctx(tfm);
atomic_set(&ctx->refcnt, 1);
refcount_set(&ctx->refcnt, 1);
return container_of(tfm, struct cryptd_skcipher, base);
}
@ -997,7 +997,7 @@ bool cryptd_skcipher_queued(struct cryptd_skcipher *tfm)
{
struct cryptd_skcipher_ctx *ctx = crypto_skcipher_ctx(&tfm->base);
return atomic_read(&ctx->refcnt) - 1;
return refcount_read(&ctx->refcnt) - 1;
}
EXPORT_SYMBOL_GPL(cryptd_skcipher_queued);
@ -1005,7 +1005,7 @@ void cryptd_free_skcipher(struct cryptd_skcipher *tfm)
{
struct cryptd_skcipher_ctx *ctx = crypto_skcipher_ctx(&tfm->base);
if (atomic_dec_and_test(&ctx->refcnt))
if (refcount_dec_and_test(&ctx->refcnt))
crypto_free_skcipher(&tfm->base);
}
EXPORT_SYMBOL_GPL(cryptd_free_skcipher);
@ -1029,7 +1029,7 @@ struct cryptd_ahash *cryptd_alloc_ahash(const char *alg_name,
}
ctx = crypto_ahash_ctx(tfm);
atomic_set(&ctx->refcnt, 1);
refcount_set(&ctx->refcnt, 1);
return __cryptd_ahash_cast(tfm);
}
@ -1054,7 +1054,7 @@ bool cryptd_ahash_queued(struct cryptd_ahash *tfm)
{
struct cryptd_hash_ctx *ctx = crypto_ahash_ctx(&tfm->base);
return atomic_read(&ctx->refcnt) - 1;
return refcount_read(&ctx->refcnt) - 1;
}
EXPORT_SYMBOL_GPL(cryptd_ahash_queued);
@ -1062,7 +1062,7 @@ void cryptd_free_ahash(struct cryptd_ahash *tfm)
{
struct cryptd_hash_ctx *ctx = crypto_ahash_ctx(&tfm->base);
if (atomic_dec_and_test(&ctx->refcnt))
if (refcount_dec_and_test(&ctx->refcnt))
crypto_free_ahash(&tfm->base);
}
EXPORT_SYMBOL_GPL(cryptd_free_ahash);
@ -1086,7 +1086,7 @@ struct cryptd_aead *cryptd_alloc_aead(const char *alg_name,
}
ctx = crypto_aead_ctx(tfm);
atomic_set(&ctx->refcnt, 1);
refcount_set(&ctx->refcnt, 1);
return __cryptd_aead_cast(tfm);
}
@ -1104,7 +1104,7 @@ bool cryptd_aead_queued(struct cryptd_aead *tfm)
{
struct cryptd_aead_ctx *ctx = crypto_aead_ctx(&tfm->base);
return atomic_read(&ctx->refcnt) - 1;
return refcount_read(&ctx->refcnt) - 1;
}
EXPORT_SYMBOL_GPL(cryptd_aead_queued);
@ -1112,7 +1112,7 @@ void cryptd_free_aead(struct cryptd_aead *tfm)
{
struct cryptd_aead_ctx *ctx = crypto_aead_ctx(&tfm->base);
if (atomic_dec_and_test(&ctx->refcnt))
if (refcount_dec_and_test(&ctx->refcnt))
crypto_free_aead(&tfm->base);
}
EXPORT_SYMBOL_GPL(cryptd_free_aead);

View File

@ -425,7 +425,7 @@ EXPORT_SYMBOL_GPL(crypto_engine_stop);
*/
struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)
{
struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
struct sched_param param = { .sched_priority = MAX_RT_PRIO / 2 };
struct crypto_engine *engine;
if (!dev)

View File

@ -10,9 +10,10 @@
#include <linux/crypto.h>
#include <linux/cryptouser.h>
#include <linux/sched.h>
#include <net/netlink.h>
#include <linux/security.h>
#include <net/netlink.h>
#include <net/net_namespace.h>
#include <net/sock.h>
#include <crypto/internal/skcipher.h>
#include <crypto/internal/rng.h>
#include <crypto/akcipher.h>
@ -25,9 +26,6 @@
static DEFINE_MUTEX(crypto_cfg_mutex);
/* The crypto netlink socket */
struct sock *crypto_nlsk;
struct crypto_dump_info {
struct sk_buff *in_skb;
struct sk_buff *out_skb;
@ -186,6 +184,7 @@ out:
static int crypto_report(struct sk_buff *in_skb, struct nlmsghdr *in_nlh,
struct nlattr **attrs)
{
struct net *net = sock_net(in_skb->sk);
struct crypto_user_alg *p = nlmsg_data(in_nlh);
struct crypto_alg *alg;
struct sk_buff *skb;
@ -217,7 +216,7 @@ drop_alg:
if (err)
return err;
return nlmsg_unicast(crypto_nlsk, skb, NETLINK_CB(in_skb).portid);
return nlmsg_unicast(net->crypto_nlsk, skb, NETLINK_CB(in_skb).portid);
}
static int crypto_dump_report(struct sk_buff *skb, struct netlink_callback *cb)
@ -420,6 +419,7 @@ static const struct crypto_link {
static int crypto_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct net *net = sock_net(skb->sk);
struct nlattr *attrs[CRYPTOCFGA_MAX+1];
const struct crypto_link *link;
int type, err;
@ -450,7 +450,7 @@ static int crypto_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
.done = link->done,
.min_dump_alloc = min(dump_alloc, 65535UL),
};
err = netlink_dump_start(crypto_nlsk, skb, nlh, &c);
err = netlink_dump_start(net->crypto_nlsk, skb, nlh, &c);
}
return err;
@ -474,22 +474,35 @@ static void crypto_netlink_rcv(struct sk_buff *skb)
mutex_unlock(&crypto_cfg_mutex);
}
static int __init crypto_user_init(void)
static int __net_init crypto_netlink_init(struct net *net)
{
struct netlink_kernel_cfg cfg = {
.input = crypto_netlink_rcv,
};
crypto_nlsk = netlink_kernel_create(&init_net, NETLINK_CRYPTO, &cfg);
if (!crypto_nlsk)
return -ENOMEM;
net->crypto_nlsk = netlink_kernel_create(net, NETLINK_CRYPTO, &cfg);
return net->crypto_nlsk == NULL ? -ENOMEM : 0;
}
return 0;
static void __net_exit crypto_netlink_exit(struct net *net)
{
netlink_kernel_release(net->crypto_nlsk);
net->crypto_nlsk = NULL;
}
static struct pernet_operations crypto_netlink_net_ops = {
.init = crypto_netlink_init,
.exit = crypto_netlink_exit,
};
static int __init crypto_user_init(void)
{
return register_pernet_subsys(&crypto_netlink_net_ops);
}
static void __exit crypto_user_exit(void)
{
netlink_kernel_release(crypto_nlsk);
unregister_pernet_subsys(&crypto_netlink_net_ops);
}
module_init(crypto_user_init);

View File

@ -10,6 +10,7 @@
#include <linux/cryptouser.h>
#include <linux/sched.h>
#include <net/netlink.h>
#include <net/sock.h>
#include <crypto/internal/skcipher.h>
#include <crypto/internal/rng.h>
#include <crypto/akcipher.h>
@ -298,6 +299,7 @@ out:
int crypto_reportstat(struct sk_buff *in_skb, struct nlmsghdr *in_nlh,
struct nlattr **attrs)
{
struct net *net = sock_net(in_skb->sk);
struct crypto_user_alg *p = nlmsg_data(in_nlh);
struct crypto_alg *alg;
struct sk_buff *skb;
@ -329,7 +331,7 @@ drop_alg:
if (err)
return err;
return nlmsg_unicast(crypto_nlsk, skb, NETLINK_CB(in_skb).portid);
return nlmsg_unicast(net->crypto_nlsk, skb, NETLINK_CB(in_skb).portid);
}
MODULE_LICENSE("GPL");

File diff suppressed because it is too large Load Diff

View File

@ -11,10 +11,14 @@
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/sysctl.h>
#include <linux/notifier.h>
int fips_enabled;
EXPORT_SYMBOL_GPL(fips_enabled);
ATOMIC_NOTIFIER_HEAD(fips_fail_notif_chain);
EXPORT_SYMBOL_GPL(fips_fail_notif_chain);
/* Process kernel command-line parameter at boot time. fips=0 or fips=1 */
static int fips_enable(char *str)
{
@ -58,6 +62,13 @@ static void crypto_proc_fips_exit(void)
unregister_sysctl_table(crypto_sysctls);
}
void fips_fail_notify(void)
{
if (fips_enabled)
atomic_notifier_call_chain(&fips_fail_notif_chain, 0, NULL);
}
EXPORT_SYMBOL_GPL(fips_fail_notify);
static int __init fips_init(void)
{
crypto_proc_fips_init();

View File

@ -152,20 +152,7 @@ out:
static int crypto_gcm_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
switch (authsize) {
case 4:
case 8:
case 12:
case 13:
case 14:
case 15:
case 16:
break;
default:
return -EINVAL;
}
return 0;
return crypto_gcm_check_authsize(authsize);
}
static void crypto_gcm_init_common(struct aead_request *req)
@ -762,15 +749,11 @@ static int crypto_rfc4106_setauthsize(struct crypto_aead *parent,
unsigned int authsize)
{
struct crypto_rfc4106_ctx *ctx = crypto_aead_ctx(parent);
int err;
switch (authsize) {
case 8:
case 12:
case 16:
break;
default:
return -EINVAL;
}
err = crypto_rfc4106_check_authsize(authsize);
if (err)
return err;
return crypto_aead_setauthsize(ctx->child, authsize);
}
@ -818,8 +801,11 @@ static struct aead_request *crypto_rfc4106_crypt(struct aead_request *req)
static int crypto_rfc4106_encrypt(struct aead_request *req)
{
if (req->assoclen != 16 && req->assoclen != 20)
return -EINVAL;
int err;
err = crypto_ipsec_check_assoclen(req->assoclen);
if (err)
return err;
req = crypto_rfc4106_crypt(req);
@ -828,8 +814,11 @@ static int crypto_rfc4106_encrypt(struct aead_request *req)
static int crypto_rfc4106_decrypt(struct aead_request *req)
{
if (req->assoclen != 16 && req->assoclen != 20)
return -EINVAL;
int err;
err = crypto_ipsec_check_assoclen(req->assoclen);
if (err)
return err;
req = crypto_rfc4106_crypt(req);
@ -1045,12 +1034,14 @@ static int crypto_rfc4543_copy_src_to_dst(struct aead_request *req, bool enc)
static int crypto_rfc4543_encrypt(struct aead_request *req)
{
return crypto_rfc4543_crypt(req, true);
return crypto_ipsec_check_assoclen(req->assoclen) ?:
crypto_rfc4543_crypt(req, true);
}
static int crypto_rfc4543_decrypt(struct aead_request *req)
{
return crypto_rfc4543_crypt(req, false);
return crypto_ipsec_check_assoclen(req->assoclen) ?:
crypto_rfc4543_crypt(req, false);
}
static int crypto_rfc4543_init_tfm(struct crypto_aead *tfm)

View File

@ -1,12 +1,37 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* GHASH: digest algorithm for GCM (Galois/Counter Mode).
* GHASH: hash function for GCM (Galois/Counter Mode).
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>
* Copyright (c) 2009 Intel Corp.
* Author: Huang Ying <ying.huang@intel.com>
*/
/*
* GHASH is a keyed hash function used in GCM authentication tag generation.
*
* The algorithm implementation is copied from gcm.c.
* The original GCM paper [1] presents GHASH as a function GHASH(H, A, C) which
* takes a 16-byte hash key H, additional authenticated data A, and a ciphertext
* C. It formats A and C into a single byte string X, interprets X as a
* polynomial over GF(2^128), and evaluates this polynomial at the point H.
*
* However, the NIST standard for GCM [2] presents GHASH as GHASH(H, X) where X
* is the already-formatted byte string containing both A and C.
*
* "ghash" in the Linux crypto API uses the 'X' (pre-formatted) convention,
* since the API supports only a single data stream per hash. Thus, the
* formatting of 'A' and 'C' is done in the "gcm" template, not in "ghash".
*
* The reason "ghash" is separate from "gcm" is to allow "gcm" to use an
* accelerated "ghash" when a standalone accelerated "gcm(aes)" is unavailable.
* It is generally inappropriate to use "ghash" for other purposes, since it is
* an "ε-almost-XOR-universal hash function", not a cryptographic hash function.
* It can only be used securely in crypto modes specially designed to use it.
*
* [1] The Galois/Counter Mode of Operation (GCM)
* (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.694.695&rep=rep1&type=pdf)
* [2] Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC
* (https://csrc.nist.gov/publications/detail/sp/800-38d/final)
*/
#include <crypto/algapi.h>
@ -156,6 +181,6 @@ subsys_initcall(ghash_mod_init);
module_exit(ghash_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("GHASH Message Digest Algorithm");
MODULE_DESCRIPTION("GHASH hash function");
MODULE_ALIAS_CRYPTO("ghash");
MODULE_ALIAS_CRYPTO("ghash-generic");

View File

@ -1,542 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The MORUS-1280 Authenticated-Encryption Algorithm
*
* Copyright (c) 2016-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <asm/unaligned.h>
#include <crypto/algapi.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/morus_common.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/scatterlist.h>
#define MORUS1280_WORD_SIZE 8
#define MORUS1280_BLOCK_SIZE (MORUS_BLOCK_WORDS * MORUS1280_WORD_SIZE)
#define MORUS1280_BLOCK_ALIGN (__alignof__(__le64))
#define MORUS1280_ALIGNED(p) IS_ALIGNED((uintptr_t)p, MORUS1280_BLOCK_ALIGN)
struct morus1280_block {
u64 words[MORUS_BLOCK_WORDS];
};
union morus1280_block_in {
__le64 words[MORUS_BLOCK_WORDS];
u8 bytes[MORUS1280_BLOCK_SIZE];
};
struct morus1280_state {
struct morus1280_block s[MORUS_STATE_BLOCKS];
};
struct morus1280_ctx {
struct morus1280_block key;
};
struct morus1280_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_chunk)(struct morus1280_state *state,
u8 *dst, const u8 *src, unsigned int size);
};
static const struct morus1280_block crypto_morus1280_const[1] = {
{ .words = {
U64_C(0x0d08050302010100),
U64_C(0x6279e99059372215),
U64_C(0xf12fc26d55183ddb),
U64_C(0xdd28b57342311120),
} },
};
static void crypto_morus1280_round(struct morus1280_block *b0,
struct morus1280_block *b1,
struct morus1280_block *b2,
struct morus1280_block *b3,
struct morus1280_block *b4,
const struct morus1280_block *m,
unsigned int b, unsigned int w)
{
unsigned int i;
struct morus1280_block tmp;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
b0->words[i] ^= b1->words[i] & b2->words[i];
b0->words[i] ^= b3->words[i];
b0->words[i] ^= m->words[i];
b0->words[i] = rol64(b0->words[i], b);
}
tmp = *b3;
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
b3->words[(i + w) % MORUS_BLOCK_WORDS] = tmp.words[i];
}
static void crypto_morus1280_update(struct morus1280_state *state,
const struct morus1280_block *m)
{
static const struct morus1280_block z = {};
struct morus1280_block *s = state->s;
crypto_morus1280_round(&s[0], &s[1], &s[2], &s[3], &s[4], &z, 13, 1);
crypto_morus1280_round(&s[1], &s[2], &s[3], &s[4], &s[0], m, 46, 2);
crypto_morus1280_round(&s[2], &s[3], &s[4], &s[0], &s[1], m, 38, 3);
crypto_morus1280_round(&s[3], &s[4], &s[0], &s[1], &s[2], m, 7, 2);
crypto_morus1280_round(&s[4], &s[0], &s[1], &s[2], &s[3], m, 4, 1);
}
static void crypto_morus1280_load_a(struct morus1280_block *dst, const u8 *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
dst->words[i] = le64_to_cpu(*(const __le64 *)src);
src += MORUS1280_WORD_SIZE;
}
}
static void crypto_morus1280_load_u(struct morus1280_block *dst, const u8 *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
dst->words[i] = get_unaligned_le64(src);
src += MORUS1280_WORD_SIZE;
}
}
static void crypto_morus1280_load(struct morus1280_block *dst, const u8 *src)
{
if (MORUS1280_ALIGNED(src))
crypto_morus1280_load_a(dst, src);
else
crypto_morus1280_load_u(dst, src);
}
static void crypto_morus1280_store_a(u8 *dst, const struct morus1280_block *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
*(__le64 *)dst = cpu_to_le64(src->words[i]);
dst += MORUS1280_WORD_SIZE;
}
}
static void crypto_morus1280_store_u(u8 *dst, const struct morus1280_block *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
put_unaligned_le64(src->words[i], dst);
dst += MORUS1280_WORD_SIZE;
}
}
static void crypto_morus1280_store(u8 *dst, const struct morus1280_block *src)
{
if (MORUS1280_ALIGNED(dst))
crypto_morus1280_store_a(dst, src);
else
crypto_morus1280_store_u(dst, src);
}
static void crypto_morus1280_ad(struct morus1280_state *state, const u8 *src,
unsigned int size)
{
struct morus1280_block m;
if (MORUS1280_ALIGNED(src)) {
while (size >= MORUS1280_BLOCK_SIZE) {
crypto_morus1280_load_a(&m, src);
crypto_morus1280_update(state, &m);
size -= MORUS1280_BLOCK_SIZE;
src += MORUS1280_BLOCK_SIZE;
}
} else {
while (size >= MORUS1280_BLOCK_SIZE) {
crypto_morus1280_load_u(&m, src);
crypto_morus1280_update(state, &m);
size -= MORUS1280_BLOCK_SIZE;
src += MORUS1280_BLOCK_SIZE;
}
}
}
static void crypto_morus1280_core(const struct morus1280_state *state,
struct morus1280_block *blk)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
blk->words[(i + 3) % MORUS_BLOCK_WORDS] ^= state->s[1].words[i];
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
blk->words[i] ^= state->s[0].words[i];
blk->words[i] ^= state->s[2].words[i] & state->s[3].words[i];
}
}
static void crypto_morus1280_encrypt_chunk(struct morus1280_state *state,
u8 *dst, const u8 *src,
unsigned int size)
{
struct morus1280_block c, m;
if (MORUS1280_ALIGNED(src) && MORUS1280_ALIGNED(dst)) {
while (size >= MORUS1280_BLOCK_SIZE) {
crypto_morus1280_load_a(&m, src);
c = m;
crypto_morus1280_core(state, &c);
crypto_morus1280_store_a(dst, &c);
crypto_morus1280_update(state, &m);
src += MORUS1280_BLOCK_SIZE;
dst += MORUS1280_BLOCK_SIZE;
size -= MORUS1280_BLOCK_SIZE;
}
} else {
while (size >= MORUS1280_BLOCK_SIZE) {
crypto_morus1280_load_u(&m, src);
c = m;
crypto_morus1280_core(state, &c);
crypto_morus1280_store_u(dst, &c);
crypto_morus1280_update(state, &m);
src += MORUS1280_BLOCK_SIZE;
dst += MORUS1280_BLOCK_SIZE;
size -= MORUS1280_BLOCK_SIZE;
}
}
if (size > 0) {
union morus1280_block_in tail;
memcpy(tail.bytes, src, size);
memset(tail.bytes + size, 0, MORUS1280_BLOCK_SIZE - size);
crypto_morus1280_load_a(&m, tail.bytes);
c = m;
crypto_morus1280_core(state, &c);
crypto_morus1280_store_a(tail.bytes, &c);
crypto_morus1280_update(state, &m);
memcpy(dst, tail.bytes, size);
}
}
static void crypto_morus1280_decrypt_chunk(struct morus1280_state *state,
u8 *dst, const u8 *src,
unsigned int size)
{
struct morus1280_block m;
if (MORUS1280_ALIGNED(src) && MORUS1280_ALIGNED(dst)) {
while (size >= MORUS1280_BLOCK_SIZE) {
crypto_morus1280_load_a(&m, src);
crypto_morus1280_core(state, &m);
crypto_morus1280_store_a(dst, &m);
crypto_morus1280_update(state, &m);
src += MORUS1280_BLOCK_SIZE;
dst += MORUS1280_BLOCK_SIZE;
size -= MORUS1280_BLOCK_SIZE;
}
} else {
while (size >= MORUS1280_BLOCK_SIZE) {
crypto_morus1280_load_u(&m, src);
crypto_morus1280_core(state, &m);
crypto_morus1280_store_u(dst, &m);
crypto_morus1280_update(state, &m);
src += MORUS1280_BLOCK_SIZE;
dst += MORUS1280_BLOCK_SIZE;
size -= MORUS1280_BLOCK_SIZE;
}
}
if (size > 0) {
union morus1280_block_in tail;
memcpy(tail.bytes, src, size);
memset(tail.bytes + size, 0, MORUS1280_BLOCK_SIZE - size);
crypto_morus1280_load_a(&m, tail.bytes);
crypto_morus1280_core(state, &m);
crypto_morus1280_store_a(tail.bytes, &m);
memset(tail.bytes + size, 0, MORUS1280_BLOCK_SIZE - size);
crypto_morus1280_load_a(&m, tail.bytes);
crypto_morus1280_update(state, &m);
memcpy(dst, tail.bytes, size);
}
}
static void crypto_morus1280_init(struct morus1280_state *state,
const struct morus1280_block *key,
const u8 *iv)
{
static const struct morus1280_block z = {};
union morus1280_block_in tmp;
unsigned int i;
memcpy(tmp.bytes, iv, MORUS_NONCE_SIZE);
memset(tmp.bytes + MORUS_NONCE_SIZE, 0,
MORUS1280_BLOCK_SIZE - MORUS_NONCE_SIZE);
crypto_morus1280_load(&state->s[0], tmp.bytes);
state->s[1] = *key;
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
state->s[2].words[i] = U64_C(0xFFFFFFFFFFFFFFFF);
state->s[3] = z;
state->s[4] = crypto_morus1280_const[0];
for (i = 0; i < 16; i++)
crypto_morus1280_update(state, &z);
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
state->s[1].words[i] ^= key->words[i];
}
static void crypto_morus1280_process_ad(struct morus1280_state *state,
struct scatterlist *sg_src,
unsigned int assoclen)
{
struct scatter_walk walk;
struct morus1280_block m;
union morus1280_block_in buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= MORUS1280_BLOCK_SIZE) {
if (pos > 0) {
unsigned int fill = MORUS1280_BLOCK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
crypto_morus1280_load_a(&m, buf.bytes);
crypto_morus1280_update(state, &m);
pos = 0;
left -= fill;
src += fill;
}
crypto_morus1280_ad(state, src, left);
src += left & ~(MORUS1280_BLOCK_SIZE - 1);
left &= MORUS1280_BLOCK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, MORUS1280_BLOCK_SIZE - pos);
crypto_morus1280_load_a(&m, buf.bytes);
crypto_morus1280_update(state, &m);
}
}
static void crypto_morus1280_process_crypt(struct morus1280_state *state,
struct aead_request *req,
const struct morus1280_ops *ops)
{
struct skcipher_walk walk;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) {
unsigned int nbytes = walk.nbytes;
if (nbytes < walk.total)
nbytes = round_down(nbytes, walk.stride);
ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr,
nbytes);
skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
}
static void crypto_morus1280_final(struct morus1280_state *state,
struct morus1280_block *tag_xor,
u64 assoclen, u64 cryptlen)
{
struct morus1280_block tmp;
unsigned int i;
tmp.words[0] = assoclen * 8;
tmp.words[1] = cryptlen * 8;
tmp.words[2] = 0;
tmp.words[3] = 0;
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
state->s[4].words[i] ^= state->s[0].words[i];
for (i = 0; i < 10; i++)
crypto_morus1280_update(state, &tmp);
crypto_morus1280_core(state, tag_xor);
}
static int crypto_morus1280_setkey(struct crypto_aead *aead, const u8 *key,
unsigned int keylen)
{
struct morus1280_ctx *ctx = crypto_aead_ctx(aead);
union morus1280_block_in tmp;
if (keylen == MORUS1280_BLOCK_SIZE)
crypto_morus1280_load(&ctx->key, key);
else if (keylen == MORUS1280_BLOCK_SIZE / 2) {
memcpy(tmp.bytes, key, keylen);
memcpy(tmp.bytes + keylen, key, keylen);
crypto_morus1280_load(&ctx->key, tmp.bytes);
} else {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
return 0;
}
static int crypto_morus1280_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
return (authsize <= MORUS_MAX_AUTH_SIZE) ? 0 : -EINVAL;
}
static void crypto_morus1280_crypt(struct aead_request *req,
struct morus1280_block *tag_xor,
unsigned int cryptlen,
const struct morus1280_ops *ops)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus1280_ctx *ctx = crypto_aead_ctx(tfm);
struct morus1280_state state;
crypto_morus1280_init(&state, &ctx->key, req->iv);
crypto_morus1280_process_ad(&state, req->src, req->assoclen);
crypto_morus1280_process_crypt(&state, req, ops);
crypto_morus1280_final(&state, tag_xor, req->assoclen, cryptlen);
}
static int crypto_morus1280_encrypt(struct aead_request *req)
{
static const struct morus1280_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_chunk = crypto_morus1280_encrypt_chunk,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus1280_block tag = {};
union morus1280_block_in tag_out;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_morus1280_crypt(req, &tag, cryptlen, &ops);
crypto_morus1280_store(tag_out.bytes, &tag);
scatterwalk_map_and_copy(tag_out.bytes, req->dst,
req->assoclen + cryptlen, authsize, 1);
return 0;
}
static int crypto_morus1280_decrypt(struct aead_request *req)
{
static const struct morus1280_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_chunk = crypto_morus1280_decrypt_chunk,
};
static const u8 zeros[MORUS1280_BLOCK_SIZE] = {};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
union morus1280_block_in tag_in;
struct morus1280_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag_in.bytes, req->src,
req->assoclen + cryptlen, authsize, 0);
crypto_morus1280_load(&tag, tag_in.bytes);
crypto_morus1280_crypt(req, &tag, cryptlen, &ops);
crypto_morus1280_store(tag_in.bytes, &tag);
return crypto_memneq(tag_in.bytes, zeros, authsize) ? -EBADMSG : 0;
}
static int crypto_morus1280_init_tfm(struct crypto_aead *tfm)
{
return 0;
}
static void crypto_morus1280_exit_tfm(struct crypto_aead *tfm)
{
}
static struct aead_alg crypto_morus1280_alg = {
.setkey = crypto_morus1280_setkey,
.setauthsize = crypto_morus1280_setauthsize,
.encrypt = crypto_morus1280_encrypt,
.decrypt = crypto_morus1280_decrypt,
.init = crypto_morus1280_init_tfm,
.exit = crypto_morus1280_exit_tfm,
.ivsize = MORUS_NONCE_SIZE,
.maxauthsize = MORUS_MAX_AUTH_SIZE,
.chunksize = MORUS1280_BLOCK_SIZE,
.base = {
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct morus1280_ctx),
.cra_alignmask = 0,
.cra_priority = 100,
.cra_name = "morus1280",
.cra_driver_name = "morus1280-generic",
.cra_module = THIS_MODULE,
}
};
static int __init crypto_morus1280_module_init(void)
{
return crypto_register_aead(&crypto_morus1280_alg);
}
static void __exit crypto_morus1280_module_exit(void)
{
crypto_unregister_aead(&crypto_morus1280_alg);
}
subsys_initcall(crypto_morus1280_module_init);
module_exit(crypto_morus1280_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("MORUS-1280 AEAD algorithm");
MODULE_ALIAS_CRYPTO("morus1280");
MODULE_ALIAS_CRYPTO("morus1280-generic");

View File

@ -1,533 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* The MORUS-640 Authenticated-Encryption Algorithm
*
* Copyright (c) 2016-2018 Ondrej Mosnacek <omosnacek@gmail.com>
* Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved.
*/
#include <asm/unaligned.h>
#include <crypto/algapi.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/morus_common.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/scatterlist.h>
#define MORUS640_WORD_SIZE 4
#define MORUS640_BLOCK_SIZE (MORUS_BLOCK_WORDS * MORUS640_WORD_SIZE)
#define MORUS640_BLOCK_ALIGN (__alignof__(__le32))
#define MORUS640_ALIGNED(p) IS_ALIGNED((uintptr_t)p, MORUS640_BLOCK_ALIGN)
struct morus640_block {
u32 words[MORUS_BLOCK_WORDS];
};
union morus640_block_in {
__le32 words[MORUS_BLOCK_WORDS];
u8 bytes[MORUS640_BLOCK_SIZE];
};
struct morus640_state {
struct morus640_block s[MORUS_STATE_BLOCKS];
};
struct morus640_ctx {
struct morus640_block key;
};
struct morus640_ops {
int (*skcipher_walk_init)(struct skcipher_walk *walk,
struct aead_request *req, bool atomic);
void (*crypt_chunk)(struct morus640_state *state,
u8 *dst, const u8 *src, unsigned int size);
};
static const struct morus640_block crypto_morus640_const[2] = {
{ .words = {
U32_C(0x02010100),
U32_C(0x0d080503),
U32_C(0x59372215),
U32_C(0x6279e990),
} },
{ .words = {
U32_C(0x55183ddb),
U32_C(0xf12fc26d),
U32_C(0x42311120),
U32_C(0xdd28b573),
} },
};
static void crypto_morus640_round(struct morus640_block *b0,
struct morus640_block *b1,
struct morus640_block *b2,
struct morus640_block *b3,
struct morus640_block *b4,
const struct morus640_block *m,
unsigned int b, unsigned int w)
{
unsigned int i;
struct morus640_block tmp;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
b0->words[i] ^= b1->words[i] & b2->words[i];
b0->words[i] ^= b3->words[i];
b0->words[i] ^= m->words[i];
b0->words[i] = rol32(b0->words[i], b);
}
tmp = *b3;
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
b3->words[(i + w) % MORUS_BLOCK_WORDS] = tmp.words[i];
}
static void crypto_morus640_update(struct morus640_state *state,
const struct morus640_block *m)
{
static const struct morus640_block z = {};
struct morus640_block *s = state->s;
crypto_morus640_round(&s[0], &s[1], &s[2], &s[3], &s[4], &z, 5, 1);
crypto_morus640_round(&s[1], &s[2], &s[3], &s[4], &s[0], m, 31, 2);
crypto_morus640_round(&s[2], &s[3], &s[4], &s[0], &s[1], m, 7, 3);
crypto_morus640_round(&s[3], &s[4], &s[0], &s[1], &s[2], m, 22, 2);
crypto_morus640_round(&s[4], &s[0], &s[1], &s[2], &s[3], m, 13, 1);
}
static void crypto_morus640_load_a(struct morus640_block *dst, const u8 *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
dst->words[i] = le32_to_cpu(*(const __le32 *)src);
src += MORUS640_WORD_SIZE;
}
}
static void crypto_morus640_load_u(struct morus640_block *dst, const u8 *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
dst->words[i] = get_unaligned_le32(src);
src += MORUS640_WORD_SIZE;
}
}
static void crypto_morus640_load(struct morus640_block *dst, const u8 *src)
{
if (MORUS640_ALIGNED(src))
crypto_morus640_load_a(dst, src);
else
crypto_morus640_load_u(dst, src);
}
static void crypto_morus640_store_a(u8 *dst, const struct morus640_block *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
*(__le32 *)dst = cpu_to_le32(src->words[i]);
dst += MORUS640_WORD_SIZE;
}
}
static void crypto_morus640_store_u(u8 *dst, const struct morus640_block *src)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
put_unaligned_le32(src->words[i], dst);
dst += MORUS640_WORD_SIZE;
}
}
static void crypto_morus640_store(u8 *dst, const struct morus640_block *src)
{
if (MORUS640_ALIGNED(dst))
crypto_morus640_store_a(dst, src);
else
crypto_morus640_store_u(dst, src);
}
static void crypto_morus640_ad(struct morus640_state *state, const u8 *src,
unsigned int size)
{
struct morus640_block m;
if (MORUS640_ALIGNED(src)) {
while (size >= MORUS640_BLOCK_SIZE) {
crypto_morus640_load_a(&m, src);
crypto_morus640_update(state, &m);
size -= MORUS640_BLOCK_SIZE;
src += MORUS640_BLOCK_SIZE;
}
} else {
while (size >= MORUS640_BLOCK_SIZE) {
crypto_morus640_load_u(&m, src);
crypto_morus640_update(state, &m);
size -= MORUS640_BLOCK_SIZE;
src += MORUS640_BLOCK_SIZE;
}
}
}
static void crypto_morus640_core(const struct morus640_state *state,
struct morus640_block *blk)
{
unsigned int i;
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
blk->words[(i + 3) % MORUS_BLOCK_WORDS] ^= state->s[1].words[i];
for (i = 0; i < MORUS_BLOCK_WORDS; i++) {
blk->words[i] ^= state->s[0].words[i];
blk->words[i] ^= state->s[2].words[i] & state->s[3].words[i];
}
}
static void crypto_morus640_encrypt_chunk(struct morus640_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
struct morus640_block c, m;
if (MORUS640_ALIGNED(src) && MORUS640_ALIGNED(dst)) {
while (size >= MORUS640_BLOCK_SIZE) {
crypto_morus640_load_a(&m, src);
c = m;
crypto_morus640_core(state, &c);
crypto_morus640_store_a(dst, &c);
crypto_morus640_update(state, &m);
src += MORUS640_BLOCK_SIZE;
dst += MORUS640_BLOCK_SIZE;
size -= MORUS640_BLOCK_SIZE;
}
} else {
while (size >= MORUS640_BLOCK_SIZE) {
crypto_morus640_load_u(&m, src);
c = m;
crypto_morus640_core(state, &c);
crypto_morus640_store_u(dst, &c);
crypto_morus640_update(state, &m);
src += MORUS640_BLOCK_SIZE;
dst += MORUS640_BLOCK_SIZE;
size -= MORUS640_BLOCK_SIZE;
}
}
if (size > 0) {
union morus640_block_in tail;
memcpy(tail.bytes, src, size);
memset(tail.bytes + size, 0, MORUS640_BLOCK_SIZE - size);
crypto_morus640_load_a(&m, tail.bytes);
c = m;
crypto_morus640_core(state, &c);
crypto_morus640_store_a(tail.bytes, &c);
crypto_morus640_update(state, &m);
memcpy(dst, tail.bytes, size);
}
}
static void crypto_morus640_decrypt_chunk(struct morus640_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
struct morus640_block m;
if (MORUS640_ALIGNED(src) && MORUS640_ALIGNED(dst)) {
while (size >= MORUS640_BLOCK_SIZE) {
crypto_morus640_load_a(&m, src);
crypto_morus640_core(state, &m);
crypto_morus640_store_a(dst, &m);
crypto_morus640_update(state, &m);
src += MORUS640_BLOCK_SIZE;
dst += MORUS640_BLOCK_SIZE;
size -= MORUS640_BLOCK_SIZE;
}
} else {
while (size >= MORUS640_BLOCK_SIZE) {
crypto_morus640_load_u(&m, src);
crypto_morus640_core(state, &m);
crypto_morus640_store_u(dst, &m);
crypto_morus640_update(state, &m);
src += MORUS640_BLOCK_SIZE;
dst += MORUS640_BLOCK_SIZE;
size -= MORUS640_BLOCK_SIZE;
}
}
if (size > 0) {
union morus640_block_in tail;
memcpy(tail.bytes, src, size);
memset(tail.bytes + size, 0, MORUS640_BLOCK_SIZE - size);
crypto_morus640_load_a(&m, tail.bytes);
crypto_morus640_core(state, &m);
crypto_morus640_store_a(tail.bytes, &m);
memset(tail.bytes + size, 0, MORUS640_BLOCK_SIZE - size);
crypto_morus640_load_a(&m, tail.bytes);
crypto_morus640_update(state, &m);
memcpy(dst, tail.bytes, size);
}
}
static void crypto_morus640_init(struct morus640_state *state,
const struct morus640_block *key,
const u8 *iv)
{
static const struct morus640_block z = {};
unsigned int i;
crypto_morus640_load(&state->s[0], iv);
state->s[1] = *key;
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
state->s[2].words[i] = U32_C(0xFFFFFFFF);
state->s[3] = crypto_morus640_const[0];
state->s[4] = crypto_morus640_const[1];
for (i = 0; i < 16; i++)
crypto_morus640_update(state, &z);
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
state->s[1].words[i] ^= key->words[i];
}
static void crypto_morus640_process_ad(struct morus640_state *state,
struct scatterlist *sg_src,
unsigned int assoclen)
{
struct scatter_walk walk;
struct morus640_block m;
union morus640_block_in buf;
unsigned int pos = 0;
scatterwalk_start(&walk, sg_src);
while (assoclen != 0) {
unsigned int size = scatterwalk_clamp(&walk, assoclen);
unsigned int left = size;
void *mapped = scatterwalk_map(&walk);
const u8 *src = (const u8 *)mapped;
if (pos + size >= MORUS640_BLOCK_SIZE) {
if (pos > 0) {
unsigned int fill = MORUS640_BLOCK_SIZE - pos;
memcpy(buf.bytes + pos, src, fill);
crypto_morus640_load_a(&m, buf.bytes);
crypto_morus640_update(state, &m);
pos = 0;
left -= fill;
src += fill;
}
crypto_morus640_ad(state, src, left);
src += left & ~(MORUS640_BLOCK_SIZE - 1);
left &= MORUS640_BLOCK_SIZE - 1;
}
memcpy(buf.bytes + pos, src, left);
pos += left;
assoclen -= size;
scatterwalk_unmap(mapped);
scatterwalk_advance(&walk, size);
scatterwalk_done(&walk, 0, assoclen);
}
if (pos > 0) {
memset(buf.bytes + pos, 0, MORUS640_BLOCK_SIZE - pos);
crypto_morus640_load_a(&m, buf.bytes);
crypto_morus640_update(state, &m);
}
}
static void crypto_morus640_process_crypt(struct morus640_state *state,
struct aead_request *req,
const struct morus640_ops *ops)
{
struct skcipher_walk walk;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) {
unsigned int nbytes = walk.nbytes;
if (nbytes < walk.total)
nbytes = round_down(nbytes, walk.stride);
ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr,
nbytes);
skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
}
static void crypto_morus640_final(struct morus640_state *state,
struct morus640_block *tag_xor,
u64 assoclen, u64 cryptlen)
{
struct morus640_block tmp;
unsigned int i;
tmp.words[0] = lower_32_bits(assoclen * 8);
tmp.words[1] = upper_32_bits(assoclen * 8);
tmp.words[2] = lower_32_bits(cryptlen * 8);
tmp.words[3] = upper_32_bits(cryptlen * 8);
for (i = 0; i < MORUS_BLOCK_WORDS; i++)
state->s[4].words[i] ^= state->s[0].words[i];
for (i = 0; i < 10; i++)
crypto_morus640_update(state, &tmp);
crypto_morus640_core(state, tag_xor);
}
static int crypto_morus640_setkey(struct crypto_aead *aead, const u8 *key,
unsigned int keylen)
{
struct morus640_ctx *ctx = crypto_aead_ctx(aead);
if (keylen != MORUS640_BLOCK_SIZE) {
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
crypto_morus640_load(&ctx->key, key);
return 0;
}
static int crypto_morus640_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
return (authsize <= MORUS_MAX_AUTH_SIZE) ? 0 : -EINVAL;
}
static void crypto_morus640_crypt(struct aead_request *req,
struct morus640_block *tag_xor,
unsigned int cryptlen,
const struct morus640_ops *ops)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus640_ctx *ctx = crypto_aead_ctx(tfm);
struct morus640_state state;
crypto_morus640_init(&state, &ctx->key, req->iv);
crypto_morus640_process_ad(&state, req->src, req->assoclen);
crypto_morus640_process_crypt(&state, req, ops);
crypto_morus640_final(&state, tag_xor, req->assoclen, cryptlen);
}
static int crypto_morus640_encrypt(struct aead_request *req)
{
static const struct morus640_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_encrypt,
.crypt_chunk = crypto_morus640_encrypt_chunk,
};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct morus640_block tag = {};
union morus640_block_in tag_out;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen;
crypto_morus640_crypt(req, &tag, cryptlen, &ops);
crypto_morus640_store(tag_out.bytes, &tag);
scatterwalk_map_and_copy(tag_out.bytes, req->dst,
req->assoclen + cryptlen, authsize, 1);
return 0;
}
static int crypto_morus640_decrypt(struct aead_request *req)
{
static const struct morus640_ops ops = {
.skcipher_walk_init = skcipher_walk_aead_decrypt,
.crypt_chunk = crypto_morus640_decrypt_chunk,
};
static const u8 zeros[MORUS640_BLOCK_SIZE] = {};
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
union morus640_block_in tag_in;
struct morus640_block tag;
unsigned int authsize = crypto_aead_authsize(tfm);
unsigned int cryptlen = req->cryptlen - authsize;
scatterwalk_map_and_copy(tag_in.bytes, req->src,
req->assoclen + cryptlen, authsize, 0);
crypto_morus640_load(&tag, tag_in.bytes);
crypto_morus640_crypt(req, &tag, cryptlen, &ops);
crypto_morus640_store(tag_in.bytes, &tag);
return crypto_memneq(tag_in.bytes, zeros, authsize) ? -EBADMSG : 0;
}
static int crypto_morus640_init_tfm(struct crypto_aead *tfm)
{
return 0;
}
static void crypto_morus640_exit_tfm(struct crypto_aead *tfm)
{
}
static struct aead_alg crypto_morus640_alg = {
.setkey = crypto_morus640_setkey,
.setauthsize = crypto_morus640_setauthsize,
.encrypt = crypto_morus640_encrypt,
.decrypt = crypto_morus640_decrypt,
.init = crypto_morus640_init_tfm,
.exit = crypto_morus640_exit_tfm,
.ivsize = MORUS_NONCE_SIZE,
.maxauthsize = MORUS_MAX_AUTH_SIZE,
.chunksize = MORUS640_BLOCK_SIZE,
.base = {
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct morus640_ctx),
.cra_alignmask = 0,
.cra_priority = 100,
.cra_name = "morus640",
.cra_driver_name = "morus640-generic",
.cra_module = THIS_MODULE,
}
};
static int __init crypto_morus640_module_init(void)
{
return crypto_register_aead(&crypto_morus640_alg);
}
static void __exit crypto_morus640_module_exit(void)
{
crypto_unregister_aead(&crypto_morus640_alg);
}
subsys_initcall(crypto_morus640_module_init);
module_exit(crypto_morus640_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ondrej Mosnacek <omosnacek@gmail.com>");
MODULE_DESCRIPTION("MORUS-640 AEAD algorithm");
MODULE_ALIAS_CRYPTO("morus640");
MODULE_ALIAS_CRYPTO("morus640-generic");

View File

@ -18,34 +18,8 @@
#include <linux/cpu.h>
#include <crypto/pcrypt.h>
struct padata_pcrypt {
struct padata_instance *pinst;
struct workqueue_struct *wq;
/*
* Cpumask for callback CPUs. It should be
* equal to serial cpumask of corresponding padata instance,
* so it is updated when padata notifies us about serial
* cpumask change.
*
* cb_cpumask is protected by RCU. This fact prevents us from
* using cpumask_var_t directly because the actual type of
* cpumsak_var_t depends on kernel configuration(particularly on
* CONFIG_CPUMASK_OFFSTACK macro). Depending on the configuration
* cpumask_var_t may be either a pointer to the struct cpumask
* or a variable allocated on the stack. Thus we can not safely use
* cpumask_var_t with RCU operations such as rcu_assign_pointer or
* rcu_dereference. So cpumask_var_t is wrapped with struct
* pcrypt_cpumask which makes possible to use it with RCU.
*/
struct pcrypt_cpumask {
cpumask_var_t mask;
} *cb_cpumask;
struct notifier_block nblock;
};
static struct padata_pcrypt pencrypt;
static struct padata_pcrypt pdecrypt;
static struct padata_instance *pencrypt;
static struct padata_instance *pdecrypt;
static struct kset *pcrypt_kset;
struct pcrypt_instance_ctx {
@ -58,35 +32,6 @@ struct pcrypt_aead_ctx {
unsigned int cb_cpu;
};
static int pcrypt_do_parallel(struct padata_priv *padata, unsigned int *cb_cpu,
struct padata_pcrypt *pcrypt)
{
unsigned int cpu_index, cpu, i;
struct pcrypt_cpumask *cpumask;
cpu = *cb_cpu;
rcu_read_lock_bh();
cpumask = rcu_dereference_bh(pcrypt->cb_cpumask);
if (cpumask_test_cpu(cpu, cpumask->mask))
goto out;
if (!cpumask_weight(cpumask->mask))
goto out;
cpu_index = cpu % cpumask_weight(cpumask->mask);
cpu = cpumask_first(cpumask->mask);
for (i = 0; i < cpu_index; i++)
cpu = cpumask_next(cpu, cpumask->mask);
*cb_cpu = cpu;
out:
rcu_read_unlock_bh();
return padata_do_parallel(pcrypt->pinst, padata, cpu);
}
static int pcrypt_aead_setkey(struct crypto_aead *parent,
const u8 *key, unsigned int keylen)
{
@ -158,7 +103,7 @@ static int pcrypt_aead_encrypt(struct aead_request *req)
req->cryptlen, req->iv);
aead_request_set_ad(creq, req->assoclen);
err = pcrypt_do_parallel(padata, &ctx->cb_cpu, &pencrypt);
err = padata_do_parallel(pencrypt, padata, &ctx->cb_cpu);
if (!err)
return -EINPROGRESS;
@ -200,7 +145,7 @@ static int pcrypt_aead_decrypt(struct aead_request *req)
req->cryptlen, req->iv);
aead_request_set_ad(creq, req->assoclen);
err = pcrypt_do_parallel(padata, &ctx->cb_cpu, &pdecrypt);
err = padata_do_parallel(pdecrypt, padata, &ctx->cb_cpu);
if (!err)
return -EINPROGRESS;
@ -347,36 +292,6 @@ static int pcrypt_create(struct crypto_template *tmpl, struct rtattr **tb)
return -EINVAL;
}
static int pcrypt_cpumask_change_notify(struct notifier_block *self,
unsigned long val, void *data)
{
struct padata_pcrypt *pcrypt;
struct pcrypt_cpumask *new_mask, *old_mask;
struct padata_cpumask *cpumask = (struct padata_cpumask *)data;
if (!(val & PADATA_CPU_SERIAL))
return 0;
pcrypt = container_of(self, struct padata_pcrypt, nblock);
new_mask = kmalloc(sizeof(*new_mask), GFP_KERNEL);
if (!new_mask)
return -ENOMEM;
if (!alloc_cpumask_var(&new_mask->mask, GFP_KERNEL)) {
kfree(new_mask);
return -ENOMEM;
}
old_mask = pcrypt->cb_cpumask;
cpumask_copy(new_mask->mask, cpumask->cbcpu);
rcu_assign_pointer(pcrypt->cb_cpumask, new_mask);
synchronize_rcu();
free_cpumask_var(old_mask->mask);
kfree(old_mask);
return 0;
}
static int pcrypt_sysfs_add(struct padata_instance *pinst, const char *name)
{
int ret;
@ -389,71 +304,25 @@ static int pcrypt_sysfs_add(struct padata_instance *pinst, const char *name)
return ret;
}
static int pcrypt_init_padata(struct padata_pcrypt *pcrypt,
const char *name)
static int pcrypt_init_padata(struct padata_instance **pinst, const char *name)
{
int ret = -ENOMEM;
struct pcrypt_cpumask *mask;
get_online_cpus();
*pinst = padata_alloc_possible(name);
if (!*pinst)
return ret;
pcrypt->wq = alloc_workqueue("%s", WQ_MEM_RECLAIM | WQ_CPU_INTENSIVE,
1, name);
if (!pcrypt->wq)
goto err;
pcrypt->pinst = padata_alloc_possible(pcrypt->wq);
if (!pcrypt->pinst)
goto err_destroy_workqueue;
mask = kmalloc(sizeof(*mask), GFP_KERNEL);
if (!mask)
goto err_free_padata;
if (!alloc_cpumask_var(&mask->mask, GFP_KERNEL)) {
kfree(mask);
goto err_free_padata;
}
cpumask_and(mask->mask, cpu_possible_mask, cpu_online_mask);
rcu_assign_pointer(pcrypt->cb_cpumask, mask);
pcrypt->nblock.notifier_call = pcrypt_cpumask_change_notify;
ret = padata_register_cpumask_notifier(pcrypt->pinst, &pcrypt->nblock);
ret = pcrypt_sysfs_add(*pinst, name);
if (ret)
goto err_free_cpumask;
ret = pcrypt_sysfs_add(pcrypt->pinst, name);
if (ret)
goto err_unregister_notifier;
put_online_cpus();
return ret;
err_unregister_notifier:
padata_unregister_cpumask_notifier(pcrypt->pinst, &pcrypt->nblock);
err_free_cpumask:
free_cpumask_var(mask->mask);
kfree(mask);
err_free_padata:
padata_free(pcrypt->pinst);
err_destroy_workqueue:
destroy_workqueue(pcrypt->wq);
err:
put_online_cpus();
padata_free(*pinst);
return ret;
}
static void pcrypt_fini_padata(struct padata_pcrypt *pcrypt)
static void pcrypt_fini_padata(struct padata_instance *pinst)
{
free_cpumask_var(pcrypt->cb_cpumask->mask);
kfree(pcrypt->cb_cpumask);
padata_stop(pcrypt->pinst);
padata_unregister_cpumask_notifier(pcrypt->pinst, &pcrypt->nblock);
destroy_workqueue(pcrypt->wq);
padata_free(pcrypt->pinst);
padata_stop(pinst);
padata_free(pinst);
}
static struct crypto_template pcrypt_tmpl = {
@ -478,13 +347,13 @@ static int __init pcrypt_init(void)
if (err)
goto err_deinit_pencrypt;
padata_start(pencrypt.pinst);
padata_start(pdecrypt.pinst);
padata_start(pencrypt);
padata_start(pdecrypt);
return crypto_register_template(&pcrypt_tmpl);
err_deinit_pencrypt:
pcrypt_fini_padata(&pencrypt);
pcrypt_fini_padata(pencrypt);
err_unreg_kset:
kset_unregister(pcrypt_kset);
err:
@ -493,8 +362,8 @@ err:
static void __exit pcrypt_exit(void)
{
pcrypt_fini_padata(&pencrypt);
pcrypt_fini_padata(&pdecrypt);
pcrypt_fini_padata(pencrypt);
pcrypt_fini_padata(pdecrypt);
kset_unregister(pcrypt_kset);
crypto_unregister_template(&pcrypt_tmpl);

View File

@ -1,11 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Cryptographic API.
*
* SHA-256, as specified in
* http://csrc.nist.gov/groups/STM/cavp/documents/shs/sha256-384-512.pdf
*
* SHA-256 code by Jean-Luc Cooke <jlcooke@certainkey.com>.
* Crypto API wrapper for the generic SHA256 code from lib/crypto/sha256.c
*
* Copyright (c) Jean-Luc Cooke <jlcooke@certainkey.com>
* Copyright (c) Andrew McDonald <andrew@mcdonald.org.uk>
@ -38,229 +33,44 @@ const u8 sha256_zero_message_hash[SHA256_DIGEST_SIZE] = {
};
EXPORT_SYMBOL_GPL(sha256_zero_message_hash);
static inline u32 Ch(u32 x, u32 y, u32 z)
static int crypto_sha256_init(struct shash_desc *desc)
{
return z ^ (x & (y ^ z));
return sha256_init(shash_desc_ctx(desc));
}
static inline u32 Maj(u32 x, u32 y, u32 z)
static int crypto_sha224_init(struct shash_desc *desc)
{
return (x & y) | (z & (x | y));
}
#define e0(x) (ror32(x, 2) ^ ror32(x,13) ^ ror32(x,22))
#define e1(x) (ror32(x, 6) ^ ror32(x,11) ^ ror32(x,25))
#define s0(x) (ror32(x, 7) ^ ror32(x,18) ^ (x >> 3))
#define s1(x) (ror32(x,17) ^ ror32(x,19) ^ (x >> 10))
static inline void LOAD_OP(int I, u32 *W, const u8 *input)
{
W[I] = get_unaligned_be32((__u32 *)input + I);
}
static inline void BLEND_OP(int I, u32 *W)
{
W[I] = s1(W[I-2]) + W[I-7] + s0(W[I-15]) + W[I-16];
}
static void sha256_transform(u32 *state, const u8 *input)
{
u32 a, b, c, d, e, f, g, h, t1, t2;
u32 W[64];
int i;
/* load the input */
for (i = 0; i < 16; i++)
LOAD_OP(i, W, input);
/* now blend */
for (i = 16; i < 64; i++)
BLEND_OP(i, W);
/* load the state into our registers */
a=state[0]; b=state[1]; c=state[2]; d=state[3];
e=state[4]; f=state[5]; g=state[6]; h=state[7];
/* now iterate */
t1 = h + e1(e) + Ch(e,f,g) + 0x428a2f98 + W[ 0];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0x71374491 + W[ 1];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0xb5c0fbcf + W[ 2];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0xe9b5dba5 + W[ 3];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0x3956c25b + W[ 4];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0x59f111f1 + W[ 5];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0x923f82a4 + W[ 6];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0xab1c5ed5 + W[ 7];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
t1 = h + e1(e) + Ch(e,f,g) + 0xd807aa98 + W[ 8];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0x12835b01 + W[ 9];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0x243185be + W[10];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0x550c7dc3 + W[11];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0x72be5d74 + W[12];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0x80deb1fe + W[13];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0x9bdc06a7 + W[14];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0xc19bf174 + W[15];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
t1 = h + e1(e) + Ch(e,f,g) + 0xe49b69c1 + W[16];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0xefbe4786 + W[17];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0x0fc19dc6 + W[18];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0x240ca1cc + W[19];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0x2de92c6f + W[20];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0x4a7484aa + W[21];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0x5cb0a9dc + W[22];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0x76f988da + W[23];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
t1 = h + e1(e) + Ch(e,f,g) + 0x983e5152 + W[24];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0xa831c66d + W[25];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0xb00327c8 + W[26];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0xbf597fc7 + W[27];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0xc6e00bf3 + W[28];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0xd5a79147 + W[29];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0x06ca6351 + W[30];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0x14292967 + W[31];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
t1 = h + e1(e) + Ch(e,f,g) + 0x27b70a85 + W[32];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0x2e1b2138 + W[33];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0x4d2c6dfc + W[34];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0x53380d13 + W[35];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0x650a7354 + W[36];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0x766a0abb + W[37];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0x81c2c92e + W[38];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0x92722c85 + W[39];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
t1 = h + e1(e) + Ch(e,f,g) + 0xa2bfe8a1 + W[40];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0xa81a664b + W[41];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0xc24b8b70 + W[42];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0xc76c51a3 + W[43];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0xd192e819 + W[44];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0xd6990624 + W[45];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0xf40e3585 + W[46];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0x106aa070 + W[47];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
t1 = h + e1(e) + Ch(e,f,g) + 0x19a4c116 + W[48];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0x1e376c08 + W[49];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0x2748774c + W[50];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0x34b0bcb5 + W[51];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0x391c0cb3 + W[52];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0x4ed8aa4a + W[53];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0x5b9cca4f + W[54];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0x682e6ff3 + W[55];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
t1 = h + e1(e) + Ch(e,f,g) + 0x748f82ee + W[56];
t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
t1 = g + e1(d) + Ch(d,e,f) + 0x78a5636f + W[57];
t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
t1 = f + e1(c) + Ch(c,d,e) + 0x84c87814 + W[58];
t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
t1 = e + e1(b) + Ch(b,c,d) + 0x8cc70208 + W[59];
t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
t1 = d + e1(a) + Ch(a,b,c) + 0x90befffa + W[60];
t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
t1 = c + e1(h) + Ch(h,a,b) + 0xa4506ceb + W[61];
t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
t1 = b + e1(g) + Ch(g,h,a) + 0xbef9a3f7 + W[62];
t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
t1 = a + e1(f) + Ch(f,g,h) + 0xc67178f2 + W[63];
t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
state[0] += a; state[1] += b; state[2] += c; state[3] += d;
state[4] += e; state[5] += f; state[6] += g; state[7] += h;
/* clear any sensitive info... */
a = b = c = d = e = f = g = h = t1 = t2 = 0;
memzero_explicit(W, 64 * sizeof(u32));
}
static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
int blocks)
{
while (blocks--) {
sha256_transform(sst->state, src);
src += SHA256_BLOCK_SIZE;
}
return sha224_init(shash_desc_ctx(desc));
}
int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
return sha256_update(shash_desc_ctx(desc), data, len);
}
EXPORT_SYMBOL(crypto_sha256_update);
static int sha256_final(struct shash_desc *desc, u8 *out)
static int crypto_sha256_final(struct shash_desc *desc, u8 *out)
{
sha256_base_do_finalize(desc, sha256_generic_block_fn);
return sha256_base_finish(desc, out);
if (crypto_shash_digestsize(desc->tfm) == SHA224_DIGEST_SIZE)
return sha224_final(shash_desc_ctx(desc), out);
else
return sha256_final(shash_desc_ctx(desc), out);
}
int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash)
{
sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
return sha256_final(desc, hash);
sha256_update(shash_desc_ctx(desc), data, len);
return crypto_sha256_final(desc, hash);
}
EXPORT_SYMBOL(crypto_sha256_finup);
static struct shash_alg sha256_algs[2] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.init = crypto_sha256_init,
.update = crypto_sha256_update,
.final = sha256_final,
.final = crypto_sha256_final,
.finup = crypto_sha256_finup,
.descsize = sizeof(struct sha256_state),
.base = {
@ -272,9 +82,9 @@ static struct shash_alg sha256_algs[2] = { {
}
}, {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.init = crypto_sha224_init,
.update = crypto_sha256_update,
.final = sha256_final,
.final = crypto_sha256_final,
.finup = crypto_sha256_finup,
.descsize = sizeof(struct sha256_state),
.base = {

View File

@ -90,7 +90,7 @@ static inline u8 *skcipher_get_spot(u8 *start, unsigned int len)
return max(start, end_page);
}
static void skcipher_done_slow(struct skcipher_walk *walk, unsigned int bsize)
static int skcipher_done_slow(struct skcipher_walk *walk, unsigned int bsize)
{
u8 *addr;
@ -98,19 +98,21 @@ static void skcipher_done_slow(struct skcipher_walk *walk, unsigned int bsize)
addr = skcipher_get_spot(addr, bsize);
scatterwalk_copychunks(addr, &walk->out, bsize,
(walk->flags & SKCIPHER_WALK_PHYS) ? 2 : 1);
return 0;
}
int skcipher_walk_done(struct skcipher_walk *walk, int err)
{
unsigned int n; /* bytes processed */
bool more;
unsigned int n = walk->nbytes;
unsigned int nbytes = 0;
if (unlikely(err < 0))
if (!n)
goto finish;
n = walk->nbytes - err;
walk->total -= n;
more = (walk->total != 0);
if (likely(err >= 0)) {
n -= err;
nbytes = walk->total - n;
}
if (likely(!(walk->flags & (SKCIPHER_WALK_PHYS |
SKCIPHER_WALK_SLOW |
@ -126,7 +128,7 @@ unmap_src:
memcpy(walk->dst.virt.addr, walk->page, n);
skcipher_unmap_dst(walk);
} else if (unlikely(walk->flags & SKCIPHER_WALK_SLOW)) {
if (err) {
if (err > 0) {
/*
* Didn't process all bytes. Either the algorithm is
* broken, or this was the last step and it turned out
@ -134,27 +136,29 @@ unmap_src:
* the algorithm requires it.
*/
err = -EINVAL;
goto finish;
}
skcipher_done_slow(walk, n);
goto already_advanced;
nbytes = 0;
} else
n = skcipher_done_slow(walk, n);
}
if (err > 0)
err = 0;
walk->total = nbytes;
walk->nbytes = 0;
scatterwalk_advance(&walk->in, n);
scatterwalk_advance(&walk->out, n);
already_advanced:
scatterwalk_done(&walk->in, 0, more);
scatterwalk_done(&walk->out, 1, more);
scatterwalk_done(&walk->in, 0, nbytes);
scatterwalk_done(&walk->out, 1, nbytes);
if (more) {
if (nbytes) {
crypto_yield(walk->flags & SKCIPHER_WALK_SLEEP ?
CRYPTO_TFM_REQ_MAY_SLEEP : 0);
return skcipher_walk_next(walk);
}
err = 0;
finish:
walk->nbytes = 0;
finish:
/* Short-circuit for the common/fast path. */
if (!((unsigned long)walk->buffer | (unsigned long)walk->page))
goto out;

View File

@ -148,52 +148,6 @@ static const struct streebog_uint512 C[12] = {
} }
};
static const u8 Tau[64] = {
0, 8, 16, 24, 32, 40, 48, 56,
1, 9, 17, 25, 33, 41, 49, 57,
2, 10, 18, 26, 34, 42, 50, 58,
3, 11, 19, 27, 35, 43, 51, 59,
4, 12, 20, 28, 36, 44, 52, 60,
5, 13, 21, 29, 37, 45, 53, 61,
6, 14, 22, 30, 38, 46, 54, 62,
7, 15, 23, 31, 39, 47, 55, 63
};
static const u8 Pi[256] = {
252, 238, 221, 17, 207, 110, 49, 22,
251, 196, 250, 218, 35, 197, 4, 77,
233, 119, 240, 219, 147, 46, 153, 186,
23, 54, 241, 187, 20, 205, 95, 193,
249, 24, 101, 90, 226, 92, 239, 33,
129, 28, 60, 66, 139, 1, 142, 79,
5, 132, 2, 174, 227, 106, 143, 160,
6, 11, 237, 152, 127, 212, 211, 31,
235, 52, 44, 81, 234, 200, 72, 171,
242, 42, 104, 162, 253, 58, 206, 204,
181, 112, 14, 86, 8, 12, 118, 18,
191, 114, 19, 71, 156, 183, 93, 135,
21, 161, 150, 41, 16, 123, 154, 199,
243, 145, 120, 111, 157, 158, 178, 177,
50, 117, 25, 61, 255, 53, 138, 126,
109, 84, 198, 128, 195, 189, 13, 87,
223, 245, 36, 169, 62, 168, 67, 201,
215, 121, 214, 246, 124, 34, 185, 3,
224, 15, 236, 222, 122, 148, 176, 188,
220, 232, 40, 80, 78, 51, 10, 74,
167, 151, 96, 115, 30, 0, 98, 68,
26, 184, 56, 130, 100, 159, 38, 65,
173, 69, 70, 146, 39, 94, 85, 47,
140, 163, 165, 125, 105, 213, 149, 59,
7, 88, 179, 64, 134, 172, 29, 247,
48, 55, 107, 228, 136, 217, 231, 137,
225, 27, 131, 73, 76, 63, 248, 254,
141, 83, 170, 144, 202, 216, 133, 97,
32, 113, 103, 164, 45, 43, 9, 91,
203, 155, 37, 208, 190, 229, 108, 82,
89, 166, 116, 210, 230, 244, 180, 192,
209, 102, 175, 194, 57, 75, 99, 182
};
static const unsigned long long Ax[8][256] = {
{
0xd01f715b5c7ef8e6ULL, 0x16fa240980778325ULL, 0xa8a42e857ee049c8ULL,

View File

@ -2327,6 +2327,22 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
0, speed_template_32);
break;
case 220:
test_acipher_speed("essiv(cbc(aes),sha256)",
ENCRYPT, sec, NULL, 0,
speed_template_16_24_32);
test_acipher_speed("essiv(cbc(aes),sha256)",
DECRYPT, sec, NULL, 0,
speed_template_16_24_32);
break;
case 221:
test_aead_speed("aegis128", ENCRYPT, sec,
NULL, 0, 16, 8, speed_template_16);
test_aead_speed("aegis128", DECRYPT, sec,
NULL, 0, 16, 8, speed_template_16);
break;
case 300:
if (alg) {
test_hash_speed(alg, sec, generic_hash_speed_template);

View File

@ -3886,18 +3886,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.aead = __VECS(aegis128_tv_template)
}
}, {
.alg = "aegis128l",
.test = alg_test_aead,
.suite = {
.aead = __VECS(aegis128l_tv_template)
}
}, {
.alg = "aegis256",
.test = alg_test_aead,
.suite = {
.aead = __VECS(aegis256_tv_template)
}
}, {
.alg = "ansi_cprng",
.test = alg_test_cprng,
@ -4556,6 +4544,20 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.akcipher = __VECS(ecrdsa_tv_template)
}
}, {
.alg = "essiv(authenc(hmac(sha256),cbc(aes)),sha256)",
.test = alg_test_aead,
.fips_allowed = 1,
.suite = {
.aead = __VECS(essiv_hmac_sha256_aes_cbc_tv_temp)
}
}, {
.alg = "essiv(cbc(aes),sha256)",
.test = alg_test_skcipher,
.fips_allowed = 1,
.suite = {
.cipher = __VECS(essiv_aes_cbc_tv_template)
}
}, {
.alg = "gcm(aes)",
.generic_driver = "gcm_base(ctr(aes-generic),ghash-generic)",
@ -4740,6 +4742,16 @@ static const struct alg_test_desc alg_test_descs[] = {
.decomp = __VECS(lzo_decomp_tv_template)
}
}
}, {
.alg = "lzo-rle",
.test = alg_test_comp,
.fips_allowed = 1,
.suite = {
.comp = {
.comp = __VECS(lzorle_comp_tv_template),
.decomp = __VECS(lzorle_decomp_tv_template)
}
}
}, {
.alg = "md4",
.test = alg_test_hash,
@ -4758,18 +4770,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.hash = __VECS(michael_mic_tv_template)
}
}, {
.alg = "morus1280",
.test = alg_test_aead,
.suite = {
.aead = __VECS(morus1280_tv_template)
}
}, {
.alg = "morus640",
.test = alg_test_aead,
.suite = {
.aead = __VECS(morus640_tv_template)
}
}, {
.alg = "nhpoly1305",
.test = alg_test_hash,
@ -5240,9 +5240,11 @@ int alg_test(const char *driver, const char *alg, u32 type, u32 mask)
type, mask);
test_done:
if (rc && (fips_enabled || panic_on_fail))
if (rc && (fips_enabled || panic_on_fail)) {
fips_fail_notify();
panic("alg: self-tests for %s (%s) failed in %s mode!\n",
driver, alg, fips_enabled ? "fips" : "panic_on_fail");
}
if (fips_enabled && !rc)
pr_info("alg: self-tests for %s (%s) passed\n", driver, alg);

File diff suppressed because it is too large Load Diff

View File

@ -1,8 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/* XTS: as defined in IEEE1619/D16
* http://grouper.ieee.org/groups/1619/email/pdf00086.pdf
* (sector sizes which are not a multiple of 16 bytes are,
* however currently unsupported)
*
* Copyright (c) 2007 Rik Snel <rsnel@cube.dyndns.org>
*
@ -34,6 +32,8 @@ struct xts_instance_ctx {
struct rctx {
le128 t;
struct scatterlist *tail;
struct scatterlist sg[2];
struct skcipher_request subreq;
};
@ -84,10 +84,11 @@ static int setkey(struct crypto_skcipher *parent, const u8 *key,
* mutliple calls to the 'ecb(..)' instance, which usually would be slower than
* just doing the gf128mul_x_ble() calls again.
*/
static int xor_tweak(struct skcipher_request *req, bool second_pass)
static int xor_tweak(struct skcipher_request *req, bool second_pass, bool enc)
{
struct rctx *rctx = skcipher_request_ctx(req);
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const bool cts = (req->cryptlen % XTS_BLOCK_SIZE);
const int bs = XTS_BLOCK_SIZE;
struct skcipher_walk w;
le128 t = rctx->t;
@ -109,6 +110,20 @@ static int xor_tweak(struct skcipher_request *req, bool second_pass)
wdst = w.dst.virt.addr;
do {
if (unlikely(cts) &&
w.total - w.nbytes + avail < 2 * XTS_BLOCK_SIZE) {
if (!enc) {
if (second_pass)
rctx->t = t;
gf128mul_x_ble(&t, &t);
}
le128_xor(wdst, &t, wsrc);
if (enc && second_pass)
gf128mul_x_ble(&rctx->t, &t);
skcipher_walk_done(&w, avail - bs);
return 0;
}
le128_xor(wdst++, &t, wsrc++);
gf128mul_x_ble(&t, &t);
} while ((avail -= bs) >= bs);
@ -119,17 +134,71 @@ static int xor_tweak(struct skcipher_request *req, bool second_pass)
return err;
}
static int xor_tweak_pre(struct skcipher_request *req)
static int xor_tweak_pre(struct skcipher_request *req, bool enc)
{
return xor_tweak(req, false);
return xor_tweak(req, false, enc);
}
static int xor_tweak_post(struct skcipher_request *req)
static int xor_tweak_post(struct skcipher_request *req, bool enc)
{
return xor_tweak(req, true);
return xor_tweak(req, true, enc);
}
static void crypt_done(struct crypto_async_request *areq, int err)
static void cts_done(struct crypto_async_request *areq, int err)
{
struct skcipher_request *req = areq->data;
le128 b;
if (!err) {
struct rctx *rctx = skcipher_request_ctx(req);
scatterwalk_map_and_copy(&b, rctx->tail, 0, XTS_BLOCK_SIZE, 0);
le128_xor(&b, &rctx->t, &b);
scatterwalk_map_and_copy(&b, rctx->tail, 0, XTS_BLOCK_SIZE, 1);
}
skcipher_request_complete(req, err);
}
static int cts_final(struct skcipher_request *req,
int (*crypt)(struct skcipher_request *req))
{
struct priv *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
int offset = req->cryptlen & ~(XTS_BLOCK_SIZE - 1);
struct rctx *rctx = skcipher_request_ctx(req);
struct skcipher_request *subreq = &rctx->subreq;
int tail = req->cryptlen % XTS_BLOCK_SIZE;
le128 b[2];
int err;
rctx->tail = scatterwalk_ffwd(rctx->sg, req->dst,
offset - XTS_BLOCK_SIZE);
scatterwalk_map_and_copy(b, rctx->tail, 0, XTS_BLOCK_SIZE, 0);
memcpy(b + 1, b, tail);
scatterwalk_map_and_copy(b, req->src, offset, tail, 0);
le128_xor(b, &rctx->t, b);
scatterwalk_map_and_copy(b, rctx->tail, 0, XTS_BLOCK_SIZE + tail, 1);
skcipher_request_set_tfm(subreq, ctx->child);
skcipher_request_set_callback(subreq, req->base.flags, cts_done, req);
skcipher_request_set_crypt(subreq, rctx->tail, rctx->tail,
XTS_BLOCK_SIZE, NULL);
err = crypt(subreq);
if (err)
return err;
scatterwalk_map_and_copy(b, rctx->tail, 0, XTS_BLOCK_SIZE, 0);
le128_xor(b, &rctx->t, b);
scatterwalk_map_and_copy(b, rctx->tail, 0, XTS_BLOCK_SIZE, 1);
return 0;
}
static void encrypt_done(struct crypto_async_request *areq, int err)
{
struct skcipher_request *req = areq->data;
@ -137,47 +206,90 @@ static void crypt_done(struct crypto_async_request *areq, int err)
struct rctx *rctx = skcipher_request_ctx(req);
rctx->subreq.base.flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
err = xor_tweak_post(req);
err = xor_tweak_post(req, true);
if (!err && unlikely(req->cryptlen % XTS_BLOCK_SIZE)) {
err = cts_final(req, crypto_skcipher_encrypt);
if (err == -EINPROGRESS)
return;
}
}
skcipher_request_complete(req, err);
}
static void init_crypt(struct skcipher_request *req)
static void decrypt_done(struct crypto_async_request *areq, int err)
{
struct skcipher_request *req = areq->data;
if (!err) {
struct rctx *rctx = skcipher_request_ctx(req);
rctx->subreq.base.flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
err = xor_tweak_post(req, false);
if (!err && unlikely(req->cryptlen % XTS_BLOCK_SIZE)) {
err = cts_final(req, crypto_skcipher_decrypt);
if (err == -EINPROGRESS)
return;
}
}
skcipher_request_complete(req, err);
}
static int init_crypt(struct skcipher_request *req, crypto_completion_t compl)
{
struct priv *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
struct rctx *rctx = skcipher_request_ctx(req);
struct skcipher_request *subreq = &rctx->subreq;
if (req->cryptlen < XTS_BLOCK_SIZE)
return -EINVAL;
skcipher_request_set_tfm(subreq, ctx->child);
skcipher_request_set_callback(subreq, req->base.flags, crypt_done, req);
skcipher_request_set_callback(subreq, req->base.flags, compl, req);
skcipher_request_set_crypt(subreq, req->dst, req->dst,
req->cryptlen, NULL);
req->cryptlen & ~(XTS_BLOCK_SIZE - 1), NULL);
/* calculate first value of T */
crypto_cipher_encrypt_one(ctx->tweak, (u8 *)&rctx->t, req->iv);
return 0;
}
static int encrypt(struct skcipher_request *req)
{
struct rctx *rctx = skcipher_request_ctx(req);
struct skcipher_request *subreq = &rctx->subreq;
int err;
init_crypt(req);
return xor_tweak_pre(req) ?:
crypto_skcipher_encrypt(subreq) ?:
xor_tweak_post(req);
err = init_crypt(req, encrypt_done) ?:
xor_tweak_pre(req, true) ?:
crypto_skcipher_encrypt(subreq) ?:
xor_tweak_post(req, true);
if (err || likely((req->cryptlen % XTS_BLOCK_SIZE) == 0))
return err;
return cts_final(req, crypto_skcipher_encrypt);
}
static int decrypt(struct skcipher_request *req)
{
struct rctx *rctx = skcipher_request_ctx(req);
struct skcipher_request *subreq = &rctx->subreq;
int err;
init_crypt(req);
return xor_tweak_pre(req) ?:
crypto_skcipher_decrypt(subreq) ?:
xor_tweak_post(req);
err = init_crypt(req, decrypt_done) ?:
xor_tweak_pre(req, false) ?:
crypto_skcipher_decrypt(subreq) ?:
xor_tweak_post(req, false);
if (err || likely((req->cryptlen % XTS_BLOCK_SIZE) == 0))
return err;
return cts_final(req, crypto_skcipher_decrypt);
}
static int init_tfm(struct crypto_skcipher *tfm)

Some files were not shown because too many files have changed in this diff Show More