Commit Graph

310708 Commits

Author SHA1 Message Date
Kim Phillips a0ca6ca022 crypto: caam - one tasklet per job ring
there is no noticeable benefit for multiple cores to process one
job ring's output ring: in fact, we can benefit from cache effects
of having the back-half stay on the core that receives a particular
ring's interrupts, and further relax general contention and the
locking involved with reading outring_used, since tasklets run
atomically.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:07 +08:00
Kim Phillips 14a8e29cc2 crypto: caam - consolidate memory barriers from job ring en/dequeue
Memory barriers are implied by the i/o register write implementation
(at least on Power).  So we can remove the redundant wmb() in
caam_jr_enqueue, and, in dequeue(), hoist the h/w done notification
write up to before we need to increment the head of the ring, and
save an smp_mb.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:07 +08:00
Kim Phillips a8ea07c21d crypto: caam - only query h/w in job ring dequeue path
Code was needlessly checking the s/w job ring when there
would be nothing to process if the h/w's output completion
ring were empty anyway.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:07 +08:00
Kim Phillips 4bba1e9f41 crypto: caam - use non-irq versions of spinlocks for job rings
The enqueue lock isn't used in any interrupt context, and
the dequeue lock isn't used in the h/w interrupt context,
only in bh context.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:06 +08:00
Kim Phillips 1a076689cd crypto: caam - disable IRQ coalescing by default
It has been observed that in zero-loss benchmarks, when a
slow traffic rate is being tested, the IRQ timer coalescing
parameter was set too high, and the ethernet controller
would start dropping packets because the job ring back half
wouldn't be executed in time before the ethernet controller
would fill its buffers, thereby significantly reducing the
zero-loss performance figures.

Empirical testing has shown that the best zero-loss performance
is achieved when IRQ coalescing is set to minimum values and/or
turned off, since apparently the job ring driver already implements
an adequately-performing general-purpose IRQ mitigation strategy
in software.

Whilst we could go with minimal count (2-8) and timing settings
(192-256), we prefer just turning h/w coalescing altogether off
to minimize setkey latency (due to split key generation), and
for consistent cross-SoC performance (the SEC vs. core clock
ratio changes).

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:06 +08:00
Kim Phillips 281922a1d4 crypto: caam - add support for SEC v5.x RNG4
The SEC v4.x' RNGB h/w block self-initialized.  RNG4, available
on SEC versions 5 and beyond, is based on a different standard
that requires manual initialization.

Also update any new errors From the SEC v5.2 reference manual:
The SEC v5.2's RNG4 unit reuses some error IDs, thus the addition
of rng_err_id_list over the CHA-independent err_id_list.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:06 +08:00
Kim Phillips e13af18a3e crypto: caam - assign 40-bit masks on SEC v5.0 and above
SEC v4.x were only 36-bit, SEC v5+ are 40-bit capable.
Also set a DMA mask for any job ring devices created.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:06 +08:00
Yuan Kang e24f7c9e87 crypto: caam - hwrng support
caam_read copies random bytes from two buffers into output.

caam rng can fill empty buffer 0xffff bytes at a time,
but the buffer sizes are rounded down to multiple of cacheline size.

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:06 +08:00
Yuan Kang 643b39b031 crypto: caam - chaining support
support chained scatterlists for aead, ablkcipher and ahash.

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>

- fix dma unmap leak
- un-unlikely src == dst, due to experience with AF_ALG

Signed-off-by: Kudupudi Ugendreshwar <B38865@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:05 +08:00
Yuan Kang b0e09bae37 crypto: caam - unkeyed ahash support
caam supports and registers unkeyed sha algorithms and md5.

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:05 +08:00
Yuan Kang 045e36780f crypto: caam - ahash hmac support
caam supports ahash hmac with sha algorithms and md5.

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:05 +08:00
Yuan Kang a299c83704 crypto: caam - link_tbl rename
- rename scatterlist and link_tbl functions
- link_tbl changed to sec4_sg
- sg_to_link_tbl_one changed to dma_to_sec4_sg_one,
  since no scatterlist is use

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:05 +08:00
Yuan Kang 4c1ec1f930 crypto: caam - refactor key_gen, sg
create separate files for split key generation and scatterlist functions.

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:05 +08:00
Yuan Kang 8009a383f2 crypto: caam - remove jr register/deregister
remove caam_jr_register and caam_jr_deregister
to allow sharing of job rings.

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:04 +08:00
Yuan Kang 6ec4733493 crypto: caam - support external seq in/out lengths
functions for external storage of seq in/out lengths,
i.e., for 32-bit lengths.

These type-dependent functions automatically determine whether to
store the length internally (embedded in the command header word) or
externally (after the address pointer), based on size of the type
given.

Signed-off-by: Yuan Kang <Yuan.Kang@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:04 +08:00
Hemant Agrawal a23d80e0b7 crypto: caam - add PDB (Protocol Descriptor Block) definitions
Add a PDB header file to support building protocol descriptors.

Signed-off-by: Steve Cornelius <sec@pobox.com>
Signed-off-by: Hemant Agrawal <hemant@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:04 +08:00
Kim Phillips 991c569c5d crypto: caam - fix descriptor length adjustments for protocol descriptors
init_desc, by always ORing with 1 for the descriptor header inclusion
into the descriptor length, and init_sh_desc_pdb, by always specifying
the descriptor length modification for the PDB via options, would not
allow for odd length PDBs to be embedded in the constructed descriptor
length.  Fix this by simply changing the OR to an addition.

also round-up pdb_bytes to the next SEC command unit size, to
allow for, e.g., optional packet header bytes that aren't a
multiple of CAAM_CMD_SZ.

Reported-by: Radu-Andrei BULIE <radu.bulie@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Cc: Yashpal Dutta <yashpal.dutta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:03 +08:00
Yashpal Dutta c4b664063e crypto: caam - fix start index for Protocol shared descriptors
In case of protocol acceleration descriptors, Shared descriptor header must
carry size of header length + PDB length in words which will be skipped by
DECO while processing descriptor to provide first command word offset

Signed-off-by: Yashpal Dutta <yashpal.dutta@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:03 +08:00
Kim Phillips a68d259587 crypto: caam - fix input job ring element dma mapping size
SEC4 h/w gets configured in 32- vs. 36-bit physical
addressing modes depending on the size of dma_addr_t,
which is not always equal to sizeof(u32 *).

Also fixed alignment of a dma_unmap call whilst in there.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:03 +08:00
Kim Phillips 70d793cc30 crypto: caam - remove line continuations from ablkcipher_append_src_dst
presumably leftovers from possible macro development.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:03 +08:00
Jussi Kivilinna 70ef2601fe crypto: move arch/x86/include/asm/aes.h to arch/x86/include/asm/crypto/
Move AES header to the new asm/crypto directory.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:03 +08:00
Jussi Kivilinna d4af0e9d6e crypto: move arch/x86/include/asm/serpent-{sse2|avx}.h to arch/x86/include/asm/crypto/
Move serpent crypto headers to the new asm/crypto/ directory.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:02 +08:00
Jussi Kivilinna a7378d4e55 crypto: twofish-avx - remove duplicated glue code and use shared glue code from glue_helper
Now that shared glue code is available, convert twofish-avx to use it.

Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:02 +08:00
Jussi Kivilinna 414cb5e7cc crypto: twofish-x86_64-3way - remove duplicated glue code and use shared glue code from glue_helper
Now that shared glue code is available, convert twofish-x86_64-3way to use it.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:02 +08:00
Jussi Kivilinna 964263afdc crypto: camellia-x86_64 - remove duplicated glue code and use shared glue code from glue_helper
Now that shared glue code is available, convert camellia-x86_64 to use it.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:02 +08:00
Jussi Kivilinna 1d0debbd46 crypto: serpent-avx: remove duplicated glue code and use shared glue code from glue_helper
Now that shared glue code is available, convert serpent-avx to use it.

Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:01 +08:00
Jussi Kivilinna 596d875052 crypto: serpent-sse2 - split generic glue code to new helper module
Now that serpent-sse2 glue code has been made generic, it can be split to
separate module.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:01 +08:00
Jussi Kivilinna e81792fbc2 crypto: serpent-sse2 - prepare serpent-sse2 glue code into generic x86 glue code for 128bit block ciphers
Block cipher implementations in arch/x86/crypto/ contain common glue code that
is currently duplicated in each module (camellia-x86_64, twofish-x86_64-3way,
twofish-avx, serpent-sse2 and serpent-avx). This patch prepares serpent-sse2
glue into generic glue code for all 128bit block ciphers to use in
arch/x86/crypto.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:01 +08:00
Jussi Kivilinna a9629d7142 crypto: aes_ni - change to use shared ablk_* functions
Remove duplicate ablk_* functions and make use of ablk_helper module instead.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:01 +08:00
Jussi Kivilinna 30a0400882 crypto: twofish-avx - change to use shared ablk_* functions
Remove duplicate ablk_* functions and make use of ablk_helper module instead.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:01 +08:00
Jussi Kivilinna ffaf915632 crypto: ablk_helper - move ablk_* functions from serpent-sse2/avx glue code to shared module
Move ablk-* functions to separate module to share common code between cipher
implementations.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:00 +08:00
Seth Jennings 7c76bdd7c3 crypto: nx - fix typo in nx driver config option
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Acked-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:00 +08:00
Seth Jennings 95ead5d7ff crypto: nx - move nx build to driver/crypto Makefile
When the nx driver was pulled, the Makefile that actually
builds it is arch/powerpc/Makefile. This is unnatural.

This patch moves the line that builds the nx driver from
arch/powerpc/Makefile to drivers/crypto/Makefile where it
belongs.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Acked-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:00 +08:00
Benoît Thébaudeau 3621189064 hwrng: mxc-rnga - fix data_present API
Commit 45001e9, which added support for RNGA, ignored the previous commit
984e976, which changed the data_present API.

Cc: Matt Mackall <mpm@selenic.com>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Alan Carvalho de Assis <acassis@gmail.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau@advansee.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:00 +08:00
Herbert Xu 398710379f crypto: algapi - Move larval completion into algboss
It has been observed that sometimes the crypto allocation code
will get stuck for 60 seconds or multiples thereof.  This is
usually caused by an algorithm failing to pass the self-test.

If an algorithm fails to be constructed, we will immediately notify
all larval waiters.  However, if it succeeds in construction, but
then fails the self-test, we won't notify anyone at all.

This patch fixes this by merging the notification in the case
where the algorithm fails to be constructed with that of the
the case where it pases the self-test.  This way regardless of
what happens, we'll give the larval waiters an answer.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-22 20:08:29 +08:00
Jussi Kivilinna 3387e7d690 crypto: serpent-sse2/avx - allow both to be built into kernel
Rename serpent-avx assembler functions so that they do not collide with
serpent-sse2 assembler functions when linking both versions in to same
kernel image.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-14 10:09:03 +08:00
Jussi Kivilinna d366db605c crypto: arc4 - improve performance by using u32 for ctx and variables
This patch changes u8 in struct arc4_ctx and variables to u32 (as AMD seems
to have problem with u8 array). Below are tcrypt results of old 1-byte block
cipher versus ecb(arc4) with u8 and ecb(arc4) with u32.

tcrypt results, x86-64 (speed ratios: new-u32/old, new-u8/old):

                  u32    u8
AMD Phenom II   : x3.6   x2.7
Intel Core 2    : x2.0   x1.9

tcrypt results, i386 (speed ratios: new-u32/old, new-u8/old):

                  u32    u8
Intel Atom N260 : x1.5   x1.4

Cc: Jon Oberheide <jon@oberheide.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-14 10:07:23 +08:00
Jussi Kivilinna ce6dd36898 crypto: arc4 - improve performance by adding ecb(arc4)
Currently arc4.c provides simple one-byte blocksize cipher which is wrapped
by ecb() module, giving function call overhead on every encrypted byte. This
patch adds ecb(arc4) directly into arc4.c for higher performance.

tcrypt results (speed ratios: new/old):

AMD Phenom II, x86-64 : x2.7
Intel Core 2, x86-64  : x1.9
Intel Atom N260, i386 : x1.4

Cc: Jon Oberheide <jon@oberheide.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-14 10:07:21 +08:00
Jussi Kivilinna 31b4cd2907 crypto: testmgr - add ecb(arc4) speed tests
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-14 10:07:20 +08:00
Paul Bolle d691af0002 crypto: s390 - clean up DES code a bit more
Commit 98971f8439 ("crypto: s390 - cleanup
DES code") should have also removed crypto_des.h. That file is unused
and unneeded since that commit. So let's clean up that file too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-14 10:07:15 +08:00
Johannes Goetzfried 7efe407672 crypto: serpent - add x86_64/avx assembler implementation
This patch adds a x86_64/avx assembler implementation of the Serpent block
cipher. The implementation is very similar to the sse2 implementation and
processes eight blocks in parallel. Because of the new non-destructive three
operand syntax all move-instructions can be removed and therefore a little
performance increase is provided.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

Intel Core i5-2500 CPU (fam:6, model:42, step:7)

serpent-avx-x86_64 vs. serpent-sse2-x86_64
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     1.03x   1.01x   1.01x   1.01x   1.00x   1.00x   1.00x   1.00x   1.00x   1.01x
64B     1.00x   1.00x   1.00x   1.00x   1.00x   0.99x   1.00x   1.01x   1.00x   1.00x
256B    1.05x   1.03x   1.00x   1.02x   1.05x   1.06x   1.05x   1.02x   1.05x   1.02x
1024B   1.05x   1.02x   1.00x   1.02x   1.05x   1.06x   1.05x   1.03x   1.05x   1.02x
8192B   1.05x   1.02x   1.00x   1.02x   1.06x   1.06x   1.04x   1.03x   1.04x   1.02x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     1.01x   1.00x   1.01x   1.01x   1.00x   1.00x   0.99x   1.03x   1.01x   1.01x
64B     1.00x   1.00x   1.00x   1.00x   1.00x   1.00x   1.00x   1.01x   1.00x   1.02x
256B    1.05x   1.02x   1.00x   1.02x   1.05x   1.02x   1.04x   1.05x   1.05x   1.02x
1024B   1.06x   1.02x   1.00x   1.02x   1.07x   1.06x   1.05x   1.04x   1.05x   1.02x
8192B   1.05x   1.02x   1.00x   1.02x   1.06x   1.06x   1.04x   1.05x   1.05x   1.02x

serpent-avx-x86_64 vs aes-asm (8kB block):
         128bit  256bit
ecb-enc  1.26x   1.73x
ecb-dec  1.20x   1.64x
cbc-enc  0.33x   0.45x
cbc-dec  1.24x   1.67x
ctr-enc  1.32x   1.76x
ctr-dec  1.32x   1.76x
lrw-enc  1.20x   1.60x
lrw-dec  1.15x   1.54x
xts-enc  1.22x   1.64x
xts-dec  1.17x   1.57x

Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:47:43 +08:00
Johannes Goetzfried 4da7de4d8b crypto: testmgr - expand twofish test vectors
The AVX implementation of the twofish cipher processes 8 blocks parallel, so we
need to make test vectors larger to check parallel code paths. Test vectors are
also large enough to deal with 16 block parallel implementations which may occur
in the future.

Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:46:07 +08:00
Johannes Goetzfried 107778b592 crypto: twofish - add x86_64/avx assembler implementation
This patch adds a x86_64/avx assembler implementation of the Twofish block
cipher. The implementation processes eight blocks in parallel (two 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the 3way-parallel functions from the twofish-x86_64-3way
module are called. A good performance increase is provided for blocksizes
greater or equal to 128B.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

Intel Core i5-2500 CPU (fam:6, model:42, step:7)

twofish-avx-x86_64 vs. twofish-x86_64-3way
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.96x   0.97x   1.00x   0.95x   0.97x   0.97x   0.96x   0.95x   0.95x   0.98x
64B     0.99x   0.99x   1.00x   0.99x   0.98x   0.98x   0.99x   0.98x   0.99x   0.98x
256B    1.20x   1.21x   1.00x   1.19x   1.15x   1.14x   1.19x   1.20x   1.18x   1.19x
1024B   1.29x   1.30x   1.00x   1.28x   1.23x   1.24x   1.26x   1.28x   1.26x   1.27x
8192B   1.31x   1.32x   1.00x   1.31x   1.25x   1.25x   1.28x   1.29x   1.28x   1.30x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.96x   0.96x   1.00x   0.96x   0.97x   0.98x   0.95x   0.95x   0.95x   0.96x
64B     1.00x   0.99x   1.00x   0.98x   0.98x   1.01x   0.98x   0.98x   0.98x   0.98x
256B    1.20x   1.21x   1.00x   1.21x   1.15x   1.15x   1.19x   1.20x   1.18x   1.19x
1024B   1.29x   1.30x   1.00x   1.28x   1.23x   1.23x   1.26x   1.27x   1.26x   1.27x
8192B   1.31x   1.33x   1.00x   1.31x   1.26x   1.26x   1.29x   1.29x   1.28x   1.30x

twofish-avx-x86_64 vs aes-asm (8kB block):
         128bit  256bit
ecb-enc  1.19x   1.63x
ecb-dec  1.18x   1.62x
cbc-enc  0.75x   1.03x
cbc-dec  1.23x   1.67x
ctr-enc  1.24x   1.65x
ctr-dec  1.24x   1.65x
lrw-enc  1.15x   1.53x
lrw-dec  1.14x   1.52x
xts-enc  1.16x   1.56x
xts-dec  1.16x   1.56x

Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:46:07 +08:00
Phil Sutter 4d03c5047a crypto: mv_cesa - fix for hash finalisation with data
Since mv_hash_final_fallback() uses ctx->state, read out the digest
state register before calling it.

Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:46:05 +08:00
Phil Sutter 5741d2eeae crypto: mv_cesa - initialise the interrupt status field to zero
Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:41:21 +08:00
Phil Sutter 170dd56dfc crypto: mv_cesa - add an expiry timer in case anything goes wrong
The timer triggers when 500ms have gone by after triggering the engine
and no completion interrupt was received. The callback then tries to
sanitise things as well as possible.

Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:37:20 +08:00
Sonic Zhang b8840098b7 crypto: bfin_crc - CRC hardware driver for BF60x family processors.
The CRC peripheral is a hardware block used to compute the CRC of the block
of data. This is based on a CRC32 engine which computes the CRC value of 32b
data words presented to it. For data words of < 32b in size, this driver
pack 0 automatically into 32b data units. This driver implements the async
hash crypto framework API.

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:37:19 +08:00
Sonic Zhang a482b081a2 crypto: testmgr - Add new test cases for Blackfin CRC crypto driver
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:37:17 +08:00
Mathias Krause 65df577439 crypto: sha1 - use Kbuild supplied flags for AVX test
Commit ea4d26ae ("raid5: add AVX optimized RAID5 checksumming")
introduced x86/ arch wide defines for AFLAGS and CFLAGS indicating AVX
support in binutils based on the same test we have in x86/crypto/ right
now. To minimize duplication drop our implementation in favour to the
one in x86/.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-12 16:37:16 +08:00
Linus Torvalds 4e3c8a1b1c Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This push fixes an unaligned fault on x86-32 with aesni-intel and an
  RNG failure with atmel-rng (repeated bits)."

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: aesni-intel - fix unaligned cbc decrypt for x86-32
  hwrng: atmel-rng - fix race condition leading to repeated bits
2012-06-11 16:31:52 +03:00