Rearrange the last request handling for hardware finished hashes
by moving the generation of the fragment operation into this path.
This results in a simplified sequence to handle this case, and
allows us to move the software padded case further down into the
function. Add comments describing these parts.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the test for the last request out of mv_cesa_ahash_dma_last_req()
to its caller, and move the mv_cesa_dma_add_frag() down into this
function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Avoid adding the final operation within the loop, but instead add it
outside. We combine this with the handling for the no-data case.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When we process the last request of data, and the request contains user
data, the loop in mv_cesa_ahash_dma_req_init() marks the first data size
as being iter.base.op_len which does not include the size of the cache
data. This means we end up hashing an insufficient amount of data.
Fix this by always including the cache size in the first operation
length of any request.
This has the effect that for a request containing no user data,
iter.base.op_len === iter.src.op_offset === creq->cache_ptr
As a result, we include one further change to use iter.base.op_len in
the cache-but-no-user-data case to make the next change clearer.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the presence of the scatterlist to determine whether we should load
any new user data to the engine. The following shall always be true at
this point:
iter.base.op_len == 0 === iter.src.sg
In doing so, we can:
1. eliminate the test for iter.base.op_len inside the loop, which
makes the loop operation more obvious and understandable.
2. move the operation generation for the cache-only case.
This prepares the code for the next step in its transformation, and also
uncovers a bug that will be fixed in the next patch.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the calls to mv_cesa_dma_add_frag() into the parent function,
mv_cesa_ahash_dma_req_init(). This is in preparation to changing
when we generate the operation blocks, as we need to avoid generating
a block for a partial hash block at the end of the user data.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If we add a template first-fragment operation, always update the
template to be a mid-fragment. This ensures that mid-fragments
always follow on from a first fragment in every case.
This means we can move the first to mid-fragment update code out of
mv_cesa_ahash_dma_add_data().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper to add the fragment operation block followed by the DMA
entry to launch the operation.
Although at the moment this pattern only strictly appears at one site,
two other sites can be factored as well by slightly changing the order
in which the DMA operations are performed. This should be harmless as
the only thing which matters is to have all the data loaded into SRAM
prior to launching the operation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Multiple locations in the driver test the operation context fragment
type, checking whether it is a first fragment or not. Introduce a
mv_cesa_mac_op_is_first_frag() helper, which returns true if the
fragment operation is for a first fragment.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure that the template operation is fully initialised, otherwise we
end up loading data from the kernel stack into the engines, which can
upset the hash results.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The endianness of the bit length used in the final stage depends on the
endianness of the algorithm - md5 hashes need it to be in little endian
format, whereas SHA hashes need it in big endian format. Use the
previously added algorithm endianness flag to control this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rather than determining whether we're using a MD5 hash by looking at
the digest size, switch to a cleaner solution using a per-request flag
initialised by the method type.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, we read/write the state in CPU endian, but on the final
request, we convert its endian according to the requested algorithm.
(md5 is little endian, SHA are big endian.)
Always keep creq->state in CPU native endian format, and perform the
necessary conversion when copying the hash to the result.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There's an easier way to get at the hash transform - rather than
using crypto_ahash_tfm(ahash), we can get it directly from
req->base.tfm.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As all the import functions and export functions are virtually
identical, factor out their common parts into a generic
mv_cesa_ahash_import() and mv_cesa_ahash_export() respectively. This
performs the actual import or export, and we pass the data pointers and
length into these functions.
We have to switch a % const operation to do_div() in the common import
function to avoid provoking gcc to use the expensive 64-bit by 64-bit
modulus operation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Attempting to use the sha1 digest for openssh via openssl reveals that
the result from the hash is wrong: this happens when we export the
state from one socket and import it into another via calling accept().
The reason for this is because the operation is reset to "initial block"
state, whereas we may be past the first fragment of data to be hashed.
Arrange for the operation code to avoid the initialisation of the state,
thereby preserving the imported state.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When a AF_ALG fd is accepted a second time (hence hash_accept() is
used), hash_accept_parent() allocates a new private context using
sock_kmalloc(). This context is uninitialised. After use of the new
fd, we eventually end up with the kernel complaining:
marvell-cesa f1090000.crypto: dma_pool_free cesa_padding, c0627770/0 (bad dma)
where c0627770 is a random address. Poisoning the memory allocated by
the above sock_kmalloc() produces kernel oopses within the marvell hash
code, particularly the interrupt handling.
The following simplfied call sequence occurs:
hash_accept()
crypto_ahash_export()
marvell hash export function
af_alg_accept()
hash_accept_parent() <== allocates uninitialised struct hash_ctx
crypto_ahash_import()
marvell hash import function
hash_ctx contains the struct mv_cesa_ahash_req in its req.__ctx member,
and, as the marvell hash import function only partially initialises
this structure, we end up with a lot of members which are left with
whatever data was in memory prior to sock_kmalloc().
Add zero-initialisation of this structure.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Boris Brezillon <boris.brezillon@free-electronc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Several of the algorithms in marvell/hash.c have a statesize of zero.
When an AF_ALG accept() on an already-accepted file descriptor to
calls into hash_accept(), this causes:
char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req))];
to be zero-sized, but we still pass this to:
err = crypto_ahash_export(req, state);
which proceeds to write to 'state' as if it was a "struct md5_state",
"struct sha1_state" etc. Add the necessary initialisers for the
.statesize member.
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for SHA256 operations.
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for MD5 operations.
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The CESA IP supports CPU offload through a dedicated DMA engine (TDMA)
which can control the crypto block.
When you use this mode, all the required data (operation metadata and
payload data) are transferred using DMA, and the results are retrieved
through DMA when possible (hash results are not retrieved through DMA yet),
thus reducing the involvement of the CPU and providing better performances
in most cases (for small requests, the cost of DMA preparation might
exceed the performance gain).
Note that some CESA IPs do not embed this dedicated DMA, hence the
activation of this feature on a per platform basis.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The existing mv_cesa driver supports some features of the CESA IP but is
quite limited, and reworking it to support new features (like involving the
TDMA engine to offload the CPU) is almost impossible.
This driver has been rewritten from scratch to take those new features into
account.
This commit introduce the base infrastructure allowing us to add support
for DMA optimization.
It also includes support for one hash (SHA1) and one cipher (AES)
algorithm, and enable those features on the Armada 370 SoC.
Other algorithms and platforms will be added later on.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>