The caam job rings (input/output job ring) are allocated using
dma_map_single(). These job rings can be visualized as the ring
buffers in which the jobs are en-queued/de-queued. The s/w enqueues
the jobs in input job ring which h/w dequeues and after processing
it copies the jobs in output job ring. Software then de-queues the
job from output ring. Using dma_map/unmap_single() is not preferred
way to allocate memory for this type of requirements because this
adds un-necessary complexity.
Example, if bounce buffer (SWIOTLB) will get used then to make any
change visible in this memory to other processing unit requires
dmap_unmap_single() or dma_sync_single_for_cpu/device(). The
dma_unmap_single() can not be used as this will free the bounce
buffer, this will require changing the job rings on running system
and I seriously doubt that it will be not possible or very complex
to implement. Also using dma_sync_single_for_cpu/device() will also
add unnecessary complexity.
The simple and preferred way is using dma_alloc_coherent() for these
type of memory requirements.
This resolves the Linux boot crash issue when "swiotlb=force" is set
in bootargs on systems which have memory more than 4G.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
there is no noticeable benefit for multiple cores to process one
job ring's output ring: in fact, we can benefit from cache effects
of having the back-half stay on the core that receives a particular
ring's interrupts, and further relax general contention and the
locking involved with reading outring_used, since tasklets run
atomically.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Memory barriers are implied by the i/o register write implementation
(at least on Power). So we can remove the redundant wmb() in
caam_jr_enqueue, and, in dequeue(), hoist the h/w done notification
write up to before we need to increment the head of the ring, and
save an smp_mb.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Code was needlessly checking the s/w job ring when there
would be nothing to process if the h/w's output completion
ring were empty anyway.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The enqueue lock isn't used in any interrupt context, and
the dequeue lock isn't used in the h/w interrupt context,
only in bh context.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SEC v4.x were only 36-bit, SEC v5+ are 40-bit capable.
Also set a DMA mask for any job ring devices created.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SEC4 h/w gets configured in 32- vs. 36-bit physical
addressing modes depending on the size of dma_addr_t,
which is not always equal to sizeof(u32 *).
Also fixed alignment of a dma_unmap call whilst in there.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
- add IRQF_SHARED to request_irq flags to support parts such as
the p1023 that has one IRQ line per couple of rings.
- resetting a job ring triggers an interrupt, so move request_irq
prior to jr_reset to avoid 'got IRQ but nobody cared' messages.
- disable IRQs in h/w to avoid contention between reset and
interrupt status
- delete invalid comment - if there were incomplete jobs,
module would be in use, preventing an unload.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The SEC4 supercedes the SEC2.x/3.x as Freescale's
Integrated Security Engine. Its programming model is
incompatible with all prior versions of the SEC (talitos).
The SEC4 is also known as the Cryptographic Accelerator
and Assurance Module (CAAM); this driver is named caam.
This initial submission does not include support for Data Path
mode operation - AEAD descriptors are submitted via the job
ring interface, while the Queue Interface (QI) is enabled
for use by others. Only AEAD algorithms are implemented
at this time, for use with IPsec.
Many thanks to the Freescale STC team for their contributions
to this driver.
Signed-off-by: Steve Cornelius <sec@pobox.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>