Commit Graph

17 Commits

Author SHA1 Message Date
Dan Williams 074cc47679 ioat2,3: convert to producer/consumer locking
Use separate locks for the descriptor prep (producer) and descriptor
cleanup (consumer) paths.  Allows the producer path to run concurrently
with the cleanup path.  Inspired by Documentation/circular-buffer.txt.

Cc: David Howells <dhowells@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-01 15:22:55 -07:00
Dan Williams abb12dfd50 ioat: convert to circ_buf
Use the common power-of-2 circular buffer macros.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-01 15:22:54 -07:00
Dan Williams aa4d72ae94 ioat: cleanup ->timer_fn() and ->cleanup_fn() prototypes
If the calling convention of ->timer_fn() and ->cleanup_fn() are unified
across hardware versions we can drop parameters to ioat_init_channel() and
unify ioat_is_dma_complete() implementations.

Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct
dma_chan pointer.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 21:21:13 -07:00
Dan Williams 281befa559 ioat2: kill pending flag
The pending == 2 case no longer exists in the driver so, we can use
ioat2_ring_pending() outside the lock to determine if there might be any
descriptors in the ring that the hardware has not seen.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 11:47:43 -07:00
Dan Williams a6d52d7067 ioat2,3: put channel hardware in known state at init
Put the ioat2 and ioat3 state machines in the halted state with all
errors cleared.

The ioat1 init path is not disturbed for stability, there are no
reported ioat1 initiaization issues.

Cc: <stable@kernel.org>
Reported-by: Roland Dreier <rdreier@cisco.com>
Tested-by: Roland Dreier <rdreier@cisco.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-19 15:36:02 -07:00
Dan Williams bbb20089a3 Merge branch 'dmaengine' into async-tx-next
Conflicts:
	crypto/async_tx/async_xor.c
	drivers/dma/ioat/dma_v2.h
	drivers/dma/ioat/pci.c
	drivers/md/raid5.c
2009-09-08 17:55:21 -07:00
Dan Williams 162b96e63e ioat2,3: cacheline align software descriptor allocations
All the necessary fields for handling an ioat2,3 ring entry can fit into
one cacheline.  Move ->len prior to ->txd in struct ioat_ring_ent, and
move allocation of these entries to a hw-cache-aligned kmem cache to
reduce the number of cachelines dirtied for descriptor management.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:53:04 -07:00
Dan Williams e3232714d4 ioat3: segregate raid engines
The cleanup routine for the raid cases imposes extra checks for handling
raid descriptors and extended descriptors.  If the channel does not
support raid it can avoid this extra overhead by using the ioat2 cleanup
path.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:43:02 -07:00
Dan Williams b094ad3be5 ioat3: xor support
ioat3.2 adds xor offload support for up to 8 sources.  It can also
perform an xor-zero-sum operation to validate whether all given sources
sum to zero, without writing to a destination.  Xor descriptors differ
from memcpy in that one operation may require multiple descriptors
depending on the number of sources.  When the number of sources exceeds
5 an extended descriptor is needed.  These descriptors need to be
accounted for when updating the DMA_COUNT register.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:42:57 -07:00
Dan Williams 5669e31c5a ioat: add 'ioat' sysfs attributes
Export driver attributes for diagnostic purposes:
'ring_size': total number of descriptors available to the engine
'ring_active': number of descriptors in-flight
'capabilities': supported operation types for this channel
'version': Intel(R) QuickData specfication revision

This also allows some chattiness to be removed from the driver startup
as this information is now available via sysfs.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:42:56 -07:00
Dan Williams bf40a6869c ioat3: split ioat3 support to its own file, add memset
Up until this point the driver for Intel(R) QuickData Technology
engines, specification versions 2 and 3, were mostly identical save for
a few quirks.  Version 3.2 hardware adds many new capabilities (like
raid offload support) requiring some infrastructure that is not relevant
for v2.  For better code organization of the new funcionality move v3
and v3.2 support to its own file dma_v3.c, and export some routines from
the base files (dma.c and dma_v2.c) that can be reused directly.

The first new capability included in this code reorganization is support
for v3.2 memset operations.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:42:55 -07:00
Dan Williams 2aec048cdc ioat3: hardware version 3.2 register / descriptor definitions
ioat3.2 adds raid5 and raid6 offload capabilities.

Signed-off-by: Tom Picard <tom.s.picard@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:42:54 -07:00
Dan Williams a309218ace ioat2,3: dynamically resize descriptor ring
Increment the allocation order of the descriptor ring every time we run
out of descriptors up to a maximum of allocation order specified by the
module parameter 'ioat_max_alloc_order'.  After each idle period
decrement the allocation order to a minimum order of
'ioat_ring_alloc_order' (i.e. the default ring size, tunable as a module
parameter).

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:38:54 -07:00
Dan Williams 09c8a5b85e ioat: switch watchdog and reset handler from workqueue to timer
In order to support dynamic resizing of the descriptor ring or polling
for a descriptor in the presence of a hung channel the reset handler
needs to make progress while in a non-preemptible context.  The current
workqueue implementation precludes polling channel reset completion
under spin_lock().

This conversion also allows us to return to opportunistic cleanup in the
ioat2 case as the timer implementation guarantees at least one cleanup
after every descriptor is submitted.  This means the worst case
completion latency becomes the timer frequency (for exceptional
circumstances), but with the benefit of avoiding busy waiting when the
lock is contended.

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:30:24 -07:00
Dan Williams 345d852391 ioat: ___devinit annotate the initialization paths
Mark all single use initialization routines with __devinit.

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:30:24 -07:00
Dan Williams 6df9183a15 ioat: add some dev_dbg() calls
Provide some output for debugging the driver.

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:30:23 -07:00
Dan Williams 5cbafa65b9 ioat2,3: convert to a true ring buffer
Replace the current linked list munged into a ring with a native ring
buffer implementation.  The benefit of this approach is reduced overhead
as many parameters can be derived from ring position with simple pointer
comparisons and descriptor allocation/freeing becomes just a
manipulation of head/tail pointers.

It requires a contiguous allocation for the software descriptor
information.

Since this arrangement is significantly different from the ioat1 chain,
move ioat2,3 support into its own file and header.  Common routines are
exported from driver/dma/ioat/dma.[ch].

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08 17:29:55 -07:00