Lustre uses a advertised max MR size of ~0ULL to indicate it should
use a dma_mr. Hence advertise max MR size as ~0ULL.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
When polling, we do a GTS update if the accumulated cidx_inc == the CQ
depth / 16. However, if the CQ is large enough, Cq depth / 16 exceeds
the size of the field in the GTS word. So we also need to update if
cidx_inc hits CIDXINC_MASK to avoid overflowing the field.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch makes following fixes in QP flush logic:
- correctly flushes unsignaled WRs followed by a signaled WR
- supports for flushing a CQ bound to multiple QPs
- resets cidx_flush if a active queue starts getting HW CQEs again
- marks WQ in error when we leave RTS. This was only being done for
user queues, but we need it for kernel queues too so that
post_send/post_recv will start returning the appropriate error
synchronously
- eats unsignaled read resp CQEs. HW always inserts CQEs so we must
silently discard them if the read work request was unsignaled.
- handles QP flushes with pending SW CQEs. The flush and out of order
completion logic has a bug where if out of order completions are
flushed but not yet polled by the consumer and the qp is then
flushed then we end up inserting duplicate completions.
- c4iw_flush_sq() should only flush wrs that have not already been
flushed. Since we already track where in the SQ we've flushed via
sq.cidx_flush, just start at that point and flush any remaining.
This bug only caused a problem in the presence of unsignaled work
requests.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
[ Fixed sparse warning due to htonl/ntohl confusion. - Roland ]
Signed-off-by: Roland Dreier <roland@purestorage.com>
It enables direct DMA by HW to memory region PBL arrays and fast register PBL
arrays from host memory, vs the T4 way of passing these arrays in the WR itself.
The result is lower latency for memory registration, and larger PBL array
support for fast register operations.
This patch also updates ULP_TX_MEM_WRITE command fields for T5. Ordering bit of
ULP_TX_MEM_WRITE is at bit position 22 in T5 and at 23 in T4.
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support for Chelsio T5 adapter.
Enables T5's Write Combining feature.
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add module option db_fc_threshold which is the count of active QPs
that trigger automatic db flow control mode. Automatically transition
to/from flow control mode when the active qp count crosses
db_fc_theshold.
Add more db debugfs stats
On DB DROP event from the LLD, recover all the iwarp queues.
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit fe3cc0d99d ("powerpc: Add
pgprot_writecombine") in benh's tree exposes the pgprot_writecombine()
API to drivers on powerpc. cxgb4 has an open-coded version of the same,
so use the common API now that it's available.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Anton Blanchard <anton@samba.org>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This avoids the CIDX_INC overflow issue with T4A2 when running
kernel RDMA applications.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
- Remove dsgl support - doesn't work in T4.
- Wrap the immediate PBL as needed when building it in the wr.
- Adjust max pbl depth allowed based on ulptx alignment requirements.
- Bump the slots per SQ to 5 to allow up to 128MB fast registers.
- Advertise fastreg support by default.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
T4 support on-chip SQs to reduce latency. This patch adds support for
this in iw_cxgb4:
- Manage ocqp memory like other adapter mem resources.
- Allocate user mode SQs from ocqp mem if available.
- Map ocqp mem to user process using write combining.
- Map PCIE_MA_SYNC reg to user process.
Bump uverbs ABI.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
T4 EQ entries are in multiples of 64 bytes. Currently the RDMA SQ and
RQ use fixed sized entries composed of 4 EQ entries for the SQ and 2
EQ entries for the RQ. For optimial latency with small IO, we need to
change this so the HW only needs to DMA the EQ entries actually used
by a given work request.
Implementation:
- add wq_pidx counter to track where we are in the EQ. cidx/pidx are
used for the sw sq/rq tracking and flow control.
- the variable part of work requests is the SGL. Add new functions to
build the SGL and/or immediate data directly in the EQ memory
wrapping when needed.
- adjust the min burst size for the EQ contexts to 64B.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.
No functional change.
For further information about the background:
http://marc.info/?l=linux-netdev&m=127037540020276&w=2
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- wrap cq->cqidx_inc based on cq size.
- optimize t4_arm_cq logic.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
1) save the timestamp flit in the cq when we consume a CQE.
2) always compare the saved flit with the previous entry flit when
reading the next CQE entry. If the flits don't compare, then we
have overflowed.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>