Commit Graph

54 Commits

Author SHA1 Message Date
Michael J. Ruhl 2b0ad2da8f IB/{rdmavt, hfi1, qib}: Add helpers to hide SWQE WR details
Add some helper functions to hide struct rvt_swqe details.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-28 22:34:26 -03:00
Michael J. Ruhl d310c4bf8a IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs
Historically rdmavt destroy_ah() has returned an -EBUSY when the AH has a
non-zero reference count.  IBTA 11.2.2 notes no such return value or error
case:

	Output Modifiers:
	- Verb results:
	- Operation completed successfully.
	- Invalid HCA handle.
	- Invalid address handle.

ULPs never test for this error and this will leak memory.

The reference count exists to allow for driver independent progress
mechanisms to process UD SWQEs in parallel with post sends.  The SWQE will
hold a reference count until the UD SWQE completes and then drops the
reference.

Fix by removing need to reference count the AH.  Add a UD specific
allocation to each SWQE entry to cache the necessary information for
independent progress.  Copy the information during the post send
processing.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-28 22:34:26 -03:00
Michael J. Ruhl 0b79b27748 IB/{hfi1, qib, rdmavt}: Schedule multi RC/UC packets instead of posting
The post_send() path determines if it should post directly or, schedule
the post for later.  The current logic is:

  if the swqe ring is empty or (for hfi1) wqe->length <= piothreshold
    post the send
  else
    schedule

This can allow large requests to call the send engine directly.  Large
requests can potentially produce a large number of packets prior to
returning to the caller, blocking the caller from posting more requests,
and allowing better parallel processing.

Allow the driver(s) more say in this logic (pass call_send to the driver,
rather than examining a return value).

Update hfi1/qib logic to schedule the send engine if an RC or UC message
is larger than the QP MTU size.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:55:02 -06:00
Mike Marciniszyn 557fafe1bf IB/qib: Convert qp_stats debugfs interface to use new iterator API
Continue porting copy/paste code into rdmavt from qib.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28 19:12:30 -04:00
Leon Romanovsky 0f4d027c3b IB/{rdmavt, qib, hfi1}: Remove gfp flags argument
The caller to the driver marks GFP_NOIO allocations with help
of memalloc_noio-* calls now. This makes redundant to pass down
to the driver gfp flags, which can be GFP_KERNEL only.

The patch removes the gfp flags argument and updates all driver paths.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:23 -04:00
Dasaratharaman Chandramouli d8966fcd4c IB/core: Use rdma_ah_attr accessor functions
Modify core and driver components to use accessor functions
introduced to access individual fields of rdma_ah_attr

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Venkata Sandeep Dhanalakota b4238e7057 IB/qib: Use new rdmavt timers
Reduce qib code footprint by using the rdmavt timers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:40 -05:00
Brian Welty 696513e8cf IB/hfi1, qib, rdmavt: Move AETH credit functions into rdmavt
Add rvt_compute_aeth() and rvt_get_credit() as shared functions in
rdmavt, moved from hfi1/qib logic.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:38 -05:00
Dennis Dalessandro 84b3adc243 IB/qib: Remove qpt_mask global
There is no need to have a global qpt_mask as that does not support the
multiple chip model which qib has. Instead rely on the value which
exists already in the device data (dd).

Fixes: 898fa52b4a "IB/qib: Remove qpn, qp tables and related variables from qib"
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02 08:42:11 -04:00
Mike Marciniszyn c62fb260a8 IB/hfi1,IB/qib: Fix qp_stats sleep with rcu read lock held
The qp init function does a kzalloc() while holding the RCU
lock that encounters the following warning with a debug kernel
when a cat of the qp_stats is done:

[  231.723948] rcu_scheduler_active = 1, debug_locks = 0
[  231.731939] 3 locks held by cat/11355:
[  231.736492]  #0:  (debugfs_srcu){......}, at: [<ffffffff813001a5>] debugfs_use_file_start+0x5/0x90
[  231.746955]  #1:  (&p->lock){+.+.+.}, at: [<ffffffff81289a6c>] seq_read+0x4c/0x3c0
[  231.755873]  #2:  (rcu_read_lock){......}, at: [<ffffffffa0a0c535>] _qp_stats_seq_start+0x5/0xd0 [hfi1]
[  231.766862]

The init functions do an implicit next which requires the rcu read lock
before the kzalloc().

Fix for both drivers is to change the scope of the init function to only
do the allocation and the initialization of the just allocated iter.

The implict next is moved back into the respective start functions to fix
the issue.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
CC: <stable@vger.kernel.org> # 4.6.x-
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-22 14:31:34 -04:00
Mike Marciniszyn 9ec4faa391 IB/qib: Add qib post send table
Add initial table for table driven post_send support.

Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02 15:47:44 -04:00
Mike Marciniszyn 91702b4a39 IB/qib, staging/rdma/hfi1, IB/rdmavt: progress selection changes
The non-rdamvt versions of qib and hfi1 allow for a differing
heuristic to override a schedule progress in favor of a direct
call the the progress routine.

This patch adds that for both drivers and rdmavt.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:38:14 -05:00
Mike Marciniszyn 46a80d62e6 IB/qib, staging/rdma/hfi1: add s_hlock for use in post send
This patch adds an additional lock to reduce contention on the s_lock.

This lock is used in post_send() so that the post_send is not
serialized with the send engine and other send related processing.

To do this the s_next_psn is now maintained on post_send() while
post_send() related fields are moved to a new cache line.  There is
an s_avail maintained for the post_send() to mitigate trading cache
lines with the send engine.  The lock is released/acquired around
releasing the just built packet to the egress mechanism.

Reviewed-by: Jubin John <jubin.john@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:38:07 -05:00
Harish Chegondi 20f333b613 IB/qib: Rename several functions by adding a "qib_" prefix
This would avoid conflict with the functions in hfi1 that have similar
names when both qib and hfi1 drivers are configured to be built into
the kernel. This issue came up in the 0-day build report.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:38:07 -05:00
Mike Marciniszyn 08279d5c94 staging/rdma/hfi1: use new RNR timer
Use the new RNR timer for hfi1.

For qib, this timer doesn't exist, so exploit driver
callbacks to use the new timer as appropriate.

Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:38:03 -05:00
Harish Chegondi 8e4c066634 IB/qib: Remove destroy queue pair code
Destroy QP functionality in rdmavt will be used instead.
Remove the remove_qp function being called exclusively by destroy qp code.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:34 -05:00
Harish Chegondi 70696ea75b IB/qib: Remove modify queue pair code
Modify queue pair functionality in rdmavt will be used instead.
Remove ancillary functions which are being used by modify QP code.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:34 -05:00
Harish Chegondi 1cefc2cd20 IB/qib: Remove qib_lookup_qpn and use rvt_lookup_qpn instead
Add calls to rcu_read_lock()/rcu_read_unlock() as rvt_lookup_qpn callers
must hold the rcu_read_lock before calling and keep the lock until the
returned qp is no longer in use.

Remove lookaside qp and some qp refcount atomics in the sdma send code
that is redundant with the s_dma_busy refcount, which will also stall
the state processing to the reset state.

Change the qpn hash function to hash_32 which is hash function used
in rvt_lookup_qpn. qpn_hash function would be eliminated in later patches.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:33 -05:00
Harish Chegondi 034a3e7079 IB/qib: Remove qib_query_qp function
Rely on rvt_query_qp function defined in rdmavt

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:30 -05:00
Harish Chegondi 18f6c582b3 IB/qib: Remove qib multicast verbs functions
Multicast is now supported by rdmavt. Remove the verbs multicast functions
and use that.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:30 -05:00
Harish Chegondi db3ef0eb84 IB/qib: Use rdmavt version of post_send
This patch removes the post_send and post_one_send from the qib driver.
The "posting" of sends will be done by rdmavt which will walk a WQE and
queue work. This patch will still provide the capability to schedule that
work as well as kick the progress. These are provided to the rdmavt layer.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:29 -05:00
Harish Chegondi 4bb88e5f84 IB/qib: Remove completion queue data structures and functions from qib
Use the completion queue functionality provided by rdmavt.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:29 -05:00
Harish Chegondi 47c7ea6d8e IB/qib: Remove create qp and create qp table functionality
Rely on rdmavt functions for creation of qp and qp table.  Function to
allocate a qpn is still being provided by qib as the algorithm to allocate
a qpn in qib is different from that of the algorithm in rdmavt.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:23 -05:00
Harish Chegondi 01ba79d4dd IB/qib: Use rdmavt send and receive flags
Use the definitions of the s_flags and r_flags which are now in rdmavt.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:23 -05:00
Harish Chegondi 898fa52b4a IB/qib: Remove qpn, qp tables and related variables from qib
This patch removes the private queue pair structure and the table which
holds the queue pair numbers in favor of using what is provided by rdmavt.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:22 -05:00
Harish Chegondi cd18201f5e IB/qib: Remove mmap from qib
Since mmap functionality has been moved into rdmavt, its time for qib to
use that.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:22 -05:00
Harish Chegondi f24a6d4887 IB/qib: Remove ibport and use rdmavt version
Remove several ibport members from qib and use the rdmavt version. rc_acks,
rc_qacks, and rc_delayed_comp are defined as per CPU variables in rdmavt.
Add support for these rdmavt per CPU variables which were not per cpu
variables in qib ibport structure.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:21 -05:00
Dennis Dalessandro 894c727b6a IB/qib: Remove srq from qib
Remove srq from qib now that it has been moved into rdmavt.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:18 -05:00
Dennis Dalessandro 96ab1ac13f IB/qib: Use address handle in rdmavt and remove from qib
Original patch from Kamal Heib <kamalh@mellanox.com>, split
apart from original.

Remove AH from qib and use rdmavt version.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:18 -05:00
Dennis Dalessandro 7c2e11fe2d IB/qib: Remove qp and mr functionality from qib
Remove qp and mr support from qib and use rdmavt. These two changes
cannot be reasonably be split apart into separate patches because they
depend on each other in mulitple places. This paves the way to remove
even more functions in subsequent patches.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:18 -05:00
Dennis Dalessandro ffc269075b IB/qib: Remove driver specific members from qib qp type
In preparation for moving the queue pair data structure to rdmavt the
members of the driver specific queue pairs which are not common need to be
pushed off to a private driver structure. This structure will be available
in the queue pair once moved to rdmavt as a void pointer. This patch while
not adding a lot of value in and of itself is a prerequisite to move the
queue pair out of the drivers and into rdmavt.

The driver specific, private queue pair data structure should condense as
more of the send side code moves to rdmavt.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:17 -05:00
Dennis Dalessandro 869a2a964a IB/qib: Use rdmavt lid defines in qib
Original patch for AH changes from Kamal Heib <kamalh@mellanox.com>, split
apart from original. This patch also removes the qib specific multicast
lid base and permissive lid defines since they are no longer needed.

Use common LID defines in qib driver.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:17 -05:00
Dennis Dalessandro 9ff198f5f2 IB/qib: Remove most uses of QIB_PERMISSIVE_LID and QIB_MULTICAST_LID_BASE
This patch removes most of the uses of QIB_PERMISSIBVE_LID and
QIB_MULTICAST_LID_BASE in favor of the recently added IB_* versions.
There are still minor uses in AH functions as well as the QIB_* defines
but those will be removed in a follow on patch.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:37:16 -05:00
Vinit Agnihotri fbbeb8632b IB/qib: Support creating qps with GFP_NOIO flag
The current code is problematic when the QP creation and ipoib is used to
support NFS and NFS desires to do IO for paging purposes. In that case, the
GFP_KERNEL allocation in qib_qp.c causes a deadlock in tight memory
situations.

This fix adds support to create queue pair with GFP_NOIO flag for connected
mode only to cleanly fail the create queue pair in those situations.

Cc: <stable@vger.kernel.org> # 3.16+
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:17:40 -05:00
Christoph Hellwig e622f2f4ad IB: split struct ib_send_wr
This patch split up struct ib_send_wr so that all non-trivial verbs
use their own structure which embedds struct ib_send_wr.  This dramaticly
shrinks the size of a WR for most common operations:

sizeof(struct ib_send_wr) (old):	96

sizeof(struct ib_send_wr):		48
sizeof(struct ib_rdma_wr):		64
sizeof(struct ib_atomic_wr):		96
sizeof(struct ib_ud_wr):		88
sizeof(struct ib_fast_reg_wr):		88
sizeof(struct ib_bind_mw_wr):		96
sizeof(struct ib_sig_handover_wr):	80

And with Sagi's pending MR rework the fast registration WR will also be
down to a reasonable size:

sizeof(struct ib_fastreg_wr):		64

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt]
Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc]
Tested-by: Haggai Eran <haggaie@mellanox.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
2015-10-08 11:09:10 +01:00
Andreea-Cristina Bernat 03c885913f IB/qib: Replace rcu_assign_pointer() with RCU_INIT_POINTER() in qib_qp.c
According to RCU_INIT_POINTER()'s block comment 3.a, it can be used if
"1.   This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.

"3.   The referenced data structure has already been exposed to readers either
at compile time or via rcu_assign_pointer() -and-
 a.   You have not made -any- reader-visible changes to this structure since
then".

These cases fulfill the conditions above because between the
rcu_dereference_protected() call and the rcu_assign_pointer() call
there is no update of that value.  Therefore, this patch makes the
replacement.

The following Coccinelle semantic patch was used:
@@
@@

- rcu_assign_pointer
+ RCU_INIT_POINTER
  (...,
(
 rtnl_dereference(...)
|
 rcu_dereference_protected(...)
) )

[consolidated from http://marc.info/?l=linux-rdma&m=140836578119485&w=2 and
 http://marc.info/?l=linux-rdma&m=140906361403047&w=2]

Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-17 10:26:19 -08:00
Mike Marciniszyn 85cbb7c728 IB/qib: Correct reference counting in debugfs qp_stats
This particular reference count is not needed with the rcu protection,
and the current code leaks a reference count, causing a hang in
qib_qp_destroy().

Cc: <stable@vger.kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-19 10:18:32 -07:00
Or Gerlitz 60093dc0c8 IB: Return error for unsupported QP creation flags
Fix the usnic and thw qib drivers to err when QP creation flags that
they don't understand are provided.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-06-02 14:58:11 -07:00
Matan Barak dd5f03beb4 IB/core: Ethernet L2 attributes in verbs/cm structures
This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.

When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.

Thus, those attributes were added to the following structures:

* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id

For the path record structure, extra care was taken to avoid the new
fields when packing it into wire format, so we don't break the IB CM
and SA wire protocol.

On the active side, the CM fills. its internal structures from the
path provided by the ULP.  We add there taking the ETH L2 attributes
and placing them into the CM Address Handle (struct cm_av).

On the passive side, the CM fills its internal structures from the WC
associated with the REQ message.  We add there taking the ETH L2
attributes from the WC.

When the HW driver provides the required ETH L2 attributes in the WC,
they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
code checks for the presence of these flags, and in their absence does
address resolution from the ib_init_ah_from_wc() helper function.

ib_modify_qp_is_ok is also updated to consider the link layer. Some
parameters are mandatory for Ethernet link layer, while they are
irrelevant for IB.  Vendor drivers are modified to support the new
function signature.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-14 14:20:54 -08:00
Mike Marciniszyn 1dd173b01f IB/qib: Add qp_stats debug file
This adds a seq_file iterator for reporting the QP hash table when the
qp_stats file is read.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21 17:19:52 -07:00
Mike Marciniszyn f7cf9a618b IB/qib: Remove atomic_inc_not_zero() from QP RCU
Follow Documentation/RCU/rcuref.txt guidance in removing
atomic_inc_not_zero() from QP RCU implementation.

This patch also removes an unneeded synchronize_rcu() in the add path.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21 17:19:46 -07:00
Mike Marciniszyn bcc9b67a5b IB/qib: Fix QP locate/remove race
remove_qp() can execute concurrently with a qib_lookup_qpn() on
another CPU, which in of itself, is ok, given the RCU locking.

The issue is that remove_qp() NULLs out the qp->next field so that a
qib_lookup_qpn() might fail to find a qp if it occurs after the one
that is being deleted.  This is a momentary issue and subsequent
qib_lookup_qpn() calls would find the qp's since the search restarts
from the bucket head.  At scale, the issue might causes dropped
packets and unnecessary retransmissions.

The fix just deletes the qp->next NULL assignment to prevent the
remove_qp() from hiding qp's from qib_lookup_qpn().

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 17:04:18 -08:00
Mike Marciniszyn d359f35430 IB/qib: Fix for broken sparse warning fix
Commit 1fb9fed6d4 ("IB/qib: Fix QP RCU sparse warning") broke QP
hash list deletion in qp_remove() badly.

This patch restores the former for loop behavior, while still fixing
the sparse warnings.

Cc: <stable@vger.kernel.org>
Reviewed-by: Gary Leshner <gary.s.leshner@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-05 09:43:09 -08:00
Mike Marciniszyn 1fb9fed6d4 IB/qib: Fix QP RCU sparse warnings
Commit af061a644a ("IB/qib: Use RCU for qpn lookup") introduced sparse
warnings.

This patch corrects those issues.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-17 10:18:37 -07:00
Mike Marciniszyn 6a82649f21 IB/qib: Avoid returning EBUSY from MR deregister
A timing issue can occur where qib_mr_dereg can return -EBUSY if the
MR use count is not zero.

This can occur if the MR is de-registered while RDMA read response
packets are being progressed from the SDMA ring.  The suspicion is
that the peer sent an RDMA read request, which has already been copied
across to the peer.  The peer sees the completion of his request and
then communicates to the responder that the MR is not needed any
longer.  The responder tries to de-register the MR, catching some
responses remaining in the SDMA ring holding the MR use count.

The code now uses a get/put paradigm to track MR use counts and
coordinates with the MR de-registration process using a completion
when the count has reached zero.  A timeout on the delay is in place
to catch other EBUSY issues.

The reference count protocol is as follows:
- The return to the user counts as 1
- A reference from the lk_table or the qib_ibdev counts as 1.
- Transient I/O operations increase/decrease as necessary

A lot of code duplication has been folded into the new routines
init_qib_mregion() and deinit_qib_mregion().  Additionally, explicit
initialization of fields to zero is now handled by kzalloc().

Also, duplicated code 'while.*num_sge' that decrements reference
counts have been consolidated in qib_put_ss().

Reviewed-by: Ramkrishna Vepa <ramkrishna.vepa@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:05:19 -07:00
Mike Marciniszyn 1c94283ddb IB/qib: Add cache line awareness to qib_qp and qib_devdata structures
This patch reorganizes the QP and devdata files to be more cache line aware.

qib_qp fields in particular are split into read-mostly, send, and receive fields.

qib_devdata fields are split into read-mostly and read/write fields

Testing has show that bidirectional tests improve by as much as 100%
with this patch.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-14 12:43:34 -07:00
Mike Marciniszyn d0f2faf72d IB/qib: Precompute timeout jiffies to optimize latency
A new field is added to qib_qp called timeout_jiffies. It is
initialized upon create and modify.

The field is now used instead of a computation based on qp->timeout.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:56 -07:00
Mike Marciniszyn af061a644a IB/qib: Use RCU for qpn lookup
The heavy weight spinlock in qib_lookup_qpn() is replaced with RCU.
The hash list itself is now accessed via jhash functions instead of mod.

The changes should benefit multiple receive contexts in different
processors by not contending for the lock just to read the hash
structures.

The patch also adds a lookaside_qp (pointer) and a lookaside_qpn in
the context.  The interrupt handler will test the current packet's qpn
against lookaside_qpn if the lookaside_qp pointer is non-NULL.  The
pointer is NULL'ed when the interrupt handler exits.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:54 -07:00
Mike Marciniszyn cc6ea1385b IB/qib: Decode path MTU optimization
Store both the encoded and decoded MTU in the QP structure as a minor
optimization for UC/RC receive routines.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:50 -07:00
Mike Marciniszyn 7c3edd3ff3 IB/qib: Change QPN increment
Changing from +1 to +2 allows for better QP distribution across
receive contexts.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:42:22 -08:00