Expose new counters using the get_hw_stats callback.
We expose the following counters:
+---------------------+----------------------------------------+
| Name | Description |
|---------------------+----------------------------------------|
|sent_pkts | number of sent pkts |
|---------------------+----------------------------------------|
|rcvd_pkts | number of received packets |
|---------------------+----------------------------------------|
|out_of_sequence | number of errors due to packet |
| | transport sequence number |
|---------------------+----------------------------------------|
|duplicate_request | number of received duplicated packets. |
| | A request that previously executed is |
| | named duplicated. |
|---------------------+----------------------------------------|
|rcvd_rnr_err | number of received RNR by completer |
|---------------------+----------------------------------------|
|send_rnr_err | number of sent RNR by responder |
|---------------------+----------------------------------------|
|rcvd_seq_err | number of out of sequence packets |
| | received |
|---------------------+----------------------------------------|
|ack_deffered | number of deferred handling of ack |
| | packets. |
|---------------------+----------------------------------------|
|retry_exceeded_err | number of times retry exceeded |
|---------------------+----------------------------------------|
|completer_retry_err | number of times completer decided to |
| | retry |
|---------------------+----------------------------------------|
|send_err | number of failed send packet |
+---------------------+----------------------------------------+
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Additionally, make it easier to detect skb leaks by issuing a warning
if a leak occurs.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Cc: Andrew Boyer <andrew.boyer@dell.com>
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Change do_complete() such that an error completion is not only
generated if a QP is in the error state but also if a work request
failed.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Andrew Boyer <andrew.boyer@dell.com>
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This change makes the code easier to read and avoids that code is
duplicated.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Andrew Boyer <andrew.boyer@dell.com>
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
It is strongly recommended to report kernel warnings once instead
of every time a condition is hit. Hence change WARN_ON() into
WARN_ON_ONCE() / BUILD_BUG_ON() as appropriate.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Andrew Boyer <andrew.boyer@dell.com>
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If the completer is in the middle of a large read operation, one
lost packet can cause havoc. Going to COMPST_ERROR_RETRY will
cause the requester to resend the request. After that, any packet
from the first attempt still in the receive queue will be
interpreted as an error, restarting the error/retry sequence.
The transfer will quickly exhaust its retries.
This behavior is very noticeable when doing 512KB reads on a
QEMU system configured with 1500B MTU.
Also, a resent request here will prompt the responder on the
other side to immediately start resending, but the resent
packets will get stuck in the already-loaded receive queue and
will never be processed.
Rather than erroring out every time an unexpected future packet
arrives, just drop it. Eventually the retry timer will send a
duplicate request; the completer will be able to make progress since
the queue will start relatively empty.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
It might be possible for all of a QP's references to be dropped
while one of that QP's tasklets is running.
For example, the completer might run during QP destroy.
If qp->valid is false, it will drop all of the packets on
the resp_pkts list, potentially removing the last reference.
Then it tries to advance the SQ consumer pointer. If the
SQ's buffer has already been destroyed, the system will
panic.
To be safe, hold a reference on the QP for the duration
of each tasklet.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
A simple userspace application might poll the CQ, find a completion,
and then attempt to post a new WQE to the SQ. A spurious error can
occur if the userspace application detects a full SQ in the instant
before the kernel is able to advance the SQ consumer pointer.
This is noticeable when using single-entry SQs with ibv_rc_pingpong
if lots of kernel and userspace library debugging is enabled.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
1. Debugging qp state transitions and qp errors in loopback and
multiple QP tests is difficult without qp numbers in debug logs.
This patch adds qp number to important debug logs.
2. Instead of having rxe: prefix in few logs and not having in
few logs, using uniform module name prefix using pr_fmt macro.
3. Code cleanup for various warnings reported by checkpatch for
incomplete unsigned data type, line over 80 characters, return
statements.
Signed-off-by: Parav Pandit <pandit.parav@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Soft RoCE (RXE) - The software RoCE driver
ib_rxe implements the RDMA transport and registers to the RDMA core
device as a kernel verbs provider. It also implements the packet IO
layer. On the other hand ib_rxe registers to the Linux netdev stack
as a udp encapsulating protocol, in that case RDMA, for sending and
receiving packets over any Ethernet device. This yields a RDMA
transport over the UDP/Ethernet network layer forming a RoCEv2
compatible device.
The configuration procedure of the Soft RoCE drivers requires
binding to any existing Ethernet network device. This is done with
/sys interface.
A userspace Soft RoCE library (librxe) provides user applications
the ability to run with Soft RoCE devices. The use of rxe verbs ins
user space requires the inclusion of librxe as a device specifics
plug-in to libibverbs. librxe is packaged separately.
Architecture:
+-----------------------------------------------------------+
| Application |
+-----------------------------------------------------------+
+-----------------------------------+
| libibverbs |
User +-----------------------------------+
+----------------+ +----------------+
| librxe | | HW RoCE lib |
+----------------+ +----------------+
+---------------------------------------------------------------+
+--------------+ +------------+
| Sockets | | RDMA ULP |
+--------------+ +------------+
+--------------+ +---------------------+
| TCP/IP | | ib_core |
+--------------+ +---------------------+
+------------+ +----------------+
Kernel | ib_rxe | | HW RoCE driver |
+------------+ +----------------+
+------------------------------------+
| NIC driver |
+------------------------------------+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+-----------------------------------------------------------+
| Application |
+-----------------------------------------------------------+
+-----------------------------------+
| libibverbs |
User +-----------------------------------+
+----------------+ +----------------+
| librxe | | HW RoCE lib |
+----------------+ +----------------+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--------------+ +------------+
| Sockets | | RDMA ULP |
+--------------+ +------------+
+--------------+ +---------------------+
| TCP/IP | | ib_core |
+--------------+ +---------------------+
+------------+ +----------------+
Kernel | ib_rxe | | HW RoCE driver |
+------------+ +----------------+
+------------------------------------+
| NIC driver |
+------------------------------------+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Soft RoCE resources:
[1[ https://github.com/SoftRoCE/librxe-dev librxe - source code in
Github
[2] https://github.com/SoftRoCE/rxe-dev/wiki/rxe-dev:-Home - Soft RoCE
Wiki page
[3] https://github.com/SoftRoCE/librxe-dev - Soft RoCE userspace library
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>