Commit Graph

825134 Commits

Author SHA1 Message Date
Mike Marciniszyn 715ab1a862 IB/rdmavt: Fix ab/ba include issues
The currently include file ordering for rdmavt headers has an
ab/ba include issue the precludes using inlines from rdma_vt.h
in rdmavt_qp.h.

At the heart of the issue is that rdma_vt.h includes rdmavt_qp.h.

Fix the ordering issue by adjusting rdma_vt.h to not require rdmavt_qp.h
and move qp related inlines to rdmavt_qp.h.

Additionally, promote rvt_mmap_info to rdma_vt.h since it is shared
by rdmavt_cq.h and rdmavt_qp.h.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-24 11:31:49 -03:00
Mike Marciniszyn 62644c1d2b IB/hfi1: Make opfn.h self sufficient
The opfn.h include file build-ablility depends on the including file
having the correct includes.

Fix by making opfn.h self sufficient.

Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-24 11:31:49 -03:00
Kaike Wan ea752bc5e5 IB/{rdmavt, hfi1): Miscellaneous comment fixes
This patch fixes miscellaneous comment errors.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-24 11:31:48 -03:00
Josh Collier 07c5ba9124 IB/hfi1: Add debugfs to control expansion ROM write protect
Some kernels now enable CONFIG_IO_STRICT_DEVMEM which prevents multiple
handles to PCI resource0. In order to continue to support expansion ROM
updates while the driver is loaded, the driver must now provide an
interface to control the expansion ROM write protection.

This patch adds an exprom_wp debugfs interface that allows the hfi1_eprom
user tool to disable the expansion ROM write protection by opening the
file and writing a '1'.  The write protection is released when writing a
'0' or automatically re-enabled when the file handle is closed.  The
current implementation will only allow one handle to be opened at a time
across all hfi1 devices.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Josh Collier <josh.d.collier@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-24 11:26:41 -03:00
Leon Romanovsky 5742582222 RDMA/hns: Remove asynchronic QP destroy
Verbs destroy callbacks are synchronous operations and can't be delayed.
The expectation is that after driver returned from destroy function, the
memory can be freed and user won't be able to access it again.

Ditch workqueue implementation used in HNS driver.

Fixes: d838c481e0 ("IB/hns: Fix the bug when destroy qp")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: oulijun <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-24 10:55:31 -03:00
Parav Pandit 5d7ed2f27b RDMA/cma: Consider scope_id while binding to ipv6 ll address
When two netdev have same link local addresses (such as vlan and non
vlan), two rdma cm listen id should be able to bind to following different
addresses.

listener-1: addr=lla, scope_id=A, port=X
listener-2: addr=lla, scope_id=B, port=X

However while comparing the addresses only addr and port are considered,
due to which 2nd listener fails to listen.

In below example of two listeners, 2nd listener is failing with address in
use error.

$ rping -sv -a fe80::268a:7ff:feb3:d113%ens2f1 -p 4545&

$ rping -sv -a fe80::268a:7ff:feb3:d113%ens2f1.200 -p 4545
rdma_bind_addr: Address already in use

To overcome this, consider the scope_ids as well which forms the accurate
IPv6 link local address.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-24 10:39:16 -03:00
Parav Pandit 823b23da71 IB/core: Allow vlan link local address based RoCE GIDs
IPv6 link local address for a VLAN netdevice has nothing to do with its
resemblance with the default GID, because VLAN link local GID is in
different layer 2 domain.

Now that RoCE MAD packet processing and route resolution consider the
right GID index, there is no need for an unnecessary check which prevents
the addition of vlan based IPv6 link local GIDs.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-24 10:08:15 -03:00
Mark Bloch 5fb58c9e2f RDMA/mlx5: Don't create IB representors when in multiport RoCE mode
Switchdev mode and mutiport RoCE mode aren't compatible at this point.
Don't create IB reps when a user switches to switchdev mode and the driver
operates in that mode.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 15:24:05 -03:00
Mark Bloch d3b5cc1cd9 RDMA/mlx5: Initialize roce port info before multiport master init
When working in mutliport RoCE mode it is possible to attach a slave
before the master. In that case the slave is waiting for a master to be
attached.  When the master is attached it goes over the list of waiting
slaves, finds a slave that is compatible and tries to bind it to itself.

The call stack is:
mlx5_ib_init_multiport_master() -> mlx5_ib_bind_slave_port()

In the bind function we will create a netdev notifier, but this is done
before we initialize the RoCE structure (this is done at a later stage by
the master in the ROCE stage).

Once events are delivered to that notifier we will use
mlx5_ib_get_native_port_mdev() to get the actual port and as the native
port is zero we will access an invalid index in the port structure.

Move the RoCE structure initialization to an earlier stage.

Fixes: 32f69e4be2 ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 15:24:05 -03:00
Mark Bloch 7f575103b0 RDMA/mlx5: Allow DEVX and raw creation flow on reps
Remove the limitations that were in place and provide support for DEVX and
raw flow creation on reps.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 15:24:05 -03:00
Maor Gottlieb 56e5acd405 RDMA/mlx5: Add query e-switch vport context to devx white list
Add MLX5_OP_QUERY_ESW_VPORT_CONTEXT to devx white list. It will be allowed
only if HCA_CAP.eswitch_manager==1.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 15:24:05 -03:00
Mark Bloch 52438be441 RDMA/mlx5: Allow inserting a steering rule to the FDB
Allow this only via mlx5 raw create flow API, legacy verbs are not
supported. To accommodate that, we add a new attribute to matcher creation
to indicate the type of flow table to be used.
	MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE
With this new attribute MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS is no longer
needed, we keep it for compatibility but at most only a single attribute can
be passed of the two.

When inserting a flow rule to the FDB we require that a DEVX FT is
provided as a destination, no other configuration is allowed.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 15:24:05 -03:00
Mark Bloch 3b70508a6b RDMA/mlx5: Create flow table with max size supported
Instead of failing the request, just use the supported number of flow
entries.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 15:24:05 -03:00
Mark Bloch 13a4376568 RDMA/mlx5: Access the prio bypass inside the FDB flow table namespace
Now that we have a specific prio inside the FDB namespace allow retrieving
it from the RDMA side.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 15:24:05 -03:00
Parav Pandit 2e5b8a0116 RDMA/core: Add a netlink command to change net namespace of rdma device
Provide an option to change the net namespace of a rdma device through a
netlink command. When multiple rdma devices exists in a system, and when
containers are used, this will limit rdma device visibility to a specified
net namespace.

An example command to change net namespace of mlx5_1 device to the
previously created net namespace 'foo' is:

$ ip netns add foo
$ rdma dev set mlx5_1 netns foo

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 14:44:58 -03:00
Parav Pandit decbc7a6b0 RDMA/core: Introduce a helper function to change net namespace of rdma device
Introduce a helper function that changes rdma device's net namespace which
performs mini disable/enable sequence to have device visible only in
assigned net namespace.

Device unregistration, device rename and device change net namespace
may be invoked concurrently.

(a) device unregistration needs to wait if a device change (rename or net
    namespace change) operation is in progress.
(b) device net namespace change should not proceed if the unregistration
    has started.
(c) while one cpu is changing device net namespace, other cpu should not
    be able to rename or change net namespace.

To address above concurrency,
(a) Use unreg_mutex to synchronize between ib_unregister_device() and net
    namespace change operation
(b) In cases where unregister_device() has started unregistration before
    change_netns got chance to acquire unreg_mutex, validate the refcount
    - if it dropped to zero, abort the net namespace change operation.

Finally use the helper function to change net namespace of ib device to
move the device back to init_net when such net is deleted.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 14:44:58 -03:00
Parav Pandit 3042492bd1 RDMA/core: Avoid freeing netdevs in disable_device()
So we can use the disable_device() helper while changing the net namespace
of the rdma device in a subsequent patch, move free_netdevs() out of it.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 14:44:58 -03:00
Chengguang Xu 2d95984977 infiniband/qib: Fix typo in comment
Fix typo 'faspath' -> 'pastpath'.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22 14:08:15 -03:00
Colin Ian King ff5eefe6d3 RDMA/cxgb4: Fix spelling mistake "immedate" -> "immediate"
There is a spelling mistake in a module parameter description. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-18 03:17:42 -03:00
Colin Ian King a6d2a5a92e RDMA/cxgb4: Fix null pointer dereference on alloc_skb failure
Currently if alloc_skb fails to allocate the skb a null skb is passed to
t4_set_arp_err_handler and this ends up dereferencing the null skb.  Avoid
the NULL pointer dereference by checking for a NULL skb and returning
early.

Addresses-Coverity: ("Dereference null return")
Fixes: b38a0ad8ec ("RDMA/cxgb4: Set arp error handler for PASS_ACCEPT_RPL messages")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-16 08:03:17 -03:00
Colin Ian King 1db86318c4 RDMA/mlx5: Check for error return in flow_rule rather than err
Currently when the call to create_flow_rule_vport_sq fails, the error
check is being performed on err rather than on the return pointer
flow_rule.  The return flow_rule maybe NULL (which is not considered an
error) or an error code, so check for the error on flow_rule.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: d5ed8ac34c ("RDMA/mlx5: Move default representors SQ steering to rule to modify QP")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-12 11:19:49 -03:00
Devesh Sharma 1c00d7bc96 RDMA/ocrdma: Remove use of idr use pci bdf instead
Removing the use of IDR variable just to name the function ids. Using the
PCI_FUNC(pdev->devfn) instead to create the device name, associated
resources and to print driver into at various places.

Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-12 10:59:02 -03:00
Mark Bloch fb652d3299 RDMA/mlx5: Remove VF representor profile
Now that we have a single IB device with multiple ports we can remove the
VF representor profile.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:40 -03:00
Mark Bloch 26628e2d58 RDMA/mlx5: Move to single device multiport ports in switchdev mode
Move from IB device (representor) per virtual function to single IB device
with port per virtual function (port 1 represents the uplink). As number
of ports is a static property of an IB device, declare the IB device with
as many port as the possible according to the PCI bus.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch a989ea01cb RDMA/mlx5: Move SMI caps logic
We store the SMI information in the core device's struct, make sure we set
that information only once (and not per port), while here make the for
loop based on the actual size of the array.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch 35b0aa67b2 RDMA/mlx5: Refactor netdev affinity code
The design of representors is such that once an IB representor is created,
the netdev of representor already exists, we can use that fact to simplify
the netdev affinity code.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch d5ed8ac34c RDMA/mlx5: Move default representors SQ steering to rule to modify QP
Currently the steering for SQs created on representors is done on
creation, once we move to representors as ports of an IB device we need
the port argument which is given only at the modify QP stage, adjust the
code appropriately.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch 6a4d00be08 RDMA/mlx5: Move rep into port struct
In preparation of moving into a model of single IB device multiple ports
move rep to be part of the port structure. We mark a representor device by
setting is_rep, no functional change with this patch.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch 5d8f6a0e92 RDMA/mlx5: Use correct size for device resources
On allocation we use the array size and on destruction num_ports, use the
array size of destruction as well, in this context the array corresponds
to the native/actual ports on the NIC so no need to adjust this logic for
representors.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch da796ccb3e RDMA/mlx5: Move ports allocation to outside of INIT stage
In downstream patches we will need access to the ports before doing any
stages, in order to set net device per representor.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch 4a6dc8552a RDMA/mlx5: Free IB device on remove
Simplify the code and move the deallocation of the IB device into the
remove function.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Mark Bloch 95579e785a RDMA/mlx5: Move netdev info into the port struct
Netdev info is stored in a separate array and holds data relevant on a per
port basis, move it to be part of the port struct.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 15:05:39 -03:00
Jason Gunthorpe 5331fa0db7 Merge branch 'mlx5-next' into rdma.git for-next
From
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Required for dependencies on the next series

* branch 'mlx5-next':
  net/mlx5: E-Switch, add a new prio to be used by the RDMA side
  net/mlx5: E-Switch, don't use hardcoded values for FDB prios
  net/mlx5: Fix false compilation warning
  net/mlx5: Expose MPEIN (Management PCIE INfo) register layout
  net/mlx5: Add rate limit print macros
  net/mlx5: Add explicit bar address field
  net/mlx5: Replace dev_err/warn/info by mlx5_core_err/warn/info
  net/mlx5: Use dev->priv.name instead of dev_name
  net/mlx5: Make mlx5_core messages independent from mdev->pdev
  net/mlx5: Break load_one into three stages
  net/mlx5: Function setup/teardown procedures
  net/mlx5: Move health and page alloc init to mdev_init
  net/mlx5: Split mdev init and pci init
  net/mlx5: Remove redundant init functions parameter
  net/mlx5: Remove spinlock support from mlx5_write64
  net/mlx5: Remove unused MLX5_*_DOORBELL_LOCK macros

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10 14:59:27 -03:00
Mark Bloch d9cb06759e net/mlx5: E-Switch, add a new prio to be used by the RDMA side
Create a new prio in the FDB, it will be used when inserting steering rules
into the FDB from the RDMA side. We create a new PRIO so rules from the
net side and rules from the RDMA side won't be inserted to the same PRIO,
each side has it's own sandbox to play in.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-04-10 09:29:11 +03:00
Mark Bloch b6d9ccb112 net/mlx5: E-Switch, don't use hardcoded values for FDB prios
When creating the FDB prios, use the enum values already defined and not
the hardcoded values.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-04-10 09:28:41 +03:00
Steve Wise ab7efbe24b RDMA/cxgb4: Use ib_device_set_netdev()
cxgb4 has a simple non-dynamic use of get_netdev, so conversion is
straightforward.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-09 10:14:54 -03:00
Jason Gunthorpe 4b38da75e0 RDMA/drivers: Convert easy drivers to use ib_device_set_netdev()
Drivers that never change their ndev dynamically do not need to use
the get_netdev callback.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
2019-04-09 10:14:54 -03:00
chenglang 2b277dae06 RDMA/hns: Support to create 1M srq queue
In mhop 0 mode, 64*bt_num queues can be supported.
In mhop 1 mode, 32K*bt_num queues can be supported.
Config srqc_hop_num to 1 to support 1M SRQ queues.

Signed-off-by: chenglang <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:26:44 -03:00
Shiraz Saleem d0b5c01bb4 RDMA/umem: Use correct value for SG entries in sg_copy_to_buffer()
With page combining, the assumption that number of SG entries in umem SGL
equal to number of system pages in umem no longer holds.

umem->sg_nents tracks the SG entries in umem SGL. Use it in
sg_pcopy_to_buffer() as opposed to ib_umem_num_pages(umem).

Fixes: d10bcf947a ("RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs")
Reported-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Lijun Ou e1c9a0dc29 RDMA/hns: Dump detailed driver-specific CQ
This patch adds support of resource track for hip08 and take dumping cq
context state used for debugging as an example.  More resources track
supports for hns driver will be added in future.

The output should be as follows.
$ rdma res show cq dev hnseth0 -d
dev hnseth0 cqe 1023 users 2 poll-ctx WORKQUEUE pid 0 comm [ib_core] drv_state 2 drv_ceq
n 0 drv_cqn 0 drv_hopnum 1 drv_pi 0 drv_ci 0 drv_coalesce 0 drv_period 0 drv_cnt 0

Signed-off-by: Tao Tian <tiantao6@huawei.com>
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: chenglang <chenglang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Leon Romanovsky 68e326dea1 RDMA: Handle SRQ allocations by IB/core
Convert SRQ allocation from drivers to be in the IB/core

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Leon Romanovsky d345691471 RDMA: Handle AH allocations by IB/core
Simplify drivers by ensuring lifetime of ib_ah object. The changes
in .create_ah() go hand in hand with relevant update in .destroy_ah().

We will use this opportunity and convert .destroy_ah() to don't fail, as
it was suggested a long time ago, because there is nothing to do in case
of failure during destroy.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Leon Romanovsky f6316032fd RDMA/core: Support object allocation in atomic context
AH objects are allocated in atomic context and those allocations should
be done with GFP_ATOMIC.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Jason Gunthorpe feec576a6a IB: When attrs.udata/ufile is available use that instead of uobject
The ucontext and ufile should not be accessed via the uobject, all these
cases have an attrs so use that instead.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Jason Gunthorpe e79c9c6062 IB/mlx5: Remove references to uboject->context
These should all go through udata now. Add mlx5_udata_to_mdev to convert
a udata into the struct mlx5_ib_dev as these call sites require.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Leon Romanovsky 9e886b39a7 RDMA/nldev: Return device protocol
Add new RDMA_NLDEV_ATTR_DEV_PROTOCOL attribute to give ability for UDEV
rules create IB device stable names based on link type protocol.  The
assumption that devices like mlx4 with duality in their link type under
one IB device struct won't be allowed in the future.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Leon Romanovsky c87e65cfb9 RDMA/cm: Move debug counters to be under relevant IB device
The sysfs layout is created by CM incorrectly presented RDMA devices with
InfiniBand link layer. Layout of such devices represents device tree of
connections. By moving CM statistics to be under relevant port of IB
device, we will fix the following issues:

 * Symlink name - It used device name instead of specific identifier.
 * Target location - It was supposed to point to PCI-ID/infiniband_cm/
   instead of PCI-ID/infiniband/
 * Target name - It created extra device file under already existing
   device folder, e.g. mlx5_0/mlx5_0
 * Crash during boot with RDMA persistent naming patches.

 sysfs: cannot create duplicate filename '/class/infiniband_cm/mlx5_0'
 CPU: 29 PID: 433 Comm: modprobe Not tainted 5.0.0-rc5+ #178
 Call Trace:
  dump_stack+0xcc/0x180
  sysfs_warn_dup.cold.3+0x17/0x2d
  sysfs_do_create_link_sd.isra.2+0xd0/0xf0
  device_add+0x7cb/0x1450
  device_create_groups_vargs+0x1ae/0x220
  device_create+0x93/0xc0
  cm_add_one+0x38f/0xf60 [ib_cm]
  add_client_context+0x167/0x210 [ib_core]
  enable_device_and_get+0x230/0x3f0 [ib_core]
  ib_register_device+0x823/0xbf0 [ib_core]
  __mlx5_ib_add+0x45/0x150 [mlx5_ib]
  mlx5_ib_add+0x1b3/0x5e0 [mlx5_ib]
  mlx5_add_device+0x130/0x3a0 [mlx5_core]
  mlx5_register_interface+0x1a9/0x270 [mlx5_core]
  do_one_initcall+0x14f/0x5de
  do_init_module+0x247/0x7c0
  load_module+0x4c2f/0x60d0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

After this change:
[leonro@server ~]$ ls -al /sys/class/infiniband/ibp0s12f0/ports/1/
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_rx_duplicates
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_rx_msgs
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_tx_msgs
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_tx_retries

Fixes: 110cf374a8 ("infiniband: make cm_device use a struct device and not a kobject.")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:24 -03:00
Colin Ian King 4d2e11d42f opa_vnic: fix check on record->event, incorrect operator used
The check on record->event is always true because the wrong operator
is being used, used && instead of ||

Addresses-Coverity: ("Constant expression result")
Fixes: fae7a699a9 ("opa_vnic: Convert vport_idr to XArray")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:24 -03:00
Shiraz Saleem d10bcf947a RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs
Combine contiguous regions of PAGE_SIZE pages into single scatter list
entry while building the scatter table for a umem. This minimizes the
number of the entries in the scatter list and reduces the DMA mapping
overhead, particularly with the IOMMU.

Set default max_seg_size in core for IB devices to 2G and do not combine
if we exceed this limit.

Also, purge npages in struct ib_umem as we now DMA map the umem SGL with
sg_nents and npage computation is not needed. Drivers should now be using
ib_umem_num_pages(), so fix the last stragglers.

Move npages tracking to ib_umem_odp as ODP drivers still need it.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Gal Pressman <galpress@amazon.com>
Tested-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:24 -03:00
Leon Romanovsky c7252a6532 RDMA/cm: Remove useless zeroing of static global variable
Static global variables are initialized to zero by C standard,
there is no need to zero them again.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-04 08:29:04 -03:00