OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Doug Ledford	884fa4f304	Merge branches 'chelsio', 'debug-cleanup', 'hns' and 'i40iw' into merge-test	2016-12-14 14:43:14 -05:00
Pan Bian	46d0703fac	IB/mlx4: fix improper return value If uhw->inlen is non-zero, the value of variable err is 0 if the copy succeeds. Then, if kzalloc() or kmalloc() returns a NULL pointer, it will return 0 to the callers. As a result, the callers cannot detect the errors. This patch fixes the bug, assign "-ENOMEM" to err before the NULL pointer checks, and remove the initialization of err at the beginning. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189031 Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:35:23 -05:00
Pan Bian	5b4c9cd7e4	IB/ocrdma: fix bad initialization In function ocrdma_mbx_create_ah_tbl(), returns the value of status on errors. However, because status is initialized with 0, 0 will be returned even if on error paths. This patch initialize status with "-ENOMEM". Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188831 Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:33:48 -05:00
Zhouyi Zhou	6a3a1056d6	infiniband: nes: return value of skb_linearize should be handled Return value of skb_linearize should be handled in function nes_netdev_start_xmit. Compiled in x86_64 Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:26:49 -05:00
Sebastian Ott	17069d32a3	IB/core: fix unmap_sg argument __ib_umem_release calls dma_unmap_sg with a different number of sg_entries than ib_umem_get uses for dma_map_sg. This might cause trouble for implementations that merge sglist entries and results in the following dma debug complaint: DMA-API: device driver frees DMA sg list with different entry count [map count=2] [unmap count=1] Fix it by using the correct value. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:21:26 -05:00
Souptick Joarder	7ceb740c54	IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc In mthca_create_ah(), pci_pool_alloc() followed by memset will be replaced by pci_pool_zalloc() Signed-off-by: Souptick joarder <jrdr.linux@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:58:39 -05:00
Bart Van Assche	1974ab9d9d	mlx5, calc_sq_size(): Make a debug message more informative Make it clear that qp->sq.wqe_cnt is not the number of WQEs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Eli Cohen <eli@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:45:38 -05:00
Bart Van Assche	3d6bdf1625	mlx5: Remove a set-but-not-used variable This has been detected by building the mlx5 driver with W=1. Fixes: `1a412fb1ca` ('net/mlx5: Fixes: `1a412fb1ca` (IB/mlx5: Modify QP commands via mlx5 ifc') Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Eli Cohen <eli@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:45:10 -05:00
Bart Van Assche	626bc02d4d	mlx5: Use { } instead of { 0 } to init struct Detected by sparse. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:42:32 -05:00
Bart Van Assche	4fa354c9db	IB/srp: Make writing the add_target sysfs attr interruptible Avoid that shutdown of srp_daemon is delayed if add_target_mutex is held by another process. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:47 -05:00
Bart Van Assche	290081b453	IB/srp: Make mapping failures easier to debug Make it easier to figure out what is going on if memory mapping fails because more memory regions than mr_per_cmd are needed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	3787d9908c	IB/srp: Make login failures easier to debug If login fails because memory region allocation failed it can be hard to figure out what happened. Make it easier to figure out why login failed by logging a message if ib_alloc_mr() fails. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	042dd765bd	IB/srp: Introduce a local variable in srp_add_one() This patch makes the srp_add_one() code more compact and does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	1a1faf7a8a	IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build Avoid that the kernel build fails as follows if dynamic debug support is disabled: drivers/infiniband/ulp/srp/ib_srp.c:2272:3: error: implicit declaration of function 'DEFINE_DYNAMIC_DEBUG_METADATA' drivers/infiniband/ulp/srp/ib_srp.c:2272:33: error: 'ddm' undeclared (first use in this function) drivers/infiniband/ulp/srp/ib_srp.c:2275:39: error: '_DPRINTK_FLAGS_PRINT' undeclared (first use in this function) Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	d3a2418ee3	IB/multicast: Check ib_find_pkey() return value This patch avoids that Coverity complains about not checking the ib_find_pkey() return value. Fixes: commit `547af76521` ("IB/multicast: Report errors on multicast groups if P_key changes") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:27:34 -05:00
Bart Van Assche	11b642b84e	IPoIB: Avoid reading an uninitialized member variable This patch avoids that Coverity reports the following: Using uninitialized value port_attr.state when calling printk Fixes: commit `94232d9ce8` ("IPoIB: Start multicast join process only on active ports") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Erez Shitrit <erezsh@mellanox.com> Cc: <stable@vger.kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:27:34 -05:00
Bart Van Assche	2fe2f378dd	IB/mad: Fix an array index check The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS (80) elements. Hence compare the array index with that value instead of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity reports the following: Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127). Fixes: commit `b7ab0b19a8` ("IB/mad: Verify mgmt class in received MADs") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:27:34 -05:00
Bart Van Assche	b42dde478b	IB/mlx4: Rework special QP creation error path The special QP creation error path relies on offset_of(struct mlx4_ib_sqp, qp) == 0. Remove this assumption because that makes the QP creation code easier to understand. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:01:11 -05:00
Bart Van Assche	0d38c240f9	IB/srpt: Report login failures only once Report the following message only once if no ACL has been configured yet for an initiator port: "Rejected login because no ACL has been configured yet for initiator %s.\n" Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:58:30 -05:00
Julia Lawall	5f4c7e4eb5	IB/usnic: simplify IS_ERR_OR_NULL to IS_ERR The function usnic_ib_qp_grp_get_chunk only returns an ERR_PTR value or a valid pointer, never NULL. The same is true of get_qp_res_chunk, which just returns the result of calling usnic_ib_qp_grp_get_chunk. Simplify IS_ERR_OR_NULL to IS_ERR in both cases. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression t,e; @@ t = $usnic_ib_qp_grp_get_chunk(...)\\|get_qp_res_chunk(...)$ ... when != t=e - IS_ERR_OR_NULL(t) + IS_ERR(t) @@ expression t,e,e1; @@ t = $usnic_ib_qp_grp_get_chunk(...)\\|get_qp_res_chunk(...)$ ... when != t=e ?- t ? PTR_ERR(t) : e1 + PTR_ERR(t) ... when any // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:57:54 -05:00
Hans Westgaard Ry	9315bc9a13	IB/core: Issue DREQ when receiving REQ/REP for stale QP from "InfiBand Architecture Specifications Volume 1": A QP is said to have a stale connection when only one side has connection information. A stale connection may result if the remote CM had dropped the connection and sent a DREQ but the DREQ was never received by the local CM. Alternatively the remote CM may have lost all record of past connections because its node crashed and rebooted, while the local CM did not become aware of the remote node's reboot and therefore did not clean up stale connections. and: A local CM may receive a REQ/REP for a stale connection. It shall abort the connection issuing REJ to the REQ/REP. It shall then issue DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP. This patch solves a problem with reuse of QPN. Current codebase, that is IPoIB, relies on a REAP-mechanism to do cleanup of the structures in CM. A problem with this is the timeconstants governing this mechanism; they are up to 768 seconds and the interface may look inresponsive in that period. Issuing a DREQ (and receiving a DREP) does the necessary cleanup and the interface comes up. Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:56:24 -05:00
Philippe Reynes	24dc08c3c9	IB/nes: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:52:25 -05:00
Alexey Khoroshilov	def4a6ffc9	IB/isert: do not ignore errors in dma_map_single() There are several places, where errors in dma_map_single() are ignored. The patch fixes them. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:51:31 -05:00
Jim Foraker	22dccc5454	IB/rdmavt: Only put mmap_info ref if it exists rvt_create_qp() creates qp->ip only when a qp creation request comes from userspace (udata is not NULL). If we exceed the number of available queue pairs however, the error path always attempts to put a kref to this structure. If the requestor is inside the kernel, this leads to a crash. We fix this by checking that qp->ip is not NULL before caling kref_put(). Signed-off-by: Jim Foraker <foraker1@llnl.gov> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:16:11 -05:00
Petr Mladek	f5eabf5e51	IB/rdmavt: Handle the kthread worker using the new API Use the new API to create and destroy the cq kthread worker. The API hides some implementation details. In particular, kthread_create_worker() allocates and initializes struct kthread_worker. It runs the kthread the right way and stores task_struct into the worker structure. In addition, the *on_cpu() variant binds the kthread to the given cpu and the related memory node. kthread_destroy_worker() flushes all pending works, stops the kthread and frees the structure. This patch does not change the existing behavior. Note that we must use the on_cpu() variant because the function starts the kthread and it must bind it to the right CPU before waking. The numa node is associated for given CPU as well. Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:16:11 -05:00
Petr Mladek	6efaf10f16	IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker The memory barrier is not enough to protect queuing works into a destroyed cq kthread. Just imagine the following situation: CPU1 CPU2 rvt_cq_enter() worker = cq->rdi->worker; rvt_cq_exit() rdi->worker = NULL; smp_wmb(); kthread_flush_worker(worker); kthread_stop(worker->task); kfree(worker); // nothing queued yet => // nothing flushed and // happily stopped and freed if (likely(worker)) { // true => read before CPU2 acted cq->notify = RVT_CQ_NONE; cq->triggered++; kthread_queue_work(worker, &cq->comptask); BANG: worker has been flushed/stopped/freed in the meantime. This patch solves this by protecting the critical sections by rdi->n_cqs_lock. It seems that this lock is not much contended and looks reasonable for this purpose. One catch is that rvt_cq_enter() might be called from IRQ context. Therefore we must always take the lock with IRQs disabled to avoid a possible deadlock. Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:16:11 -05:00
Arnd Bergmann	14ab8896f5	IB/mlx5: avoid bogus -Wmaybe-uninitialized warning We get a false-positive warning in linux-next for the mlx5 driver: infiniband/hw/mlx5/mr.c: In function ‘mlx5_ib_reg_user_mr’: infiniband/hw/mlx5/mr.c:1172:5: error: ‘order’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1161:6: note: ‘order’ was declared here infiniband/hw/mlx5/mr.c:1173:6: error: ‘ncont’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1160:6: note: ‘ncont’ was declared here infiniband/hw/mlx5/mr.c:1173:6: error: ‘page_shift’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1158:6: note: ‘page_shift’ was declared here infiniband/hw/mlx5/mr.c:1143:13: error: ‘npages’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1159:6: note: ‘npages’ was declared here I had a trivial workaround for gcc-5 or higher, but that didn't work on gcc-4.9 unfortunately. The only way I found to avoid the warnings for gcc-4.9, short of initializing each of the arguments first was to change the calling conventions to separate the error code from the umem pointer. This avoids casting the error codes from one pointer to another incompatible pointer, and lets gcc figure out when that the data is actually valid whenever we return successfully. Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:12:53 -05:00
Steve Wise	1e38a366ee	ib_isert: log the connection reject message Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Steve Wise	97540bb90a	ib_iser: log the connection reject message Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Steve Wise	5f24410408	rdma_cm: add rdma_consumer_reject_data helper function rdma_consumer_reject_data() will return the private data pointer and length if any is available. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Steve Wise	5042a73d3e	rdma_cm: add rdma_is_consumer_reject() helper function Return true if the peer consumer application rejected the connection attempt. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Steve Wise	77a5db1315	rdma_cm: add rdma_reject_msg() helper function rdma_reject_msg() returns a pointer to a string message associated with the transport reject reason codes. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Wei Yongjun	aecb66b2b0	qedr: remove pointless NULL check in qedr_post_send() Remove pointless NULL check for 'wr' in qedr_post_send(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:18:17 -05:00
Wei Yongjun	aafec388a1	qedr: Use list_move_tail instead of list_del/list_add_tail Using list_move_tail() instead of list_del() + list_add_tail(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:18:17 -05:00
Wei Yongjun	181d80151f	qedr: Fix possible memory leak in qedr_create_qp() 'qp' is malloced in qedr_create_qp() and should be freed before leaving from the error handling cases, otherwise it will cause memory leak. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:18:17 -05:00
Colin Ian King	ea7ef2accd	qedr: return -EINVAL if pd is null and avoid null ptr dereference Currently, if pd is null then we hit a null pointer derference on accessing pd->pd_id. Instead of just printing an error message we should also return -EINVAL immediately. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:18:17 -05:00
Hal Rosenstock	9fa240bbfc	IB/mad: Eliminate redundant SM class version defines for OPA and rename class version define to indicate SM rather than SMP or SMI Signed-off-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:01:58 -05:00
Bodong Wang	7d29f349a4	IB/mlx5: Properly adjust rate limit on QP state transitions - Add MODIFY_QP_EX CMD to extend modify_qp. - Rate limit will be updated in the following state transactions: RTR2RTS, RTS2RTS. The limit will be removed when SQ is in RST and ERR state. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:51 -05:00
Bodong Wang	189aba99e7	IB/uverbs: Extend modify_qp and support packet pacing An new uverbs command ib_uverbs_ex_modify_qp is added to support more QP attributes. User driver should choose to call the legacy/extended API based on input mask. IB_USER_LAST_QP_ATTR_MASK is added to indicated the maximum bit position which supports legacy ib_uverbs_modify_qp. IB_USER_LEGACY_LAST_QP_ATTR_MASK indicates the maximum bit position which supports ib_uverbs_ex_modify_qp, the value of this mask should be updated if new mask is added later. Along with this change, rate_limit is supported by the extended command, user driver could use it to control packet packing. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:51 -05:00
Bodong Wang	528e5a1bd3	IB/core: Support rate limit for packet pacing Add new member rate_limit to ib_qp_attr which holds the packet pacing rate in kbps, 0 means unlimited. IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW QPs when changing QP state from RTR to RTS, RTS to RTS. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:50 -05:00
Bodong Wang	d949167d68	IB/mlx5: Report mlx5 packet pacing capabilities when querying device Enable mlx5 based hardware to report packet pacing capabilities from kernel to user space. Packet pacing allows to limit the rate to any number between the maximum and minimum, based on user settings. The capabilities are exposed to user space through query_device by uhw. The following capabilities are reported: 1. The maximum and minimum rate limit in kbps supported by packet pacing. 2. Bitmap showing which QP types are supported by packet pacing operation. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:50 -05:00
Or Gerlitz	ca5b91d631	IB/mlx5: Support RAW Ethernet when RoCE is disabled On some environments, such as certain SRIOV VF configurations, RoCE is not supported for mlx5 Ethernet ports. Currently, the driver will not open IB device on that port. This is problematic, since we do want user-space RAW Ethernet (RAW_PACKET QPs) functionality to remain in place. For that end, enhance the relevant driver flows such that we do create a device instance in that case. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:49 -05:00
Or Gerlitz	45f95acd63	IB/mlx5: Rename RoCE related helpers to reflect being Eth ones This is a pre-step towards having mlx5 IB device also over Eth ports where RoCE is not supported. We change the roce enable/disable and roce_lag init/fini function names to have _eth instead of _roce. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:48 -05:00
Or Gerlitz	d012f5d6f8	IB/mlx5: Refactor registration to netdev notifier Refactor the netdev notifier registration into a small helper function. This is a pre-step towards having mlx5 IB device over an Ethernet port which doesn't support RoCE. Also, renamed the de-registration helper and the new helper as netdev notifier and not roce, to make it clear this is not only used with roce. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:48 -05:00
Maor Gottlieb	b216af408c	IB/mlx5: Use u64 for UMR length The fast_registration length is used to convey length for memory registrations through UMR which can be of any size up to 2^64. Change the length type to be u64. Fixes: `968e78dd96` ('IB/mlx5: Enhance UMR support to allow partial page table update') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:47 -05:00
Eli Cohen	afd02cd3a9	IB/mlx5: Avoid system crash when enabling many VFs When enabling many VFs, the total amount of DMA mappings increase significantly. This causes DMA allocations to take a lot of time since they are serialized in the kernel. As a result the driver enters into fatal condition due to timeout and the system hangs. To recover from this we disable MR cache for VFs. PFs will still have a full cache and VFs cache can be manipulated as usual after driver load. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:47 -05:00
Maor Gottlieb	c73b7911de	IB/mlx5: Assign SRQ type earlier Move the SRQ type assignment to be before actually using it in create_srq_user() and in create_srq_kernel() functions. Fixes: `af1ba291c5` ('{net, IB}/mlx5: Refactor internal SRQ API') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:46 -05:00
Jack Morgenstein	c482af646d	IB/mlx4: Fix out-of-range array index in destroy qp flow For non-special QPs, the port value becomes non-zero only at the RESET-to-INIT transition. If the QP has not undergone that transition, its port number value is still zero. If such a QP is destroyed before being moved out of the RESET state, subtracting one from the qp port number results in a negative value. Using that negative value as an index into the qp1_proxy array results in an out-of-bounds array reference. Fix this by testing that the QP type is one that uses qp1_proxy before using the port number. For special QPs of all types, the port number is specified at QP creation time. Fixes: `9433c18891` ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:46 -05:00
Moni Shoua	41c450fd8d	IB/mlx5: Make create/destroy_ah available to userspace Advertise that create_ah and destroy_ah verbs are accessible from uverbs interface. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:39:19 -05:00
Moni Shoua	5097e71f3e	IB/mlx5: Use kernel driver to help userspace create ah Resolving a MAC address for a given IP address in userspace is inefficient. This patch lets mlx5 user driver using the kernel driver to resolve the mac and get the answer in the private section of the response. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:38:49 -05:00
Moni Shoua	477864c8fc	IB/core: Let create_ah return extended response to user Add struct ib_udata to the signature of create_ah callback that is implemented by IB device drivers. This allows HW drivers to return extra data to the userspace library. This patch prepares the ground for mlx5 driver to resolve destination mac address for a given GID and return it to userspace. This patch was previously submitted by Knut Omang as a part of the patch set to support Oracle's Infiniband HCA (SIF). Signed-off-by: Knut Omang <knut.omang@oracle.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:38:27 -05:00
Moni Shoua	6ad279c5a2	IB/mlx5: Report that device has udata response in create_ah To make mlx5 user driver aware of whether kernel driver returns dmac in user data response add a new flag that will be returned back to user-space through alloc_ucontext. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:37:19 -05:00
Moni Shoua	c90ea9d8e5	IB/core: Change ib_resolve_eth_dmac to use it in create AH The function ib_resolve_eth_dmac() requires struct qp_attr * and qp_attr_mask as parameters while the function might be useful to resolve dmac for address handles. This patch changes the signature of the function so it can be used in the flow of creating an address handle. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:34:25 -05:00
Moses Reuben	2d1e697e9b	IB/mlx5: Add support to match inner packet fields Add support to match packet fields which are tunneled, i.e. support matching the header of the inner packet which is the result of or bit operation of the original header and the IB_FLOW_SPEC_INNER type. The combination of IB_FLOW_SPEC_INNER \| IB_FLOW_SPEC_VXLAN_TUNNEL is not needed to be checked, because the IB core has this check already. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:34:24 -05:00
Moses Reuben	fbf46860b1	IB/core: Introduce inner flow steering For a tunneled packet which contains external and internal headers, we refer to the external headers as "outer fields" and the internal headers as "inner fields". Example of a tunneled packet: { L2 \| L3 \| L4 \| tunnel header \| L2 \| L3 \| l4 \| data } \| \| \| \| \| \| \| { outer fields }{ inner fields } This patch introduces a new flag for flow steering rules - IB_FLOW_SPEC_INNER - which specifies that the rule applies to the inner fields, rather than to the outer fields of the packet. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:34:23 -05:00
Moses Reuben	ffb30d8f10	IB/mlx5: Support Vxlan tunneling specification Add support to receive specific Vxlan packet in ConnectX-4. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:34:23 -05:00
Moses Reuben	0dbf3332b7	IB/core: Add flow spec tunneling support In order to support tunneling, that can be used by the QP, both struct ib_flow_spec_tunnel and struct ib_flow_tunnel_filter can be used to more IP or UDP based tunneling protocols (e.g NVGRE, GRE, etc). IB_FLOW_SPEC_VXLAN_TUNNEL type flow specification is added to use this functionality and match specific Vxlan packets. In similar to IPv6, we check overflow of the vni value by comparing with the maximum size. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:34:21 -05:00
Bodong Wang	1cbe6fc86c	IB/mlx5: Add support for CQE compressing CQE compressing reduces PCI overhead by coalescing and compressing multiple CQEs into a single merged CQE. Successful compressing improves message rate especially for small packet traffic. CQE compressing is supported for all 64B CQE formats (with certain limitations) generated by RQ/Responder or by SQ/Requestor. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:34:20 -05:00
Bodong Wang	7e43a2a5ba	IB/mlx5: Report mlx5 CQE compression caps during query The capabilities include: - Max number of compressed and aggregated CQEs in a single session, while zero means unsupported. - For Responder, there are two formats of mini CQE: mini CQE with Rx hash and mini CQE with checksum. They're mutual exclusive. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:34:03 -05:00
Bodong Wang	191ded4a4d	IB/mlx5: Report mlx5 multi packet WQE caps during query The capabilities whether hardware support multi packet WQE or not is exposed to user space through query_device by uhw. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:33:25 -05:00
Yonatan Cohen	d680ebed91	IB/rxe: Increase max number of completions to 32k Increase limit of max CQE from 8K to 32K to allow demanding applications to work over SoftRoCE with same configuration as most RoCEv2 HW vendors have. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:33:24 -05:00
Eran Ben Elisha	bf08e884bf	IB/mlx4: Check if GRH is available before using it Before reading GRH attributes, need to make sure AH contains GRH, and in addition, initialize GID type. Fixes: `dbf727de74` ('IB/core: Use GID table in AH creation and dmac resolution') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:32:51 -05:00
Eran Ben Elisha	1f22e454df	IB/mlx4: When no DMFS for IPoIB, don't allow NET_IF QPs According to the firmware spec, FLOW_STEERING_IB_UC_QP_RANGE command is supported only if dmfs_ipoib bit is set. If it isn't set we want to ensure allocating NET_IF QPs fail. We do so by filling out the allocation bitmap. By thus, the NET_IF QPs allocating function won't find any free QP and will fail. Fixes: `c1c9850112` ('IB/mlx4: Add support for steerable IB UD QPs') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:29:46 -05:00
Henry Orosco	d6f7bbcc2e	i40iw: Reorganize structures to align with HW capabilities Some resources are incorrectly organized and at odds with HW capabilities. Specifically, ILQ, IEQ, QPs, MSS, QOS and statistics belong in a VSI. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:20:29 -05:00
Mustafa Ismail	0cc0d851cc	i40iw: Fix incorrect check for error In i40iw_ieq_handle_partial() the check for !status is incorrect. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:20:29 -05:00
Mustafa Ismail	6b0805c256	i40iw: Assign MSS only when it is a new MTU Currently we are changing the MSS regardless of whether there is a change or not in MTU. Fix to make the assignment of MSS dependent on an MTU change. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:20:28 -05:00
Shiraz Saleem	d627b50631	i40iw: Fix race condition in terminate timer's handler Add a QP reference when terminate timer is started to ensure the destroy QP doesn't race ahead to free the QP while it is being referenced in the terminate timer's handler. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:20:28 -05:00
Mustafa Ismail	fd90d4d4c2	i40iw: Fix memory leak in CQP destroy when in reset On a device close, the control QP (CQP) is destroyed by calling cqp_destroy which destroys the CQP and frees its SD buffer memory. However, if the reset flag is true, cqp_destroy is never called and leads to a memory leak on SD buffer memory. Fix this by always calling cqp_destroy, on device close, regardless of reset. The exception to this when CQP create fails. In this case, the SD buffer memory is already freed on an error check and there is no need to call cqp_destroy. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:20:27 -05:00
Shiraz Saleem	1cda28bb5b	i40iw: Fix QP flush to not hang on empty queues or failure When flush QP and there are no pending work requests, signal completion to unblock i40iw_drain_sq and i40iw_drain_rq which are waiting on completion for iwqp->sq_drained and iwqp->sq_drained respectively. Also, signal completion if flush QP fails to prevent the drain SQ or RQ from being blocked indefintely. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:20:27 -05:00
Mustafa Ismail	f4a87ca12a	i40iw: Fix double free of QP A QP can be double freed if i40iw_cm_disconn() is called while it is currently being freed by i40iw_rem_ref(). The fix in i40iw_cm_disconn() will first check if the QP is already freed before making another request for the QP to be freed. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:20:26 -05:00
Shiraz Saleem	91c42b72f8	i40iw: Use correct src address in memcpy to rdma stats counters hw_stats is a pointer to i40_iw_dev_stats struct in i40iw_get_hw_stats(). Use hw_stats and not &hw_stats in the memcpy to copy the i40iw device stats data into rdma_hw_stats counters. Fixes: `b40f4757da` ("IB/core: Make device counter infrastructure dynamic") Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:19:02 -05:00
Thomas Huth	5e58917122	i40iw: Remove macros I40IW_STAG_KEY_FROM_STAG and I40IW_STAG_INDEX_FROM_STAG The macros I40IW_STAG_KEY_FROM_STAG and I40IW_STAG_INDEX_FROM_STAG are apparently bad - they are using the logical "&&" operation which does not make sense here. It should have been a bitwise "&" instead. Since the macros seem to be completely unused, let's simply remove them so that nobody accidentially uses them in the future. And while we're at it, also remove the unused macro I40IW_CREATE_STAG. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 17:13:02 -05:00
Andrew Boyer	37f69f43fb	IB/rxe: Hold refs when running tasklets It might be possible for all of a QP's references to be dropped while one of that QP's tasklets is running. For example, the completer might run during QP destroy. If qp->valid is false, it will drop all of the packets on the resp_pkts list, potentially removing the last reference. Then it tries to advance the SQ consumer pointer. If the SQ's buffer has already been destroyed, the system will panic. To be safe, hold a reference on the QP for the duration of each tasklet. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:34:22 -05:00
Andrew Boyer	07bf9627d5	IB/rxe: Wait for tasklets to finish before tearing down QP The system may crash when a malformed request is received and the error is detected by the responder. NodeA: $ ibv_rc_pingpong -g 0 -d rxe0 -i 1 -n 1 -s 50000 NodeB: $ ibv_rc_pingpong -g 0 -d rxe0 -i 1 -n 1 -s 1024 <NodeA_ip> The responder generates a receive error on node B since the incoming SEND is oversized. If the client tears down the QP before the responder or the completer finish running, a page fault may occur. The fix makes the destroy operation spin until the tasks complete, which appears to be original intent of the design. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	5407f53012	IB/rxe: Fix ref leak in duplicate_request() A ref was added after the call to skb_clone(). Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	5b9ea16c54	IB/rxe: Fix ref leak in rxe_create_qp() The udata->inlen error path needs to clean up the ref added by rxe_alloc(). Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	accacb8f51	IB/rxe: Add support for IB_CQ_REPORT_MISSED_EVENTS Peek at the CQ after arming it so that we can return a hint. This avoids missed completions due to a race between posting CQEs and arming the CQ. For example, CM teardown waits on MAD requests to complete with ib_cq_poll_work(). Without this fix, the last completion might be left on the CQ, hanging the kthread doing the teardown. The console backtraces look like this: [ 4199.911284] Call Trace: [ 4199.911401] [<ffffffff9657fe95>] schedule+0x35/0x80 [ 4199.911556] [<ffffffff965830df>] schedule_timeout+0x22f/0x2c0 [ 4199.911727] [<ffffffff9657f7a8>] ? __schedule+0x368/0xa20 [ 4199.911891] [<ffffffff96580903>] wait_for_completion+0xb3/0x130 [ 4199.912067] [<ffffffff960a17e0>] ? wake_up_q+0x70/0x70 [ 4199.912243] [<ffffffffc074a06d>] cm_destroy_id+0x13d/0x450 [ib_cm] [ 4199.912422] [<ffffffff961615d5>] ? printk+0x57/0x73 [ 4199.912578] [<ffffffffc074a390>] ib_destroy_cm_id+0x10/0x20 [ib_cm] [ 4199.912759] [<ffffffffc076098c>] rdma_destroy_id+0xac/0x340 [rdma_cm] [ 4199.912941] [<ffffffffc076f2cc>] 0xffffffffc076f2cc Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	d4fb59256a	IB/rxe: Add support for zero-byte operations The last_psn algorithm fails in the zero-byte case: it calculates first_psn = N, last_psn = N-1. This makes the operation unretryable since the res structure will fail the (first_psn <= psn <= last_psn) test in find_resource(). While here, use BTH_PSN_MASK to mask the calculated last_psn. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	d38eb801aa	IB/rxe: Unblock loopback by moving skb_out increment skb_out is decremented in rxe_skb_tx_dtor(), which is not called in the loopback() path. Move the increment to the send() path rather than rxe_xmit_packet(). Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	2a7a85487e	IB/rxe: Don't update the response PSN unless it's going forwards A client might post a read followed by a send. The partner receives and acknowledges both transactions, posting an RCQ entry for the send, but something goes wrong with the read ACK. When the client retries the read, the partner's responder processes the duplicate read but incorrectly resets the PSN to the value preceding the original send. When the duplicate send arrives, the responder cannot tell that it is a duplicate, so the responder generates a duplicate RCQ entry, confusing the client. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	dd753d8743	IB/rxe: Advance the consumer pointer before posting the CQE A simple userspace application might poll the CQ, find a completion, and then attempt to post a new WQE to the SQ. A spurious error can occur if the userspace application detects a full SQ in the instant before the kernel is able to advance the SQ consumer pointer. This is noticeable when using single-entry SQs with ibv_rc_pingpong if lots of kernel and userspace library debugging is enabled. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Andrew Boyer	6e9bb530ff	IB/rxe: Remove buffer used for printing IP address Avoid smashing the stack when an ICRC error occurs on an IPv6 network. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Dan Carpenter	95db9d05b7	IB/rxe: Remove unneeded cast in rxe_srq_from_attr() It makes me nervous when we cast pointer parameters. I would estimate that around 50% of the time, it indicates a bug. Here the cast is not needed becaue u32 and and unsigned int are the same thing. Removing the cast makes the code more robust and future proof in case any of the types change. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Wei Yongjun	4ac4707102	IB/rxe: Use DEFINE_SPINLOCK() for spinlock spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanosky <leonro@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Arnd Bergmann	a0fa72683e	IB/rxe: avoid putting a large struct rxe_qp on stack A race condition fix added an rxe_qp structure to the stack in order to be able to perform rollback in rxe_requester(), but the structure is large enough to trigger the warning for possible stack overflow: drivers/infiniband/sw/rxe/rxe_req.c: In function 'rxe_requester': drivers/infiniband/sw/rxe/rxe_req.c:757:1: error: the frame size of 2064 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] This changes the rollback function to only save the psn inside the qp, which is the only field we access in the rollback_qp anyway. Fixes: `3050b99850` ("IB/rxe: Fix race condition between requester and completer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-12 16:31:45 -05:00
Bart Van Assche	66431b0e86	IB/hfi1: Define platform_config_table_limits once Defining static data structures in a header file is wrong because this causes the data structure to be instantiated once in every .c file it is included in. Hence move the definition of a static array from a header file into the only .c file in which it is used. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Bhumika Goyal	0fc859a657	IB/hfi1: constify mmu_notifier_ops structure Declare the structure mmu_notifier_ops as const as it is only stored in the ops field of a mmu_notifier structure. The ops field is of type const struct mmu_notifier_ops *, so mmu_notifier_ops structures having this property can be declared as const. Done using coccinelle: @r1 disable optional_qualifier @ identifier i; position p; @@ static struct mmu_notifier_ops i@p = {...}; @ok1@ identifier r1.i; position p; struct mmu_rb_handler handler; @@ handler.mn.ops=&i@p @bad@ position p!={r1.p,ok1.p}; identifier r1.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r1.i; @@ static +const struct mmu_notifier_ops i={...}; @depends on !bad disable optional_qualifier@ identifier r1.i; @@ +const struct mmu_notifier_ops i; File size before: text data bss dec hex filename 3566 72 16 3654 e46 drivers/infiniband/hw/hfi1/mmu_rb.o File size after: text data bss dec hex filename 3658 0 16 3674 e5a drivers/infiniband/hw/hfi1/mmu_rb.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Mike Marciniszyn	5dc806052a	IB/rdmavt, IB/hfi1, IB/qib: Add inlines for mtu division Add rvt_div_round_up_mtu() and rvt_div_mtu() routines to do the computation based on the pmtu and the log_pmtu. Change divides in qib, hfi1 to use the new inlines. Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Mike Marciniszyn	c64607aa8a	IB/hfi1,IB/qib: use rvt swqe mr deref helper Convert to use new swqe put routine. Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Harish Chegondi	9d8145a604	IB/hfi1: Avoid credit return allocation for cpu-less NUMA nodes Do not allocate credit return base and DMA memory for NUMA nodes without CPUs. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Mike Marciniszyn	0771da5a6e	IB/hfi1,IB/qib: Use new send completion helper Convert cq completion returns in both rdmavt drivers to use the new helper. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Mike Marciniszyn	f2dc9cdce8	IB/rdmavt: Add a send completion helper This is for use by client drivers to drive send completions into a CQ. A new exported table allows for the mapping of ib_wr_opcode into a ib_wc_opcode. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Sebastian Sanchez	238b1862b4	IB/qib: Use standard refcount wrapper for QPs Use the standard driver wrapper for QP reference counters. This makes the code more maintainable. Fixes: Commit `4d6f85c3fa` ("IB/rdmavt, IB/qib, IB/hfi1: Use new QP put get routines") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Sebastian Sanchez	f84dfa26e6	IB/hfi1: Use reference count wrapper for MRs Some parts of the code don't use the standard driver wrapper for memory region reference counters. Use the standard driver wrapper throughout the code. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Sebastian Sanchez	b44980f879	IB/hfi1: Replace qp->refcount release code with standard driver wrapper Some parts of the code don't use the standard release wrapper rvt_put_qp() for decrementing and testing the refcount to then try to use a resource. Replace this code with the standard driver wrapper. Fixes: Commit `4d6f85c3fa` ("IB/rdmavt, IB/qib, IB/hfi1: Use new QP put get routines") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Dean Luick	0080167467	IB/hfi1: Preserve external device completed bit The driver should not change the external device request completed bit when not actually doing an external device request. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Sebastian Sanchez	9b86071c5e	IB/hfi1: Remove critical section gap in sc_buffer_alloc() In sc_buffer_alloc(), the sc->alloc_lock is released before calling sc_release_update(), and it is reacquired after the function call. This causes CPU lock trading. Fix it by not dropping the lock before calling sc_release_update(). Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Mitko Haralanov	b777f154a0	IB/hfi1: Remove usage of qp->s_cur_sge The s_cur_sge field in the qp structure holds a pointer to the SGE of the currently processed WQE. It assumes the protection of the RVT_S_BUSY flag to prevent the changing of this field while the send engine is using it. This scheme works as long as there is only one instance of the send engine running at a time. Scaling of the send engine to multiple cores would break this assumption as there could be multiple instances of the send engine running on different CPUs. This opens a window where the QP's RVT_S_BUSY flag is not set but the send engine is still running. To prevent accidental changing of the s_cur_sge pointer, the QP's dependence on it is removed. The SGE pointer is now stored in the verbs_txreq, which is a per-packet data structure. This ensures that each individual packet has it's own pointer, which is setup while the RVT_S_BUSY flag is set. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Mike Marciniszyn	fcb29a6668	IB/rdmavt: Add trace of MR segs Add tracing of MR segment information. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00
Dean Luick	5213006ade	IB/hfi1: Add special setting for low power AOC Low power QSFP AOC cables require a different SerDes Tx PLL bandwidth setting than the default. The 8051 firmware does not know the details, so the driver needs to tell the firmware through a special setting. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-11 15:29:42 -05:00

1 2 3 4 5 ...

6278 Commits