OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Steve Wise	2d478b2859	iw_cxgb4: remove wr_id attributes Remove sq/rq wr_id attributes because typically they are pointers and we don't want to pass up kernel pointers. Fixes: `056f9c7f39` ("iw_cxgb4: dump detailed driver-specific QP information") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-05-15 16:18:03 -06:00
Steve Wise	1ea62e8164	iw_cxgb4: fix uninitialized variable warnings Fixes: `056f9c7f39` ("iw_cxgb4: dump detailed driver-specific QP information") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-05-15 16:14:25 -06:00
Doug Ledford	f5e27a203f	Merge branch 'k.o/for-rc' into k.o/wip/dl-for-next Several items of conflict have arisen between the RDMA stack's for-rc branch and upcoming for-next work: `9fd4350ba8` ("IB/rxe: avoid double kfree_skb") directly conflicts with `2e47350789` ("IB/rxe: optimize the function duplicate_request") Patches already submitted by Intel for the hfi1 driver will fail to apply cleanly without this merge Other people on the mailing list have notified that their upcoming patches also fail to apply cleanly without this merge Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-05-09 15:48:48 -04:00
Christophe Jaillet	3d69191086	iw_cxgb4: Fix an error handling path in 'c4iw_get_dma_mr()' The error handling path of 'c4iw_get_dma_mr()' does not free resources in the correct order. If an error occures, it can leak 'mhp->wr_waitp'. Fixes: `a3f12da0e9` ("iw_cxgb4: allocate wait object for each memory object") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-05-09 10:45:19 -04:00
Steve Wise	056f9c7f39	iw_cxgb4: dump detailed driver-specific QP information Provide a cxgb4-specific function to fill in qp state details. This allows dumping important c4iw_qp state useful for debugging. Included in the dump are the t4_sq, t4_rq structs, plus a dump of the t4_swsqe and t4swrqe descriptors for the first and last pending entries. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-05-03 15:51:27 -04:00
YueHaibing	ecb238f6a7	IB/cxgb4: use skb_put_zero()/__skb_put_zero Use the recently introduced helper to replace the pattern of skb_put_zero/__skb_put() && memset(). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-05-01 11:27:37 -04:00
Bharat Potnuri	2df19e19ae	iw_cxgb4: Atomically flush per QP HW CQEs When a CQ is shared by multiple QPs, c4iw_flush_hw_cq() needs to acquire corresponding QP lock before moving the CQEs into its corresponding SW queue and accessing the SQ contents for completing a WR. Ignore CQEs if corresponding QP is already flushed. Cc: stable@vger.kernel.org Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-04-27 14:38:44 -04:00
Raju Rangoju	26bff1bd74	RDMA/cxgb4: release hw resources on device removal The c4iw_rdev_close() logic was not releasing all the hw resources (PBL and RQT memory) during the device removal event (driver unload / system reboot). This can cause panic in gen_pool_destroy(). The module remove function will wait for all the hw resources to be released during the device removal event. Fixes c12a67fe(iw_cxgb4: free EQ queue memory on last deref) Signed-off-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-04-27 13:52:31 -04:00
Linus Torvalds	19fd08b85b	Merge candidates for 4.17 merge window - Fix RDMA uapi headers to actually compile in userspace and be more complete - Three shared with netdev pull requests from Mellanox: * 7 patches, mostly to net with 1 IB related one at the back). This series addresses an IRQ performance issue (patch 1), cleanups related to the fix for the IRQ performance problem (patches 2-6), and then extends the fragmented completion queue support that already exists in the net side of the driver to the ib side of the driver (patch 7). * Mostly IB, with 5 patches to net that are needed to support the remaining 10 patches to the IB subsystem. This series extends the current 'representor' framework when the mlx5 driver is in switchdev mode from being a netdev only construct to being a netdev/IB dev construct. The IB dev is limited to raw Eth queue pairs only, but by having an IB dev of this type attached to the representor for a switchdev port, it enables DPDK to work on the switchdev device. * All net related, but needed as infrastructure for the rdma driver - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers - SRP performance updates - IB uverbs write path cleanup patch series from Leon - Add RDMA_CM support to ib_srpt. This is disabled by default. Users need to set the port for ib_srpt to listen on in configfs in order for it to be enabled (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port) - TSO and Scatter FCS support in mlx4 - Refactor of modify_qp routine to resolve problems seen while working on new code that is forthcoming - More refactoring and updates of RDMA CM for containers support from Parav - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device memory' user API features - Infrastructure updates for the new IOCTL interface, based on increased usage - ABI compatibility bug fixes to fully support 32 bit userspace on 64 bit kernel as was originally intended. See the commit messages for extensive details - Syzkaller bugs and code cleanups motivated by them -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCgAGBQJax5Z0AAoJEDht9xV+IJsacCwQAJBIgmLCvVp5fBu2kJcXMMVI y3l2YNzAUJvDDKv1r5yTC9ugBXEkDtgzi/W/C2/5es2yUG/QeT/zzQ3YPrtsnN68 5FkiXQ35Tt7+PBHMr0cacGRmF4M3Td3MeW0X5aJaBKhqlNKwA+aF18pjGWBmpVYx URYCwLb5BZBKVh4+1Leebsk4i0/7jSauAqE5M+9notuAUfBCoY1/Eve3DipEIBBp EyrEnMDIdujYRsg4KHlxFKKJ1EFGItknLQbNL1+SEa0Oe0SnEl5Bd53Yxfz7ekNP oOWQe5csTcs3Yr4Ob0TC+69CzI71zKbz6qPDILTwXmsPFZJ9ipJs4S8D6F7ra8tb D5aT1EdRzh/vAORPC9T3DQ3VsHdvhwpUMG7knnKrVT9X/g7E+gSji1BqaQaTr/xs i40GepHT7lM/TWEuee/6LRpqdhuOhud7vfaRFwn2JGRX9suqTcvwhkBkPUDGV5XX 5RkHcWOb/7KvmpG7S1gaRGK5kO208LgmAZi7REaJFoZB74FqSneMR6NHIH07ha41 Zou7rnxV68CT2bgu27m+72EsprgmBkVDeEzXgKxVI/+PZ1oadUFpgcZ3pRLOPWVx rEqjHu65rlA/YPog4iXQaMfSwt/oRD3cVJS/n8EdJKXi4Qt2RDDGdyOmt74w4prM QuLEdvJIFmwrND1KDoqn =Ku8g -----END PGP SIGNATURE----- Merge tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Doug and I are at a conference next week so if another PR is sent I expect it to only be bug fixes. Parav noted yesterday that there are some fringe case behavior changes in his work that he would like to fix, and I see that Intel has a number of rc looking patches for HFI1 they posted yesterday. Parav is again the biggest contributor by patch count with his ongoing work to enable container support in the RDMA stack, followed by Leon doing syzkaller inspired cleanups, though most of the actual fixing went to RC. There is one uncomfortable series here fixing the user ABI to actually work as intended in 32 bit mode. There are lots of notes in the commit messages, but the basic summary is we don't think there is an actual 32 bit kernel user of drivers/infiniband for several good reasons. However we are seeing people want to use a 32 bit user space with 64 bit kernel, which didn't completely work today. So in fixing it we required a 32 bit rxe user to upgrade their userspace. rxe users are still already quite rare and we think a 32 bit one is non-existing. - Fix RDMA uapi headers to actually compile in userspace and be more complete - Three shared with netdev pull requests from Mellanox: * 7 patches, mostly to net with 1 IB related one at the back). This series addresses an IRQ performance issue (patch 1), cleanups related to the fix for the IRQ performance problem (patches 2-6), and then extends the fragmented completion queue support that already exists in the net side of the driver to the ib side of the driver (patch 7). * Mostly IB, with 5 patches to net that are needed to support the remaining 10 patches to the IB subsystem. This series extends the current 'representor' framework when the mlx5 driver is in switchdev mode from being a netdev only construct to being a netdev/IB dev construct. The IB dev is limited to raw Eth queue pairs only, but by having an IB dev of this type attached to the representor for a switchdev port, it enables DPDK to work on the switchdev device. * All net related, but needed as infrastructure for the rdma driver - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers - SRP performance updates - IB uverbs write path cleanup patch series from Leon - Add RDMA_CM support to ib_srpt. This is disabled by default. Users need to set the port for ib_srpt to listen on in configfs in order for it to be enabled (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port) - TSO and Scatter FCS support in mlx4 - Refactor of modify_qp routine to resolve problems seen while working on new code that is forthcoming - More refactoring and updates of RDMA CM for containers support from Parav - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device memory' user API features - Infrastructure updates for the new IOCTL interface, based on increased usage - ABI compatibility bug fixes to fully support 32 bit userspace on 64 bit kernel as was originally intended. See the commit messages for extensive details - Syzkaller bugs and code cleanups motivated by them" * tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits) IB/rxe: Fix for oops in rxe_register_device on ppc64le arch IB/mlx5: Device memory mr registration support net/mlx5: Mkey creation command adjustments IB/mlx5: Device memory support in mlx5_ib net/mlx5: Query device memory capabilities IB/uverbs: Add device memory registration ioctl support IB/uverbs: Add alloc/free dm uverbs ioctl support IB/uverbs: Add device memory capabilities reporting IB/uverbs: Expose device memory capabilities to user RDMA/qedr: Fix wmb usage in qedr IB/rxe: Removed GID add/del dummy routines RDMA/qedr: Zero stack memory before copying to user space IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR IB/mlx5: Add information for querying IPsec capabilities IB/mlx5: Add IPsec support for egress and ingress {net,IB}/mlx5: Add ipsec helper IB/mlx5: Add modify_flow_action_esp verb IB/mlx5: Add implementation for create and destroy action_xfrm IB/uverbs: Introduce ESP steering match filter IB/uverbs: Add modify ESP flow_action ...	2018-04-06 17:35:43 -07:00
Bharat Potnuri	44016b3466	iw_cxgb4: print mapped ports correctly c4iw_ep_common structure holds the mapped addresses, so while printing them, use appropriate pointers. Fixes: `bab572f1d` ("iw_cxgb4: Guard against null cm_id in dump_ep/qp") Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-03-29 16:00:59 -06:00
Steve Wise	f215a3d244	iw_cxgb4: Add ib_device->get_netdev support This is useful to rdma ULPs. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-03-23 11:12:51 -06:00
Matan Barak	0ede73bc01	IB/uverbs: Extend uverbs_ioctl header with driver_id Extending uverbs_ioctl header with driver_id and another reserved field. driver_id should be used in order to identify the driver. Since every driver could have its own parsing tree, this is necessary for strace support. Downstream patches take off the EXPERIMENTAL flag from the ioctl() IB support and thus we add some reserved fields for future usage. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-03-19 14:45:17 -06:00
Ganesh Goudar	8b7372c101	cxgb4: notify fatal error to uld drivers notify uld drivers if the adapter encounters fatal error. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-16 14:46:57 -04:00
Jason Gunthorpe	7f86260b5f	RDMA/cxgb4: Use structs to describe the uABI instead of opencoding Open coding a loose value is not acceptable for describing the uABI in RDMA. Provide the missing struct. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-03-15 15:58:04 -06:00
Steve Wise	750fb1656a	iw_cxgb4: initialize ib_mr fields for user mrs Some of the struct ib_mr fields weren't getting initialized. This was benign, but will cause problems when dumping the mr resource via nldev/restrack. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-03-08 15:03:03 -05:00
Bharat Potnuri	f48fca4d81	iw_cxgb4: Change error/warn prints to pr_debug These prints not neccesarily mean error/warning, so changing them to debug. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-29 11:09:23 -07:00
Jason Gunthorpe	76a895d9e1	Merge branch 'from-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git Patches for 4.16 that are dependent on patches sent to 4.15-rc. These are small clean ups for the vmw_pvrdma and i40iw drivers. * 'from-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git: RDMA/vmw_pvrdma: Remove usage of BIT() from UAPI header RDMA/vmw_pvrdma: Use refcount_t instead of atomic_t RDMA/vmw_pvrdma: Use more specific sizeof in kcalloc RDMA/vmw_pvrdma: Clarify QP and CQ is_kernel logic RDMA/vmw_pvrdma: Add UAR SRQ macros in ABI header file i40iw: Change accelerated flag to bool	2017-12-27 21:50:46 -07:00
Steve Wise	d145873345	iw_cxgb4: when flushing, complete all wrs in a chain If a wr chain was posted and needed to be flushed, only the first wr in the chain was completed with FLUSHED status. The rest were never completed. This caused isert to hang on shutdown due to the missing completions which left iscsi IO commands referenced, stalling the shutdown. Fixes: `4fe7c2962e` ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:06 -07:00
Steve Wise	96a236ed28	iw_cxgb4: reflect the original WR opcode in drain cqes The flush/drain logic was not retaining the original wr opcode in its completion. This can cause problems if the application uses the completion opcode to make decisions. Use bit 10 of the CQE header word to indicate the CQE is a special drain completion, and save the original WR opcode in the cqe header opcode field. Fixes: `4fe7c2962e` ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:06 -07:00
Steve Wise	f55688c454	iw_cxgb4: Only validate the MSN for successful completions If the RECV CQE is in error, ignore the MSN check. This was causing recvs that were flushed into the sw cq to be completed with the wrong status (BAD_MSN instead of FLUSHED). Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:06 -07:00
Dan Carpenter	ccc04cdd55	RDMA/cxgb4: Add a sanity check in process_work() The story is that Smatch marks skb->data as untrusted so it generates a warning message here: drivers/infiniband/hw/cxgb4/cm.c:4100 process_work() error: buffer overflow 'work_handlers' 241 <= 255 In other places which handle this such as t4_uld_rx_handler() there is some checking to make sure that the function pointer is not NULL. I have added bounds checking and a check for NULL here as well. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-13 11:00:13 -07:00
Colin Ian King	0cb65d421a	iw_cxgb4: make pointer reg_workq static The pointer reg_workq is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: drivers/infiniband/hw/cxgb4/device.c:69:25: warning: symbol 'reg_workq' was not declared. Should it be static? Fixes: `1c8f1da5d8` ("iw_cxgb4: Fix possible circular dependency locking warning") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-13 11:00:13 -07:00
Arnd Bergmann	f8109d9e7d	infiniband: cxgb4: use ktime_get for timestamps The debugfs file prints the difference between host timestamps as a seconds/nanoseconds tuple, along with a 64-bit nanoseconds hardware timestamp. The host time is read using getnstimeofday() which is deprecated because of the y2038 overflow, and it suffers from time jumps during settimeofday() and leap seconds. Converting to ktime_get_ts64() would solve those two, but I'm going a little further here by changing to ktime_get() and printing 64-bit nanoseconds on both host and hw timestamps. This simplifies the code further and makes the output easier to understand. The format of the debugfs file obviously changes here, but this should only be read by humans and not scripts, so I assume it's fine. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-13 11:00:13 -07:00
Steve Wise	c058ecf6e4	iw_cxgb4: only insert drain cqes if wq is flushed Only insert our special drain CQEs to support ib_drain_sq/rq() after the wq is flushed. Otherwise, existing but not yet polled CQEs can be returned out of order to the user application. This can happen when the QP has exited RTS but not yet flushed the QP, which can happen during a normal close (vs abortive close). In addition never count the drain CQEs when determining how many CQEs need to be synthesized during the flush operation. This latter issue should never happen if the QP is properly flushed before inserting the drain CQE, but I wanted to avoid corrupting the CQ state. So we handle it and log a warning once. Fixes: `4fe7c2962e` ("iw_cxgb4: refactor sq/rq drain logic") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-11 15:33:51 -07:00
Steve Wise	335ebf6fa3	iw_cxgb4: only clear the ARMED bit if a notification is needed In __flush_qp(), the CQ ARMED bit was being cleared regardless of whether any notification is actually needed. This resulted in the iser termination logic getting stuck in ib_drain_sq() because the CQ was not marked ARMED and thus the drain CQE notification wasn't triggered. This new bug was exposed when this commit was merged: commit `cbb40fadd3` ("iw_cxgb4: only call the cq comp_handler when the cq is armed") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-07 14:09:59 -07:00
Steve Wise	bc52e9ca74	iw_cxgb4: atomically flush the qp __flush_qp() has a race condition where during the flush operation, the qp lock is released allowing another thread to possibly post a WR, which corrupts the queue state, possibly causing crashes. The lock was released to preserve the cq/qp locking hierarchy of cq first, then qp. However releasing the qp lock is not necessary; both RQ and SQ CQ locks can be acquired first, followed by the qp lock, and then the RQ and SQ flushing can be done w/o unlocking. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Steve Wise	cbb40fadd3	iw_cxgb4: only call the cq comp_handler when the cq is armed The ULPs completion handler should only be called if the CQ is armed for notification. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Bharat Potnuri	1c8f1da5d8	iw_cxgb4: Fix possible circular dependency locking warning Locking sequence of iw_cxgb4 and RoCE drivers in ib_register_device() is slightly different and this leads to possible circular dependency locking warning when both the devices are brought up. Here is the locking sequence upto ib_register_device(): iw_cxgb4: rtnl_mutex(net stack) --> uld_mutex --> device_mutex RoCE drivers: device_mutex --> rtnl_mutex Here is the possibility of cross locking: CPU #0 (iw_cxgb4) CPU #1 (RoCE drivers) -> on interface up cxgb4_up() executed with rtnl_mutex held -> hold uld_mutex and try registering ib device -> In ib_register_device() hold device_mutex -> hold device mutex in ib_register_device -> try acquiring rtnl_mutex in ib_enum_roce_netdev() Current patch schedules the ib_register_device() functionality of iw_cxgb4 to a workqueue to prevent the possible cross-locking. Also rename the labels in c4iw_reister_device(). Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Steve Wise	ba97b74997	iw_cxgb4: remove BUG_ON() usage. iw_cxgb4 has many BUG_ON()s that were left over from various enhancemnets made over the years. Almost all of them should just be removed. Some, however indicate a ULP usage error and can be handled w/o bringing down the system. If the condition cannot happen with correctly implemented cxgb4 sw/fw, then remove the BUG_ON. If the condition indicates a misbehaving ULP (like CQ overflows), add proper recovery logic. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:01:25 -05:00
Leon Romanovsky	9950acf945	RDMA/cxgb4: Protect from possible dereference Smatch tool reports the following error: drivers/infiniband/hw/cxgb4/qp.c:1886 c4iw_create_qp() error: we previously assumed 'ucontext' could be null (see line 1804) Cc: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 13:53:22 -05:00
Leon Romanovsky	7d7d065a5e	RDMA/cxgb4: Annotate r2 and stag as __be32 Chelsio cxgb4 HW is big-endian, hence there is need to properly annotate r2 and stag fields as __be32 and not __u32 to fix the following sparse warnings. drivers/infiniband/hw/cxgb4/qp.c:614:16: warning: incorrect type in assignment (different base types) expected unsigned int [unsigned] [usertype] r2 got restricted __be32 [usertype] <noident> drivers/infiniband/hw/cxgb4/qp.c:615:18: warning: incorrect type in assignment (different base types) expected unsigned int [unsigned] [usertype] stag got restricted __be32 [usertype] <noident> Cc: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-10 13:04:09 -05:00
Leon Romanovsky	35fb2a88ed	RDMA/cxgb4: Declare stag as __be32 The scqe.stag is actually __b32, fix it. drivers/infiniband/hw/cxgb4/cq.c:754:52: warning: cast to restricted __be32 Cc: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-25 15:25:00 -04:00
Doug Ledford	894b82c427	Merge branch 'timer_setup' into for-next Conflicts: drivers/infiniband/hw/cxgb4/cm.c drivers/infiniband/hw/qib/qib_driver.c drivers/infiniband/hw/qib/qib_mad.c There were minor fixups needed in these files. Just minor context diffs due to patches from independent sources touching the same basic area. Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-18 13:12:09 -04:00
Kees Cook	a9346abed5	RDMA/cxgb4: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Also removes an unused timer and drops a redundant initialization. Cc: Steve Wise <swise@chelsio.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-18 11:55:53 -04:00
Bart Van Assche	81e74ec286	RDMA/cxgb4: Remove a set-but-not-used variable Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-14 20:47:06 -04:00
Bart Van Assche	9ae970e277	RDMA/cxgb4: Suppress gcc 7 fall-through complaints Avoid that gcc 7 reports the following warning when building with W=1: warning: this statement may fall through [-Wimplicit-fallthrough=] Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-14 20:47:05 -04:00
Bart Van Assche	76ca0d1b16	RDMA/cxgb4: Remove the obsolete kernel module option 'c4iw_debug' This patch avoids that building the cxgb4 module with W=1 triggers a complaint about a local variable that has not been declared static. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Steve Wise <swise@opengridcomputing.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-14 20:47:05 -04:00
Bart Van Assche	70d7256819	RDMA/cxgb4: Fix indentation This patch avoids that smatch reports the following: drivers/infiniband/hw/cxgb4/device.c:1105: copy_gl_to_skb_pkt() warn: inconsistent indenting drivers/infiniband/hw/cxgb4/cm.c:835: send_connect() warn: inconsistent indenting drivers/infiniband/hw/cxgb4/cm.c:841: send_connect() warn: inconsistent indenting drivers/infiniband/hw/cxgb4/cm.c:888: send_connect() warn: inconsistent indenting drivers/infiniband/hw/cxgb4/cm.c:894: send_connect() warn: inconsistent indenting Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-14 20:47:05 -04:00
Steve Wise	2015f26cfa	iw_cxgb4: add referencing to wait objects For messages sent from the host to fw that solicit a reply from fw, the c4iw_wr_wait struct pointer is passed in the host->fw message, and included in the fw->host fw6_msg reply. This allows the sender to wait until the reply is received, and the code processing the ingress reply to wake up the sender. If c4iw_wait_for_reply() times out, however, we need to keep the c4iw_wr_wait object around in case the reply eventually does arrive. Otherwise we have touch-after-free bugs in the wake_up paths. This was hit due to a bad kernel driver that blocked ingress processing of cxgb4 for a long time, causing iw_cxgb4 timeouts, but eventually resuming ingress processing and thus hitting the touch-after-free bug. So I want to fix iw_cxgb4 such that we'll at least keep the wait object around until the reply comes. If it never comes we leak a small amount of memory, but if it does come late, we won't potentially crash the system. So add a kref struct in the c4iw_wr_wait struct, and take a reference before sending a message to FW that will generate a FW6 reply. And remove the reference (and potentially free the wait object) when the reply is processed. The ep code also uses the wr_wait for non FW6 CPL messages and doesn't embed the c4iw_wr_wait object in the message sent to firmware. So for those cases we add c4iw_wake_up_noref(). The mr/mw, cq, and qp object create/destroy paths do need this reference logic. For these paths, c4iw_ref_send_wait() is introduced to take the wr_wait reference, send the msg to fw, and then wait for the reply. So going forward, iw_cxgb4 either uses c4iw_ofld_send(), c4iw_wait_for_reply() and c4iw_wake_up_noref() like is done in the some of the endpoint logic, or c4iw_ref_send_wait() and c4iw_wake_up_deref() (formerly c4iw_wake_up()) when sending messages with the c4iw_wr_wait object pointer embedded in the message and resulting FW6 reply. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-29 11:46:41 -04:00
Steve Wise	ef885dc66c	iw_cxgb4: allocate wait object for each ep object Remove the embedded c4iw_wr_wait object in preparation for correctly handling timeouts. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-29 11:46:41 -04:00
Steve Wise	7088a9ba62	iw_cxgb4: allocate wait object for each qp object Remove the local stack allocated c4iw_wr_wait object in preparation for correctly handling timeouts. Also cleaned up some error path unwind logic to make it more readable. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-29 11:46:41 -04:00
Steve Wise	13ce83174a	iw_cxgb4: allocate wait object for each cq object Remove the local stack allocated c4iw_wr_wait object in preparation for correctly handling timeouts. Also cleaned up some error path unwind logic to make it more readable. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-29 11:46:41 -04:00
Steve Wise	a3f12da0e9	iw_cxgb4: allocate wait object for each memory object Remove the local stack allocated c4iw_wr_wait object in preparation for correctly handling timeouts. Also refactored some code to simplify it and make errpath unwinding more readable. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-29 11:46:40 -04:00
Bharat Potnuri	4d45b7573b	iw_cxgb4: change pr_debug to appropriate log level Error logs of iw_cxgb4 needs to be printed by default. This patch changes the necessary pr_debug() to appropriate pr_<log level>. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-27 10:13:15 -04:00
Bharat Potnuri	548ddb19af	iw_cxgb4: Remove __func__ parameter from pr_debug() pr_debug() can be enabled to print function names, So removing the unwanted __func__ parameters from debug logs. Realign function parameters. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-27 10:13:14 -04:00
Yuval Shaia	be92d4891f	IB/{cxgb3,cxgb4}: Remove unneeded config dependencies CHELSIO_T3 already depend on INET CHELSIO_T4 already depend on (IPV6 \|\| IPV6=n) Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-27 08:54:19 -04:00
Steve Wise	8b1bbf36b7	iw_cxgb4: remove the stid on listen create failure If a listen create fails, then the server tid (stid) is incorrectly left in the stid idr table, which can cause a touch-after-free if the stid is looked up and the already freed endpoint is touched. So make sure and remove it in the error path. Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-22 12:59:42 -04:00
Steve Wise	3c8415cc7a	iw_cxgb4: drop listen destroy replies if no ep found If the thread waiting for a CLOSE_LISTSRV_RPL times out and bails, then we need to handle a subsequent CPL if it arrives and the stid has been released. In this case silently drop it. Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-22 12:59:42 -04:00
Steve Wise	3d318605f5	iw_cxgb4: put ep reference in pass_accept_req() The listening endpoint should always be dereferenced at the end of pass_accept_req(). Fixes: `f86fac79af` ("RDMA/iw_cxgb4: atomic find and reference for listening endpoints") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-09-22 12:59:42 -04:00
Doug Ledford	b0e32e20e3	Merge branch 'k.o/for-4.13-rc' into k.o/for-next Merging our (hopefully) final -rc pull branch into our for-next branch because some of our pending patches won't apply cleanly without having the -rc patches in our tree. Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-18 14:12:04 -04:00
Doug Ledford	d3cf4d9915	Merge branch 'misc' into k.o/for-next Conflicts: drivers/infiniband/core/iwcm.c - The rdma_netlink patches in HEAD and the iwarp cm workqueue fix (don't use WQ_MEM_RECLAIM, we aren't safe for that context) touched the same code. Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-18 14:10:23 -04:00
Christophe Jaillet	836daee2df	cxgb4: Remove some dead code This 'BUG_ON(!ep)' can never trigger because we have: if (!ep) return 0; just a few lines above. So it can be removed safely. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-17 18:06:37 -04:00
Steve Wise	d4ba61d218	iw_cxgb4: fix misuse of integer variable Fixes: `ee30f7d507` ("iw_cxgb4: Max fastreg depth depends on DSGL support") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-16 11:43:21 -04:00
Leon Romanovsky	9abb0d1bbd	RDMA: Simplify get firmware interface There is a need to forward FW version to user space application through RDMA netlink. In order to make it safe, there is need to declare nla_policy and limit the size of FW string. The new define IB_FW_VERSION_NAME_MAX will limit the size of FW version string. That define was chosen to be equal to ETHTOOL_FWVERS_LEN, because many drivers anyway are limited by that value indirectly. The introduction of this define allows us to remove the string size from get_fw_str function signature. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:10 +03:00
Leon Romanovsky	e1267b0124	RDMA: Remove useless MODULE_VERSION All modules in drivers/infiniband defined and used MODULE_VERSION, which was pointless because the kernel version describes their state more accurate then those arbitrary numbers. Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sagi Grimbrg <sagi@grimberg.me> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:45:11 -04:00
Ganesh Goudar	720336c42e	iw_cxgb4: don't use WR keys/addrs for 0 byte reads Only use the read sge lkey/addr and the remote rkey/addr if the length of the read is not zero. Otherwise the read response might be treated as the RTR read response and not delivered to the application. Or worse Terminator hardware will fail a 0B read if the STAG is 0 even if the read length is 0. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-20 11:20:50 -04:00
Dan Carpenter	6ebedacbb4	cxgb4: Fix error codes in c4iw_create_cq() If one of these kmalloc() calls fails then we return ERR_PTR(0) which is NULL. It results in a NULL dereference in the callers. Fixes: `cfdda9d764` ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-20 11:20:49 -04:00
David S. Miller	3d09198243	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two entries being added at the same time to the IFLA policy table, whilst parallel bug fixes to decnet routing dst handling overlapping with the dst gc removal in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-21 17:35:22 -04:00
yuan linyu	de77b966ce	net: introduce __skb_put_[zero, data, u8] follow Johannes Berg, semantic patch file as below, @@ identifier p, p2; expression len; expression skb; type t, t2; @@ ( -p = __skb_put(skb, len); +p = __skb_put_zero(skb, len); \| -p = (t)__skb_put(skb, len); +p = __skb_put_zero(skb, len); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, len); \| -memset(p, 0, len); ) @@ identifier p; expression len; expression skb; type t; @@ ( -t p = __skb_put(skb, len); +t p = __skb_put_zero(skb, len); ) ... when != p ( -memset(p, 0, len); ) @@ type t, t2; identifier p, p2; expression skb; @@ t p; ... ( -p = __skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); \| -p = (t )__skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, sizeof(p)); \| -memset(p, 0, sizeof(p)); ) @@ expression skb, len; @@ -memset(__skb_put(skb, len), 0, len); +__skb_put_zero(skb, len); @@ expression skb, len, data; @@ -memcpy(__skb_put(skb, len), data, len); +__skb_put_data(skb, data, len); @@ expression SKB, C, S; typedef u8; identifier fn = {__skb_put}; fresh identifier fn2 = fn ## "_u8"; @@ - (u8 )fn(SKB, S) = C; + fn2(SKB, C); Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-20 13:30:14 -04:00
Johannes Berg	d58ff35122	networking: make skb_push & __skb_push return void pointers It seems like a historic accident that these return unsigned char , and in many places that means casts are required, more often than not. Make these functions return void and remove all the casts across the tree, adding a (u8 ) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - (fn(SKB, LEN)) + (u8 )fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; type T; @@ - E = ((T )(fn(SKB, LEN))) + E = fn(SKB, LEN) @@ expression SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - fn(SKB, LEN)[0] + (u8 )fn(SKB, LEN) Note that the last part there converts from push(...)[0] to the more idiomatic (u8 *)push(...). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:48:40 -04:00
Johannes Berg	4df864c1d9	networking: make skb_put & friends return void pointers It seems like a historic accident that these return unsigned char , and in many places that means casts are required, more often than not. Make these functions (skb_put, __skb_put and pskb_put) return void and remove all the casts across the tree, adding a (u8 ) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_put, __skb_put }; @@ - (fn(SKB, LEN)) + (u8 )fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_put, __skb_put }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) which actually doesn't cover pskb_put since there are only three users overall. A handful of stragglers were converted manually, notably a macro in drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many instances in net/bluetooth/hci_sock.c. In the former file, I also had to fix one whitespace problem spatch introduced. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:48:39 -04:00
Johannes Berg	b080db5853	networking: convert many more places to skb_put_zero() There were many places that my previous spatch didn't find, as pointed out by yuan linyu in various patches. The following spatch found many more and also removes the now unnecessary casts: @@ identifier p, p2; expression len; expression skb; type t, t2; @@ ( -p = skb_put(skb, len); +p = skb_put_zero(skb, len); \| -p = (t)skb_put(skb, len); +p = skb_put_zero(skb, len); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, len); \| -memset(p, 0, len); ) @@ type t, t2; identifier p, p2; expression skb; @@ t p; ... ( -p = skb_put(skb, sizeof(t)); +p = skb_put_zero(skb, sizeof(t)); \| -p = (t )skb_put(skb, sizeof(t)); +p = skb_put_zero(skb, sizeof(t)); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, sizeof(p)); \| -memset(p, 0, sizeof(p)); ) @@ expression skb, len; @@ -memset(skb_put(skb, len), 0, len); +skb_put_zero(skb, len); Apply it to the tree (with one manual fixup to keep the comment in vxlan.c, which spatch removed.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:48:35 -04:00
Raju Rangoju	d470264583	rdma/cxgb4: Fix memory leaks during module exit Fix memory leaks of iw_cxgb4 module in the exit path Signed-off-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-14 15:24:50 -04:00
Eric Dumazet	eed29f17f0	tcp: add a struct net parameter to tcp_parse_options() We want to move some TCP sysctls to net namespaces in the future. tcp_window_scaling, tcp_sack and tcp_timestamps being fetched from tcp_parse_options(), we need to pass an extra parameter. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 10:53:28 -04:00
Ganesh Goudar	1dec4cec9f	cxgb4: Fix tids count for ipv6 offload connection the adapter consumes two tids for every ipv6 offload connection be it active or passive, calculate tid usage count accordingly. Also change the signatures of relevant functions to get the address family. Signed-off-by: Rizwan Ansari <rizwana@chelsio.com> Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-07 15:14:34 -04:00
Raju Rangoju	98b80a2a73	RDMA/iw_cxgb4: fix the calculation of ipv6 header size Take care of ipv6 checks while computing header length for deducing mtu size of ipv6 servers. Due to the incorrect header length computation for ipv6 servers, wrong mss is reported to the peer (client). Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-01 17:03:02 -04:00
Ganesh Goudar	4bbfabede5	RDMA/iw_cxgb4: calculate t4_eq_status_entries properly use egrstatuspagesize to calculate t4_eq_status_entries. Fixes: `bb58d07964` ("cxgb4: Update IngPad and IngPack values") Reported-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-01 17:02:50 -04:00
Raju Rangoju	1dad0ebeea	RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers The patch `761e19a504` (RDMA/iw_cxgb4: Handle return value of c4iw_ofld_send() in abort_arp_failure()) from May 6, 2016 leads to the following static checker warning: drivers/infiniband/hw/cxgb4/cm.c:575 abort_arp_failure() warn: passing freed memory 'skb' Also fixes skb leak when l2t resolution fails Fixes: `761e19a504` (RDMA/iw_cxgb4: Handle return value of c4iw_ofld_send() in abort_arp_failure()) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-01 17:01:28 -04:00
Dasaratharaman Chandramouli	90898850ec	IB/core: Rename struct ib_ah_attr to rdma_ah_attr This patch simply renames struct ib_ah_attr to rdma_ah_attr as these fields specify attributes that are not necessarily specific to IB. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-01 14:32:43 -04:00
Pan Bian	9ef63f31ad	iw_cxgb4: check return value of alloc_skb Function alloc_skb() will return a NULL pointer when there is no enough memory. However, the return value of alloc_skb() is directly used without validation in function send_fw_pass_open_req(). This patches checks the return value of alloc_skb() against NULL. Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-28 13:09:55 -04:00
Artemy Kovalyov	3e7e1193e2	IB: Replace ib_umem page_size by page_shift Size of pages are held by struct ib_umem in page_size field. It is better to store it as an exponent, because page size by nature is always power-of-two and used as a factor, divisor or ilog2's argument. The conversion of page_size to be page_shift allows to have portable code and avoid following error while compiling on ARM: ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined! CC: Selvin Xavier <selvin.xavier@broadcom.com> CC: Steve Wise <swise@chelsio.com> CC: Lijun Ou <oulijun@huawei.com> CC: Shiraz Saleem <shiraz.saleem@intel.com> CC: Adit Ranadive <aditr@vmware.com> CC: Dennis Dalessandro <dennis.dalessandro@intel.com> CC: Ram Amrani <Ram.Amrani@Cavium.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-25 15:40:28 -04:00
Ganesh Goudar	e821303c42	iw_cxgb4: Use dsgl by default Enable the use of dsgl by default and determine whether dsgl is supported from lld info. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Bharat Potnuri <bharat@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-25 14:04:41 -04:00
Doug Ledford	339e7575ad	cxgb4: Convert PDBG to pr_debug the second A couple spots were missed in the original patch to implement this change. Add those spots. Fixes: `a9a42886d0` (cxgb4: Convert PDBG to pr_debug) Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-20 22:18:54 -04:00
Joe Perches	a9a42886d0	cxgb4: Convert PDBG to pr_debug Use a more typical logging style. Miscellanea: o Obsolete the c4iw_debug module parameter o Coalesce formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-20 16:14:13 -04:00
Joe Perches	700456bd25	cxgb4: Use more common logging style Convert printks to pr_<level> Miscellanea: o Coalesce formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-20 16:13:20 -04:00
Ingo Molnar	589ee62844	sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h> Update code that relied on sched.h including various MM types for them. This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:37 +01:00
Linus Torvalds	ac1820fb28	This is a tree wide change and has been kept separate for that reason. Bart Van Assche noted that the ib DMA mapping code was significantly similar enough to the core DMA mapping code that with a few changes it was possible to remove the IB DMA mapping code entirely and switch the RDMA stack to use the core DMA mapping code. This resulted in a nice set of cleanups, but touched the entire tree. This branch will be submitted separately to Linus at the end of the merge window as per normal practice for tree wide changes like this. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9 2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX 0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy 1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS 3e7mgpcwC+3XfA/6vU3F =e0Si -----END PGP SIGNATURE----- Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma DMA mapping updates from Doug Ledford: "Drop IB DMA mapping code and use core DMA code instead. Bart Van Assche noted that the ib DMA mapping code was significantly similar enough to the core DMA mapping code that with a few changes it was possible to remove the IB DMA mapping code entirely and switch the RDMA stack to use the core DMA mapping code. This resulted in a nice set of cleanups, but touched the entire tree and has been kept separate for that reason." * tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits) IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it IB/core: Remove ib_device.dma_device nvme-rdma: Switch from dma_device to dev.parent RDS: net: Switch from dma_device to dev.parent IB/srpt: Modify a debug statement IB/srp: Switch from dma_device to dev.parent IB/iser: Switch from dma_device to dev.parent IB/IPoIB: Switch from dma_device to dev.parent IB/rxe: Switch from dma_device to dev.parent IB/vmw_pvrdma: Switch from dma_device to dev.parent IB/usnic: Switch from dma_device to dev.parent IB/qib: Switch from dma_device to dev.parent IB/qedr: Switch from dma_device to dev.parent IB/ocrdma: Switch from dma_device to dev.parent IB/nes: Remove a superfluous assignment statement IB/mthca: Switch from dma_device to dev.parent IB/mlx5: Switch from dma_device to dev.parent IB/mlx4: Switch from dma_device to dev.parent IB/i40iw: Remove a superfluous assignment statement IB/hns: Switch from dma_device to dev.parent ...	2017-02-25 13:45:43 -08:00
Linus Torvalds	af17fe7a63	Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYrx+WAAoJELgmozMOVy/du70P/1kpW2xY9Le04c3K7na2XOYl AUVIDrW/8Go63tpOaM7jBT3k4GlwVFr3IOmBpS24KbW/THxjhyUeP5L5+z2x+go+ jkQOgtPWWEHr5zP3MzsNyB8fDx1YQOnJwEXxybQRW/cbw4CLjnhP+ezd6FdV/3Yy pPEqDVlAErzvNweG+n2r1pjcUbR8uneC3inyMLnyzUBz4CHKmC8fgD3/qJIM+DNb gtFT5xHFIXKCigWdQ/EwsTDcHub43V8OXlI5sO7loG6vToOUATMkjI4oOUNhDmYS X7XLN3yRK9QHEfb5kutXIZEWzTGh7LiFtUYGaNNYqqzDfSiMRc9NC5kTOfplEXDV Uo+AGb6Fh1zYIOzNk7o+tazIv3LaLv6+Fcm+9bbe0VUIqasaylsePqaTwMuIzx/I xP5nitmd5lbYo8WdlasVdG6mH1DlJEUbU30v4DpmTpxCP6jGpog7lexyGyF3TgzS NhnG0IiIClWh3WQ2/GdsFK/obIdFkpLeASli1hwD81vzPfly9zc2YpgqydZI3WCr q6hTXYnANcP6+eciCpQPO7giRdXdiKey08Uoq/2jxb7Qbm4daG6UwopjvH9/lm1F m6UDaDvzNYm+Rx+bL/+KSx9JO9+fJB1L51yCmvLGpWi6yJI4ZTfanHNMBsCua46N Kev/DSpIAzX1WOBkte+a =rspQ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull Mellanox rdma updates from Doug Ledford: "Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (40 commits) IB/mlx5: Fix configuration of port capabilities IB/mlx4: Take source GID by index from HW GID table IB/mlx5: Fix blue flame buffer size calculation IB/mlx4: Remove unused variable from function declaration IB: Query ports via the core instead of direct into the driver IB: Add protocol for USNIC IB/mlx4: Support raw packet protocol IB/mlx5: Support raw packet protocol IB/core: Add raw packet protocol IB/mlx5: Add implicit MR support IB/mlx5: Expose MR cache for mlx5_ib IB/mlx5: Add null_mkey access IB/umem: Indicate that process is being terminated IB/umem: Update on demand page (ODP) support IB/core: Add implicit MR flag IB/mlx5: Support creation of a WQ with scatter FCS offload IB/mlx5: Enable QP creation with cvlan offload IB/mlx5: Enable WQ creation and modification with cvlan offload IB/mlx5: Expose vlan offloads capabilities IB/uverbs: Enable QP creation with cvlan offload ...	2017-02-23 11:27:49 -08:00
Linus Torvalds	4cc4b9323f	First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc. minor changes all over -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYrfewAAoJELgmozMOVy/dFnEP/2Qe7NqXRqxLS0ZqsQseFHgQ jd236E7R/XtQQTE3PTcrWL0mq0DRF6tMEjfhUASKTbZVfCBTniJAoXYrvWhN/STq LxAdigdV/0SPbxO3r9B1Xvk2v5BySaIBkaUDvcEXzT4e7UVQwZgxDkhhsYeY0Z/r 9bNB5760PzW8uO5cctXccNcWztZnW0IUZuAHVfQCPjZ7svoGwLnNDW6YQx+FsEkW tbPdzMXX8VKHlC5RcKbfOOBjdNyrUpWl+uvWEc/7mazKscp4yKVFZL7PcxqPJSfd aKdfqXYawhjZZpyws8Kn0rhkfT7xWKD/y9G5STykRJPj9/n1BDScFkmyDQhtP5bJ GANzdgH0z7Dt9LkcAs86A8EVBbIdbdT2cpPVu7t0uWEIsJw/O5ThKpgjnrrTm6m+ 89tgqLZooifTEsdj4UkZoyktrD3J9LSNZkgVmWtRn01W3oYFOPbdM4TmBZtg+/Yl VGmOJEHMEsNuJBcJcOuRJ1MVz2LebXmPUcB0RXzgmHHgulZ/DqoOtlpg5JNmJcr5 wpw/yppkBop4V4+etJBlzDsZNmZZlX+AY0ZLqQJsDHNszDjwXgAy5Rn5FYIdMyk4 ff0FKb5dzASSxHRDxAsu2uoGaREM0NkpA0UYiIZbepGLSO8PuFG2ScQ6qzU47vqu 9SEzOaaQY2S2uqFFFnYp =ugNm -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc minor changes all over" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (114 commits) RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0." rdma_cm: fail iwarp accepts w/o connection params IB/srp: Drain the send queue before destroying a QP IB/core: Add support for draining IB_POLL_DIRECT completion queues IB/srp: Improve an error path IB/srp: Make a diagnostic message more informative IB/srp: Document locking conventions IB/srp: Fix race conditions related to task management IB/srp: Avoid that duplicate responses trigger a kernel bug IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS RDMA/qedr: Fix some error handling RDMA/bnxt_re: add DCB dependency IB/hns: include linux/module.h IB/vmw_pvrdma: Expose vendor error to ULPs vmw_pvrdma: switch to pci_alloc_irq_vectors IB/hfi1: use size_t for passing array length IB/ipoib: Remove redudant label IB/ipoib: remove the unnecessary memory free IB/mthca: switch to pci_alloc_irq_vectors IB/hfi1: Code reuse with memdup_copy ...	2017-02-23 08:27:57 -08:00
Linus Torvalds	42e1b14b6e	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Implement wraparound-safe refcount_t and kref_t types based on generic atomic primitives (Peter Zijlstra) - Improve and fix the ww_mutex code (Nicolai Hähnle) - Add self-tests to the ww_mutex code (Chris Wilson) - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr Bueso) - Micro-optimize the current-task logic all around the core kernel (Davidlohr Bueso) - Tidy up after recent optimizations: remove stale code and APIs, clean up the code (Waiman Long) - ... plus misc fixes, updates and cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) fork: Fix task_struct alignment locking/spinlock/debug: Remove spinlock lockup detection code lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS lkdtm: Convert to refcount_t testing kref: Implement 'struct kref' using refcount_t refcount_t: Introduce a special purpose refcount type sched/wake_q: Clarify queue reinit comment sched/wait, rcuwait: Fix typo in comment locking/mutex: Fix lockdep_assert_held() fail locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock() locking/rwsem: Reinit wake_q after use locking/rwsem: Remove unnecessary atomic_long_t casts jump_labels: Move header guard #endif down where it belongs locking/atomic, kref: Implement kref_put_lock() locking/ww_mutex: Turn off __must_check for now locking/atomic, kref: Avoid more abuse locking/atomic, kref: Use kref_get_unless_zero() more locking/atomic, kref: Kill kref_sub() locking/atomic, kref: Add kref_read() locking/atomic, kref: Add KREF_INIT() ...	2017-02-20 13:23:30 -08:00
Ganesh Goudar	192539f4ce	iw_cxgb4: clean up send_connect() Clean up send_connect() and make use of t6 specific active open request struct. Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Bharat Teja <bharat@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:30 -05:00
Doug Ledford	6dd7abae71	Merge branch 'k.o/for-4.10-rc' into HEAD	2017-02-19 09:18:21 -05:00
Or Gerlitz	c4550c63b3	IB: Query ports via the core instead of direct into the driver Change the drivers to call ib_query_port in their get port immutable handler instead of their own query port handler. Doing this required to set the core cap flags of this device before the ib_query_port call is made, since the IB core might need these caps to serve the port query. Drivers are ensured by the IB core that the port attributes passed to the port query verb implementation are zero, and hence we removed the zeroing from the drivers. This patch doesn't add any new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:22 -05:00
Amrani, Ram	d3f4aadd61	RDMA/core: Add the function ib_mtu_int_to_enum As the functionality to convert the MTU from a number to enum_ib_mtu is ubiquitous, define a dedicated function and remove the duplicated code. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:22 -05:00
Ganesh Goudar	bab572f1d4	iw_cxgb4: Guard against null cm_id in dump_ep/qp Endpoints that are aborting can have already dereferenced the cm_id and set ep->com.cm_id to NULL. So guard against that in dump_ep() and dump_qp(). Also create a common function for setting up ip address pointers since the same logic is needed in several places. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:44:01 -05:00
Bart Van Assche	d08868a15a	IB/cxgb4: Set dev.parent instead of dma_device Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hariprasad S <hariprasad@chelsio.com> Acked-by: Steve Wise <swise@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Peter Zijlstra	2c935bc572	locking/atomic, kref: Add kref_read() Since we need to change the implementation, stop exposing internals. Provide kref_read() to read the current reference count; typically used for debug messages. Kills two anti-patterns: atomic_read(&kref->refcount) kref->refcount.counter Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:37:18 +01:00
ssh10	b462b06eb6	RDMA/cxgb4: Use AF_INET for sin_family field Elsewhere the sin_family field holds a value with a name of the form AF_..., so it seems reasonable to do so here as well. Also the values of PF_INET and AF_INET are the same. The semantic patch that makes this change is as follows: //</smpl> @@ struct sockaddr_in sip; @@ ( sip.sin_family == - PF_INET + AF_INET \| sip.sin_family != - PF_INET + AF_INET \| sip.sin_family = - PF_INET + AF_INET ) //</smpl> Signed-off-by: Shyam Saini <mayhs11saini@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 12:21:52 -05:00
Steve Wise	3bcf96e018	iw_cxgb4: do not send RX_DATA_ACK CPLs after close/abort Function rx_data(), which handles ingress CPL_RX_DATA messages, was always sending an RX_DATA_ACK with the goal of updating the credits. However, if the RDMA connection is moved out of FPDU mode abruptly, then it is possible for iw_cxgb4 to process queued RX_DATA CPLs after HW has aborted the connection. These CPLs should not trigger RX_DATA_ACKS. If they do, HW can see a READ after DELETE of the DB_LE hash entry for the tid and post a LE_DB HashTblMemCrcError. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Steve Wise	c12a67fec8	iw_cxgb4: free EQ queue memory on last deref Commit `ad61a4c7a9` ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") introduced a bug where the RDMA QP EQ queue memory (and QIDs) are possibly freed before the underlying connection has been fully shutdown. The result being a possible DMA read issued by HW after the queue memory has been unmapped and freed. This results in possible WR corruption in the worst case, system bus errors if an IOMMU is in use, and SGE "bad WR" errors reported in the very least. The fix is to defer unmap/free of queue memory and QID resources until the QP struct has been fully dereferenced. To do this, the c4iw_ucontext must also be kept around until the last QP that references it is fully freed. In addition, since the last QP deref can happen in an IRQ disabled context, we need a new workqueue thread to do the final unmap/free of the EQ queue memory. Fixes: `ad61a4c7a9` ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Steve Wise	4fe7c2962e	iw_cxgb4: refactor sq/rq drain logic With the addition of the IB/Core drain API, iw_cxgb4 supported drain by watching the CQs when the QP was out of RTS and signalling "drain complete" when the last CQE is polled. This, however, doesn't fully support the drain semantics. Namely, the drain logic is supposed to signal "drain complete" only when the application has _processed_ the last CQE, not just removed them from the CQ. Thus a small timing hole exists that can cause touch after free type bugs in applications using the drain API (nvmf, iSER, for example). So iw_cxgb4 needs a better solution. The iWARP Verbs spec mandates that "_at some point_ after the QP is moved to ERROR", the iWARP driver MUST synchronously fail post_send and post_recv calls. iw_cxgb4 was currently not allowing any posts once the QP is in ERROR. This was in part due to the fact that the HW queues for the QP in ERROR state are disabled at this point, so there wasn't much else to do but fail the post operation synchronously. This restriction is what drove the first drain implementation in iw_cxgb4 that has the above mentioned flaw. This patch changes iw_cxgb4 to allow post_send and post_recv WRs after the QP is moved to ERROR state for kernel mode users, thus still adhering to the Verbs spec for user mode users, but allowing flush WRs for kernel users. Since the HW queues are disabled, we just synthesize a CQE for this post, queue it to the SW CQ, and then call the CQ event handler. This enables proper drain operations for the various storage applications. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Linus Torvalds	296915912d	First round of -rc fixes for 4.10 kernel - Series of qedr fixes - Series of rxe fixes - One isolated i40iw fix - One isolated cma fix - One isolated cxgb4 fix -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYXAGvAAoJELgmozMOVy/dDukQAMMNarWp0U8KfNYRU5tyCBwd aIQC1gFT6GUCFys40Z6L84m1D3NpGR+vzVv3grVBeuge73b79zAOHXvVDwJCA+Jl QQLG3vZ13C3158sLDiK8zL+4Ob5OfOQ5nQ2spvDfJWpye9SD+pWFcrpqvK02ANRN kFHILk1gROBTNi46yBR5hjWOkw7Bua6XLsPxh6xoaDZ43NL0r0xgm43FTnj/19x3 0zpZYYKP+3C6U7678rqaog9zfXHvadghW5/WBJ/VgfKqEmH89ESx4J2MvbB8DxFD 1tWAOpr5TNY5jnh8mtUsceDjCzQivc/RWqAu05BspEwcavjSLFyRYr1epR0/4oAd PqLSmfORmhpJ8+5Kmn+chtXo3TT4SYGHIzSUbgbEV/ClwX/7UW+w8mfQZ3buUBq/ cQp/oRnJcsrQIEDFO3AH7P+6Sxy6t3zbSl5oKBUOI1u4RFmC7YBPqo9fQu2Z2mGk 3+AWQaPr7qgEcFzXBgLzvd4LhTYKsvmiNwrcXi9KjjwQjNEVg15qqF2YtmxEUgi9 kh3IOcGan3iSblhV/WLrxcOjlPQrPpBOVnTPhUskFtlsrD+032OxeOBpVoU3nCUt MjTYWoNTYdw4wHz0w373o0uR4+4nl4a5OmO4Fh6Drmg5hm4Bl9BWy0Kziu93Z1Ay Z2utZVWLWhBzn8yJujUz =NW9g -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "First round of -rc fixes for 4.10 kernel: - a series of qedr fixes - a series of rxe fixes - one i40iw fix - one cma fix - one cxgb4 fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rxe: Don't check for null ptr in send() IB/rxe: Drop future atomic/read packets rather than retrying IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends qedr: Always notify the verb consumer of flushed CQEs qedr: clear the vendor error field in the work completion qedr: post_send/recv according to QP state qedr: ignore inline flag in read verbs qedr: modify QP state to error when destroying it qedr: return correct value on modify qp qedr: return error if destroy CQ failed qedr: configure the number of CQEs on CQ creation i40iw: Set 128B as the only supported RQ WQE size IB/cma: Fix a race condition in iboe_addr_get_sgid() IB/rxe: Fix a memory leak in rxe_qp_cleanup() iw_cxgb4: set correct FetchBurstMax for QPs	2016-12-23 10:38:48 -08:00
Steve Wise	b414fa01c3	iw_cxgb4: set correct FetchBurstMax for QPs The current QP FetchBurstMax value is 256B, which is incorrect since a WR can exceed that value. The result being a partial WR fetched by hardware, and a fatal "bad WR" error posted by the SGE. So bump the FetchBurstMax to 512B. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-18 13:35:19 -05:00
Linus Torvalds	4d5b57e05a	Updates for 4.10 kernel merge window - Shared mlx5 updates with net stack (will drop out on merge if Dave's tree has already been merged) - Driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe - Debug cleanups - New connection rejection helpers - SRP updates - Various misc fixes - New paravirt driver from vmware -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYUbAPAAoJELgmozMOVy/dMXcP/iuG5MNzfN8Ny1JftyBQGWg3 cqoQ2OLj9CsXjwVB+5EqbcZHRZY852lKONaLoDKkIOx4YAXO2YuIKOp944vN7EQx 96wfqzT1F5jzAcy5mYZXgLaStGFDAwejKMqeHd0LfJj3OEtemGnVPWYzyqSQmSKo dzJraS1Z9GIRppzU5WaRpB9PtRBkqIqGJ5vZ0EKLGhed5hYY5r0iMJB0GfriMRDO lJ4UUVfpsAoLPnqDBFH6IMn2V2UeAw9IR5zNa1mrM1RBfvt/uYTxrw1w3p9WoaNs GRodhk4DCeAfeyqzVPNBLyXZ4Zq4FzGe3UWM4qysJ1RR4oFNw9Cuw0Fqk8mrfznr 7hv5TpGIckRZiKf8l6e+qLirF0qGtXJg29j2vPVQI9i5nSj95g1agA81PnLQlLLb flWyxeMj81my7lfMHN1xcV6pqPEKMCOysZmfcvVfJd2XxpjuVD7ekl/YXWp8o8kU YPdQMqPD626XsD8VpPdMszb9FPmx0JD0HEv+Y1rIFX8JegEI+c3H2X0dqC27T/Ou FEPWOy025EgHm0Fh/7eIzkG6tjZ4JHoCugJAcxNZGj2XW4eB6r5vY8UwJ8iQRv+n PVYHiy0UoIRePh0mrdOSSphGZMi/GO/DsqKwCtAMEK43WqZQju6wR7QSIGkh66mp 4uSHJqpf3YEYylxGMhk3 =QeGy -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "This is the complete update for the rdma stack for this release cycle. Most of it is typical driver and core updates, but there is the entirely new VMWare pvrdma driver. You may have noticed that there were changes in DaveM's pull request to the bnxt Ethernet driver to support a RoCE RDMA driver. The bnxt_re driver was tentatively set to be pulled in this release cycle, but it simply wasn't ready in time and was dropped (a few review comments still to address, and some multi-arch build issues like prefetch() not working across all arches). Summary: - shared mlx5 updates with net stack (will drop out on merge if Dave's tree has already been merged) - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe - debug cleanups - new connection rejection helpers - SRP updates - various misc fixes - new paravirt driver from vmware" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits) IB: Add vmw_pvrdma driver IB/mlx4: fix improper return value IB/ocrdma: fix bad initialization infiniband: nes: return value of skb_linearize should be handled MAINTAINERS: Update Intel RDMA RNIC driver maintainers MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers IB/core: fix unmap_sg argument qede: fix general protection fault may occur on probe IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc mlx5, calc_sq_size(): Make a debug message more informative mlx5: Remove a set-but-not-used variable mlx5: Use { } instead of { 0 } to init struct IB/srp: Make writing the add_target sysfs attr interruptible IB/srp: Make mapping failures easier to debug IB/srp: Make login failures easier to debug IB/srp: Introduce a local variable in srp_add_one() IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build IB/multicast: Check ib_find_pkey() return value IPoIB: Avoid reading an uninitialized member variable IB/mad: Fix an array index check ...	2016-12-15 12:03:32 -08:00
Doug Ledford	86ef0beaa0	Merge branch 'mlx' into merge-test	2016-12-14 14:44:25 -05:00
Doug Ledford	884fa4f304	Merge branches 'chelsio', 'debug-cleanup', 'hns' and 'i40iw' into merge-test	2016-12-14 14:43:14 -05:00
Moni Shoua	477864c8fc	IB/core: Let create_ah return extended response to user Add struct ib_udata to the signature of create_ah callback that is implemented by IB device drivers. This allows HW drivers to return extra data to the userspace library. This patch prepares the ground for mlx5 driver to resolve destination mac address for a given GID and return it to userspace. This patch was previously submitted by Knut Omang as a part of the patch set to support Oracle's Infiniband HCA (SIF). Signed-off-by: Knut Omang <knut.omang@oracle.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-13 13:38:27 -05:00
Wei Yongjun	15f7e3c21b	iw_cxgb4: Fix error return code in c4iw_rdev_open() Fix to return error code -ENOMEM from the __get_free_page() error handling case instead of 0, as done elsewhere in this function. Fixes: `05eb23893c` ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-08 10:03:36 -05:00
Leon Romanovsky	9a88f96f21	IB/cxgb4: Remove debug prints after allocation failure The prints after [k\|v][m\|z\|c]alloc() functions are not needed, because in case of failure, allocator will print their internal error prints anyway. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-03 13:12:52 -05:00
David S. Miller	f9aa9dc7d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net All conflicts were simple overlapping changes except perhaps for the Thunder driver. That driver has a change_mtu method explicitly for sending a message to the hardware. If that fails it returns an error. Normally a driver doesn't need an ndo_change_mtu method becuase those are usually just range changes, which are now handled generically. But since this extra operation is needed in the Thunder driver, it has to stay. However, if the message send fails we have to restore the original MTU before the change because the entire call chain expects that if an error is thrown by ndo_change_mtu then the MTU did not change. Therefore code is added to nicvf_change_mtu to remember the original MTU, and to restore it upon nicvf_update_hw_max_frs() failue. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-22 13:27:16 -05:00
Hariprasad Shenai	ab677ff4ad	cxgb4: Allocate Tx queues dynamically Allocate resources dynamically for Upper layer driver's (ULD) like cxgbit, iw_cxgb4, cxgb4i and chcr. The resources allocated include Tx queues which are allocated when ULD register with cxgb4 driver and freed while un-registering. The Tx queues which are shared by ULD shall be allocated by first registering driver and un-allocated by last unregistering driver. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-18 14:04:29 -05:00
Steve Wise	5c6b2aaf93	iw_cxgb4: invalidate the mr when posting a read_w_inv wr Also, rearrange things a bit to have a common c4iw_invalidate_mr() function used everywhere that we need to invalidate. Fixes: `49b53a93a6` ("iw_cxgb4: add fast-path for small REG_MR operations") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-11-16 20:10:36 -05:00
Steve Wise	4ff522ea47	iw_cxgb4: set bad_wr for post_send/post_recv errors There are a few cases in c4iw_post_send() and c4iw_post_receive() where bad_wr is not set when an error is returned. This can cause a crash if the application tries to use bad_wr. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-11-16 20:10:36 -05:00
Linus Torvalds	b9044ac829	Merge of primary rdma-core code for 4.9 - Updates to mlx5 - Updates to mlx4 (two conflicts, both minor and easily resolved) - Updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper resolution is to keep the code in cxgb4_main.c as it is in Linus' tree as attach_uld was refactored and moved into cxgb4_uld.c) - Improvements to uAPI (moved vendor specific API elements to uAPI area) - Add hns-roce driver and hns and hns-roce ACPI reset support - Conversion of all rdma code away from deprecated create_singlethread_workqueue - Security improvement: remove unsafe ib_get_dma_mr (breaks lustre in staging) -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJX+AwSAAoJELgmozMOVy/d0WkQAKxPzVccMWwHv28iZI4ey13u JwE+VoCNpCAZAVuEgzK5zzFdNHPvAk2jU93H4apA7dfXJBXPatVuj9Lnk+ieEEnW tbFwJjBpbQ3Zol3+SPfAHnsVMbtax+xmd6WDKExPXXEDl1L6rutwL3KKfmgWEitg ysX7XOJCiSdyM0hcg4T6UPB9a3jGPff9NLu0oGamV+yoUk5Y0WGoVFxHZ4MKcw8t OkFBYIxGz4SGwq2tulStuH03HteURX594KngtrA8dyq6l1R2GlGRv+bkJAUEIWUv aA0ow3VWusOM6fT+jLXPCv8iUwIXM8tR/U6F7X+cmORUUtWvCl+uCUVid113j/aN BK+Af2nJnfoJ5cDBPsD+bC76l5gQycNZO/Qh8op2kmgJtD+6OpGM3cBXsHx53+kk 0wloJ2lKCGShWxNj+ig8n8rR/rhhs/x3vV3ouCVWNMbOUgOSN3eYHxmK3wGFW4nd Qx+WYCjj9Yi/J6nmUDcfEQ4NWPR22Q2+0ENAabfhLhV6mDloAO5ILHd4GDqC3IA9 UtxlVjf4ZonaiLnTQQzCnDMGVVk6tT8FJ9D42s0ScwjbdYwjyCW9/rs/g2EhcprR Cc+AmjqLviCWGtzBSFO0SijqQon8lcQOwdLw61CdFFvPa/mlLdf1rbx9ArIyNVKn JSrbr3CGyoqyYj6qaEO5 =LC+S -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull main rdma updates from Doug Ledford: "This is the main pull request for the rdma stack this release. The code has been through 0day and I had it tagged for linux-next testing for a couple days. Summary: - updates to mlx5 - updates to mlx4 (two conflicts, both minor and easily resolved) - updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper resolution is to keep the code in cxgb4_main.c as it is in Linus' tree as attach_uld was refactored and moved into cxgb4_uld.c) - improvements to uAPI (moved vendor specific API elements to uAPI area) - add hns-roce driver and hns and hns-roce ACPI reset support - conversion of all rdma code away from deprecated create_singlethread_workqueue - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in staging)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits) staging/lustre: Disable InfiniBand support iw_cxgb4: add fast-path for small REG_MR operations cxgb4: advertise support for FR_NSMR_TPTE_WR IB/core: correctly handle rdma_rw_init_mrs() failure IB/srp: Fix infinite loop when FMR sg[0].offset != 0 IB/srp: Remove an unused argument IB/core: Improve ib_map_mr_sg() documentation IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets IB/mthca: Move user vendor structures IB/nes: Move user vendor structures IB/ocrdma: Move user vendor structures IB/mlx4: Move user vendor structures IB/cxgb4: Move user vendor structures IB/cxgb3: Move user vendor structures IB/mlx5: Move and decouple user vendor structures IB/{core,hw}: Add constant for node_desc ipoib: Make ipoib_warn ratelimited IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue IB/ipoib: Remove deprecated create_singlethread_workqueue ...	2016-10-09 17:04:33 -07:00
Steve Wise	49b53a93a6	iw_cxgb4: add fast-path for small REG_MR operations When processing a REG_MR work request, if fw supports the FW_RI_NSMR_TPTE_WR work request, and if the page list for this registration is <= 2 pages, and the current state of the mr is INVALID, then use FW_RI_NSMR_TPTE_WR to pass down a fully populated TPTE for FW to write. This avoids FW having to do an async read of the TPTE blocking the SQ until the read completes. To know if the current MR state is INVALID or not, iw_cxgb4 must track the state of each fastreg MR. The c4iw_mr struct state is updated as REG_MR and LOCAL_INV WRs are posted and completed, when a reg_mr is destroyed, and when RECV completions are processed that include a local invalidation. This optimization increases small IO IOPS for both iSER and NVMF. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-10-07 16:54:40 -04:00
Leon Romanovsky	e44ee2fd98	IB/cxgb4: Move user vendor structures This patch moves cxgb4 vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libcxgb4) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-10-07 16:54:35 -04:00
Yuval Shaia	bd99fdea42	IB/{core,hw}: Add constant for node_desc Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-10-07 16:54:34 -04:00
Bhaktipriya Shridhar	52ee1a05d2	iw_cxgb4: Remove deprecated create_singlethread_workqueue alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces deprecated create_singlethread_workqueue(). This is the identity conversion. The workqueue "workq" queues work item &skb_work. It has been identity converted. WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-10-07 16:54:27 -04:00
David S. Miller	d6989d4bbe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-09-23 06:46:57 -04:00
Hariprasad Shenai	0fbc81b3ad	chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's Allocate resources dynamically to cxgb4's Upper layer driver's(ULD) like cxgbit, iw_cxgb4 and cxgb4i. Allocate resources when they register with cxgb4 driver and free them while unregistering. All the queues and the interrupts for them will be allocated during ULD probe only and freed during remove. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-19 01:37:32 -04:00
Varun Prakash	6e3b6fc201	libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_rx_data_ack() Add cxgb_mk_rx_data_ack() to remove duplicate code to form CPL_RX_DATA_ACK hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:20 -04:00
Varun Prakash	052f4731ed	libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_abort_rpl() Add cxgb_mk_abort_rpl() to remove duplicate code to form CPL_ABORT_RPL hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:20 -04:00
Varun Prakash	a7e1a97f88	libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_abort_req() Add cxgb_mk_abort_req() to remove duplicate code to form CPL_ABORT_REQ hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:20 -04:00
Varun Prakash	29fb6f42e7	libcxgb, iw_cxgb4, cxgbit: add cxgb_mk_close_con_req() Add cxgb_mk_close_con_req() to remove duplicate code to form CPL_CLOSE_CON_REQ hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:20 -04:00
Varun Prakash	a1a234542b	libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_tid_release() Add cxgb_mk_tid_release() to remove duplicate code to form CPL_TID_RELEASE hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:20 -04:00
Varun Prakash	cc516700c7	libcxgb,iw_cxgb4,cxgbit: add cxgb_compute_wscale() Add cxgb_compute_wscale() in libcxgb_cm.h to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:20 -04:00
Varun Prakash	44c6d06992	libcxgb,iw_cxgb4,cxgbit: add cxgb_best_mtu() Add cxgb_best_mtu() in libcxgb_cm.h to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:20 -04:00
Varun Prakash	b65eef0a5b	libcxgb,iw_cxgb4,cxgbit: add cxgb_is_neg_adv() Add cxgb_is_neg_adv() in libcxgb_cm.h to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:19 -04:00
Varun Prakash	95554761d1	libcxgb,iw_cxgb4,cxgbit: add cxgb_find_route6() Add cxgb_find_route6() in libcxgb_cm.c to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:19 -04:00
Varun Prakash	804c2f3e36	libcxgb,iw_cxgb4,cxgbit: add cxgb_find_route() Add cxgb_find_route() in libcxgb_cm.c to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:19 -04:00
Varun Prakash	85e42b044e	libcxgb,iw_cxgb4,cxgbit: add cxgb_get_4tuple() Add cxgb_get_4tuple() in libcxgb_cm.c to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-15 20:49:19 -04:00
Linus Torvalds	46626600d1	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A set of fixes for the current series in the realm of block. Like the previous pull request, the meat of it are fixes for the nvme fabrics/target code. Outside of that, just one fix from Gabriel for not doing a queue suspend if we didn't get the admin queue setup in the first place" * 'for-linus' of git://git.kernel.dk/linux-block: nvme-rdma: add back dependency on CONFIG_BLOCK nvme-rdma: fix null pointer dereference on req->mr nvme-rdma: use ib_client API to detect device removal nvme-rdma: add DELETING queue flag nvme/quirk: Add a delay before checking device ready for memblaze device nvme: Don't suspend admin queue that wasn't created nvme-rdma: destroy nvme queue rdma resources on connect failure nvme_rdma: keep a ref on the ctrl during delete/flush iw_cxgb4: block module unload until all ep resources are released iw_cxgb4: call dev_put() on l2t allocation failure	2016-09-15 13:22:59 -07:00
Jens Axboe	3bc42f3f0e	Merge branch 'nvmf-4.8-rc' of git://git.infradead.org/nvme-fabrics into for-linus Sagi writes: Here we have: - Kconfig dependencies fix from Arnd - nvme-rdma device removal fixes from Steve - possible bad deref fix from Colin	2016-09-13 07:58:34 -06:00
Steve Wise	37eb816c08	iw_cxgb4: block module unload until all ep resources are released Otherwise an endpoint can be still closing down causing a touch after free crash. Also WARN_ON if ulps have failed to destroy various resources during device removal. Fixes: `ad61a4c7a9` ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") Reviewed-by: Sagi Grimberg <sagi@grimbrg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>	2016-09-04 10:00:53 +03:00
Steve Wise	609e941a6b	iw_cxgb4: call dev_put() on l2t allocation failure Reviewed-by: Sagi Grimberg <sagi@grimbrg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>	2016-09-04 10:00:53 +03:00
Baoyou Xie	656aacea6c	IB/cxgb4: Make _free_qp static to silence build warning We get 1 warning when build kernel with W=1: drivers/infiniband/hw/cxgb4/qp.c:686:6: warning: no previous prototype for '_free_qp' [-Wmissing-prototypes] In fact, this function is only used in the file in which it is declared and don't need a declaration, but can be made static. so this patch marks it 'static'. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-09-02 13:46:33 -04:00
Bharat Potnuri	cff069b78c	iw_cxgb4: Fix cxgb4 arm CQ logic w/IB_CQ_REPORT_MISSED_EVENTS Current cxgb4 arm CQ logic ignores IB_CQ_REPORT_MISSED_EVENTS for request completion notification on a CQ. Due to this ib_poll_handler() assumes all events polled and avoids further iopoll scheduling. This patch adds logic to cxgb4 ib_req_notify_cq() handler to check if CQ is not empty and return accordingly. Based on the return value of ib_req_notify_cq() handler, ib_poll_handler() will schedule a run of iopoll handler. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-08-23 12:52:52 -04:00
Steve Wise	30b03b1528	iw_cxgb4: use the MPA initiator's IRD if < our ORD The i40iw initiator sends an MPA-request with ird=16 and ord=16. The cxgb4 responder sends an MPA-reply with ord = 32 causing i40iw to terminate due to insufficient resources. The logic to reduce the ORD to <= peer's IRD was wrong. Reported-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-08-22 15:00:42 -04:00
Steve Wise	7f446abf12	iw_cxgb4: limit IRD/ORD advertised to ULP by device max. The i40iw initiator sends an MPA-request with ird = 63, ord = 63. The cxgb4 responder sends a RST. Since the inbound ord=63 and it exceeds the max_ird/c4iw_max_read_depth (=32 default), chelsio decides to abort. Instead, cxgb4 should adjust the ord/ird down before presenting it to the ULP. Reported-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-08-22 15:00:42 -04:00
Doug Ledford	3e5e8e8a9a	Merge branches 'cxgb4' and 'mlx5' into k.o/for-4.8	2016-08-03 20:58:45 -04:00
Steve Wise	ad61a4c7a9	iw_cxgb4: don't block in destroy_qp awaiting the last deref Blocking in c4iw_destroy_qp() causes a deadlock when apps destroy a qp or disconnect a cm_id from their cm event handler function. There is no need to block here anyway, so just replace the refcnt atomic with a kref object and free the memory on the last put. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-08-02 13:15:17 -04:00
Steve Wise	1b1a889dbb	iw_cxgb4: explicitly move the qp to ERROR state during flush This forces the connection to abort if the application failed to disconnect before flushing. This is aligned with how the common flush services work. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-08-02 13:15:17 -04:00
Steve Wise	12eb5137ed	iw_cxgb4: stop MPA_REPLY timer when disconnecting There exists a race where the application can setup a connection and then disconnect it before iw_cxgb4 processes the fw4_ack message. For passive side connections, the fw4_ack message is used to know when to stop the ep timer for MPA_REPLY messages. If the application disconnects before the fw4_ack is handled then c4iw_ep_disconnect() needs to clean up the timer state and stop the timer before restarting it for the disconnect timer. Failure to do this results in a "timer already started" message and a premature stopping of the disconnect timer. Fixes: `e4b76a2` ("RDMA/iw_cxgb4: stop_ep_timer() after MPA negotiation") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-08-02 13:15:17 -04:00
Hariprasad S	56b2eca3fe	RDMA/iw_cxgb4: Use kfree_skb instead of kfree The commit `0f8ab0b6e9` ("RDMA/iw_cxgb4: Low resource fixes for Memory registration") from Jun 10, 2016, leads to the following static checker warning: drivers/infiniband/hw/cxgb4/mem.c:612 c4iw_alloc_mw() error: use kfree_skb() here instead of kfree(mhp->dereg_skb) Also fixes skb leak in c4iw_dealloc_mw Fixes: `0f8ab0b6e9` ("RDMA/iw_cxgb4: Low resource fixes for Memory registration") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-08-02 13:04:42 -04:00
Doug Ledford	fb92d8fb1b	Merge branches 'cxgb4-4.8', 'mlx5-4.8' and 'fw-version' into k.o/for-4.8	2016-06-23 12:29:26 -04:00
Ira Weiny	ce1922435d	IB/cxgb4: Support device FW version string And remove sysfs fw_ver in favor of the core. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 12:08:33 -04:00
Hariprasad S	dd6b024126	RDMA/iw_cxgb4: Low resource fixes for Completion queue Pre-allocate buffers to deallocate completion queue, so that completion queue is deallocated during RDMA termination when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 10:44:18 -04:00
Hariprasad S	0f8ab0b6e9	RDMA/iw_cxgb4: Low resource fixes for Memory registration Pre-allocate buffers for deregistering memory region and memory window during RDMA connection close, when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 10:44:17 -04:00
Hariprasad S	4a740838bf	RDMA/iw_cxgb4: Low resource fixes for connection manager Pre-allocate buffers for sending various control messages to close connection, abort connection, etc so that we gracefully handle connections when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 10:44:17 -04:00
Hariprasad S	4c72efefd9	RDMA/iw_cxgb4: Add missing error codes for act open cmd Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 10:44:17 -04:00
Hariprasad S	bce2841f5a	RDMA/iw_cxgb4: clean up c4iw_reject_cr() Get rid of unneeded code, and refactor things a bit. For MPA version 0 we abort the connection. For > 0, we attempt to send an MPA_START/REJECT Reply, and then disconnect gracefully. If the send of the MPA message fails, then we abort the connection. We can ignore c4iw_ep_disconnect() errors here because it will clean up the endpoint if there are failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 10:44:16 -04:00
Hariprasad S	68cebcab59	RDMA/iw_cxgb4: allocate enough space for debugfs "qps" dump With IPv6 addresses, the "qps" debugfs is running out of space and truncating the output. Bump the required size accordingly. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 10:44:16 -04:00
Hariprasad S	3d4e79949c	RDMA/iw_cxgb4: only read markers_enabled mod param once markers_enabled should be read only once during MPA negotiation. The present code does read markers_enabled twice during negotiation which results in setting wrong recv/xmit markers if the markers_enabled is changed in the middle of negotiation. With this change the markers_enabled is read only once during MPA negotiation. recv markers are set based on markers enabled module parameter and xmit markers are set based on markers flag from the MPA_START_REQ/MPA_START_REP. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 10:43:19 -04:00
Christoph Lameter	b40f4757da	IB/core: Make device counter infrastructure dynamic In practice, each RDMA device has a unique set of counters that the hardware implements. Having a central set of counters that they must all adhere to is limiting and causes many useful counters to not be available. Therefore we create a dynamic counter registration infrastructure. The driver must implement a stats structure allocation routine, in which the driver must place the directory name it wants, a list of names for all of the counters, an array of u64 counters themselves, plus a few generic configuration options. We then implement a core routine to create a sysfs file for each of the named stats elements, and a core routine to retrieve the stats when any of the sysfs attribute files are read. To avoid excessive beating on the stats generation routine in the drivers, the core code also caches the stats for a short period of time so that someone attempting to read all of the stats in a given device's directory will not result in a stats generation call per file read. Future work will attempt to standardize just the shared stats elements, and possibly add a method to get the stats via netlink in addition to sysfs. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [ Add caching, make structure names more informative, add i40iw support, other significant rewrites from the original patch ]	2016-05-26 12:52:51 -04:00
Doug Ledford	0651ec932a	Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into k.o/for-4.7	2016-05-13 19:40:38 -04:00
Bart Van Assche	ba987e51a6	iw_cxgb4: Convert a __force cast __force casts should be avoided if there is a better alternative. Hence modify the comparison of s_addr with INADDR_ANY such that the __force cast is no longer necessary. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Vipul Pandya <vipul@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-13 19:38:11 -04:00
Hariprasad S	64bec74a9b	RDMA/iw_cxgb4: Add arp failure handlers to send_mpa_reply/reject() These handlers when called print error message to the kernel log, but the actual handling is done by _c4iw_free_ep() and process_timeout(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-13 19:38:11 -04:00
Hariprasad S	093108cb36	RDMA/iw_cxgb4: Always wake up waiter in c4iw_peer_abort_intr() Currently c4iw_peer_abort_intr() does not wake up the waiter if the endpoint state indicates we're using MPAv2 and we're currently trying to connect. This was introduced with commit `7c0a33d611` ("RDMA/cxgb4: Don't wakeup threads for MPAv2") However, this original fix is flawed because it introduces a race that can cause a deadlock of the iwarp stack. Here is the race: ->local side sets up an active offload connection. ->local side sends MPA_START request. ->peer sends MPA_START response. ->local side ingress cpl thread begins processing the MPA_START response, but before it changes the state from MPA_REQ_SENT to FPDU_MODE: ->peer sends a RST which results in a ABORT_REQ_RSS. This triggers peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the stack will re-attempt the connection using MPAv1. ->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT WR. ->So the cpl thread is stuck waiting for a reply and cannot process the ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a halt because no more ingress cpls are processed by the stack... The correct fix for the issue is to always do the wake up in c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect(). Fixes: `7c0a33d611` ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-13 19:38:10 -04:00
Hariprasad S	4a4dd8db9d	RDMA/iw_cxgb4: Handle ret value of process_mpa_reply() in rx_data Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-13 19:38:10 -04:00
Hariprasad S	f86fac79af	RDMA/iw_cxgb4: atomic find and reference for listening endpoints Add get_ep_from_stid() which will atomically find and reference the endpoint struct if found. This avoids touch-after-free races between threads destroying listening endpoints and the CPL processing thread processing an incoming PASS_ACCEPT_REQ CPL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-13 19:38:09 -04:00

1 2 3 4 5 ...

621 Commits