linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Nicholas Bellinger	6cc44a6fb4	iser-target: Add missing target_put_sess_cmd for ImmedateData failure This patch addresses a bug where an early exception for SCSI WRITE with ImmediateData=Yes was missing the target_put_sess_cmd() call to drop the extra se_cmd->cmd_kref reference obtained during the normal iscsit_setup_scsi_cmd() codepath execution. This bug was manifesting itself during session shutdown within isert_cq_rx_comp_err() where target_wait_for_sess_cmds() would end up waiting indefinately for the last se_cmd->cmd_kref put to occur for the failed SCSI WRITE + ImmediateData descriptors. This fix follows what traditional iscsi-target code already does for the same failure case within iscsit_get_immediate_data(). Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Cc: Sagi Grimberg <sagig@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-06-03 19:17:31 -07:00
Roland Dreier	5343c00dd0	IB/mad: Fix sparse warning about gfp_t use Properly convert gfp_t & result to bool to fix: drivers/infiniband/core/sa_query.c:621:33: warning: incorrect type in initializer (different base types) drivers/infiniband/core/sa_query.c:621:33: expected bool [unsigned] [usertype] preload drivers/infiniband/core/sa_query.c:621:33: got restricted gfp_t Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-06-03 10:24:24 -07:00
Jiri Kosina	40f2287bd5	IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO Modify the various routines used to allocate memory resources which serve QPs in mlx4 to get an input GFP directive. Have the Ethernet driver to use GFP_KERNEL in it's QP allocations as done prior to this commit, and the IB driver to use GFP_NOIO when the IB verbs IB_QP_CREATE_USE_GFP_NOIO QP creation flag is provided. Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-06-02 14:58:11 -07:00
Or Gerlitz	09b93088d7	IB: Add a QP creation flag to use GFP_NOIO allocations This addresses a problem where NFS client writes over IPoIB connected mode may deadlock on memory allocation/writeback. The problem is not directly memory reclamation. There is an indirect dependency between network filesystems writing back pages and ipoib_cm_tx_init() due to how a kworker is used. Page reclaim cannot make forward progress until ipoib_cm_tx_init() succeeds and it is stuck in page reclaim itself waiting for network transmission. Ordinarily this situation may be avoided by having the caller use GFP_NOFS but ipoib_cm_tx_init() does not have that information. To address this, take a general approach and add a new QP creation flag that tells the low-level hardware driver to use GFP_NOIO for the memory allocations related to the new QP. Use the new flag in the ipoib connected mode path, and if the driver doesn't support it, re-issue the QP creation without the flag. Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-06-02 14:58:11 -07:00
Or Gerlitz	60093dc0c8	IB: Return error for unsupported QP creation flags Fix the usnic and thw qib drivers to err when QP creation flags that they don't understand are provided. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-06-02 14:58:11 -07:00
Yann Droneaud	729ee4efcc	IB: Allow build of hw/ and ulp/ subdirectories independently It is not possible to build only the drivers/infiniband/hw/ (or ulp/) subdirectory with command such as: $ make ARCH=x86_64 O=./obj-x86_64/ drivers/infiniband/hw/ This fails with following error messages: make[2]: Nothing to be done for `all'. make[2]: Nothing to be done for `relocs'. CHK include/config/kernel.release Using /home/ydroneaud/src/linux as source for kernel GEN /home/ydroneaud/src/linux/obj-x86_64/Makefile CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CALL /home/ydroneaud/src/linux/scripts/checksyscalls.sh /home/ydroneaud/src/linux/scripts/Makefile.build:44: /home/ydroneaud/src/linux/drivers/infiniband/hw/Makefile: No such file or directory make[2]: * No rule to make target `/home/ydroneaud/src/linux/drivers/infiniband/hw/Makefile'. Stop. make[1]: * [drivers/infiniband/hw/] Error 2 make: *** [sub-make] Error 2 This patch creates a Makefile in hw/ and ulp/ and moves each corresponding parts of drivers/infiniband/Makefile in the new Makefiles. It should not break build except if some hw/ drivers or ulp/ were allowed previously to be built while CONFIG_INFINIBAND is set to 'n', but according to drivers/infiniband/Kconfig, it's not possible. So it should be safe to apply. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-06-02 14:51:12 -07:00
David S. Miller	96b2e73c54	Revert "net/mlx4_en: Use affinity hint" This reverts commit `70a640d0da`. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 00:18:48 -07:00
Yuval Atias	70a640d0da	net/mlx4_en: Use affinity hint The “affinity hint” mechanism is used by the user space daemon, irqbalancer, to indicate a preferred CPU mask for irqs. Irqbalancer can use this hint to balance the irqs between the cpus indicated by the mask. We wish the HCA to preferentially map the IRQs it uses to numa cores close to it. To accomplish this, we use cpumask_set_cpu_local_first(), that sets the affinity hint according the following policy: First it maps IRQs to “close” numa cores. If these are exhausted, the remaining IRQs are mapped to “far” numa cores. Signed-off-by: Yuval Atias <yuvala@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 19:16:29 -07:00
Yann Droneaud	b6f04d3d21	RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp The i386 ABI disagrees with most other ABIs regarding alignment of data types larger than 4 bytes: on most ABIs a padding must be added at end of the structures, while it is not required on i386. So for most ABI struct c4iw_create_cq_resp gets implicitly padded to be aligned on a 8 bytes multiple, while for i386, such padding is not added. The tool pahole can be used to find such implicit padding: $ pahole --anon_include \ --nested_anon_include \ --recursive \ --class_name c4iw_create_cq_resp \ drivers/infiniband/hw/cxgb4/iw_cxgb4.o Then, structure layout can be compared between i386 and x86_64: +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 11:43:05.547432195 +0100 --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100 @@ -14,9 +13,8 @@ struct c4iw_create_cq_resp { __u32 size; /* 28 4 / __u32 qid_mask; / 32 4 / - / size: 36, cachelines: 1, members: 6 / - / last cacheline: 36 bytes / + / size: 40, cachelines: 1, members: 6 / + / padding: 4 / + / last cacheline: 40 bytes */ }; This ABI disagreement will make an x86_64 kernel try to write past the buffer provided by an i386 binary. When boundary check will be implemented, the x86_64 kernel will refuse to write past the i386 userspace provided buffer and the uverbs will fail. If the structure is on a page boundary and the next page is not mapped, ib_copy_to_udata() will fail and the uverb will fail. This patch adds an explicit padding at end of structure c4iw_create_cq_resp, and, like `92b0ca7cb1` ("IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_create_cq() not writting this padding field to userspace. This way, x86_64 kernel will be able to write struct c4iw_create_cq_resp as expected by unpatched and patched i386 libcxgb4. Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Fixes: `cfdda9d764` ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC") Fixes: `e24a72a330` ("RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()") Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:44:57 -07:00
Joe Perches	d236cd0e20	IB/srp: Avoid problems if a header uses pr_fmt SRP defines pr_fmt(fmt) to be "PFX fmt", and then includes a bunch of header files before it gets around to defining PFX. This causes problems if any of the header files do a pr_... and use pr_fmt(). Fix this by using KBUILD_MODNAME instead of the private PFX. Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:32:54 -07:00
Bart Van Assche	8ec0a0e6b5	IB/umad: Fix error handling Avoid leaking a kref count in ib_umad_open() if port->ib_dev == NULL or if nonseekable_open() fails. Avoid leaking a kref count, that sm_sem is kept down and also that the IB_PORT_SM capability mask is not cleared in ib_umad_sm_open() if nonseekable_open() fails. Since container_of() never returns NULL, remove the code that tests whether container_of() returns NULL. Moving the kref_get() call from the start of ib_umad_*open() to the end is safe since it is the responsibility of the caller of these functions to ensure that the cdev pointer remains valid until at least when these functions return. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: <stable@vger.kernel.org> [ydroneaud@opteya.com: rework a bit to reduce the amount of code changed] Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> [ nonseekable_open() can't actually fail, but.... - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:27:30 -07:00
Jack Morgenstein	65fed8a8c1	IB/mlx4: Add interface for selecting VFs to enable QP0 via MLX proxy QPs This commit adds the sysfs interface for enabling QP0 on VFs for selected VF/port. By default, no VFs are enabled for QP0 operation. To enable QP0 operation on a VF/port, under /sys/class/infiniband/mlx4_x/iov/<b:d:f>/ports/x there are two new entries: - smi_enabled (read-only). Indicates whether smi is currently enabled for the indicated VF/port - enable_smi_admin (rw). Used by the admin to request that smi capability be enabled or disabled for the indicated VF/port. 0 = disable, 1 = enable. The requested enablement will occur at the next reset of the VF (e.g. driver restart on the VM which owns the VF). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:13:19 -07:00
Jack Morgenstein	99ec41d0a4	mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs This commit adds the infrastructure for enabling selected VFs to operate SMI (QP0) MADs without restriction. Additionally, for these enabled VFs, their QP0 proxy and tunnel QPs are MLX QPs. As such, they operate over VL15. Therefore, they are not affected by "credit" problems or changes in the VLArb table (which may shut down VL0). Non-enabled VFs may only create UD proxy QP0 qps (which are forced by the hypervisor to send packets using the q-key it assigns and places in the qp-context). Thus, non-enabled VFs will not pose a security risk. The hypervisor discards any privileged MADs it receives from these non-enabled VFs. By default, all VFs are NOT enabled, and must explicitly be enabled by the administrator. The sysfs interface which operates the VF enablement infrastructure is provided in the next commit. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:13:09 -07:00
Jack Morgenstein	97982f5a91	IB/mlx4: Preparation for VFs to issue/receive SMI (QP0) requests/responses Currently, VFs in SRIOV VFs are denied QP0 access. The main reason for this decision is security, since Subnet Management Datagrams (SMPs) are not restricted by network partitioning and may affect the physical network topology. Moreover, even the SM may be denied access from portions of the network by setting management keys unknown to the SM. However, it is desirable to grant SMI access to certain privileged VFs, so that certain network management activities may be conducted within virtual machines instead of the hypervisor. This commit does the following: 1. Create QP0 tunnel QPs for all VFs. 2. Discard SMI mads sent-from/received-for non-privileged VFs in the hypervisor MAD multiplex/demultiplex logic. SMI mads from/for privileged VFs are allowed to pass. 3. MAD_IFC wrapper changes/fixes. For non-privileged VFs, only host-view MAD_IFC commands are allowed, and only for SMI LID-Routed GET mads. For privileged VFs, there are no restrictions. This commit does not allow privileged VFs as yet. To determine if a VF is privileged, it calls function mlx4_vf_smi_enabled(). This function returns 0 unconditionally for now. The next two commits allow defining and activating privileged VFs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:12:58 -07:00
Jack Morgenstein	61565013cf	IB/mlx4: SET_PORT called by mlx4_ib_modify_port should be wrapped mlx4_ib_modify_port is invoked in IB for resetting the Q_Key violations counters and for modifying the IB port capability flags. For example, when opensm is started up on the hypervisor, mlx4_ib_modify_port is called to set the port's IsSM flag. In multifunction mode, the SET_PORT command used in this flow should be wrapped (so that the PF port capability flags are also tracked, thus enabling the aggregate of all the VF/PF capability flags to be tracked properly). The procedure mlx4_SET_PORT() in main.c is also renamed to mlx4_ib_SET_PORT() to differentiate it from procedure mlx4_SET_PORT() in port.c. mlx4_ib_SET_PORT() is used exclusively by mlx4_ib_modify_port(). Finally, the CM invokes ib_modify_port() to set the IsCMSupported flag even when running over RoCE. Therefore, when RoCE is active, mlx4_ib_modify_port should return OK unconditionally (since the capability flags and qkey violations counter are not relevant). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:12:58 -07:00
Vinit Agnihotri	0a66d2bd30	IB/qib: Additional Intel branding changes This patches changes user visible function names containing "qlogic" in module init and cleanup. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-29 21:06:39 -07:00
Dan Carpenter	3c735d481b	RDMA/cxgb3: Remove a couple unneeded conditions We know that "reset_tpt_entry" is false on this side of the if else statement so there is no need to check again. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-28 10:04:00 -07:00
Colin Ian King	bfdfcfee3c	IB/mlx4: fix unitialised variable is_mcast Commit `297e0dad72` ("IB/mlx4: Handle Ethernet L2 parameters for IP based GID addressing") introduced a bug where is_mcast is now no longer initialized on the non-multicast condition and so it can be any random value from the stack. This issue was detected by cppcheck: [drivers/infiniband/hw/mlx4/ah.c:103]: (error) Uninitialized variable: is_mcast Simple fix is to initialise is_mcast to zero. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-28 10:00:06 -07:00
Manuel Schölling	49410185c3	IB/ipath: Use time_before()/_after() Time comparisons must use time_after / time_before to avoid problems when jiffies wraps. Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-28 09:57:06 -07:00
Roland Dreier	6c9b5d9b00	IB/mlx5: Fix warning about cast of wr_id back to pointer on 32 bits We need to cast wr_id to unsigned long before casting to a pointer. This fixes: drivers/infiniband/hw/mlx5/mr.c: In function 'mlx5_umr_cq_handler': >> drivers/infiniband/hw/mlx5/mr.c:724:13: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] context = (struct mlx5_ib_umr_context *)wc.wr_id; Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-28 09:23:03 -07:00
Upinder Malhi	ed477c4c83	IB/usnic: Fix source file missing copyright and license Prepends copyright and license to usnic_uiom_interval_tree.c Signed-off-by: Upinder Malhi <umalhi@cisco.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 13:24:40 -07:00
Dennis Dalessandro	7e6d3e5c70	IB/ipath: Translate legacy diagpkt into newer extended diagpkt This patch addresses an issue where the legacy diagpacket is sent in from the user, but the driver operates on only the extended diagpkt. This patch specifically initializes the extended diagpkt based on the legacy packet. Cc: <stable@vger.kernel.org> Reported-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 13:21:04 -07:00
Mike Marciniszyn	911eccd284	IB/qib: Fix port in pkey change event The code used a literal 1 in dispatching an IB_EVENT_PKEY_CHANGE. As of the dual port qib QDR card, this is not necessarily correct. Change to use the port as specified in the call. Cc: <stable@vger.kernel.org> Reported-by: Alex Estrin <alex.estrin@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 13:20:26 -07:00
Dan Carpenter	e4514cbd97	RDMA/cxgb3: Fix information leak in send_abort() The cpl_abort_req struct has several reserved members which need to be cleared to avoid disclosing kernel information. I have added a memset() so now it matches the cxgb4 version of this function. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:55:40 -07:00
Yann Droneaud	43bc889380	IB/mlx5: add missing padding at end of struct mlx5_ib_create_srq The i386 ABI disagrees with most other ABIs regarding alignment of data type larger than 4 bytes: on most ABIs a padding must be added at end of the structures, while it is not required on i386. So for most ABIs struct mlx5_ib_create_srq gets implicitly padded to be aligned on a 8 bytes multiple, while for i386, such padding is not added. Tool pahole could be used to find such implicit padding: $ pahole --anon_include \ --nested_anon_include \ --recursive \ --class_name mlx5_ib_create_srq \ drivers/infiniband/hw/mlx5/mlx5_ib.o Then, structure layout can be compared between i386 and x86_64: +++ obj-i386/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt 2014-03-28 11:43:07.386413682 +0100 --- obj-x86_64/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt 2014-03-27 13:06:17.788472721 +0100 @@ -69,7 +68,6 @@ struct mlx5_ib_create_srq { __u64 db_addr; /* 8 8 / __u32 flags; / 16 4 / - / size: 20, cachelines: 1, members: 3 / - / last cacheline: 20 bytes / + / size: 24, cachelines: 1, members: 3 / + / padding: 4 / + / last cacheline: 24 bytes */ }; ABI disagreement will make an x86_64 kernel try to read past the buffer provided by an i386 binary. When boundary check will be implemented, the x86_64 kernel will refuse to read past the i386 userspace provided buffer and the uverb will fail. Anyway, if the structure lay in memory on a page boundary and next page is not mapped, ib_copy_from_udata() will fail and the uverb will fail. This patch makes create_srq_user() takes care of the input data size to handle the case where no padding was provided. This way, x86_64 kernel will be able to handle struct mlx5_ib_create_srq as sent by unpatched and patched i386 libmlx5. Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapter") Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:16 -07:00
Yann Droneaud	a8237b32a3	IB/mlx5: add missing padding at end of struct mlx5_ib_create_cq The i386 ABI disagrees with most other ABIs regarding alignment of data type larger than 4 bytes: on most ABIs a padding must be added at end of the structures, while it is not required on i386. So for most ABI struct mlx5_ib_create_cq get padded to be aligned on a 8 bytes multiple, while for i386, such padding is not added. The tool pahole can be used to find such implicit padding: $ pahole --anon_include \ --nested_anon_include \ --recursive \ --class_name mlx5_ib_create_cq \ drivers/infiniband/hw/mlx5/mlx5_ib.o Then, structure layout can be compared between i386 and x86_64: +++ obj-i386/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt 2014-03-28 11:43:07.386413682 +0100 --- obj-x86_64/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt 2014-03-27 13:06:17.788472721 +0100 @@ -34,9 +34,8 @@ struct mlx5_ib_create_cq { __u64 db_addr; /* 8 8 / __u32 cqe_size; / 16 4 / - / size: 20, cachelines: 1, members: 3 / - / last cacheline: 20 bytes / + / size: 24, cachelines: 1, members: 3 / + / padding: 4 / + / last cacheline: 24 bytes */ }; This ABI disagreement will make an x86_64 kernel try to read past the buffer provided by an i386 binary. When boundary check will be implemented, a x86_64 kernel will refuse to read past the i386 userspace provided buffer and the uverb will fail. Anyway, if the structure lies in memory on a page boundary and next page is not mapped, ib_copy_from_udata() will fail when trying to read the 4 bytes of padding and the uverb will fail. This patch makes create_cq_user() takes care of the input data size to handle the case where no padding is provided. This way, x86_64 kernel will be able to handle struct mlx5_ib_create_cq as sent by unpatched and patched i386 libmlx5. Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapter") Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:13 -07:00
Shachar Raindel	a74d24168d	IB/mlx5: Refactor UMR to have its own context struct Instead of having the UMR context part of each memory region, allocate a struct on the stack. This allows queuing multiple UMRs that access the same memory region. Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:09 -07:00
Haggai Eran	48fea837bb	IB/mlx5: Set QP offsets and parameters for user QPs and not just for kernel QPs For user QPs, the creation process does not currently initialize the fields: * qp->rq.offset * qp->sq.offset * qp->sq.wqe_shift Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:08 -07:00
Haggai Eran	b475598aec	mlx5_core: Store MR attributes in mlx5_mr_core during creation and after UMR The patch stores iova, pd and size during mr creation and after UMRs that modify them. It removes the unused access flags field. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:06 -07:00
Haggai Eran	8605933a22	IB/mlx5: Add MR to radix tree in reg_mr_callback For memory regions that are allocated using reg_umr, the suffix of mlx5_core_create_mkey isn't being called. Instead the creation is completed in a callback function (reg_mr_callback). This means that these MRs aren't being added to the MR radix tree. Add them in the callback. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:05 -07:00
Haggai Eran	096f7e72c6	IB/mlx5: Fix error handling in reg_umr If ib_post_send fails when posting the UMR work request in reg_umr, the code doesn't release the temporary pas buffer allocated, and doesn't dma_unmap it. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:05 -07:00
Sagi Grimberg	c7f44fbda6	mlx5_core: Copy DIF fields only when input and output space values match Some DIF implementations (SCSI initiator/target) may want to use different input/output values for application tag and/or reference tag. So in case memory/wire domain values don't match HW must not copy them. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:02 -07:00
Sagi Grimberg	5c273b1677	mlx5_core: Simplify signature handover wqe for interleaved buffers No need for repetition format pattern in case the data and protection are already interleaved in the memory domain since the pattern already exists. A single key entry is sufficient and may save some extra fetch ops. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:52:58 -07:00
Sagi Grimberg	8524867b9c	mlx5_core: Fix signature handover operation for interleaved buffers When the data and protection are interleaved in the memory domain, no need to expand the mkey total length. At the moment no Linux user works (iSER initiator & target) in interleaved mode. This may change in the future as for SCSI pass-through devices there is no real point in target performing de-interleaving and re-interleaving of the protection data in the PT stage. Regardless, signature verbs support this mode. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:52:54 -07:00
Or Gerlitz	c7ca4b69d9	IB/iser: Bump version to 1.4 Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
Roi Dayan	e7eeffa4a0	IB/iser: Add missing newlines to logging messages Logging messages need terminating newlines to avoid possible message interleaving. Add them. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
Ariel Nahum	66d4e62d27	IB/iser: Fix a possible race in iser connection states transition In some circumstances (multiple targets), RDMA_CM ESTABLISHED event and ep_disconnect may race. In this case, the iser connection state may transition to UP (after ep_disconnect transitioned it to TERMINATING), while the connection is being torn down. Upon RDMA_CM event ESTABLISHED we allow iser connection state to transition to UP only from PENDING. We also make sure to protect this state change (done under the connection lock). Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
Ariel Nahum	b73c3adabd	IB/iser: Simplify connection management iSER relies on refcounting to manage iser connections establishment and teardown. Following commit `39ff05dbbb` ("IB/iser: Enhance disconnection logic for multi-pathing"), iser connection maintain 3 references: - iscsi_endpoint (at creation stage) - cma_id (at connection request stage) - iscsi_conn (at bind stage) We can avoid taking explicit refcounts by correctly serializing iser teardown flows (graceful and non-graceful). Our approach is to trigger a scheduled work to handle ordered teardown by gracefully waiting for 2 cleanup stages to complete: 1. Cleanup of live pending tasks indicated by iscsi_conn_stop completion 2. Flush errors processing Each completed stage will notify a waiting worker thread when it is done to allow teardwon continuation. Since iSCSI connection establishment may trigger endpoint disconnect without a successful endpoint connect, we rely on the iscsi <-> iser binding (.conn_bind) to learn about the teardown policy we should take wrt cleanup stages. Since all cleanup worker threads are scheduled (release_wq) in .ep_disconnect it is safe to assume that when module_exit is called, all cleanup workers are already scheduled. Thus proper module unload shall flush all scheduled works before allowing safe exit, to guarantee no resources got left behind. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
David S. Miller	54e5c4def0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-24 00:32:30 -04:00
Linus Torvalds	5fa6a683c0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "It looks like a sizeble collection but this is nearly 3 weeks of bug fixing while you were away. 1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute the packet through a non-IPSEC protected path and the code has to be able to handle SKBs attached to routes lacking an attached xfrm state. From Steffen Klassert. 2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported sub-protocols, also from Steffen Klassert. 3) Set local_df on fragmented netfilter skbs otherwise we won't be able to forward successfully, from Florian Westphal. 4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without holding RCU lock, from Bjorn Mork. 5) local_df test in ip_may_fragment is inverted, from Florian Westphal. 6) jme driver doesn't check for DMA mapping failures, from Neil Horman. 7) qlogic driver doesn't calculate number of TX queues properly, from Shahed Shaikh. 8) fib_info_cnt can drift irreversibly positive if we fail to allocate the fi->fib_metrics array, from Sergey Popovich. 9) Fix use after free in ip6_route_me_harder(), also from Sergey Popovich. 10) When SYSCTL is disabled, we don't handle local_port_range and ping_group_range defaults properly at all, from Cong Wang. 11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim driver, fix from Bjorn Mork. 12) cassini driver needs nested lock annotations for TX locking, from Emil Goode. 13) On init error ipv6 VTI driver can unregister pernet ops twice, oops. Fix from Mahtias Krause. 14) If macvlan device is down, don't propagate IFF_ALLMULTI changes, from Peter Christensen. 15) Missing NULL pointer check while parsing netlink config options in ip6_tnl_validate(). From Susant Sahani. 16) Fix handling of neighbour entries during ipv6 router reachability probing, from Duan Jiong. 17) x86 and s390 JIT address randomization has some address calculation bugs leading to crashes, from Alexei Starovoitov and Heiko Carstens. 18) Clear up those uglies with nop patching and net_get_random_once(), from Hannes Frederic Sowa. 19) Option length miscalculated in ip6_append_data(), fix also from Hannes Frederic Sowa. 20) A while ago we fixed a race during device unregistry when a namespace went down, turns out there is a second place that needs similar protection. From Cong Wang. 21) In the new Altera TSE driver multicast filtering isn't working, disable it and just use promisc mode until the cause is found. From Vince Bridgers. 22) When we disable router enabling in ipv6 we have to flush the cached routes explicitly, from Duan Jiong. 23) NBMA tunnels should not cache routes on the tunnel object because the key is variable, from Timo Teräs. 24) With stacked devices GRO information in skb->cb[] can be not setup properly, make sure it is in all code paths. From Eric Dumazet. 25) Really fix stacked vlan locking, multiple levels of nesting with intervening non-vlan devices are possible. From Vlad Yasevich. 26) Fallback ipip tunnel device's mtu is not setup properly, from Steffen Klassert. 27) The packet scheduler's tcindex filter can crash because we structure copy objects with list_head's inside, oops. From Cong Wang. 28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric Dumazet. 29) In some configurations 'itag' in __mkroute_input() can end up being used uninitialized because of how fib_validate_source() works. Fix it by explitly initializing itag to zero like all the other fib_validate_source() callers do, from Li RongQing" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) batman: fix a bogus warning from batadv_is_on_batman_iface() ipv4: initialise the itag variable in __mkroute_input bonding: Send ALB learning packets using the right source bonding: Don't assume 802.1Q when sending alb learning packets. net: doc: Update references to skb->rxhash stmmac: Remove unbalanced clk_disable call ipv6: gro: fix CHECKSUM_COMPLETE support net_sched: fix an oops in tcindex filter can: peak_pci: prevent use after free at netdev removal ip_tunnel: Initialize the fallback device properly vlan: Fix build error wth vlan_get_encap_level() can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option MAINTAINERS: Pravin Shelar is Open vSwitch maintainer. bnx2x: Convert return 0 to return rc bonding: Fix alb mode to only use first level vlans. bonding: Fix stacked device detection in arp monitoring macvlan: Fix lockdep warnings with stacked macvlan devices vlan: Fix lockdep warning with stacked vlan devices. net: Allow for more then a single subclass for netif_addr_lock net: Find the nesting level of a given device by type. ...	2014-05-23 15:29:43 -07:00
Linus Torvalds	a7aa96a92e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull scsi target fixes from Nicholas Bellinger: "This series include: - Close race between iser-target network portal shutdown + accepting new connection logins (sagi) - Fix free-after-use regression in tcm_fc post conversion to percpu-ida pre-allocation (nab) - Explicitly disable Immediate + Unsolicited Data for iser-target connections when T10-PI is enabled (sagi + nab) - Allow pi_prot_type + emulate_write_cache attributes to be set to zero regardless of backend support (andy) - memory leak fix (mikulas)" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: fix memory leak on XCOPY target: Don't allow setting WC emulation if device doesn't support iscsi-target: Disable Immediate + Unsolicited Data with ISER Protection tcm_fc: Fix free-after-use regression in ft_free_cmd iscsi-target: Change BUG_ON to REJECT in iscsit_process_nop_out Target/iscsi,iser: Avoid accepting transport connections during stop stage Target/iser: Fix iscsit_accept_np and rdma_cm racy flow Target/iser: Fix wrong connection requests list addition target: Allow non-supporting backends to set pi_prot_type to 0	2014-05-21 18:03:14 +09:00
Sagi Grimberg	5f80ff8ecc	Target/iser: Gracefully reject T10-PI enabled connect request if not supported In case user chose to set T10-PI enable on the target while the IB device does not support it, gracefully reject the request. Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-20 11:18:44 -07:00
Sagi Grimberg	f5ebec9629	Target/iser: Wait for proper cleanup before unloading disconnected_handler works are scheduled on system_wq. When attempting to unload, first make sure all works have cleaned up. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-20 11:18:43 -07:00
Sagi Grimberg	88c4015fda	Target/iser: Improve cm events handling There are 4 RDMA_CM events that all basically mean that the user should teardown the IB connection: - DISCONNECTED - ADDR_CHANGE - DEVICE_REMOVAL - TIMEWAIT_EXIT Only in DISCONNECTED/ADDR_CHANGE it makes sense to call rdma_disconnect (send DREQ/DREP to our initiator). So we keep the same teardown handler for all of them but only indicate calling rdma_disconnect for the relevant events. This patch also removes redundant debug prints for each single event. v2 changes: - Call isert_disconnected_handler() for DEVICE_REMOVAL (Or + Sag) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-20 11:17:40 -07:00
Bart Van Assche	5cfb17828d	IB/srp: Add fast registration support Certain HCA types (e.g. Connect-IB) and certain configurations (e.g. ConnectX VF) support fast registration but not FMR. Hence add fast registration support. In function srp_rport_reconnect(), move the the srp_finish_req() loop from after to before the srp_create_target_ib() call. This is needed to avoid that srp_finish_req() tries to queue any invalidation requests for rkeys associated with the old queue pair on the newly allocated queue pair. Invoking srp_finish_req() before the queue pair has been reallocated is safe since srp_claim_req() handles completions correctly that arrive after srp_finish_req() has been invoked. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	52ede08f00	IB/srp: Rename FMR-related variables The next patch will cause the renamed variables to be shared between the code for FMR and for FR memory registration. Make the names of these variables independent of the memory registration mode. This patch does not change any functionality. The start of this patch was the changes applied via the following shell command: sed -i.orig 's/SRP_FMR_SIZE/SRP_MAX_PAGES_PER_MR/g; \ s/fmr_page_mask/mr_page_mask/g;s/fmr_page_size/mr_page_size/g; \ s/fmr_page_shift/mr_page_shift/g;s/fmr_max_size/mr_max_size/g; \ s/max_pages_per_fmr/max_pages_per_mr/g;s/nfmr/nmdesc/g; \ s/fmr_len/dma_len/g' drivers/infiniband/ulp/srp/ib_srp.[ch] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	d1b4289e16	IB/srp: One FMR pool per SRP connection Allocate one FMR pool per SRP connection instead of one SRP pool per HCA. This improves scalability of the SRP initiator. Only request the SCSI mid-layer to retry a SCSI command after a temporary mapping failure (-ENOMEM) but not after a permanent mapping failure. This avoids that SCSI commands are retried indefinitely if a permanent memory mapping failure occurs. Tell the SCSI mid-layer to reduce queue depth temporarily in the unlikely case where an application is queuing many requests with more than max_pages_per_fmr sg-list elements. For FMR pool allocation, base the max_pages_per_fmr parameter on the HCA memory registration limit. Only try to allocate an FMR pool if FMR is supported. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	b1b8854d16	IB/srp: Introduce the 'register_always' kernel module parameter Add a kernel module parameter that enables memory registration also for SG-lists that can be processed without memory registration. This makes it easier for kernel developers to test the memory registration code. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	539dde6fc5	IB/srp: Introduce srp_finish_mapping() This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	76bc1e1ddd	IB/srp: Introduce srp_map_fmr() This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Bart Van Assche	62154b2e89	IB/srp: Introduce an additional local variable This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Bart Van Assche	af24663bc8	IB/srp: Fix kernel-doc warnings Avoid that the kernel-doc tool warns about missing argument descriptions for the ib_srp.[ch] source files. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Bart Van Assche	024ca90151	IB/srp: Fix a sporadic crash triggered by cable pulling Avoid that the loops that iterate over the request ring can encounter a pointer to a SCSI command in req->scmnd that is no longer associated with that request. If the function srp_unmap_data() is invoked twice for a SCSI command that is not in flight then that would cause ib_fmr_pool_unmap() to be invoked with an invalid pointer as argument, resulting in a kernel oops. Reported-by: Sagi Grimberg <sagig@mellanox.com> Reference: http://thread.gmane.org/gmane.linux.drivers.rdma/19068/focus=19069 Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Steve Wise	11b8e22d4d	RDMA/cxgb4: Fix vlan support RDMA connections over a vlan interface don't work due to import_ep() not using the correct egress device. - use the real device in import_ep() - use rdma_vlan_dev_real_dev() in get_real_dev(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-19 18:00:32 -07:00
Duan Jiong	0cc65dd691	RDMA/ocrdma: Convert to use simple_open() This removes an open-coded duplicate of simple_open(). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-19 17:55:54 -07:00
Christoph Jaeger	65b302ad31	RDMA/cxgb4: Fix memory leaks in c4iw_alloc() error paths c4iw_alloc() bails out without freeing the storage that 'devp' points to. Picked up by Coverity - CID 1204241. Fixes: `fa658a98a2` ("RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices") Signed-off-by: Christoph Jaeger <christophjaeger@linux.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-19 17:55:43 -07:00
Sagi Grimberg	9d49f5e284	Target/iser: Fix hangs in connection teardown In ungraceful teardowns isert close flows seem racy such that isert_wait_conn hangs as RDMA_CM_EVENT_DISCONNECTED never gets invoked (no one called rdma_disconnect). Both graceful and ungraceful teardowns will have rx flush errors (isert posts a batch once connection is established). Once all flush errors are consumed we invoke isert_wait_conn and it will be responsible for calling rdma_disconnect. This way it can be sure that rdma_disconnect was called and it won't wait forever. This patch also removes the logout_posted indicator. either the logout completion was consumed and no problem decrementing the post_send_buf_count, or it was consumed as a flush error. no point of keeping it for isert_wait_conn as there is no danger that isert_conn will be accidentally removed while it is running. (Drop unnecessary sleep_on_conn_wait_comp check in isert_cq_rx_comp_err - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-19 14:38:49 -07:00
Sagi Grimberg	e346ab343f	Target/iser: Bail from accept_np if np_thread is trying to close In case np_thread state is in RESET/SHUTDOWN/EXIT states, no point for isert to stall there as we may get a hang in case no one will wake it up later. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-19 14:32:58 -07:00
Matan Barak	9433c18891	IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes When we receive a netdev event indicating a netdev change and/or a netdev address change, we must change the MAC index used by the proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the VF will not carry the same source MAC address as the non-CM packets. We use the UPDATE_QP command to perform this change. In order to avoid modifying a QP context based on netdev event, while the driver attempts to destroy this QP (e.g either the mlx4_ib or ib_mad modules are unloaded), we use mutex locking in both flows. Since the relevant mlx4 proxy GSI QP is created indirectly by the mad module when they create their GSI QP, the mlx4 didn't need to keep track on that QP prior to this change. Now, when QP modifications are needed to this QP from within the driver, we added refernece to it. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 15:12:45 -04:00
Sagi Grimberg	14f4b54fe3	Target/iscsi,iser: Avoid accepting transport connections during stop stage When the target is in stop stage, iSER transport initiates RDMA disconnects. The iSER initiator may wish to establish a new connection over the still existing network portal. In this case iSER transport should not accept and resume new RDMA connections. In order to learn that, iscsi_np is added with enabled flag so the iSER transport can check when deciding weather to accept and resume a new connection request. The iscsi_np is enabled after successful transport setup, and disabled before iscsi_np login threads are cleaned up. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-15 17:09:11 -07:00
Sagi Grimberg	531b7bf4bd	Target/iser: Fix iscsit_accept_np and rdma_cm racy flow RDMA CM and iSCSI target flows are asynchronous and completely uncorrelated. Relying on the fact that iscsi_accept_np will be called after CM connection request event and will wait for it is a mistake. When attempting to login to a few targets this flow is racy and unpredictable, but for parallel login to dozens of targets will race and hang every time. The correct synchronizing mechanism in this case is pending on a semaphore rather than a wait_for_event. We keep the pending interruptible for iscsi_np cleanup stage. (Squash patch to remove dead code into parent - nab) Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-15 17:09:10 -07:00
Sagi Grimberg	9fe63c88b1	Target/iser: Fix wrong connection requests list addition Should be adding list_add_tail($new, $head) and not the other way around. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-15 17:09:10 -07:00
Wilfried Klaebe	7ad24ea4bf	net: get rid of SET_ETHTOOL_OPS net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops ops; struct net_device dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:43:20 -04:00
Hariprasad S	7d0a73a40c	RDMA/cxgb4: Update Kconfig to include Chelsio T5 adapter Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Steve Wise	c2f9da92f2	RDMA/cxgb4: Only allow kernel db ringing for T4 devs The whole db drop avoidance stuff is for T4 only. So we cannot allow that to be enabled for T5 devices. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Steve Wise	92e5011ab0	RDMA/cxgb4: Force T5 connections to use TAHOE congestion control This is required to work around a T5 HW issue. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Steve Wise	cc18b939e1	RDMA/cxgb4: Fix endpoint mutex deadlocks In cases where the cm calls c4iw_modify_rc_qp() with the endpoint mutex held, they must be called with internal == 1. rx_data() and process_mpa_reply() are not doing this. This causes a deadlock because c4iw_modify_rc_qp() might call c4iw_ep_disconnect() in some !internal cases, and c4iw_ep_disconnect() acquires the endpoint mutex. The design was intended to only do the disconnect for !internal calls. Change rx_data(), FPDU_MODE case, to call c4iw_modify_rc_qp() with internal == 1, and then disconnect only after releasing the mutex. Change process_mpa_reply() to call c4iw_modify_rc_qp(TERMINATE) with internal == 1 and set a new attr flag telling it to send a TERMINATE message. Previously this was implied by !internal. Change process_mpa_reply() to return whether the caller should disconnect after releasing the endpoint mutex. Now rx_data() will do the disconnect in the cases where process_mpa_reply() wants to disconnect after the TERMINATE is sent. Change c4iw_modify_rc_qp() RTS->TERM to only disconnect if !internal, and to send a TERMINATE message if attrs->send_term is 1. Change abort_connection() to not aquire the ep mutex for setting the state, and make all calls to abort_connection() do so with the mutex held. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Linus Torvalds	38137a5188	InfiniBand/RDMA updates for 3.15-rc2: - Mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree. - Drop deprecated MSI-X API use. - A couple other miscellaneous things. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJTUXCpAAoJEENa44ZhAt0h4tEP/1P1TqA/90sO7WeChze6H2ph kkGJPsCj8i1AWpef/Ax2hvoH1pjxK/VQbpwgkGJElv+rRaP99pandYGPBFluUoJK jI6D/mZSAsl6BXIZZCTcz9ttQNYacDdDL1I22WIdntkkCUsoP1awuHnXS4WLAiPA 3D1XmiSHTGv9nbqd2wKA3sHWFaUN/s+RqsvIXUcNrN4qEUL2KXPOyt7i4Pz8eUq7 /lxG1S854QUrZsW7qFR6Jc3clhNupUDWBBRrII3MOV98W3upHOVsFAdtcLZAkcTA FBCk90VVi0hsPeHqqCZjKUW/JE414vEDzOZfmYOpMj0xzc9sF1POxlvqMfRjvKTD fwYAkNLT7AW74jSU38KZl9pdLsQx187c75pcYUZ4ph6m6bozbtlPGkJaa3M3yyUb DmWUZL1dX8i69dV+w/ABz/4uK0dbCwMeAM9cCCUiedqI5wIuEV4Ld3NOLe7HnAXz O81UrVVnwrOVl8uPmGzY3VliU0OtFqhMBCxTgsZRxuBr9DSx/dUzb4Tzfw7e3Sc+ +uxMMnbuCDysqEfzKsLG0+qu9P3hcwyafTJ9u8pq0D+BCB3tA8ZIKiqrKeMBua2h IJAy8M+zVG2HZCQFreH2oMnXmzVoyp+5Yl28izMDEfDptJ6hel4FLBiLLlYnU1DO zxUN5X1yOHxFkK4YYetE =/PTZ -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma updates from Roland Dreier: - mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree - drop deprecated MSI-X API use. - a couple other miscellaneous things. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cxgb4: Fix over-dereference when terminating RDMA/cxgb4: Use uninitialized_var() RDMA/cxgb4: Add missing debug stats RDMA/cxgb4: Initialize reserved fields in a FW work request RDMA/cxgb4: Use pr_warn_ratelimited RDMA/cxgb4: Max fastreg depth depends on DSGL support RDMA/cxgb4: SQ flush fix RDMA/cxgb4: rmb() after reading valid gen bit RDMA/cxgb4: Endpoint timeout fixes RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices IB/mlx5: Add block multicast loopback support IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix()	2014-04-18 13:49:42 -07:00
Linus Torvalds	141eaccd01	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Here are the target pending updates for v3.15-rc1. Apologies in advance for waiting until the second to last day of the merge window to send these out. The highlights this round include: - iser-target support for T10 PI (DIF) offloads (Sagi + Or) - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung) - Pass in transport supported PI at session initialization (Sagi + MKP + nab) - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi) - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab) - Fix tcm_fc use-after-free of ft_tpg (Andy Grover) - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn) Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI metadata into KVM guest have been left-out for now, as there where a few comments from MST + Paolo that where not able to be addressed in time for v3.15. Please expect this feature for v3.16-rc1" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits) ib_srpt: Use correct ib_sg_dma primitives target/tcm_fc: Rename ft_tport_create to ft_tport_get target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn target/tcm_fc: Rename structs and list members for clarity target/tcm_fc: Limit to 1 TPG per wwn target/tcm_fc: Don't export ft_lport_list target/tcm_fc: Fix use-after-free of ft_tpg target: Add check to prevent Abort Task from aborting itself target: Enable READ_STRIP emulation in target_complete_ok_work target/sbc: Add sbc_dif_read_strip software emulation target: Enable WRITE_INSERT emulation in target_execute_cmd target/sbc: Add sbc_dif_generate software emulation target/sbc: Only expose PI read_cap16 bits when supported by fabric target/spc: Only expose PI mode page bits when supported by fabric target/spc: Only expose PI inquiry bits when supported by fabric target: Pass in transport supported PI at session initialization target/iblock: Fix double bioset_integrity_free bug Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist target/rd: T10-Dif: RAM disk is allocating more space than required. iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug ...	2014-04-12 16:51:08 -07:00
Mike Marciniszyn	b076808051	ib_srpt: Use correct ib_sg_dma primitives The code was incorrectly using sg_dma_address() and sg_dma_len() instead of ib_sg_dma_address() and ib_sg_dma_len(). This prevents srpt from functioning with the Intel HCA and indeed will corrupt memory badly. Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Vinod Kumar <vinod.kumar@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: stable@vger.kernel.org # 3.3+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-11 15:31:20 -07:00
Roland Dreier	5ae2866f52	Merge branches 'cxgb4', 'misc', 'mlx5' and 'qib' into for-next	2014-04-11 11:36:15 -07:00
Steve Wise	1d1ca9b4fd	RDMA/cxgb4: Fix over-dereference when terminating Need to get the endpoint reference before calling rdma_fini(), which might fail causing us to not get the reference. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:10 -07:00
Steve Wise	97df1c6736	RDMA/cxgb4: Use uninitialized_var() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:10 -07:00
Steve Wise	98a3e87990	RDMA/cxgb4: Add missing debug stats Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:09 -07:00
Steve Wise	c3f98fa291	RDMA/cxgb4: Initialize reserved fields in a FW work request Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:09 -07:00
Hariprasad Shenai	aec844df10	RDMA/cxgb4: Use pr_warn_ratelimited Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:08 -07:00
Steve Wise	a03d9f94cc	RDMA/cxgb4: Max fastreg depth depends on DSGL support The max depth of a fastreg mr depends on whether the device supports DSGL or not. So compute it dynamically based on the device support and the module use_dsgl option. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:08 -07:00
Steve Wise	b4e2901c52	RDMA/cxgb4: SQ flush fix There is a race when moving a QP from RTS->CLOSING where a SQ work request could be posted after the FW receives the RDMA_RI/FINI WR. The SQ work request will never get processed, and should be completed with FLUSHED status. Function c4iw_flush_sq(), however was dropping the oldest SQ work request when in CLOSING or IDLE states, instead of completing the pending work request. If that oldest pending work request was actually complete and has a CQE in the CQ, then when that CQE is proceessed in poll_cq, we'll BUG_ON() due to the inconsistent SQ/CQ state. This is a very small timing hole and has only been hit once so far. The fix is two-fold: 1) c4iw_flush_sq() MUST always flush all non-completed WRs with FLUSHED status regardless of the QP state. 2) In c4iw_modify_rc_qp(), always set the "in error" bit on the queue before moving the state out of RTS. This ensures that the state transition will not happen while another thread is in post_rc_send(), because set_state() and post_rc_send() both aquire the qp spinlock. Also, once we transition the state out of RTS, subsequent calls to post_rc_send() will fail because the "in error" bit is set. I don't think this fully closes the race where the FW can get a FINI followed a SQ work request being posted (because they are posted to differente EQs), but the #1 fix will handle the issue by flushing the SQ work request. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:08 -07:00
Steve Wise	def4771f4b	RDMA/cxgb4: rmb() after reading valid gen bit Some HW platforms can reorder read operations, so we must rmb() after we see a valid gen bit in a CQE but before we read any other fields from the CQE. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:07 -07:00
Steve Wise	b33bd0cbfa	RDMA/cxgb4: Endpoint timeout fixes 1) timedout endpoint processing can be starved. If there are continual CPL messages flowing into the driver, the endpoint timeout processing can be starved. This condition exposed the other bugs below. Solution: In process_work(), call process_timedout_eps() after each CPL is processed. 2) Connection events can be processed even though the endpoint is on the timeout list. If the endpoint is scheduled for timeout processing, then we must ignore MPA Start Requests and Replies. Solution: Change stop_ep_timer() to return 1 if the ep has already been queued for timeout processing. All the callers of stop_ep_timer() need to check this and act accordingly. There are just a few cases where the caller needs to do something different if stop_ep_timer() returns 1: 1) in process_mpa_reply(), ignore the reply and process_timeout() will abort the connection. 2) in process_mpa_request, ignore the request and process_timeout() will abort the connection. It is ok for callers of stop_ep_timer() to abort the connection since that will leave the state in ABORTING or DEAD, and process_timeout() now ignores timeouts when the ep is in these states. 3) Double insertion on the timeout list. Since the endpoint timers are used for connection setup and teardown, we need to guard against the possibility that an endpoint is already on the timeout list. This is a rare condition and only seen under heavy load and in the presense of the above 2 bugs. Solution: In ep_timeout(), don't queue the endpoint if it is already on the queue. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:07 -07:00
Steve Wise	fa658a98a2	RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices Signed-off-by: Steve Wise <swise@opengridcomputing.com> [ Fix cast from u64* to integer. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:01 -07:00
Eli Cohen	f360d88a2e	IB/mlx5: Add block multicast loopback support Add support for the block multicast loopback QP creation flag along the proper firmware API for that. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-10 18:43:32 -07:00
Alexander Gordeev	9684c2ea6d	IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() As result of the deprecation of the MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block(), all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() or pci_enable_msi_exact() and pci_enable_msix_range() or pci_enable_msix_exact() interfaces. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-10 18:41:34 -07:00
Alexander Gordeev	bf3f043e7b	IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix() As result of the deprecation of the MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block(), all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() and pci_enable_msix_range() interfaces. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-10 18:39:13 -07:00
Nicholas Bellinger	e70beee783	target: Pass in transport supported PI at session initialization In order to support local WRITE_INSERT + READ_STRIP operations for non PI enabled fabrics, the fabric driver needs to be able signal what protection offload operations are supported. This is done at session initialization time so the modes can be signaled by individual se_wwn + se_portal_group endpoints, as well as optionally across different transports on the same endpoint. For iser-target, set TARGET_PROT_ALL if the underlying ib_device has already signaled PI offload support, and allow this to be exposed via a new iscsit_transport->iscsit_get_sup_prot_ops() callback. For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode operation. For all other drivers, set TARGET_PROT_NORMAL to disable fabric level PI. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:54 -07:00
Sagi Grimberg	f225225848	Target/iser: Use Fastreg only if device supports signature Fastreg is mandatory for signature, so if the device doesn't support it we don't need to use it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:52 -07:00
Nicholas Bellinger	03e7848a64	iser-target: Add missing se_cmd put for WRITE_PENDING in tx_comp_err This patch fixes a bug where outstanding RDMA_READs with WRITE_PENDING status require an extra target_put_sess_cmd() in isert_put_cmd() code when called from isert_cq_tx_comp_err() + isert_cq_drain_comp_llist() context during session shutdown. The extra kref PUT is required so that transport_generic_free_cmd() invokes the last target_put_sess_cmd() -> target_release_cmd_kref(), which will complete(&se_cmd->cmd_wait_comp) the outstanding se_cmd descriptor with WRITE_PENDING status, and awake the completion in target_wait_for_sess_cmds() to invoke TFO->release_cmd(). The bug was manifesting itself in target_wait_for_sess_cmds() where a se_cmd descriptor with WRITE_PENDING status would end up sleeping indefinately. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:51 -07:00
Nicholas Bellinger	131e6abc67	target: Add TFO->abort_task for aborted task resources release Now that TASK_ABORTED status is not generated for all cases by TMR ABORT_TASK + LUN_RESET, a new TFO->abort_task() caller is necessary in order to give fabric drivers a chance to unmap hardware / software resources before the se_cmd descriptor is released via the normal TFO->release_cmd() codepath. This patch adds TFO->aborted_task() in core_tmr_abort_task() in place of the original transport_send_task_abort(), and also updates all fabric drivers to implement this caller. The fabric drivers that include changes to perform cleanup via ->aborted_task() are: - iscsi-target - iser-target - srpt - tcm_qla2xxx The fabric drivers that currently set ->aborted_task() to NOPs are: - loopback - tcm_fc - usb-gadget - sbp-target - vhost-scsi For the latter five, there appears to be no additional cleanup required before invoking TFO->release_cmd() to release the se_cmd descriptor. v2 changes: - Move ->aborted_task() call into transport_cmd_finish_abort (Alex) Cc: Alex Leung <amleung21@yahoo.com> Cc: Mark Rustad <mark.d.rustad@intel.com> Cc: Roland Dreier <roland@kernel.org> Cc: Vu Pham <vu@mellanox.com> Cc: Chris Boot <bootc@bootc.net> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Cc: Saurav Kashyap <saurav.kashyap@qlogic.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:51 -07:00
Nicholas Bellinger	f46d6a8a01	iser-target: Match FRMR descriptors to available session tags This patch changes isert_conn_create_fastreg_pool() to follow logic in iscsi_target_locate_portal() for determining how many FRMR descriptors to allocate based upon the number of possible per-session command slots that are available. This addresses an OOPs in isert_reg_rdma() where due to the use of ISCSI_DEF_XMIT_CMDS_MAX could end up returning a bogus fast_reg_descriptor when the number of active tags exceeded the original hardcoded max. Note this also includes moving isert_conn_create_fastreg_pool() from isert_connect_request() to isert_put_login_tx() before posting the final Login Response PDU in order to determine the se_nacl->queue_depth (eg: number of tags) per session the target will be enforcing. v2 changes: - Move isert_conn->conn_fr_pool list_head init into isert_conn_request() v3 changes: - Drop unnecessary list_empty() check in isert_reg_rdma() (Sagi) Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:50 -07:00
Sagi Grimberg	5bac4b1a1f	Target/iser: Fail SCSI WRITE command if device detected integrity error If during data-transfer a data-integrity error was detected we must fail the command with CHECK_CONDITION and not execute the command. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:49 -07:00
Sagi Grimberg	96b7973e1c	Target/iser: Move check signature status to a function Remove code duplication from RDMA_READ and RDMA_WRITE completions that do basically the same check. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:49 -07:00
Sagi Grimberg	897bb2c916	Target/iser: Consider DIF and RDMA_READ completions when calculating post_send counter If protection is involved, iSER target must wait for completion of RDMA_READ before sending SCSI response. So we must consider that when calculating post_send_buf_count additions, also when processing good/error completions. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:48 -07:00
Sagi Grimberg	c2caa20777	Target/iser: Fix signature work requests accounting As REG_SIG_MR work request and it's LOCAL_INVALIDATE are not accounted in post_send_buf_count we must color these with ISER_FASTREG_LI_WRID in order to process their error completions when the QP flushes. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:48 -07:00
Sagi Grimberg	9e961ae73c	IB/isert: Support T10-PI protected transactions In case the Target core passed transport T10 protection operation: 1. Register data buffer (data memory region) 2. Register protection buffer if exsists (prot memory region) 3. Register signature region (signature memory region) - use work request IB_WR_REG_SIG_MR 4. Execute RDMA 5. Upon RDMA completion check the signature status - if succeeded send good SCSI response - if failed send SCSI bad response with appropriate sense buffer (Fix up compile error in isert_reg_sig_mr, and fix up incorrect se_cmd->prot_type -> TARGET_PROT_NORMAL comparision - nab) (Fix failed sector assignment in isert_completion_rdma_* - Sagi + nab) (Fix enum assignements for protection type - Sagi) (Fix devision on 32-bit in isert_completion_rdma_* - Sagi + Fengguang) (Fix context change for v3.14-rc6 code - nab) (Fix iscsit_build_rsp_pdu inc_statsn flag usage - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:47 -07:00
Sagi Grimberg	f93f3a70da	IB/isert: Accept RDMA_WRITE completions In case of protected transactions, we will need to check the protection status of the transaction before sending SCSI response. So be ready for RDMA_WRITE completions. currently we don't ask for these completions, but for T10-PI we will. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:46 -07:00
Sagi Grimberg	d3e125dac1	IB/isert: Initialize T10-PI resources Introduce pi_context to hold relevant RDMA protection resources. We eliminate data_key_valid boolean and replace it with indicators container to indicate: - Is the descriptor protected (registered via signature MR) - Is the data_mr key valid (can spare LOCAL_INV WR) - Is the prot_mr key valid (can spare LOCAL_INV WR) - Is the sig_mr key valid (can spare LOCAL_INV WR) Upon connection establishment check if network portal is T10-PI enabled and allocate T10-PI resources if necessary, allocate signature enabled memory regions and mark connection queue-pair as signature enabled. (Fix context change for v3.14-rc6 code - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:46 -07:00
Sagi Grimberg	e3d7e4c30c	IB/isert: Introduce isert_map/unmap_data_buf export map/unmap data buffer to a routine that may be used in various places in the code and keep the mapping data in a designated descriptor. Also, let isert_fast_reg_mr to decide weather to use global MR or do fast registration. This commit does not change any functionality. (Fix context change for v3.14-rc6 code - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:45 -07:00
Linus Torvalds	877f075aac	Main batch of InfiniBand/RDMA changes for 3.15: - The biggest change is core API extensions and mlx5 low-level driver support for handling DIF/DIX-style protection information, and the addition of PI support to the iSER initiator. Target support will be arriving shortly through the SCSI target tree. - A nice simplification to the "umem" memory pinning library now that we have chained sg lists. Kudos to Yishai Hadas for realizing our code didn't have to be so crazy. - Another nice simplification to the sg wrappers used by qib, ipath and ehca to handle their mapping of memory to adapter. - The usual batch of fixes to bugs found by static checkers etc. from intrepid people like Dan Carpenter and Yann Droneaud. - A large batch of cxgb4, ocrdma, qib driver updates. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJTPYBnAAoJEENa44ZhAt0hGI4P/29eotGwpkANUQE6FQvxCUL2 CXJtSg52lmYvGJrPK4IhihpbtQmHJz3iXEzlOOWidTw1dJgObR6vFaRymh7+vDLs CdzybMcXdasarqTuYeJbFzhkimpwtWWrMy/8Ik/Jj/5glGQ6cUSpdYZzVtFhYNqf hCGE8iLi+tuekJJj1htut5D6apXM7udcdc2yLJNOdsSj/VUXt1oqG1x9xAi9R8Tq 7o8eFSStdlja0EBQ6Hli2zauCSnQkaUtr8h6EAFbcCtvBK8HqsHSc2gfq2ViFUiN ztt167oWoQnVkR0qCPL5nVt+CRQHHROprVXvbpcTI3aW61gNIl6OrUUOXefzHXac TNi+fdMpiEB/JQ4Z04Jzd1dGCSjYeTqPj4rO4meFjBmxRDdTgZHu7FWwejT1nYJ5 d2abVdCOT+QWlIlM7m/pjdWJII5OYM+4/jtTayGepEaR4fTUzKtPZPBLNUBDBKE+ 4f92PC8LiuPkwJgb6XT96onPz1bDCOnPSEdwoKUFKPeGUcwgVOM/Wx5NU4Yf7rfg RxQwZ7mJXbjCYFlmGGo/0QDy6UEGkIFYlJSzooP+wlK1JvZ5h2M+9QKX2FtwzR+R I2kBxcTXWsM/h88R7MkNqbNIllmhssrJwmAE46OneZbfoBOB+JZjb4nLRTu0jEcS zn6f16GmJ37BKn2/qYY/ =Ww6H -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband updates from Roland Dreier: "Main batch of InfiniBand/RDMA changes for 3.15: - The biggest change is core API extensions and mlx5 low-level driver support for handling DIF/DIX-style protection information, and the addition of PI support to the iSER initiator. Target support will be arriving shortly through the SCSI target tree. - A nice simplification to the "umem" memory pinning library now that we have chained sg lists. Kudos to Yishai Hadas for realizing our code didn't have to be so crazy. - Another nice simplification to the sg wrappers used by qib, ipath and ehca to handle their mapping of memory to adapter. - The usual batch of fixes to bugs found by static checkers etc. from intrepid people like Dan Carpenter and Yann Droneaud. - A large batch of cxgb4, ocrdma, qib driver updates" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (102 commits) RDMA/ocrdma: Unregister inet notifier when unloading ocrdma RDMA/ocrdma: Fix warnings about pointer <-> integer casts RDMA/ocrdma: Code clean-up RDMA/ocrdma: Display FW version RDMA/ocrdma: Query controller information RDMA/ocrdma: Support non-embedded mailbox commands RDMA/ocrdma: Handle CQ overrun error RDMA/ocrdma: Display proper value for max_mw RDMA/ocrdma: Use non-zero tag in SRQ posting RDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr() RDMA/ocrdma: Increment abi version count RDMA/ocrdma: Update version string be2net: Add abi version between be2net and ocrdma RDMA/ocrdma: ABI versioning between ocrdma and be2net RDMA/ocrdma: Allow DPP QP creation RDMA/ocrdma: Read ASIC_ID register to select asic_gen RDMA/ocrdma: SQ and RQ doorbell offset clean up RDMA/ocrdma: EQ full catastrophe avoidance RDMA/cxgb4: Disable DSGL use by default RDMA/cxgb4: rx_data() needs to hold the ep mutex ...	2014-04-03 16:57:19 -07:00
Roland Dreier	f7eaa7ed8f	Merge branches 'core', 'cxgb4', 'ip-roce', 'iser', 'misc', 'mlx4', 'nes', 'ocrdma', 'qib', 'sgwrapper', 'srp' and 'usnic' into for-next	2014-04-03 08:30:17 -07:00
Selvin Xavier	2d8f57d56f	RDMA/ocrdma: Unregister inet notifier when unloading ocrdma Unregister the inet notifier during ocrdma unload to avoid a panic after driver unload. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:07 -07:00
Roland Dreier	7a1e89d8b7	RDMA/ocrdma: Fix warnings about pointer <-> integer casts We should cast pointers to and from unsigned long to turn them into ints. Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:07 -07:00
Devesh Sharma	fad51b7d36	RDMA/ocrdma: Code clean-up Clean up code. Also modifying GSI QP to error during ocrdma_close is fixed. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:06 -07:00
Selvin Xavier	334b8db3a6	RDMA/ocrdma: Display FW version Adding a sysfs file for getting the FW version. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:06 -07:00
Selvin Xavier	a51f06e167	RDMA/ocrdma: Query controller information Issue mailbox commands to query ocrdma controller information and phy information and print them while adding ocrdma device. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:05 -07:00
Selvin Xavier	bbc5ec524e	RDMA/ocrdma: Support non-embedded mailbox commands Added a routine to issue non-embedded mailbox commands for handling large mailbox request/response data. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:05 -07:00
Selvin Xavier	1228056bcf	RDMA/ocrdma: Handle CQ overrun error Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:05 -07:00
Selvin Xavier	ac578aef8b	RDMA/ocrdma: Display proper value for max_mw Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:04 -07:00
Selvin Xavier	cf5788ade7	RDMA/ocrdma: Use non-zero tag in SRQ posting As part of SRQ receive buffers posting we populate a non-zero tag which will be returned in SRQ receive completions. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:04 -07:00
Selvin Xavier	9d1878a369	RDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr() Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:03 -07:00
Devesh Sharma	2e6e9f2bb8	RDMA/ocrdma: Increment abi version count Increment the ABI version count for driver/library interface. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:02 -07:00
Devesh Sharma	0154410bd4	RDMA/ocrdma: Update version string Update the driver vrsion string and node description string Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:59 -07:00
Devesh Sharma	b6b87d2e69	RDMA/ocrdma: ABI versioning between ocrdma and be2net While loading RoCE driver be2net driver should check for ABI version to catch functional incompatibilities. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:51 -07:00
Devesh Sharma	1eebbb6ec3	RDMA/ocrdma: Allow DPP QP creation Allow creating DPP QP even if inline-data is not requested. This is an optimization to lower latency. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:44 -07:00
Devesh Sharma	21c3391a9a	RDMA/ocrdma: Read ASIC_ID register to select asic_gen ocrdma driver selects execution path based on sli_family and asic generation number. This introduces code to read the asic gen number from pci register instead of obtaining it from the Emulex NIC driver. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:40 -07:00
Devesh Sharma	2df84fa87f	RDMA/ocrdma: SQ and RQ doorbell offset clean up Introducing new macros to define SQ and RQ doorbell offset. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:36 -07:00
Devesh Sharma	ea61762679	RDMA/ocrdma: EQ full catastrophe avoidance Stale entries in the CQ being destroyed causes hardware to generate EQEs indefinitely for a given CQ. Thus causing uncontrolled execution of irq_handler. This patch fixes this using following sementics: * irq_handler will ring EQ doorbell atleast once and implement budgeting scheme. * cq_destroy will count number of valid entires during destroy and ring cq-db so that hardware does not generate uncontrolled EQE. * cq_destroy will synchronize with last running irq_handler instance. * arm_cq will always defer arming CQ till poll_cq, except for the first arm_cq call. * poll_cq will always ring cq-db with arm=SET if arm_cq was called prior to enter poll_cq. * poll_cq will always ring cq-db with arm=UNSET if arm_cq was not called prior to enter poll_cq. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:34 -07:00
Steve Wise	96bb2706c8	RDMA/cxgb4: Disable DSGL use by default Current hardware doesn't correctly support DSGL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:53:54 -07:00
Steve Wise	c529fb5046	RDMA/cxgb4: rx_data() needs to hold the ep mutex To avoid racing with other threads doing close/flush/whatever, rx_data() should hold the endpoint mutex. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:53:54 -07:00
Steve Wise	977116c698	RDMA/cxgb4: Drop RX_DATA packets if the endpoint is gone Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:53:53 -07:00
Steve Wise	a7db89eb89	RDMA/cxgb4: Lock around accept/reject downcalls There is a race between ULP threads doing an accept/reject, and the ingress processing thread handling close/abort for the same connection. The accept/reject path needs to hold the lock to serialize these paths. Signed-off-by: Steve Wise <swise@opengridcomputing.com> [ Fold in locking fix found by Dan Carpenter <dan.carpenter@oracle.com>. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:52:45 -07:00
Moni Shoua	b2853fd6c2	IB/core: Don't resolve passive side RoCE L2 address in CMA REQ handler The code that resolves the passive side source MAC within the rdma_cm connection request handler was both redundant and buggy, so remove it. It was redundant since later, when an RC QP is modified to RTR state, the resolution will take place in the ib_core module. It was buggy because this callback also deals with UD SIDR exchange, for which we incorrectly looked at the REQ member of the CM event and dereferenced a random value. Fixes: `dd5f03beb4` ("IB/core: Ethernet L2 attributes in verbs/cm structures") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 14:05:26 -07:00
Mike Marciniszyn	f3585a6ae3	IB/ehca: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads These methods appear to only mimic the sg_dma_address() and sg_dma_len() behavior. They can be safely removed. Suggested-by: Bart Van Assche <bvanassche@acm.org> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Cc: Christoph Raisch <raisch@de.ibm.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:16:31 -07:00
Mike Marciniszyn	49c5c27e05	IB/ipath: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads The removal of these methods is compensated for by code changes to .map_sg to insure that the vanilla sg_dma_address() and sg_dma_len() will do the same thing as the equivalent former ib_sg_dma_address() and ib_sg_dma_len() calls into the drivers. The introduction of this patch required that the struct ipath_dma_mapping_ops be converted to a C99 initializer. Suggested-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:16:31 -07:00
Mike Marciniszyn	446bf432a9	IB/qib: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads Remove the overload for .dma_len and .dma_address The removal of these methods is compensated for by code changes to .map_sg to insure that the vanilla sg_dma_address() and sg_dma_len() will do the same thing as the equivalent former ib_sg_dma_address() and ib_sg_dma_len() calls into the drivers. Suggested-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Vinod Kumar <vinod.kumar@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:16:31 -07:00
Or Gerlitz	5de2ad986d	IB/iser: Bump driver version to 1.3 Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Or Gerlitz	3ee07d27ac	IB/iser: Update Mellanox copyright note Update Mellanox copyrights for 2014 on the iser initiator driver. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Or Gerlitz	4f9208ad3f	IB/iser: Print QP information once connection is established Add an iser info print with the local/remote QP information carried out when the connection is established. While here, fix a little leftover from the T10 work and set a debug print to be carried in debug and not info level. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Ariel Nahum	4667f5dfb0	IB/iser: Remove struct iscsi_iser_conn The iscsi stack has existing mechanisms to link back and forth between the iscsi connection and the iscsi transport (e.g iser/tcp) connection. This is done through a dd_data pointer field in struct iscsi_conn which can be set to point to the transport connection, etc. The iscsi_iser_conn structure was used to get this linking done in another way, which is uneeded and adds extra complication to the iser code, so we just remove it. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Roi Dayan	1d6c2b736f	IB/iser: Drain the tx cq once before looping on the rx cq The iser disconnection flow isn't done before all the inflight recv/send buffers posted to the QP are either flushed or normally completed to the CQ that serves this connection. The condition check is done in iser_handle_comp_error(). Currently, it's possible for the send buffer completion that makes the posted send buffers counter reach zero to be polled in the drain tx call, which is after the rx cq is fully drained. Since this completion might be not an error one (for example, it might be a completion of the logout request iSCSI PDU) we will skip iser_handle_comp_error(). So the connection will never terminate from the iscsi stack point of view, and we hang. To resolve this race, do the draining of the tx cq before the loop on the rx cq. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Randy Dunlap	39c978cd17	IB/iser: Fix sector_t format warning Fix pr_err (printk) format warning: drivers/infiniband/ulp/iser/iser_verbs.c:1181:4: warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'sector_t' [-Wformat] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:01:08 -07:00
Dan Carpenter	4661bd798f	mlx4_core: Make buffer larger to avoid overflow warning My static checker complains that the sprintf() here can overflow. drivers/infiniband/hw/mlx4/main.c:1836 mlx4_ib_alloc_eqs() error: format string overflow. buf_size: 32 length: 69 This seems like a valid complaint. The "dev->pdev->bus->name" string can be 48 characters long. I just made the buffer 80 characters instead of 69 and I changed the sprintf() to snprintf(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 10:53:29 -07:00
Dan Carpenter	3839d8ac1b	mlx4_core: Fix some indenting in mlx4_ib_add() The code was indented too far and also kernel style says we should have curly braces. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 10:52:18 -07:00
Yan Burman	2c34e68f42	IB/mad: Check and handle potential DMA mapping errors Running with DMA_API_DEBUG enabled and not checking for DMA mapping errors triggers a kernel stack trace with "DMA-API: device driver failed to check map error" message. Add these checks to the MAD module, both to be be more robust and also eliminate these false-positive stack traces. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 10:36:07 -07:00
Yann Droneaud	5bdb0f02ad	IB/ehca: Returns an error on ib_copy_to_udata() failure In case of error when writing to userspace, function ehca_create_cq() does not set an error code before following its error path. This patch sets the error code to -EFAULT when ib_copy_to_udata() fails. This was caught when using spatch (aka. coccinelle) to rewrite call to ib_copy_{from,to}_udata(). Link: `75ebf2c103`:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 10:36:07 -07:00
Yann Droneaud	08e74c4b00	IB/mthca: Return an error on ib_copy_to_udata() failure In case of error when writing to userspace, the function mthca_create_cq() does not set an error code before following its error path. This patch sets the error code to -EFAULT when ib_copy_to_udata() fails. This was caught when using spatch (aka. coccinelle) to rewrite call to ib_copy_{from,to}_udata(). Link: `75ebf2c103`:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 10:35:42 -07:00
Yann Droneaud	bfd2793c95	RDMA/cxgb4: set error code on kmalloc() failure If kmalloc() fails in c4iw_alloc_ucontext(), the function leaves but does not set an error code in ret variable: it will return 0 to the caller. This patch set ret to -ENOMEM in such case. Cc: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@chelsio.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-28 14:55:21 -04:00
Matan Barak	e471b40321	mlx4: Use actual number of PCI functions (PF + VFs) for alias GUID logic The code which is dealing with SRIOV alias GUIDs in the mlx4 IB driver has some logic which operated according to the maximal possible active functions (PF + VFs). After the single port VFs code integration this resulted in a flow of false-positive warnings going to the kernel log after the PF driver started the alias GUID work. Fix it by referring to the actual number of functions. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-25 20:48:05 -04:00
Steve Wise	9c88aa003d	RDMA/cxgb4: Update snd_seq when sending MPA messages Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:07:35 -07:00
Steve Wise	be13b2dff8	RDMA/cxgb4: Connect_request_upcall fixes When processing an MPA Start Request, if the listening endpoint is DEAD, then abort the connection. If the IWCM returns an error, then we must abort the connection and release resources. Also abort_connection() should not post a CLOSE event, so clean that up too. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:07:35 -07:00
Steve Wise	70b9c66053	RDMA/cxgb4: Ignore read reponse type 1 CQEs These are generated by HW in some error cases and need to be silently discarded. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:07:35 -07:00
Steve Wise	1ce1d471ac	RDMA/cxgb4: Fix possible memory leak in RX_PKT processing If cxgb4_ofld_send() returns < 0, then send_fw_pass_open_req() must free the request skb and the saved skb with the tcp header. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:07:35 -07:00
Steve Wise	dbb084cc5f	RDMA/cxgb4: Don't leak skb in c4iw_uld_rx_handler() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:07:35 -07:00
Bart Van Assche	b3fe628da2	IB/srp: Fix a race condition between failing I/O and I/O completion Avoid that srp_terminate_io() can access req->scmnd after it has been cleared by the I/O completion code. Do this by protecting req->scmnd accesses from srp_terminate_io() via locking Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:05:31 -07:00
Bart Van Assche	ac72d766a8	IB/srp: Avoid that writing into "add_target" hangs due to a cable pull If a cable is pulled while srp_connect_target() is in progress that can result in that function never to return. That makes the process, e.g. srp_daemon, that invoked this function unkillable. Avoid this by letting srp_connect_target() finish if the event IB_CM_TIMEWAIT_EXIT is received. This patch fixes a hang with the following call trace: [<ffffffff814eae85>] schedule_timeout+0x215/0x2e0 [<ffffffff814eab03>] wait_for_common+0x123/0x180 [<ffffffff814eac1d>] wait_for_completion+0x1d/0x20 [<ffffffffa03b398c>] srp_connect_target+0x1dc/0x410 [ib_srp] [<ffffffffa03b5809>] srp_create_target+0xba9/0xe70 [ib_srp] [<ffffffff8133e590>] dev_attr_store+0x20/0x30 [<ffffffff811eb8f5>] sysfs_write_file+0xe5/0x170 [<ffffffff811767c8>] vfs_write+0xb8/0x1a0 [<ffffffff811770c1>] sys_write+0x51/0x90 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:05:31 -07:00
Bart Van Assche	a702adceec	IB/srp: Make writing into the "add_target" sysfs attribute interruptible Avoid that stopping srp_daemon takes unusually long due to a cable pull by making writing into the "add_target" sysfs attribute interruptible. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:05:31 -07:00
Bart Van Assche	2d7091bcf6	IB/srp: Avoid duplicate connections The connection uniqueness check is performed before a new connection is added to the target list. This patch protects both actions by a mutex such that simultaneous writes from two different threads into the "add_target" variable do not result in duplicate connections. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:05:31 -07:00
Bart Van Assche	e7ffde0164	IB/srp: Add more logging Log sgid and dgid when reporting that a login has been rejected or when a host has been added. This makes it easy to figure out which initiator and target ports these messages apply to. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:05:30 -07:00
Sagi Grimberg	2088ca66f5	IB/srp: Check ib_query_gid return value Detected by Coverity. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-03-24 10:05:30 -07:00
Matan Barak	449fc48866	net/mlx4: Adapt code for N-Port VF Adds support for N-Port VFs, this includes: 1. Adding support in the wrapped FW command In wrapped commands, we need to verify and convert the slave's port into the real physical port. Furthermore, when sending the response back to the slave, a reverse conversion should be made. 2. Adjusting sqpn for QP1 para-virtualization The slave assumes that sqpn is used for QP1 communication. If the slave is assigned to a port != (first port), we need to adjust the sqpn that will direct its QP1 packets into the correct endpoint. 3. Adjusting gid[5] to modify the port for raw ethernet In B0 steering, gid[5] contains the port. It needs to be adjusted into the physical port. 4. Adjusting number of ports in the query / ports caps in the FW commands When a slave queries the hardware, it needs to view only the physical ports it's assigned to. 5. Adjusting the sched_qp according to the port number The QP port is encoded in the sched_qp, thus in modify_qp we need to encode the correct port in sched_qp. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-20 16:18:30 -04:00
Matan Barak	82373701be	IB/mlx4_ib: Adapt code to use caps.num_ports instead of a constant Some code in the mlx4 IB driver stack assumed MLX4_MAX_PORTS ports. Instead, we should only loop until the number of actual ports in i the device, which is stored in dev->caps.num_ports. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-20 16:18:29 -04:00

1 2 3 4 5 ...

4044 Commits