OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Yonatan Cohen	0fd586de65	IB/mlx4: Add CQ moderation capability to query_device query_device can now obtain the maximum values for cq_max_count and cq_period, needed for cq moderation. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Yonatan Cohen	18bd907292	IB/uverbs: Add CQ moderation capability to query_device The query_device function can now obtain the maximum values for cq_max_count and cq_period, needed for CQ moderation. cq_max_count is a 16 bits number that determines the number of CQEs to accumulate before generating an event. cq_period is a 16 bits number that determines the timeout in micro seconds from the last event generated, upon which a new event will be generated even if cq_max_count was not reached. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Yonatan Cohen	b0e9df6da2	IB/mlx5: Exposing modify CQ callback to uverbs layer Exposed mlx5_ib_modify_cq to be called from ib device verb list. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Yonatan Cohen	34d9a270e7	IB/mlx4: Exposing modify CQ callback to uverbs layer Exposed mlx4_ib_modify_cq to be called from ib device verb list. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Yonatan Cohen	869ddcf8b3	IB/uverbs: Allow CQ moderation with modify CQ Uverbs support in modify_cq for CQ moderation only. Gives ability to change cq_max_count and cq_period. CQ moderation enhance performance by moderating the number of CQEs needed to create an event instead of application having to suffer from event per-CQE. To achieve CQ moderation the application needs to set cq_max_count and cq_period. cq_max_count - defines the number of CQEs needed to create an event. cq_period - defines the timeout (micro seconds) between last event and a new one that will occur even if cq_max_count was not satisfied Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Steve Wise	bc52e9ca74	iw_cxgb4: atomically flush the qp __flush_qp() has a race condition where during the flush operation, the qp lock is released allowing another thread to possibly post a WR, which corrupts the queue state, possibly causing crashes. The lock was released to preserve the cq/qp locking hierarchy of cq first, then qp. However releasing the qp lock is not necessary; both RQ and SQ CQ locks can be acquired first, followed by the qp lock, and then the RQ and SQ flushing can be done w/o unlocking. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Steve Wise	cbb40fadd3	iw_cxgb4: only call the cq comp_handler when the cq is armed The ULPs completion handler should only be called if the CQ is armed for notification. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Bharat Potnuri	1c8f1da5d8	iw_cxgb4: Fix possible circular dependency locking warning Locking sequence of iw_cxgb4 and RoCE drivers in ib_register_device() is slightly different and this leads to possible circular dependency locking warning when both the devices are brought up. Here is the locking sequence upto ib_register_device(): iw_cxgb4: rtnl_mutex(net stack) --> uld_mutex --> device_mutex RoCE drivers: device_mutex --> rtnl_mutex Here is the possibility of cross locking: CPU #0 (iw_cxgb4) CPU #1 (RoCE drivers) -> on interface up cxgb4_up() executed with rtnl_mutex held -> hold uld_mutex and try registering ib device -> In ib_register_device() hold device_mutex -> hold device mutex in ib_register_device -> try acquiring rtnl_mutex in ib_enum_roce_netdev() Current patch schedules the ib_register_device() functionality of iw_cxgb4 to a workqueue to prevent the possible cross-locking. Also rename the labels in c4iw_reister_device(). Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:59:22 -05:00
Devesh Sharma	84511455ac	RDMA/bnxt_re: report vlan_id and sl in qp1 recv completion In a real RoCE v2 network it is possible to have two Sections of network have same IP hence same gid. However those may have different vlans. During connection resolution it is important to report the actual vlan on which the MAD packet was received instead of relying on other means to resolve vlan-id. ib_find_gid_index should not be used to resolve the vlan-id using sgid of the local system where the packet was received. Our device has the capability to report the actual VLAN-ID in the GSI qp completions. Since we have the capability our driver should move away from resolving the vlan-id with the help of SGID at the destination port. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Reported-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:28:44 -05:00
Daniel Jurgens	877add2817	IB/core: Only maintain real QPs in the security lists When modify QP is called on a shared QP update the security context for the real QP. When security is subsequently enforced the shared QP handles will be checked as well. Without this change shared QP handles get added to the port/pkey lists, which is a bug, because not all shared QP handles will be checked for access. Also the shared QP security context wouldn't get removed from the port/pkey lists causing access to free memory and list corruption when they are destroyed. Cc: stable@vger.kernel.org Fixes: `d291f1a652` ("IB/core: Enforce PKey security on QPs") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:24:17 -05:00
Gustavo A. R. Silva	f4e96c1a71	IB/ocrdma_hw: remove unnecessary code in ocrdma_mbx_dealloc_lkey Check on return value and goto label mbx_err are unnecessary. Addresses-Coverity-ID: 1268780 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:18:33 -05:00
Yuval Shaia	e08ce2e82b	RDMA/core: Make function rdma_copy_addr return void Function returns zero - make it void. While there make struct net_device const. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:18:33 -05:00
Bryan Tan	8b10ba783c	RDMA/vmw_pvrdma: Add shared receive queue support Add the required functions needed to support SRQs. Currently, kernel clients are not supported. SRQs will only be available in userspace. Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Reviewed-by: Nitish Bhat <bnitish@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:18:33 -05:00
Arnd Bergmann	cb9fd89f91	RDMA/core: avoid uninitialized variable warning in create_udata As Dan pointed out, the rework I did makes it harder for smatch and other static checkers to figure out what is going on with the uninitialized pointers. By open-coding the call in create_udata(), we make it more readable for both humans and tools. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `12f727721e` ("IB/uverbs: clean up INIT_UDATA_BUF_OR_NULL usage") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 16:11:11 -05:00
Selvin Xavier	051276658b	RDMA/bnxt_re: synchronize poll_cq and req_notify_cq verbs Synchronize poll_cq and req_notify_cq verbs using cq_lock, instead of the lower level qplib->hwq.lock. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Selvin Xavier	c88a7858d7	RDMA/bnxt_re: Flush CQ notification Work Queue before destroying QP Destroy_qp shall wait for any outstanding CQ notification to be flushed out before proceeding with QP destroy. Flushing the WQ before destroying the QP. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Selvin Xavier	237379fc33	RDMA/bnxt_re: Set QP state in case of response completion errors Moves the driver QP state to error in case of response completion errors. Handles the scenarios which doesn't generate a terminal CQE. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Somnath Kotur	9b40183c08	RDMA/bnxt_re: Add memory barriers when processing CQ/EQ entries The code determines if the next ring entry is valid before proceeding further to read the rest of the entry. The CPU can re-order and read the rest of the entry first, possibly reading a stale entry, if DMA of a new entry happens right after reading it. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Dennis Dalessandro	685894dd9b	IB/hfi1: Handle initial value of 0 for CCTI setting When the driver is loaded it sets the default CCTI value to be 0. When the FM starts and CCA is disabled the driver sets the max value to 65535 due the driver subtracting 1 from 0 and the fact that the CCTI value is a u16. Special case the subtraction to find the index for a 0 value. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Don Hiatt	b64581adba	IB/hfi1: Mask upper 16Bits of Extended LID prior to rvt_cq_entry Pass only the lower 16Bits of an Extended LIDs to rvt_cq_entry to avoid triggering a WARN_ON_ONCE during conversion there. These upper 16Bits are okay to drop as they are obtained elsewhere. Fixes: `62ede77799` ("Add OPA extended LID support") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Don Hiatt	19b57c6c44	IB/core: Convert OPA AH to IB for Extended LIDs only When deciding to convert an OPA AH to IB we were incorrectly including the IB multicast range. At this layer, all Extended LIDs will be larger than IB_LID_PERMISSIVE. Change comparison accordingly. Fixes: `d541e45500` ("IB/core: Convert ah_attr from OPA to IB when copying to user") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Jan Sokolowski	e8d5aff650	IB/hfi1: Send 'reboot' as planned down remote reason On host shutdown, driver sends 'SMA_Disabled' as a reason for link down. This is incorrect. Send 'reboot' as a linkdown reason. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Grzegorz Morys	a276672ed7	IB/hfi1: Prohibit invalid Init to Armed state transition It is invalid to change Link state from Init to Armed if IsSmConfigurationStarted bit is not set in Attribute modifier for Set subnet management method in case of PortInfo and PortStateInfo attribute. Set response MAD status field bits accordingly to react correctly in such situations and avoid changing Link state. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Morys <grzegorz.morys@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Niranjana Vishwanathapura	cc9a97ea2c	IB/hfi1: Do not allocate PIO send contexts for VNIC OPA VNIC does not use PIO contexts and instead only uses SDMA engines. Do not allocate PIO contexts for VNIC ports. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Jan Sokolowski	e4c397eed9	IB/hfi1: Remove unnecessary if check A for loop condition of data_iovs in user_sdma_free_request is unnecessarily repeated before the loop as an if check. Remove the if enveloping the loop. Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Mike Marciniszyn	d61ea0751a	IB/hfi1: Fix a wrapping test to insure the correct timeout The "2 * UINT_MAX" statement: if ((u64)(ts - cce->timestamp) > 2 * UINT_MAX) { is equivalent to: if ((u64)(ts - cce->timestamp) > UINT_MAX - 1) { This results in a premature timeout of the cong log entry. Fix by using unsigned 64 bit integers, removing casts, and using an algebraic equivalent test to avoid the "2 * UINT_MAX" issue. Also make use of kernel API to get nanoseconds instead of open coding. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:57 -05:00
Kamenee Arumugam	0e31a2e195	IB/hfi1: Remove wrapper function in mmu_rb Wrapper functions were used to call the same function mmu_notifier_mem_invalidate for 2 callbacks in mmu_notifier. The commit `7def96f0a9` ("IB/hfi1: update to new mmu_notifier semantic") removed the invalidate_page callback. Therefore, the wrapper function is no longer needed. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:56 -05:00
Jakub Byczkowski	22a3ffa780	IB/hfi1: Reduce 8051 command timeout Timeout of 20 seconds is too long for active wait performed for 8051 command completion. It was required for scenarios when transition to polling was requested before offline.quiet state was reached. Currently wait for offline.quiet is properly implemented and timeout can be reduced to 1 second. Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Duane McCrory <duane.mccrory@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:56 -05:00
Jan Sokolowski	641f348bbd	IB/hfi1: Allow MgmtAllowed on B2B setups HFI's are hard-wired to send Device Info frames with MgmtAllowed bit set to 0. This means in B2B setups, MgmtAllowed would never be allowed, which prevents remote opa management tools from working properly. Assume MgmtAllowed if a neighbor is also an HFI. Fixes: `98b9ee2002` ("IB/hfi1: Cache neighbor secure data after link up") Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:53:56 -05:00
Bart Van Assche	57b0c46054	IB/srpt: Ensure that modifying the use_srq configfs attribute works The use_srq configfs attribute is created after it is read. Hence modify srpt_tpg_attrib_use_srq_store() such that this function switches dynamically between non-SRQ and SRQ mode. Reported-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:25:16 -05:00
Bart Van Assche	8b6dc529ea	IB/srpt: Wait until channel release has finished during module unload Introduce the helper function srpt_set_enabled(). Protect sport->enabled changes with sdev->mutex. Makes configfs writes into 'enabled' wait until all channel resources have been freed. Wait until channel release has finished during kernel module unload. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:25:16 -05:00
Bart Van Assche	01b3ee13c2	IB/srpt: Introduce srpt_disconnect_ch_sync() Except for changing a BUG_ON() call into a WARN_ON_ONCE() call, this patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:25:16 -05:00
Bart Van Assche	c76d7d64f8	IB/srpt: Introduce helper functions for SRQ allocation and freeing The only functional change in this patch is in the srpt_add_one() error path: if allocating the ring buffer for the SRQ fails, fall back to non-SRQ mode instead of disabling SRP target functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:25:16 -05:00
Mike Marciniszyn	321e329b9c	IB/srpt: Post receive work requests after qp transition to INIT state IB and iWARP specs both spell out that posting a receive work request to a queue pair in the RESET state is an invalid operation and required to fail. Postpone posting receive work requests until after the transition to the INIT state. Fixes: commit `dea262094c` ("IB/srpt: Change default behavior from using SRQ to using RC") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:25:16 -05:00
Steve Wise	ba97b74997	iw_cxgb4: remove BUG_ON() usage. iw_cxgb4 has many BUG_ON()s that were left over from various enhancemnets made over the years. Almost all of them should just be removed. Some, however indicate a ULP usage error and can be handled w/o bringing down the system. If the condition cannot happen with correctly implemented cxgb4 sw/fw, then remove the BUG_ON. If the condition indicates a misbehaving ULP (like CQ overflows), add proper recovery logic. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:01:25 -05:00
Sriharsha Basavapatna	063fb5bd1a	bnxt_re: changing the ip address shouldn't affect new connections While adding a new gid, the driver currently does not return the context back to the stack. A subsequent del_gid() (e.g, when ip address is changed) doesn't find the right context in the driver and it ends up dropping that request. This results in the HW caching a stale gid entry and traffic fails because of that. Fix by returning the proper context in bnxt_re_add_gid(). Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:01:25 -05:00
Sriharsha Basavapatna	d6d5c59905	bnxt_re: fix a crash in qp error event processing In bnxt_qplib_process_qp_event(), for qp error events we look up the qp-handle and pass it for further processing. But we don't check if the handle is NULL. This could lead to a crash in the called functions when that qp-handle is dereferenced, if the qp is destroyed in the meantime. Fix this by checking for a valid qp-handle in that function. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 15:01:25 -05:00
Noa Osherovich	f17966f195	IB/mlx5: Fix ABI alignment to 64 bit Struct mlx5_ib_striding_rq_caps was not aligned to 64 bit as it should have been. Add a 32 bit reserved field. Fixes: `b4f34597a5` ('IB/mlx5: Expose multi-packet RQ capabilities') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 14:42:05 -05:00
Parav Pandit	2e4c85c6ed	IB/core: Avoid unnecessary return value check Since there is nothing done with non zero return value, such check is avoided. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 14:42:04 -05:00
Mark Bloch	5f22a1d87c	IB/mlx4: Increase maximal message size under UD QP Maximal message should be used as a limit to the max message payload allowed, without the headers. The ConnectX-3 check is done against this value includes the headers. When the payload is 4K this will cause the NIC to drop packets. Increase maximal message to 8K as workaround, this shouldn't change current behaviour because we continue to set the MTU to 4k. To reproduce; set MTU to 4296 on the corresponding interface, for example: ifconfig eth0 mtu 4296 (both server and client) On server: ib_send_bw -c UD -d mlx4_0 -s 4096 -n 1000000 -i1 -m 4096 On client: ib_send_bw -d mlx4_0 -c UD <server_ip> -s 4096 -n 1000000 -i 1 -m 4096 Fixes: `6e0d733d92` ("IB/mlx4: Allow 4K messages for UD QPs") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 14:42:04 -05:00
Guy Levi	ed8637d361	IB/mlx4: Add contig support for control objects Taking advantage of the optimization which was introduced in previous commit ("IB/mlx4: Use optimal numbers of MTT entries") to optimize the MTT usage for QP and CQ. Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 14:42:04 -05:00
Guy Levi	9901abf583	IB/mlx4: Use optimal numbers of MTT entries Optimize the device performance by assigning multiple physical pages, which are contiguous, to a single MTT. As a result, the number of MTTs is reduced and in turn save cache misses of MTTs. Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 14:42:04 -05:00
Majd Dibbiny	2b621851ac	IB/mlx5: Fix RoCE Address Path fields When working over a RoCE network, the UDP source port should be set only for statically connected QPs (RC, UC and XRC). Fixes: `2811ba51b0` ("IB/mlx5: Add RoCE fields to Address Vector") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 13:53:22 -05:00
Majd Dibbiny	31fde034a8	IB/mlx5: Assign send CQ and recv CQ of UMR QP The UMR's QP is created by calling mlx5_ib_create_qp directly, and therefore the send CQ and the recv CQ on the ibqp weren't assigned. Assign them right after calling the mlx5_ib_create_qp to assure that any access to those pointers will work as expected and won't crash the system as might happen as part of reset flow. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 13:53:22 -05:00
Leon Romanovsky	9950acf945	RDMA/cxgb4: Protect from possible dereference Smatch tool reports the following error: drivers/infiniband/hw/cxgb4/qp.c:1886 c4iw_create_qp() error: we previously assumed 'ucontext' could be null (see line 1804) Cc: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 13:53:22 -05:00
Leon Romanovsky	e32d2d7144	RDMA/bnxt_re: Remove unused vlan_tag variable The Broadcom driver produces the following compilation warning drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function ‘bnxt_re_create_ah’: drivers/infiniband/hw/bnxt_re/ib_verbs.c:668:6: warning: variable ‘vlan_tag’ set but not used [-Wunused-but-set-variable] u16 vlan_tag; Let's remove it till vlan_tag will be implemented properly. Cc: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-13 13:53:22 -05:00
Noa Osherovich	b1383aa641	IB/mlx5: Add PCI write end padding support Add the PCI write end padding flag to device_cap_flags enum and set it during mlx5_ib_query_device so it will be reported to user-space. During WQ/QP creation, set that capability for WQ/QP if user requested it and HW supports it. PCI write end padding modification is not supported for now. There's no such flag for a QP but for a WQ, create and modify use the same flag. Return an error if PCI write end padding flag is set during modify_wq. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-10 13:50:27 -05:00
Noa Osherovich	e1d2e88733	IB/core: Add PCI write end padding flags for WQ and QP There are root complexes that are able to optimize their performance when incoming data is multiple full cache lines. PCI write end padding is the device's ability to pad the ending of incoming packets (scatter) to full cache line such that the last upstream write generated by an incoming packet will be a full cache line. Add a relevant entry to ib_device_cap_flags to report such capability of an RDMA device. Add the QP and WQ create flags: * A QP/WQ created with a scatter end padding flag will cause HW to pad the last upstream write generated by a packet to cache line. User should consider several factors before activating this feature: - In case of high CPU memory load (which may cause PCI back pressure in turn), if a large percent of the writes are partial cache line, this feature should be checked as an optional solution. - This feature might reduce performance if most packets are between one and two cache lines and PCIe throughput has reached its maximum capacity. E.g. 65B packet from the network port will lead to 128B write on PCIe, which may cause traffic on PCIe to reach high throughput. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-10 13:50:27 -05:00
Thomas Bogendoerfer	3192c53e5a	IB/rxe: don't crash, if allocation of crc algorithm failed Following crash happens, if crc algorithm couldn't be allocated: [ 1087.989072] rdma_rxe: loaded [ 1097.855397] PCLMULQDQ-NI instructions are not detected. [ 1097.901220] rdma_rxe: failed to allocate crc algorithmi err:-2 [ 1097.901248] BUG: unable to handle kernel [ 1097.901249] NULL pointer dereference [ 1097.901250] at 0000000000000046 [...] Reason is that rxe->tfm is assigned the error return, which will then be used for crypto_free_shash() in rxe_cleanup. Fix by using a temporary variable and assigning it rxe->tfm after allocation succeeded. Fixes: `cee2688e3c` ("IB/rxe: Offload CRC calculation when possible") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-10 13:43:50 -05:00
Parav Pandit	89548bcafe	IB/core: Avoid crash on pkey enforcement failed in received MADs Below kernel crash is observed when Pkey security enforcement fails on received MADs. This issue is reported in [1]. ib_free_recv_mad() accesses the rmpp_list, whose initialization is needed before accessing it. When security enformcent fails on received MADs, MAD processing avoided due to security checks failed. OpenSM[3770]: SM port is down kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 kernel: IP: ib_free_recv_mad+0x44/0xa0 [ib_core] kernel: PGD 0 kernel: P4D 0 kernel: kernel: Oops: 0002 [#1] SMP kernel: CPU: 0 PID: 2833 Comm: kworker/0:1H Tainted: P IO 4.13.4-1-pve #1 kernel: Hardware name: Dell XS23-TY3 /9CMP63, BIOS 1.71 09/17/2013 kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] kernel: task: ffffa069c6541600 task.stack: ffffb9a729054000 kernel: RIP: 0010:ib_free_recv_mad+0x44/0xa0 [ib_core] kernel: RSP: 0018:ffffb9a729057d38 EFLAGS: 00010286 kernel: RAX: ffffa069cb138a48 RBX: ffffa069cb138a10 RCX: 0000000000000000 kernel: RDX: ffffb9a729057d38 RSI: 0000000000000000 RDI: ffffa069cb138a20 kernel: RBP: ffffb9a729057d60 R08: ffffa072d2d49800 R09: ffffa069cb138ae0 kernel: R10: ffffa069cb138ae0 R11: ffffa072b3994e00 R12: ffffb9a729057d38 kernel: R13: ffffa069d1c90000 R14: 0000000000000000 R15: ffffa069d1c90880 kernel: FS: 0000000000000000(0000) GS:ffffa069dba00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000000008 CR3: 00000011f51f2000 CR4: 00000000000006f0 kernel: Call Trace: kernel: ib_mad_recv_done+0x5cc/0xb50 [ib_core] kernel: __ib_process_cq+0x5c/0xb0 [ib_core] kernel: ib_cq_poll_work+0x20/0x60 [ib_core] kernel: process_one_work+0x1e9/0x410 kernel: worker_thread+0x4b/0x410 kernel: kthread+0x109/0x140 kernel: ? process_one_work+0x410/0x410 kernel: ? kthread_create_on_node+0x70/0x70 kernel: ? SyS_exit_group+0x14/0x20 kernel: ret_from_fork+0x25/0x30 kernel: RIP: ib_free_recv_mad+0x44/0xa0 [ib_core] RSP: ffffb9a729057d38 kernel: CR2: 0000000000000008 [1] : https://www.spinics.net/lists/linux-rdma/msg56190.html Fixes: `47a2b338fe` ("IB/core: Enforce security on management datagrams") Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Parav Pandit <parav@mellanox.com> Reported-by: Chris Blake <chrisrblake93@gmail.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-11-10 13:26:00 -05:00

1 2 3 4 5 ...

707405 Commits All Branches Search

707405 Commits

All Branches