OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
James Smart	c490850a09	scsi: lpfc: Adapt partitioned XRI lists to efficient sharing The XRI get/put lists were partitioned per hardware queue. However, the adapter rarely had sufficient resources to give a large number of resources per queue. As such, it became common for a cpu to encounter a lack of XRI resource and request the upper io stack to retry after returning a BUSY condition. This occurred even though other cpus were idle and not using their resources. Create as efficient a scheme as possible to move resources to the cpus that need them. Each cpu maintains a small private pool which it allocates from for io. There is a watermark that the cpu attempts to keep in the private pool. The private pool, when empty, pulls from a global pool from the cpu. When the cpu's global pool is empty it will pull from other cpu's global pool. As there many cpu global pools (1 per cpu or hardware queue count) and as each cpu selects what cpu to pull from at different rates and at different times, it creates a radomizing effect that minimizes the number of cpu's that will contend with each other when the steal XRI's from another cpu's global pool. On io completion, a cpu will push the XRI back on to its private pool. A watermark level is maintained for the private pool such that when it is exceeded it will move XRI's to the CPU global pool so that other cpu's may allocate them. On NVME, as heartbeat commands are critical to get placed on the wire, a single expedite pool is maintained. When a heartbeat is to be sent, it will allocate an XRI from the expedite pool rather than the normal cpu private/global pools. On any io completion, if a reduction in the expedite pools is seen, it will be replenished before the XRI is placed on the cpu private pool. Statistics are added to aid understanding the XRI levels on each cpu and their behaviors. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	ace44e48b1	scsi: lpfc: Synchronize hardware queues with SCSI MQ interface Now that the lower half has much better per-cpu parallelization using the hardware queues, the SCSI MQ support needs to be tied into it. The involves the following mods: - Use the hardware queue info from the midlayer to help select the hardware queue to utilize. This required change to the get_scsi-buf_xxx routines. - Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed. - Includes fix for SLI-3 that does not have multi queue parallelization. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	1fbf974250	scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting. SLI4 nvme functions are passing the SLI3 ring number when posting wqe to hardware. This should be indicating the hardware queue to use, not the ring number. Replace ring number with the hardware queue that should be used. Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb routine that properly adapts. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	4c47efc140	scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures Many io statistics were being sampled and saved using adapter-based data structures. This was creating a lot of contention and cache thrashing in the I/O path. Move the statistics to the hardware queue data structures. Given the per-queue data structures, use of atomic types is lessened. Add new sysfs and debugfs stat routines to collate the per hardware queue values and report at an adapter level. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:08 -05:00
James Smart	5e5b511d8b	scsi: lpfc: Partition XRI buffer list across Hardware Queues Once the IO buff allocations were made shared, there was a single XRI buffer list shared by all hardware queues. A single list isn't great for performance when shared across the per-cpu hardware queues. Create a separate XRI IO buffer get/put list for each Hardware Queue. As SGLs and associated IO buffers get allocated/posted to the firmware; round robin their assignment across all available hardware Queues so that there is an equitable assignment. Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic for XRI allocation. Add a debugfs interface to display hardware queue statistics Added new empty_io_bufs counter to track if a cpu runs out of XRIs. Replace common_ variables/names with io_ to make meanings clearer. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:24:22 -05:00
James Smart	cdb42becdd	scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu Currently, both nvme and fcp each have their own concept of an io_channel, which is a combination wq/cq and associated msix. Different cpus would share an io_channel. The driver is now moving to per-cpu wq/cq pairs and msix vectors. The driver will still use separate wq/cq pairs per protocol on each cpu, but the protocols will share the msix vector. Given the elimination of the nvme and fcp io channels, the module parameters will be removed. A new parameter, lpfc_hdw_queue is added which allows the wq/cq pair allocation per cpu to be overridden and allocated to lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will be based on the number of cpus. If non-zero, the parameter specifies the number of queues to allocate. At this time, the maximum non-zero value is 64. To manage this new paradigm, a new hardware queue structure is created to track queue activity and relationships. As MSIX vector allocation must be known before setting up the relationships, msix allocation now occurs before queue datastructures are allocated. If the number of vectors allocated is less than the desired hardware queues, the hardware queue counts will be reduced to the number of vectors Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
James Smart	7370d10ac9	scsi: lpfc: Remove extra vector and SLI4 queue for Expresslane There is a extra queue and msix vector for expresslane. Now that the driver will be doing queues per cpu, this oddball queue is no longer needed. Expresslane will utilize the normal per-cpu queues. Updated debugfs sli4 queue output to go along with the change Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
James Smart	0794d601d1	scsi: lpfc: Implement common IO buffers between NVME and SCSI Currently, both NVME and SCSI get their IO buffers from separate pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split between protocols. Eliminate the independent pools and use a single pool. Each buffer structure now has a common section and a protocol section. Per protocol routines for SGL initialization are removed and replaced by common routines. Initialization of the buffers is only done on the common area. All other fields, which are protocol specific, are initialized when the buffer is allocated for use in the per-protocol allocation routine. In the past, the SCSI side allocated IO buffers as part of slave_alloc calls until the maximum XRIs for SCSI was reached. As all XRIs are now common and may be used for either protocol, allocation for everything is done as part of adapter initialization and the scsi side has no action in slave alloc. As XRI's are no longer split, the lpfc_xri_split module parameter is removed. Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put routines. All SLI4 adapters utilize the new IO buffer scheme Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
Linus Torvalds	938edb8a31	SCSI misc on 20181224 This is mostly update of the usual drivers: smarpqi, lpfc, qedi, megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have a pile of annotation, unused variable and minor updates. The big API change is the updates for Christoph's DMA rework which include removing the DISABLE_CLUSTERING flag. And finally there are a couple of target tree updates. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXCEUNiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdjKAP9vrTTv qFaYmAoRSbPq9ZiixaXLMy0K/6o76Uay0gnBqgD/fgn3jg/KQ6alNaCjmfeV3wAj u1j3H7tha9j1it+4pUw= =GDa+ -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: smarpqi, lpfc, qedi, megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have a pile of annotation, unused variable and minor updates. The big API change is the updates for Christoph's DMA rework which include removing the DISABLE_CLUSTERING flag. And finally there are a couple of target tree updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits) scsi: isci: request: mark expected switch fall-through scsi: isci: remote_node_context: mark expected switch fall-throughs scsi: isci: remote_device: Mark expected switch fall-throughs scsi: isci: phy: Mark expected switch fall-through scsi: iscsi: Capture iscsi debug messages using tracepoints scsi: myrb: Mark expected switch fall-throughs scsi: megaraid: fix out-of-bound array accesses scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through scsi: fcoe: remove set but not used variable 'port' scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() scsi: smartpqi: fix build warnings scsi: smartpqi: update driver version scsi: smartpqi: add ofa support scsi: smartpqi: increase fw status register read timeout scsi: smartpqi: bump driver version scsi: smartpqi: add smp_utils support scsi: smartpqi: correct lun reset issues scsi: smartpqi: correct volume status scsi: smartpqi: do not offline disks for transient did no connect conditions scsi: smartpqi: allow for larger raid maps ...	2018-12-28 14:48:06 -08:00
James Smart	5021267af1	scsi: lpfc: Adding ability to reset chip via pci bus reset This patch adds a "pci_bus_reset" option to the board_mode sysfs attribute. This option uses the pci_reset_bus() api to reset the PCIe link the adapter is on, which will reset the chip/adapter. Prior to issuing this option, all functions on the same chip must be placed in the offline state by the admin. After the reset, all of the instances may be brought online again. The primary purpose of this functionality is to support cases where firmware update required a chip reset but the admin did not want to reboot the machine in order to instantiate the firmware update. Sanity checks take place prior to the reset to ensure the adapter is the sole entity on the PCIe bus and that all functions are in the offline state. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:08 -05:00
Christoph Hellwig	2a3d4eb8e2	scsi: flip the default on use_clustering Most SCSI drivers want to enable "clustering", that is merging of segments so that they might span more than a single page. Remove the ENABLE_CLUSTERING define, and require drivers to explicitly set DISABLE_CLUSTERING to disable this feature. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-18 23:13:12 -05:00
James Smart	7c4042a4d0	scsi: lpfc: Fix dif and first burst use in write commands When dif and first burst is used in a write command wqe, the driver was not properly setting fields in the io command request. This resulted in no dif bytes being sent and invalid xfer_rdy's, resulting in the io being aborted by the hardware. Correct the wqe initializaton when both dif and first burst are used. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:33 -05:00
James Smart	2c4c91415a	scsi: lpfc: Fix a duplicate 0711 log message number. Renumber one of the 0711 log messages so there isn't a duplication. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
Jens Axboe	f664a3cc17	scsi: kill off the legacy IO path This removes the legacy (non-mq) IO path for SCSI. Cc: linux-scsi@vger.kernel.org Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-11-07 13:42:32 -07:00
James Smart	18027a8ccc	scsi: lpfc: reduce locking when updating statistics Currently, on each io completion, the stats update routine indiscriminately holds a lock. While holding the adapter-wide lock, checks are made to check whether status are being tracked. When disabled (the default), the locking wasted a lot of cycles. Check for stats enablement before taking the lock. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	ca7fb76e09	scsi: lpfc: Correct race with abort on completion path On io completion, the driver is taking an adapter wide lock and nulling the scsi command back pointer. The nulling of the back pointer is to signify the io was completed and the scsi_done() routine was called. However, the routine makes no check to see if the abort routine had done the same thing and possibly nulled the pointer. Thus it may doubly-complete the io. Make the following mods: - Check to make sure forward progress (call scsi_done()) only happens if the command pointer was non-null. - As the taking of the lock, which is adapter wide, is very costly on a system under load, null the pointer using an xchg operation rather than under lock. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
Dan Carpenter	26c724a690	scsi: lpfc: remove an unnecessary NULL check Smatch complains about this code: drivers/scsi/lpfc/lpfc_scsi.c:1053 lpfc_get_scsi_buf_s4() warn: variable dereferenced before check 'lpfc_cmd' (see line 1039) Fortunately the NULL check isn't required so I have removed it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-30 07:33:55 -04:00
James Smart	2a5b7d626e	scsi: lpfc: Limit tracking of tgt queue depth in fast path Performance is affected when target queue depth is tracked. An atomic counter is incremented on the submission path which competes with it being decremented on the completion path. In addition, multiple CPUs can simultaniously be manipulating this counter for the same ndlp. Reduce the overhead by only performing the target increment/decrement when the target queue depth is less than the overall adapter depth, thus is actually meaningful. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:19 -04:00
James Smart	8931c73bee	scsi: lpfc: Fix list corruption on the completion queue. Enabling list_debug showed the drivers txcmplq was suffering list corruption. The systems will eventually crash because the iocb free list gets crossed linked with the prings txcmplq. Most systems will run for a while after the corruption, but will eventually crash when a scsi eh reset occurs and the txcmplq is attempted to be flushed. The flush gets stuck in an endless loop. The problem is the abort handler does not hold the sli4 ring lock while validating the IO so the IO could complete while the driver is still preping the abort. The erroneously generated abort, when it completes, has pointers to the original IO that has already completed, and the IO manipulation (for the second time) corrupts the list. Correct by taking the ring lock early in the abort handler so the erroneous abort won't be sent if the io has/is completing. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:18 -04:00
Johannes Thumshirn	c6668cae16	scsi: lpfc: remove ScsiResult macro Remove the ScsiResult macro and open code it on all call sites. This will make subsequent refactoring in this area easier. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:42:47 -04:00
James Smart	b0e830125b	scsi: lpfc: devloss timeout race condition caused null pointer reference A race condition between the context of devloss timeout handler and I/O completion caused devloss timeout handler de-referencing pointer that had been released. Added the check in lpfc_sli_validate_fcp_iocb() on LPFC_IO_ON_TXCMPLQ to capture the race condition of I/O completion and devloss timeout handler attemption for aborting the I/O. Also, added check on lpfc_cmd->rdata pointer before de-referenceing lpfc_cmd->rdata->pnode. Also, added protection in lpfc_sli_abort_iocb() routine on driver performed FCP I/O FLUSHING already under way before proceeding to aborting I/Os. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	414abe0ab6	scsi: lpfc: Make PBDE optimizations configurable The PBDE optimizations aren't supported in all firmware revs. Make optimizations configurable in case there's a side effect on old firmware. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	3e21d1cb0f	scsi: lpfc: Comment cleanup regarding Broadcom copyright header Fix small formatting and wording nits in Broadcom copyright header Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	59c68eaad7	scsi: lpfc: Fix Abort request WQ selection When running loads that generated aborts, io errors where seen. Turns out the abort requests where not placed on the proper WQ resulting in the errors. Closer inspection inspection of this error also showed improper spinlock api use. Correct the WQ selection policy for the abort requests. Correct spin_lock/spin_lock_irq/spin_lock_irqsave usage. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:02 -04:00
James Smart	f91bc594ba	scsi: lpfc: Correct target queue depth application changes The max_scsicmpl_time parameter can be used to perform scsi cmd queue depth mgmt based on io completion time: the queue depth is reduced to make completion time shorter. However, as soon as an io completes and the completion time is within limits, the code immediately bumps the queue depth limit back up to the target queue depth. Thus the procedure restarts, effectively limiting the usefulness of adjusting queue depth to help completion time. This patch makes the following changes: - Removes the code at io completion that resets the queue depth as soon as within limits. - As the code removed was where the target queue depth was first applied, change target queue depth application so that it occurs when the parameter is changed. - Makes target queue depth a standard parameter: both a module parameter and a sysfs parameter. - Optimizes the command pending count by using atomics rather than locks. - Updates the debugfs nodelist stats to allow better debugging of pending command counts. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:01 -04:00
James Smart	f44ac12f1d	scsi: lpfc: Memory allocation error during driver start-up on power8 The driver fails to allocate command buffers in the routine lpfc_new_scsi_buf_s4 There is an inconsistency between lpfc_mem_alloc(), where the phba->lpfc_sg_dma_buf_pool is created, and lpfc_new_scsi_buf_s4(), when we allocate a buffer from the pool and check the alignment. The alignment should be on a page boundary, based on LPFC_SLI3_BG_ENABLED in sli3_options, for both cases. Fix by explicitly tracking sli4 vs sli3 and BG options. The result is that phba->cfg_sg_dma_buf_size is now set correctly for SLI-4. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-03-12 21:55:24 -04:00
James Smart	0bc2b7c531	scsi: lpfc: Add embedded data pointers for enhanced performance The current driver isn't taking advantage of a performance hint whereby the initial data buffer descriptor can be placed in the WQE as well as the SGL. Add the logic to detect support for the feature and to use it when supported. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-02-22 20:39:29 -05:00
James Smart	128bddacc4	scsi: lpfc: Update 11.4.0.7 modified files for 2018 Copyright Updated Copyright in files updated 11.4.0.7 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-02-12 11:43:24 -05:00
James Smart	45634a86ca	scsi: lpfc: Treat SCSI Write operation Underruns as an error Currently, write underruns (mismatch of amount transferred vs scsi status and its residual) detected by the adapter are not being flagged as an error. Its expected the target controls the data transfer and would appropriately set the RSP values. Only read underruns are treated as errors. Revise the SCSI error handling to treat write underruns as an error as well. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-02-12 11:43:24 -05:00
James Smart	64bf009933	scsi: lpfc: Allow set of maximum outstanding SCSI cmd limit for a target Make the attribute writeable. Remove the ramp up to logic as its unnecessary, simply set depth. Add debug message if depth changed, possibly reducing limit, yet our outstanding count has yet to catch up with it. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-02-12 11:43:23 -05:00
Kees Cook	f22eb4d31c	scsi: lpfc: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-01 11:27:07 -07:00
Romain Perier	771db5c0e3	scsi: lpfc: Replace PCI pool old API The PCI pool API is deprecated. This commit replaces the PCI pool old API by the appropriate function with the DMA pool API. It also updates some comments, accordingly. Signed-off-by: Romain Perier <romain.perier@collabora.com> Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:01 -04:00
Linus Torvalds	130568d5ea	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull more block updates from Jens Axboe: "This is a followup for block changes, that didn't make the initial pull request. It's a bit of a mixed bag, this contains: - A followup pull request from Sagi for NVMe. Outside of fixups for NVMe, it also includes a series for ensuring that we properly quiesce hardware queues when browsing live tags. - Set of integrity fixes from Dmitry (mostly), fixing various issues for folks using DIF/DIX. - Fix for a bug introduced in cciss, with the req init changes. From Christoph. - Fix for a bug in BFQ, from Paolo. - Two followup fixes for lightnvm/pblk from Javier. - Depth fix from Ming for blk-mq-sched. - Also from Ming, performance fix for mtip32xx that was introduced with the dynamic initialization of commands" * 'for-linus' of git://git.kernel.dk/linux-block: (44 commits) block: call bio_uninit in bio_endio nvmet: avoid unneeded assignment of submit_bio return value nvme-pci: add module parameter for io queue depth nvme-pci: compile warnings in nvme_alloc_host_mem() nvmet_fc: Accept variable pad lengths on Create Association LS nvme_fc/nvmet_fc: revise Create Association descriptor length lightnvm: pblk: remove unnecessary checks lightnvm: pblk: control I/O flow also on tear down cciss: initialize struct scsi_req null_blk: fix error flow for shared tags during module_init block: Fix __blkdev_issue_zeroout loop nvme-rdma: unconditionally recycle the request mr nvme: split nvme_uninit_ctrl into stop and uninit virtio_blk: quiesce/unquiesce live IO when entering PM states mtip32xx: quiesce request queues to make sure no submissions are inflight nbd: quiesce request queues to make sure no submissions are inflight nvme: kick requeue list when requeueing a request instead of when starting the queues nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues ...	2017-07-11 15:36:52 -07:00
Dmitry Monakhov	128b6f9fdd	t10-pi: Move opencoded contants to common header Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-07-03 16:56:25 -06:00
James Smart	2cee780800	scsi: lpfc: Fix counters so outstandng NVME IO count is accurate NVME FC counters don't reflect actual results Since counters are not atomic, or protected by a lock, the values often get screwed up. Make them atomic, like NVMET. Fix up sysfs and debugfs display accordingly Added Outstanding IOs to stats display Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 21:37:31 -04:00
James Smart	856984b71c	scsi: lpfc: add transport eh_timed_out reference Christoph's prior patch missed the template for the sli3 adapters, which is now the "no host reset" template. Add the transport eh_timed_out handler to the no host reset template Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-03-06 23:04:22 -05:00
James Smart	96418b5e2c	scsi: lpfc: Fix eh_deadline setting for sli3 adapters. A previous change unilaterally removed the hba reset entry point from the sli3 host template. This was done to allow tape devices being used for back up from being removed. Why was this done ? When there was non-responding device on the fabric, the error escalation policy would escalate to the reset handler. When the reset handler was called, it would reset the adapter, dropping link, thus logging out and terminating all i/o's - on any target. If there was a tape device on the same adapter that wasn't in error, it would kill the tape i/o's, effectively killing the tape device state. With the reset point removed, the adapter reset avoided the fabric logout, allowing the other devices to continue to operate unaffected. A hack - yes. Hint: we really need a transport I_T nexus reset callback added to the eh process (in between the SCSI target reset and hba reset points), so a fc logout could occur to the one bad target only and stop the error escalation process. This patch commonizes the approach so it can be used for sli3 and sli4 adapters, but mandates the admin, via module parameter, specifically identify which adapters the resets are to be removed for. Additionally, bus_reset, which sends Target Reset TMFs to all targets, is also removed from the template as it too has the same effect as the adapter reset. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-03-06 23:04:22 -05:00
James Smart	d080abe0a8	scsi: lpfc: Update copyrights Update copyrights to 2017 for all files touched in this patch set Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:44 -05:00
James Smart	a0f2d3ef37	scsi: lpfc: NVME Initiator: Merge into FC discovery NVME Initiator: Merge into FC discovery Adds NVME PRLI support and Nameserver registrations and Queries for NVME Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:43 -05:00
James Smart	895427bd01	scsi: lpfc: NVME Initiator: Base modifications NVME Initiator: Base modifications This patch adds base modifications for NVME initiator support. The base modifications consist of: - Formal split of SLI3 rings from SLI-4 WQs (sometimes referred to as rings as well) as implementation now widely varies between the two. - Addition of configuration modes: SCSI initiator only; NVME initiator only; NVME target only; and SCSI and NVME initiator. The configuration mode drives overall adapter configuration, offloads enabled, and resource splits. NVME support is only available on SLI-4 devices and newer fw. - Implements the following based on configuration mode: - Exchange resources are split by protocol; Obviously, if only 1 mode, then no split occurs. Default is 50/50. module attribute allows tuning. - Pools and config parameters are separated per-protocol - Each protocol has it's own set of queues, but share interrupt vectors. SCSI: SLI3 devices have few queues and the original style of queue allocation remains. SLI4 devices piggy back on an "io-channel" concept that eventually needs to merge with scsi-mq/blk-mq support (it is underway). For now, the paradigm continues as it existed prior. io channel allocates N msix and N WQs (N=4 default) and either round robins or uses cpu # modulo N for scheduling. A bunch of module parameters allow the configuration to be tuned. NVME (initiator): Allocates an msix per cpu (or whatever pci_alloc_irq_vectors gets) Allocates a WQ per cpu, and maps the WQs to msix on a WQ # modulo msix vector count basis. Module parameters exist to cap/control the config if desired. - Each protocol has its own buffer and dma pools. I apologize for the size of the patch. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> ---- Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:43 -05:00
James Smart	2ea259eead	scsi: lpfc: minor code cleanups This contains code cleanups that were in the prior patch set. This allows better review of real changes later. minor code cleanups: fix indentation, punctuation, line length addition/reduction of whitespace remove unneeded parens, braces lpfc_debugfs_nodelist_data: print as u64 rather than byte by byte covert printk(KERN_ERR to pr_err small print string deltas use num_present_cpus() rather than count them comment updates rctl/type names moved to module variable, not on stack Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Christoph Hellwig	b6a05c823f	scsi: remove eh_timed_out methods in the transport template Instead define the timeout behavior purely based on the host_template eh_timed_out method and wire up the existing transport implementations in the host templates. This also clears up the confusion that the transport template method overrides the host template one, so some drivers have to re-override the transport template one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-06 19:10:03 -05:00
James Smart	b5749fe182	scsi: lpfc: Fix Xlane dynamic LUN set for LUN priority. Fix Xlane dynamic LUN set for LUN priority. Dynamic changing of the priority was not getting reflected on the LUN. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-01-05 00:21:12 -05:00
Souptick Joarder	b4b22a012e	scsi: lpfc: Replace pci_pool_alloc by pci_pool_zalloc In lpfc_new_scsi_buf_s3() and lpfc_new_scsi_buf_s4() pci_pool_alloc followed by memset will be replaced by pci_pool_zalloc() Signed-off-by: Souptick joarder <jrdr.linux@gmail.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-11-30 11:40:20 -05:00
James Smart	89533e9be0	scsi: lpfc: Correct panics with eh_timeout and eh_deadline Correct panics with eh_timeout and eh_deadline We were having double completions on our SLI-3 version of adapters. Solved by clearing our command pointer before calling scsi_done. The eh paths potentially ran simulatenously and would see the non-null value and invoke scsi_done again. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-11-08 17:29:50 -05:00
James Smart	eed695d70e	scsi: lpfc: Fix sg_reset on SCSI device causing kernel crash Fix sg_reset on SCSI device causing kernel crash Driver could reference stale node pointers in task mgmt call. Changed to use resetting cmd and look up node pointer in task mgmt function. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-11-08 17:29:49 -05:00
Milan P. Gandhi	4b160ae8a3	scsi: lpfc: Fix few small typos in lpfc_scsi.c This patch does a cleanup and fixes few small typos in lpfc_scsi.c Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-11-08 17:29:49 -05:00
Mauricio Faria de Oliveira	05a05872c8	lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd 'lpfc_cmd->pCmd' not to be null, and point to the midlayer command. That's not true in the .eh_(device\|target\|bus)_reset_handler path, because lpfc_send_taskmgmt() sends commands not from the midlayer, so does not set 'lpfc_cmd->pCmd'. That is true in the .queuecommand path because lpfc_queuecommand() stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd is stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is passed to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter. This problem can be hit on SCSI EH, and immediately with sg_reset. These 2 test-cases demonstrate the problem/fix with next-20160601. Test-case 1) sg_reset # strace sg_reset --device /dev/sdm <...> open("/dev/sdm", O_RDWR\|O_NONBLOCK) = 3 ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...> +++ killed by SIGSEGV +++ Segmentation fault # dmesg Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xd00000001c88442c Oops: Kernel access of bad area, sig: 11 [#1] <...> CPU: 104 PID: 16333 Comm: sg_reset Tainted: G W 4.7.0-rc1-next-20160601-00004-g95b89dc #6 <...> NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc] LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc] Call Trace: [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable) [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc] [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc] [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc] [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc] [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0 [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0 [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0 [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120 [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70 [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90 [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890 [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0 [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108 Instruction dump: <...> With fix: # strace sg_reset --device /dev/sdm <...> open("/dev/sdm", O_RDWR\|O_NONBLOCK) = 3 ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0 close(3) = 0 exit_group(0) = ? +++ exited with 0 +++ # dmesg [ 424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002 Test-case 2) SCSI EH Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl(): - cmd->scsi_done(cmd); + if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000) + printk(KERN_ALERT "lpfc: skip scsi_done()\n"); + else + cmd->scsi_done(cmd); # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose # dd if=/dev/sdm of=/dev/null iflag=direct & <...> After a while: # dmesg lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000) lpfc: skip scsi_done() <...> Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xd0000000199e448c Oops: Kernel access of bad area, sig: 11 [#1] <...> CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G W 4.7.0-rc1-next-20160601-00004-g95b89dc #6 <...> NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc] LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc] Call Trace: [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable) [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc] [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc] [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc] [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc] [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0 [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0 [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680 [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130 [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4 Instruction dump: <...> With fix: # dmesg lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000) lpfc: skip scsi_done() <...> lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002 <...> lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002 <...> lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002 <...> lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data: <...> Fixes: `8b0dff1416` ("lpfc: Add support for using block multi-queue") Cc: <stable@vger.kernel.org> # v4.2+ Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-07-27 00:31:09 -04:00
James Smart	51f4ca3c93	lpfc: Copyright updates Copyright updates Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-07-15 15:25:06 -04:00
James Smart	c92c841cc7	lpfc: Add support for XLane LUN priority Add support for XLane LUN priority Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-07-15 15:25:06 -04:00

1 2 3 4 5 ...

291 Commits