OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
James Smart	b7b95fb863	scsi: lpfc: Fix miss of register read failure check Coverity flagged missing status check on register read that flags a poisoned data return value. Add checking of register read status. Link: https://lore.kernel.org/r/20190922035906.10977-4-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-30 22:07:08 -04:00
James Smart	65a3df63e7	scsi: lpfc: Fix premature re-enabling of interrupts in lpfc_sli_host_down Use of spin_lock_irq may re-enable interrupts prematurely. Convert to spin_lock. Note: code is under the phba->hba_lock which has been locked with irqsave. Link: https://lore.kernel.org/r/20190922035906.10977-3-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-30 22:07:08 -04:00
James Smart	359e10f087	scsi: lpfc: Fix pt2pt discovery on SLI3 HBAs After exchanging PLOGI on an SLI-3 adapter, the PRLI exchange failed. Link trace showed the port was assigned a non-zero n_port_id, but didn't use the address on the PRLI. The assigned address is set on the port by the CONFIG_LINK mailbox command. The driver responded to the PRLI before the mailbox command completed. Thus the PRLI response used the old n_port_id. Defer the PRLI response until CONFIG_LINK completes. Link: https://lore.kernel.org/r/20190922035906.10977-2-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-30 22:07:08 -04:00
Linus Torvalds	10fd71780f	SCSI misc on 20190919 This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi, lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates. The only core change this time around is the addition of request batching for virtio. Since batching requires an additional flag to use, it should be invisible to the rest of the drivers. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXYQE/yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXs9AP4usPY5 OpMlF6OiKFNeJrCdhCScVghf9uHbc7UA6cP+EgD/bCtRgcDe1ZjOTYWdeTwvwWqA ltWYonnv6Lg3b1f9yqI= =jRC/ -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi, lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates. The only core change this time around is the addition of request batching for virtio. Since batching requires an additional flag to use, it should be invisible to the rest of the drivers" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (264 commits) scsi: hisi_sas: Fix the conflict between device gone and host reset scsi: hisi_sas: Add BIST support for phy loopback scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation scsi: hisi_sas: Remove some unused function arguments scsi: hisi_sas: Remove redundant work declaration scsi: hisi_sas: Remove hisi_sas_hw.slot_complete scsi: hisi_sas: Assign NCQ tag for all NCQ commands scsi: hisi_sas: Update all the registers after suspend and resume scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset() scsi: hisi_sas: add debugfs auto-trigger for internal abort time out scsi: virtio_scsi: unplug LUNs when events missed scsi: scsi_dh_rdac: zero cdb in send_mode_select() scsi: fcoe: fix null-ptr-deref Read in fc_release_transport scsi: ufs-hisi: use devm_platform_ioremap_resource() to simplify code scsi: ufshcd: use devm_platform_ioremap_resource() to simplify code scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code scsi: ufs: Use kmemdup in ufshcd_read_string_desc() ...	2019-09-21 10:50:15 -07:00
James Smart	4fb86a6bc5	scsi: lpfc: Fix reset recovery paths that are not recovering A recent patch unconditionally marks the hba as in error as part of resetting the adapter. The driver flow that called the adapter reset was a recovery path, which expects the adapter to not be in an error state in order to finish the recovery. Given the new error state being set, the recovery fails and the adapter is left in limbo. Revise the adapter reset routine so that it will only mark the adapter in error if it was unable to reset the adapter. Fixes: `8c24a4f643` ("scsi: lpfc: Fix crash due to port reset racing vs adapter error handling") Link: https://lore.kernel.org/r/20190903215441.10490-1-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-07 16:28:15 -04:00
Sakari Ailus	2d44d165e9	scsi: lpfc: Convert existing %pf users to %ps Convert the remaining %pf users to %ps to prepare for the removal of the old %pf conversion specifier support. Fixes: `3235066449` ("scsi: lpfc: Migrate to %px and %pf in kernel print calls") Link: https://lore.kernel.org/r/20190904160423.3865-1-sakari.ailus@linux.intel.com Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-07 16:26:40 -04:00
James Smart	01f2ef6d18	scsi: lpfc: fix 12.4.0.0 GPF at boot The 12.4.0.0 patch that merged WQ/CQ pairs into single per-cpu pair contained a bug: a local variable was set to the queue pair by index. This should have allowed the local variable to be natively used. Instead, the code reused the index relative to the local variable, obtaining a random pointer value that when used eventually faulted the system Convert offending code to use local variable. Fixes: `c00f62e6c5` ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair") Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-29 18:15:07 -04:00
James Smart	0622800d2e	scsi: lpfc: Raise config max for lpfc_fcp_mq_threshold variable Raise the config max for lpfc_fcp_mq_threshold variable to 256. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> CC: Hannes Reinecke <hare@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-29 18:12:27 -04:00
James Smart	9db6c14c36	scsi: lpfc: Remove bg debugfs buffers Capturing and downloading dif command data and dif data was done a dozen years ago and no longer being used. Also creates a potential security hole. Remove the debugfs buffer for dif debugging. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> CC: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> CC: Hannes Reinecke <hare@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-29 18:08:58 -04:00
James Smart	7f9989bace	scsi: lpfc: Resolve checker warning for lpfc_new_io_buf() Per Dan Carpenter: The patch d79c9e9d4b3d: "scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware." from Aug 14, 2019, leads to the following static checker warning: drivers/scsi/lpfc/lpfc_init.c:4107 lpfc_new_io_buf() error: not allocating enough data 784 vs 768 There was no need to compare sizes nor to allocate size based on a define. Change allocation to use actual structure length Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-29 18:07:59 -04:00
James Smart	10541f037b	scsi: lpfc: Update lpfc version to 12.4.0.0 Update lpfc version to 12.4.0.0 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	c00f62e6c5	scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair Currently, each hardware queue, typically allocated per-cpu, consists of a WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2 WQ/CQ pairs will exist for the hardware queue. Separate queues are unnecessary. The current implementation wastes memory backing the 2nd set of queues, and the use of double the SLI-4 WQ/CQ's means less hardware queues can be supported which means there may not always be enough to have a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their own WQ/CQ. Rework the implementation to use a single WQ/CQ pair by both protocols. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	0d8af09643	scsi: lpfc: Add NVMe sequence level error recovery support FC-NVMe-2 added support for sequence level error recovery in the FC-NVME protocol. This allows for the detection of errors and lost frames and immediate retransmission of data to avoid exchange termination, which escalates into NVMeoFC connection and association failures. A significant RAS improvement. The driver is modified to indicate support for SLER in the NVMe PRLI is issues and to check for support in the PRLI response. When both sides support it, the driver will set a bit in the WQE to enable the recovery behavior on the exchange. The adapter will take care of all detection and retransmission. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	d79c9e9d4b	scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware. Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI to contain the exchanges Scatter/Gather List. This caps the number of SGL elements that can be in the SGL. There are not extensions to extend the list out of the 2 pages. The G7 hardware adds a SGE type that allows the SGL to be vectored to a different scatter/gather list segment. And that segment can contain a SGE to go to another segment and so on. The initial segment must still be pre-registered for the XRI, but it can be a much smaller amount (256Bytes) as it can now be dynamically grown. This much smaller allocation can handle the SG list for most normal I/O, and the dynamic aspect allows it to support many MB's if needed. The implementation creates a pool which contains "segments" and which is initially sized to hold the initial small segment per xri. If an I/O requires additional segments, they are allocated from the pool. If the pool has no more segments, the pool is grown based on what is now needed. After the I/O completes, the additional segments are returned to the pool for use by other I/Os. Once allocated, the additional segments are not released under the assumption of "if needed once, it will be needed again". Pools are kept on a per-hardware queue basis, which is typically 1:1 per cpu, but may be shared by multiple cpus. The switch to the smaller initial allocation significantly reduces the memory footprint of the driver (which only grows if large ios are issued). Based on the several K of XRIs for the adapter, the 8KB->256B reduction can conserve 32MBs or more. It has been observed with per-cpu resource pools that allocating a resource on CPU A, may be put back on CPU B. While the get routines are distributed evenly, only a limited subset of CPUs may be handling the put routines. This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because all the resources are being put on a limited subset of CPUs. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	e62245d923	scsi: lpfc: Add MDS driver loopback diagnostics support Added code to support driver loopback with MDS Diagnostics. This style of diagnostics passes frames from the fabric to the driver who then echo them back out the link. SEND_FRAME WQEs are used to transmit the frames. Added the SOF and EOF field location definitions for use by SEND_FRAME. Also ensure that enable_mds_diags is a RW parameter. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	ec76242f3b	scsi: lpfc: Add first and second level hardware revisions to sysfs reporting To aid better hardware detection when there are issues, report the first and second level hardware revisions from the READ_REV command. Add the elements to the existing hardware id string. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	3235066449	scsi: lpfc: Migrate to %px and %pf in kernel print calls In order to see real addresses, convert %p with %px for kernel addresses and replace %p with %pf for functions. While converting, standardize on "x%px" throughout (not %px or 0x%px). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	1df0944990	scsi: lpfc: Add simple unlikely optimizations to reduce NVME latency While performing code review, several relatively simple optimizations can be done in the fast path. Add these optimizations (unlikely designators). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	d9f492a1a1	scsi: lpfc: Fix coverity warnings Running on Coverity produced the following errors: - coding style (indentation) - memset size mismatch errors note: comment cases where it is purposely a mismatch Fix the errors. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	db197bc469	scsi: lpfc: Fix nvme first burst module parameter description modinfo for lpfc_nvme_enable_fb is incorrect. FirstBurst on lpfc target is not fully supported. Update the attribute description Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	4945c0f95b	scsi: lpfc: Fix BlockGuard enablement on FCoE adapters The driver is allowing the user to change lpfc_enable_bg while loading the driver against a FCoE adapter. This is not supported. No check is made for the adapter type when applying the blockguard enablement value. Fix by verifying the adapter type before setting the enablement flag. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	845d0327bf	scsi: lpfc: Fix reported physical link speed on a disabled trunked link GetTrunkInfo is displaying an incorrect link speed when the link is a trunk and the link has gone down. The driver is not clearing the logical speed as part of the link down transition. Fix by setting the logical speed to UNKNOWN SPEED when the link goes down. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	f98b2fd796	scsi: lpfc: Fix Max Frame Size value shown in fdmishow output Max Frame Size value is shown as 34816 in fdmishow from Switch. The driver uses bbRcvSize in common service param which is obtained from the READ_SPARM mailbox command. The bbRcvSize field which is displayed is a three nibble field but the driver is printing a full four nibbles. Fix by masking off the upper nibble. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	6db51abb8d	scsi: lpfc: Fix upcall to bsg done in non-success cases The scsi transport fc bsg interface does not expect the bsg_job_done() callback to be done if the bsg request call returns failure. Several of the HST_VENDOR cases in the driver unconditionally call bsg_job_done() regardless of the returning value. Fix the code to only call bsg_job_done() if the call to lpfc_bsg_request() will return success. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	07b1b91412	scsi: lpfc: Fix sli4 adapter initialization with MSI When forcing the use of MSI (vs MSI-X) the driver is crashing in pci_irq_get_affinity. The driver was not using the new pci_alloc_irq_vectors interface in the MSI path. Fix by using pci_alloc_irq_vectors() with PCI_RQ_MSI in the MSI path. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	6a224b47fd	scsi: lpfc: Fix nvme sg_seg_cnt display if HBA does not support NVME The driver is currently reporting a non-zero nvme sg_seg_cnt value of 256 when nvme is disabled. It should be zero. Fix by ensuring the value is cleared. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	005d8eb928	scsi: lpfc: Fix nvme target mode ABTSing a received ABTS If an unsolicited ABTS was received, the driver looks up the exchange it references. It it does various searches looking for the exchange context. When one is eventually matched and it is associated with an XRI context, the driver sends an ABORT WQE to terminate the exchange. Current code looks at whether the transport had taken action on the XRI yet or not (no action if set to LPFC_NVMET_STE_RCV; action if non-LPFC_NVMET_STE_RCV). Based on action or not one of two (sol vs unsol) issue abort routines are called. The unsol version cheats and transmits a sequence containing an ABTS with no interaction with the adapter. The sol version issues an Abort WQE and lets the adapter manage whether the ABTS is sent to not. The issue is the unsol version is sending ABTS unconditionally for the exchange that received the ABTS. It's unnecessary. Remove the conditional and just call the adapter command-based routine to let the adapter manage the ABTS. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	84f2ddf8cf	scsi: lpfc: Fix hang when downloading fw on port enabled for nvme As part of firmware download, the adapter is reset. On the adapter the reset causes the function to stop and all outstanding io is terminated (without responses). The reset path then starts teardown of the adapter, starting with deregistration of the remote ports with the nvme-fc transport. The local port is then deregistered and the driver waits for local port deregistration. This never finishes. The remote port deregistrations terminated the nvme controllers, causing them to send aborts for all the outstanding io. The aborts were serviced in the driver, but stalled due to its state. The nvme layer then stops to reclaim it's outstanding io before continuing. The io must be returned before the reset on the controller is deemed complete and the controller delete performed. The remote port deregistration won't complete until all the controllers are terminated. And the local port deregistration won't complete until all controllers and remote ports are terminated. Thus things hang. The issue is the reset which stopped the adapter also stopped all the responses that would drive i/o completions, and the aborts were also stopped that stopped i/o completions. The driver, when resetting the adapter like this, needs to be generating the completions as part of the adapter reset so that I/O complete (in error), and any aborts are not queued. Fix by adding flush routines whenever the adapter port has been reset or discovered in error. The flush routines will generate the completions for the scsi and nvme outstanding io. The abort ios, if waiting, will be caught and flushed as well. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	5e0e2318aa	scsi: lpfc: Fix too many sg segments spamming in kernel log This issue is specific to SLI-3 adapters, specifically when DIF is used. Once seen, this message floods the logs: 9064 BLKGRD: lpfc_scsi_prep_dma_buf_s3: Too many sg segments from dma_map_sg The driver, upon detecting an error such as too many elements in an sglist, misrepresents the error by treating it as a temporary resource issue by returning MLQUEUE_HOST_BUSY. In these cases, no retry will fix it and it should have been a hard error. The repeated retry was causing the spamming of the log. As for the initial reason of why an I/O encountered this issue at all is not clear as parameters set by the driver should have avoided this. The dm multipath maintainer has been notified of the issue. Fix by changing the return code for the dma mapping routines to indicate cases that are not retryable and return DID_ERROR on those cases. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	8c24a4f643	scsi: lpfc: Fix crash due to port reset racing vs adapter error handling If the adapter encounters a condition which causes the adapter to fail (driver must detect the failure) simultaneously to a request to the driver to reset the adapter (such as a host_reset), the reset path will be racing with the asynchronously-detect adapter failure path. In the failing situation, one path has started to tear down the adapter data structures (io_wq's) while the other path has initiated a repeat of the teardown and is in the lpfc_sli_flush_xxx_rings path and attempting to access the just-freed data structures. Fix by the following: - In cases where an adapter failure is detected, rather than explicitly calling offline_eratt() to start the teardown, change the adapter state and let the later calls of posted work to the slowpath thread invoke the adapter recovery. In essence, this means all requests to reset are serialized on the slowpath thread. - Clean up the routine that restarts the adapter. If there is a failure from brdreset, don't immediately error and leave things in a partial state. Instead, ensure the adapter state is set and finish the teardown of structures before returning. - If in the scsi host reset handler and the board fails to reset and restart (which can be due to parallel reset/recovery paths), instead of hard failing and explicitly calling offline_eratt() (which gets into the redundant path), just fail out and let the asynchronous path resolve the adapter state. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	894bb17f0c	scsi: lpfc: Fix deadlock on host_lock during cable pulls During cable pull testing a deadlock was seen between lpfc_nlp_counters() vs lpfc_mbox_process_link_up() vs lpfc_work_list_done(). They are all waiting on the shost->host_lock. Issue is all of these cases raise irq when taking out the lock but use spin_unlock_irq() when unlocking. The unlock path is will unconditionally re-enable interrupts in cases where irq state should be preserved. The re-enablement allowed the other paths to execute which then causes the deadlock. Fix by converting the lock/unlock to irqsave/irqrestore. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	6825b7bd32	scsi: lpfc: Fix error in remote port address change In a test with high nvme remote port counts connected via a multi-hop FC switch config where switches were systematically reset (e.g. fabric partitioning and re-establishment), the nvme remote ports would switch addresses based on the switch reconfiguration events. The driver would get into a situation where the nvme port changed address, PLOGI and PRLI would succeed nvme transport registration occurred, but subsequent LS requests by the nvme subsystem failed due to a bad ndlp state and connectivity to the device failed. The driver hit a race condition on multiple devices that address swapped simultaneously. In cases where the driver notices the remote port structure came back as the same value as previously (meaning a nvme_rport structure was re-enabled and did not go through devloss_tmo/connect_tmo_failures on all controllers) the driver would unconditionally exit assuming the ndlp information was correct. But, if the ndlp's had been swapped, the ndlp had stale port state information, which when used by the LS request commands, would fail the commands. Fix by checking whether a node swap had occurred, and only exit if no ndlp swap had occurred. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	a6d10f24a0	scsi: lpfc: Fix driver nvme rescan logging In situations where zoning is not being used, thus NVMe initiators see other NVMe initiators as well as NVMe targets, a link bounce on an initiator will cause the NVMe initiators to spew "6169" State Error messages. The driver is not qualifying whether the remote port is a NVMe targer or not before calling the lpfc_nvme_rescan_port(), which validates the role and prints the message if its only an NVMe initiator. Fix by the following: - Before calling lpfc_nvme_rescan_port() ensure that the node is a NVMe storage target or a NVMe discovery controller. - Clean up implementation of lpfc_nvme_rescan_port. remoteport pointer will always be NULL if a NVMe initiator only. But, grabbing of remoteport pointer should be done under lock to coincide with the registering of the remote port with the fc transport. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	c26c265b16	scsi: lpfc: Fix sg_seg_cnt for HBAs that don't support NVME On an SLI-3 adapter which does not support NVMe, but with the driver global attribute to enable nvme on any adapter if it does support NVMe (e.g. module parameter lpfc_enable_fc4_type=3), the SGL and total SGE values are being munged by the protocol enablement when it shouldn't be. Correct by changing the location of where the NVME sgl information is being applied, which will avoid any SLI-3-based adapter. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	a643c6de14	scsi: lpfc: Fix propagation of devloss_tmo setting to nvme transport If admin changes the devloss_tmo on an rport via the fc_remote_port rport dev_loss_tmo attribute, the value is on set on scsi stack. The change is not propagated to NVMe. The set routine in the lldd lacks the call to nvme_fc_set_remoteport_devloss() to set the value. Fix by adding the call to the lldd set routine. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	b95b21193c	scsi: lpfc: Fix loss of remote port after devloss due to lack of RPIs In tests with remote ports contantly logging out/logging coupled with occassional local link bounce, if a remote port is disocnnected for longer than devloss_tmo and then subsequently reconnected, eventually the test will fail to login with the remote port and remote port connectivity is lost. When devloss_tmo expires, the driver does not free the node struct until the port or npiv instances is being deleted. The node is left allocated but the state set to UNUSED. If the node was in the process of logging in when the local link drop occurred, meaning the RPI was allocated for the node in order to send the ELS, but not yet registered which comes after successful login, the node is moved to the NPR state, and if devloss expires, to UNUSED state. If the remote port comes back, the node associated with it is restarted and this path happens to allocate a new RPI and overwrites the prior RPI value. In the cases where the port was logged in and loggs out, the path did release the RPI but did not set the node rpi value. In the cases where the remote port never finished logging in, the path never did the call to release the rpi. In this latter case, when the node is subsequently restore, the new rpi allocation overwrites the rpi that was not released, and the rpi is now leaked. Eventually the port will run out of RPI resources to log into new remote ports. Fix by following changes: - When an rpi is released, do so under locks and ensure the node rpi value is set to a non-allocated value (LPFC_RPI_ALLOC_ERROR). Note: refactored to a small service routine to avoid indentation issues. - When re-enabling a node, check the rpi value to determine if a new allocation is necessary. If already set, use the prior rpi. Enhanced logging to help in the future. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	96d156f95c	scsi: lpfc: Fix devices that don't return after devloss followed by rediscovery If a remote port is removed and remains removed for devloss_tmo, if an RSCN is subsequently received indicating the presence of the remte port, the driver does not login to and rediscovery the remote port. Currently, in order to for a port to be rediscovered post an RSCN, the node state must be NPR to reflect not logged in. When devloss expires, the node state is marked UNUSED. When an RSCN occurs, the nodes referenced by the RSCN will have a NPR_2B_DISC flag set, but the re-login will only be attempted if the node is in NPR_NODE state. Thus the node is skipped over. Fix by recognizing the NPR_2B_DISC and UNUSED and transition the node back to NPR state to allow the re-login to take place. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	07f50997d6	scsi: lpfc: Fix null ptr oops updating lpfc_devloss_tmo via sysfs attribute If an admin updates lpfc's devloss_tmo sysfs attribute, the kernel will oops. Coding of a loop allowed a new value (rport) to be set/checked for null followed by an older value (remoteport) checked for null to allow progress where the new value, even though null, will be referenced. Rework the logic to validate and prevent any reference to the null ptr. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	6ede2ddd8b	scsi: lpfc: Fix FLOGI handling across multiple link up/down conditions It's possible for the driver to initiate an FLOGI and before it completes, another link down/up transition occurs requiring a new FLOGI. Currently, nothing is done to abort/noop the older FLOGI request to the adapter, so if this transition occurs and the FLOGI completion is received after the link down/up transition, the driver may erroneously act on the older FLOGI. In most cases, the adapter properly terminates/fails the FLOGI, but there is a timing condition where the FLOGI may complete on the wire prior to the transition, but the response may not be seen/processed by the driver before the driver sees the link transition. Fix by having the link down handler in the driver run through any outstanding ELS's and change the completion handler of the ELS so that it will be no-op'd and released. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	3ad348d944	scsi: lpfc: Fix oops when fewer hdwqs than cpus When tearing down the adapter for a reset, online/offline, or driver unload, the queue free routine would hit a GPF oops. This only occurs on conditions where the number of hardware queues created is fewer than the number of cpus in the system. In this condition cpus share a hardware queue. And of course, it's the 2nd cpu that shares a hardware that attempted to free it a second time and hit the oops. Fix by reworking the cpu to hardware queue mapping such that: Assignment of hardware queues to cpus occur in two passes: first pass: is first time assignment of a hardware queue to a cpu. This will set the LPFC_CPU_FIRST_IRQ flag for the cpu. second pass: for cpus that did not get a hardware queue they will be assigned one from a primary cpu (one set in first pass). Deletion of hardware queues is driven by cpu itteration, and queues will only be deleted if the LPFC_CPU_FIRST_IRQ flag is set. Also contains a few small cleanup fixes and a little better logging. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	4b0a42be26	scsi: lpfc: Fix irq raising in lpfc_sli_hba_down The adapter reset path (lpfc_sli_hba_down) is taking/releasing a lock with irq. But, the path is already under the hbalock which raised irq so it's unnecessary. Convert to simple lock/unlock. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	61184f1742	scsi: lpfc: Fix Oops in nvme_register with target logout/login lpfc_nvme_register_port hit a null prev_ndlp pointer in a test with lots of target ports swapping addresses. The oldport value was stale, thus it's ndlp (prev_ndlp set to it) was used. Fix by moving oldrport pointer checks, and if used prev_ndlp pointer assignment, to be done while the lock is held. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	08180db254	scsi: lpfc: Fix issuing init_vpi mbox on SLI-3 card The driver is inadvertently trying to issue an INIT_VPI mailbox command on an SLI-3 driver. The command is specific to SLI-4. When the call is made to send the command, if on an SLI-3 adapter, an array pointer is NULL and the driver will oops. Fix by restricting the command to SLI-4 adapters only. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	26d824ca45	scsi: lpfc: Fix ADISC reception terminating login state if a NVME target If a target issues an ADISC to the port and the target is a NVME target, the driver is inadvertantly invalidating the login and marking the remote port as logged out. Communication with the target is lost. Revise the ADISC check so that FCP or NVME targets will be marked valid at the end of ADISC processing. Enhance logging to recognize condition better. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	7f20c1cb23	scsi: lpfc: Fix discovery when target has no GID_FT information Some remote ports may be slow in registering their GID_FT protocol information with the fabric. If the remote port is an initiator, it may send PLOGI to the port before the GID_FT logic is complete. Meaning, after accepting the PLOGI, when the driver may see no response to the GID_FT that is issued after the login to determine the protocols supported so that proper PRLI's may be transmit. If the driver has no fc4 information, it currently stops and the remote port is not discovered. Fix by issuing a LOGO when there is no GID_FT information. The LOGO completion handling will attempt to re-login if the nport_id is still present. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
James Smart	57178b9275	scsi: lpfc: Fix port relogin failure due to GID_FT interaction In cases of remote-port-side cable pull/replug, there happens to be a target that upon replug will send the port a PLOGI, a PRLI, and a LOGO. When this sequence is received by the driver, the PLOGI accepted and a GFT_ID is issued to find the protocol support for the remote port. While the GFT_ID is outstanding, a LOGO is received. The driver logs the remote port out and unregisters the RPI and schedules a new PLOGI transmission. However, the GFT_ID was not terminated. When it completed, the driver attempted to transition the remote port to PRLI transmission, which cancels the PLOGI scheduling. The PRLI transmit attempt is rejected by the adapter as the remote port is not logged in. No retry is attempted as it's expected the logout is noted and the supposedly scheduled PLOGI should address the state. As there is no PLOGI, the remote port does not get re-discovered. Fix by aborting the outstanding GFT_ID if the related remote port is logged out. Ensure a PRLI transmit attempt only occurs if the remote port is logging in. This avoids the incorrect attempt while logged out. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:08 -04:00
James Smart	296012285c	scsi: lpfc: Fix leak of ELS completions on adapter reset If the adapter is reset while there are outstanding ELS's, subsequent reinitialization of the adapter will fail as it has not recovered all of the io contexts relative to the ELS's. If an ELS timed out or otherwise failed and an the ELS was attempted to be aborted (which changes the ELS completion context), in causes where the driver generates completions for the outstanding IO as the adapter would not due to being reset, the driver released only the ELS context and failed to release the abort context. When the adapter went to reinit, as it had not received all of the contexts, it failed to reinit. Fix by having the ELS completion handler identify the driver-generated completion status and release the abort context. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:08 -04:00
James Smart	8d34a59cae	scsi: lpfc: Fix failure to clear non-zero eq_delay after io rate reduction Unusually high IO latency can be observed with little IO in progress. The latency may remain high regardless of amount of IO and can only be cleared by forcing lpfc_fcp_imax values to non-zero and then back to zero. The driver's eq_delay mechanism that scales the interrupt coalescing based on io completion load failed to reduce or turn off coalescing when load decreased. Specifically, if no io completed on a cpu within an eq_delay polling window, the eq delay processing was skipped and no change was made to the coalescing values. This left the coalescing values set when they were no longer applicable. Fix by always clearing the percpu counters for each time period and always run the eq_delay calculations if an eq has a non-zero coalescing value. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:08 -04:00
James Smart	3cee98db26	scsi: lpfc: Fix crash on driver unload in wq free If a timer routine uses workqueues, it could fire before the workqueue is allocated. Fix by allocating the workqueue before the timer routines are setup Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:08 -04:00
James Smart	1d755d6477	scsi: lpfc: Fix ELS field alignments After seeing some interoperability issues with ADISC, it was determined the ELS definitions in lpfc were using types that allowed the compiler to add pad to the structure, causing the structure to no longer be per spec. The offending structures are ADISC, FAN, and RNID. This patch implements the simple fix of eliminating the pad by forcing the compiler to pack the structure. Care was taken to ensure field accesses won't be by operations that would hit a bad field alignment. The better solution would be to convert to the uapi fc header definitions, but the number of changes required to do is rather intrusive so this course of action was deferred. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:08 -04:00
James Smart	4f1a2fef2a	scsi: lpfc: Fix PLOGI failure with high remoteport count When connected to a high number of remote ports, the driver is encountering PLOGI errors. The errors are due to adapter detected failures indicating illegal field values. Turns out the driver was prematurely clearing an RPI bitmask before waiting for an UNREG_RPI mailbox completion. This allowed the RPI to be reused before it was actually available. Fix by clearing RPI bitmask only after UNREG_RPI mailbox completion. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:08 -04:00
James Smart	31f06d2e73	scsi: lpfc: Limit xri count for kdump environment scsi-mq operation inherently performs pre-allocation of resources for blk-mq request queues. Even though the kdump environment reduces the configuration to a single CPU, thus 1 hardware queue, which helps significantly, the resources are still rather large due to the per request allocations. blk-mq pre-allocations can be over 4KB per request. With adapter can_queue values in the 4k or 8k range, this can easily be 32MBs before any other driver memory is factored in. Driver SGL DMA buffer allocation can be up to 8KB per request as well adding an additional 64MB. Totals are well over 100MB for a single shost. Given kdump memory auto-sizing utilities don't accommodate this amount of memory well, it's very possible for kdump to fail due to lack of memory. Fix by having the driver recognize that it is booting within a kdump context and reduce the number of requests it will support to a more reasonable value. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:08 -04:00
Hariprasad Kelam	a967783300	scsi: lpfc: remove NULL check before some freeing functions As dma_pool_destroy and mempool_destroy functions has NULL check. We may not need NULL check before calling them. Fix below warnings reported by coccicheck ./drivers/scsi/lpfc/lpfc_mem.c:252:2-18: WARNING: NULL check before some freeing functions is not needed. ./drivers/scsi/lpfc/lpfc_mem.c:255:2-18: WARNING: NULL check before some freeing functions is not needed. ./drivers/scsi/lpfc/lpfc_mem.c:258:2-18: WARNING: NULL check before some freeing functions is not needed. ./drivers/scsi/lpfc/lpfc_mem.c:261:2-18: WARNING: NULL check before some freeing functions is not needed. ./drivers/scsi/lpfc/lpfc_mem.c:265:2-18: WARNING: NULL check before some freeing functions is not needed. ./drivers/scsi/lpfc/lpfc_mem.c:269:2-17: WARNING: NULL check before some freeing functions is not needed. Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:21:57 -04:00
James Smart	77ffd3465b	scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of MQ resources based on shost values set by the driver. In newer cases of the driver, which attempts to set nr_hw_queues to the cpu count, the multipliers become excessive, with a single shost having SCSI-MQ pre-allocation reaching into the multiple GBytes range. NPIV, which creates additional shosts, only multiply this overhead. On lower-memory systems, this can exhaust system memory very quickly, resulting in a system crash or failures in the driver or elsewhere due to low memory conditions. After testing several scenarios, the situation can be mitigated by limiting the value set in shost->nr_hw_queues to 4. Although the shost values were changed, the driver still had per-cpu hardware queues of its own that allowed parallelization per-cpu. Testing revealed that even with the smallish number for nr_hw_queues for SCSI-MQ, performance levels remained near maximum with the within-driver affiinitization. A module parameter was created to allow the value set for the nr_hw_queues to be tunable. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:14:10 -04:00
Fuqian Huang	78d4b1327d	scsi: lpfc: use spin_lock_irqsave in IRQ context As spin_unlock_irq will enable interrupts. Function lpfc_findnode_rpi is called from lpfc_sli_abts_err_handler (./drivers/scsi/lpfc/lpfc_sli.c) <- lpfc_sli_async_event_handler <- lpfc_sli_process_unsol_iocb <- lpfc_sli_handle_fast_ring_event <- lpfc_sli_fp_intr_handler <- lpfc_sli_intr_handler and lpfc_sli_intr_handler is an interrupt handler. Interrupts are enabled in interrupt handler. Use spin_lock_irqsave/spin_unlock_irqrestore instead of spin_(un)lock_irq in IRQ context to avoid this. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:10:03 -04:00
Fuqian Huang	ee9a256cd8	scsi: lpfc: remove redundant code Remove the redundant initialization code. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:07:50 -04:00
James Smart	a86c71ba30	scsi: lpfc: Fix crash when cpu count is 1 and null irq affinity mask When a configurations runs with a single cpu (such as a kdump kernel), which causes the driver to request a single vector, when the driver subsequently requests an irq affinity mask, the mask comes back null. The driver currently does nothing in this scenario, which leaves mappings to hardware queues incomplete and crashes the system. Fix by recognizing the null mask and assigning the vector to the first cpu in the system. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-07 21:56:41 -04:00
YueHaibing	70a51d8c53	scsi: lpfc: Remove unnecessary null check before kfree A null check before a kfree is redundant, so remove it. This is detected by coccinelle. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-07-22 16:44:07 -04:00
Linus Torvalds	f65420df91	SCSI fixes on 20190720 This is the final round of mostly small fixes in our initial submit. It's mostly minor fixes and driver updates. The only change of note is adding a virt_boundary_mask to the SCSI host and host template to parametrise this for NVMe devices instead of having them do a call in slave_alloc. It's a fairly straightforward conversion except in the two NVMe handling drivers that didn't set it who now have a virtual infinity parameter added. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXTJS/yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQTNAQCsTdkA IN1BvDBbE+KO8mvL5DuRxLtnDU6Pq5K6fkrE3gD/a1GkqyPPaJIuspq7fQY87DH/ o7VsJd/5uGphIE2Ls+M= =38XV -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is the final round of mostly small fixes in our initial submit. It's mostly minor fixes and driver updates. The only change of note is adding a virt_boundary_mask to the SCSI host and host template to parametrise this for NVMe devices instead of having them do a call in slave_alloc. It's a fairly straightforward conversion except in the two NVMe handling drivers that didn't set it who now have a virtual infinity parameter added" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: megaraid_sas: set an unlimited max_segment_size scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs scsi: IB/srp: set virt_boundary_mask in the scsi host scsi: IB/iser: set virt_boundary_mask in the scsi host scsi: storvsc: set virt_boundary_mask in the scsi host template scsi: ufshcd: set max_segment_size in the scsi host template scsi: core: take the DMA max mapping size into account scsi: core: add a host / host template field for the virt boundary scsi: core: Fix race on creating sense cache scsi: sd_zbc: Fix compilation warning scsi: libfc: fix null pointer dereference on a null lport scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized scsi: zfcp: fix request object use-after-free in send path causing wrong traces scsi: zfcp: fix request object use-after-free in send path causing seqno errors scsi: megaraid_sas: Update driver version to 07.710.50.00 scsi: megaraid_sas: Add module parameter for FW Async event logging scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers scsi: megaraid_sas: Fix calculation of target ID scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE scsi: devinfo: BLIST_TRY_VPD_PAGES for SanDisk Cruzer Blade ...	2019-07-20 10:04:58 -07:00
Linus Torvalds	ef8f3d48af	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: "Am experimenting with splitting MM up into identifiable subsystems perhaps with a view to gitifying it in complex ways. Also with more verbose "incoming" emails. Most of MM is here and a few other trees. Subsystems affected by this patch series: - hotfixes - iommu - scripts - arch/sh - ocfs2 - mm:slab-generic - mm:slub - mm:kmemleak - mm:kasan - mm:cleanups - mm:debug - mm:pagecache - mm:swap - mm:memcg - mm:gup - mm:pagemap - mm:infrastructure - mm:vmalloc - mm:initialization - mm:pagealloc - mm:vmscan - mm:tools - mm:proc - mm:ras - mm:oom-kill hotfixes: mm: vmscan: scan anonymous pages on file refaults mm/nvdimm: add is_ioremap_addr and use that to check ioremap address mm/memcontrol: fix wrong statistics in memory.stat mm/z3fold.c: lock z3fold page before __SetPageMovable() nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header MAINTAINERS: nilfs2: update email address iommu: include/linux/dmar.h: replace single-char identifiers in macros scripts: scripts/decode_stacktrace: match basepath using shell prefix operator, not regex scripts/decode_stacktrace: look for modules with .ko.debug extension scripts/spelling.txt: drop "sepc" from the misspelling list scripts/spelling.txt: add spelling fix for prohibited scripts/decode_stacktrace: Accept dash/underscore in modules scripts/spelling.txt: add more spellings to spelling.txt arch/sh: arch/sh/configs/sdk7786_defconfig: remove CONFIG_LOGFS sh: config: remove left-over BACKLIGHT_LCD_SUPPORT sh: prevent warnings when using iounmap ocfs2: fs: ocfs: fix spelling mistake "hearbeating" -> "heartbeat" ocfs2/dlm: use struct_size() helper ocfs2: add last unlock times in locking_state ocfs2: add locking filter debugfs file ocfs2: add first lock wait time in locking_state ocfs: no need to check return value of debugfs_create functions fs/ocfs2/dlmglue.c: unneeded variable: "status" ocfs2: use kmemdup rather than duplicating its implementation mm:slab-generic: Patch series "mm/slab: Improved sanity checking": mm/slab: validate cache membership under freelist hardening mm/slab: sanity-check page type when looking up cache lkdtm/heap: add tests for freelist hardening mm:slub: mm/slub.c: avoid double string traverse in kmem_cache_flags() slub: don't panic for memcg kmem cache creation failure mm:kmemleak: mm/kmemleak.c: fix check for softirq context mm/kmemleak.c: change error at _write when kmemleak is disabled docs: kmemleak: add more documentation details mm:kasan: mm/kasan: print frame description for stack bugs Patch series "Bitops instrumentation for KASAN", v5: lib/test_kasan: add bitops tests x86: use static_cpu_has in uaccess region to avoid instrumentation asm-generic, x86: add bitops instrumentation for KASAN Patch series "mm/kasan: Add object validation in ksize()", v3: mm/kasan: introduce __kasan_check_{read,write} mm/kasan: change kasan_check_{read,write} to return boolean lib/test_kasan: Add test for double-kzfree detection mm/slab: refactor common ksize KASAN logic into slab_common.c mm/kasan: add object validation in ksize() mm:cleanups: include/linux/pfn_t.h: remove pfn_t_to_virt() Patch series "remove ARCH_SELECT_MEMORY_MODEL where it has no effect": arm: remove ARCH_SELECT_MEMORY_MODEL s390: remove ARCH_SELECT_MEMORY_MODEL sparc: remove ARCH_SELECT_MEMORY_MODEL mm/gup.c: make follow_page_mask() static mm/memory.c: trivial clean up in insert_page() mm: make !CONFIG_HUGE_PAGE wrappers into static inlines include/linux/mm_types.h: ifdef struct vm_area_struct::swap_readahead_info mm: remove the account_page_dirtied export mm/page_isolation.c: change the prototype of undo_isolate_page_range() include/linux/vmpressure.h: use spinlock_t instead of struct spinlock mm: remove the exporting of totalram_pages include/linux/pagemap.h: document trylock_page() return value mm:debug: mm/failslab.c: by default, do not fail allocations with direct reclaim only Patch series "debug_pagealloc improvements": mm, debug_pagelloc: use static keys to enable debugging mm, page_alloc: more extensive free page checking with debug_pagealloc mm, debug_pagealloc: use a page type instead of page_ext flag mm:pagecache: Patch series "fix filler_t callback type mismatches", v2: mm/filemap.c: fix an overly long line in read_cache_page mm/filemap: don't cast ->readpage to filler_t for do_read_cache_page jffs2: pass the correct prototype to read_cache_page 9p: pass the correct prototype to read_cache_page mm/filemap.c: correct the comment about VM_FAULT_RETRY mm:swap: mm, swap: fix race between swapoff and some swap operations mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device() mm, swap: use rbtree for swap_extent mm/mincore.c: fix race between swapoff and mincore mm:memcg: memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL memcg, fsnotify: no oom-kill for remote memcg charging mm, memcg: introduce memory.events.local mm: memcontrol: dump memory.stat during cgroup OOM Patch series "mm: reparent slab memory on cgroup removal", v7: mm: memcg/slab: postpone kmem_cache memcg pointer initialization to memcg_link_cache() mm: memcg/slab: rename slab delayed deactivation functions and fields mm: memcg/slab: generalize postponed non-root kmem_cache deactivation mm: memcg/slab: introduce __memcg_kmem_uncharge_memcg() mm: memcg/slab: unify SLAB and SLUB page accounting mm: memcg/slab: don't check the dying flag on kmem_cache creation mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock mm: memcg/slab: rework non-root kmem_cache lifecycle management mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages mm: memcg/slab: reparent memcg kmem_caches on cgroup removal mm, memcg: add a memcg_slabinfo debugfs file mm:gup: Patch series "switch the remaining architectures to use generic GUP", v4: mm: use untagged_addr() for get_user_pages_fast addresses mm: simplify gup_fast_permitted mm: lift the x86_32 PAE version of gup_get_pte to common code MIPS: use the generic get_user_pages_fast code sh: add the missing pud_page definition sh: use the generic get_user_pages_fast code sparc64: add the missing pgd_page definition sparc64: define untagged_addr() sparc64: use the generic get_user_pages_fast code mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP mm: reorder code blocks in gup.c mm: consolidate the get_user_pages* implementations mm: validate get_user_pages_fast flags mm: move the powerpc hugepd code to mm/gup.c mm: switch gup_hugepte to use try_get_compound_head mm: mark the page referenced in gup_hugepte mm/gup: speed up check_and_migrate_cma_pages() on huge page mm/gup.c: remove some BUG_ONs from get_gate_page() mm/gup.c: mark undo_dev_pagemap as __maybe_unused mm:pagemap: asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel] alpha: switch to generic version of pte allocation arm: switch to generic version of pte allocation arm64: switch to generic version of pte allocation csky: switch to generic version of pte allocation m68k: sun3: switch to generic version of pte allocation mips: switch to generic version of pte allocation nds32: switch to generic version of pte allocation nios2: switch to generic version of pte allocation parisc: switch to generic version of pte allocation riscv: switch to generic version of pte allocation um: switch to generic version of pte allocation unicore32: switch to generic version of pte allocation mm/pgtable: drop pgtable_t variable from pte_fn_t functions mm/memory.c: fail when offset == num in first check of __vm_map_pages() mm:infrastructure: mm/mmu_notifier: use hlist_add_head_rcu() mm:vmalloc: Patch series "Some cleanups for the KVA/vmalloc", v5: mm/vmalloc.c: remove "node" argument mm/vmalloc.c: preload a CPU with one object for split purpose mm/vmalloc.c: get rid of one single unlink_va() when merge mm/vmalloc.c: switch to WARN_ON() and move it under unlink_va() mm/vmalloc.c: spelling> s/informaion/information/ mm:initialization: mm/large system hash: use vmalloc for size > MAX_ORDER when !hashdist mm/large system hash: clear hashdist when only one node with memory is booted mm:pagealloc: arm64: move jump_label_init() before parse_early_param() Patch series "add init_on_alloc/init_on_free boot options", v10: mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options mm: init: report memory auto-initialization features at boot time mm:vmscan: mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned mm: vmscan: correct some vmscan counters for THP swapout mm:tools: tools/vm/slabinfo: order command line options tools/vm/slabinfo: add partial slab listing to -X tools/vm/slabinfo: add option to sort by partial slabs tools/vm/slabinfo: add sorting info to help menu mm:proc: proc: use down_read_killable mmap_sem for /proc/pid/maps proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup proc: use down_read_killable mmap_sem for /proc/pid/pagemap proc: use down_read_killable mmap_sem for /proc/pid/clear_refs proc: use down_read_killable mmap_sem for /proc/pid/map_files mm: use down_read_killable for locking mmap_sem in access_remote_vm mm: smaps: split PSS into components mm: vmalloc: show number of vmalloc pages in /proc/meminfo mm:ras: mm/memory-failure.c: clarify error message mm:oom-kill: mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks() mm, oom: refactor dump_tasks for memcg OOMs mm, oom: remove redundant task_in_mem_cgroup() check oom: decouple mems_allowed from oom_unkillable_task mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()" * akpm: (147 commits) mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process() oom: decouple mems_allowed from oom_unkillable_task mm, oom: remove redundant task_in_mem_cgroup() check mm, oom: refactor dump_tasks for memcg OOMs mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks() mm/memory-failure.c: clarify error message mm: vmalloc: show number of vmalloc pages in /proc/meminfo mm: smaps: split PSS into components mm: use down_read_killable for locking mmap_sem in access_remote_vm proc: use down_read_killable mmap_sem for /proc/pid/map_files proc: use down_read_killable mmap_sem for /proc/pid/clear_refs proc: use down_read_killable mmap_sem for /proc/pid/pagemap proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup proc: use down_read_killable mmap_sem for /proc/pid/maps tools/vm/slabinfo: add sorting info to help menu tools/vm/slabinfo: add option to sort by partial slabs tools/vm/slabinfo: add partial slab listing to -X tools/vm/slabinfo: order command line options mm: vmscan: correct some vmscan counters for THP swapout mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned ...	2019-07-12 11:40:28 -07:00
Paul Walmsley	cc0e5f1ce0	scripts/spelling.txt: drop "sepc" from the misspelling list The RISC-V architecture has a register named the "Supervisor Exception Program Counter", or "sepc". This abbreviation triggers checkpatch.pl's misspelling detector, resulting in noise in the checkpatch output. The risk that this noise could cause more useful warnings to be missed seems to outweigh the harm of an occasional misspelling of "spec". Thus drop the "sepc" entry from the misspelling list. [akpm@linux-foundation.org: fix existing "sepc" instances, per Joe] Link: http://lkml.kernel.org/r/20190518210037.13674-1-paul.walmsley@sifive.com Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:41 -07:00
Arnd Bergmann	057959c6e3	scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE The lpfc_debug_dump_all_queues() function repeatedly calls into lpfc_debug_dump_qe() which has a temporary 128 byte buffer. This was fine before the introduction of CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE because each instance could occupy the same stack slot. However, now they each get their own copy, which leads to a huge increase in stack usage as seen from the compiler warning: drivers/scsi/lpfc/lpfc_debugfs.c: In function 'lpfc_debug_dump_all_queues': drivers/scsi/lpfc/lpfc_debugfs.c:6474:1: error: the frame size of 1712 bytes is larger than 100 bytes [-Werror=frame-larger-than=] Avoid this by not marking lpfc_debug_dump_qe() as inline so the compiler can choose to emit a static version of this function when it's needed or otherwise silently drop it. As an added benefit, not inlining multiple copies of this function means we save several kilobytes of .text section, reducing the file size from 47kb to 43. It is somewhat unusual to have a function that is static but not inline in a header file, but this does not cause problems here because it is only used by other inline functions. It would however seem reasonable to move all the lpfc_debug_dump_* functions into lpfc_debugfs.c and not mark them inline as a later cleanup. Fixes: `81a56f6dcd` ("gcc-plugins: structleak: Generalize to all variable types") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-07-11 20:42:30 -04:00
Linus Torvalds	1f7563f743	SCSI sg on 20190709 This topic branch covers a fundamental change in how our sg lists are allocated to make mq more efficient by reducing the size of the preallocated sg list. This necessitates a large number of driver changes because the previous guarantee that if a driver specified SG_ALL as the size of its scatter list, it would get a non-chained list and didn't need to bother with scatterlist iterators is now broken and every driver must use scatterlist iterators. This was broken out as a separate topic because we need to convert all the drivers before pulling the trigger and unconverted drivers kept being found, necessitating a rebase. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTzzCYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishZB+AP9I8j/s wWfg0Z3WNuf4D5I3rH4x1J3cQTqPJed+RjwgcQEA1gZvtOTg1ZEn/CYMVnaB92x0 t6MZSchIaFXeqfD+E7U= =cv8o -----END PGP SIGNATURE----- Merge tag 'scsi-sg' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI scatter-gather list updates from James Bottomley: "This topic branch covers a fundamental change in how our sg lists are allocated to make mq more efficient by reducing the size of the preallocated sg list. This necessitates a large number of driver changes because the previous guarantee that if a driver specified SG_ALL as the size of its scatter list, it would get a non-chained list and didn't need to bother with scatterlist iterators is now broken and every driver must use scatterlist iterators. This was broken out as a separate topic because we need to convert all the drivers before pulling the trigger and unconverted drivers kept being found, necessitating a rebase" * tag 'scsi-sg' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (21 commits) scsi: core: don't preallocate small SGL in case of NO_SG_CHAIN scsi: lib/sg_pool.c: clear 'first_chunk' in case of no preallocation scsi: core: avoid preallocating big SGL for data scsi: core: avoid preallocating big SGL for protection information scsi: lib/sg_pool.c: improve APIs for allocating sg pool scsi: esp: use sg helper to iterate over scatterlist scsi: NCR5380: use sg helper to iterate over scatterlist scsi: wd33c93: use sg helper to iterate over scatterlist scsi: ppa: use sg helper to iterate over scatterlist scsi: pcmcia: nsp_cs: use sg helper to iterate over scatterlist scsi: imm: use sg helper to iterate over scatterlist scsi: aha152x: use sg helper to iterate over scatterlist scsi: s390: zfcp_fc: use sg helper to iterate over scatterlist scsi: staging: unisys: visorhba: use sg helper to iterate over scatterlist scsi: usb: image: microtek: use sg helper to iterate over scatterlist scsi: pmcraid: use sg helper to iterate over scatterlist scsi: ipr: use sg helper to iterate over scatterlist scsi: mvumi: use sg helper to iterate over scatterlist scsi: lpfc: use sg helper to iterate over scatterlist scsi: advansys: use sg helper to iterate over scatterlist ...	2019-07-11 15:17:41 -07:00
Linus Torvalds	ba6d10ab80	SCSI misc on 20190709 This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs, mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the removal of the osst driver (I heard from Willem privately that he would like the driver removed because all his test hardware has failed). Plus number of minor changes, spelling fixes and other trivia. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE fjy17dQLvsJ4GsidHy8= =aS5B -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs, mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the removal of the osst driver (I heard from Willem privately that he would like the driver removed because all his test hardware has failed). Plus number of minor changes, spelling fixes and other trivia. The big merge conflict this time around is the SPDX licence tags. Following discussion on linux-next, we believe our version to be more accurate than the one in the tree, so the resolution is to take our version for all the SPDX conflicts" Note on the SPDX license tag conversion conflicts: the SCSI tree had done its own SPDX conversion, which in some cases conflicted with the treewide ones done by Thomas & co. In almost all cases, the conflicts were purely syntactic: the SCSI tree used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the treewide conversion had used the new-style ones ("GPL-2.0-only" and "GPL-2.0-or-later"). In these cases I picked the new-style one. In a few cases, the SPDX conversion was actually different, though. As explained by James above, and in more detail in a pre-pull-request thread: "The other problem is actually substantive: In the libsas code Luben Tuikov originally specified gpl 2.0 only by dint of stating: * This file is licensed under GPLv2. In all the libsas files, but then muddied the water by quoting GPLv2 verbatim (which includes the or later than language). So for these files Christoph did the conversion to v2 only SPDX tags and Thomas converted to v2 or later tags" So in those cases, where the spdx tag substantially mattered, I took the SCSI tree conversion of it, but then also took the opportunity to turn the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag. Similarly, when there were whitespace differences or other differences to the comments around the copyright notices, I took the version from the SCSI tree as being the more specific conversion. Finally, in the spdx conversions that had no conflicts (because the treewide ones hadn't been done for those files), I just took the SCSI tree version as-is, even if it was old-style. The old-style conversions are perfectly valid, even if the "-only" and "-or-later" versions are perhaps more descriptive. * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits) scsi: qla2xxx: move IO flush to the front of NVME rport unregistration scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition scsi: qla2xxx: on session delete, return nvme cmd scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1 scsi: megaraid_sas: Introduce various Aero performance modes scsi: megaraid_sas: Use high IOPS queues based on IO workload scsi: megaraid_sas: Set affinity for high IOPS reply queues scsi: megaraid_sas: Enable coalescing for high IOPS queues scsi: megaraid_sas: Add support for High IOPS queues scsi: megaraid_sas: Add support for MPI toolbox commands scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD scsi: megaraid_sas: Handle sequence JBOD map failure at driver level scsi: megaraid_sas: Don't send FPIO to RL Bypass queue scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout scsi: megaraid_sas: Call disable_irq from process IRQ poll scsi: megaraid_sas: Remove few debug counters from IO path ...	2019-07-11 15:14:01 -07:00
Martin K. Petersen	893ca250ed	Merge branch '5.3/scsi-sg' into scsi-next	2019-06-27 00:19:33 -04:00
James Smart	41b194b843	lpfc: add sysfs interface to post NVME RSCN To support scenarios which aren't bound to nvmetcli add port scenarios, which is currently where the nvmet_fc transport invokes the discovery event callbacks, a syfs attribute is added to lpfc which can be written to cause an RSCN to be generated for the nport. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-06-21 11:08:38 +02:00
James Smart	6f2589f478	lpfc: add support for translating an RSCN rcv into a discovery rescan This patch updates RSCN receive processing to check for the remote port being an NVME port, and if so, invoke the nvme_fc callback to rescan the remote port. The rescan will generate a discovery udev event. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-06-21 11:08:38 +02:00
James Smart	ab723121a8	lpfc: add nvmet discovery_event op support This patch adds support for the nvmet discovery op. When the callback routine is called, the driver will call the routine to generate an RSCN to the port on the other end of the link. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-06-21 11:08:37 +02:00
James Smart	f60cb93bbf	lpfc: add support to generate RSCN events for nport This patch adds general RSCN support: - The ability to transmit an RSCN to the port on the other end of the link (regular port if pt2pt, or fabric controller if fabric). - And general recognition of an RSCN ELS when an ELS is received. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-06-21 11:08:37 +02:00
Ming Lei	46e8e475a1	scsi: lpfc: use sg helper to iterate over scatterlist Unlike the legacy I/O path, scsi-mq preallocates a large array to hold the scatterlist for each request. This static allocation can consume substantial amounts of memory on modern controllers which support a large number of concurrently outstanding requests. To facilitate a switch to a smaller static allocation combined with a dynamic allocation for requests that need it, we need to make sure all SCSI drivers handle chained scatterlists correctly. Convert remaining drivers that directly dereference the scatterlist array to using the iterator functions. [mkp: clarified commit message] Reviewed by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-20 14:39:21 -04:00
Nathan Chancellor	336df6eb62	scsi: lpfc: Avoid unused function warnings When building powerpc pseries_defconfig or powernv_defconfig: drivers/scsi/lpfc/lpfc_nvmet.c:224:1: error: unused function 'lpfc_nvmet_get_ctx_for_xri' [-Werror,-Wunused-function] drivers/scsi/lpfc/lpfc_nvmet.c:246:1: error: unused function 'lpfc_nvmet_get_ctx_for_oxid' [-Werror,-Wunused-function] These functions are only compiled when CONFIG_NVME_TARGET_FC is enabled. Use that same condition so there is no more warning. While the fixes commit did not introduce these functions, it caused these warnings. Fixes: 4064b27417a7 ("scsi: lpfc: Make some symbols static") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:25 -04:00
YueHaibing	d7b761b069	scsi: lpfc: Make some symbols static Fix sparse warnings: drivers/scsi/lpfc/lpfc_sli.c:115:1: warning: symbol 'lpfc_sli4_pcimem_bcopy' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_sli.c:7854:1: warning: symbol 'lpfc_sli4_process_missed_mbox_completions' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvmet.c:223:27: warning: symbol 'lpfc_nvmet_get_ctx_for_xri' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvmet.c:245:27: warning: symbol 'lpfc_nvmet_get_ctx_for_oxid' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_init.c:75:10: warning: symbol 'lpfc_present_cpu' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:24 -04:00
YueHaibing	a82b3539dc	scsi: lpfc: Remove set but not used variables 'qp' Fixes gcc '-Wunused-but-set-variable' warnings: drivers/scsi/lpfc/lpfc_init.c: In function lpfc_setup_cq_lookup: drivers/scsi/lpfc/lpfc_init.c:9359:30: warning: variable qp set but not used [-Wunused-but-set-variable] It's not used since commit e70596a60f88 ("scsi: lpfc: Fix poor use of hardware queues if fewer irq vectors") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:24 -04:00
Thomas Meyer	a5c990eea5	scsi: lpfc: Use _pool_zalloc rather than _pool_alloc Use _pool_zalloc rather than _pool_alloc followed by memset with 0. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:23 -04:00
James Smart	852eb63a71	scsi: lpfc: Update lpfc version to 12.2.0.3 Update lpfc version to 12.2.0.3 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:22 -04:00
James Smart	01d53c0463	scsi: lpfc: Fix kernel warnings related to smp_processor_id() Kernel warnings may be seen with preempt debugging enabled. Replace smp_processor_id calls with raw_smp_processor_id or cpu information stored in hdwq structures. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:22 -04:00
James Smart	aa6ff30918	scsi: lpfc: Fix BFS crash with DIX enabled Crashes in scsi_queue_rq or in dma_unmap_direct_sg during BFS when lpfc has lpfc_enable_bg=1. lpfc is setting DIX and prot sg after scsi_add_host_with_dma() has been called. The scsi_host_set_prot() and scsi_host_set_guard() routines need to be called before scsi_add_host_with_dma(). Revise the calling sequence to set the protection/guard data before calling scsi_add_host_with_dma(). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:22 -04:00
James Smart	93f647f93d	scsi: lpfc: Fix FDMI fc4type for nvme support FDMI protocol support registration was not accurately showing nvme support. The fcponly-path clears the parameter object. Move the code out of the fcponly code path. Fix the FDMI registration data to properly check for nvme support. Commonize the manner in which the fdmi routines set protocol support. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:22 -04:00
James Smart	996a02aeb9	scsi: lpfc: Fix fcp_rsp_len checking on lun reset Issuing a LUN reset was resulting in a command failure which then escalated to a host reset. The FCP-4 spec allows fcp_rsp_len field to specify the number of valid bytes of FCP_RSP_INFO, and the value could be 4 or 8. The driver is allowing only a value of 8, thus it failed the command. Revise the driver to allow 4 or 8. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:22 -04:00
James Smart	657add4e5e	scsi: lpfc: Fix poor use of hardware queues if fewer irq vectors While fixing the resources per socket, realized the driver was not using hardware queues (up to 1 per cpu) if there were fewer interrupt vectors. The driver was only using the hardware queue assigned to the cpu with the vector. Rework the affinity map check to use the additional hardware queue elements that had been allocated. If the cpu count exceeds the hardware queue count - share, but choose what is shared with by: hyperthread peer, core peer, socket peer, or finally similar cpu in a different socket. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:22 -04:00
James Smart	d9954a2d18	scsi: lpfc: Fix oops when driver is loaded with 1 interrupt vector The driver was coded expecting enough hardware queues and interrupt vectors such that at least there was one per socket. In the case where there were fewer than sockets, cpus were left unassigned thus null pointers. Rework the affinity mappings. Map settings for the cpu's that are in the irq cpu mask. For each cpu not in the mask, map to another cpu that does have a mask. Choice of the "other" cpu will attempt to map to the same cpu but differing hyperthread, or cpu within in same core, or cpu within same socket, or finally cpu in the base socket. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:22 -04:00
James Smart	b8e6f13617	scsi: lpfc: Fix incorrect logical link speed on trunks when links down Invalid logical speed is displayed for trunk enabled ports when all ports are down. Also noted that link speed is incorrectly reported for the units when links are up. Current code is returning the logical link speed from the last event from the adapter. In cases where the last link went down, the link speed in the event was not valid - meaning that although the links where down the field had a bogus value. Rework the event handling to qualify the trunk link state before using the event speed data. Also correct units on other areas where the logical link speed was taken from a link event. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	04d210c98e	scsi: lpfc: Fix memory leak in abnormal exit path from lpfc_eq_create eq create is leaking mailbox memory if it encounters an error. rework error path to free the memory. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	c15e07047e	scsi: lpfc: Rework misleading nvme not supported in firmware message The driver unconditionally says fw doesn't support nvme when in truth it was a driver parameter settings that disabled nvme support. Rework the code validating nvme support to accurately report what condition is disabling nvme support. Save state on whether nvme fw supports nvme in case sysfs attributes change dynamically. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	b9e5a2d961	scsi: lpfc: Fix hardlockup in scsi_cmd_iocb_cmpl There is a race condition with the abort handler declaring a waitq item on it's stack, followed by a timeout in the abort handler that has it give up on the abort return to its caller. When the io is finally aborted and its completion handler called, it references the waitq element that the abort_handler set up, which is no longer valid resulting in a deadlock. Fix by clearing the waitq reference, under lock, when the abort handler timeout gives up. Have the completion handler validate the waitq before referencing it. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	6594d31bab	scsi: lpfc: Cancel queued work for an IO when processing a received ABTS When queued work is executed posting a new command to the transport the driver is reporting a null buffer. The driver had received an ABTS which matched a command that had been scheduled for delivery to the transport. The driver proceeded to cancel the command, but the work item was never cancelled. Fix by cancelling the queued work item. Also turns out the ABTS response was not properly sending a BA_ACC, so set the flag to send the ACC. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	51d23fb28c	scsi: lpfc: Prevent 'use after free' memory overwrite in nvmet LS handling Use-after-free memory overwrite detected. Problem reported by Ewan Milne at Red Hat after running lpfc target with additional memory checking enabled. Race condition when lpfc_nvmet_xmt_ls_rsp_cmp frees the ctxp memory in interrupt context before lpfc_nvmet_xmt_ls_rsp clears a field in the ctxp after successfully issuing the wqe. Remove the unnecessary ctxp write after reposting the rq buffer. The ctxp->rqb_buffer field is not checked in LS handling after the wqe is submitted. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reported-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	f22bfe8d1c	scsi: lpfc: Fix PT2PT PLOGI collison stopping discovery Under heavy load the target stops responding, the drivers aborts timeout and we start recovery by logging out of the target, but the target is never logged into again. In a point-to-point scenario, there were battling PLOGI's. When we received a PLOGI request after having sent one, the driver cancels the processing of the original plogi. However, the completion path of the remaining plogi was coded to skip the reg_rpi that should be happening on the 2nd plogi. Correct by adding a simple pt2pt check such that the 2nd plogi isn't skipped and the reg_login occurs. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	f6978f4163	scsi: lpfc: Revert message logging on unsupported topology Turns out the message change in 12.2.0.1 for unsupported topology makes the linux driver out of sync with other products. Revert the message back to the prior content for product consistency. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	79d8c4ce01	scsi: lpfc: Fix nvmet handling of received ABTS for unmapped frames The driver currently is relying on firmware to match ABTSs to existing exchanges. This works fine as long as an exchange has been assigned to the io and work posted to it. However, for unmapped frames (rxid=0xFFFF), the driver has yet to assign an xri. The driver was blindly saying it couldn't match the ABTS and sending the BA_xxx. However, the command frame may have been in queues waiting on xri's before posting to the nvmet_fc layer. When xri's became available, the command frame would still be pushed to the transport and that io would execute, even though the io had been killed by ABTS. The initiator, seeing the io ABTS'd, would reuse the exchange for a different io which would be received on the target and pushed up. If the "zombie" io then came back down and started transmitting, the initiator would match the oxid and accept erroneous data. Bad things happened. Add tracking of active exchanges in the target to allow matching of a received ABTS against active or pending IO requests. If the ABTS is matched to a pending or active IO, the drive initiates cleanup and conditionally notifies the transport. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	d74a89aab9	scsi: lpfc: Separate CQ processing for nvmet_fc upcalls Currently the driver is notified of new command frame receipt by CQEs. As part of the CQE processing, the driver upcalls the nvmet_fc transport to deliver the command. nvmet_fc, as part of receiving the command builds out a context for it, where one of the first steps is to allocate memory for the io. When running with tests that do large ios (1MB), it was found on some systems, the total number of outstanding I/O's, at 1MB per, completely consumed the system's memory. Thus additional ios were getting blocked in the memory allocator. Given that this blocked the lpfc thread processing CQEs, there were lots of other commands that were received and which are then held up, and given CQEs are serially processed, the aggregate delays for an IO waiting behind the others became cummulative - enough so that the initiator hit timeouts for the ios. The basic fix is to avoid the direct upcall and instead schedule a work item for each io as it is received. This allows the cq processing to complete very quickly, and each io can then run or block on it's own. However, this general solution hurts latency when there are few ios. As such, implemented the fix such that the driver watches how many CQEs it has processed sequentially in one run. As long as the count is below a threshold, the direct nvmet_fc upcall will be made. Only when the count is exceeded will it revert to work scheduling. Given that debug of this showed a surprisingly long delay in cq processing, the io timer stats were updated to better reflect the processing of the different points. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	2ab70c2106	scsi: lpfc: Revise message when stuck due to unresponsive adapter Revise a stalled adapter message to also include the number of jobs that are stalling the thread. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	4767c58af9	scsi: lpfc: Correct nvmet buffer free race condition A race condition resulted in receive buffers being placed in the free list twice. Change the locking and handling to check whether the "other" path will be freeing the entry in a later thread and skip it if it is. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:20 -04:00
James Smart	32b9386564	scsi: lpfc: Fix nvmet target abort cmd matching After receiving an unsolicited ABTS (meaning rxid is 0xFFFF), the driver used the oxid from the initiator to match against a local xri which may have been allocated for the io. The xri would be the rxid - it's an invalid check resulting in the command not being matched or erroneously matched. Change the lookup to use the oxid and the SID to match against received IO's original values. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:20 -04:00
James Smart	2d71dc8eb6	scsi: lpfc: Fix alloc context on oas lun creations Softlockups are seen in low memory situations. They are due to doing oas_lun allocation with GFP_KERNEL in atomic contexts. Change the calls to oas_lun to indicate atomic context so that GFP_ATOMIC is used. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:20 -04:00
James Smart	69b9c52ca5	scsi: lpfc: Update lpfc version to 12.2.0.2 Due to the couple of bug fixes, update lpfc version to 12.2.0.2 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-05-13 20:32:50 -04:00
James Smart	c8cb261a07	scsi: lpfc: add check for loss of ndlp when sending RRQ There was a missing qualification of a valid ndlp structure when calling to send an RRQ for an abort. Add the check. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-05-13 20:32:50 -04:00
James Smart	79080d349f	scsi: lpfc: correct rcu unlock issue in lpfc_nvme_info_show Many of the exit cases were not releasing the rcu read lock. Corrected the exit paths. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-05-13 20:32:50 -04:00
James Smart	e2a8be5696	scsi: lpfc: resolve lockdep warnings There were a number of erroneous comments and incorrect older lockdep checks that were causing a number of warnings. Resolve the following: - Inconsistent lock state warnings in lpfc_nvme_info_show(). - Fixed comments and code on sequences where ring lock is now held instead of hbalock. - Reworked calling sequences around lpfc_sli_iocbq_lookup(). Rather than locking prior to the routine and have routine guess on what lock, take the lock within the routine. The lockdep check becomes unnecessary. - Fixed comments and removed erroneous hbalock checks. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> CC: Bart Van Assche <bvanassche@acm.org> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-05-13 20:32:49 -04:00
YueHaibing	c709297525	scsi: lpfc: Make lpfc_sli4_oas_verify static Fix sparse warning: drivers/scsi/lpfc/lpfc_init.c:13091:1: warning: symbol 'lpfc_sli4_oas_verify' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-18 20:35:58 -04:00
Martin K. Petersen	17631462cd	Merge branch '5.1/scsi-fixes' into 5.2/merge We have a few submissions for 5.2 that depend on fixes merged post 5.1-rc1. Merge the fixes branch into queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-12 21:27:23 -04:00
Hannes Reinecke	a6a6d0589a	scsi: scsi_transport_fc: nvme: display FC-NVMe port roles Currently the FC-NVMe driver is leverating the SCSI FC transport class to access the remote ports. Which means that all FC-NVMe remote ports will be visible to the fc transport layer, but due to missing definitions the port roles will always be 'unknown'. This patch adds the missing definitions to the fc transport class to that the port roles are correctly displayed. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Giridhar Malavali <gmalavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-12 20:09:34 -04:00
James Smart	1a61e5486a	scsi: lpfc: add support for posting FC events on FPIN reception This patch adds support to recognize FPIN ELS's that are received. When one is received, the fc transport will be called to handle the the FPIN. Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-08 21:29:16 -04:00
Bart Van Assche	d964b3e534	scsi: lpfc: Fix a recently introduced compiler warning This patch avoids that the following compiler warning is reported with CONFIG_NVME_FC=n: drivers/scsi/lpfc/lpfc_nvme.c:2140:1: warning: 'lpfc_nvme_lport_unreg_wait' defined but not used [-Wunused-function] lpfc_nvme_lport_unreg_wait(struct lpfc_vport *vport, ^~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: `3999df75bc` ("scsi: lpfc: Declare local functions static") Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-08 21:26:11 -04:00
James Smart	4eb0153588	scsi: lpfc: Fix missing wakeups on abort threads Abort thread wakeups, on some wqe types, are not happening. The thread wakeup logic is dependent upon the LPFC_DRIVER_ABORTED flag. However, on these wqes, the completion handler running prior to the io completion routine ends up clearing the flag. Rework the wakeup logic to look at a non-null waitq element which must be set if the abort thread is waiting. This is reverting the change in the indicated patch. Fixes: `c2017260ee` ("scsi: lpfc: Rework locking on SCSI io completion") Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:35:09 -04:00
Bart Van Assche	d6d189ceab	scsi: lpfc: Change smp_processor_id() into raw_smp_processor_id() This patch avoids that a kernel warning appears when smp_processor_id() is called with preempt debugging enabled. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:36 -04:00
Bart Van Assche	d8c2040bf9	scsi: lpfc: Remove unused functions Remove those functions that are not called from outside the removed functions. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:36 -04:00
Bart Van Assche	b27cbd5549	scsi: lpfc: Remove set-but-not-used variables This patch does not change any functionality but avoids that the compiler complains about set-but-not-used variables when building with W=1. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:36 -04:00
Bart Van Assche	a73cb81492	scsi: lpfc: Move trunk_errmsg[] from a header file into a .c file Arrays should be defined in .c files instead of in a header file. This patch reduces the size of the lpfc kernel module. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:36 -04:00
Bart Van Assche	cd05c155d7	scsi: lpfc: Annotate switch/case fall-through This patch avoids that the compiler warns about missing fall-through annotation when building with W=1. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:35 -04:00
Bart Van Assche	ffd43814d9	scsi: lpfc: Fix indentation and balance braces This patch avoid that smatch complains about misleading indentation. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:35 -04:00
Bart Van Assche	3999df75bc	scsi: lpfc: Declare local functions static This patch avoids that the compiler complains about missing declarations when building with W=1. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:35 -04:00
Arnd Bergmann	faf5a744f4	scsi: lpfc: avoid uninitialized variable warning clang -Wuninitialized incorrectly sees a variable being used without initialization: drivers/scsi/lpfc/lpfc_nvme.c:2102:37: error: variable 'localport' is uninitialized when used here [-Werror,-Wuninitialized] lport = (struct lpfc_nvme_lport )localport->private; ^~~~~~~~~ drivers/scsi/lpfc/lpfc_nvme.c:2059:38: note: initialize the variable 'localport' to silence this warning struct nvme_fc_local_port localport; ^ = NULL 1 error generated. This is clearly in dead code, as the condition leading up to it is always false when CONFIG_NVME_FC is disabled, and the variable is always initialized when nvme_fc_register_localport() got called successfully. Change the preprocessor conditional to the equivalent C construct, which makes the code more readable and gets rid of the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-25 22:17:26 -04:00
Silvio Cesare	e7f7b6f38a	scsi: lpfc: change snprintf to scnprintf for possible overflow Change snprintf to scnprintf. There are generally two cases where using snprintf causes problems. 1) Uses of size += snprintf(buf, SIZE - size, fmt, ...) In this case, if snprintf would have written more characters than what the buffer size (SIZE) is, then size will end up larger than SIZE. In later uses of snprintf, SIZE - size will result in a negative number, leading to problems. Note that size might already be too large by using size = snprintf before the code reaches a case of size += snprintf. 2) If size is ultimately used as a length parameter for a copy back to user space, then it will potentially allow for a buffer overflow and information disclosure when size is greater than SIZE. When the size is used to index the buffer directly, we can have memory corruption. This also means when size = snprintf... is used, it may also cause problems since size may become large. Copying to userspace is mitigated by the HARDENED_USERCOPY kernel configuration. The solution to these issues is to use scnprintf which returns the number of characters actually written to the buffer, so the size variable will never exceed SIZE. Signed-off-by: Silvio Cesare <silvio.cesare@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Kees Cook <keescook@chromium.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-25 22:14:16 -04:00
James Smart	92f3b32718	scsi: lpfc: Fixup eq_clr_intr references Declaring interrupt clear routines as inline is bogus as they are used as an indirect pointer. Remove the inline references. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-20 20:03:47 -04:00
James Bottomley	c88725dd14	scsi: lpfc: Fix build error You can't declare a function inline in a header if it doesn't have a body available to the compiler. So realistically you either don't declare it inline or you make it a static inline in the header. I think the latter applies in this case, so this should be the fix Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-20 20:03:47 -04:00
Matteo Croce	92684bfc9b	scsi: be2iscsi: lpfc: fix typo Fix spelling mistake: "lenght" -> "length" Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 17:38:20 -04:00
James Smart	d095986d69	scsi: lpfc: Update lpfc version to 12.2.0.1 Update lpfc version to 12.2.0.0 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:10 -04:00
James Smart	95df18c253	scsi: lpfc: Update Copyright in driver version Revise driver copyright message to show 2019. Update couple of files modified by 12.2.0.1 patch set. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:10 -04:00
James Smart	2c013a3a6b	scsi: lpfc: Enhance 6072 log string Update the 6072 log message string to print the whole 32 bits of the extended status. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:10 -04:00
James Smart	c835c0854c	scsi: lpfc: Fix duplicate log message numbers Driver had duplicated log message numbers making debug difficult. Make all messages unique. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:10 -04:00
James Smart	c1a21ebc0f	scsi: lpfc: Specify node affinity for queue memory allocation Change the SLI4 queue creation code to use NUMA node based memory allocation based on the cpu the queues will be related to. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	9afbee3d62	scsi: lpfc: Reduce memory footprint for lpfc_queue Currently the driver maintains a sideband structure which has a pointer for each queue element. However, at 8 bytes per pointer, and up to 4k elements per queue, and 100s of queues, this can take up a lot of memory. Convert the driver to using an access routine that calculates the element address based on its index rather than using the pointer table. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	9a66d990c7	scsi: lpfc: Add loopback testing to trunking mode When in trunking mode, the adapter can be placed into diagnostic mode and each link in the trunk tested via loopback. Add support to the driver to perform per-link loopback testing when in trunking mode. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	f3339800f9	scsi: lpfc: Fix link speed reporting for 4-link trunk Driver is using uint16_t and is encountering an overflow of the 16bits when calculating link speed. Fix by using a u32 type. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	19193ff34e	scsi: lpfc: Fix handling of trunk links state reporting If all the trunk links drop and a single link resumes, the link_state is not properly reported. When trunked, the driver receives two async cqes. One acqe reports the trunk link states, which the driver records. The other cqe reports the overall state of the trunk. In the failing case, the trunk link state acqe preceeds the overall trunk link state acqe. The trunk link state acqe, as it's an "up" transition, calls a code path which ensures a down transition before moving to the up state. The down transition had a side effect of clearing the just-saved trunk link states. Fix by not clearing the trunk link states if we've already transitioned to a down state. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	e4771ec3c8	scsi: lpfc: Fix protocol support on G6 and G7 adapters Invalid test is allowing Loop to be a supported topology on G6 and G7 adapters. The chips do not support loop as their link speeds prohibit loop per standard. Correct the conditional so that loop is not reported. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	b3b4f3e1d5	scsi: lpfc: Correct boot bios information to FDMI registration The driver is currently reporting the firmware revision not the actual boot bios version in FDMI data. Modify the driver to obtain the boot bios version from the adapter and use that data in the FMDI data sent to the switch. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	f4f87861d9	scsi: lpfc: Fix HDMI2 registration string for symbolic name The switch is rejecting FDMI2 registration for symbolic name. There is a "\n" in the name string, which the switch dislikes thus rejects the registration. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	32a80c093b	scsi: lpfc: Fix fc4type information for FDMI The driver is reporting support for NVME even when not configured for NVME operation. Fix (and make more readable) when NVME protocol support is indicated. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	d67f935b79	scsi: lpfc: Fix FDMI manufacturer attribute value The FDMI manufacturer value being reported on Linux is inconsistent with other OS's. Set the value to "Emulex Corporation" for consistency. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	c66a919746	scsi: lpfc: Fix io lost on host resets If the driver undergoes repeated host resets it starts losing exchange structures and eventually returns SCSI_MLQUEUE_HOST_BUSY and does not recover. The offline path is not reclaiming the outstanding ios on the fcp pring txcmplq before calling lpfc_destroy_multixripool, which causes the txmcplq to be reinit and the resources lost. Flush the fcp rings before destroying the multixripools. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:09 -04:00
James Smart	e8869f5b0a	scsi: lpfc: Fix mailbox hang on adapter init The adapter initialization sequence enables interrupts, initializes the adapter link_state to LINK_DOWN, then issues commands to initialize the adapter. The interrupt handler on the adapter validates the link_state (has to be at least LINK_DOWN) and if invalid, will discard the interrupting event. In most cases, there is not a command completion, thus an interrupt until the initialization commands have been sent which is post the setting of state to LINK_DOWN. However, in cases of firmware reset, the reset will modify the link_state to an invalid value (indicating a reset of the adapter) and there occasionally are cases where the adapter will generate an asynchronous event which shares the eq/cq used for mailbox commands. In the failure case, an interrupt is generated immediately after enabling them due to the async event. As link_state is invalid, the eq is list and the CQ not serviced. At this point link_state is initialized and the mailbox command sent. As the CQ has not been serviced, it is not armed, so no interrupt event is generated when the mailbox command completes. Modify the initialization sequence so that interrupts are enabled after link_state is properly initialized, which avoids the race condition with the async event. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:08 -04:00
James Smart	bbd3d7380b	scsi: lpfc: Fix driver crash in target reset handler It's possible for the scsi error handler to fire and call the target reset handler simultaneously to the driver logging out and relogging into the system. If hit just right, the re-login may not have fully re-established the remote port and the rdata->pnod structure may be null. Check for NULL in the reset handler and return failure if NULL. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:08 -04:00
James Smart	2a0fb340fc	scsi: lpfc: Correct localport timeout duration error Current code incorrectly specifies a completion wait timeout duration in 5 jiffies, when it should have been 5 seconds. Fix the adjust for units for the completion timeout call. [mkp: manual merge] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 13:15:08 -04:00
James Smart	e2ffe4d5dc	scsi: lpfc: Convert bootstrap mbx polling from msleep to udelay Current code is using msleep when polling for hw ready. Unfortunately the msleep routine isn't very accurate on rescheduling. In fact, on a busy systems which reset the adapter, it became 10s of seconds before it was rescheduled. Fix by busy waiting using udelay. As we're now busy waiting, significantly reduce the wait time so that we can exit the pool loop as soon as possible. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	4645f7b56a	scsi: lpfc: Coordinate adapter error handling with offline handling The driver periodically checks for adapter error in a background thread. If the thread detects an error, the adapter will be reset including the deletion and reallocation of workqueues on the adapter. Simultaneously, there may be a user-space request to offline the adapter which may try to do many of the same steps, in parallel, on a different thread. As memory was deallocated while unexpected, the parallel offline request hit a bad pointer. Add coordination between the two threads. The error recovery thread has precedence. So, when an error is detected, a flag is set on the adapter to indicate the error thread is terminating the adapter. But, before doing that work, it will look for a flag that is set by the offline flow, and if set, will wait for it to complete before then processing the error handling path. Similarly, in the offline thread, it first checks for whether the error thread is resetting the adapter, and if so, will then wait for the error thread to finish. Only after it has finished, will it set its flag and offline the adapter. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	32a9310076	scsi: lpfc: Stop adapter if pci errors detected In a couple of cases, the driver detected a pci error (via pci device state or via failed register reads) but didn't take any action to disable the device. Additionally, the driver is ignoring the status of pci configuration space reads. Having the driver take the adapter offline whenever the pci error is detected. Pay attention to pci_config_space_read status and return failure if an error is seen. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	731eedcb31	scsi: lpfc: Fix deadlock due to nested hbalock call If an adapter fails, causing a board reset, the board reset routine lpfc_hba_down_s4() takes the hbalock out then calls lpfc_nvmet_ctxbuf_post() who then tries to take out the same lock. As the context lists are now protected under the buf_list_locks, there is no need for the hbalock to be held by the board reset routine. Fix by no longer taking the hbalock in the board reset routine. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	22b738ac33	scsi: lpfc: Fix nvmet handling of first burst cmd With negative test injection, the driver is receiving a command with first burst enabled, meaning Sequence initiative is not passed with the command frame. The driver notes the condition and discards the frame. However the driver calls the incorrect buffer free routine, resulting in a NULL pointer reference. For hbq buffer free, convert to using lpfc_rq_buf_free(). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	982ab128dc	scsi: lpfc: Fix lpfc_nvmet_mrq attribute handling when 0 Currently, when lpfc_nvmet_mrq is 0 it could mean 2 different things depending on when its looked at. If at module load time it specifies the default number of hardware queues to allocate, with 0 meaning default to the number of CPUs. But post module load, a value of zero means to disable mrq use. Changed the driver so that enablement of mrq is based on whether nvme target mode is enabled or not. When enabled, mrq is enabled. Thus, the cfg_nvemt_mrq field only specifies the number of mrq queues to enable, with 0 defaulting to the number of cpus. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	4552e0f6fa	scsi: lpfc: Fix nvmet async receive buffer replenishment Under circustances with high load, the driver is running out of async receive buffers which may result in one of the following messages: 0:6401 RQE Error x13, posted 226 err_cnt 0: 925c6050 925c604e 925c5d54 or 0:2885 Port Status Event: port status reg 0x81800000, port smphr reg 0xc000, error 1=0x52004a01, error 2=0x0 The driver is waiting for full io completion before returning receive buffers to the adapter. There is no need for such a relationship. Whenever a new command is received from the wire, the driver will have two contexts - an io context (ctxp) and a receive buffer context. In current code, the receive buffer context stays 1:1 with the io and won't be reposted to the hardware until the io completes. There is no need for such a relationship. Change the driver so that up on successful handing of the command to the transport, where the transport has copied what it needed thus the buffer is returned to the driver, have the driver immediately repost the buffer to the hardware. If the command cannot be successfully handed to the transport as transport resources are temporarily busy, have the driver allocate a new and separate receive buffer and post it to the hardware so that hardware can continue while the command is queued for the transport. When an io is complete, the transport returns the io context to the driver, and the driver may be waiting for more contexts, thus immediately reuse the io context. In this path, there was a buffer posted when the receive buffer was queued waiting for an io context so a replacement is not needed in the new code additions. Thus, exempt this the context reuse case from the buffer reposting. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	def11a58c1	scsi: lpfc: Fix location of SCSI ktime counters The debug ktime counters that trace an io were inadvertently not placed in the common section of an io buffer. Thus, they generate an invalid opcode error when accessed. Move the ktime counters into the common area. Fixes: `0794d601d1` ("scsi: lpfc: Implement common IO buffers between NVME and SCSI") Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	c95a3b4b0f	scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices During debug, it was seen that the driver is issuing commands specific to SLI3 on SLI4 devices. Although the adapter correctly rejected the command, this should not be done. Revise the code to stop sending these commands on a SLI4 adapter. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:02 -04:00
James Smart	9b16406864	scsi: lpfc: Fix use-after-free mailbox cmd completion When unloading the driver, mailbox commands may be sent without holding a reference on the ndlp. By the time the mailbox command completes, the ndlp may have reduced its ref counts and been freed. The problem was reported by KASAN. While unregistering due to driver unload, have the completion noop'd by setting the ndlp context NULL'd. Due to the unload, no further action was necessary. Also, while reviewing this path, the generic nulling of the context after handling should be slightly moved. Reported by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:01 -04:00
James Smart	50e3f871fb	scsi: lpfc: Resolve irq-unsafe lockdep heirarchy warning in lpfc_io_free A patch in the 12.2.0.0 set caused a new lockdep warning: WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 5.0.0-rc8-next-20190301-dbg+ #1 Not tainted Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&qp->io_buf_list_put_lock)->rlock); local_irq_disable(); lock(&(&phba->hbalock)->rlock); lock(&(&qp->io_buf_list_put_lock)->rlock); <Interrupt> lock(&(&phba->hbalock)->rlock); see: https://www.spinics.net/lists/linux-scsi/msg128389.html In summary, the new patch added taking the io_buf_list_put_lock while under an irq-disabled hbalock. This created a lock heirarchy dependent upon irq being disabled, and there are paths that take the io_buf_list_put_lock without disabling irq. Looking at the lpfc_io_free routine, which is where the new heirarchy was introduced, there is no reason to be taking out the hbalock and raising irq, as the functionality is replaced by the io_buf_list_xxx locks. Resolve by removing the hbalock/irq calls in lpfc_io_free. Fixes: `5e5b511d8b` ("scsi: lpfc: Partition XRI buffer list across Hardware Queues") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:01 -04:00
James Smart	ff6bf89717	scsi: lpfc: Resolve inconsistent check of hdwq in lpfc_scsi_cmd_iocb_cmpl A prior patch which added support for non-uniform allocation of MSIX vectors now causes a smatch complaint: drivers/scsi/lpfc/lpfc_scsi.c:3674 lpfc_scsi_cmd_iocb_cmpl() error: we previously assumed 'phba->sli4_hba.hdwq' could be null (see line 3667) Resolve by removing the unnecessary check for a NULL hdwq table. Fixes 6a828b0f6192: ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-19 12:57:01 -04:00
Linus Torvalds	477558d7e8	SCSI misc on 20190315 This is the final round of mostly small fixes and performance improvements to our initial submit. The main regression fix is the ia64 simscsi build failure which was missed in the serial number elimination conversion. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIxBayYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisherpAP4rxLpX bcUnQnEsvoxys/JyoK08Qfv1JebZo1B2MAZ62wD/VZ7LpOuzVLhsM2KhLFGRrs1/ 7D2K4tgtO2dQsFix7H0= =pcHl -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull more SCSI updates from James Bottomley: "This is the final round of mostly small fixes and performance improvements to our initial submit. The main regression fix is the ia64 simscsi build failure which was missed in the serial number elimination conversion" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: ia64: simscsi: use request tag instead of serial_number scsi: aacraid: Fix performance issue on logical drives scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup() scsi: libiscsi: Hold back_lock when calling iscsi_complete_task scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port scsi: hisi_sas: Set PHY linkrate when disconnected scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO scsi: hisi_sas: Change return variable type in phy_up_v3_hw() scsi: qla2xxx: check for kstrtol() failure scsi: lpfc: fix 32-bit format string warning scsi: lpfc: fix unused variable warning scsi: target: tcmu: Switch to bitmap_zalloc() scsi: libiscsi: fall back to sendmsg for slab pages scsi: qla2xxx: avoid printf format warning scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check scsi: ufs: hisi: fix ufs_hba_variant_ops passing scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show ...	2019-03-16 12:51:50 -07:00
Dan Carpenter	3a487ff78c	scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup() It used to be that "error" was set to -ENODEV at the start of the function but we shifted some code around an now "error" is set to zero for most error paths. There is a mix of direct returns and "goto out" but I changed everything to direct returns for consistency. Fixes: `56de835704` ("scsi: lpfc: fix calls to dma_set_mask_and_coherent()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-14 07:01:36 -04:00
Linus Torvalds	92fff53b71	SCSI misc on 20190306 This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc, hisi_sas, target/iscsi and target/core. Additionally Christoph refactored gdth as part of the dma changes. The major mid-layer change this time is the removal of bidi commands and with them the whole of the osd/exofs driver and filesystem. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIC54SYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishT1GAPwJEV23 ExPiPsnuVgKj49nLTagZ3rILRQcYNbL+MNYqxQEA0cT8FHzSDBfWY5OKPNE+RQ8z f69LpXGmMpuagKGvvd4= =Fhy1 -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc, hisi_sas, target/iscsi and target/core. Additionally Christoph refactored gdth as part of the dma changes. The major mid-layer change this time is the removal of bidi commands and with them the whole of the osd/exofs driver and filesystem. This is a major simplification for block and mq in particular" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits) scsi: cxgb4i: validate tcp sequence number only if chip version <= T5 scsi: cxgb4i: get pf number from lldi->pf scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c scsi: mpt3sas: Add missing breaks in switch statements scsi: aacraid: Fix missing break in switch statement scsi: kill command serial number scsi: csiostor: drop serial_number usage scsi: mvumi: use request tag instead of serial_number scsi: dpt_i2o: remove serial number usage scsi: st: osst: Remove negative constant left-shifts scsi: ufs-bsg: Allow reading descriptors scsi: ufs: Allow reading descriptor via raw upiu scsi: ufs-bsg: Change the calling convention for write descriptor scsi: ufs: Remove unused device quirks Revert "scsi: ufs: disable vccq if it's not needed by UFS device" scsi: megaraid_sas: Remove a bunch of set but not used variables scsi: clean obsolete return values of eh_timed_out scsi: sd: Optimal I/O size should be a multiple of physical block size scsi: MAINTAINERS: SCSI initiator and target tweaks scsi: fcoe: make use of fip_mode enum complete ...	2019-03-09 16:53:47 -08:00
Arnd Bergmann	f996861be1	scsi: lpfc: fix 32-bit format string warning On 32-bit architectures, we see a warning when %ld is used to print a size_t: In file included from drivers/scsi/lpfc/lpfc_init.c:62: drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_new_io_buf': drivers/scsi/lpfc/lpfc_logmsg.h:62:45: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'unsigned int' [-Werror=format=] This is harmless, but portable code should just use %zd to avoid the warning. Fixes: `0794d601d1` ("scsi: lpfc: Implement common IO buffers between NVME and SCSI") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-06 19:26:46 -05:00
Arnd Bergmann	352b205a3b	scsi: lpfc: fix unused variable warning The newly introduced 'cpu' variable is only used inside of an optional block, so we get a warning without CONFIG_SCSI_LPFC_DEBUG_FS: drivers/scsi/lpfc/lpfc_nvme.c: In function 'lpfc_nvme_io_cmd_wqe_cmpl': drivers/scsi/lpfc/lpfc_nvme.c:968:30: error: unused variable 'cpu' [-Werror=unused-variable] uint32_t code, status, idx, cpu; Move the declaration into the same block to avoid the warning. Fixes: `63df6d637e` ("scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-06 19:26:46 -05:00
James Smart	1ffdd2c044	scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset The patch that replaced io channels for hdw_queues now reports the following static checker warning: drivers/scsi/lpfc/lpfc_init.c:11136 lpfc_sli4_hba_unset() error: we previously assumed 'phba->pport' could be null (see line 11074) Resolve by adding a pport NULL check. [mkp: tag tweak] Fixes: `cdb42becdd` ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu"_ Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-06 19:26:45 -05:00
James Smart	cda7fa1865	scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check The outer routine lpfc_sli_issue_iocb(), which decomposes into the SLI3 (s3) or SLI4 (s4) subroutines takes out the locks. For s3, it takes out the hbalock. For s4, it takes out the ring_lock. The lockdep check in the s3 and s4 subroutines both check hbalock, which is incorrect for s4. Revise the s4 subroutine to lockdep check the ring_lock. Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-03-06 19:26:45 -05:00
Linus Torvalds	df49fd0ff8	SCSI fixes on 20190302 Nine small fixes. The resume fix is a cosmetic removal of a warning with an incorrect condition causing it to alarm people wrongly. The other eight patches correct a thinko in Christoph Hellwig's DMA conversion series. Without it all these drivers end up with 32 bit DMA masks meaning they bounce any page over 4GB before sending it to the controller. Nowadays, even laptops mostly have memory above 4GB, so this can lead to significant performance degradation with all the bouncing. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXHql8CYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQmvAQCSVQRf kx3ABDGnaj4Km4/Jzibj44aCYwh+ewwtLCWwFQD9GWaEaDxBkbxQDf/YndQKRhYg VJQjjj6a9VlNSmWoW28= =L9Fe -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Nine small fixes. The resume fix is a cosmetic removal of a warning with an incorrect condition causing it to alarm people wrongly. The other eight patches correct a thinko in Christoph Hellwig's DMA conversion series. Without it all these drivers end up with 32 bit DMA masks meaning they bounce any page over 4GB before sending it to the controller. Nowadays, even laptops mostly have memory above 4GB, so this can lead to significant performance degradation with all the bouncing" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: core: Avoid that system resume triggers a kernel warning scsi: hptiop: fix calls to dma_set_mask() scsi: hisi_sas: fix calls to dma_set_mask_and_coherent() scsi: csiostor: fix calls to dma_set_mask_and_coherent() scsi: bfa: fix calls to dma_set_mask_and_coherent() scsi: aic94xx: fix calls to dma_set_mask_and_coherent() scsi: 3w-sas: fix calls to dma_set_mask_and_coherent() scsi: 3w-9xxx: fix calls to dma_set_mask_and_coherent() scsi: lpfc: fix calls to dma_set_mask_and_coherent()	2019-03-02 11:39:54 -08:00
Hannes Reinecke	56de835704	scsi: lpfc: fix calls to dma_set_mask_and_coherent() The change to use dma_set_mask_and_coherent() incorrectly made a second call with the 32 bit DMA mask value when the call with the 64 bit DMA mask value succeeded. This resulted in NVMe/FC connections failing due to corrupted data buffers, and various other SCSI/FCP I/O errors. Fixes: `f30e1bfd61` ("scsi: lpfc: use dma_set_mask_and_coherent") Cc: <stable@vger.kernel.org> Suggested-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-25 21:37:25 -05:00
YueHaibing	59e54d9aab	scsi: lpfc: Remove set but not used variable 'phys_id' Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_cpu_affinity_check': drivers/scsi/lpfc/lpfc_init.c:10599:19: warning: variable 'phys_id' set but not used [-Wunused-but-set-variable] It never used since introduction in commit `6a828b0f61` ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-19 18:58:34 -05:00
Colin Ian King	258f84fae3	scsi: lpfc: fix a handful of indentation issues There are a handful of statements that are indented incorrectly. Fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-13 22:15:42 -05:00
Dan Carpenter	fad28e3d9a	scsi: lpfc: Fix error code if kcalloc() fails This should return -ENOMEM if kcalloc() fails, but it accidentally returns success instead. Fixes: `6a828b0f61` ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-12 22:15:54 -05:00
James Smart	42fb055a57	scsi: lpfc: Update lpfc version to 12.2.0.0 Update lpfc version to 12.2.0.0 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	0d041215f0	scsi: lpfc: Update 12.2.0.0 file copyrights to 2019 For files modified as part of 12.2.0.0 patches, update copyright to 2019 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	c160c0f806	scsi: lpfc: Fix nvmet issues when link bounce under IO load Various null pointer dereference and general protection fault panics occur when there is a link bounce under load. There are a large number of "error" message 6413 indicating "bad release". The issues resolve to list corruptions due to missing or inconsistent lock protection. Lockups are due to nested locks in the unsolicited abort path. The unsolicited abort path calls the wrong abort processing routine. There was also duplicate context release while aborts were still active in the hardware. Removed duplicate locks and added lock protection around list item removal. Commonized lock handling around the abort processing routines. Prevent context release while still in ABTS list. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	472e146d1c	scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall When the transport calls into the lpfc target to release an IO job structure, which corresponds to an exchange, and if the driver was waiting for an exchange in order to post a previously received command to the transport, the driver immediately takes the IO job and reuses the context for the prior command and calls nvmet_fc_rcv_fcp_req() to tell the transport about a newly received command. Problem is, the execution of the IO job release may be in the context of the back end driver and its bio completion handlers, thus it may be in a irq context and protection code kicks in in the bio and request layers that are subsequently called. Rework lpfc so that instead of immediately upcalling, queue it to a deferred work thread and have the thread make the upcall. Took advantage of this change to remove duplicated code with the normal command receive path that preps the IO job and upcalls nvmet_fc. Created a common routine both paths use. Also corrected some errors that were found during review of the context freeing and reuse - basically unlocked operations and a somewhat disjoint set of calls to release associated job elements. Cleaned up this path and added locks for coherency. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	f6e8479052	scsi: lpfc: Fix default driver parameter collision for allowing NPIV support The conversion to enable SCSI and NVME fc4 support ran into an issue with NPIV support. With NVME, NPIV is not currently supported, but with SCSI it was. The driver reverted to its lowest setting meaning NPIV with SCSI was not allowed. Convert the NPIV checks and implementation so that SCSI can continue to allow NPIV support. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	c2017260ee	scsi: lpfc: Rework locking on SCSI io completion A scsi host lock is taken on every io completion to check whether the abort handler is waiting on the io completion. This is an expensive lock to take on all completion when rarely in an abort condition. Replace scsi host lock with command-specific lock. Synchronize completion and abort paths by new cmd lock. Ensure all flag changing and nulling of context pointers taken under lock. When adding lock to task management abort, realized it was missing other synchronization locks. Added that synchronization to match normal paths. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	b1684a0b42	scsi: lpfc: Enable SCSI and NVME fc4s by default Now that performance mods don't split resources by protocol and enable both protocols by default, there's no reason not to enable concurrent SCSI and NVME fc4 support. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	222e9239c6	scsi: lpfc: Resize cpu maps structures based on possible cpus The work done to date utilized the number of present cpus when sizing per-cpu structures. Structures should have been sized based on the max possible cpu count. Convert the driver over to possible cpu count for sizing allocation. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	75508a8b8b	scsi: lpfc: Utilize new IRQ API when allocating MSI-X vectors Current driver uses the older IRQ API for MSIX allocation Change driver to utilize pci_alloc_irq_vectors when allocating IRQ vectors. Make lpfc_cpu_affinity_check use pci_irq_get_affinity to determine how the kernel mapped all the IRQs. Remove msix_entries from SLI4 structure, replaced with pci_irq_vector() usage. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:49 -05:00
James Smart	32517fc097	scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescing When driving high iop counts, auto_imax coalescing kicks in and drives the performance to extremely small iops levels. There are two issues: 1) auto_imax is enabled by default. The auto algorithm, when iops gets high, divides the iops by the hdwq count and uses that value to calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether they have load or not. The EQ_delay is only manipulated every 5s (a long time). Thus there were large 5s swings of no interrupt delay followed by large/maximum delay, before repeating. 2) When processing a CQ, the driver got mixed up on the rate of when to ring the doorbell to keep the chip appraised of the eqe or cqe consumption as well as how how long to sit in the thread and process queue entries. Currently, the driver capped its work at 64 entries (very small) and exited/rearmed the CQ. Thus, on heavy loads, additional overheads were taken to exit and re-enter the interrupt handler. Worse, if in the large/maximum coalescing windows,k it could be a while before getting back to servicing. The issues are corrected by the following: - A change in defaults. Auto_imax is turned OFF and fcp_imax is set to 0. Thus all interrupts are immediate. - Cleanup of field names and their meanings. Existing names were non-intuitive or used for duplicate things. - Added max_proc_limit field, to control the length of time the handlers would service completions. - Reworked EQ handling: Added common routine that walks eq, applying notify interval and max processing limits. Use queue_claimed to claim ownership of the queue while processing. Always rearm the queue whenever the common routine is called. Rework queue element processing, namely to eliminate hba_index vs host_index. Only one index is necessary. The queue entry can be marked invalid and the host_index updated immediately after eqe processing. After rework, xx_release routines are now DB write functions. Renamed the routines as such. Moved lpfc_sli4_eq_flush(), which does similar action, to same area. Replaced the 2 individual loops that walk an eq with a call to the common routine. Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax. Added per-cpu counters to detect interrupt rates and scale interrupt coalescing values. - Reworked CQ handling: Added common routine that walks cq, applying notify interval and max processing limits. Use queue_claimed to claim ownership of the queue while processing. Always rearm the queue whenever the common routine is called. Rework queue element processing, namely to eliminate hba_index vs host_index. Only one index is necessary. The queue entry can be marked invalid and the host_index updated immediately after cqe processing. After rework, xx_release routines are now DB write functions. Renamed the routines as such. Replaced the 3 individual loops that walk a cq with a call to the common routine. Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with queue reference. Add increment for mbox completion to handler. - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow dynamic changing of the CQ max_proc_limit value being used. Although this leaves an EQ as an immediate interrupt, that interrupt will only occur if a CQ bound to it is in an armed state and has cqe's to process. By staying in the cq processing routine longer, high loads will avoid generating more interrupts as they will only rearm as the processing thread exits. The immediately interrupt is also beneficial to idle or lower-processing CQ's as they get serviced immediately without being penalized by sharing an EQ with a more loaded CQ. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:49 -05:00
James Smart	cb733e3587	scsi: lpfc: cleanup: convert eq_delay to usdelay Review of the eq coalescing logic showed the code was a bit fragmented. Sometimes it would save/set via an interrupt max value, while in others it would do so via a usdelay. There were also two places changing eq delay, one place that issued mailbox commands, and another that changed via register writes if supported. Clean this up by: - Standardizing the operation of lpfc_modify_hba_eq_delay() routine so that it is always told of a us delay to impose. The routine then chooses the best way to set that - via register or via mbx. - Rather than two value types stored in eq->q_mode (usdelay if change via register, imax if change via mbox) - q_mode always contains usdelay. Before any value change, old vs new value is compared and only if different is a change done. - Revised the dmult calculation. dmult is not set based on overall imax divided by hardware queues - instead imax applies to a single cpu and the value will be replicated to all cpus. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:49 -05:00
James Smart	6a828b0f61	scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues So far MSIX vector allocation assumed it would be 1:1 with hardware queues. However, there are several reasons why fewer MSIX vectors may be allocated than hardware queues such as the platform being out of vectors or adapter limits being less than cpu count. This patch reworks the MSIX/EQ relationships with the per-cpu hardware queues so they can function independently. MSIX vectors will be equitably split been cpu sockets/cores and then the per-cpu hardware queues will be mapped to the vectors most efficient for them. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:49 -05:00
James Smart	b3295c2a75	scsi: lpfc: Fix setting affinity hints to correlate with hardware queues The desired affinity for the hardware queue behavior is for hdwq 0 to be affinitized with cpu 0, hdwq 1 to cpu 1, and so on. The implementation so far does not do this if the number of cpus is greater than the number of hardware queues (e.g. hardware queue allocation was administratively reduced or hardware queue resources could not scale to the cpu count). Correct the queue affinitization logic when queue count is less than cpu count. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	45aa312e21	scsi: lpfc: Allow override of hardware queue selection policies Default behavior is to use the information from the upper IO stacks to select the hardware queue to use for IO submission. Which typically has good cpu affinity. However, the driver, when used on some variants of the upstream kernel, has found queuing information to be suboptimal for FCP or IO completion locked on particular cpus. For command submission situations, the lpfc_fcp_io_sched module parameter can be set to specify a hardware queue selection policy that overrides the os stack information. For IO completion situations, rather than queing cq processing based on the cpu servicing the interrupting event, schedule the cq processing on the cpu associated with the hardware queue's cq. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	c490850a09	scsi: lpfc: Adapt partitioned XRI lists to efficient sharing The XRI get/put lists were partitioned per hardware queue. However, the adapter rarely had sufficient resources to give a large number of resources per queue. As such, it became common for a cpu to encounter a lack of XRI resource and request the upper io stack to retry after returning a BUSY condition. This occurred even though other cpus were idle and not using their resources. Create as efficient a scheme as possible to move resources to the cpus that need them. Each cpu maintains a small private pool which it allocates from for io. There is a watermark that the cpu attempts to keep in the private pool. The private pool, when empty, pulls from a global pool from the cpu. When the cpu's global pool is empty it will pull from other cpu's global pool. As there many cpu global pools (1 per cpu or hardware queue count) and as each cpu selects what cpu to pull from at different rates and at different times, it creates a radomizing effect that minimizes the number of cpu's that will contend with each other when the steal XRI's from another cpu's global pool. On io completion, a cpu will push the XRI back on to its private pool. A watermark level is maintained for the private pool such that when it is exceeded it will move XRI's to the CPU global pool so that other cpu's may allocate them. On NVME, as heartbeat commands are critical to get placed on the wire, a single expedite pool is maintained. When a heartbeat is to be sent, it will allocate an XRI from the expedite pool rather than the normal cpu private/global pools. On any io completion, if a reduction in the expedite pools is seen, it will be replenished before the XRI is placed on the cpu private pool. Statistics are added to aid understanding the XRI levels on each cpu and their behaviors. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	ace44e48b1	scsi: lpfc: Synchronize hardware queues with SCSI MQ interface Now that the lower half has much better per-cpu parallelization using the hardware queues, the SCSI MQ support needs to be tied into it. The involves the following mods: - Use the hardware queue info from the midlayer to help select the hardware queue to utilize. This required change to the get_scsi-buf_xxx routines. - Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed. - Includes fix for SLI-3 that does not have multi queue parallelization. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	1fbf974250	scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting. SLI4 nvme functions are passing the SLI3 ring number when posting wqe to hardware. This should be indicating the hardware queue to use, not the ring number. Replace ring number with the hardware queue that should be used. Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb routine that properly adapts. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:09 -05:00
James Smart	4c47efc140	scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures Many io statistics were being sampled and saved using adapter-based data structures. This was creating a lot of contention and cache thrashing in the I/O path. Move the statistics to the hardware queue data structures. Given the per-queue data structures, use of atomic types is lessened. Add new sysfs and debugfs stat routines to collate the per hardware queue values and report at an adapter level. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:08 -05:00
James Smart	63df6d637e	scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues Similar to the io execution path that reports cpu context information, the debugfs routines for cpu information needs to be aligned with new hardware queue implementation. Convert debugfs cnd nvme cpucheck statistics to report information per Hardware Queue. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:28:11 -05:00
James Smart	18c27a6216	scsi: lpfc: cleanup: Remove unused FCP_XRI_ABORT_EVENT slowpath event Both NVME and SCSI aborts are now processed off the CQ workqueue and do not generate events for the slowpath any more. Remove the unused event code. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:24:22 -05:00
James Smart	5e5b511d8b	scsi: lpfc: Partition XRI buffer list across Hardware Queues Once the IO buff allocations were made shared, there was a single XRI buffer list shared by all hardware queues. A single list isn't great for performance when shared across the per-cpu hardware queues. Create a separate XRI IO buffer get/put list for each Hardware Queue. As SGLs and associated IO buffers get allocated/posted to the firmware; round robin their assignment across all available hardware Queues so that there is an equitable assignment. Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic for XRI allocation. Add a debugfs interface to display hardware queue statistics Added new empty_io_bufs counter to track if a cpu runs out of XRIs. Replace common_ variables/names with io_ to make meanings clearer. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:24:22 -05:00
James Smart	cdb42becdd	scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu Currently, both nvme and fcp each have their own concept of an io_channel, which is a combination wq/cq and associated msix. Different cpus would share an io_channel. The driver is now moving to per-cpu wq/cq pairs and msix vectors. The driver will still use separate wq/cq pairs per protocol on each cpu, but the protocols will share the msix vector. Given the elimination of the nvme and fcp io channels, the module parameters will be removed. A new parameter, lpfc_hdw_queue is added which allows the wq/cq pair allocation per cpu to be overridden and allocated to lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will be based on the number of cpus. If non-zero, the parameter specifies the number of queues to allocate. At this time, the maximum non-zero value is 64. To manage this new paradigm, a new hardware queue structure is created to track queue activity and relationships. As MSIX vector allocation must be known before setting up the relationships, msix allocation now occurs before queue datastructures are allocated. If the number of vectors allocated is less than the desired hardware queues, the hardware queue counts will be reduced to the number of vectors Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
James Smart	7370d10ac9	scsi: lpfc: Remove extra vector and SLI4 queue for Expresslane There is a extra queue and msix vector for expresslane. Now that the driver will be doing queues per cpu, this oddball queue is no longer needed. Expresslane will utilize the normal per-cpu queues. Updated debugfs sli4 queue output to go along with the change Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
James Smart	0794d601d1	scsi: lpfc: Implement common IO buffers between NVME and SCSI Currently, both NVME and SCSI get their IO buffers from separate pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split between protocols. Eliminate the independent pools and use a single pool. Each buffer structure now has a common section and a protocol section. Per protocol routines for SGL initialization are removed and replaced by common routines. Initialization of the buffers is only done on the common area. All other fields, which are protocol specific, are initialized when the buffer is allocated for use in the per-protocol allocation routine. In the past, the SCSI side allocated IO buffers as part of slave_alloc calls until the maximum XRIs for SCSI was reached. As all XRIs are now common and may be used for either protocol, allocation for everything is done as part of adapter initialization and the scsi side has no action in slave alloc. As XRI's are no longer split, the lpfc_xri_split module parameter is removed. Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put routines. All SLI4 adapters utilize the new IO buffer scheme Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
James Smart	e960f5ab40	scsi: lpfc: cleanup: Remove excess check on NVME io submit code path lpfc_nvme_prep_io_cmd() checks for null pnode, but caller lpfc_nvme_fcp_io_submit() has already ensured it's non-null. Remove the pnode null check. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
James Smart	0b05e9fe1f	scsi: lpfc: cleanup: remove nrport from nvme command structure An hba-wide lock is taken in the nvme io completion routine. The lock covers null'ing of the nrport pointer in the cmd structure. The nrport member isn't necessary. After extracting the pointer from the command, the pointer was dereferenced to get the fc discovery node pointer. But the fc discovery node pointer is alrady in the command structure so the dereferrence was unnecessary. Eliminated the nrport structure member and its use, which also eliminates the port-wide lock. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:22:42 -05:00
Greg Kroah-Hartman	50e931679a	scsi: lpfc: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-01-29 00:40:54 -05:00
Linus Torvalds	7930851ef1	SCSI fixes on 20190125 Six fixes, all of which appear to have user visible consequences. The DMA one is a regression fix from the merge window and of the others, four are driver specific and one specific to the target code. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXEt6GSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbxoAP0R0z0J fdCRn9AMRbZ3xJzD+E6V2gw38Gdgm8rR5mT1gQEAv1VxZwOAb9Qxbcr+l4uybqZr Q55H31I/IvBgNm84jH0= =vkW7 -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Six fixes, all of which appear to have user visible consequences. The DMA one is a regression fix from the merge window and of the others, four are driver specific and one specific to the target code" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: Use explicit access size in ufshcd_dump_regs scsi: tcmu: fix use after free scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state() scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport scsi: communicate max segment size to the DMA mapping code	2019-01-26 15:03:43 -08:00
Ewan D. Milne	c41f59884b	scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport We cannot wait on a completion object in the lpfc_nvme_targetport structure in the _destroy_targetport() code path because the NVMe/fc transport will free that structure immediately after the .targetport_delete() callback. This results in a use-after-free, and a hang if slub_debug=FZPU is enabled. Fix this by putting the completion on the stack. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-01-22 20:40:59 -05:00
Ewan D. Milne	7961cba6f7	scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport We cannot wait on a completion object in the lpfc_nvme_lport structure in the _destroy_localport() code path because the NVMe/fc transport will free that structure immediately after the .localport_delete() callback. This results in a use-after-free, and a hang if slub_debug=FZPU is enabled. Fix this by putting the completion on the stack. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-01-22 20:40:59 -05:00
Linus Torvalds	4d5f6e0201	SCSI fixes on 20190118 A set of 17 fixes. Most of these are minor or trivial. The one fix that may be serious is the isci one: the bug can cause hba parameters to be set from uninitialized memory. I don't think it's exploitable, but you never know. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXEKL0SYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVZpAQCwuPTk fqOt4v4hJ0oUHtEBsQK3VMXSdUvWdb5Lbn3WeQD/RFYTyNxcIF7ADSWw71b+IigT ejUrMzI8ig+nZ1jbFZ4= =BdS/ -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "A set of 17 fixes. Most of these are minor or trivial. The one fix that may be serious is the isci one: the bug can cause hba parameters to be set from uninitialized memory. I don't think it's exploitable, but you never know" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: cxgb4i: add wait_for_completion() scsi: qla1280: set 64bit coherent mask scsi: ufs: Fix geometry descriptor size scsi: megaraid_sas: Retry reads of outbound_intr_status reg scsi: qedi: Add ep_state for login completion on un-reachable targets scsi: ufs: Fix system suspend status scsi: qla2xxx: Use correct number of vectors for online CPUs scsi: hisi_sas: Set protection parameters prior to adding SCSI host scsi: tcmu: avoid cmd/qfull timers updated whenever a new cmd comes scsi: isci: initialize shost fully before calling scsi_add_host() scsi: lpfc: lpfc_sli: Mark expected switch fall-throughs scsi: smartpqi_init: fix boolean expression in pqi_device_remove_start scsi: core: Synchronize request queue PM status only on successful resume scsi: pm80xx: reduce indentation scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param scsi: megaraid_sas: correct an info message scsi: target/iscsi: fix error msg typo when create lio_qr_cache failed scsi: sd: Fix cache_type_store()	2019-01-20 09:15:04 +12:00
Gustavo A. R. Silva	5bd5f66cf1	scsi: lpfc: lpfc_sli: Mark expected switch fall-throughs In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that, in this particular case, I replaced "Drop thru" and "Fall Thru" with "fall through" annotations, which is what GCC is expecting to find. Also, in some cases a dash is added as a token in order to separate the "fall through" annotation from the rest of the comment on the same line, which is what GCC is expecting to find. Addresses-Coverity-ID: 114979 ("Missing break in switch") Addresses-Coverity-ID: 114980 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-01-08 21:57:26 -05:00
Luis Chamberlain	750afb08ca	cross-tree: phase out dma_zalloc_coherent() We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-01-08 07:58:37 -05:00
Linus Torvalds	938edb8a31	SCSI misc on 20181224 This is mostly update of the usual drivers: smarpqi, lpfc, qedi, megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have a pile of annotation, unused variable and minor updates. The big API change is the updates for Christoph's DMA rework which include removing the DISABLE_CLUSTERING flag. And finally there are a couple of target tree updates. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXCEUNiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdjKAP9vrTTv qFaYmAoRSbPq9ZiixaXLMy0K/6o76Uay0gnBqgD/fgn3jg/KQ6alNaCjmfeV3wAj u1j3H7tha9j1it+4pUw= =GDa+ -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: smarpqi, lpfc, qedi, megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have a pile of annotation, unused variable and minor updates. The big API change is the updates for Christoph's DMA rework which include removing the DISABLE_CLUSTERING flag. And finally there are a couple of target tree updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits) scsi: isci: request: mark expected switch fall-through scsi: isci: remote_node_context: mark expected switch fall-throughs scsi: isci: remote_device: Mark expected switch fall-throughs scsi: isci: phy: Mark expected switch fall-through scsi: iscsi: Capture iscsi debug messages using tracepoints scsi: myrb: Mark expected switch fall-throughs scsi: megaraid: fix out-of-bound array accesses scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through scsi: fcoe: remove set but not used variable 'port' scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() scsi: smartpqi: fix build warnings scsi: smartpqi: update driver version scsi: smartpqi: add ofa support scsi: smartpqi: increase fw status register read timeout scsi: smartpqi: bump driver version scsi: smartpqi: add smp_utils support scsi: smartpqi: correct lun reset issues scsi: smartpqi: correct volume status scsi: smartpqi: do not offline disks for transient did no connect conditions scsi: smartpqi: allow for larger raid maps ...	2018-12-28 14:48:06 -08:00
James Smart	9e1f03e4d3	scsi: lpfc: Update lpfc version to 12.0.0.10 Update lpfc version to 12.0.0.10 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:08 -05:00
James Smart	5021267af1	scsi: lpfc: Adding ability to reset chip via pci bus reset This patch adds a "pci_bus_reset" option to the board_mode sysfs attribute. This option uses the pci_reset_bus() api to reset the PCIe link the adapter is on, which will reset the chip/adapter. Prior to issuing this option, all functions on the same chip must be placed in the offline state by the admin. After the reset, all of the instances may be brought online again. The primary purpose of this functionality is to support cases where firmware update required a chip reset but the admin did not want to reboot the machine in order to instantiate the firmware update. Sanity checks take place prior to the reset to ensure the adapter is the sole entity on the PCIe bus and that all functions are in the offline state. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:08 -05:00
James Smart	72ca6b2220	scsi: lpfc: Add log messages to aid in debugging fc4type discovery issues Current messages report generic actions (like send GID_FT), but misses reporting for what protocol type the action is taken. Revise the messages to reflect the FC4 protocol type being worked on. [mkp: typo] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:07 -05:00
James Smart	00292e0306	scsi: lpfc: Fix discovery failure when PLOGI is defered When a target's link dropped, an RSCN was received to communicate the change. The driver detected the loss of the target and issued and UNREG_RPI mailbox command. While that was being processed, another RSCN was received to communicate the port coming back. The driver deferred the PLOGI to the port until the mailbox command finishes. When the mailbox command completed it saw the pending port and called the routines to issue the PLOGI. However, it forgot to clear the UNREG_INP state flag, so the PLOGI xmt routine nooped the PLOGI request assuming it needed to wait for the mailbox command. At this point, login would never be re-attempted. Clear UNREG_INP before issuing the deferred PLOGI. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:07 -05:00
James Smart	529b3ddcff	scsi: lpfc: update fault value on successful trunk events. Currently, when a trunk link goes down due to some fault, the driver snapshots the fault code. If the link then comes back up, meaning there is no fault, the driver is not clearing the fault code so the sysfs link_state entry reports old/stale data. Revise the logic so that on successful link up the fault code is cleared. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:07 -05:00
James Smart	e817e5d703	scsi: lpfc: Correct MDS loopback diagnostics support The existing MDS loopback diagnostics support processing received frames in the slowpath work thread. It caps the number of frames it will process at 64, before waiting for another event to indicate additional frame reception. The net-net is this results in very slow frame processing during loopback tests and sometimes orphans an io, causing the loopback test to report failure by the switch. Move MDS loopback frame processing out of the slow path worker thread and into the normal RQ processing routines. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:07 -05:00
James Smart	2977a09512	scsi: lpfc: Fix link state reporting for trunking when adapter is offline If the adapter is taken offline, the trunk link port attributes continue to report trunk links as up even though all links are down as the adapter is offline. Clear the trunk links state as part of taking the adapter offline. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-19 22:13:07 -05:00
Ewan D. Milne	4e87eb2f46	scsi: lpfc: do not set queue->page_count to 0 if pc_sli4_params.wqpcnt is invalid Certain older adapters such as the OneConnect OCe10100 may not have a valid wqpcnt value. In this case, do not set queue->page_count to 0 in lpfc_sli4_queue_alloc() as this will prevent the driver from initializing. Fixes: `895427bd01` ("scsi: lpfc: NVME Initiator: Base modifications") Cc: stable@vger.kernel.org # 4.11+ Signed-off-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-18 23:25:35 -05:00
Christoph Hellwig	2a3d4eb8e2	scsi: flip the default on use_clustering Most SCSI drivers want to enable "clustering", that is merging of segments so that they might span more than a single page. Remove the ENABLE_CLUSTERING define, and require drivers to explicitly set DISABLE_CLUSTERING to disable this feature. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-18 23:13:12 -05:00
James Smart	719162bd5b	scsi: lpfc: Enable Management features for IF_TYPE=6 Addition of support for if_type=6 missed several checks for interface type, resulting in the failure of several key management features such as firmware dump and loopback testing. Correct the checks on the if_type so that both SLI4 IF_TYPE's 2 and 6 are supported. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-12 20:33:08 -05:00
Martin K. Petersen	2d1036aea4	Revert "scsi: lpfc: ls_rjt erroneus FLOGIs" This reverts commit `287aba2592`. We killed the bad firmware and this mod is no longer necessary. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-12 20:26:56 -05:00
Jens Axboe	96f774106e	Linux 4.20-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlwNpb0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGwGwH/00UHnXfxww3ixxz zwTVDzptA6SPm6s84yJOWatM5fXhPiAltZaHSYF9lzRzNU71NCq7Frhq3fQUIXKM OxqDn9nfSTWcjWTk2q5n2keyRV/KIn67YX7UgqFc1bO/mqtVjEgNWaMyblhI+e9E giu1ZXayHr43jK1cDOmGExZubXUq7Vsc9TOlrd+d2SwIqeEP7TCMrPhnHDwCNvX2 UU5dtANpVzGtHaBcr37wJj+L8kODCc0f+PQ3g2ar5jTHst5SLlHp2u0AMRnUmgdi VkGx+mu/uk8mtwUqMIMqhplklVoqK6LTeLqsY5Xt32SKruw9UqyJGdphLjW2QP/g MkmA1lI= =7kaD -----END PGP SIGNATURE----- Merge tag 'v4.20-rc6' into for-4.21/block Pull in v4.20-rc6 to resolve the conflict in NVMe, but also to get the two corruption fixes. We're going to be overhauling the direct dispatch path, and we need to do that on top of the changes we made for that in mainline. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-12-09 17:45:40 -07:00
James Smart	de55b786b8	scsi: lpfc: update driver version to 12.0.0.9 Update the driver version to 12.0.0.9 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:33 -05:00
James Smart	7c4042a4d0	scsi: lpfc: Fix dif and first burst use in write commands When dif and first burst is used in a write command wqe, the driver was not properly setting fields in the io command request. This resulted in no dif bytes being sent and invalid xfer_rdy's, resulting in the io being aborted by the hardware. Correct the wqe initializaton when both dif and first burst are used. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:33 -05:00
James Smart	1165a5c220	scsi: lpfc: Fix driver release of fw-logging buffers On driver termination, after the driver stops fw logging by writing a register on the chip, the driver immediately unmaps and frees the logging buffer, without confirming in any way that the chip has received the write and terminated the logging. As termination on the chip is not immediate, the chip may issue a dma request to the now unmapped dma buffer, resulting in a iommu fault. Change the driver to receive a confirmation that logging ahs been terminated. As the driver always issues an SLI reset with the device as part of shutdown, and as part of that is receiving confirmation that the reset is complete - the driver was modified to perform the write to disable fw logging prior to the SLI reset and only free the fw log buffer after the SLI reset is complete. That guarantees use of the fw log buffer is fully terminated when it is unmapped. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:33 -05:00
James Smart	76558b2573	scsi: lpfc: Correct topology type reporting on G7 adapters Driver missed classifying the chip type for G7 when reporting supported topologies. This resulted in loop being shown as supported on FC links that are not supported per the standard. Add the chip classifications to the topology checks in the driver. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:33 -05:00
James Smart	1c36833d82	scsi: lpfc: Correct code setting non existent bits in sli4 ABORT WQE Driver is setting bits in word 10 of the SLI4 ABORT WQE (the wqid). The field was a carry over from a prior SLI revision. The field does not exist in SLI4, and the action may result in an overlap with future definition of the WQE. Remove the setting of WQID in the ABORT WQE. Also cleaned up WQE field settings - initialize to zero, don't bother to set fields to zero. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	0a9e9687ac	scsi: lpfc: Defer LS_ACC to FLOGI on point to point logins The current discovery state machine the driver treated FLOGI oddly. When point to point, an FLOGI is to be exchanged by the two ports, with the port with the most significant WWN then proceeding with PLOGI. The implementation in the driver was keyed to closely with "what have I sent", not with what has happened between the two endpoints. Thus, it blatantly would ACC an FLOGI, but reject PLOGI's until it had its FLOGI ACC'd. The problem is - the sending of FLOGI may be delayed for some reason, or the response to FLOGI held off by the other side. In the failing situation the other side sent an FLOGI, which was ACC'd, then sent PLOGIs which were then rjt'd until the retry count for the PLOGIs were exceeded and the port gave up. The FLOGI may have been very late in transmit, or the response held off until the PLOGIs failed. Given the other port had the higher WWN, no PLOGIs would occur and communication stopped. Correct the situation by changing the FLOGI handling. Defer any response to an FLOGI until the driver has sent its FLOGI as well. Then, upon either completion of the sent FLOGI, or upon sending an ACC to a received FLOGI (which may be received before or just after FLOGI was sent). the driver will act on who has the higher WWN. if the other port does, the driver will noop any handling of an FLOGI response (if outstanding) and wait for PLOGI. If the local port does, the driver will transition to sending PLOGI and will noop any action on responding to an FLOGI (if not yet received). Fortunately, to implement this, it only took another state flag and deferring any FLOGI response if the FLOGI has yet to be transmit. All subsequent actions were already in place. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	287aba2592	scsi: lpfc: ls_rjt erroneus FLOGIs In some link initialization sequences, the fw generates an erroneous FLOGI payload to the driver without an intervening link bounce. The driver, when it sees a 2nd FLOGI without an intervening link bounce, automatically performs a link bounce. In this, the link bounce causes the situate to repeat and in a nasty loop of link bounces. Resolve the issue by validating the FLOGI payload. The erroneous FLOGI will contain VVL signatures that are not normal. When the driver sees these, it will simply reject the flogi rather than bouncing the link. The reject is consumed within the firmware. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	92ea83a878	scsi: lpfc: rport port swap discovery issue. Two initiator ports were cable swapped and after swap both went down. The driver internally swaps the nlp nodes based on matching node wwn's but not the same nport id as before. After detecting a change in the nodes RPI, the driver sends an UNREG_RPI command and clears the NLP_RPI_REGISTERED flag, then swaps the node information with the other node. But the other node's NLP_RPI_REGISTERED flag is also cleared, but it is done so without an UNREG_RPI being sent, which causes the later REG_RPI for that other node to fail as the hardware believes its still registered. Additionally, if the node swap occurred while the two nodes had PLOGI's in flight, the fc4_types weren't properly getting swapped such that when the PLOGIs commpleted and PRLI's were then sent, the PRLI's acted on bad protocol types so the PRLI was for the wrong protocol. NVME devices saw SCSI FCP PRLIs and vice versa. Clean up the node swap so that the NLP_RPI_REGISTERED flag is handled properly. Fix the handling of the fc4_types when the nodes are swapped as well Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	8b47ae69e0	scsi: lpfc: Cap NPIV vports to 256 Depending on the chipset, the number of NPIV vports may vary and be in excess of what most switches support (256). To avoid confusion with the users, limit the reported NPIV vports to 256. Additionally correct the 16G adapter which is reporting a bogus NPIV vport number if the link is down. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	5a9eeff57f	scsi: lpfc: Fix kernel Oops due to null pring pointers Driver is hitting null pring pointers in lpfc_do_work(). Pointer assignment occurs based on SLI-revision. If recovering after an error, its possible the sli revision for the port was cleared, making the lpfc_phba_elsring() not return a ring pointer, thus the null pointer. Add SLI revision checking to lpfc_phba_elsring() and status checking to all callers. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	2c4c91415a	scsi: lpfc: Fix a duplicate 0711 log message number. Renumber one of the 0711 log messages so there isn't a duplication. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	dea16bdae2	scsi: lpfc: Fix discovery failures during port failovers with lots of vports The driver is getting hit with 100s of RSCNs during remote port address changes. Each of those RSCN's ends up generating UNREG_RPI and REG_PRI mailbox commands. The discovery engine within the driver doesn't wait for the mailbox command completions. Instead it sets state flags and moves forward. At some point, there's a massive backlog of mailbox commands which take time for the adapter to process. Additionally, it appears there were duplicate events from the switch so the driver generated duplicate mailbox commands for the same remote port. During this window, failures on PLOGI and PRLI ELS's are see as the adapter is rejecting them as they are for remote ports that still have pending mailbox commands. Streamline the discovery engine so that PLOGI log checks for outstanding UNREG_RPIs and defer the processing until the commands complete. This better synchronizes the ELS transmission vs the RPI registrations. Filter out multiple UNREG_RPIs being queued up for the same remote port. Beef up log messages in this area. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	3e1f071892	scsi: lpfc: refactor mailbox structure context fields The driver data structure for managing a mailbox command contained two context fields. Unfortunately, the context were considered "generic" to be used at the whim of the command code. Of course, one section of code used fields this way, while another did it that way, and eventually there were mixups. Refactored the structure so that the generic contexts become a node context and a buffer context and all code standardizes on their use. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	0f31e9593a	scsi: lpfc: update manufacturer attribute to reflect Broadcom Update manufacturer attribute to reflect Broadcom Inc, not Emulex Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:31 -05:00
James Smart	cb34990b90	scsi: lpfc: Fix panic when FW-log buffsize is not initialized While trying to get adapter fw-log for a function whose buffsize was set to 0, kernel panic occurred. When buffsize is 0, the kernel buffer for the log won't be allocated. When fw log usage was enabled, it failed to check the buffer size, and log usage was started. Eventually the driver referenced the unallocated log buffer. Added checks of the buffer size before allowing fw logging to be enabled and added check for valid buffer if enabling fw log. Performed a couple other minor cleanups while fixing this: - clarified log messages - re-evaluated log message severity - treat any error as an error, not only a couple codes Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:31 -05:00
Martin Wilck	dfb7513374	scsi: lpfc: fix block guard enablement on SLI3 adapters Since `f44ac12f1d`, BG enablement is tracked with the LPFC_SLI3_BG_ENABLED bit, which is set in lpfc_get_cfgparam before lpfc_sli_config_sli_port() is called. The bit shouldn't be cleared before checking the feature. Based on problem analysis by David Bond. Fixes: `f44ac12f1d` "scsi: lpfc: Memory allocation error during driver start-up on power8" Tested-by: David Bond <dbond@suse.com> Signed-off-by: Martin Wilck <mwilck@suse.com> Cc: stable@vger.kernel.org # 4.17.x Cc: stable@vger.kernel.org # 4.18.x Cc: stable@vger.kernel.org # 4.19.x Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-28 12:14:25 -05:00
Sabyasachi Gupta	359d0ac1e8	scsi: lpfc: Use dma_zalloc_coherent Replaced dma_alloc_coherent + memset with dma_zalloc_coherent. Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-21 22:31:22 -05:00
Jens Axboe	a78b03bc73	Linux 4.20-rc3 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlvx2sAeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGycgIAIuxobwt0RRKa0zO ROS+34JGoC2yU2P9VdEGWdtxS6ANMVQgKPBhWL6s+xR89Kd+V4xSdJLD1pNTxxqP 0DCva0np1/Q4juH+JbU50v/lykoLgteZ0P0LBRGf1y8p3WiLPv45IbnNsMDNYhB2 7a8rOmZYakRY9CPznRDw3X8cJt3sddKgFJHIOGz1OQJVWtCD0KPGcJmQNsbDSagY Zx6Z5BKSIdjRqaAdN5gDa1Pft3WQo7TpaQGl80lSsgr5LcjmscXA3sClOCy+25Mo FZLx0PcwP+Efq8RTGzNK51WSOMa6d37hvjDqUAdQBOR0KbyjRyXQwyQVw/MGbPJs 7J3Pzm0= =56Mt -----END PGP SIGNATURE----- Merge tag 'v4.20-rc3' into for-4.21/block Merge in -rc3 to resolve a few conflicts, but also to get a few important fixes that have gone into mainline since the block 4.21 branch was forked off (most notably the SCSI queue issue, which is both a conflict AND needed fix). Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-11-18 15:46:03 -07:00
Christoph Hellwig	f30e1bfd61	scsi: lpfc: use dma_set_mask_and_coherent The driver currently uses pci_set_dma_mask despite otherwise using the generic DMA API. Switch it over to the better generic DMA API. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-15 14:27:08 -05:00
Jens Axboe	f664a3cc17	scsi: kill off the legacy IO path This removes the legacy (non-mq) IO path for SCSI. Cc: linux-scsi@vger.kernel.org Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-11-07 13:42:32 -07:00
James Smart	ed5b3994c6	scsi: lpfc: update driver version to 12.0.0.8 Update the driver version to 12.0.0.8 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	1dc5ec2452	scsi: lpfc: add Trunking support Add trunking support to the driver. Trunking is found on more recent asics. In general, trunking appears as a single "port" to the driver and overall behavior doesn't differ. Link speed is reported as an aggregate value, while link speed control is done on a per-physical link basis with all links in the trunk symmetrical. Some commands returning port information are updated to additionally provide trunking information. And new ACQEs are generated to report physical link events relative to the trunk. This patch contains the following modifications: - Added link speed settings of 128GB and 256GB. - Added handling of trunk-related ACQEs, mainly logging and trapping of physical link statuses. - Added additional bsg interface to query trunk state by applications. - Augment link_state sysfs attribtute to display trunk link status Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	7ea92eb458	scsi: lpfc: Implement GID_PT on Nameserver query to support faster failover The switches seem to respond faster to GID_PT vs GID_FT NameServer queries. Add support for GID_PT to be used over GID_FT to enable faster storage failover detection. Includes addition of new module parameter to select between GID_PT and GID_FT (GID_FT is default). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	d83ca3ea83	scsi: lpfc: Correct loss of fc4 type on remote port address change An address change for a remote port cause PRLI for the wrong protocol to be sent. The node copy done in the discovery code skipped copying the fc4 protocols supported as well. Fix the copy logic for the address change. Beefed up log messages in this area as well. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	d496b9a724	scsi: lpfc: Fix odd recovery in duplicate FLOGIs in point-to-point Testing a point-to-point topology and a case of re-FLOGI without intervening link bouncing, showed an odd interaction with firmware and a resulting scenario where the driver no longer probed after accepting the new FLOGI. Work around the firmware issue by issuing a link bounce if a FLOGI is received after the link is already up and FLOGI's accepted. While debugging the issue, realized that some debug traces should be clarified to help in the future. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	b114d9009d	scsi: lpfc: Correct LCB RJT handling When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	036cad1f1a	scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces On FCoE adapters, when running link bounce test in a loop, initiator failed to login with switch switch and required driver reload to recover. Switch reached a point where all subsequent FLOGIs would be LS_RJT'd. Further testing showed the condition to be related to not performing FCF discovery between FLOGI's. Fix by monitoring FLOGI failures and once a repeated error is seen repeat FCF discovery. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	191e2f7493	scsi: lpfc: Correct errors accessing fw log This patch corrects two issues: - An oops would occur if reading based on a non-zero offset. Offset calculation was incorrect. - Updates to ras config (logging level) were ignored if change was made while fw logging was enabled. Revise to dynamically update. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:50 -05:00
James Smart	5cca2ab1b3	scsi: lpfc: Reset link or adapter instead of doing infinite nameserver PLOGI retry Currently, PLOGI failures are infinitely delayed/retried. There have been some fabric situations where the PLOGI's were to the nameserver and it stopped responding. The retries would never clear up. A better resolution in this situation is to retry a couple of times, then drop the link and reinit. This brings back connectivity to the nameserver. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:50 -05:00
James Smart	30e196cace	scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to re-establish the login. An nlp_type check in the LOGO completion handler failed to restart discovery for NVME targets. Revised the nlp_type check for NVME as well as SCSI. While reviewing the LOGO handling a few other issues were seen and were addressed: - Better lock synchronization around ndlp data types - When the ABTS times out, unregister the RPI before sending the LOGO so that all local exchange contexts are cleared and nothing received while awaiting LOGO/PLOGI handling will be accepted. - LOGO handling optimized to: Wait only R_A_TOV for a response. It doesn't need to be retried on timeout. If there wasn't a response, a PLOGI will be sent, thus an implicit logout applies as well when the other port sees it. If there is a response, any kind of response is considered "good" and the XRI quarantined for a exchange qualifier window. - PLOGI is issued as soon a LOGO state is resolved. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:50 -05:00
James Smart	3952e91f11	scsi: lpfc: Fix lpfc_sli4_read_config return value check An error is an error - but not to the existing return value check. Revise check to handle any failure, not just EIO. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:50 -05:00
James Smart	cd71348ad7	scsi: lpfc: Correct speeds on SFP swap Supported speeds is not updated when SFP is removed or replaced Supported speed is obtained from lmt field in READ_CONFIG mailbox response. Driver updates supported speeds only once from PCI probe path. After that it is never updated. So, supported speeds remains the same till reboot or driver reload. When SFP is removed or inserted, driver gets SLI-Port Event ACQE. If SFP is removed, lmt wil have value 0. If a different SFP is inserted, lmt will have value according to its supported speeds. So, afterr SLI-Port Event ACQE handling path, send READ_CONFIG mailbox and update supported speeds. If READ_CONFIG fails, set supported speeds to unknown and log. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:50 -05:00
Arnd Bergmann	f8d2943245	scsi: lpfc: fix remoteport access The addition of a spinlock in lpfc_debugfs_nodelist_data() introduced a bug that lets us not skip NULL pointers correctly, as noticed by gcc-8: drivers/scsi/lpfc/lpfc_debugfs.c: In function 'lpfc_debugfs_nodelist_data.constprop': drivers/scsi/lpfc/lpfc_debugfs.c:728:13: error: 'nrport' may be used uninitialized in this function [-Werror=maybe-uninitialized] if (nrport->port_role & FC_PORT_ROLE_NVME_INITIATOR) This changes the logic back to what it was, while keeping the added spinlock. Fixes: `9e21017826` ("scsi: lpfc: Synchronize access to remoteport via rport") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-05 22:35:01 -05:00
Linus Torvalds	d49f8a52b1	SCSI misc on 20181024 This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380, qla2xxx, lpfc, libsas, hisi_sas. In addition there's a set of mostly small updates to the target subsystem a set of conversions to the generic DMA API, which do have some potential for issues in the older drivers but we'll handle those as case by case fixes. A new myrs for the DAC960/mylex raid controllers to replace the block based DAC960 which is also being removed by Jens in this merge window. Plus the usual slew of trivial changes. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCW9BQJSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishU3MAP41T8yW UJQDCprj65pCR+9mOUWzgMvgAW/15ouK89x/7AD/XAEQZqoAgpFUbgnoZWGddZkS LykIzSiLHP4qeDOh1TQ= =2JMU -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380, qla2xxx, lpfc, libsas, hisi_sas. In addition there's a set of mostly small updates to the target subsystem a set of conversions to the generic DMA API, which do have some potential for issues in the older drivers but we'll handle those as case by case fixes. A new myrs driver for the DAC960/mylex raid controllers to replace the block based DAC960 which is also being removed by Jens in this merge window. Plus the usual slew of trivial changes" [ "myrs" stands for "MYlex Raid Scsi". Obviously. Silly of me to even wonder. There's also a "myrb" driver, where the 'b' stands for 'block'. Truly, somebody has got mad naming skillz. - Linus ] * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (237 commits) scsi: myrs: Fix the processor absent message in processor_show() scsi: myrs: Fix a logical vs bitwise bug scsi: hisi_sas: Fix NULL pointer dereference scsi: myrs: fix build failure on 32 bit scsi: fnic: replace gross legacy tag hack with blk-mq hack scsi: mesh: switch to generic DMA API scsi: ips: switch to generic DMA API scsi: smartpqi: fully convert to the generic DMA API scsi: vmw_pscsi: switch to generic DMA API scsi: snic: switch to generic DMA API scsi: qla4xxx: fully convert to the generic DMA API scsi: qla2xxx: fully convert to the generic DMA API scsi: qla1280: switch to generic DMA API scsi: qedi: fully convert to the generic DMA API scsi: qedf: fully convert to the generic DMA API scsi: pm8001: switch to generic DMA API scsi: nsp32: switch to generic DMA API scsi: mvsas: fully convert to the generic DMA API scsi: mvumi: switch to generic DMA API scsi: mpt3sas: switch to generic DMA API ...	2018-10-25 07:40:30 -07:00
Linus Torvalds	bd6bf7c104	pci-v4.20-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlvPV7IUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vyaUg//WnCaRIu2oKOp8c/bplZJDW5eT10d oYAN9qeyptU9RYrg4KBNbZL9UKGFTk3AoN5AUjrk8njxc/dY2ra/79esOvZyyYQy qLXBvrXKg3yZnlNlnyBneGSnUVwv/kl2hZS+kmYby2YOa8AH/mhU0FIFvsnfRK2I XvwABFm2ZYvXCqh3e5HXaHhOsR88NQ9In0AXVC7zHGqv1r/bMVn2YzPZHL/zzMrF mS79tdBTH+shSvchH9zvfgIs+UEKvvjEJsG2liwMkcQaV41i5dZjSKTdJ3EaD/Y2 BreLxXRnRYGUkBqfcon16Yx+P6VCefDRLa+RhwYO3dxFF2N4ZpblbkIdBATwKLjL npiGc6R8yFjTmZU0/7olMyMCm7igIBmDvWPcsKEE8R4PezwoQv6YKHBMwEaflIbl Rv4IUqjJzmQPaA0KkRoAVgAKHxldaNqno/6G1FR2gwz+fr68p5WSYFlQ3axhvTjc bBMJpB/fbp9WmpGJieTt6iMOI6V1pnCVjibM5ZON59WCFfytHGGpbYW05gtZEod4 d/3yRuU53JRSj3jQAQuF1B6qYhyxvv5YEtAQqIFeHaPZ67nL6agw09hE+TlXjWbE rTQRShflQ+ydnzIfKicFgy6/53D5hq7iH2l7HwJVXbXRQ104T5DB/XHUUTr+UWQn /Nkhov32/n6GjxQ= =58I4 -----END PGP SIGNATURE----- Merge tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - Fix ASPM link_state teardown on removal (Lukas Wunner) - Fix misleading _OSC ASPM message (Sinan Kaya) - Make _OSC optional for PCI (Sinan Kaya) - Don't initialize ASPM link state when ACPI_FADT_NO_ASPM is set (Patrick Talbert) - Remove x86 and arm64 node-local allocation for host bridge structures (Punit Agrawal) - Pay attention to device-specific _PXM node values (Jonathan Cameron) - Support new Immediate Readiness bit (Felipe Balbi) - Differentiate between pciehp surprise and safe removal (Lukas Wunner) - Remove unnecessary pciehp includes (Lukas Wunner) - Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner) - Tolerate PCIe Slot Presence Detect being hardwired to zero to workaround broken hardware, e.g., the Wilocity switch/wireless device (Lukas Wunner) - Unify pciehp controller & slot structs (Lukas Wunner) - Constify hotplug_slot_ops (Lukas Wunner) - Drop hotplug_slot_info (Lukas Wunner) - Embed hotplug_slot struct into users instead of allocating it separately (Lukas Wunner) - Initialize PCIe port service drivers directly instead of relying on initcall ordering (Keith Busch) - Restore PCI config state after a slot reset (Keith Busch) - Save/restore DPC config state along with other PCI config state (Keith Busch) - Reference count devices during AER handling to avoid race issue with concurrent hot removal (Keith Busch) - If an Upstream Port reports ERR_FATAL, don't try to read the Port's config space because it is probably unreachable (Keith Busch) - During error handling, use slot-specific reset instead of secondary bus reset to avoid link up/down issues on hotplug ports (Keith Busch) - Restore previous AER/DPC handling that does not remove and re-enumerate devices on ERR_FATAL (Keith Busch) - Notify all drivers that may be affected by error recovery resets (Keith Busch) - Always generate error recovery uevents, even if a driver doesn't have error callbacks (Keith Busch) - Make PCIe link active reporting detection generic (Keith Busch) - Support D3cold in PCIe hierarchies during system sleep and runtime, including hotplug and Thunderbolt ports (Mika Westerberg) - Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots are empty or occupied (Jon Derrick) - Remove duplicated include from pci/pcie/err.c and unused variable from cpqphp (YueHaibing) - Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza Pawandeep) - Uninline PCI bus accessors for better ftracing (Keith Busch) - Remove unused AER Root Port .error_resume method (Keith Busch) - Use kfifo in AER instead of a local version (Keith Busch) - Use threaded IRQ in AER bottom half (Keith Busch) - Use managed resources in AER core (Keith Busch) - Reuse pcie_port_find_device() for AER injection (Keith Busch) - Abstract AER interrupt handling to disconnect error injection (Keith Busch) - Refactor AER injection callbacks to simplify future improvments (Keith Busch) - Remove unused Netronome NFP32xx Device IDs (Jakub Kicinski) - Use bitmap_zalloc() for dma_alias_mask (Andy Shevchenko) - Add switch fall-through annotations (Gustavo A. R. Silva) - Remove unused Switchtec quirk variable (Joshua Abraham) - Fix pci.c kernel-doc warning (Randy Dunlap) - Remove trivial PCI wrappers for DMA APIs (Christoph Hellwig) - Add Intel GPU device IDs to spurious interrupt quirk (Bin Meng) - Run Switchtec DMA aliasing quirk only on NTB endpoints to avoid useless dmesg errors (Logan Gunthorpe) - Update Switchtec NTB documentation (Wesley Yung) - Remove redundant "default n" from Kconfig (Bartlomiej Zolnierkiewicz) - Avoid panic when drivers enable MSI/MSI-X twice (Tonghao Zhang) - Add PCI support for peer-to-peer DMA (Logan Gunthorpe) - Add sysfs group for PCI peer-to-peer memory statistics (Logan Gunthorpe) - Add PCI peer-to-peer DMA scatterlist mapping interface (Logan Gunthorpe) - Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan Gunthorpe) - Add PCI peer-to-peer DMA driver writer's documentation (Logan Gunthorpe) - Add block layer flag to indicate driver support for PCI peer-to-peer DMA (Logan Gunthorpe) - Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P memory (Logan Gunthorpe) - Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan Gunthorpe) - Add nvme-pci support for PCI peer-to-peer memory in requests (Logan Gunthorpe) - Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise, Christoph Hellwig, Logan Gunthorpe) - Cache VF config space size to optimize enumeration of many VFs (KarimAllah Ahmed) - Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas) - Fix VMD AERSID quirk Device ID matching (Jon Derrick) - Fix Cadence PHY handling during probe (Alan Douglas) - Signal Cadence Endpoint interrupts via AXI region 0 instead of last region (Alan Douglas) - Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan Douglas) - Remove redundant controller tests for "device_type == pci" (Rob Herring) - Document R-Car E3 (R8A77990) bindings (Tho Vu) - Add device tree support for R-Car r8a7744 (Biju Das) - Drop unused mvebu PCIe capability code (Thomas Petazzoni) - Add shared PCI bridge emulation code (Thomas Petazzoni) - Convert mvebu to use shared PCI bridge emulation (Thomas Petazzoni) - Add aardvark Root Port emulation (Thomas Petazzoni) - Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach) - Add initial power management for i.MX7 (Leonard Crestez) - Add PME_Turn_Off support for i.MX7 (Leonard Crestez) - Fix qcom runtime power management error handling (Bjorn Andersson) - Update TI dra7xx unaligned access errata workaround for host mode as well as endpoint mode (Vignesh R) - Fix kirin section mismatch warning (Nathan Chancellor) - Remove iproc PAXC slot check to allow VF support (Jitendra Bhivare) - Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I) - Update Keystone to use MRRS quirk for host bridge instead of open coding (Kishon Vijay Abraham I) - Refactor Keystone link establishment (Kishon Vijay Abraham I) - Simplify and speed up Keystone link training (Kishon Vijay Abraham I) - Remove unused Keystone host_init argument (Kishon Vijay Abraham I) - Merge Keystone driver files into one (Kishon Vijay Abraham I) - Remove redundant Keystone platform_set_drvdata() (Kishon Vijay Abraham I) - Rename Keystone functions for uniformity (Kishon Vijay Abraham I) - Add Keystone device control module DT binding (Kishon Vijay Abraham I) - Use SYSCON API to get Keystone control module device IDs (Kishon Vijay Abraham I) - Clean up Keystone PHY handling (Kishon Vijay Abraham I) - Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I) - Clean up Keystone config space access checks (Kishon Vijay Abraham I) - Get Keystone outbound window count from DT (Kishon Vijay Abraham I) - Clean up Keystone outbound window configuration (Kishon Vijay Abraham I) - Clean up Keystone DBI setup (Kishon Vijay Abraham I) - Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I) - Fix Keystone IRQ status checking (Kishon Vijay Abraham I) - Add debug messages for all Keystone errors (Kishon Vijay Abraham I) - Clean up Keystone includes and macros (Kishon Vijay Abraham I) - Fix Mediatek unchecked return value from devm_pci_remap_iospace() (Gustavo A. R. Silva) - Fix Mediatek endpoint/port matching logic (Honghui Zhang) - Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui Zhang) - Remove redundant Mediatek PM domain check (Honghui Zhang) - Convert Mediatek to pci_host_probe() (Honghui Zhang) - Fix Mediatek MSI enablement (Honghui Zhang) - Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang) - Add Mediatek loadable module support (Honghui Zhang) - Detach VMD resources after stopping root bus to prevent orphan resources (Jon Derrick) - Convert pcitest build process to that used by other tools (iio, perf, etc) (Gustavo Pimentel) * tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits) PCI/AER: Refactor error injection fallbacks PCI/AER: Abstract AER interrupt handling PCI/AER: Reuse existing pcie_port_find_device() interface PCI/AER: Use managed resource allocations PCI: pcie: Remove redundant 'default n' from Kconfig PCI: aardvark: Implement emulated root PCI bridge config space PCI: mvebu: Convert to PCI emulated bridge config space PCI: mvebu: Drop unused PCI express capability code PCI: Introduce PCI bridge emulated config space common logic PCI: vmd: Detach resources after stopping root bus nvmet: Optionally use PCI P2P memory nvmet: Introduce helper functions to allocate and free request SGLs nvme-pci: Add support for P2P memory in requests nvme-pci: Use PCI p2pmem subsystem to manage the CMB IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init\|destroy]() block: Add PCI P2P flag for request queue PCI/P2PDMA: Add P2P DMA driver writer's documentation docs-rst: Add a new directory for PCI documentation PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset ...	2018-10-25 06:50:48 -07:00
YueHaibing	d021613ee3	scsi: lpfc: Remove set but not used variables 'tgtp' Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/lpfc/lpfc_debugfs.c: In function 'lpfc_debugfs_nodelist_data': drivers/scsi/lpfc/lpfc_debugfs.c:553:29: warning: variable 'tgtp' set but not used [-Wunused-but-set-variable] It never used since `2b65e18202` ("scsi: lpfc: NVME Target: Add debugfs support") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-10-16 21:58:05 -04:00
YueHaibing	f41d84d44a	scsi: lpfc: Remove set but not used variable 'psli' Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/lpfc/lpfc_hbadisc.c: In function 'lpfc_free_tx': drivers/scsi/lpfc/lpfc_hbadisc.c:5431:19: warning: variable 'psli' set but not used [-Wunused-but-set-variable] Since commit `895427bd01` ("scsi: lpfc: NVME Initiator: Base modifications") 'psli' is not used any more. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-10-16 21:46:46 -04:00
YueHaibing	feb59a3413	scsi: lpfc: Remove set but not used variables 'fc_hdr' and 'hw_page_size' Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_sli4_sp_handle_rcqe': drivers/scsi/lpfc/lpfc_sli.c:13430:26: warning: variable 'fc_hdr' set but not used [-Wunused-but-set-variable] drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_cq_create': drivers/scsi/lpfc/lpfc_sli.c:14852:11: warning: variable 'hw_page_size' set but not used [-Wunused-but-set-variable] Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-10-16 21:23:55 -04:00
Colin Ian King	c4dba187e6	scsi: lpfc: fix spelling mistake "Resrouce" -> "Resource" Trivial fix to spelling mistake in lpfc_printf_log message text. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-10-16 17:51:47 -04:00
Christoph Hellwig	416c461372	scsi: lpfc: remove a bogus pci_dma_sync_single_for_device call dma_alloc_coherent allocates memory that can be used by the cpu and the device at the same time, calls to pci_dma_sync_* are not required, and in fact actively harmful on some architectures like arm. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-10-15 22:41:00 -04:00
Oza Pawandeep	62b36c3ea6	PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls After `bfcb79fca1` ("PCI/ERR: Run error recovery callbacks for all affected devices"), AER errors are always cleared by the PCI core and drivers don't need to do it themselves. Remove calls to pci_cleanup_aer_uncorrect_error_status() from device driver error recovery functions. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, remove PCI core changes, remove unused variables] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-10-02 16:04:40 -05:00
James Smart	9e21017826	scsi: lpfc: Synchronize access to remoteport via rport The driver currently uses the ndlp to get the local rport which is then used to get the nvme transport remoteport pointer. There can be cases where a stale remoteport pointer is obtained as synchronization isn't done through the different dereferences. Correct by using locks to synchronize the dereferences. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-20 22:02:36 -04:00
YueHaibing	a63eba9efd	scsi: lpfc: Remove set but not used variable 'sgl_size' Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/lpfc/lpfc_nvme.c: In function 'lpfc_new_nvme_buf': drivers/scsi/lpfc/lpfc_nvme.c:2238:24: warning: variable 'sgl_size' set but not used [-Wunused-but-set-variable] int bcnt, num_posted, sgl_size; ^ Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-17 03:01:43 -04:00
James Smart	6318cb7fb0	scsi: lpfc: update driver version to 12.0.0.7 Update the driver version to 12.0.0.7 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:34 -04:00
James Smart	d2cc9bcd7f	scsi: lpfc: add support to retrieve firmware logs This patch adds the ability to read firmware logs from the adapter. The driver registers a buffer with the adapter that is then written to by the adapter. The adapter posts CQEs to indicate content updates in the buffer. While the adapter is writing to the buffer in a circular fashion, an application will poll the driver to read the next amount of log data from the buffer. Driver log buffer size is configurable via the ras_fwlog_buffsize sysfs attribute. Verbosity to be used by firmware when logging to host memory is controlled through the ras_fwlog_level attribute. The ras_fwlog_func attribute enables or disables loggy by firmware. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	18027a8ccc	scsi: lpfc: reduce locking when updating statistics Currently, on each io completion, the stats update routine indiscriminately holds a lock. While holding the adapter-wide lock, checks are made to check whether status are being tracked. When disabled (the default), the locking wasted a lot of cycles. Check for stats enablement before taking the lock. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	2879265f51	scsi: lpfc: Fix errors in log messages. Message 6408 is displayed for each entry in an array, but the cpu and queue numbers were incorrect for the entry. Message 6001 includes an extraneous character. Resolve both issues Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	aad59d5d34	scsi: lpfc: Correct invalid EQ doorbell write on if_type=6 During attachment, the driver writes the EQ doorbell to disable potential interrupts from an EQ. The current EQ doorbell format used for clearing the interrupt is incorrect and uses an if_type=2 format, making the operation act on the wrong EQ. Correct the code to use the proper if_type=6 EQ doorbell format. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	523128e53b	scsi: lpfc: Correct irq handling via locks when taking adapter offline When taking the board offline while performing i/o, unsafe locking errors occurred and irq level isn't properly managed. In lpfc_sli_hba_down, spin_lock_irqsave(&phba->hbalock, flags) does not disable softirqs raised from timer expiry. It is possible that a softirq is raised from the lpfc_els_retry_delay routine and recursively requests the same phba->hbalock spinlock causing deadlock. Address the deadlocks by creating a new port_list lock. The softirq behavior can then be managed a level deeper into the calling sequences. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	0ef01a2d95	scsi: lpfc: Correct soft lockup when running mds diagnostics When running an mds diagnostic that passes frames with the switch, soft lockups are detected. The driver is in a CQE processing loop and has sufficient amount of traffic that it never exits the ring processing routine, thus the "lockup". Cap the number of elements in the work processing routine to 64 elements. This ensures that the cpu will be given up and the handler reschedule to process additional items. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	ca7fb76e09	scsi: lpfc: Correct race with abort on completion path On io completion, the driver is taking an adapter wide lock and nulling the scsi command back pointer. The nulling of the back pointer is to signify the io was completed and the scsi_done() routine was called. However, the routine makes no check to see if the abort routine had done the same thing and possibly nulled the pointer. Thus it may doubly-complete the io. Make the following mods: - Check to make sure forward progress (call scsi_done()) only happens if the command pointer was non-null. - As the taking of the lock, which is adapter wide, is very costly on a system under load, null the pointer using an xchg operation rather than under lock. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	faf0a5f829	scsi: lpfc: Raise nvme defaults to support a larger io and more connectivity When nvme is enabled, change the default for two parameters: sg_seg_cnt - raise the per-io sg list size so that 1MB ios are supported (based on a 4k buffer per element). iocb_cnt - raise the number of buffers used for things like NVME LS request/responses to allow more concurrent requests to for larger nvme configs. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	5b9e70b22c	scsi: lpfc: raise sg count for nvme to use available sg resources The driver allocates a sg list per io struture based on a fixed maximum size. When it registers with the protocol transports and indicates the max sg list size it supports, the driver manipulates the fixed value to report a lesser amount so that it has reserved space for sg elements that are used for DIF. The driver initialization path sets the cfg_sg_seg_cnt field to the manipulated value for scsi. NVME initialization ran afterward and capped it's maximum by the manipulated value for SCSI. This erroneously made NVME report the SCSI-reduce-for-DIF value that reduced the max io size for nvme and wasted sg elements. Rework the driver so that cfg_sg_seg_cnt becomes the overall maximum size and allow the max size to be tunable. A separate (new) scsi sg count is then setup with the scsi-modified reduced value. NVME then initializes based off the overall maximum. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	01a8aed6a0	scsi: lpfc: Fix GFT_ID and PRLI logic for RSCN Driver only sends NVME PRLI to a device that also supports FCP. This resuls in remote ports that don't have fc_remote_ports created for them. The driver is clearing the nlp_fc4_type for a ndlp at the wrong time. Fix by moving the nlp_fc4_type clearing to the discovery engine in the DEVICE_RECOVERY state. Also ensure that rport registration is done for all nlp_fc4_types. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:32 -04:00
Dan Carpenter	26c724a690	scsi: lpfc: remove an unnecessary NULL check Smatch complains about this code: drivers/scsi/lpfc/lpfc_scsi.c:1053 lpfc_get_scsi_buf_s4() warn: variable dereferenced before check 'lpfc_cmd' (see line 1039) Fortunately the NULL check isn't required so I have removed it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-30 07:33:55 -04:00
James Smart	53e13ee087	scsi: lpfc: Correct MDS diag and nvmet configuration A recent change added some MDS processing in the lpfc_drain_txq routine that relies on the fcp_wq being allocated. For nvmet operation the fcp_wq is not allocated because it can only be an nvme-target. When the original MDS support was added LS_MDS_LOOPBACK was defined wrong, (0x16) it should have been 0x10 (decimal value used for hex setting). This incorrect value allowed MDS_LOOPBACK to be set simultaneously with LS_NPIV_FAB_SUPPORTED, causing the driver to crash when it accesses the non-existent fcp_wq. Correct the bad value setting for LS_MDS_LOOPBACK. Fixes: `ae9e28f36a` ("lpfc: Add MDS Diagnostic support.") Cc: <stable@vger.kernel.org> # v4.12+ Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Tested-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-27 12:26:10 -04:00
James Smart	9abd9990e9	scsi: lpfc: Default fdmi_on to on Change default behavior for fdmi registration to on. [mkp: patch was mangled] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-27 12:26:10 -04:00
James Smart	7fa8512330	scsi: lpfc: update driver version to 12.0.0.6 Update the driver version to 12.0.0.6 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:19 -04:00
James Smart	06b6fa3815	scsi: lpfc: Remove lpfc_enable_pbde as module parameter Enablement of the PBDE optimization brought out some incompatible behaviors under error scenarios. Best to disable and remove the PBDE optimization. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:19 -04:00
James Smart	24bc311942	scsi: lpfc: Correct LCB ACCept payload After memory allocation for the LCB response frame, the memory wasn't zero initialized, and not all fields are set. Thus garbage shows up in the payload. Fix by zeroing the memory at allocation. Also properly set the Capability field based on duration support. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:19 -04:00
James Smart	2a5b7d626e	scsi: lpfc: Limit tracking of tgt queue depth in fast path Performance is affected when target queue depth is tracked. An atomic counter is incremented on the submission path which competes with it being decremented on the completion path. In addition, multiple CPUs can simultaniously be manipulating this counter for the same ndlp. Reduce the overhead by only performing the target increment/decrement when the target queue depth is less than the overall adapter depth, thus is actually meaningful. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:19 -04:00
James Smart	93a3922da4	scsi: lpfc: Fix driver crash when re-registering NVME rports. During remote port loss fault testing, the driver crashed with the following trace: general protection fault: 0000 [#1] SMP RIP: ... lpfc_nvme_register_port+0x250/0x480 [lpfc] Call Trace: lpfc_nlp_state_cleanup+0x1b3/0x7a0 [lpfc] lpfc_nlp_set_state+0xa6/0x1d0 [lpfc] lpfc_cmpl_prli_prli_issue+0x213/0x440 lpfc_disc_state_machine+0x7e/0x1e0 [lpfc] lpfc_cmpl_els_prli+0x18a/0x200 [lpfc] lpfc_sli_sp_handle_rspiocb+0x3b5/0x6f0 [lpfc] lpfc_sli_handle_slow_ring_event_s4+0x161/0x240 [lpfc] lpfc_work_done+0x948/0x14c0 [lpfc] lpfc_do_work+0x16f/0x180 [lpfc] kthread+0xc9/0xe0 ret_from_fork+0x55/0x80 After registering a new remoteport, the driver is pulling an ndlp pointer from the lpfc rport associated with the private area of a newly registered remoteport. The private area is uninitialized, so it's garbage. Correct by pulling the the lpfc rport pointer from the entering ndlp point, then ndlp value from at rport. Note the entering ndlp may be replacing by the rport->ndlp due to an address change swap. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:19 -04:00
James Smart	8931c73bee	scsi: lpfc: Fix list corruption on the completion queue. Enabling list_debug showed the drivers txcmplq was suffering list corruption. The systems will eventually crash because the iocb free list gets crossed linked with the prings txcmplq. Most systems will run for a while after the corruption, but will eventually crash when a scsi eh reset occurs and the txcmplq is attempted to be flushed. The flush gets stuck in an endless loop. The problem is the abort handler does not hold the sli4 ring lock while validating the IO so the IO could complete while the driver is still preping the abort. The erroneously generated abort, when it completes, has pointers to the original IO that has already completed, and the IO manipulation (for the second time) corrupts the list. Correct by taking the ring lock early in the abort handler so the erroneous abort won't be sent if the io has/is completing. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:18 -04:00
James Smart	b615a20adf	scsi: lpfc: Fix sysfs Speed value on CNA ports CNA ports were showing speed as "unknown" even if the link is up. Add speed decoding for FCOE-based adapters. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:18 -04:00
James Smart	faa832e97a	scsi: lpfc: Fix ELS abort on SLI-3 adapters For ABORT_XRI_CN command, firmware identifies XRI to abort by IOTAG and RPI combination. For ELS aborts, driver specifies IOTAG correctly but RPI is not specified. Fix by setting RPI in WQE. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:18 -04:00
Colin Ian King	cc74e31d41	scsi: lpfc: remove null check on nvmebuf The null checks on nvmebuf are redundant as nvmebuf is always obtained from a container_of() and hence can never be null. Remove all the redundant null checks. This also cleans up a static analysis warning. Detected by CoverityScan, CID#1471753 ("Dereference before null check") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-30 23:17:53 -04:00
Johannes Thumshirn	c6668cae16	scsi: lpfc: remove ScsiResult macro Remove the ScsiResult macro and open code it on all call sites. This will make subsequent refactoring in this area easier. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:42:47 -04:00
James Smart	4ae2ebde31	scsi: lpfc: Revise copyright for new company language Change references from "Broadcom Limited" to "Broadcom Inc." in the copyright message. Update copyright duration if not yet updated for 2018. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	3e1ebadd88	scsi: lpfc: update driver version to 12.0.0.5 Update the driver version to 12.0.0.5 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	b0e830125b	scsi: lpfc: devloss timeout race condition caused null pointer reference A race condition between the context of devloss timeout handler and I/O completion caused devloss timeout handler de-referencing pointer that had been released. Added the check in lpfc_sli_validate_fcp_iocb() on LPFC_IO_ON_TXCMPLQ to capture the race condition of I/O completion and devloss timeout handler attemption for aborting the I/O. Also, added check on lpfc_cmd->rdata pointer before de-referenceing lpfc_cmd->rdata->pnode. Also, added protection in lpfc_sli_abort_iocb() routine on driver performed FCP I/O FLUSHING already under way before proceeding to aborting I/Os. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	6871e8144f	scsi: lpfc: Fix NVME Target crash in defer rcv logic Kernel occasionally crashed with the following ops on NVME Target: BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffffa042ee50>] lpfc_nvmet_defer_rcv+0x50/0x70 [lpfc] Callback routine was called for deferred rcv when it should be treated as a normal rcv. Added code in callback routine to detect this condition and log a message, then bail. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	66e9e6bf07	scsi: lpfc: Support duration field in Link Cable Beacon V1 command Current implementation missed setting the duration field. Correct the code to set the field. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	414abe0ab6	scsi: lpfc: Make PBDE optimizations configurable The PBDE optimizations aren't supported in all firmware revs. Make optimizations configurable in case there's a side effect on old firmware. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	68c9b55dee	scsi: lpfc: Fix abort error path for NVMET rmmod of driver hangs As driver instances were being unloaded, the NVME target port was unloaded first. During the unload, the NVME initiator port sent a heartbeat IO. Because of the target port state, that IO was scheduled for an Abort; however, that abort subsequently failed. The failure was not cleaned up properly and lpfc_sli4_xri_exchange_busy_wait silently hung forever. Clean failed abort properly and make lpfc_sli4_xri_exchange_busy_wait not hangs silently while waiting for aborts to complete. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	d580c61374	scsi: lpfc: Fix panic if driver unloaded when port is offline System crashes when the lpfc module is unloaded after making the port offline The nvme queue pointers were freed during port offline, but were later accessed in pci remove path. Validate the pointers in pci remove path before accessing them. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:08 -04:00
James Smart	5cc167ddb7	scsi: lpfc: Fix driver not setting dpp bits correctly in doorbell word Driver is incorrectly formatting a register on new hardware, using a format for an older chip. This can result in non-deterministic behavior. Ensure driver is not setting "workqueue index" in the WQ doorbell when making a non-dpp doorbell write. The field must be zero when non-dpp. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:08 -04:00
James Smart	afff0d2321	scsi: lpfc: Add Buffer overflow check, when nvme_info larger than PAGE_SIZE Kernel crashes during fill_read_buffer when nvme_info sysfs file read. With multiple NVME targets, approx 40, nvme_info may grow larger than PAGE_SIZE bytes. snprintf(buf + len, PAGE_SIZE - len, ...) logic is flawed as PAGE_SIZE - len can be < 0 and is accepted by snprintf. This results in buffer overflow, and is detected with check from dev_attr_show and fill_read_buffer. Change to use scnprintf to a tmp array, before calling strlcat to ensure no buffer overflow over PAGE_SIZE bytes. Message "6314" created as a new message indicating when there is more nvme info, but is truncated to fit within PAGE_SIZE bytes. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:08 -04:00
Arnd Bergmann	c4d6204dc1	scsi: lpfc: use monotonic timestamps for statistics The get_seconds() function suffers from a possible overflow in 2038 or 2106, as well as jitter due to settimeofday or leap second updates, and is deprecated. As we are interested in elapsed time only, using ktime_get_seconds() to read the CLOCK_MONOTONIC timebase is ideal here. This also lets us remove the hack that tries to deal with get_seconds() going slightly backwards, which cannot happen with montonic timestamps. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-06-26 12:00:27 -04:00
Kees Cook	6396bb2215	treewide: kzalloc() -> kcalloc() The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(char) * COUNT + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) \| kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) \| kzalloc(sizeof(TYPE) * C2, ...) \| kzalloc(C1 * C2 * C3, ...) \| kzalloc(C1 * C2, ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) \| - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) \| - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Kees Cook	6da2ec5605	treewide: kmalloc() -> kmalloc_array() The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kmalloc( - sizeof(char) * COUNT + COUNT , ...) \| kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) \| kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) \| kmalloc(sizeof(TYPE) * C2, ...) \| kmalloc(C1 * C2 * C3, ...) \| kmalloc(C1 * C2, ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) \| - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) \| - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
James Smart	1b5c2cb196	scsi: lpfc: update driver version to 12.0.0.4 Update the driver version to 12.0.0.4 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-28 22:40:33 -04:00
James Smart	b92dc72df3	scsi: lpfc: Fix port initialization failure. The driver exits port setup after failing the lpfc_sli4_get_parameters command (messages 0356, 2541, & 1412). The older CNA adapters do not support the MBX command. In the past the code was allowed to fail and continue on with initialization. However a nvme change moved a closing bracket and now makes all failures terminal. Revise the logic so that terminal failure only occurs if the command failed on the newer adapters. Additionally, if parameters are set that require information from the command and the command failed, the parameters are erroneous and port set up should fail even on the older adapters. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-28 22:40:33 -04:00
James Smart	c221768bd4	scsi: lpfc: Fix 16gb hbas failing cq create. The lancer G5 chip family fails the CQ create with 16k page size. The hardware incorrectly reports it supports large page sizes when it is actually limited to 4k pages. A prior patch resolved this for the A0 chip revision only. This patch excludes all revisions of the G5 asic from using large page sizes. As knowing the actual chip revision is unnecessary, the now unused definitions are removed Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-28 22:40:33 -04:00
James Smart	7438273fa2	scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc modprobe -r lpfc produces the following: Call Trace: __blk_mq_run_hw_queue+0xa2/0xb0 __blk_mq_delay_run_hw_queue+0x9d/0xb0 ? blk_mq_hctx_has_pending+0x32/0x80 blk_mq_run_hw_queue+0x50/0xd0 blk_mq_sched_insert_request+0x110/0x1b0 blk_execute_rq_nowait+0x76/0x180 nvme_keep_alive_work+0x8a/0xd0 [nvme_core] process_one_work+0x17f/0x440 worker_thread+0x126/0x3c0 ? manage_workers.isra.24+0x2a0/0x2a0 kthread+0xd1/0xe0 ? insert_kthread_work+0x40/0x40 ret_from_fork_nospec_begin+0x21/0x21 ? insert_kthread_work+0x40/0x40 However, rmmod lpfc would run correctly. When an nvme remoteport is unregistered with the host nvme transport, it needs to set the remoteport->dev_loss_tmo value 0 to indicate an immediate termination of device loss and prevent any further keep alives to that rport. The driver was never setting dev_loss_tmo causing the nvme transport to continue to send the keep alive. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-28 22:40:33 -04:00
James Smart	4d5e789a2e	scsi: lpfc: correct oversubscription of nvme io requests for an adapter Under large configurations, the driver would start to log message 6065 - NVME out of buffers (exchanges). The driver is using the ndlp cmd_qdepth value when determining the max outstanding ios for an adapter. This value, by default, is set to 65536, which exceeds the maximum exchange counts supported on an adapter. The ndlp cmd_qdepth has no relevance and outstanding io count should be capped at the max exchange count with IO requests beyond that level getting bounced back with an EBUSY status so that they are retried by the block layer. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-28 22:40:33 -04:00
James Smart	dc19e3b4a8	scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx) MDS diagnostics fail because of frame count mismatch. Unavailability of SGL is the trigger for this issue. If ELS SGL is not available to process MDS frame, IOCB is put in FCP txq but not attempted to post afterwards. So, driver stops processing incoming frames as it runs out of IOCB. lpfc_drain_txq attempts to submit IOCBS that are queued in ELS txq but MDS frames are posted to FCP WQ. Attempt to submit IOCBs that are present in FCP txq when MDS loopback is running. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-28 22:40:32 -04:00
Colin Ian King	7afc0ce912	scsi: lpfc: fix spelling mistakes: "mabilbox" and "maibox" Trivial fix to spelling mistakes in lpfc_printf_log log message "mabilbox" -> "mailbox" "maibox" -> "mailbox" Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:26:44 -04:00
James Smart	3e21d1cb0f	scsi: lpfc: Comment cleanup regarding Broadcom copyright header Fix small formatting and wording nits in Broadcom copyright header Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	e65d8c3411	scsi: lpfc: update driver version to 12.0.0.3 Update the driver version to 12.0.0.3 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	11f0e34ff4	scsi: lpfc: Enhance log messages when reporting CQE errors Enhance log messages for CQEs as they were not reporting certain fields. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	44c2757b76	scsi: lpfc: Fix up log messages and stats counters in IO submit code path Fix up log messages and add an fcp error stat counter in the IO submit code path to make diagnosing problems easier Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	d38f33b304	scsi: lpfc: Driver NVME load fails when CPU cnt > WQ resource cnt If the cpu count is larger than the number of WQ resources available, adapter attachment eventually failes due to a WQ_CREATE failure. Calculate the number of WQs desired (which initializes to cpu count) after accounting for the number of queues the adapter supports and the number allocated to SCSI and the control/ELS path, and scale down if necessary. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	23288b78a1	scsi: lpfc: Handle new link fault code returned by adapter firmware. The driver encounters a link event ACQE with a fault code it doesn't recognize, it logs an "Invalid" fault type and futher treats the unknown value as a mailbox command failure. First off, there is no "invalid" value, only values that are unknown. Secondly, the fault code doesn't indicate status - the rest of the ACQE contains that status so there is no reason to "fail the commands". Change the "Invalid" to "Unknown". There is no "invalid" code value. Separate fault code parsing and message genaration from any mbx handling status. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	a72d56b2a6	scsi: lpfc: Correct fw download error message In situations when the firmware image in inappropriate for the chip type, initial validation checks were light, allowing the checks to pass, thus allowing the firmware to be downloaded. Eventually, after the download, the chip rejects the firmware but it is logged as a generic firmware download error. Revise the initial checks to validate the image vs asic type so that the correct message is displayed and the download process is avoided. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	48f8fdb4b4	scsi: lpfc: enhance LE data structure copies to hardware The driver builds the control structures in host memory using definitions that are based on 32-bit words. After building the structure it is then written to the adapter. This patch slightly optimizes LE hosts by copying the structures via 64-bit copies. This is doable as the adapter interface is LE thus there is no byteswapping as the copy is performed. The same optimization would be nice on BE systems, but when byteswapping occurs, it swaps 32-bit words as well, thus trashing the control structure. Given amount of code that is dependent upon the 32-bit word definition, it was decided to not change things for the minor optimization. Thus PPC 64-bit systems sticks with doing 32-bit copies. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:15 -04:00

... 4 5 6 7 8 ...

1859 Commits