Sync code to the same with tk4 pub/lts/0017-kabi, except deleted rue
and wujing. Partners can submit pull requests to this branch, and we
can pick the commits to tk4 pub/lts/0017-kabi easly.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Gitee limit the repo's size to 3GB, to reduce the size of the code,
sync codes to ock 5.4.119-20.0009.21 in one commit.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Sync kernel codes to the same with 590eaf1fec ("Init Repo base on
linux 5.4.32 long term, and add base tlinux kernel interfaces."), which
is from tk4, and it is the base of tk4.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Nine changes, eight in drivers [ufs, target, lpfc x 2, qla2xxx x 4]
and one core change in sd that fixes an I/O failure on DIF type 3
devices.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXbzO+iYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYOpAP9/BCSY
2TAFlli2rVQe+ZNjhHcE4Gj92HNPO7ZgvDQvWgD9F184tjG+1pntYGFutoso7Ak6
QimtBw4AuYg9eDKJDKU=
=bQRX
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Nine changes, eight in drivers [ufs, target, lpfc x 2, qla2xxx x 4]
and one core change in sd that fixes an I/O failure on DIF type 3
devices"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: stop timer in shutdown path
scsi: sd: define variable dif as unsigned int instead of bool
scsi: target: cxgbit: Fix cxgbit_fw4_ack()
scsi: qla2xxx: Fix partial flash write of MBI
scsi: qla2xxx: Initialized mailbox to prevent driver load failure
scsi: lpfc: Honor module parameter lpfc_use_adisc
scsi: ufs-bsg: Wake the device before sending raw upiu commands
scsi: lpfc: Check queue pointer before use
scsi: qla2xxx: fixup incorrect usage of host_byte
Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch,
53c710[x2], target) and one core change that tries to close a race
between sysfs delete and module removal.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXbN1gSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWUzAP4tB9Z+
X5zfnMLmeAtSCnVwIgFX3/GVSFfzEmi+3VxfBQEA3nfs5AAJCPsaTk9z+jLtAKPk
6uYoHwsyTHal19Ojt9g=
=IOPn
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch,
53c710[x2], target) and one core change that tries to close a race
between sysfs delete and module removal"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: remove left-over BUILD_NVME defines
scsi: core: try to get module before removing device
scsi: hpsa: add missing hunks in reset-patch
scsi: target: core: Do not overwrite CDB byte 1
scsi: ch: Make it possible to open a ch device multiple times again
scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE
scsi: sni_53c710: fix compilation error
scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions
scsi: qla2xxx: fix a potential NULL pointer dereference
The initial lpfc_desc_set_adisc implementation in commit
dea3101e0a ("lpfc: add Emulex FC driver version 8.0.28") enabled ADISC if
cfg_use_adisc && RSCN_MODE && FCP_2_DEVICE
In commit 92d7f7b0cd ("[SCSI] lpfc: NPIV: add NPIV support on top of
SLI-3") this changed to
(cfg_use_adisc && RSC_MODE) || FCP_2_DEVICE
and later in commit ffc954936b ("[SCSI] lpfc 8.3.13: FC Discovery Fixes
and enhancements.") to
(cfg_use_adisc && RSC_MODE) || (FCP_2_DEVICE && FCP_TARGET)
A customer reports that after a devloss, an ADISC failure is logged. It
turns out the ADISC flag is set even the user explicitly set lpfc_use_adisc
= 0.
[Sat Dec 22 22:55:58 2018] lpfc 0000:82:00.0: 2:(0):0203 Devloss timeout on WWPN 50:01:43:80:12:8e:40:20 NPort x05df00 Data: x82000000 x8 xa
[Sat Dec 22 23:08:20 2018] lpfc 0000:82:00.0: 2:(0):2755 ADISC failure DID:05DF00 Status:x9/x70000
[mkp: fixed Hannes' email]
Fixes: 92d7f7b0cd ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3")
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: James Smart <james.smart@broadcom.com>
Link: https://lore.kernel.org/r/20191022072112.132268-1-dwagner@suse.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The queue pointer might not be valid. The rest of the code checks the
pointer before accessing it. lpfc_sli4_process_missed_mbox_completions is
the only place where the check is missing.
Fixes: 657add4e5e ("scsi: lpfc: Fix poor use of hardware queues if fewer irq vectors")
Cc: James Smart <jsmart2021@gmail.com>
Link: https://lore.kernel.org/r/20191018162111.8798-1-dwagner@suse.de
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The BUILD_NVME define never got defined anywhere, causing NVMe commands to
be treated as SCSI commands when freeing the buffers. This was causing a
stuck discovery and a horrible crash in lpfc_set_rrq_active() later on.
Link: https://lore.kernel.org/r/20191017150019.75769-1-hare@suse.de
Fixes: c00f62e6c5 ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates.
The only core change this time around is the addition of request
batching for virtio. Since batching requires an additional flag to
use, it should be invisible to the rest of the drivers.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXYQE/yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXs9AP4usPY5
OpMlF6OiKFNeJrCdhCScVghf9uHbc7UA6cP+EgD/bCtRgcDe1ZjOTYWdeTwvwWqA
ltWYonnv6Lg3b1f9yqI=
=jRC/
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates. The
only core change this time around is the addition of request batching
for virtio. Since batching requires an additional flag to use, it
should be invisible to the rest of the drivers"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (264 commits)
scsi: hisi_sas: Fix the conflict between device gone and host reset
scsi: hisi_sas: Add BIST support for phy loopback
scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation
scsi: hisi_sas: Remove some unused function arguments
scsi: hisi_sas: Remove redundant work declaration
scsi: hisi_sas: Remove hisi_sas_hw.slot_complete
scsi: hisi_sas: Assign NCQ tag for all NCQ commands
scsi: hisi_sas: Update all the registers after suspend and resume
scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device
scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails
scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled
scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset()
scsi: hisi_sas: add debugfs auto-trigger for internal abort time out
scsi: virtio_scsi: unplug LUNs when events missed
scsi: scsi_dh_rdac: zero cdb in send_mode_select()
scsi: fcoe: fix null-ptr-deref Read in fc_release_transport
scsi: ufs-hisi: use devm_platform_ioremap_resource() to simplify code
scsi: ufshcd: use devm_platform_ioremap_resource() to simplify code
scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code
scsi: ufs: Use kmemdup in ufshcd_read_string_desc()
...
A recent patch unconditionally marks the hba as in error as part of
resetting the adapter. The driver flow that called the adapter reset was a
recovery path, which expects the adapter to not be in an error state in
order to finish the recovery. Given the new error state being set, the
recovery fails and the adapter is left in limbo.
Revise the adapter reset routine so that it will only mark the adapter in
error if it was unable to reset the adapter.
Fixes: 8c24a4f643 ("scsi: lpfc: Fix crash due to port reset racing vs adapter error handling")
Link: https://lore.kernel.org/r/20190903215441.10490-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Convert the remaining %pf users to %ps to prepare for the removal of the
old %pf conversion specifier support.
Fixes: 3235066449 ("scsi: lpfc: Migrate to %px and %pf in kernel print calls")
Link: https://lore.kernel.org/r/20190904160423.3865-1-sakari.ailus@linux.intel.com
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The 12.4.0.0 patch that merged WQ/CQ pairs into single per-cpu pair
contained a bug: a local variable was set to the queue pair by index. This
should have allowed the local variable to be natively used. Instead, the
code reused the index relative to the local variable, obtaining a random
pointer value that when used eventually faulted the system
Convert offending code to use local variable.
Fixes: c00f62e6c5 ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair")
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Raise the config max for lpfc_fcp_mq_threshold variable to 256.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Capturing and downloading dif command data and dif data was done a dozen
years ago and no longer being used. Also creates a potential security hole.
Remove the debugfs buffer for dif debugging.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com>
CC: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Per Dan Carpenter:
The patch d79c9e9d4b3d: "scsi: lpfc: Support dynamic unbounded SGL lists on
G7 hardware." from Aug 14, 2019, leads to the following static checker
warning:
drivers/scsi/lpfc/lpfc_init.c:4107 lpfc_new_io_buf()
error: not allocating enough data 784 vs 768
There was no need to compare sizes nor to allocate size based on a define.
Change allocation to use actual structure length
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 12.4.0.0
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, each hardware queue, typically allocated per-cpu, consists of a
WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2
WQ/CQ pairs will exist for the hardware queue. Separate queues are
unnecessary. The current implementation wastes memory backing the 2nd set
of queues, and the use of double the SLI-4 WQ/CQ's means less hardware
queues can be supported which means there may not always be enough to have
a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their
own WQ/CQ.
Rework the implementation to use a single WQ/CQ pair by both protocols.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FC-NVMe-2 added support for sequence level error recovery in the FC-NVME
protocol. This allows for the detection of errors and lost frames and
immediate retransmission of data to avoid exchange termination, which
escalates into NVMeoFC connection and association failures. A significant
RAS improvement.
The driver is modified to indicate support for SLER in the NVMe PRLI is
issues and to check for support in the PRLI response. When both sides
support it, the driver will set a bit in the WQE to enable the recovery
behavior on the exchange. The adapter will take care of all detection and
retransmission.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI
to contain the exchanges Scatter/Gather List. This caps the number of SGL
elements that can be in the SGL. There are not extensions to extend the
list out of the 2 pages.
The G7 hardware adds a SGE type that allows the SGL to be vectored to a
different scatter/gather list segment. And that segment can contain a SGE
to go to another segment and so on. The initial segment must still be
pre-registered for the XRI, but it can be a much smaller amount (256Bytes)
as it can now be dynamically grown. This much smaller allocation can
handle the SG list for most normal I/O, and the dynamic aspect allows it to
support many MB's if needed.
The implementation creates a pool which contains "segments" and which is
initially sized to hold the initial small segment per xri. If an I/O
requires additional segments, they are allocated from the pool. If the
pool has no more segments, the pool is grown based on what is now
needed. After the I/O completes, the additional segments are returned to
the pool for use by other I/Os. Once allocated, the additional segments are
not released under the assumption of "if needed once, it will be needed
again". Pools are kept on a per-hardware queue basis, which is typically
1:1 per cpu, but may be shared by multiple cpus.
The switch to the smaller initial allocation significantly reduces the
memory footprint of the driver (which only grows if large ios are
issued). Based on the several K of XRIs for the adapter, the 8KB->256B
reduction can conserve 32MBs or more.
It has been observed with per-cpu resource pools that allocating a resource
on CPU A, may be put back on CPU B. While the get routines are distributed
evenly, only a limited subset of CPUs may be handling the put routines.
This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because
all the resources are being put on a limited subset of CPUs.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added code to support driver loopback with MDS Diagnostics. This style of
diagnostics passes frames from the fabric to the driver who then echo them
back out the link. SEND_FRAME WQEs are used to transmit the frames. Added
the SOF and EOF field location definitions for use by SEND_FRAME.
Also ensure that enable_mds_diags is a RW parameter.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To aid better hardware detection when there are issues, report the first
and second level hardware revisions from the READ_REV command. Add the
elements to the existing hardware id string.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In order to see real addresses, convert %p with %px for kernel addresses
and replace %p with %pf for functions.
While converting, standardize on "x%px" throughout (not %px or 0x%px).
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While performing code review, several relatively simple optimizations can
be done in the fast path.
Add these optimizations (unlikely designators).
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Running on Coverity produced the following errors:
- coding style (indentation)
- memset size mismatch errors
note: comment cases where it is purposely a mismatch
Fix the errors.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
modinfo for lpfc_nvme_enable_fb is incorrect. FirstBurst on lpfc target is
not fully supported.
Update the attribute description
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is allowing the user to change lpfc_enable_bg while loading the
driver against a FCoE adapter. This is not supported.
No check is made for the adapter type when applying the blockguard
enablement value.
Fix by verifying the adapter type before setting the enablement flag.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
GetTrunkInfo is displaying an incorrect link speed when the link is a trunk
and the link has gone down. The driver is not clearing the logical speed
as part of the link down transition.
Fix by setting the logical speed to UNKNOWN SPEED when the link goes down.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Max Frame Size value is shown as 34816 in fdmishow from Switch.
The driver uses bbRcvSize in common service param which is obtained from
the READ_SPARM mailbox command. The bbRcvSize field which is displayed is a
three nibble field but the driver is printing a full four nibbles.
Fix by masking off the upper nibble.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The scsi transport fc bsg interface does not expect the bsg_job_done()
callback to be done if the bsg request call returns failure. Several of the
HST_VENDOR cases in the driver unconditionally call bsg_job_done()
regardless of the returning value.
Fix the code to only call bsg_job_done() if the call to lpfc_bsg_request()
will return success.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When forcing the use of MSI (vs MSI-X) the driver is crashing in
pci_irq_get_affinity.
The driver was not using the new pci_alloc_irq_vectors interface in the MSI
path.
Fix by using pci_alloc_irq_vectors() with PCI_RQ_MSI in the MSI path.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is currently reporting a non-zero nvme sg_seg_cnt value of 256
when nvme is disabled. It should be zero.
Fix by ensuring the value is cleared.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If an unsolicited ABTS was received, the driver looks up the exchange it
references. It it does various searches looking for the exchange
context. When one is eventually matched and it is associated with an XRI
context, the driver sends an ABORT WQE to terminate the exchange. Current
code looks at whether the transport had taken action on the XRI yet or not
(no action if set to LPFC_NVMET_STE_RCV; action if non-LPFC_NVMET_STE_RCV).
Based on action or not one of two (sol vs unsol) issue abort routines are
called. The unsol version cheats and transmits a sequence containing an
ABTS with no interaction with the adapter. The sol version issues an Abort
WQE and lets the adapter manage whether the ABTS is sent to not.
The issue is the unsol version is sending ABTS unconditionally for the
exchange that received the ABTS. It's unnecessary.
Remove the conditional and just call the adapter command-based routine to
let the adapter manage the ABTS.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As part of firmware download, the adapter is reset. On the adapter the
reset causes the function to stop and all outstanding io is terminated
(without responses). The reset path then starts teardown of the adapter,
starting with deregistration of the remote ports with the nvme-fc
transport. The local port is then deregistered and the driver waits for
local port deregistration. This never finishes.
The remote port deregistrations terminated the nvme controllers, causing
them to send aborts for all the outstanding io. The aborts were serviced in
the driver, but stalled due to its state. The nvme layer then stops to
reclaim it's outstanding io before continuing. The io must be returned
before the reset on the controller is deemed complete and the controller
delete performed. The remote port deregistration won't complete until all
the controllers are terminated. And the local port deregistration won't
complete until all controllers and remote ports are terminated. Thus things
hang.
The issue is the reset which stopped the adapter also stopped all the
responses that would drive i/o completions, and the aborts were also
stopped that stopped i/o completions. The driver, when resetting the
adapter like this, needs to be generating the completions as part of the
adapter reset so that I/O complete (in error), and any aborts are not
queued.
Fix by adding flush routines whenever the adapter port has been reset or
discovered in error. The flush routines will generate the completions for
the scsi and nvme outstanding io. The abort ios, if waiting, will be caught
and flushed as well.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This issue is specific to SLI-3 adapters, specifically when DIF is used.
Once seen, this message floods the logs:
9064 BLKGRD: lpfc_scsi_prep_dma_buf_s3: Too many sg segments from
dma_map_sg
The driver, upon detecting an error such as too many elements in an sglist,
misrepresents the error by treating it as a temporary resource issue by
returning MLQUEUE_HOST_BUSY. In these cases, no retry will fix it and it
should have been a hard error. The repeated retry was causing the spamming
of the log.
As for the initial reason of why an I/O encountered this issue at all is
not clear as parameters set by the driver should have avoided this. The
dm multipath maintainer has been notified of the issue.
Fix by changing the return code for the dma mapping routines to indicate
cases that are not retryable and return DID_ERROR on those cases.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the adapter encounters a condition which causes the adapter to fail
(driver must detect the failure) simultaneously to a request to the driver
to reset the adapter (such as a host_reset), the reset path will be racing
with the asynchronously-detect adapter failure path. In the failing
situation, one path has started to tear down the adapter data structures
(io_wq's) while the other path has initiated a repeat of the teardown and
is in the lpfc_sli_flush_xxx_rings path and attempting to access the
just-freed data structures.
Fix by the following:
- In cases where an adapter failure is detected, rather than explicitly
calling offline_eratt() to start the teardown, change the adapter state
and let the later calls of posted work to the slowpath thread invoke the
adapter recovery. In essence, this means all requests to reset are
serialized on the slowpath thread.
- Clean up the routine that restarts the adapter. If there is a failure
from brdreset, don't immediately error and leave things in a partial
state. Instead, ensure the adapter state is set and finish the teardown
of structures before returning.
- If in the scsi host reset handler and the board fails to reset and
restart (which can be due to parallel reset/recovery paths), instead of
hard failing and explicitly calling offline_eratt() (which gets into the
redundant path), just fail out and let the asynchronous path resolve the
adapter state.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During cable pull testing a deadlock was seen between lpfc_nlp_counters()
vs lpfc_mbox_process_link_up() vs lpfc_work_list_done(). They are all
waiting on the shost->host_lock.
Issue is all of these cases raise irq when taking out the lock but use
spin_unlock_irq() when unlocking. The unlock path is will unconditionally
re-enable interrupts in cases where irq state should be preserved. The
re-enablement allowed the other paths to execute which then causes the
deadlock.
Fix by converting the lock/unlock to irqsave/irqrestore.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In a test with high nvme remote port counts connected via a multi-hop FC
switch config where switches were systematically reset (e.g. fabric
partitioning and re-establishment), the nvme remote ports would switch
addresses based on the switch reconfiguration events. The driver would get
into a situation where the nvme port changed address, PLOGI and PRLI would
succeed nvme transport registration occurred, but subsequent LS requests by
the nvme subsystem failed due to a bad ndlp state and connectivity to the
device failed.
The driver hit a race condition on multiple devices that address swapped
simultaneously. In cases where the driver notices the remote port structure
came back as the same value as previously (meaning a nvme_rport structure
was re-enabled and did not go through devloss_tmo/connect_tmo_failures on
all controllers) the driver would unconditionally exit assuming the ndlp
information was correct. But, if the ndlp's had been swapped, the ndlp had
stale port state information, which when used by the LS request commands,
would fail the commands.
Fix by checking whether a node swap had occurred, and only exit if no ndlp
swap had occurred.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In situations where zoning is not being used, thus NVMe initiators see
other NVMe initiators as well as NVMe targets, a link bounce on an
initiator will cause the NVMe initiators to spew "6169" State Error
messages.
The driver is not qualifying whether the remote port is a NVMe targer or
not before calling the lpfc_nvme_rescan_port(), which validates the role
and prints the message if its only an NVMe initiator.
Fix by the following:
- Before calling lpfc_nvme_rescan_port() ensure that the node is a NVMe
storage target or a NVMe discovery controller.
- Clean up implementation of lpfc_nvme_rescan_port. remoteport pointer
will always be NULL if a NVMe initiator only. But, grabbing of
remoteport pointer should be done under lock to coincide with the
registering of the remote port with the fc transport.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On an SLI-3 adapter which does not support NVMe, but with the driver global
attribute to enable nvme on any adapter if it does support NVMe
(e.g. module parameter lpfc_enable_fc4_type=3), the SGL and total SGE
values are being munged by the protocol enablement when it shouldn't be.
Correct by changing the location of where the NVME sgl information is being
applied, which will avoid any SLI-3-based adapter.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If admin changes the devloss_tmo on an rport via the fc_remote_port rport
dev_loss_tmo attribute, the value is on set on scsi stack. The change is
not propagated to NVMe.
The set routine in the lldd lacks the call to
nvme_fc_set_remoteport_devloss() to set the value.
Fix by adding the call to the lldd set routine.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In tests with remote ports contantly logging out/logging coupled with
occassional local link bounce, if a remote port is disocnnected for longer
than devloss_tmo and then subsequently reconnected, eventually the test
will fail to login with the remote port and remote port connectivity is
lost.
When devloss_tmo expires, the driver does not free the node struct until
the port or npiv instances is being deleted. The node is left allocated but
the state set to UNUSED. If the node was in the process of logging in when
the local link drop occurred, meaning the RPI was allocated for the node in
order to send the ELS, but not yet registered which comes after successful
login, the node is moved to the NPR state, and if devloss expires, to
UNUSED state. If the remote port comes back, the node associated with it
is restarted and this path happens to allocate a new RPI and overwrites the
prior RPI value. In the cases where the port was logged in and loggs out,
the path did release the RPI but did not set the node rpi value. In the
cases where the remote port never finished logging in, the path never did
the call to release the rpi. In this latter case, when the node is
subsequently restore, the new rpi allocation overwrites the rpi that was
not released, and the rpi is now leaked. Eventually the port will run out
of RPI resources to log into new remote ports.
Fix by following changes:
- When an rpi is released, do so under locks and ensure the node rpi value
is set to a non-allocated value (LPFC_RPI_ALLOC_ERROR). Note:
refactored to a small service routine to avoid indentation issues.
- When re-enabling a node, check the rpi value to determine if a new
allocation is necessary. If already set, use the prior rpi.
Enhanced logging to help in the future.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a remote port is removed and remains removed for devloss_tmo, if an RSCN
is subsequently received indicating the presence of the remte port, the
driver does not login to and rediscovery the remote port.
Currently, in order to for a port to be rediscovered post an RSCN, the node
state must be NPR to reflect not logged in. When devloss expires, the node
state is marked UNUSED. When an RSCN occurs, the nodes referenced by the
RSCN will have a NPR_2B_DISC flag set, but the re-login will only be
attempted if the node is in NPR_NODE state. Thus the node is skipped over.
Fix by recognizing the NPR_2B_DISC and UNUSED and transition the node back
to NPR state to allow the re-login to take place.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If an admin updates lpfc's devloss_tmo sysfs attribute, the kernel will
oops.
Coding of a loop allowed a new value (rport) to be set/checked for null
followed by an older value (remoteport) checked for null to allow progress
where the new value, even though null, will be referenced.
Rework the logic to validate and prevent any reference to the null ptr.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It's possible for the driver to initiate an FLOGI and before it completes,
another link down/up transition occurs requiring a new FLOGI. Currently,
nothing is done to abort/noop the older FLOGI request to the adapter, so if
this transition occurs and the FLOGI completion is received after the link
down/up transition, the driver may erroneously act on the older FLOGI. In
most cases, the adapter properly terminates/fails the FLOGI, but there is a
timing condition where the FLOGI may complete on the wire prior to the
transition, but the response may not be seen/processed by the driver before
the driver sees the link transition.
Fix by having the link down handler in the driver run through any
outstanding ELS's and change the completion handler of the ELS so that it
will be no-op'd and released.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When tearing down the adapter for a reset, online/offline, or driver
unload, the queue free routine would hit a GPF oops. This only occurs on
conditions where the number of hardware queues created is fewer than the
number of cpus in the system. In this condition cpus share a hardware
queue. And of course, it's the 2nd cpu that shares a hardware that
attempted to free it a second time and hit the oops.
Fix by reworking the cpu to hardware queue mapping such that:
Assignment of hardware queues to cpus occur in two passes:
first pass: is first time assignment of a hardware queue to a cpu.
This will set the LPFC_CPU_FIRST_IRQ flag for the cpu.
second pass: for cpus that did not get a hardware queue they will
be assigned one from a primary cpu (one set in first pass).
Deletion of hardware queues is driven by cpu itteration, and queues
will only be deleted if the LPFC_CPU_FIRST_IRQ flag is set.
Also contains a few small cleanup fixes and a little better logging.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The adapter reset path (lpfc_sli_hba_down) is taking/releasing a lock with
irq. But, the path is already under the hbalock which raised irq so it's
unnecessary.
Convert to simple lock/unlock.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc_nvme_register_port hit a null prev_ndlp pointer in a test with lots of
target ports swapping addresses. The oldport value was stale, thus it's
ndlp (prev_ndlp set to it) was used.
Fix by moving oldrport pointer checks, and if used prev_ndlp pointer
assignment, to be done while the lock is held.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is inadvertently trying to issue an INIT_VPI mailbox command on
an SLI-3 driver. The command is specific to SLI-4. When the call is made to
send the command, if on an SLI-3 adapter, an array pointer is NULL and the
driver will oops.
Fix by restricting the command to SLI-4 adapters only.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a target issues an ADISC to the port and the target is a NVME target,
the driver is inadvertantly invalidating the login and marking the remote
port as logged out. Communication with the target is lost.
Revise the ADISC check so that FCP or NVME targets will be marked valid at
the end of ADISC processing. Enhance logging to recognize condition
better.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some remote ports may be slow in registering their GID_FT protocol
information with the fabric. If the remote port is an initiator, it may
send PLOGI to the port before the GID_FT logic is complete. Meaning, after
accepting the PLOGI, when the driver may see no response to the GID_FT that
is issued after the login to determine the protocols supported so that
proper PRLI's may be transmit. If the driver has no fc4 information, it
currently stops and the remote port is not discovered.
Fix by issuing a LOGO when there is no GID_FT information. The LOGO
completion handling will attempt to re-login if the nport_id is still
present.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>