clang static analysis reports this problem:
qla_nx2.c:694:3: warning: 6th function call argument is
an uninitialized value
ql_log(ql_log_fatal, vha, 0xb090,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In qla8044_poll_reg(), when reading the reg fails, the error is reported by
reusing the timeout error reporter. Because the value is unset, a garbage
value will be reported. Initialize the value.
Link: https://lore.kernel.org/r/20201005144544.25335-1-trix@redhat.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
qla2xx_process_get_sp_from_handle() will clear the slot in which the
current srb is stored. As a result it can't be used in
qla24xx_process_mbx_iocb_response() to check for consistency and later
again in qla24xx_mbx_iocb_entry().
Move the consistency check directly into qla24xx_mbx_iocb_entry() and avoid
the double call or any open coding of the
qla2xx_process_get_sp_from_handle() functionality.
Link: https://lore.kernel.org/r/20200929073802.18770-1-dwagner@suse.de
Fixes: 31a3271ff1 ("scsi: qla2xxx: Handle incorrect entry_type entries")
Reviewed-by: Arun Easi <aeasi@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Directly return constant when it is known to make code easier to
understand.
Link: https://lore.kernel.org/r/20200921112340.GA19336@duo.ucw.cz
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following warning:
[drivers/scsi/qla2xxx/qla_dbg.c:2451]: (warning) %ld in format string (no. 4)
requires 'long' but the argument type is 'unsigned long'.
Link: https://lore.kernel.org/r/20200930022515.2862532-4-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following warnings:
[drivers/scsi/qla2xxx/qla_os.c:4882]: (warning) %ld in format string (no. 2)
requires 'long' but the argument type is 'unsigned long'.
[drivers/scsi/qla2xxx/qla_os.c:5011]: (warning) %ld in format string (no. 1)
requires 'long' but the argument type is 'unsigned long'.
Link: https://lore.kernel.org/r/20200930022515.2862532-3-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following warnings:
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:884]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:885]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:886]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:887]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:888]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
Link: https://lore.kernel.org/r/20200930022515.2862532-2-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver was using a shorter timeout waiting for PLOGI from the peer in
point-to-point configurations. Some devices takes some time (~4 seconds) to
initiate the PLOGI. This peer initiating PLOGI is when the peer has a
higher P-WWN.
Increase the wait time based on N2N R_A_TOV.
Link: https://lore.kernel.org/r/20200929102152.32278-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On unload, session cleanup prematurely gave the signal for driver unload
path to advance.
Link: https://lore.kernel.org/r/20200929102152.32278-6-njavali@marvell.com
Fixes: 726b854870 ("qla2xxx: Add framework for async fabric discovery")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Normally, the MPI firmware is reset when an MPI dump is collected. If an
unsaved MPI dump exists in the driver, though, an alternate mechanism is
used. This mechanism, which was not fully correct, is not recommended and
instead an MPI dump template walk is suggested to perform the MPI reset.
To allow for the MPI dump template walk, extra space is reserved in the MPI
dump buffer which gets used only when there is already an MPI dump in
place.
Link: https://lore.kernel.org/r/20200929102152.32278-5-njavali@marvell.com
Fixes: cbb01c2f2f ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When printing the message:
"MPI Heartbeat stop. MPI reset is not needed.."
..the wrong register was checked leading to always printing that MPI reset
is not needed, even when it is needed. Fix the MPI reset message.
Link: https://lore.kernel.org/r/20200929102152.32278-4-njavali@marvell.com
Fixes: cbb01c2f2f ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Current code uses wrong mailbox option to extract bbc from firmware. This
field is nested inside of PLOGI payload. Extract bbc from PLOGI template
payload.
Link: https://lore.kernel.org/r/20200929102152.32278-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since the version string has been modified, sscanf() returns 4 instead of
6.
Link: https://lore.kernel.org/r/20200929102152.32278-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update internal driver version and remove module version macro.
Link: https://lore.kernel.org/r/20200904045128.23631-14-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
BIT_13 of extended FW attribute informs about NVMe-2 support. Set BIT_15
of special feature control block for enabling SLER in FW. Set bit 8 (SLER
supported) to 1 for the service parameter information when sending NVMe
PRLI request. Set BIT_14 of special feature control block for enabling PI
Control in FW. Driver should set bit 9 (PI Control supported) to 1 for the
service parameter information when sending NVMe PRLI request. Set BIT_13
for NVMe Async events.
Link: https://lore.kernel.org/r/20200904045128.23631-13-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch tracks number of IOCB resources used in the I/O fast path. If
the number of used IOCBs reach a high water limit, driver would return the
I/O as busy and let upper layer retry. This prevents over subscription of
IOCB resources where any future error recovery command is unable to cut
through. Enable IOCB throttling by default.
Link: https://lore.kernel.org/r/20200904045128.23631-12-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
tgt_port_database data is today exported only in target mode, allow it to
be shown in initiator mode as well.
Link: https://lore.kernel.org/r/20200904045128.23631-10-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In .fcp_io(), returning ENODEV as soon as remote port delete has started
can cause I/O errors. Fix this by returning EBUSY until the remote port
delete finishes.
Link: https://lore.kernel.org/r/20200904045128.23631-9-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Memory size calculations for Extended Login used in hardware offload got
truncated. Fix this by changing definition of exlogin_size to use uint32_t.
Link: https://lore.kernel.org/r/20200904045128.23631-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FCP-4 (referred FCP-4 rev-2b) identifies the earlier known "retry delay
timer" field as "status qualifier", which is described in SAM-5 and later
specs. This fix makes appropriate driver side modifications to honor the
new definition. The SAM document referred was SAM-6 rev-5.
Link: https://lore.kernel.org/r/20200904045128.23631-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Create a base for adding remote port related entries in debugfs.
Link: https://lore.kernel.org/r/20200904045128.23631-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver was using a lower value for dev_loss_tmo making it more prone to I/O
failures during remote port toggle testing. Set dev_loss_tmo to zero during
remote port registration to allow nvme-fc default dev_loss_tmo to be used,
which is higher than what driver was using.
Link: https://lore.kernel.org/r/20200904045128.23631-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This addresses the following coccinelle warning:
drivers/scsi/qla2xxx/qla_init.c:7112:5-9: Unneeded variable: "rval".
Return "QLA_SUCCESS" on line 7115
Link: https://lore.kernel.org/r/20200911091021.2937708-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It was observed on an ISP8324 16Gb HBA with fw=8.08.203 (d0d5) in a
PowerPC64 machine that pkt->entry_type was MBX_IOCB_TYPE/0x39 with an
sp->type SRB_SCSI_CMD which is invalid and should not be possible.
Reading the entry_type from the crash dump shows the expected value of
STATUS_TYPE/0x03 but the call trace shows that qla24xx_mbx_iocb_entry() is
used.
Add a check to verify for consistency and reset the HBA if an invalid state
is reached. Obviously, this is only a workaround until the real problem is
solved.
Link: https://lore.kernel.org/r/20200908081516.8561-5-dwagner@suse.de
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 7c3df1320e ("[SCSI] qla2xxx: Code changes to support new dynamic
logging infrastructure.") removed the use of the func argument. Let's add
it back.
Link: https://lore.kernel.org/r/20200908081516.8561-4-dwagner@suse.de
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Refactor qla2x00_get_sp_from_handle() to avoid the unnecessary goto if
early returns are used. With this we can also avoid preinitialzing the sp
pointer.
Link: https://lore.kernel.org/r/20200908081516.8561-3-dwagner@suse.de
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Emit a warning when ->done or ->free are called on an already freed
srb. There is a hidden use-after-free bug in the driver which corrupts
the srb memory pool which originates from the cleanup callbacks.
An extensive search didn't bring any lights on the real problem. The
initial fix was to set both pointers to NULL and try to catch invalid
accesses. But instead the memory corruption was gone and the driver
didn't crash. Since not all calling places check for NULL pointer, add
explicitly default handlers. With this we workaround the memory
corruption and add a debug help.
Link: https://lore.kernel.org/r/20200908081516.8561-2-dwagner@suse.de
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A negative error code should be returned.
Link: https://lore.kernel.org/r/20200829075746.19166-1-tian.xianting@h3c.com
Acked-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Xianting Tian <tian.xianting@h3c.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 98aee70d19 ("qla2xxx: Add endianizer to max_payload_size
modifier.") in 2014 broke qla2xxx on sparc64, e.g. as in the Sun Blade 1000
/ 2000. Unbreak by partial revert to fix endianness in nvram firmware
default initialization. Also mark the second frame_payload_size in nvram_t
__le16 to avoid new sparse warnings.
Link: https://lore.kernel.org/r/20200827.222729.1875148247374704975.rene@exactcode.com
Fixes: 98aee70d19 ("qla2xxx: Add endianizer to max_payload_size modifier.")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: René Rebe <rene@exactcode.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On an error exit path, a negative error code should be returned instead of
a positive return value.
Link: https://lore.kernel.org/r/20200802111530.5020-1-tianjia.zhang@linux.alibaba.com
Fixes: 8777e4314d ("scsi: qla2xxx: Migrate NVME N2N handling into state machine")
Cc: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the case of a failed retry, a positive value EIO is returned here. I
think this is a typo error. It is necessary to return an error value.
[mkp: caller checks != 0 but the rest of the file uses -Exxx so fix this up
to be consistent]
Link: https://lore.kernel.org/r/20200802111528.4974-1-tianjia.zhang@linux.alibaba.com
Fixes: 0691094ff3 ("scsi: qla2xxx: Add logic to detect ABTS hang and response completion")
Cc: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The initialization value of `rc` is wrong. It is unnecessary to initialize
`rc` variables, so remove its initialization operation.
Link: https://lore.kernel.org/r/20200802111527.4928-1-tianjia.zhang@linux.alibaba.com
Fixes: 84905dfe78 ("scsi: qla2xxx: Fix TMF and Multi-Queue config")
Cc: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update the size used in 'dma_free_coherent()' in order to match the one
used in the corresponding 'dma_alloc_coherent()'.
[mkp: removed memset() hunk that has already been addressed]
Link: https://lore.kernel.org/r/20200802110721.677707-1-christophe.jaillet@wanadoo.fr
Fixes: 4161cee52d ("[SCSI] qla4xxx: Add host statistics support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes coccicheck warning:
./drivers/scsi/qla2xxx/qla_mbx.c:4928:15-33: WARNING: dma_alloc_coherent use in els_cmd_map already zeroes out memory, so memset is not needed
dma_alloc_coherent() already zeroes out memory so memset() is not needed.
Link: https://lore.kernel.org/r/1596079918-41115-3-git-send-email-liheng40@huawei.com
Signed-off-by: Li Heng <liheng40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FCP T10-PI and NVMe features are independent of each other. This patch
allows both features to co-exist.
This reverts commit 5da05a26b8.
Link: https://lore.kernel.org/r/20200806111014.28434-12-njavali@marvell.com
Fixes: 5da05a26b8 ("scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FCoE adapter initialization failed for ISP8021 with the following patch
applied. In addition, reproduction of the issue the patch originally tried
to address has been unsuccessful.
This reverts commit 3cb182b3fa.
Link: https://lore.kernel.org/r/20200806111014.28434-11-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
OS boot during Boot from SAN was stuck at dracut emergency shell after
enabling NVMe driver parameter. For non-MQ support the driver was enabling
MQ. Add a check to confirm if FW supports MQ.
Link: https://lore.kernel.org/r/20200806111014.28434-9-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
qla_nvme_register_hba() puts out a warning when there are not enough queue
pairs available for FC-NVME. Just fail the NVME registration rather than a
WARNING + call Trace.
Link: https://lore.kernel.org/r/20200806111014.28434-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ql2xextended_error_logging can now be set to 1 to get the default mask
value, as opposed to at module load time only.
Link: https://lore.kernel.org/r/20200806111014.28434-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Multipath errors were seen during failback due to login timeout. The
remote device sent LOGO, the local host tore down the session and did
relogin. The RSCN arrived indicates remote device is going through failover
after which the relogin is in a 20s timeout phase. At this point the
driver is stuck in the relogin process. Add a fix to delete the session as
part of abort/flush the login.
Link: https://lore.kernel.org/r/20200806111014.28434-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>