On link down, transport is calling driver to abort outstanding ios.
Driver erroneously rejects the abort if the port indicates it isn't
logged in - which will be the case after the link down. Thus, the io
can't clean up. This prevents reconnection at the transport level.
Fix by allowing abort to proceed.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
NVME FC counters don't reflect actual results
Since counters are not atomic, or protected by a lock, the values often
get screwed up.
Make them atomic, like NVMET. Fix up sysfs and debugfs display
accordingly Added Outstanding IOs to stats display
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As the devloss API was implemented in the nvmei driver, an evaluation of
the nvme transport and the lpfc driver showed dual management of the
rports. This creates a bug possibility when the thread count and SAN
size increases.
The nvmei driver code was based on a very early transport and was not
revisited until the devloss API was introduced.
Remove the listhead in the driver's rport data structure and the
listhead in the driver's lport data structure. Remove all rport_list
traversal. Convert the driver to use the nrport (nvme rport) pointer
that is now NULL or nonNULL depending on a devloss action. Convert
debugfs and nvme_info in sysfs to use the fc_nodes list in the vport.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add nvme initiator devloss support
The existing implementation was based on no devloss behavior in the
transport (e.g. immediate teardown) so code didn't properly handle
delayed nvme rport device unregister calls. In addition, the driver was
not correctly cycling the rport port role for each
register-unregister-reregister process.
This patch does the following:
Rework the code to properly handle rport device unregister calls and
potential re-allocation of the remoteport structure if the port comes
back in under dev_loss_tmo.
Correct code that was incorrectly cycling the rport port role for each
register-unregister-reregister process.
Prep the code to enable calling the nvme_fc transport api to dynamically
update dev_loss_tmo when the scsi sysfs interface changes it.
Memset the rpinfo structure in the registration call to enforce "accept
nvme transport defaults" in the registration call. Driver parameters do
influence the dev_loss_tmo transport setting dynamically.
Simplifies the register function: the driver was incorrectly searching
its local rport list to determine resume or new semantics, which is not
valid as the transport already handles this. The rport was resumed if
the rport handed back matches the ndlp->nrport pointer. Otherwise,
devloss fired and the ndlp's nrport is NULL.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc was changing the private pointer that is set/maintained by
the nvme_fc transport. This caused two issues: a) the transport, on
teardown may erroneous attempt to free whatever address was set;
and b) lfpc uses any value set in lpfc_nvme_fcp_abort() and
assumes its a valid io request.
Correct issue by properly defining a context structure for lpfc.
Lpfc also updated to clear the private context structure on io
completion.
Since this bug caused scrutiny of the way lpfc moves local request
structures between lists, also cleaned up list_del()'s to
list_del_inits()'s.
This is a nvme-specific bug. The patch was cut against the
linux-block tree, for-4.12/block tree. It should be pulled in through
that tree.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The driver with nvme had this routine stubbed.
Right now XRI_ABORTED_CQE is not handled and the FC NVMET
Transport has a new API for the driver.
Missing code path, new NVME abort API
Update ABORT processing for NVMET
There are 3 new FC NVMET Transport API/ template routines for NVMET:
lpfc_nvmet_xmt_fcp_release
This NVMET template callback routine called to release context
associated with an IO This routine is ALWAYS called last, even
if the IO was aborted or completed in error.
lpfc_nvmet_xmt_fcp_abort
This NVMET template callback routine called to abort an exchange that
has an IO in progress
nvmet_fc_rcv_fcp_req
When the lpfc driver receives an ABTS, this NVME FC transport layer
callback routine is called. For this case there are 2 paths thru the
driver: the driver either has an outstanding exchange / context for the
XRI to be aborted or not. If not, a BA_RJT is issued otherwise a BA_ACC
NVMET Driver abort paths:
There are 2 paths for aborting an IO. The first one is we receive an IO and
decide not to process it because of lack of resources. An unsolicated ABTS
is immediately sent back to the initiator as a response.
lpfc_nvmet_unsol_fcp_buffer
lpfc_nvmet_unsol_issue_abort (XMIT_SEQUENCE_WQE)
The second one is we sent the IO up to the NVMET transport layer to
process, and for some reason the NVME Transport layer decided to abort the
IO before it completes all its phases. For this case there are 2 paths
thru the driver:
the driver either has an outstanding TSEND/TRECEIVE/TRSP WQE or no
outstanding WQEs are present for the exchange / context.
lpfc_nvmet_xmt_fcp_abort
if (LPFC_NVMET_IO_INP)
lpfc_nvmet_sol_fcp_issue_abort (ABORT_WQE)
lpfc_nvmet_sol_fcp_abort_cmp
else
lpfc_nvmet_unsol_fcp_issue_abort
lpfc_nvmet_unsol_issue_abort (XMIT_SEQUENCE_WQE)
lpfc_nvmet_unsol_fcp_abort_cmp
Context flags:
LPFC_NVMET_IOP - his flag signifies an IO is in progress on the exchange.
LPFC_NVMET_XBUSY - this flag indicates the IO completed but the firmware
is still busy with the corresponding exchange. The exchange should not be
reused until after a XRI_ABORTED_CQE is received for that exchange.
LPFC_NVMET_ABORT_OP - this flag signifies an ABORT_WQE was issued on the
exchange.
LPFC_NVMET_CTX_RLS - this flag signifies a context free was requested,
but we are deferring it due to an XBUSY or ABORT in progress.
A ctxlock is added to the context structure that is used whenever these
flags are set/read within the context of an IO.
The LPFC_NVMET_CTX_RLS flag is only set in the defer_relase routine when
the transport has resolved all IO associated with the buffer. The flag is
cleared when the CTX is associated with a new IO.
An exchange can has both an LPFC_NVMET_XBUSY and a LPFC_NVMET_ABORT_OP
condition active simultaneously. Both conditions must complete before the
exchange is freed.
When the abort callback (lpfc_nvmet_xmt_fcp_abort) is envoked:
If there is an outstanding IO, the driver will issue an ABORT_WQE. This
should result in 3 completions for the exchange:
1) IO cmpl with XB bit set
2) Abort WQE cmpl
3) XRI_ABORTED_CQE cmpl
For this scenerio, after completion #1, the NVMET Transport IO rsp
callback is called. After completion #2, no action is taken with respect
to the exchange / context. After completion #3, the exchange context is
free for re-use on another IO.
If there is no outstanding activity on the exchange, the driver will send a
ABTS to the Initiator. Upon completion of this WQE, the exchange / context
is freed for re-use on another IO.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cannot set NVME segment counts to a large number
The existing module parameter lpfc_sg_seg_cnt is used for both
SCSI and NVME.
Limit the module parameter lpfc_sg_seg_cnt to 128 with the
default being 64 for both NVME and NVMET, assuming NVME is enabled in the
driver for that port. The driver will set max_sgl_segments in the
NVME/NVMET template to lpfc_sg_seg_cnt + 1.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
The symptom is that the driver will fail to login to the fabric.
The reason is because it is out of iocb resources.
There is a one to one relationship between MRQs
(receive buffers for NVMET-FC) and iocbs and the default number of
IOCBs was not accounting for the number of MRQs that were being created.
This fix aligns the number of MRQ resources with the total resources so
that it can handle fabric events when needed.
Also the initialization of ctxlock to be on FCP commands, NOT LS commands.
And modified log messages so that the log output can be correlated with
the analyzer trace.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Fix nvme initiator handline when CONFIG_LPFC_NVME_INITIATOR is not enabled.
With update nvme upstream driver sources, loading
the driver with nvme enabled resulting in this Oops.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
IP: lpfc_nvme_update_localport+0x23/0xd0 [lpfc]
PGD 0
Oops: 0000 [#1] SMP
CPU: 0 PID: 10256 Comm: lpfc_worker_0 Tainted
Hardware name: ...
task: ffff881028191c40 task.stack: ffff880ffdf00000
RIP: 0010:lpfc_nvme_update_localport+0x23/0xd0 [lpfc]
RSP: 0018:ffff880ffdf03c20 EFLAGS: 00010202
Cause: As the initiator driver completes discovery at different stages,
it call lpfc_nvme_update_localport to hint that the DID and role may have
changed. In the implementation of lpfc_nvme_update_localport, the driver
was not validating the localport or the lport during the execution
of the update_localport routine. With the recent upstream additions to
the driver, the create_localport routine didn't run and so the localport
was NULL causing the page-fault Oops.
Fix: Add the CONFIG_LPFC_NVME_INITIATOR preprocessor inclusions to
lpfc_nvme_update_localport to turn off all routine processing when
the running kernel does not have NVME configured. Add NULL pointer
checks on the localport and lport in lpfc_nvme_update_localport and
dump messages if they are NULL and just exit.
Also one alingment issue fixed.
Repalces the ifdef with the IS_ENABLED macro.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
In the lpfc_nvme_io_cmd_wqe_cmpl routine the driver was printing two
pointers and the DID for the rport whenever an IO completed on a now
that had transitioned to a non active state.
There is no need to print the node pointer address for a node that
is not active the DID should be enough to debug.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
In this case, the NVME initiator is sending an LS REQ command on an NDLP
that is not MAPPED. The FW rejects it.
The lpfc_nvme_ls_req routine checks for a NULL ndlp pointer
but does not check the NDLP state. This allows the routine
to send an LS IO when the ndlp is disconnected.
Check the ndlp for NULL, actual node, Target and MAPPED
or Initiator and UNMAPPED. This avoids Fabric nodes getting
the Create Association or Create Connection commands. Initiators
are free to Reject either Create.
Also some of the messages numbers in lpfc_nvme_ls_req were changed because
they were already used in other log messages.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
During some link event testing it was observed that the
wait_for_completion_timeout in the lpfc_nvme_unregister_port
was timing out all the time.
The initiator is claiming the nvme_fc_unregister_remoteport upcall is
not completing the unregister in the time allotted.
[ 2186.151317] lpfc 0000:07:00.0: 0:(0):6169 Unreg nvme wait failed 0
The wait_for_completion_timeout returns 0 when the wait has
been outstanding for the jiffies passed by the caller. In this error
message, the nvme initiator passed value 5 - meaning 5 jiffies -
and this is just wrong.
Calculate 5 seconds in Jiffies and pass that value
from the current jiffies.
Also the log message for the unregister timeout was reduced
because timeout failure is the same as timeout.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewing the result of what was just added for Kconfig, we made a poor
choice. It worked well for full kernel builds, but not so much for how
it would be deployed on a distro.
Here's the final result:
- lpfc will compile in NVME initiator and/or NVME target support based
on whether the kernel has the corresponding subsystem support.
Kconfig is not used to drive this specifically for lpfc.
- There is a module parameter, lpfc_enable_fc4_type, that indicates
whether the ports will do FCP-only or FCP & NVME (NVME-only not yet
possible due to dependency on fc transport). As FCP & NVME divvys up
exchange resources, and given NVME will not be often initially, the
default is changed to FCP only.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reworked Kconfig so that lfpc only requires the scsi stack.
NVME Initiator and NVME Target support can be enabled if
the other NVMe subsystems have been enabled.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
previous code did little more than log a message.
This patch adds abort path support, modeled after the SCSI code paths.
Currently addresses only the initiator path. Target path under
development, but stubbed out.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
nvme bufs get allocated even when the registration fails.
Move allocation into the rsgistration success path.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For both initiator and target: if WQ is full, return -EBUSY.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Word 1 in NVME CMD IU appears byte swapped from value placed in WQE
Should be Big Endian value in WQE word 16
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
NVME LS requests and responses had wrong R_CTL values.
Use the FC4 ELS Request and Response defines (defines badly
named, they are FC4 LS's) instead of the base ELS values.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
dma_addr_t may be either u32 or u64, depending on the kernel configuration,
and we get a warning for the 32-bit case:
drivers/scsi/lpfc/lpfc_nvme.c: In function 'lpfc_nvme_ls_req':
drivers/scsi/lpfc/lpfc_logmsg.h:52:52: error: format '%llu' expects argument of type 'long long unsigned int', but argument 11 has type 'dma_addr_t {aka unsigned int}' [-Werror=format=]
drivers/scsi/lpfc/lpfc_logmsg.h:52:52: error: format '%llu' expects argument of type 'long long unsigned int', but argument 12 has type 'dma_addr_t {aka unsigned int}' [-Werror=format=]
drivers/scsi/lpfc/lpfc_nvme.c: In function 'lpfc_nvme_ls_abort':
drivers/scsi/lpfc/lpfc_logmsg.h:52:52: error: format '%llu' expects argument of type 'long long unsigned int', but argument 11 has type 'dma_addr_t {aka unsigned int}' [-Werror=format=]
drivers/scsi/lpfc/lpfc_logmsg.h:52:52: error: format '%llu' expects argument of type 'long long unsigned int', but argument 12 has type 'dma_addr_t {aka unsigned int}' [-Werror=format=]
printk has a special "%pad" format string that passes the dma address by
reference to solve this problem.
Fixes: 01649561a8 ("scsi: lpfc: NVME Initiator: bind to nvme_fc api")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyrights to 2017 for all files touched in this patch set
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
NVME Initiator: Add debugfs support
Adds debugfs snippets to cover the new NVME initiator functionality
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
NVME Initiator: Tie in to NVME Fabrics nvme_fc LLDD initiator api
Adds the routines to:
- register and deregister the FC port as a nvme-fc initiator localport
- register and deregister remote FC ports as a nvme-fc remoteport
- binding of nvme queues to adapter WQs
- send/perform NVME LS's
- send/perform NVME FCP initiator io operations
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>