Commit Graph

1013548 Commits

Author SHA1 Message Date
Kashyap Desai 392bbeb85b scsi: mpi3mr: Hardware workaround for UNMAP commands to NVMe drives
The controller hardware can not handle certain UNMAP commands for NVMe
drives. Add support in the driver for checking those commands and handle
them appropriately.

Link: https://lore.kernel.org/r/20210520152545.2710479-17-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:17 -04:00
Kashyap Desai 82141ddba9 scsi: mpi3mr: Allow certain commands during pci-remove hook
Instead of driver returning DID_NO_CONNECT during driver unload allow SSU
and Sync Cache commands to be sent to the controller to flush any cached
data from the drive.

Link: https://lore.kernel.org/r/20210520152545.2710479-16-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:17 -04:00
Kashyap Desai 0ea177343f scsi: mpi3mr: Add change queue depth support
Link: https://lore.kernel.org/r/20210520152545.2710479-15-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:17 -04:00
Kashyap Desai e844adb1fb scsi: mpi3mr: Implement SCSI error handler hooks
Link: https://lore.kernel.org/r/20210520152545.2710479-14-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Cc: hare@suse.de
Cc: thenzl@redhat.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:17 -04:00
Kashyap Desai 8f9c6173ca scsi: mpi3mr: Add bios_param SCSI host template hook
Link: https://lore.kernel.org/r/20210520152545.2710479-13-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai ff9561e910 scsi: mpi3mr: Print IOC info for debugging
Link: https://lore.kernel.org/r/20210520152545.2710479-12-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai 54dfcffb41 scsi: mpi3mr: Add support for timestamp sync with firmware
This operation requests that the IOC update the TimeStamp.

When the I/O Unit is powered on it sets the TimeStamp field value to
0x0000_0000_0000_0000 and increments the current value every millisecond.
A host driver sets the TimeStamp field to the current time by using an
IOCInit request. The TimeStamp field is periodically updated by the host
driver.

Link: https://lore.kernel.org/r/20210520152545.2710479-11-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai fb9b04574f scsi: mpi3mr: Add support for recovering controller
Detection of firmware fault or any kind of unresponsiveness in the
controller (any admin command which times out) results in resetting the
controller. The primary reset mechanisms used are either soft reset or diag
fault reset. A reset is performed if the host sets the ResetAction field in
the HostDiagnostic register to either 001b (soft reset) or 007b (diag fault
reset). After successfully resetting the controller the driver
reinitializes the controller by going through start of the day
initialization procedure. Pending I/Os during the reset are returned back
to the SCSI midlayer for retry.

Link: https://lore.kernel.org/r/20210520152545.2710479-10-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.co
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai e36710dc06 scsi: mpi3mr: Additional event handling
Implement support for handling the following MPI events:

 - MPI3_EVENT_SAS_BROADCAST_PRIMITIVE
 - MPI3_EVENT_CABLE_MGMT
 - MPI3_EVENT_ENERGY_PACK_CHANGE

Link: https://lore.kernel.org/r/20210520152545.2710479-9-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai 8e65345554 scsi: mpi3mr: Add support for PCIe device event handling
Implement support for the following PCIe-related MPI events:

 - MPI3_EVENT_PCIE_TOPOLOGY_CHANGE_LIST
 - MPI3_EVENT_PCIE_ENUMERATION

Link: https://lore.kernel.org/r/20210520152545.2710479-8-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai 13ef29ea4a scsi: mpi3mr: Add support for device add/remove event handling
Firmware can report various MPI Events. Enable support for processing the
following events related to device addition/removal to the driver:

 - MPI3_EVENT_DEVICE_ADDED
 - MPI3_EVENT_DEVICE_INFO_CHANGED
 - MPI3_EVENT_DEVICE_STATUS_CHANGE
 - MPI3_EVENT_ENCL_DEVICE_STATUS_CHANGE
 - MPI3_EVENT_SAS_TOPOLOGY_CHANGE_LIST
 - MPI3_EVENT_SAS_DISCOVERY
 - MPI3_EVENT_SAS_DEVICE_DISCOVERY_ERROR

Link: https://lore.kernel.org/r/20210520152545.2710479-7-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai 672ae26c82 scsi: mpi3mr: Add support for internal watchdog thread
The watchdog thread is the driver's internal thread which does a few things
such as detecting firmware faults, resetting the controller, performing
timestamp sync, etc.

Link: https://lore.kernel.org/r/20210520152545.2710479-6-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai 023ab2a9b4 scsi: mpi3mr: Add support for queue command processing
Send Port Enable Request to FW for Device Discovery.  As part of port
enable completion driver calls scan_start and scan_finished hooks.  SCSI
layer references like sdev, starget, etc. are added but actual device
discovery will be supported once driver adds complete event process
handling.

Link: https://lore.kernel.org/r/20210520152545.2710479-5-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Cc: hare@suse.de
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai c9566231cf scsi: mpi3mr: Create operational request and reply queue pair
Create operational request and reply queue pair.

The MPI3 transport interface consists of an Administrative Request Queue,
an Administrative Reply Queue, and Operational Messaging Queues.  The
Operational Messaging Queues are the primary communication mechanism
between the host and the I/O Controller (IOC).  Request messages, allocated
in host memory, identify I/O operations to be performed by the IOC. These
operations are queued on an Operational Request Queue by the host driver.
Reply descriptors track I/O operations as they complete.  The IOC queues
these completions in an Operational Reply Queue.

To fulfil large contiguous memory requirement, driver creates multiple
segments and provide the list of segments. Each segment size should be 4K
which is a hardware requirement. An element array is contiguous or
segmented.  A contiguous element array is located in contiguous physical
memory.  A contiguous element array must be aligned on an element size
boundary.  An element's physical address within the array may be directly
calculated from the base address, the Producer/Consumer index, and the
element size.

Expected phased identifier bit is used to find out valid entry on reply
queue. Driver sets <ephase> bit and IOC inverts the value of this bit on
each pass.

Link: https://lore.kernel.org/r/20210520152545.2710479-4-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:16 -04:00
Kashyap Desai 824a156633 scsi: mpi3mr: Base driver code
Implement basic pci device driver requirements: Device probing, memory
allocation, mapping system registers, allocate irq lines, etc.

Source is managed in mainly three different files:

 - mpi3mr_fw.c:  Common code which interacts with underlying fw/hw.

 - mpi3mr_os.c:  Common code which interacts with SCSI midlayer.

 - mpi3mr_app.c: Common code which interacts with application/ioctl.
		 This is currently work in progress.

Link: https://lore.kernel.org/r/20210520152545.2710479-3-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Cc: bvanassche@acm.org
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:15 -04:00
Kashyap Desai c4f7ac6461 scsi: mpi3mr: Add mpi30 Rev-R headers and Kconfig
This adds the Kconfig and mpi30 headers.

Link: https://lore.kernel.org/r/20210520152545.2710479-2-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Cc: bvanassche@acm.org
Cc: hch@infradead.org
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:15 -04:00
Bean Huo f6b4142942 scsi: ufs: Fix a kernel-doc related formatting issue
Fix the following W=1 kernel build warning:

drivers/scsi/ufs/ufshcd.c:9773: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

[mkp: upcase abbreviations]

Link: https://lore.kernel.org/r/20210531163122.451375-1-huobean@gmail.com
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:15 -04:00
Kees Cook 5250db63d1 scsi: isci: Use correctly sized target buffer for memcpy()
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), avoid intentionally writing across
neighboring array fields.

Switch from rsp_ui to resp_buf, since resp_ui isn't SSP_RESP_IU_MAX_SIZE
bytes in length. This avoids future compile-time warnings.

Link: https://lore.kernel.org/r/20210528181337.792268-4-keescook@chromium.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:15 -04:00
Kees Cook 66fc475bd9 scsi: esas2r: Switch to flexible array member
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), avoid intentionally writing across
neighboring array fields.

Remove old-style 1-byte array in favor of a flexible array[1] to avoid
future false-positive cross-field memcpy() warning in:

esas2r_vda.c:
	memcpy(vi->cmd.gsv.version_info, esas2r_vdaioctl_versions, ...)

The change in struct size doesn't change other structure sizes (it is
already maxed out to 256 bytes, for example here:

        union {
                struct atto_ioctl_vda_scsi_cmd scsi;
                struct atto_ioctl_vda_flash_cmd flash;
                struct atto_ioctl_vda_diag_cmd diag;
                struct atto_ioctl_vda_cli_cmd cli;
                struct atto_ioctl_vda_smp_cmd smp;
                struct atto_ioctl_vda_cfg_cmd cfg;
                struct atto_ioctl_vda_mgt_cmd mgt;
                struct atto_ioctl_vda_gsv_cmd gsv;
                u8 cmd_info[256];
        } cmd;

No sizes are calculated using the enclosing structure, so no other
updates are needed.

Link: https://lore.kernel.org/r/20210528181337.792268-3-keescook@chromium.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:15 -04:00
Randy Dunlap 4d431153e7 scsi: FlashPoint: Rename si_flags field
The BusLogic driver has build errors on ia64 due to a name collision (in
the #included FlashPoint.c file). Rename the struct field in struct
sccb_mgr_info from si_flags to si_mflags (manager flags) to mend the build.

This is the first problem. There are 50+ others after this one:

In file included from ../include/uapi/linux/signal.h:6,
                 from ../include/linux/signal_types.h:10,
                 from ../include/linux/sched.h:29,
                 from ../include/linux/hardirq.h:9,
                 from ../include/linux/interrupt.h:11,
                 from ../drivers/scsi/BusLogic.c:27:
../arch/ia64/include/uapi/asm/siginfo.h:15:27: error: expected ':', ',', ';', '}' or '__attribute__' before '.' token
   15 | #define si_flags _sifields._sigfault._flags
      |                           ^
../drivers/scsi/FlashPoint.c:43:6: note: in expansion of macro 'si_flags'
   43 |  u16 si_flags;
      |      ^~~~~~~~
In file included from ../drivers/scsi/BusLogic.c:51:
../drivers/scsi/FlashPoint.c: In function 'FlashPoint_ProbeHostAdapter':
../drivers/scsi/FlashPoint.c:1076:11: error: 'struct sccb_mgr_info' has no member named '_sifields'
 1076 |  pCardInfo->si_flags = 0x0000;
      |           ^~
../drivers/scsi/FlashPoint.c:1079:12: error: 'struct sccb_mgr_info' has no member named '_sifields'

Link: https://lore.kernel.org/r/20210529234857.6870-1-rdunlap@infradead.org
Fixes: 391e2f2560 ("[SCSI] BusLogic: Port driver to 64-bit.")
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:15 -04:00
Gustavo A. R. Silva 84a84cc6af scsi: mpt3sas: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
of warnings by explicitly adding break statements instead of just letting
the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Link: https://lore.kernel.org/r/20210528200828.GA39349@embeddedor
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:15 -04:00
Daniel Wagner 27c707b146 scsi: qla2xxx: Log PCI address in qla_nvme_unregister_remote_port()
Pass in fcport->vha to ql_log() in order to add the PCI address to the log.

Currently NULL is passed in which gives this confusing log entry:

> qla2xxx [0000:00:00.0]-2112: : qla_nvme_unregister_remote_port: unregister remoteport on 0000000009d6a2e9 50000973981648c7

Link: https://lore.kernel.org/r/20210531122444.116655-1-dwagner@suse.de
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:14 -04:00
Alice.Chao f9c602f3bd scsi: ufs: ufs-mediatek: Disable HCI before HW reset
MediaTek ufshci needs to be disabled before HW reset to avoid potential
issues.

Link: https://lore.kernel.org/r/20210528033624.12170-3-alice.chao@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Alice.Chao <alice.chao@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:14 -04:00
Alice.Chao 3a95f5b392 scsi: ufs: core: Export ufshcd_hba_stop()
Export ufshcd_hba_stop() to allow vendors to disable HCI in variant ops.

Link: https://lore.kernel.org/r/20210528033624.12170-2-alice.chao@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Alice.Chao <alice.chao@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 00:56:14 -04:00
Bart Van Assche 40d2fd05ec scsi: ufs: Suppress false positive unhandled interrupt messages
From ufshcd_transfer_req_compl():

    Resetting interrupt aggregation counters first and reading the
    DOOR_BELL afterward allows us to handle all the completed requests.  In
    order to prevent other interrupts starvation the DB is read once after
    reset. The down side of this solution is the possibility of false
    interrupt if device completes another request after resetting
    aggregation and before reading the DB.

Prevent that ufshcd_intr() reports a false positive "Unhandled interrupt"
message if the above scenario is triggered.

Link: https://lore.kernel.org/r/20210519202058.12634-2-bvanassche@acm.org
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-31 22:48:20 -04:00
Suganath Prabu S a0815c45c8 scsi: mpt3sas: Handle firmware faults during second half of IOC init
If a firmware fault occurs while scanning the devices during IOC
initialization then the driver issues the hard reset operation to recover
the IOC. However, the driver is not issuing a Port enable request
message as part of hard reset operation during IOC initialization.  Due to
this, the driver will not receive get any device discovery-related events
and hence devices will not be accessible.

Teach the driver to gracefully handle firmware faults while scanning for
target devices during IOC initialization. Make the driver issue a port
enable request message as part of hard reset operation. This permits
receiving device discovery-related events from the firmware after the hard
reset operation completes.

Link: https://lore.kernel.org/r/20210518051625.1596742-4-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-31 22:48:20 -04:00
Suganath Prabu S 19a622c39a scsi: mpt3sas: Handle firmware faults during first half of IOC init
During first half of IOC initialization (i.e.  before going for device
scanning), if any firmware fault occurs then driver is aborting the IOC
initialization operation.

Modify the driver to issue a diag reset operation to recover IOC from fault
state and reinitialize the IOC.

Link: https://lore.kernel.org/r/20210518051625.1596742-3-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-31 22:48:20 -04:00
Suganath Prabu S e2fac6c44a scsi: mpt3sas: Fix deadlock while cancelling the running firmware event
Do not cancel current running firmware event work if the event type is
different from MPT3SAS_REMOVE_UNRESPONDING_DEVICES.  Otherwise a deadlock
can be observed while cancelling the current firmware event work if a hard
reset operation is called as part of processing the current event.

Link: https://lore.kernel.org/r/20210518051625.1596742-2-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-31 22:48:20 -04:00
John Garry ea2f0f7753 scsi: core: Cap scsi_host cmd_per_lun at can_queue
The sysfs handling function sdev_store_queue_depth() enforces that the sdev
queue depth cannot exceed shost can_queue. The initial sdev queue depth
comes from shost cmd_per_lun. However, the LLDD may manually set
cmd_per_lun to be larger than can_queue, which leads to an initial sdev
queue depth greater than can_queue.

Such an issue was reported in [0], which caused a hang. That has since been
fixed in commit fc09acb7de ("scsi: scsi_debug: Fix cmd_per_lun, set to
max_queue").

Stop this possibly happening for other drivers by capping shost cmd_per_lun
at shost can_queue.

[0] https://lore.kernel.org/linux-scsi/YHaez6iN2HHYxYOh@T590/

Link: https://lore.kernel.org/r/1621434662-173079-1-git-send-email-john.garry@huawei.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-22 00:34:39 -04:00
James Smart e5e0280db7 scsi: lpfc: Update lpfc version to 12.8.0.10
Update lpfc version to 12.8.0.10

Link: https://lore.kernel.org/r/20210514195559.119853-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart 8eced80707 scsi: lpfc: Reregister FPIN types if ELS_RDF is received from fabric controller
FC-LS-5 specifies that a received RDF implies a possible change to fabric
supported diagnostic functions. Endpoints are to re-perform the RDF
exchange with the fabric to enable possible new features or adapt to
changes in values.

This patch adds the logic to RDF receive to re-perform the RDF exchange
with the switch.

Link: https://lore.kernel.org/r/20210514195559.119853-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart 3e49af9393 scsi: lpfc: Add a option to enable interlocked ABTS before job completion
Default behavior for the driver, when aborting an I/O, is to terminate the
I/O with the adapter. The adapter will initiate an ABTS to terminate the
exchange on the link and mark the exchange is terminated so that no further
use of the sgl or any traffic for the exchange is worked on. Completion on
the Abort is then posted to the driver, which as the I/O is terminated can
complete the I/O to the OS. This completion may occur prior to the ABTS
handshake completing on the wire. The ABTS handshake can take a long time
to complete with timeouts and retries reaching 60+ seconds. Note: if
retries fail, LOGO occurs.

Some devices want to ensure that the ABTS handshake fully completes (this
device has fully ack'd it) before the I/O completion is posted back to the
OS, where a failed I/O may be retried via a different path.

To support this behavior, an option was added to the driver to change I/O
completion from the Abort cmd completion to the Exchange termination (aka
ABTS) completion.

Link: https://lore.kernel.org/r/20210514195559.119853-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart 5aa615d195 scsi: lpfc: Fix crash when lpfc_sli4_hba_setup() fails to initialize the SGLs
The driver is encountering a crash in lpfc_free_iocb_list() while
performing initial attachment.

Code review found this to be an errant failure path that was taken, jumping
to a tag that then referenced structures that were uninitialized.

Fix the failure path.

Link: https://lore.kernel.org/r/20210514195559.119853-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart 04c1d9c50a scsi: lpfc: Ignore GID-FT response that may be received after a link flip
When a link bounce happens, there is a possibility that responses to
requests posted prior to the link bounce could be received. This is
problematic as the counter to track reglogin completion after link up can
become out of sync with the real state.

As there is no reason to process a request made in a prior link up context,
eliminate all the disturbance by tagging the request with the event_tag
maintained by the SLI Port for the link. The event_tag will change on every
link state transition.  As long as the tag matches the current event_tag,
the response can be processed. If it doesn't match, just discard the
response.

Link: https://lore.kernel.org/r/20210514195559.119853-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart fe83e3b9b4 scsi: lpfc: Fix node handling for Fabric Controller and Domain Controller
During link bounce testing, RPI counts were seen to differ from the number
of nodes. For fabric and domain controllers, a temporary RPI is assigned,
but the code isn't registering it. If the nodes do go away, such as on link
down, the temporary RPI isn't being released.

Change the way these two fabric services are managed, make them behave like
any other remote port. Register the RPI and register with the transport.
Never leave the nodes in a NPR or UNUSED state where their RPI is in limbo.
This allows them to follow normal dev_loss_tmo handling, RPI refcounting,
and normal removal rules. It also allows fabric I/Os to use the RPI for
traffic requests.

Note: There is some logic that still has a couple of exceptions when the
Domain controller (0xfffcXX). There are cases where the fabric won't have a
valid login but will send RDP. Other times, it will it send a LOGO then an
RDP. It makes for ad-hoc behavior to manage the node. Exceptions are
documented in the code.

Link: https://lore.kernel.org/r/20210514195559.119853-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart 4012baeab6 scsi: lpfc: Fix Node recovery when driver is handling simultaneous PLOGIs
When lpfc is handling a solicited and unsolicited PLOGI with another
initiator, the remote initiator is never recovered. The node for the
initiator is erroneouosly removed and all resources released.

In lpfc_cmpl_els_plogi(), when lpfc_els_retry() returns a failure code, the
driver is calling the state machine with a device remove event because the
remote port is not currently registered with the SCSI or NVMe
transports. The issue is that on a PLOGI "collision" the driver correctly
aborts the solicited PLOGI and allows the unsolicited PLOGI to complete the
process, but this process is interrupted with a device_rm event.

Introduce logic in the PLOGI completion to capture the PLOGI collision
event and jump out of the routine.  This will avoid removal of the node.
If there is no collision, the normal node removal will occur.

Fixes: 	52edb2caf6 ("scsi: lpfc: Remove ndlp when a PLOGI/ADISC/PRLI/REG_RPI ultimately fails")
Cc: <stable@vger.kernel.org> # v5.11+
Link: https://lore.kernel.org/r/20210514195559.119853-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart 1037e4b4f8 scsi: lpfc: Add ndlp kref accounting for resume RPI path
The driver is crashing due to a bad pointer during driver load due in an
adisc acc receive routine. The driver is missing node get/put in the
mbx_resume_rpi paths.

Fix by adding the proper gets and puts into the resume_rpi path.

Link: https://lore.kernel.org/r/20210514195559.119853-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart e30d55137e scsi: lpfc: Fix "Unexpected timeout" error in direct attach topology
An 'unexpected timeout' message may be seen in a point-2-point topology.
The message occurs when a PLOGI is received before the driver is notified
of FLOGI completion. The FLOGI completion failure causes discovery to be
triggered for a second time. The discovery timer is restarted but no new
discovery activity is initiated, thus the timeout message eventually
appears.

In point-2-point, when discovery has progressed before the FLOGI completion
is processed, it is not a failure. Add code to FLOGI completion to detect
that discovery has progressed and exit the FLOGI handling (noop'ing it).

Link: https://lore.kernel.org/r/20210514195559.119853-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart fa21189db9 scsi: lpfc: Fix non-optimized ERSP handling
When processing an NVMe ERSP IU which didn't match the optimized CQE-only
path, the status was being left to the WQE status. WQE status is non-zero
as it is indicating a non-optimized completion that needs to be handled by
the driver.

Fix by clearing the status field when falling into the non-optimized
case. Log message added to track optimized vs non-optimized debug.

Link: https://lore.kernel.org/r/20210514195559.119853-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart 01131e7aae scsi: lpfc: Fix unreleased RPIs when NPIV ports are created
While testing NPIV and watching logins and used RPI levels, it was seen the
used RPI count was much higher than the number of remote ports discovered.

Code inspection showed that remote port removals on any NPIV instance are
releasing the RPI, but not performing an UNREG_RPI with the adapter thus
the reference counting never fully drops and the RPI is never fully
released. This was happening on NPIV nodes due to a log of fabric ELS's to
fabric addresses. This lack of UNREG_RPI was introduced by a prior node
rework patch that performed the UNREG_RPI as part of node cleanup.

To resolve the issue, do the following:

 - Restore the RPI release code, but move the location to so that it is in
   line with the new node cleanup design.

 - NPIV ports now release the RPI and drop the node when the caller sets
   the NLP_RELEASE_RPI flag.

 - Set the NLP_RELEASE_RPI flag in node cleanup which will trigger a
   release of RPI to free pool.

 - Ensure there's an UNREG_RPI at LOGO completion so that RPI release is
   completed.

 - Stop offline_prep from skipping nodes that are UNUSED. The RPI may
   not have been released.

 - Stop the default RPI handling in lpfc_cmpl_els_rsp() for SLI4.

 - Fixed up debugfs RPI displays for better debugging.

Fixes: a70e63eee1 ("scsi: lpfc: Fix NPIV Fabric Node reference counting")
Link: https://lore.kernel.org/r/20210514195559.119853-2-jsmart2021@gmail.com
Cc: <stable@vger.kernel.org> # v5.11+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
Martin Wilck ee8868c5c7 scsi: scsi_dh_alua: Retry RTPG on a different path after failure
If an RTPG fails, we can't infer anything wrt. the state of the ports in
the port group except that we were unable to reach the one port on which
the RTPG had failed. "offline" is just a secondary port state, which means
that we can't infer the state of any port in the PG from the failure (in
fact, even the failed port might still be in "active/optimized" primary
port access state).

Therefore, when we encounter an RTPG failure, we should retry the RTPG on a
different port. This avoids falsely setting port states to offline for
unreachable ports. To do this, ports on which an RTPG has failed are
temporarily set to "disabled" to avoid repeating the failed I/O on the same
target port. Once the RTPG has either succeeded on one port or failed on
all ports of the PG, the ports are enabled again.

Link: https://lore.kernel.org/r/20210514153214.5626-1-mwilck@suse.com
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:06:29 -04:00
Jiapeng Chong cb9eb11fd5 scsi: qla2xxx: Remove redundant assignment to rval
Variable rval is set to QLA_SUCCESS but this value is never read as it is
overwritten later on. Hence it is a redundant assignment and can be
removed.

Clean up the following clang-analyzer warning:

drivers/scsi/qla2xxx/qla_init.c:4359:2: warning: Value stored to 'rval'
is never read [clang-analyzer-deadcode.DeadStores].

Link: https://lore.kernel.org/r/1620643206-127930-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 22:58:49 -04:00
Colin Ian King 5ac3c649f1 scsi: ufs: ufs-exynos: Make a const array static, makes object smaller
Don't populate the const array granularity_tbl on the stack but instead
make it static. Makes the object code smaller by 190 bytes:

Before:
   text    data     bss     dec     hex filename
  25563    6908       0   32471    7ed7 ./drivers/scsi/ufs/ufs-exynos.o

After:
   text    data     bss     dec     hex filename
  25213    7068       0   32281    7e19 ./drivers/scsi/ufs/ufs-exynos.o

(gcc version 10.3.0)

Link: https://lore.kernel.org/r/20210505190104.70112-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 22:50:22 -04:00
Wei Ming Chen 86cfe4ad24 scsi: fas216: Use fallthrough pseudo-keyword
Replace /*FALLTHROUGH*/ comment with pseudo-keyword macro 'fallthrough'.

Link: https://lore.kernel.org/r/20210518131823.2586-1-jj251510319013@gmail.com
Signed-off-by: Wei Ming Chen <jj251510319013@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 22:14:39 -04:00
Keoseong Park ecd7beb378 scsi: ufs: core: Clean up whitespace
checkpatch reports the following errors:

	ERROR: space prohibited before that ',' (ctx:WxW)
	#945: FILE: drivers/scsi/ufs/ufshcd.h:945:
	+int ufshcd_init(struct ufs_hba * , void __iomem * , unsigned int);
	                                  ^

	ERROR: space prohibited before that ',' (ctx:WxW)
	#945: FILE: drivers/scsi/ufs/ufshcd.h:945:
	+int ufshcd_init(struct ufs_hba * , void __iomem * , unsigned int);
	                                                   ^
Remove unnecessary whitespace in ufshcd.h.

Link: https://lore.kernel.org/r/2038148563.21621340102306.JavaMail.epsvc@epcpadp3
Signed-off-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 22:08:41 -04:00
Zhen Lei 40d6b939e4 scsi: Fix spelling mistakes in header files
Fix some spelling mistakes in comments:

  pathes ==> paths
  Resouce ==> Resource
  retreived ==> retrieved
  recevied ==> received
  interruped ==> interrupted

[mkp: kept 'keep-alives' and 'busses']

Link: https://lore.kernel.org/r/20210517095945.7363-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 17:22:45 -04:00
Juerg Haefliger 98f92dff14 scsi: core: Remove leading spaces in Kconfig
Remove leading spaces before tabs in Kconfig file(s) by running the
following command:

  $ find drivers/scsi -name 'Kconfig*' | xargs sed -r -i 's/^[ ]+\t/\t/'

Link: https://lore.kernel.org/r/20210517095835.81733-1-juergh@canonical.com
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 17:17:41 -04:00
kernel test robot 824731258b scsi: target: tcmu: Fix boolreturn.cocci warnings
drivers/target/target_core_user.c:1424:9-10: WARNING: return of 0/1 in function 'tcmu_handle_completions' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.

Generated by: scripts/coccinelle/misc/boolreturn.cocci

Link: https://lore.kernel.org/r/20210515230358.GA97544@60d1edce16e0
Fixes: 9814b55cde ("scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found")
CC: Bodo Stroesser <bostroesser@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 17:14:21 -04:00
Bart Van Assche e2ac7ab281 scsi: ufs: Use designated initializers in ufs_pm_lvl_states[]
The comments in the enum ufs_pm_level definition are redundant. Remove the
comments from the ufs_pm_level enum and use designated initializers in the
ufs_pm_lvl_states[] definition instead.

Link: https://lore.kernel.org/r/20210519202058.12634-3-bvanassche@acm.org
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 17:07:05 -04:00
Sergey Shtylyov ab17122e75 scsi: hisi_sas: Propagate errors in interrupt_init_v1_hw()
After commit 6c11dc0604 ("scsi: hisi_sas: Fix IRQ checks") we have the
error codes returned by platform_get_irq() ready for the propagation
upsream in interrupt_init_v1_hw() -- that will fix still broken deferred
probing. Let's propagate the error codes from devm_request_irq() as well
since I don't see the reason to override them with -ENOENT...

Link: https://lore.kernel.org/r/49ba93a3-d427-7542-d85a-b74fe1a33a73@omp.ru
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 17:04:13 -04:00