Commit Graph

270 Commits

Author SHA1 Message Date
Don Brace f54f85dfd7 scsi: smartpqi: Update version to 2.1.18-045
Link: https://lore.kernel.org/r/165730608687.177165.11815510982277242966.stgit@brunhilda
Reviewed-by: Gerry Morong <gerry.morong@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:05 -04:00
Don Brace e4b73b3fa2 scsi: smartpqi: Update copyright to current year
Update copyright to current year.

Link: https://lore.kernel.org/r/165730608177.177165.13184715486635363193.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:04 -04:00
Kevin Barnett 6d567dfee0 scsi: smartpqi: Add ctrl ready timeout module parameter
Allow user to override the default driver timeout for controller ready.

There are some rare configurations which require the driver to wait longer
than the normal 3 minutes for the controller to complete its bootup
sequence and be ready to accept commands from the driver.

The module parameter is:

ctrl_ready_timeout= { 0 | 30-1800 }

and specifies the timeout in seconds for the driver to wait for controller
ready. The valid range is 0 or 30-1800. The default value is 0, which
causes the driver to use a timeout of 180 seconds (3 minutes).

Link: https://lore.kernel.org/r/165730607666.177165.9221211345284471213.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:04 -04:00
Kevin Barnett 2d80f4054f scsi: smartpqi: Update deleting a LUN via sysfs
Change removing a LUN using sysfs from an internal driver function
pqi_remove_all_scsi_devices() to using the .slave_destroy entry in the
scsi_host_template.

A LUN can be deleted via sysfs using this syntax:

echo 1 > /sys/block/sdX/device/delete

Link: https://lore.kernel.org/r/165730607154.177165.9723066932202995774.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:04 -04:00
Mike McGowen cf15c3e734 scsi: smartpqi: Add module param to disable managed ints
Allow SMP affinity to be changeable by disabling managed interrupts.

On distributions where the driver is enabled for multi-queue support the
driver utilizes kernel managed interrupts, which automatically distributes
interrupts to all available CPUs and assigns SMP affinity.

On most distributions, the affinity can not be changed by the user.

This change will allow managed interrupts to be disabled by the user via a
module parameter while still allowing multi-queue support to function
properly.

Use the module parameter disable_managed_interrupts=1

Link: https://lore.kernel.org/r/165730606638.177165.12846020942931640329.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:04 -04:00
Kevin Barnett 6ce3cfb365 scsi: smartpqi: Fix RAID map race condition
Correct a rare stale RAID map access when performing AIO during a RAID
configuration change.

A race condition in the driver could cause it to access a stale RAID map
when a logical volume is reconfigured.

Modify the driver logic to invalidate a RAID map very early when a RAID
configuration change is detected and only switch to a new RAID map after
the driver detects that the RAID map has changed.

Link: https://lore.kernel.org/r/165730606128.177165.7671413443814750829.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:04 -04:00
Mahesh Rajashekhara 69695aeaa6 scsi: smartpqi: Fix DMA direction for RAID requests
Correct a SOP READ and WRITE DMA flags for some requests.

This update corrects DMA direction issues with SCSI commands removed from
the controller's internal lookup table.

Currently, SCSI READ BLOCK LIMITS (0x5) was removed from the controller
lookup table and exposed a DMA direction flag issue.

SCSI READ BLOCK LIMITS was recently removed from our controller lookup
table so the controller uses the respective IU flag field to set the DMA
data direction. Since the DMA direction is incorrect the FW never completes
the request causing a hang.

Some SCSI commands which use SCSI READ BLOCK LIMITS

      * sg_map
      * mt -f /dev/stX status

After updating controller firmware, users may notice their tape units
failing. This patch resolves the issue.

Also, the AIO path DMA direction is correct.

The DMA direction flag is a day-one bug with no reported BZ.

Fixes: 6c223761eb ("smartpqi: initial commit of Microsemi smartpqi driver")
Link: https://lore.kernel.org/r/165730605618.177165.9054223644512926624.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mahesh Rajashekhara <Mahesh.Rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:04 -04:00
Kevin Barnett 85b41834b0 scsi: smartpqi: Stop logging spurious PQI reset failures
Change method used to detect controller firmware crash during PQI reset.

PQI reset can fail with error -6 if firmware takes > 100ms to complete
reset.

Method used by driver to detect controller firmware crash during PQI was
incorrect in some cases.

Link: https://lore.kernel.org/r/165730605108.177165.1132931838384767071.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:03 -04:00
Mike McGowen 2a9c2ba2bc scsi: smartpqi: Add PCI IDs for Lenovo controllers
Add PCI IDs for Lenovo controllers (values in hex):

                                        VID  / DID  / SVID / SDID
                                        ----   ----   ----   ----
Lenovo 4350-8i HBA                      9005 / 028f / 1d49 / 0220
Lenovo 4350-16i HBA                     9005 / 028f / 1d49 / 0221
Lenovo 5350-8i RAID                     9005 / 028f / 1d49 / 0520
Lenovo 5350-8i Internal RAID            9005 / 028f / 1d49 / 0522
Lenovo 9350-8i RAID                     9005 / 028f / 1d49 / 0620
Lenovo 9350-8i Internal RAID            9005 / 028f / 1d49 / 0621
Lenovo 9350-16i RAID                    9005 / 028f / 1d49 / 0622
Lenovo 9350-16i Internal RAID           9005 / 028f / 1d49 / 0623

Link: https://lore.kernel.org/r/165730604598.177165.9910276232981721083.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:03 -04:00
Mike McGowen 44e68c4af5 scsi: smartpqi: Add PCI ID for Adaptec SmartHBA 2100-8i
Add the PCI ID for (values in hex):
                                        VID  / DID  / SVID / SDID
                                        ----   ----   ----   ----
Adaptec SmartHBA 2100-8i-o              9005 / 0285 / 9005 / 0659

Link: https://lore.kernel.org/r/165730604089.177165.17257514581321583667.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:03 -04:00
Sagar Biradar 331f7e998b scsi: smartpqi: Fix PCI control linkdown system hang
Fail all outstanding requests after a PCI linkdown.

Block access to device SCSI attributes during the following conditions:

  "Cable pull" is called PQI_CTRL_SURPRISE_REMOVAL.

  "PCIe Link Down" is called PQI_CTRL_GRACEFUL_REMOVAL.

Block access to device SCSI attributes during and in rare instances when
the controller goes offline.

Either outstanding requests or the access of SCSI attributes post linkdown
can lead to a hang.

Post linkdown, driver does not fail the outstanding requests leading to
long wait time before all the IOs eventually fail.

Also access of the SCSI attributes by host applications can lead to a
system hang.

Link: https://lore.kernel.org/r/165730603578.177165.4699352086827187263.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:03 -04:00
Kumar Meiyappan 904f2bfda6 scsi: smartpqi: Add driver support for multi-LUN devices
Add driver support for up to 256 LUNs per device.

Update AIO path to pass the appropriate LUN number for base-code to target
the correct LUN.

Update RAID IO path to pass the appropriate LUN number for FW to target the
correct LUN.

Pass the correct LUN number while doing a LUN reset.

Count the outstanding commands based on LUN number.  While removing a
Multi-LUN device, wait for all outstanding commands to complete for all
LUNs.

Add Feature bit support.

Link: https://lore.kernel.org/r/165730603067.177165.14016422176841798336.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kumar Meiyappan <Kumar.Meiyappan@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:03 -04:00
Mike McGowen 297bdc540f scsi: smartpqi: Close write read holes
Insert a minimum 1 millisecond delay after writing to a register before
reading from it.

SIS and PQI registers that can be both written to and read from can return
stale data if read from too soon after having been written to.

There is no read/write ordering or hazard detection on the inbound path to
the MSGU from the PCIe bus, therefore reads could pass writes.

Link: https://lore.kernel.org/r/165730602555.177165.11181012469428348394.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com>
Co-developed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:03 -04:00
Murthy Bhat dab5378485 scsi: smartpqi: Add PCI IDs for ramaxel controllers
Add the following controllers (values in hex):

                               VID  / DID  / SVID / SDID
                               ---- / ---- / ---- / ----
Ramaxel FBGF-RAD PM8204        9005 / 028F / 1CC4 / 0101
Ramaxel FBGF-RAD PM8222        9005 / 028F / 1CC4 / 0201

Link: https://lore.kernel.org/r/165730602045.177165.3720208650043407285.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:02 -04:00
Gilbert Wu 1d393227fc scsi: smartpqi: Add controller fw version to console log
Print controller firmware version to OS message log during driver
initialization or after OFA.

Link: https://lore.kernel.org/r/165730601536.177165.17698744242908911822.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:02 -04:00
Mike McGowen 4e7d26029e scsi: smartpqi: Shorten drive visibility after removal
Check the response code returned from the LUN reset task management
function and if it indicates the LUN is not valid, do not retry.

Reduce rescan worker delay to 5 seconds for the event handler only.

The removal of a drive from the OS could have been delayed up to 30 seconds
after being physically pulled.

The driver was retrying a LUN reset 3 times even though the return code
indiciated the LUN was no longer valid. There was a 10 second delay between
each retry. Additionally, the rescan worker was scheduled to run 10 seconds
after the driver received the event.

Link: https://lore.kernel.org/r/165730601025.177165.9416869335174437006.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-13 23:42:02 -04:00
Julia Lawall 8946ea2838 scsi: smartpqi: Fix typo in comment
Spelling mistake (triple letters) in comment.  Detected with the help of
Coccinelle.

Link: https://lore.kernel.org/r/20220521111145.81697-58-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-23 23:24:10 -04:00
Bart Van Assche c1ea387d99 scsi: smartpqi: Stop using the SCSI pointer
Set .cmd_size in the SCSI host template instead of using the SCSI pointer
from struct scsi_cmnd. This patch prepares for removal of the SCSI pointer
from struct scsi_cmnd.

Link: https://lore.kernel.org/r/20220218195117.25689-44-bvanassche@acm.org
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-22 21:11:06 -05:00
Don Brace 31b17c3aeb scsi: smartpqi: Fix unused variable pqi_pm_ops for clang
Driver added a new dev_pm_ops structure used only if CONFIG_PM is set.  The
CONFIG_PM MACRO needed to be moved up in the code to avoid the compiler
warnings. The hunk to move the location was missing from the above patch.

Found by kernel test robot by building driver with CONFIG_PM disabled.

Link: https://lore.kernel.org/all/202202090657.bstNLuce-lkp@intel.com/
Link: https://lore.kernel.org/r/20220210201151.236170-1-don.brace@microchip.com
Fixes: c66e078ad8 ("scsi: smartpqi: Fix hibernate and suspend")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike Mcgowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-11 17:06:12 -05:00
Don Brace 62ed6622aa scsi: smartpqi: Update version to 2.1.14-035
Link: https://lore.kernel.org/r/164375215867.440833.17567317655622946368.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Gerry Morong <gerry.morong@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:36 -05:00
Kevin Barnett 291c2e0071 scsi: smartpqi: Fix lsscsi -t SAS addresses
Correct lsscsi -t output for newer controllers that support 16-byte WWID in
the SAS address field. lsscsi -t was displaying all zeros for SAS
addresses.

When we added support to smartpqi for 16-byte WWIDs in the RPL data for
newer controllers, we were copying the wrong part of the 16-byte WWID to
the SAS address field.

Link: https://lore.kernel.org/r/164375215363.440833.7298523719813806902.stgit@brunhilda.pdev.net
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:36 -05:00
Kevin Barnett c66e078ad8 scsi: smartpqi: Fix hibernate and suspend
Restructure the hibernate/suspend code to allow workarounds for the
controller boot differences.

Newer controllers have subtle differences in the way that they boot up.

Link: https://lore.kernel.org/r/164375214859.440833.14683009064111314948.stgit@brunhilda.pdev.net
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:36 -05:00
Mike McGowen 5e6935864d scsi: smartpqi: Fix BUILD_BUG_ON() statements
Add calls to the functions at the beginning driver initialization.

The BUILD_BUG_ON() statements that are currently in functions named
verify_structures() in the modules smartpqi_init.c and smartpqi_sis.c do
not work as currently implemented.

Link: https://lore.kernel.org/r/164375214355.440833.13129778749209816497.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:36 -05:00
Mike McGowen c52efc9238 scsi: smartpqi: Fix NUMA node not updated during init
Correct NUMA node association when calling pqi_pci_probe().

In the function pqi_pci_probe(), the driver makes an OS call to get the
NUMA node associated with a controller. If the call returns that there is
no associated node, the driver attempts to set it to node 0. The problem is
that a different local variable (cp_node) was being used to do this, but
the original local variable (node) was still being used in the call to
pqi_alloc_ctrl_info().

The value of "node" is not updated if the conditional after the call to
dev_to_node() evaluates to TRUE.

Link: https://lore.kernel.org/r/164375213850.440833.5243014942807747074.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:35 -05:00
Kevin Barnett 00598b056a scsi: smartpqi: Expose SAS address for SATA drives
Remove UNIQUE_WWID_IN_REPORT_PHYS_LUN PQI feature.

This feature was originally added to solve a problem with NVMe drives, but
this problem has since been solved a different way, so this PQI feature is
no longer required for any type of drive.

The kernel was not creating symbolic links in sysfs between SATA drives and
their enclosure.

The driver was enabling the UNIQUE_WWID_IN_REPORT_PHYS_LUN PQI feature,
which causes the FW to return a WWID for SATA drives that is derived from a
unique ID read from the SATA drive itself. The driver was exposing this
WWID as the drive's SAS address. However, because this SAS address does not
match the SAS address returned by an enclosure's SES Page 0xA data, the
Linux kernel was not able to match a SATA drive with its enclosure.

Link: https://lore.kernel.org/r/164375213346.440833.12379222470149882747.stgit@brunhilda.pdev.net
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:35 -05:00
Mike McGowen 5d8fbce04d scsi: smartpqi: Speed up RAID 10 sequential reads
Use all data disks for sequential read operations.

Testing discovered inconsistent performance on RAID 10 volumes when
performing 256K sequential reads. The driver was only using a single
tracker to determine which physical drive to send a request to for AIO
requests.

Change the single tracker (next_bypass_group) to an array of trackers based
on the number of data disks in a row of the RAID map.

Link: https://lore.kernel.org/r/164375212842.440833.6733971458765002128.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:35 -05:00
Mahesh Rajashekhara 3ada501d60 scsi: smartpqi: Fix kdump issue when controller is locked up
Avoid dropping into shell if the controller is in locked up state.

Driver issues SIS soft reset to bring back the controller to SIS mode while
OS boots into kdump mode.

If the controller is in lockup state, SIS soft reset does not work.

Since the controller lockup code has not been cleared, driver considers the
firmware is no longer up and running. Driver returns back an error code to
OS and the kdump fails.

Link: https://lore.kernel.org/r/164375212337.440833.11955356190354940369.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:35 -05:00
Mahesh Rajashekhara 27655e9db4 scsi: smartpqi: Update volume size after expansion
After modifying logical volume size, lsblk command still shows previous
size of logical volume.

When the driver gets any event from firmware it schedules rescan worker
with delay of 10 seconds. If array expansion is so quick that it gets
completed in a second, the driver could not catch logical volume expansion
due to worker delay.

Since driver does not detect volume expansion, driver would not call
rescan device to update new size to the OS.

Link: https://lore.kernel.org/r/164375211833.440833.17023155389220583731.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:35 -05:00
Sagar Biradar b73357a1fd scsi: smartpqi: Avoid drive spin-down during suspend
On certain systems (based on PCI IDs), when the OS transitions the system
into the suspend (S3) state, the BMIC flush cache command will indicate a
system RESTART instead of SUSPEND.

This avoids drive spin-down.

Link: https://lore.kernel.org/r/164375211330.440833.7203813692110347698.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:35 -05:00
Balsundar P 42dc0426fb scsi: smartpqi: Resolve delay issue with PQI_HZ value
Change PQI_HZ to HZ.

PQI_HZ macro was set to 1000 when HZ value is less than 1000.  By default,
PQI_HZ will result into a delay of 10 seconds(for kernel, which has HZ =
100). So in this case when firmware raises an event, rescan worker will be
scheduled after a delay of (10 x PQI_HZ) = 100 seconds instead of 10
seconds.

Also driver uses PQI_HZ at many instances, which might result in some other
issues with respect to delay.

Link: https://lore.kernel.org/r/164375210825.440833.15510172447583227486.stgit@brunhilda.pdev.net
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Balsundar P <balsundar.p@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:34 -05:00
Kevin Barnett 9e98e60bfc scsi: smartpqi: Fix a typo in func pqi_aio_submit_io()
Use correct pqi_aio_path_request structure to calculate the offset to
sg_descriptors.

The function pqi_aio_submit_io() uses the pqi_raid_path_request structure
to calculate the offset of the structure member sg_descriptors.  This is
incorrect.  It should be using the pqi_aio_path_request structure instead.

This typo is benign because the offsets are the same in both structures.

Link: https://lore.kernel.org/r/164375210321.440833.2566086558909686629.stgit@brunhilda.pdev.net
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:34 -05:00
Kevin Barnett b4dc06a907 scsi: smartpqi: Fix a name typo and cleanup code
Rename the function pqi_is_io_high_priority() to pqi_is_io_high_priority().
Remove 2 unnecessary lines from the function, and adjusted the white space.

Link: https://lore.kernel.org/r/164375209818.440833.10908948825731635853.stgit@brunhilda.pdev.net
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:34 -05:00
Murthy Bhat 94a68c8143 scsi: smartpqi: Quickly propagate path failures to SCSI midlayer
Return DID_NO_CONNECT when a path failure is detected.

When a path fails during I/O and AIO path gets disabled for a multipath
device, the I/O was retried in the RAID path slowing down path fail
detection. Returning DID_NO_CONNECT allows multipath to switch paths more
quickly.

Link: https://lore.kernel.org/r/164375209313.440833.9992416628621839233.stgit@brunhilda.pdev.net
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:34 -05:00
Sagar Biradar 70ba20be4b scsi: smartpqi: Eliminate drive spin down on warm boot
Avoid drive spin down during a warm boot.

Call the BMIC Flush Cache command (0xc2) to indicate the reason for the
flush cache (shutdown, hibernate, suspend, or restart).

Link: https://lore.kernel.org/r/164375208810.440833.11254644025650871435.stgit@brunhilda.pdev.net
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:34 -05:00
Gilbert Wu 2a47834d94 scsi: smartpqi: Enable SATA NCQ priority in sysfs
Add device attribute 'sas_ncq_prio_enable' to enable SATA NCQ priority
support and provide I/O priority in SCSI command and pass priority
information to controller firmware.

This device attribute works only when device has NCQ priority support and
controller firmware can handle I/Os with NCQ priority attribute.

Link: https://lore.kernel.org/r/164375208306.440833.7392577382127815362.stgit@brunhilda.pdev.net
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:34 -05:00
Don Brace c57ee4ccb3 scsi: smartpqi: Add PCI IDs
Add in new ZTE controllers:
                                    VID  / DID  / SVID / SDID
                                    ----   ----   ----   ----
ZTE SmartROC3100 RS241-18i 2G       9005 / 028F / 1CF2 / 5449
ZTE SmartROC3100 RS242-18i 4G       9005 / 028F / 1CF2 / 544A
ZTE SmartIOC2100 RS243-18i          9005 / 028F / 1CF2 / 544B
ZTE SmartROC3100 RM241B-18i 2G      9005 / 028F / 1CF2 / 544D
ZTE SmartROC3100 RM242B-18i 4G      9005 / 028F / 1CF2 / 544E
ZTE SmartIOC2100 RM243B-18i         9005 / 028F / 1CF2 / 544F

Add PCI ID for 1100-24i controller:
                                    VID  / DID  / SVID / SDID
                                    ----   ----   ----   ----
HBA 1100-24i                        9005 / 028F / 9005 / 1304

Add PCI IDs for HPE and Adaptec devices:
                                    VID  / DID  / SVID / SDID
                                    ----   ----   ----   ----
Adaptec Smart HBA 2200-8io /e       9005 / 028F / 9005 / 1463
Adaptec Smart HBA 2200-16io /e      9005 / 028F / 9005 / 14C2
HPE SR308i-p Gen11                  9005 / 028F / 1590 / 0382
HPE SR308i-o Gen11                  9005 / 028F / 1590 / 0383
HPE SR932i-p Gen11                  9005 / 028F / 1590 / 0381

Add PCI IDs for Inspur controllers:
                                    VID  / DID  / SVID / SDID
                                    ----   ----   ----   ----
INSPUR RS0800M5H24i                 9005 / 028F / 1BD4 / 006B
INSPUR RS0800M5E8I                  9005 / 028F / 1BD4 / 006C
INSPUR RS0800M5H8I                  9005 / 028F / 1BD4 / 006D
INSPUR RS0804M5R16i                 9005 / 028F / 1BD4 / 006F
INSPUR RS0800M5E16i                 9005 / 028F / 1BD4 / 0070
INSPUR RS0800M5H16i                 9005 / 028F / 1BD4 / 0071
INSPUR RS0800M5E16i                 9005 / 028F / 1BD4 / 0072
NT RAID 3100-24i                    9005 / 028F / 1F0C / 3161

Add HPE and Adaptec OROC PCI IDs:
                                    VID  / DID  / SVID / SDID
                                    ----   ----   ----   ----
HPE SR216i-o Gen11                  9005 / 028F / 9005 / 036F
Adaptec SmartRAID 3284-16io /e/uC   9005 / 028F / 9005 / 1473
Adaptec SmartRAID 3254-16io /e      9005 / 028F / 9005 / 1474

Add PCI IDs for new channel controllers:
                                    VID  / DID  / SVID / SDID
                                    ----   ----   ----   ----
Adaptec SmartRAID 3254-8i /e        9005 / 028F / 9005 / 14A4
Adaptec SmartRAID 3252-8i /e        9005 / 028F / 9005 / 14A5
Adaptec SmartRAID 3204-8i /e        9005 / 028F / 9005 / 14A6

Align PCI IDs with OOB driver. Easier to check for differences.

Link: https://lore.kernel.org/r/164375207802.440833.15947108153078495425.stgit@brunhilda.pdev.net
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Co-developed-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Co-developed-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Co-developed-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:33 -05:00
Don Brace c4ff687d25 scsi: smartpqi: Fix rmmod stack trace
Prevent "BUG: scheduling while atomic: rmmod" stack trace.

Stop setting spin_locks before calling OS functions to remove devices.

Link: https://lore.kernel.org/r/164375207296.440833.4996145011193819683.stgit@brunhilda.pdev.net
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-07 23:38:33 -05:00
Bart Van Assche 64fc9015fb scsi: smartpqi: Switch to attribute groups
struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.

Link: https://lore.kernel.org/r/20211012233558.4066756-43-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 21:45:59 -04:00
Bart Van Assche 0ca1908057 scsi: smartpqi: Call scsi_done() directly
Conditional statements are faster than indirect calls. Hence call
scsi_done() directly.

Link: https://lore.kernel.org/r/20211007202923.2174984-71-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 21:31:41 -04:00
Don Brace 605ae389ea scsi: smartpqi: Update version to 2.1.12-055
Update driver version to reflect changes.

Link: https://lore.kernel.org/r/20210928235442.201875-12-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:16 -04:00
Mike McGowen 80982656b7 scsi: smartpqi: Add 3252-8i PCI id
Add PCI ID information for the Adaptec SmartRAID 3252-8i controller:

    9005 / 028F / 9005 / 14A2

Link: https://lore.kernel.org/r/20210928235442.201875-11-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:16 -04:00
Kevin Barnett d4dc6aea93 scsi: smartpqi: Fix duplicate device nodes for tape changers
Stop the OS from re-discovering multiple LUNs for tape drive and medium
changer.

Duplicate device nodes for Ultrium tape drive and medium changer are being
created.

The Ultrium tape drive is a multi-LUN SCSI target.  It presents a LUN for
the tape drive and a 2nd LUN for the medium changer.  Our controller FW
lists both LUNs in the RPL results.

As a result, the smartpqi driver exposes both devices to the OS. Then the
OS does its normal device discovery via the SCSI REPORT LUNS command, which
causes it to re-discover both devices a 2nd time, which results in the
duplicate device nodes.

Link: https://lore.kernel.org/r/20210928235442.201875-10-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:15 -04:00
Mike McGowen 987d35605b scsi: smartpqi: Fix boot failure during LUN rebuild
Move the delay in the register polling loop to the beginning of the loop to
ensure there is always a delay between writing the register and reading it.

Link: https://lore.kernel.org/r/20210928235442.201875-9-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:15 -04:00
Mike McGowen 28ca6d876c scsi: smartpqi: Add extended report physical LUNs
Add support for the new extended formats in the data returned from the
Report Physical LUNs command for controllers that enable this feature.

The new formats allow the reporting of 16-byte WWIDs.

Link: https://lore.kernel.org/r/20210928235442.201875-8-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:15 -04:00
Mahesh Rajashekhara 4f3cefc308 scsi: smartpqi: Avoid failing I/Os for offline devices
Prevent kernel crash by failing outstanding I/O request when the OS takes
device offline.

When posted I/Os to the controller's inbound queue are not picked by the
controller, the driver will halt the controller and take the controller
offline.

When the driver takes the controller offline, the driver will fail all the
outstanding requests which can sometimes lead to an OS crash.

Link: https://lore.kernel.org/r/20210928235442.201875-7-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:15 -04:00
Don Brace be76f90668 scsi: smartpqi: Add TEST UNIT READY check for SANITIZE operation
Send a TEST UNIT READY to HBA disks and do not present them to the OS if
0x02/0x04/0x1b (SANITIZE IN PROGRESS) is returned.

During boot-up, some OSes appear to hang when there are one or more disks
undergoing a sanitize operation.

According to SCSI SBC4 specification section 4.11.2 "Commands allowed
during SANITIZE", some SCSI commands are permitted, but read/write
operations are not.

When the OS attempts to read the disk partition table a CHECK CONDITION ASC
0x04 ASCQ 0x1b is returned which causes the OS to retry the read until
SANITIZE has completed. This can take hours.

According to document HPE Smart Storage Administrator User Guide, during
the sanitize erase operation, the drive is unusable. I.e. the expected
behavior for SANITIZE is the that disk remains offline even after SANITIZE
has completed. The customer is expected to re-enable the disk using the
management utility.

Link: https://lore.kernel.org/r/20210928235442.201875-6-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:15 -04:00
Kevin Barnett 6ce1ddf532 scsi: smartpqi: Update LUN reset handler
Enhance check for commands queued to the controller.  Add new function
pqi_nonempty_inbound_queue_count() that will wait for all I/O queued for
submission to controller across all queue groups to drain.  Add helper
functions to obtain queue command counts for each queue group.  These
queues should drain quickly as they are already staged to be submitted down
to the controller's IB queue.

Enhance check for outstanding command completion.  Update the count of
outstanding commands while waiting.  This value was not re-obtained and was
potentially causing infinite wait for all completions.

Link: https://lore.kernel.org/r/20210928235442.201875-5-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:15 -04:00
Murthy Bhat 5d1f03e6f4 scsi: smartpqi: Capture controller reason codes
In some rare cases, the driver can halt the controller. Add a reason code
describing why the controller was halted.  Store this reason code in a
controller register to aid in debugging the issue.

Link: https://lore.kernel.org/r/20210928235442.201875-4-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:14 -04:00
Mahesh Rajashekhara 9ee5d6e9ac scsi: smartpqi: Add controller handshake during kdump
Correct kdump hangs when controller is locked up.

There are occasions when a controller reboot (controller soft reset) is
issued when a controller firmware crash dump is in progress.

This leads to incomplete controller firmware crash dump:

 - When the controller crash dump is in progress, and a kdump is initiated,
   the driver issues inbound doorbell reset to bring back the controller in
   SIS mode.

 - If the controller is in locked up state, the inbound doorbell reset does
   not work causing controller initialization failures. This results in the
   driver hanging waiting for SIS mode.

To avoid an incomplete controller crash dump, add in a controller crash
dump handshake:

 - Controller will indicate start and end of the controller crash dump by
   setting some register bits.

 - Driver will look these bits when a kdump is initiated.  If a controller
   crash dump is in progress, the driver will wait for the controller crash
   dump to complete before issuing the controller soft reset then complete
   driver initialization.

Link: https://lore.kernel.org/r/20210928235442.201875-3-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:14 -04:00
Don Brace 819225b03d scsi: smartpqi: Update device removal management
Update device removal path to handle issues for:

 - rmmod: Correct stack trace when removing devices.
 - rmmod: Synchronize SCSI cache.
 - Update handling for removing devices using sysfs.

Link: https://lore.kernel.org/r/20210928235442.201875-2-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-05 00:13:14 -04:00