Commit Graph

436 Commits

Author SHA1 Message Date
Xiang Chen e74006edd0 scsi: hisi_sas: Fix the conflict between device gone and host reset
When device gone, it will check whether it is during reset, if not, it will
send internal task abort. Before internal task abort returned, reset
begins, and it will check whether SAS_PHY_UNUSED is set, if not, it will
call hisi_sas_init_device(), but at that time domain_device may already be
freed or part of it is freed, so it may referenece null pointer in
hisi_sas_init_device(). It may occur as follows:

    thread0				thread1
hisi_sas_dev_gone()
    check whether in RESET(no)
    internal task abort
				    reset prep
				    soft_reset
				    ... (part of reset_done)
    internal task abort failed
    release resource anyway
    clear_itct
    device->lldd_dev=NULL
				    hisi_sas_reset_init_all_device
					check sas_dev->dev_type is SAS_PHY_UNUSED and
					!device
    set dev_type SAS_PHY_UNUSED
    sas_free_device
					hisi_sas_init_device
					...

Semaphore hisi_hba.sema is used to sync the processes of device gone and
host reset.

To solve the issue, expand the scope that semaphore protects and let them
never occur together.

And also some places will check whether domain_device is NULL to judge
whether the device is gone. So when device gone, need to clear
sas_dev->sas_device.

Link: https://lore.kernel.org/r/1567774537-20003-14-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:57 -04:00
Xiang Chen 97b151e758 scsi: hisi_sas: Add BIST support for phy loopback
Add BIST (built in self test) support for phy loopback.

Through the new debugfs interface, the user can configure loopback
mode/linkrate/phy id/code mode before enabling it. And also user can
enable/disable BIST function.

Link: https://lore.kernel.org/r/1567774537-20003-13-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:57 -04:00
Luo Jiaxing 7ec7082c57 scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation
We extract the code of memory allocate and construct an new function for
it. We think it's convenient for subsequent optimization.

Link: https://lore.kernel.org/r/1567774537-20003-12-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:57 -04:00
Luo Jiaxing 4bc058097a scsi: hisi_sas: Remove some unused function arguments
Some function arguments are unused, so remove them.

Also move the timeout print in for wait_cmds_complete_timeout_vX_hw()
callsites into that same function.

Link: https://lore.kernel.org/r/1567774537-20003-11-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen 27f22723c3 scsi: hisi_sas: Remove redundant work declaration
Remove redundant work declaration in HISI_SAS_DECLARE_RST_WORK_ON_STACK

Link: https://lore.kernel.org/r/1567774537-20003-10-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Luo Jiaxing 971b59443f scsi: hisi_sas: Remove hisi_sas_hw.slot_complete
We never call hisi_sas_hw.slot_complete, so remove it.

Link: https://lore.kernel.org/r/1567774537-20003-9-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen 435a05cf8c scsi: hisi_sas: Assign NCQ tag for all NCQ commands
Currently the NCQ tag is only assigned for FPDMA READ and FPDMA WRITE
commands, and for other NCQ commands (such as FPDMA SEND), their NCQ tags
are set in the delivery command to 0.

So for all the NCQ commands, we also need to assign normal NCQ tag for
them, so drop the command type check in hisi_sas_get_ncq_tag() [drop
hisi_sas_get_ncq_tag() altogether actually], and always use the ATA command
NCQ tag when appropriate.

Link: https://lore.kernel.org/r/1567774537-20003-8-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen 73a4925d15 scsi: hisi_sas: Update all the registers after suspend and resume
After suspend and resume, the HW registers will be set back to their
initial value. We use init_reg_v3_hw() to set some registers, but some
registers are set via firmware in ACPI "_RST" method, so add reset handler
before init_reg_v3_hw().

Link: https://lore.kernel.org/r/1567774537-20003-7-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen b45e05aa5d scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device
When init device for SAS disks, it will send TMF IO to clear disks. At that
time TMF IO is broken by some operations such as injecting controller reset
from HW RAs event, the TMF IO will be timeout, and at last device will be
gone. Print is as followed:

hisi_sas_v3_hw 0000:74:02.0: dev[240:1] found
...
hisi_sas_v3_hw 0000:74:02.0: controller resetting...
hisi_sas_v3_hw 0000:74:02.0: phyup: phy7 link_rate=10(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy0 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy2 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy3 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy6 link_rate=10(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy5 link_rate=11
hisi_sas_v3_hw 0000:74:02.0: phyup: phy4 link_rate=11
hisi_sas_v3_hw 0000:74:02.0: controller reset complete
hisi_sas_v3_hw 0000:74:02.0: abort tmf: TMF task timeout and not done
hisi_sas_v3_hw 0000:74:02.0: dev[240:1] is gone
sas: driver on host 0000:74:02.0 cannot handle device 5000c500a75a860d,
error:5

To improve the reliability, retry TMF IO max of 3 times for SAS disks which
is the same as softreset does.

Link: https://lore.kernel.org/r/1567774537-20003-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Luo Jiaxing 76dd768b44 scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails
At expander environment, we delay after issue phy reset to wait for
hardware to handle phy reset. But if sas_smp_phy_control() fails, the
delay is unnecessary so remove it.

Link: https://lore.kernel.org/r/1567774537-20003-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Luo Jiaxing c2bae4f7d7 scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled
At hisi_sas_debug_I_T_nexus_reset(), we call sas_phy_reset() to reset a
phy. But if the phy is disabled, sas_phy_reset() will directly return
-ENODEV without issue a phy reset request.

If so, We can directly return -ENODEV to libsas before issue a phy
reset.

Link: https://lore.kernel.org/r/1567774537-20003-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:55 -04:00
Luo Jiaxing af01b2b924 scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset()
When calling sas_phy_reset(), we need to specify whether the reset type
is hard reset or link reset - use true/false for clarity.

Link: https://lore.kernel.org/r/1567774537-20003-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:55 -04:00
Luo Jiaxing 7105e68afa scsi: hisi_sas: add debugfs auto-trigger for internal abort time out
This trigger is add at _hisi_sas_internal_task_abort()

Link: https://lore.kernel.org/r/1567774537-20003-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:55 -04:00
YueHaibing c0c1a71e95 scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.

Link: https://lore.kernel.org/r/20190904130256.24704-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-07 16:40:56 -04:00
zhengbin 328bc6debf scsi: hisi_sas: remove set but not used variable 'irq_value'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/hisi_sas/hisi_sas_v1_hw.c: In function cq_interrupt_v1_hw:
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c:1542:6: warning: variable irq_value set but not used [-Wunused-but-set-variable]

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-29 18:01:58 -04:00
Luo Jiaxing a5ac1f5d9a scsi: hisi_sas: Consolidate internal abort calls in LU reset operation
In hisi_sas_lu_reset(), we call internal abort for SAS and SATA device
codepaths -> consolidate into a single call.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:16 -04:00
Xiang Chen e7513f666b scsi: hisi_sas: replace "%p" with "%pK"
The format specifier "%p" can leak kernel address, and use "%pK" instead.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen a07b48766c scsi: hisi_sas: Remove some unnecessary code
Remove some unnecessary code, including:

 - Explicit zeroing of memory allocated for dmam_alloc_coherent()

 - Some duplicated code

 - Some redundant masking

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing 7bf18e849d scsi: hisi_sas: Modify return type of debugfs functions
For functions which always return 0, which is never checked, make to return
void.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen e16963f378 scsi: hisi_sas: Drop free_irq() when devm_request_irq() failed
It will free irq automatically if devm_request_irq() failed, so drop
free_irq() if devm_request_irq() failed.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
John Garry 5f6c32d7ce scsi: hisi_sas: Drop SMP resp frame DMA mapping
The SMP frame response is written to the command table and not the SMP
response pointer from libsas, so don't bother DMA mapping (and unmapping)
the SMP response from libsas.

Suggested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
John Garry 1c003146c6 scsi: hisi_sas: Drop kmap_atomic() in SMP command completion
The call to kmap_atomic() in the SMP command completion code is
unnecessary, since kmap() is only really concerned with highmem, which is
not relevant on arm64. The controller only finds itself in arm64 systems.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen 599aefc81e scsi: hisi_sas: Make slot buf minimum allocation of PAGE_SIZE
For a system with PAGE_SIZE of 16K or 64K, the size every time we want to
alloc may be small like 4K, but for function dmam_alloc_coherent(), the
least size it allocates is PAGE_SIZE, so it will waste much memory for the
situation.

To solve the issue, limit the minimum allocation size of slot buf to
PAGE_SIZE.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen d380f55503 scsi: hisi_sas: Don't bother clearing status buffer IU in task prep
For struct hisi_sas_status_buffer, it contains struct hisi_sas_err_record
and iu[1024]. The struct iu[1024] will be filled fully by the response of
disks, so it is not need to initialize them to 0, but for the struct
hisi_sas_err_record, SAS controller only fill some fields of
hisi_sas_err_record according to hw designer, so it should be initialised
to 0.  After the change, cpu utilization percentage of memset() is changed
from 1.7% to 0.12%.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing 445ee2de11 scsi: hisi_sas: Fix out of bound at debug_I_T_nexus_reset()
Fix a possible out-of-bounds access in hisi_sas_debug_I_T_nexus_reset().

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing b0b3e4290e scsi: hisi_sas: Snapshot AXI and RAS register at debugfs
The AXI and RAS register values should also should be snapshot at debugfs.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing bbe0a7b348 scsi: hisi_sas: Snapshot HW cache of IOST and ITCT at debugfs
The value of IOST/ITCT is updated to cache first, and then synchronize to
DDR periodically. So the value in IOST/ITCT cache is the latest data and
it's important for debugging.

So, the HW cache of IOST and ITCT should be snapshot at debugfs.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
Luo Jiaxing bee0cf25c0 scsi: hisi_sas: Fix pointer usage error in show debugfs IOST/ITCT
Fix how the pointer is set in hisi_sas_debugfs_iost_show() and
hisi_sas_debugfs_itct_show().

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
John Garry 897cc769bc scsi: hisi_sas: Drop hisi_sas_hw.get_free_slot
In commit 1273d65f29 ("scsi: hisi_sas: change queue depth from 512 to
4096"), the depth of each queue is the same as the max IPTT in the system.

As such, as long as we have an IPTT allocated, we will have enough space on
any delivery queue.

All .get_free_slot functions were checking for space on the queue by
reading the DQ read pointer. Drop this, and also raise the code into common
code, as there is nothing hw specific remaining.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
John Garry 93352abc81 scsi: hisi_sas: Make max IPTT count equal for all hw revisions
There is a small optimisation to be had by making the max IPTT the same for
all hw revisions, that being we can drop the check for read and write
pointer being the same in the get free slot function.

Change v1 hw to have max IPTT of 4096 - same as v2 and v3 hw - and
drop hisi_sas_hw.max_command_entries.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
Linus Torvalds ba6d10ab80 SCSI misc on 20190709
This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
 mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
 removal of the osst driver (I heard from Willem privately that he
 would like the driver removed because all his test hardware has
 failed).  Plus number of minor changes, spelling fixes and other
 trivia.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd
 fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE
 fjy17dQLvsJ4GsidHy8=
 =aS5B
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
  mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
  removal of the osst driver (I heard from Willem privately that he
  would like the driver removed because all his test hardware has
  failed). Plus number of minor changes, spelling fixes and other
  trivia.

  The big merge conflict this time around is the SPDX licence tags.
  Following discussion on linux-next, we believe our version to be more
  accurate than the one in the tree, so the resolution is to take our
  version for all the SPDX conflicts"

Note on the SPDX license tag conversion conflicts: the SCSI tree had
done its own SPDX conversion, which in some cases conflicted with the
treewide ones done by Thomas & co.

In almost all cases, the conflicts were purely syntactic: the SCSI tree
used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the
treewide conversion had used the new-style ones ("GPL-2.0-only" and
"GPL-2.0-or-later").

In these cases I picked the new-style one.

In a few cases, the SPDX conversion was actually different, though.  As
explained by James above, and in more detail in a pre-pull-request
thread:

 "The other problem is actually substantive: In the libsas code Luben
  Tuikov originally specified gpl 2.0 only by dint of stating:

  * This file is licensed under GPLv2.

  In all the libsas files, but then muddied the water by quoting GPLv2
  verbatim (which includes the or later than language). So for these
  files Christoph did the conversion to v2 only SPDX tags and Thomas
  converted to v2 or later tags"

So in those cases, where the spdx tag substantially mattered, I took the
SCSI tree conversion of it, but then also took the opportunity to turn
the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag.

Similarly, when there were whitespace differences or other differences
to the comments around the copyright notices, I took the version from
the SCSI tree as being the more specific conversion.

Finally, in the spdx conversions that had no conflicts (because the
treewide ones hadn't been done for those files), I just took the SCSI
tree version as-is, even if it was old-style.  The old-style conversions
are perfectly valid, even if the "-only" and "-or-later" versions are
perhaps more descriptive.

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits)
  scsi: qla2xxx: move IO flush to the front of NVME rport unregistration
  scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition
  scsi: qla2xxx: on session delete, return nvme cmd
  scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices
  scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1
  scsi: megaraid_sas: Introduce various Aero performance modes
  scsi: megaraid_sas: Use high IOPS queues based on IO workload
  scsi: megaraid_sas: Set affinity for high IOPS reply queues
  scsi: megaraid_sas: Enable coalescing for high IOPS queues
  scsi: megaraid_sas: Add support for High IOPS queues
  scsi: megaraid_sas: Add support for MPI toolbox commands
  scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver
  scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura
  scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD
  scsi: megaraid_sas: Handle sequence JBOD map failure at driver level
  scsi: megaraid_sas: Don't send FPIO to RL Bypass queue
  scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault
  scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout
  scsi: megaraid_sas: Call disable_irq from process IRQ poll
  scsi: megaraid_sas: Remove few debug counters from IO path
  ...
2019-07-11 15:14:01 -07:00
John Garry 924a3541ea scsi: libsas: aic94xx: hisi_sas: mvsas: pm8001: Use dev_is_expander()
Many times in libsas, and in LLDDs which use libsas, the check for an
expander device is re-implemented or open coded.

Use dev_is_expander() instead. We rename this from
sas_dev_type_is_expander() to not spill so many lines in referencing.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-20 15:37:02 -04:00
Xiang Chen 97fcf176b4 scsi: hisi_sas: Disable stash for v3 hw
For v3 hw, stash is enabled to promote performance, but it does little to
improve performance according to current tests. What's more, it causes
exceptions for some situations, so disable it.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Luo Jiaxing e4c19deba6 scsi: hisi_sas: Ignore the error code between phy down to phy up
Several error codes will be generated between PHY down to up.

This issue was introduced by HW design. The designers came to the
conclusion that we should ignore these errors.

Signed-off-by: Jiaxing Luo <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Xiang Chen 0ab7bc825a scsi: hisi_sas: Change the type of some numbers to unsigned
It reports a error as follows from some tools at two places in our code:
runtime error: left shift of 4 by 29 places cannot be represented in type
'int' So change the type of the two numbers to unsigned to avoid the error.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
John Garry c7669f5012 scsi: hisi_sas: Reduce HISI_SAS_SGE_PAGE_CNT in size
Macro HISI_SAS_SGE_PAGE_CNT is defined to SG_CHUNK_SIZE, which is 128.

This means that sizeof(struct hisi_sas_slot_buf_table) is 4192. This is
just over a 4K, which can mean inefficient DMA memory usage (for no PI).

Reduce the size of HISI_SAS_SGE_PAGE_CNT to 124 to fit in a 4K page. With
this change, we experience no performance hit.

Cc: dann frazier <dann.frazier@canonical.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Xiaofei Tan 794327ab53 scsi: hisi_sas: Fix the issue of argument mismatch of printing ecc errors
The argument of dev_err() called by multi_bit_ecc_error_process_v3_hw() is
not right. We pass two arguments, but there is only one printk format
specifier in the string.

Also move the print format string to dev_err().

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Xiang Chen 6c86e046cf scsi: hisi_sas: Delete PHY timers when rmmod or probe failed
When removing the driver or when probe fails, we need to delete the PHY
timers.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Thomas Gleixner 2874c5fd28 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:32 -07:00
Thomas Gleixner ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Xiang Chen 01d4e3a2fc scsi: hisi_sas: Some misc tidy-up
Do some minor tidy-up.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Luo Jiaxing 246ea3c0ad scsi: hisi_sas: Don't fail IT nexus reset for Open Reject timeout
Currently we call hisi_sas_softreset_ata_disk() in
hisi_sas_I_T_nexus_reset().

If this fails for open reject reason, there is no reason to fail the IT
nexus reset, so only fail for TMF_RESP_FUNC_FAILED.

Some other strings spilled over multiple lines are reunited.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Luo Jiaxing a311570027 scsi: hisi_sas: Don't hard reset disk during controller reset
In the function of hisi_sas_init_device(), we added ops->hardreset() to
clear affiliation of STP target port or handle [STP pending] state.

Function hisi_sas_init_device() will be called when a device is found or
during controller reset. At controller reset, we call
hisi_sas_init_device() to re-init the disks, so calling hardreset() is
unnecessary and it also will cause some delay at controller reset.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiaofei Tan 3168d4f800 scsi: hisi_sas: Support all RAS events with MSI interrupts
This patch is to switch HW all error handling from PCI AER to MSI interrupt
due to non-standard PCI implementation. All HW errors which were being
reported through PCI AER can be reported through MSI interrupt also.

Do two things to complete the switch:

1. Notify FW to switch to MSI handling through ACPI DSM.

2. Add MSI handling for some hw errors, ECC errors and poison errors (we
   also call some of them AXI reuser error). They were handled only through
   PCI AER before.

For old FW reporting PCI AER events, the PCI AER handler will see that the
driver on longer support AER, and will leave the device in offlined state,
which is safe.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen adb5b38c19 scsi: hisi_sas: allocate different SAS address for directly attached situation
In commit 8b8d665315 ("scsi: hisi_sas: make SAS address of SATA disks
unique"), we ensured that each SATA disk in the system has a unique SAS
address, even if it is fake. That was for v2 hw.

Add this for v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen 18a54b329c scsi: hisi_sas: Adjust the printk format of functions hisi_sas_init_device()
In function hisi_sas_init_device(), the log is as follows when error for
hardreset:

  hisi_sas_v3_hw 0000:74:02.0: SATA disk hardreset fail: 0xffffffed

Actually if hardreset failed, its return value is negative, so change the
print format from %x to %d.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
John Garry c63b88ccff scsi: hisi_sas: Fix for setting the PHY linkrate when disconnected
In commit efdcad62e7 ("scsi: hisi_sas: Set PHY linkrate when
disconnected"), we use the sas_phy_data.enable flag to track whether the
PHY was enabled or not, so that we know if we should set the PHY negotiated
linkrate at SAS_LINK_RATE_UNKNOWN or SAS_PHY_DISABLED.

However, it is not proper to use sas_phy_data.enable, since it is only set
when libsas attempts to set the PHY disabled/enabled; hence, it may not
even have an initial value.

As a solution to this problem, introduce hisi_sas_phy.enable to track
whether the PHY is enabled or not, so that we can set the negotiated
linkrate properly when the PHY comes down.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen 447f78c0e1 scsi: hisi_sas: Remedy inconsistent PHY down state in software
Currently there are two scenarioes which may cause PHY state of hardware
(which is 0) is inconsistent with the state held in software:

- Unplug SAS wire before get_phys_state when SAS controller reset, then the
  interrupts of phy down are ignored, phy state is 0 before reset, and it
  also gets 0 after reset, so phy down doesn't occur even if unplugged SAS
  wire;

- For v3 hw later version, it will close bus when 2 bit ECC error occurs.
  So if unplug SAS wire at that time, interrupts of phy down also not
  occur. So at last it will cause host reset. It also get phy state 0
  before and after reset, the same issue occurs.

To solve it, use hisi_sas_phy_down() directly in rescan topology function.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen a97fa58680 scsi: hisi_sas: add host reset interface for test
Add host reset interface to make it easier for testing the host reset
feature.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:11 -04:00
Luo Jiaxing 0e83fc61ee scsi: hisi_sas: Add softreset in hisi_sas_I_T_nexus_reset()
We found out that for v2 hw, a SATA disk can not be written to after the
system comes up.

In commit ffb1c820b8 ("scsi: hisi_sas: remove the check of sas_dev status
in hisi_sas_I_T_nexus_reset()"), we introduced a path where we may issue an
internal abort for a SATA device, but without following it with a
softreset.

We need to always follow an internal abort with a software reset, as per HW
programming flow, so add this.

Fixes: ffb1c820b8 ("scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()")
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-20 14:24:04 -04:00