When adding/removing slots from device list, we need to lock this
operation with hisi_hba lock for safety.
This patch adds missing instances of this.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We used spin_lock() to grab hisi_hba.lock in two places where
spin_lock_irqsave() should be used, as hisi_hba.lock can be taken in
interrupt context.
This patch is to fix this.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When an internal abort times out in hisi_sas_internal_task_abort(),
goto the exit label in and not go through the other task status
checks.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We already relocated hisi_sas_get_ncq_tag() into common file main.c,
so delete get_ncq_tag_v3_hw() and use hisi_sas_get_ncq_tag() instead.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove unused variable sas_ha to clean up build warning
"unused variable sas_ha [-Wunused-variable]"
Fixes: 042ebd293b ("scsi: libsas: kill useless ha_event and do some cleanup")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The ha_event now has only one event HAE_RESET, and this event does
nothing. Kill it and do some cleanup.
This is a preparation for enhance libsas hotplug feature in the next
patches.
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The bus reset handler is calling I_T Nexus reset, which logically is a
target reset as it need to specify both the initiator and the target.
So move it to target reset.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver version is not updated with changes to the driver, so it has
no value, so just get rid of it.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instances of kfree(shost) should be replaced with scsi_host_put().
In addition, a missing scsi_host_put() is added for error path in
hisi_sas_shost_alloc_pci() and v3 driver removal.
Signed-off-by: Pan Bian <bianpan2016@163.com> # For main.c changes
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Variable res only holds value 0, so remove it.
This cleans up a coccicheck warning.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add function to set linkrate for v3 hw.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch updates some register setting according to recommendation
from HW designer and experiment.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use ACPI "_RST" method to reset the controller, since FLR is not
supported.
Function hisi_sas_stop_phys() is introduced to remove some code
duplication.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds calls to kill CQ takslets v3 hw during probe failure.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The channel interrupt is to process all the interrupts except PHY
UP/DOWN and broadcast interrupt. So we need to clear all the interrupts
except those 3 interrupts after processing channel interrupts.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Modify STP link timer from 10ms to 500ms. Also add the register address.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For v3 hw, internal abort function required status and command buffer to
be set, so add necessary code for this.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add two ATA commands, ATA_CMD_ZAC_MGMT_IN and ATA_CMD_ZAC_MGMT_OUT in
hisi_sas_get_ata_protocol(), to support SATA SMR disk.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch is a fix related to freeing a device in v2 hw driver.
Before, we polled to ITCT CLR interrupt to check if a device is free.
This was error prone, as if the interrupt doesn't occur in 10us, we miss
processing it.
To avoid this situation, service this interrupt and sync the event with
a completion.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds support to clean-up allocated IRQs and kill tasklets
when probe fails and for driver removal.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch removes some repeated configurations:
(1) The device id of the device is already set in the alloc function, so
we don't need to modify in free device function.
(2) Field dev_type and dev_status are configured in hisi_sas_dev_gone(),
so there is no need for repeated config in free_device_v3_hw.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The code to print ECC errors in v2 hw driver is very repetitive. This
patch condensed the code by looping an array of errors.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add DFX feature for v2 hw. We are adding support for
the following errors:
- loss_of_dword_sync_count
- invalid_dword_count
- phy_reset_problem_count
- running_disparity_error_count
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The value dw0 is the residual bytes when UNDERFLOW error happens, but we
filled the residual with the value of dw3 before. So change the residual
from dw3 to dw0.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When some interrupts happen together, we need to process every interrupt
one-by-one, and should not return immediately when one interrupt process
is finished being processed.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch provides fixes for the following issues:
1. Fix issue of controller reset required to send commands. For reset
process, it may be required to send commands to the controller, but
not during soft reset. So add HISI_SAS_NOT_ACCEPT_CMD_BIT to prevent
executing a task during this period.
2. Send a broadcast event in rescan topology to detect any topology
changes during reset.
3. Previously it was not ensured that libsas has processed the PHY up
and down events after reset. Potentially this could cause an issue
that we still process the PHY event after reset. So resolve this by
flushing shot workqueue in LLDD reset.
4. Port ID requires refresh after reset. The port ID generated after
reset is not guaranteed to be the same as before reset, so it needs
to be refreshed for each device's ITCT.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don't populate various tables on the stack but make them static const.
Makes the object code smaller by over 280 bytes:
Before:
text data bss dec hex filename
39887 5080 64 45031 afe7 hisi_sas_v2_hw.o
After:
text data bss dec hex filename
39318 5368 64 44750 aece hisi_sas_v2_hw.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently we allocate 3 sets of DMA memories from separate pools for
each slot. This is inefficient in terms of memory usage
(buffers are less than 1 page in size, so we lose due to alignment),
and also time spent in doing 3 allocations + de-allocations per slot,
instead of 1.
To optimise, combine the 3 DMA buffers into a single buffer from a
single pool.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Element phy_type is a bitmask and it only ever has 2 bits possibly set,
and it is overkill to define as a u64, so redefine as a u32.
This change resolves static code check complaint that "phy->phy_type &=
~PORT_TYPE_SAS;" would unintentionally clear the high 32 bits as well.
Structure hisi_sas_phy is also reordered to ensure packing efficiency.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a change for abort dev for v3 hw: add registers to configure
unaborted iptt for a device, and then inform this to logic.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to fill the interface of phy_hard_reset, phy_get_max_linkrate,
and phy enable/disable.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code for interface get_wideport_bitmap.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to prepare internal abort command.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to itct setup and free for v3 hw.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to prepare ATA frame for v3 hw
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to prepare SMP frame.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to prepare SSP frame and deliver it to hardware.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add v3 cq interrupt handler slot_complete_v3_hw().
Note: The slot error handling needs to be further refined in the future
to examine all fields in the error record, and handle appropriately,
instead of current solution - just report SAS_OPEN_REJECT.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to initialise interrupts and add some interrupt handlers.
Also add function hisi_sas_v3_destroy_irqs() to clean-up irqs upon
module unloading.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to configure PHYs for v3 hw.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code to initialise v3 hardware.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the code to initialise the controller which is based on pci device
in hisi_sas_v3_hw.c
The core controller routines are still in hisi_sas_main.c; some common
initialisation functions are also exported from hisi_sas_main.c
For pci-based controller, the device properties, like phy count and sas
address are read from the firmware, same as platform device-based
controller.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add skeleton driver for v3 hw in hisi_sas_v3_hw.c
File hisi_sas_v3_hw.c will serve 2 purposes:
- probing and initialisation of the controller based on pci device
- hw layer for v3-based controllers
The controller design is quite similar to v2 hw in hip07.
However key differences include:
-All v2 hw bugs are fixed (hopefully), so workarounds are not required
-support for device deregistration
-some interrupt modifications
-configurable max device support
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the functionality to retrieve the fw info into a dedicated device
type-agnostic function, hisi_sas_get_fw_info().
The reasoning is that this function will be required for future
pci-based platforms.
Also add some debug logs for failure.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since hip08 SAS controller is based on pci device, add hisi_hba.pci_dev
for hip08 (will be v3), and also rename hisi_hba.pdev to .platform_dev
for clarity.
In addition, for common code which wants to reference the controller
device struct, add hisi_hba.dev, and change the common code to use it.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Relocate get_ncq_tag_v2_hw() to a common location, as future hw versions
will require it. Also rename with "hisi_sas_" prefix for consistency.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Relocate get_ata_protocol() to a common location, as future hw versions
will require it. Also rename with "hisi_sas_" prefix for consistency.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Relocate get_ata_protocol() to a common location, as future hw versions
will require it. Also rename with "hisi_sas_" prefix for consistency.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently hisi_hba.lock is locked to deliver and receive a command
to/from any hw queue. This causes much contention at high data-rates.
To boost performance, lock on a per queue basis for sending and
receiving commands to/from hw.
Certain critical regions still need to be locked in the delivery and
completion stages with hisi_hba.lock.
New element hisi_sas_device.dq is added to store the delivery queue for
a device, so it does not need to be needlessly re-calculated for every
task.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently hisi_sas_device.device_id is a u64. This can create a problem
in selecting the queue for a device, in that this code does a 64b
division on device id. For some 32b systems, 64b division is slow and
the lib reference must be explicitly included.
The device id does not need to be 64b in size, so, as a solution, just
make as an int.
Also, struct hisi_sas_device elements are re-ordered to improve packing
efficiency.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We need to check for timeout before task status, or the task will be
mistook as completed internal abort command. Also add protection for
sas_task.task_state_flags in hisi_sas_tmf_timedout().
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add null check before indirectly dereferencing pointer task->lldd_task
in statement u32 tag = slot->idx;
Addresses-Coverity-ID: 1373843
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move scsi_remove_host call into sas_remove_host and remove it from SAS
HBA drivers, so we don't mess up the ordering. This solves an issue with
double deleting sysfs entries that was introduced by the change of sysfs
behaviour from commit bcdde7e221 ("sysfs: make __sysfs_remove_dir()
recursive").
[mkp: addressed checkpatch complaints]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Suggested-by: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: James Bottomley <jejb@linux.vnet.ibm.com>
Cc: Jinpu Wang <jinpu.wang@profitbricks.com>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinpu Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For 1 bit ECC errors, those errors can be recovered by hw. But for
multi-bits ECC and AXI errors, there are something wrong with whole
module or system, so try reset the controller to recover those errors
instead of calling panic().
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a TMF timeouts (maybe due to unlikely scenario of an expander being
unplugged when TMF for remote device is active), when we eventually try
to free the slot, we crash as we dereference the slot's task, which has
already been released.
As a fix, add checks in the slot release code for a NULL task.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch is a workaround for a SoC bug where an internal abort command
may timeout. In v2 hw, the channel should become idle in order to finish
abort process. If the target side has been sending HOLD, host side
channel failed to complete the frame to send, and can not enter the idle
state. Then internal abort command will timeout.
As this issue is only in v2 hw, we deal with it in the hw layer. Our
workaround solution is: If abort is not finished within a certain period
of time, we will check HOLD status. If HOLD has been sending, we will
send break command.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds a workaround solution for a SoC bug which may cause SoC
logic fatal error when disabling a PHY. Then we find internal abort IO
timeout may occur, and the controller IO breakpoint may be corrupted.
We work around this SoC bug by optimizing the flow of disabling a PHY.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch provides a workaround a SoC bug where SATA IPTTs for
different devices may conflict.
The workaround solution requests the following:
1. SATA device id must be even and not equal to SAS IPTT.
2. SATA device can not share the same IPTT with other SAS or
SATA device.
Besides we shall consider IPTT value 0 is reserved for another SoC bug
(STP device open link at firstly after SAS controller reset).
To sum up, the solution is: Each SATA device uses independent and
continuous 32 even IPTT from 64 to 4094, then v2 hw can only support 63
SATA devices. All SAS device(SSP/SMP devices) share odd IPTT value from
1 to 4095.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
After resetting the controller, the process of scanning SATA disks
attached to an expander may fail occasionally. The issue is that the
controller can't close the STP link created by target if the max link
time is 0.
To workaround this issue, we reject STP link after resetting the
controller, and change the corresponding PHY to accept STP link only
after receiving data.
We do this check in cq interrupt handler. In order not to reduce
efficiency, we use an variable to control whether we should check and
change PHY to accept STP link.
The function phys_reject_stp_links_v2_hw() should be called after
resetting the controller.
The solution of another SoC bug "SATA IO timeout", that also uses the
same register to control STP link, is not effective before the PHY
accepts STP link.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Removing the 'select SCSI_SAS_LIBSAS' statement in Kconfig resulted in a
link failure in configurations that have hisi_sas built-in but libsas as
a loadable module:
drivers/scsi/built-in.o: In function `hisi_sas_scan_finished':
hisi_sas_main.c:(.text+0x37ce9): undefined reference to `sas_drain_work'
drivers/scsi/built-in.o: In function `hisi_sas_slave_configure':
hisi_sas_main.c:(.text+0x37d17): undefined reference to `sas_slave_configure'
hisi_sas_main.c:(.text+0x37d40): undefined reference to `sas_change_queue_depth'
drivers/scsi/built-in.o: In function `hisi_sas_remove':
All other libsas users have the 'select' statement, so we should do the
same here for consistency. For all I can tell, the patch that added the
sata softreset does not actually introduce a dependency on SCSI_SAS_ATA
but instead adds calls into libata itself, so we can express that with a
more specific dependency.
We cannot have 'select SCSI_SAS_LIBSAS; depends on SCSI_SAS_ATA' as that
would cause a dependency loop.
Fixes: 7c594f0407 ("scsi: hisi_sas: add softreset function for SATA disk")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It appears that a break in the TRANS_TX_OPEN_CNX_ERR_NO_DESTINATION case
got accidentally removed in an earlier commit, as it stands, the
ts->stat and ts->open_rej_reason are being updated twice for this case
which looks incorrect. Fix this by adding in the missing break
statement.
Detected by CoverityScan, CID#1422110 ("Missing break in switch")
Fixes: 634a9585f4 ("scsi: hisi_sas: process error codes according to their priority")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helper function is_sata_phy_v2_hw() to judge whether the attached
device is SATA disk for a root PHY.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When SMP IO is sent, sas_protocol_ata couldn't judge whether the disk is
SATA or SAS disk. So use dev_is_sata to identify SATA or SAS disk.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unless we actually get some sort of failure in hisi_sas_lu_reset(),
don't print a message.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When an SMP task timeouts, it will call lldd_abort_task to release the
associated slot, and then will release the sas_task.
Currently in lldd_abort_task, if we fail to internally abort IO, then
the slot of SMP IO is not released, but sas_task will still be later
released, so the slot's sas_task is NULL, which will cause NULL pointer
when hisi_sas_slot_task_free happens later.
To resolve, check the return value of internal abort, and release the
slot if it failed.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add function for upper-layer to reset controller when all else fails.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For consistency, remove the "hisi_sas_" prefix.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Handle the situation that PHY UP and DOWN irq happen simultaneously.
There is no mechanism of SoC HW to ensure this situation will never
happen. So, we add this handle just in case.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch includes:
(1) Disable transport layer retry
(2) Support CQ time and count interrupt coal
(3) fix link FIFO full issue
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Zhao Nenglong <zhaonenglong@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are some rules to decide which error code has the high priority
when errors happen together:
(1) Error phase of CQ decides the error happens on RX or TX;
(2) For TX error, when DMA/TRANS TX error happen simultaneously, the
priority of DMA TX error is higher than TRANS TX error, so for the
priority of TX error: DW2 (DMA TX part) > DW0;
(3) For RX error, when TRANS/DMA/SIPC RX error happen simultaneously,
the priority of TRANS RX error is higher than DMA and SIPC RX error,
and we should also keep the rules (the priority of DW3 > DW2), so
for the priority of RX error: DW1 > DW3 > DW2(SIPC RX part);
(4) There are also a priority we should keep in the same error type.
So, modify slot error code to handle this.
In addition to this, some some error codes are modified according to
recommendation from SoC designer.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a TMF or internal abort times-out, do not free slot. We expect this
to be done upon later escalated error handling.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some more locking needs to be added/modified for when
read-modify-writing sas_task.task_state_flags.
Note: since we can attempt to grab this lock in interrupt
context we should use irq variant of spin_lock.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
After hardreset, we clear up IOs of remote disks, so we need to free
those slots in LLDD.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Check in slot_complete_v2_hw() for whether a task has already been
completed by upper layer.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When issuing an LU reset for a SATA target, issue an internal abort and
a hard reset.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently an internal abort is executed regardless of the result of the
TMF. We should also check the result of the internal abort to see if we
should free the slot.
So change the status code STAT_IO_COMPLETE to TMF_RESP_FUNC_SUCC,
meaning the slot has been successfully aborted.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For error codes which need abort-and-retry, simulate IO timeout and let
SCSI+ATA layers process those errors.
Previously for SSP, we should try to abort the IO in the LLDD and then
pass back to upper layer, but sometimes this would also error. So
Instead of adding special error handling for this scenario in the LLDD,
allow the upper layer to handle completely.
No performance hit is seen by taking this approach.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We currently do a hard reset for a link reset. Change this to do a link
reset only.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When sas_port is NULL, then return SAS_PHY_DOWN.
In addition, when the sas_dev is gone then explicitly return
SAS_PHY_DOWN.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently when a root PHY is deformed from a asd_sas_port we try to
release the slots in the LLDD, and fail.
Regardless, it is not right to release this early.
This patch removes the deformed function. As it was before, port
deformation is still done in hisi_sas_phy_down().
It would be nice to actually remove the hisi_sas_port_{de}formed() pair,
however we cannot as we need to know the asd_sas_port index libsas has
associated with an asd_sas_phy.
The hw does actually generate a port id for a PHY, but this seems to a
random number, so ignored for this purpose.
This patch also changes the code to link slots to the hisi_sas_device,
and not hisi_sas_port.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add softreset to clear IO after internal abort device for SATA disk.
The SATA error handling for the controller is based on device internal
abort and softreset function.
The controller does not support internal abort for single IO, so we need
to execute internal abort for device.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Relocate the PHY init code from LLDD hw init path to
hisi_sas_scan_start().
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are some scenarios that we need to warm-reset to reset registers
of SAS controller. During reset we disable interrupts/DQs/PHYs, and
after reset we re-init the hardware and rescan the topology to see if
anything changed.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce function to get hisi_sas_port from asd_sas_port.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is an issue that hisi_sas_dev.running_req is not
decremented properly for internal abort and TMF.
To resolve, only decrease running_req in hisi_sas_slot_task_free()
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a potential probe issue in how we trigger the hw initialisation.
Although we use 1s timer to delay hw initialisation, there is still a
potential that sas_register_ha() is not be finished before we start
the PHY init from hw->hw_init().
To avoid this issue, initialise the hw after sas_register_ha() in the
same probe context.
Note: it is not necessary to use 1s timer now (modified v2 hw only).
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Downgrade the exit print in hisi_sas_internal_task_abort()
to dbg level, as info is not required.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correctly set registers in v2 for root PHY hardreset for directly
attached disk.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The message to inform that the controller has no refclk
is currently at warning level, which is unnecessary, so
downgrade to debug.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Set SMP connection timeout and continue AWT timer;
Clear ITCT table when dev gone.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The v2 SAS controller needs more time to detect channel idle
and send setup link request than SATA disk does, so it is
difficult for the SAS controller to setup an STP link. Therefore
it may cause some IO timeouts.
We need to periodically configure the SAS controller so it
doesn't receive STP setup requests from SATA disks for a while,
so IO can be sent during this period.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When we call hisi_sas_slot_task_free() we should grab the hisi_hba.lock,
as hisi_sas_slot_task_free() accesses common hisi_hba elements.
Function hisi_sas_slot_abort() is missing this, so add it.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a bug in the current driver in that certain hisi_hba and port
structure elements which we access when servicing the CQ interrupt do
not use thread-safe accesses; these include hisi_sas_port linked-list of
active slots (hisi_sas_port.entry), bitmap of currently allocated IPTT
(in hisi_hba.slot_index_tags), and completion queue read pointer.
As a solution, lock these elements with the hisi_hba.lock.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the all the slot processing for the completion queue is done
in ISR context. It is judged that the slot processing can take a long
time, especially when a SATA NCQ completes (upto 32 slots).
So, as a solution, defer the bulk of the ISR processing to tasklet
context. Each CQ will have its down tasklet.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the hip06 and hip07 SoCs, the interrupt lines from the SAS
controllers are connected to mbigen hw module [1]. The mbigen module is
probed with module_init, and, as such, is not guaranteed to probe before
the SAS driver. So we need to support deferred probe.
We check for probe deferral in the hw layer probe, so we not probe into
the main layer and allocate shost, memories, etc., to later learn that
we need to defer the probe.
[1] ./Documentation/devicetree/bindings/interrupt-controller/hisilicon,mbigen-v2.txt
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch addresses 4 problems in the module probe/remove:
- When hisi_sas_shost_alloc() fails after we alloc shost memory, we
should free shost memory before the function returns.
- When hisi_sas_probe() fails after we alloc the HBA memories, we
should also free the HBA memories.
- We should free shost memory at the end of hisi_sas_remove().
- sha->core.shost is set twice, so remove extra set.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are some typos where we intended "<<" but have "<". Seems likely
to cause a bunch of problems.
Fixes: d3b688d3c6 ("scsi: hisi_sas: add v2 hw support for ECC and AXI bus fatal error")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the function to set PHY min and max linkrate through
sysfs interface.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sometimes the value of hisi_sas_device.running_req
would go negative unless we have the check for
running_req >= 0 before trying to decrement.
This is because using running_req is not thread-safe.
As such, the value for running_req may be actually incorrect,
so use atomic64_t instead.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Check ERR bit of status to decide whether there is something wrong with
initial register-D2H FIS. If error exists, PHY reset the channel to
restart OOB.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>