There is an issue that hisi_sas_dev.running_req is not
decremented properly for internal abort and TMF.
To resolve, only decrease running_req in hisi_sas_slot_task_free()
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a potential probe issue in how we trigger the hw initialisation.
Although we use 1s timer to delay hw initialisation, there is still a
potential that sas_register_ha() is not be finished before we start
the PHY init from hw->hw_init().
To avoid this issue, initialise the hw after sas_register_ha() in the
same probe context.
Note: it is not necessary to use 1s timer now (modified v2 hw only).
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Downgrade the exit print in hisi_sas_internal_task_abort()
to dbg level, as info is not required.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correctly set registers in v2 for root PHY hardreset for directly
attached disk.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The message to inform that the controller has no refclk
is currently at warning level, which is unnecessary, so
downgrade to debug.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Set SMP connection timeout and continue AWT timer;
Clear ITCT table when dev gone.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The v2 SAS controller needs more time to detect channel idle
and send setup link request than SATA disk does, so it is
difficult for the SAS controller to setup an STP link. Therefore
it may cause some IO timeouts.
We need to periodically configure the SAS controller so it
doesn't receive STP setup requests from SATA disks for a while,
so IO can be sent during this period.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When we call hisi_sas_slot_task_free() we should grab the hisi_hba.lock,
as hisi_sas_slot_task_free() accesses common hisi_hba elements.
Function hisi_sas_slot_abort() is missing this, so add it.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a bug in the current driver in that certain hisi_hba and port
structure elements which we access when servicing the CQ interrupt do
not use thread-safe accesses; these include hisi_sas_port linked-list of
active slots (hisi_sas_port.entry), bitmap of currently allocated IPTT
(in hisi_hba.slot_index_tags), and completion queue read pointer.
As a solution, lock these elements with the hisi_hba.lock.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the all the slot processing for the completion queue is done
in ISR context. It is judged that the slot processing can take a long
time, especially when a SATA NCQ completes (upto 32 slots).
So, as a solution, defer the bulk of the ISR processing to tasklet
context. Each CQ will have its down tasklet.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the hip06 and hip07 SoCs, the interrupt lines from the SAS
controllers are connected to mbigen hw module [1]. The mbigen module is
probed with module_init, and, as such, is not guaranteed to probe before
the SAS driver. So we need to support deferred probe.
We check for probe deferral in the hw layer probe, so we not probe into
the main layer and allocate shost, memories, etc., to later learn that
we need to defer the probe.
[1] ./Documentation/devicetree/bindings/interrupt-controller/hisilicon,mbigen-v2.txt
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch addresses 4 problems in the module probe/remove:
- When hisi_sas_shost_alloc() fails after we alloc shost memory, we
should free shost memory before the function returns.
- When hisi_sas_probe() fails after we alloc the HBA memories, we
should also free the HBA memories.
- We should free shost memory at the end of hisi_sas_remove().
- sha->core.shost is set twice, so remove extra set.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are some typos where we intended "<<" but have "<". Seems likely
to cause a bunch of problems.
Fixes: d3b688d3c6 ("scsi: hisi_sas: add v2 hw support for ECC and AXI bus fatal error")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the function to set PHY min and max linkrate through
sysfs interface.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sometimes the value of hisi_sas_device.running_req
would go negative unless we have the check for
running_req >= 0 before trying to decrement.
This is because using running_req is not thread-safe.
As such, the value for running_req may be actually incorrect,
so use atomic64_t instead.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Check ERR bit of status to decide whether there is something wrong with
initial register-D2H FIS. If error exists, PHY reset the channel to
restart OOB.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Modify and add some SATA commands according to SATA protocol.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Delete repeated configuration items for hisi_sas_device() when
we free a device. These items are now only set in
hisi_sas_dev_gone().
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
sas_scsi_find_task() only deals with return value
TMF_RESP_FUNC_FAILED/TMF_RESP_FUNC_SUCC/TMF_RESP_FUNC_COMPLETE of
query task. So for LLDD errors just return TMF_RESP_FUNC_FAILED.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When we form a wideport, we should use hardware PHY port_id instead
of sas_phy->id.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are many BROADCAST primitives generated by the host.
We are only interested in BROADCAST (CHANGE) primitives currently,
so only process this.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently slots are allocated from queues in a round-robin fashion.
This causes a problem for internal commands in device mode. For this
mode, we should ensure that the internal abort command is the last
command seen in the host for that device. We can only ensure this when
we place the internal abort command after the preceding commands for
device that in the same queue, as there is no order in which the host
will select a queue to execute the next command.
This queue restriction makes supporting scsi mq more tricky in
the future, but should not be a blocker.
Note: Even though v1 hw does not support internal abort, the
allocation method is chosen to be the same for consistency.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For ECC 1bit error, logic can recover it, so we only print
a warning.
For ECC multi-bit and AXI bus fatal error, we panic.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The hip06 D03 and hip07 D05 boards have different reference clock
frequencies for the SAS controller.
Register PHY_CTRL needs to be programmed differently according to this
frequency, so add support for this.
The default register setting in PHY_CTRL is for 50MHz, so only update
this register when the refclk frequency is 66MHz.
For ACPI we expect the _RST handler to set the correct value for
PHY_CTRL (we're forced to take different approach for DT and ACPI as
ACPI does not support fixed-clock device).
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the v2 hw is attached with many disks through an expander, there
may be OOB reset resulting in a PHY going down after the speed is
negotiated (very low probability).
This issue is resolved by modifying the link control registers to send
three identify frames before the PHY is ready (according to 6.10.3.3.2
in SAS 3.0 spec) and close ready when the PHY is down.
Signed-off-by: NengLong Zhao <zhaonenglong@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In setup_itct_v2_hw(), SATA device type SAS_SATA_PENDING is missing, so
add it.
Note: The HiSi SAS controller does not support SATA PM, so do not handle
SAS_SATA_PM_PORT or SAS_SATA_PM.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Function config_id_frame_v1_hw() is called twice for each PHY during
initialisation, which is unneeded.
So remove init_id_frame_v1_hw(), which only calls
config_id_frame_v1_hw().
We will keep the call to config_id_frame_v1_hw() in start_phy_v1_hw()
since it will be used for PHY reset functions.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Function config_id_frame_v2_hw() is called twice for each PHY during
initialisation, which is unneeded.
So remove init_id_frame_v2_hw(), which only calls
config_id_frame_v2_hw().
We will keep the call to config_id_frame_v2_hw() in start_phy_v2_hw()
since it will be used for PHY reset functions.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The endianness for the SAS address in the TX_ID_DWORD registers is set
incorrectly. We see errors like this in the boot log for v2 hw (which
would have the same issue as v1 hw):
[ 7.583284] sas: target proto 0x0 at 50000d1108e7923f:0x1f not handled
This is due to the host SAS addr not matching the PHY SAS addr in the
expander host-attached phy discovery responses.
To fix, we byte swap the SAS addr from BE to LE (which is the endianness
of the SAS controller).
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The endianness for the SAS address in the TX_ID_DWORD registers is set
incorrectly. We see errors like this in the boot log:
[ 7.583284] sas: target proto 0x0 at 50000d1108e7923f:0x1f not handled
This is due to the host SAS addr not matching the PHY SAS addr in the
expander host-attached phy discovery responses.
To fix, we byte swap the SAS addr from BE to LE (which is the endianness
of the SAS controller).
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The device DMA mask was being set after the bulk of the DMA allocations
in the driver init, so potentially DMA allocates fail. To resolve,
relocate before allocating the DMA memory when initialising the driver.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If hisi_sas_task_prep() fails for a SATA device due to PHY down, we
return a failure to libata and also call task_done(), which will cause
ata_qc_complete() to be called twice: - first call from
hisi_sas_task_prep(), which will clear flag ATA_QCFLAG_ACTIVE -
ata_qc_complete() called from libata The warning call trace is as
follows:
[ 117.070206] [<ffff0000084f59b0>] __ata_qc_complete+0xf4/0x11c
[ 117.070208] [<ffff0000084f5b58>] ata_qc_complete+0x180/0x200
[ 117.070210] [<ffff0000084f5dd0>] ata_qc_issue+0x110/0x354
[ 117.070212] [<ffff0000084f6254>] ata_exec_internal_sg+0x240/0x4d0
[ 117.070214] [<ffff0000084f6544>] ata_exec_internal+0x60/0xa0
[ 117.070217] [<ffff000008501580>] ata_read_log_page+0x188/0x1b4
[ 117.070218] [<ffff0000085017dc>] ata_eh_analyze_ncq_error+0xa8/0x274
[ 117.070220] [<ffff000008501a3c>] ata_eh_link_autopsy+0x94/0x8c8
[ 117.070222] [<ffff0000085022a4>] ata_eh_autopsy+0x34/0xe8
[ 117.070223] [<ffff00000850540c>] ata_do_eh+0x28/0xc0
[ 117.070225] [<ffff0000085054e0>] ata_std_error_handler+0x3c/0x84
[ 117.070227] [<ffff000008505140>] ata_scsi_port_error_handler+0x480/0x674
[ 117.070230] [<ffff0000084e3020>] async_sas_ata_eh+0x44/0x78
[ 117.070231] [<ffff0000080d6b8c>] async_run_entry_fn+0x40/0x104
[ 117.070234] [<ffff0000080ce518>] process_one_work+0x128/0x2f0
[ 117.070235] [<ffff0000080ce738>] worker_thread+0x58/0x434
[ 117.070237] [<ffff0000080d416c>] kthread+0xd4/0xe8
[ 117.070240] [<ffff000008084e10>] ret_from_fork+0x10/0x40
The issue is resolved by simply returning a failure status code to the
upper layer.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In function phy_up_v2_hw(), we needlessly recalculate the phy linkrate
for all phys, and the calculation is incorrect for phy8, so remove this
code.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The Delivery queue enable register should only be written to once at
reset for v2 hw.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The memory calculation for the tags bitmap should use BITS_PER_BYTE
macro instead of coincidental same value of sizeof(unsigned long).
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the slot memory is zeroed when it is freed and also when it is
reused, like in hisi_sas_task_prep(). Optimise by avoiding the redundant
zeroing in the free.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
None of the CHL_INT2 interrupts are serviced in the channel irq ISR, so
leave the interrupt source masked. The interrupt mask is initially set
in init_reg_v2_hw().
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Optimise by saving an avoidable read in the get_free_slot function. The
delivery queue write pointer will only be updated by software, so don't
bother re-reading what was already written in the previous call to
start_delivery function.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Optimise by saving an avoidable read in the cq interrupt. The queue
read pointer will only be updated by software, so don't bother
re-reading what was already written in the previous interrupt.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a tmf is issued, various response codes can be returned from the
target. For a query tmf the response may be TMF_RESP_FUNC_COMPLETE or
TMF_RESP_FUNC_SUCC. Add a condition for TMF_RESP_FUNC_SUCC to
hisi_sas_exec_internal_tmf_task(). This affects query tmf, as the
result is success the returned value was for failure.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the port is detached we cannot execute a TMF, as there can be no
device attached to the port.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add code in slot_complete_v2_hw() to deal with the slots which have
completed due to internal abort.
The status codes have the following meaning:
- STAT_IO_ABORTED: the IO has been aborted due to internal abort,
whether by device or individual abort command
- STAT_IO_COMPLETE: internal abort command has completed successfully
for device or individual abort command
- STAT_IO_NO_DEVICE: internal abort command has completed for device but
cannot find any IO
- STAT_IO_NOT_VALID: internal abort command has completed for single
command but could not find the command
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add function to prepare the an internal abort for v2 hw.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Execute an internal abort for executing a task abort.
This is for case of the command still being present
in host when abort is executed.
For a SATA internal abort, we set abort for all tasks
associated with the device.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Execute an internal abort for that device when it is removed, so that
commands for that device are not processed.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add main code for internal abort functionality.
The internal abort features allows the host controller to abort commands
which are still active in the controller but have not yet been sent to
the slave device.
Typically a command only spends a relatively short time in the
controller when compared to the amount of the time after it is sent to
the slave device.
Two modes of internal abort are supported:
- device
- individual command
For device, when the internal abort is issued all commands in the host
for that device are aborted. For a single command, only that command is
aborted if it is still in the host.
In HW the internal abort command is executed similar to any other sort
of command, like SSP.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>