SATA MICROCODE DOWNALOAD fails on isci driver. After receiving Register
Device to Host (FIS 0x34) frame Initiator resets phy.
In the frame handler routine response (FIS 0x34) was copied into wrong
buffer and upper layer did not receive any answer which resulted in
timeout and reset.
This patch corrects this bug.
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is a large set of updates, mostly for drivers (qla2xxx [including support
for new 83xx based card], qla4xxx, mpt2sas, bfa, zfcp, hpsa, be2iscsi, isci,
lpfc, ipr, ibmvfc, ibmvscsi, megaraid_sas). There's also a rework for tape
adding virtually unlimited numbers of tape drives plus a set of dif fixes for
sd and a fix for a live lock on hot remove of SCSI devices.
This round includes a signed tag pull of isci-for-3.6
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJQaqCFAAoJEDeqqVYsXL0MKJ4IALg/Obnk0/fNvBUNIrh5zRmj
r9UlXFJnlEDT03qRGdn8okgWMChbgaD1ZrwDTQnjNsabVQoTXI6oO6/uL2c8crpY
BFBwJvkNJS99nbcZv10CpJ3K7ykmRnKlkYon12iknhGwdtU+XJ14Z4PUcZkI9jmg
sBQQ6uNVWyosaONNE+k6o+dw6OTttJkzRX8e9in3thstxNTcG+h9iB1zZ/ETkSEj
tD4MyOgDiPf8kPV2awQThQGpni9Tu3SQr5dEn/iUUktUjiYsDNQuyaAk+QzyhUU7
D35iIJnIHlXTSTMQkrG4qpJHBvqPkWlYJzaOmheQryQ3vzp2C5Ly/hS9il45uIQ=
=49u9
-----END PGP SIGNATURE-----
Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull first round of SCSI updates from James Bottomley:
"This is a large set of updates, mostly for drivers (qla2xxx [including
support for new 83xx based card], qla4xxx, mpt2sas, bfa, zfcp, hpsa,
be2iscsi, isci, lpfc, ipr, ibmvfc, ibmvscsi, megaraid_sas).
There's also a rework for tape adding virtually unlimited numbers of
tape drives plus a set of dif fixes for sd and a fix for a live lock
on hot remove of SCSI devices.
This round includes a signed tag pull of isci-for-3.6
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
Fix up trivial conflict in drivers/scsi/qla2xxx/qla_nx.c due to new PCI
helper function use in a function that was removed by this pull.
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (198 commits)
[SCSI] st: remove st_mutex
[SCSI] sd: Ensure we correctly disable devices with unknown protection type
[SCSI] hpsa: gen8plus Smart Array IDs
[SCSI] qla4xxx: Update driver version to 5.03.00-k1
[SCSI] qla4xxx: Disable generating pause frames for ISP83XX
[SCSI] qla4xxx: Fix double clearing of risc_intr for ISP83XX
[SCSI] qla4xxx: IDC implementation for Loopback
[SCSI] qla4xxx: update copyrights in LICENSE.qla4xxx
[SCSI] qla4xxx: Fix panic while rmmod
[SCSI] qla4xxx: Fail probe_adapter if IRQ allocation fails
[SCSI] qla4xxx: Prevent MSI/MSI-X falling back to INTx for ISP82XX
[SCSI] qla4xxx: Update idc reg in case of PCI AER
[SCSI] qla4xxx: Fix double IDC locking in qla4_8xxx_error_recovery
[SCSI] qla4xxx: Clear interrupt while unloading driver for ISP83XX
[SCSI] qla4xxx: Print correct IDC version
[SCSI] qla4xxx: Added new mbox cmd to pass driver version to FW
[SCSI] scsi_dh_alua: Enable STPG for unavailable ports
[SCSI] scsi_remove_target: fix softlockup regression on hot remove
[SCSI] ibmvscsi: Fix host config length field overflow
[SCSI] ibmvscsi: Remove backend abstraction
...
Pull the trivial tree from Jiri Kosina:
"Tiny usual fixes all over the place"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
doc: fix old config name of kprobetrace
fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc
btrfs: fix the commment for the action flags in delayed-ref.h
btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID
vfs: fix kerneldoc for generic_fh_to_parent()
treewide: fix comment/printk/variable typos
ipr: fix small coding style issues
doc: fix broken utf8 encoding
nfs: comment fix
platform/x86: fix asus_laptop.wled_type module parameter
mfd: printk/comment fixes
doc: getdelays.c: remember to close() socket on error in create_nl_socket()
doc: aliasing-test: close fd on write error
mmc: fix comment typos
dma: fix comments
spi: fix comment/printk typos in spi
Coccinelle: fix typo in memdup_user.cocci
tmiofb: missing NULL pointer checks
tools: perf: Fix typo in tools/perf
tools/testing: fix comment / output typos
...
We always assign a dummy task context to a port in order to address a
silicon issue. We have 4 ports per controller. So when idle, there are always
exactly 4 TCs "active". The adaptive interrupt coalescing code uses number of
active TCs to figure out the coalescing values. However, we never hit "0" TCs
because of the 4 dummy TCs. Putting in fix so that we calculate this correctly.
Reported-by: Dan Melnic <dan@seamicro.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This commit fixes a driver bug for SSP tasks that require task management
in the target after they complete in the SCU hardware. The problem was
manifested in the function "isci_task_abort_task", which tests
to see if the sas_task.lldd_task is non-NULL before allowing task
management; this bug would always NULL lldd_task in the SCU I/O completion
path even if target management was required, which would prevent
task / target manangement from happening.
Note that in the case of SATA/STP targets, error recovery is provided by
the libata error handler which is why SATA/STP device recovery worked
correctly even though SSP handling did not.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
1/ Fix the workaround for drives that have a slow response to COMSAS.
Drives with this problem intermittently take a long time to be
identified, or fail to be identified altogether.
2/ A minor fix for the efi variable code failure path
3/ A handful of smatch fixups from Dan Carpenter
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJP/GIxAAoJEB7SkWpmfYgCVW8P/iW9RkFx1yNCJGrWYzD/1/TC
HtiO0AYbJQejc7LP04AtbU1+EeH8S13SN1zyWCQIFN+dpxZmTv4mGEUY5Kp91SRf
vttWVqFlkLm2w8+K3Thp31UZWmNFyHguWGCoDQtKfTj3byZoaw7pLEv1KdrKnCbL
q0Mu+Dbsp/GcuulCumjVrGoxMUv++1McVgcqu/Loa2RU5eActGAkJmJpYpZS12uI
HPy/DcQKY0afpcIjNnavwxufqYcCupg5km6gkIsglQGecra7xAL5GICQG8vGYR6m
yt7mK0P8UdNA8r2O64UWstKRwZRn6McabxJoL/ZjYhyap7i1kbGBfu3wXLgqQp+E
L0I5SCDlJm159GDId6cU6XPdWiKvVKaofC5XvAjG5gQZZonqNhdQDqBTfk2+bawF
QEE0tSK6/fTWOS3TaVehmLQjgp7t+4EzBj3P1YfyEBqDh7xEsAjUJKfgyvWdkgDy
MGIPD95e6EAg1NxWpWg1GOdHV33y+Lh16xCj2wjWVQfqkr6CE04Av65kQp/QKlPI
fMXPtUvgp+f0+Pvpo3iviBQSh++fBl3iFkOmEDGlNAi0C56SW3rW2U+0GYMGHhEp
U9FXCBHTjlVEb05/ee7I9tRSFM30w/zRjXY6ubGtyP/MIzYCjvyAEm2Hc5LjhIwA
HNHI8ehCaNywVCFoqCyu
=pwks
-----END PGP SIGNATURE-----
Merge tag 'isci-for-3.6' into for-next
isci update for 3.6
1/ Fix the workaround for drives that have a slow response to COMSAS.
Drives with this problem intermittently take a long time to be
identified, or fail to be identified altogether.
2/ A minor fix for the efi variable code failure path
3/ A handful of smatch fixups from Dan Carpenter
Provide a "simple-dev-pm-ops" implementation that shuts down the domain
and the device on suspend, and resumes the device and the domain on
resume. All of the mechanics of restoring domain connectivity are
handled by libsas once isci has notified libsas that all links should be
back up. libsas is in charge of handling links that did not resume, or
resumed out of order.
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Jacek Danecki <jacek.danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
...now that the strategy handlers guarantee eh context and notify
the driver of bus reset.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
These are __iomem. Sparse complains if we don't have that.
drivers/scsi/isci/phy.c +149 70: warning:
incorrect type in initializer (different address spaces)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Sparse complains that we redeclare this with a different type, because
in the .c file we use an enum and in the .h file we declare the
parameter as a u32. Probably it's best to use an enum in both places.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The following patch is a fix for the WD workaround
COMSAS negation timeout change. This patch disables the
OOB SM when the OOB is placed in reset, which allows
the updated COMSAS negation timeout value to take
effect.
Cc: Dan Thompson <daniel.j.thompson@intel.com>
Reported-by: Dan Thompson <daniel.j.thompson@intel.com>
Signed-off-by: Dave Maurer <david.c.maurer@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The oem parameter image embedded in the efi variable is at an offset
from the start of the variable. However, in the failure path we try to
free the 'orom' pointer which is only valid when the paramaters are
being read from the legacy option-rom space.
Since failure to load the oem parameters is unlikely and we keep the
memory around in the success case just defer all de-allocation to devm.
Cc: <stable@vger.kernel.org>
Reported-by: Don Morris <don.morris@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
While the RNC is suspended for I/O cleanup, the remote device can be
stopped and the RNC setup for destruction. These changes accomodate that
case in the abort path.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This fix corrects the saving of resume parameters when the destruction
of the RNC has already been directed, and makes sure not to overwrite
the RNC destruction callbacks.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The RNC state machine would incorrectly transition from
SCI_RNC_AWAIT_SUSPENSION directly to SCI_RNC_INVALIDATING when a destruct
request was made. This would skip the increment of the suspension count
and the abort of pending TCs (although the invalidating state would at
least cleanup outstanding TCs).
Instead, the RNC will transition to SCI_RNC_SUSPENDED and then start the
destruction process.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Since there is a possibilty of a timeout waiting for the RNC suspension,
handle the exit case from the task termination under scic_lock, and leave
the tag allocated if the termination timed-out.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Since the callbacks to libsas now occur under scic_lock, there is no
longer any reason to save the completed requests in a separate list
for completion to libsas.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the link fail path, set IDEV_GONE for every device on the domain
when the last link in the port fails.
In the abort path functions like isci_reset_device, make sure that
there has not already been a detected domain failure with the device
by checking IDEV_GONE, before performing any kind of hard reset, SMP
phy control, or TMF operation.
The check for IDEV_GONE makes sure that the device in the abort path
really has control of the port with which it is associated. This
prevents starting hard resets at incorrect times and scheduling
unnecessary LUN resets for SATA devices.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The ATAPI specific and STP general RNC suspension code had been
incorrectly removed from the remote device code.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Make sure that the wait for suspend can handle the RNC destruction case.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There is an apparent HW lockup caused when the PE is disabled while there
is an outstanding TC in progress. This change puts the link into OOB to
force the TC to end before the PE is disabled.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This change adds timeouts to the RNC suspension wait. It makes the
suspend and resume timeouts the same.
The previous resume timeout of 5 ms was too short, and timeouts were
seen in resumptions of devices in the abort task/LUN reset path - which
would receive an RNC resumed message within a tenth of a second later.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Requests contructed as task management requests need to have the protocol
indicator set so the completion decode can observe any RNC suspension
conditions.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
TMF requests, unlike normal I/O requests, need to handle I/O management
conditions in the completion function because TMFs are not handled in the
completion tasklet.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the case of TMF execution, or device resets, wait for the RNC to fully
resume before returning to the caller. This ensures that the remote
device will not fail I/O requests while waiting for the RNC resumption to
complete.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Instead of immediately transitioning to the SCI_RNC_AWAIT_SUSPENSION
state, handle the SCI_RNC_RESUMING suspend transition from the
SCI_RNC_READY state like the SCI_RNC_INVALIDATING --> SCI_RNC_POSTING
transitions do now, by setting the destination state for the entry
into the READY state.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When an individual request is being terminated, the request's tag
is managed in the terminate function.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch changes the callback mechanism to libsas to only occur while
the scic_lock is held; the abort path cleanup of I/Os also checks to make
sure IREQ_ABORT_PATH_ACTIVE is clear before proceding.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Completion of I/Os during the one of the abort path interface calls
from libsas can drive remote device state changes and the resumption
of the device RNC. This is a problem when the abort path is
attempting to cleanup outstanding I/O at the same time - the resumption
can prevent the termination from occuring correctly.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In order to prevent a device from receiving an I/O request while still
in an RNC suspending or resuming state (and therefore failing that
I/O back to libsas with a reset required status) wait for the RNC state
change before proceding in the abort path.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the libsas error path, SATA disks require extra handling in
libata to recover operation. However, libsas expects to be able
to immediately recover all outstanding I/O once the error handler
escalation stops. This patch fixes the condition where the libata
error handler is scheduled for operation but libsas has already
deleted the outstanding sas_tasks.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The LLHANG timer should be enabled once per device. This patch corrects
both the timer enable and the timer disable for the remote device.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the case of a suspend call while in SCI_RNC_POSTING or INVALIDATING
states, the LLHANG detect needed to be saved so the upcoming suspension
would enable it correctly. The unused suspend callback parameters were
removed.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This addresses a regression from the commit "isci: Redesign
device suspension, abort, cleanup." in which the sas_task end
condition for terminated I/Os was made to call back on
sas_task_abort()".
This commit will be rolled into the original.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For NCQ error conditions among others, there is no need to enable
the link layer hang detect timer.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The RNC can be any of the states in the loop from suspended to
ready when the API "suspend" or "resume" are called. This change
adds destination states parameters that control the suspension /
resumption action of the RNC statemachine for those transition states.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This commit changes the means by which outstanding I/Os are handled
for cleanup.
The likelihood is that this commit will be broken into smaller pieces,
however that will be a later revision. Among the changes:
- All completion structures have been removed from the tmf and
abort paths.
- Now using one completed I/O list, with the I/O completed in host bit being
used to select error or normal callback paths.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If LUN reset sees that the device is gone, it returns TMF_RESP_FUNC_FAILED
to cause libsas to escalate to an I_T_Nexus_Reset.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fixing the remote device state machine to suspend and terminate
all outstanding I/O before the device stopped state is reached.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When the remote device enters the NCQ error state, the device must
be suspended so that the I/O terminations can take place.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
TCs must be terminated only while the RNC is suspended. This commit
adds remote device suspensions and resumptions in the abort, reset and
termination paths.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
TCs must only be terminated when RNCs are suspended.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add comprehensive decode for all TC completions that generate RNC
suspensions.
Note that this commit also removes unconditional resumptions of ATAPI
devices when in the SCI_STP_DEV_ATAPI_ERROR state, and STP devices
when in the SCI_STP_DEV_IDLE state. This is because the SCI_STP_DEV_IDLE
and SCI_STP_DEV_ATAPI state entry functions manage the RNC resumption.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>