It is unavoidable that zfcp_scsi_queuecommand() has to finish requests
with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time
window when zfcp detected an unavailable rport but
fc_remote_port_delete(), which is asynchronous via
zfcp_scsi_schedule_rport_block(), has not yet blocked the rport.
However, for the case when the rport becomes available again, we should
prevent unblocking the rport too early. In contrast to other FCP LLDDs,
zfcp has to open each LUN with the FCP channel hardware before it can
send I/O to a LUN. So if a port already has LUNs attached and we
unblock the rport just after port recovery, recoveries of LUNs behind
this port can still be pending which in turn force
zfcp_scsi_queuecommand() to unnecessarily finish requests with
DID_IMM_RETRY.
This also opens a time window with unblocked rport (until the followup
LUN reopen recovery has finished). If a scsi_cmnd timeout occurs during
this time window fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents a clean and
timely path failover. This should not happen if the path issue can be
recovered on FC transport layer such as path issues involving RSCNs.
Fix this by only calling zfcp_scsi_schedule_rport_register(), to
asynchronously trigger fc_remote_port_add(), after all LUN recoveries as
children of the rport have finished and no new recoveries of equal or
higher order were triggered meanwhile. Finished intentionally includes
any recovery result no matter if successful or failed (still unblock
rport so other successful LUNs work). For simplicity, we check after
each finished LUN recovery if there is another LUN recovery pending on
the same port and then do nothing. We handle the special case of a
successful recovery of a port without LUN children the same way without
changing this case's semantics.
For debugging we introduce 2 new trace records written if the rport
unblock attempt was aborted due to still unfinished or freshly triggered
recovery. The records are only written above the default trace level.
Benjamin noticed the important special case of new recovery that can be
triggered between having given up the erp_lock and before calling
zfcp_erp_action_cleanup() within zfcp_erp_strategy(). We must avoid the
following sequence:
ERP thread rport_work other context
------------------------- -------------- --------------------------------
port is unblocked, rport still blocked,
due to pending/running ERP action,
so ((port->status & ...UNBLOCK) != 0)
and (port->rport == NULL)
unlock ERP
zfcp_erp_action_cleanup()
case ZFCP_ERP_ACTION_REOPEN_LUN:
zfcp_erp_try_rport_unblock()
((status & ...UNBLOCK) != 0) [OLD!]
zfcp_erp_port_reopen()
lock ERP
zfcp_erp_port_block()
port->status clear ...UNBLOCK
unlock ERP
zfcp_scsi_schedule_rport_block()
port->rport_task = RPORT_DEL
queue_work(rport_work)
zfcp_scsi_rport_work()
(port->rport_task != RPORT_ADD)
port->rport_task = RPORT_NONE
zfcp_scsi_rport_block()
if (!port->rport) return
zfcp_scsi_schedule_rport_register()
port->rport_task = RPORT_ADD
queue_work(rport_work)
zfcp_scsi_rport_work()
(port->rport_task == RPORT_ADD)
port->rport_task = RPORT_NONE
zfcp_scsi_rport_register()
(port->rport == NULL)
rport = fc_remote_port_add()
port->rport = rport;
Now the rport was erroneously unblocked while the zfcp_port is blocked.
This is another situation we want to avoid due to scsi_eh
potential. This state would at least remain until the new recovery from
the other context finished successfully, or potentially forever if it
failed. In order to close this race, we take the erp_lock inside
zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or
LUN. With that, the possible corresponding rport state sequences would
be: (unblock[ERP thread],block[other context]) if the ERP thread gets
erp_lock first and still sees ((port->status & ...UNBLOCK) != 0),
(block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock
after the other context has already cleard ...UNBLOCK from port->status.
Since checking fields of struct erp_action is unsafe because they could
have been overwritten (re-used for new recovery) meanwhile, we only
check status of zfcp_port and LUN since these are only changed under
erp_lock elsewhere. Regarding the check of the proper status flags (port
or port_forced are similar to the shown adapter recovery):
[zfcp_erp_adapter_shutdown()]
zfcp_erp_adapter_reopen()
zfcp_erp_adapter_block()
* clear UNBLOCK ---------------------------------------+
zfcp_scsi_schedule_rports_block() |
write_lock_irqsave(&adapter->erp_lock, flags);-------+ |
zfcp_erp_action_enqueue() | |
zfcp_erp_setup_act() | |
* set ERP_INUSE -----------------------------------|--|--+
write_unlock_irqrestore(&adapter->erp_lock, flags);--+ | |
.context-switch. | |
zfcp_erp_thread() | |
zfcp_erp_strategy() | |
write_lock_irqsave(&adapter->erp_lock, flags);------+ | |
... | | |
zfcp_erp_strategy_check_target() | | |
zfcp_erp_strategy_check_adapter() | | |
zfcp_erp_adapter_unblock() | | |
* set UNBLOCK -----------------------------------|--+ |
zfcp_erp_action_dequeue() | |
* clear ERP_INUSE ---------------------------------|-----+
... |
write_unlock_irqrestore(&adapter->erp_lock, flags);-+
Hence, we should check for both UNBLOCK and ERP_INUSE because they are
interleaved. Also we need to explicitly check ERP_FAILED for the link
down case which currently does not clear the UNBLOCK flag in
zfcp_fsf_link_down_info_eval().
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8830271c48 ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport")
Fixes: a2fa0aede0 ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e1 ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e066 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248 ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Cc: <stable@vger.kernel.org> #2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change FC drivers to use 'struct bsg_job' from bsg-lib.h instead of
'struct fc_bsg_job' from scsi_transport_fc.h and remove 'struct
fc_bsg_job'.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since commit a54ca0f62f
("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
HBA records no longer contain WWPN, D_ID, or LUN
to reduce duplicate information which is already in REC records.
In contrast to "regular" target ports, we don't use recovery to open
WKA ports such as directory/nameserver, so we don't get REC records.
Therefore, introduce pseudo REC running records without any
actual recovery action but including D_ID of WKA port on open/close.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a54ca0f62f ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While retaining the actual filtering according to trace level,
the following commits started to write such filtered records
with a hardcoded record level of 1 instead of the actual record level:
commit 250a1352b9
("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
commit a54ca0f62f
("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Now we can distinguish written records again for offline level filtering.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 250a1352b9 ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
Fixes: a54ca0f62f ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch improves the Fibre Channel port scan behaviour of the zfcp lldd.
Without it the zfcp device driver may churn up the storage area network by
excessive scanning and scan bursts, particularly in big virtual server
environments, potentially resulting in interference of virtual servers and
reduced availability of storage connectivity.
The two main issues as to the zfcp device drivers automatic port scan in
virtual server environments are frequency and simultaneity.
On the one hand, there is no point in allowing lots of ports scans
in a row. It makes sense, though, to make sure that a scan is conducted
eventually if there has been any indication for potential SAN changes.
On the other hand, lots of virtual servers receiving the same indication
for a SAN change had better not attempt to conduct a scan instantly,
that is, at the same time.
Hence this patch has a two-fold approach for better port scanning:
the introduction of a rate limit to amend frequency issues, and the
introduction of a short random backoff to amend simultaneity issues.
Both approaches boil down to deferred port scans, with delays
comprising parts for both approaches.
The new port scan behaviour is summarised best by:
NEW: NEW:
no_auto_port_rescan random rate flush
backoff limit =wait
adapter resume/thaw yes yes no yes*
adapter online (user) no yes no yes*
port rescan (user) no no no yes
adapter recovery (user) yes yes yes no
adapter recovery (other) yes yes yes no
incoming ELS yes yes yes no
incoming ELS lost yes yes yes no
Implementation is straight-forward by converting an existing worker to
a delayed worker. But care is needed whenever that worker is going to be
flushed (in order to make sure work has been completed), since a flush
operation cancels the timer set up for deferred execution (see * above).
There is a small race window whenever a port scan work starts
running up to the point in time of storing the time stamp for that port
scan. The impact is negligible. Closing that gap isn't trivial, though, and
would the destroy the beauty of a simple work-to-delayed-work conversion.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Get rid of unused function zfcp_fsf_get_req and corresponding
prototype definition.
Commit a54ca0f62f in v2.6.28
"[SCSI] zfcp: Redesign of the debug tracing for HBA records."
accidentally introduced this code which was dead in the first place.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch removes an interface that was used to manage access control
tables within the HBA. The patch consequently removes the handling
for conditions related to those access control tables, too.
That initiator-based access control feature was only needed until the
introduction of NPIV and was withdrawn with z10 years ago.
It's time to cleanup the corresponding device driver code.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Remove the now unused function zfcp_device_unregister since all
users have been converted to use device_unregister directly.
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Let the driver core handle device attribute creation and removal. This
will simplify the code and eliminates races between attribute
availability and userspace notification via uevents.
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Let the driver core handle device attribute creation and removal. This
will simplify the code and eliminates races between attribute
availability and userspace notification via uevents.
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In FC fabrics with large zones, the automatic port_rescan on incoming ELS
and any adapter recovery can cause quite some traffic at the very same
time, especially if lots of Linux images share an HBA, which is common on
s390. This can cause trouble and failures. Fix this by making such port
rescans dependent on a user configurable module parameter.
The following unconditional automatic port rescans remain as is:
On setting an adapter online and
on manual user-triggered writes to the sysfs attribute port_rescan.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Upstream commit f3450c7b91
"[SCSI] zfcp: Replace local reference counting with common kref"
accidentally dropped a reference count check before tearing down
zfcp_ports that are potentially in use by zfcp_units.
Even remote ports in use can be removed causing
unreachable garbage objects zfcp_ports with zfcp_units.
Thus units won't come back even after a manual port_rescan.
The kref of zfcp_port->dev.kobj is already used by the driver core.
We cannot re-use it to track the number of zfcp_units.
Re-introduce our own counter for units per port
and check on port_remove.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.33+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If the mapping of FCP device bus ID and corresponding subchannel
is modified while the Linux image is suspended, the resume of FCP
devices can fail. During resume, zfcp gets callbacks from cio regarding
the modified subchannels but they can be arbitrarily mixed with the
restore/resume callback. Since the cio callbacks would trigger
adapter recovery, zfcp could wakeup before the resume callback.
Therefore, ignore the cio callbacks regarding subchannels while
being suspended. We can safely do so, since zfcp does not deal itself
with subchannels. For problem determination purposes, we still trace the
ignored callback events.
The following kernel messages could be seen on resume:
kernel: <WWPN>: parent <FCP device bus ID> should not be sleeping
As part of adapter reopen recovery, zfcp performs auto port scanning
which can erroneously try to register new remote ports with
scsi_transport_fc and the device core code complains about the parent
(adapter) still sleeping.
kernel: zfcp.3dff9c: <FCP device bus ID>:\
Setting up the QDIO connection to the FCP adapter failed
<last kernel message repeated 3 more times>
kernel: zfcp.574d43: <FCP device bus ID>:\
ERP cannot recover an error on the FCP device
In such cases, the adapter gave up recovery and remained blocked along
with its child objects: remote ports and LUNs/scsi devices. Even the
adapter shutdown as part of giving up recovery failed because the ccw
device state remained disconnected. Later, the corresponding remote
ports ran into dev_loss_tmo. As a result, the LUNs were erroneously
not available again after resume.
Even a manually triggered adapter recovery (e.g. sysfs attribute
failed, or device offline/online via sysfs) could not recover the
adapter due to the remaining disconnected state of the corresponding
ccw device.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.32+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.
Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
FICON Express8S supports hardware data router, which requires an
adapted qdio request format.
This part 2/2 exploits the functionality in zfcp.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Query the FC symbolic port name for reporting in the fc_host sysfs and
enable the symbolic_name attribute in the fc_host sysfs. When running
in NPIV mode, extend the symbolic port name with the devno and the
hostname. This allows better identification of Linux systems for SAN
and storage administrators.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The SCSI host and transport templates are the only members left in the
global zfcp_data struct. Move them out of zfcp_data and remove the
now unused zfcp_data struct. Also update the names of the register and
unregister functions to use the zfcp_scsi prefix.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the kmem_cache for allocating the qtcb to zfcp_fsf.c and rename
it accordingly.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A data buffer that is passed to the hardware must not cross a page
boundary. zfcp uses a series of kmem_caches to align the data to not
cross a page boundary. Introduce a new kmem_cache for the FC requests
sent from the zfcp driver and use it for the ELS ADISC data. The goal
is to migrate to the FC kmem_cache in later patches and remove the
request specific kmem_caches.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the final cleanup of the redesign from the zfcp tracing.
Structures and elements which were used by multiple areas of the
former debug tracing are now changed to the new scheme.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the continuation to redesign the zfcp tracing to a more
straight-forward and easy to extend scheme.
This patch deals with all trace records of the zfcp SCSI area.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the continuation to redesign the zfcp tracing to a more
straight-forward and easy to extend scheme.
This patch deals with all trace records of the zfcp HBA area.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the continuation to redesign the zfcp tracing to a more
straight-forward and easy to extend scheme.
This patch deals with all trace records of the zfcp SAN area.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The tracing environment of the zfcp LLD has become very bulky and hard
to maintain. Small changes involve a large modification process which
is error-prone and not effective. This patch is the first of a set to
redesign the zfcp tracing to a more straight-forward and easy to
extend scheme. It removes all interpretation and visualization parts
and focuses on bare logging of the information.
This patch deals with all trace records of the zfcp error recovery.
Signed-off-by: Swen schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Replace the zfcp_modify_<xxx>_status functions and its accompanying wrappers
with dedicated status modifier functions. This eases code readability and
maintenance.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the code evaluating the ACL/CFDC specific errors to the file
zfcp_cfdc.c. With this change, all code related to the old access
control feature is kept in one file, not split across zfcp_erp.c and
zfcp_fsf.c.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is the large change to switch from using the data in
zfcp_unit to zfcp_scsi_dev. Keeping everything working requires doing
the switch in one piece. To ensure that no code keeps using the data
in zfcp_unit, this patch also removes the data from zfcp_unit that is
now being replaced with zfcp_scsi_dev.
For zfcp, the scsi_device together with zfcp_scsi_dev exist from the
call of slave_alloc to the call of slave_destroy. The data in
zfcp_scsi_dev is initialized in zfcp_scsi_slave_alloc and the LUN is
opened; the final shutdown for the LUN is run from slave_destroy.
Where the scsi_device or zfcp_scsi_dev is needed, the pointer to the
scsi_device is passed as function argument and inside the function
converted to the pointer to zfcp_scsi_dev; this avoids back and forth
conversion betweeen scsi_device and zfcp_scsi_dev.
While changing the function arguments from zfcp_unit to scsi_device,
the functions names are renamed form "unit" to "lun". This is to have
a seperation between zfcp_scsi_dev/LUN and the zfcp_unit; only code
referring to the remaining configuration information in zfcp_unit
struct uses "unit".
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the change for the LUN data to be part of the scsi_device, the
slave_destroy callback will be the final call to the
zfcp_erp_unit_shutdown function. The erp tries to acquire a reference
to run the action asynchronously and fail, if it cannot get the
reference. But calling scsi_device_get from slave_destroy will fail,
because the scsi_device is already in the state SDEV_DEL.
Introduce a new call into the zfcp erp to solve this: The function
zfcp_erp_unit_shutdown_wait will close the LUN and wait for the erp to
finish without acquiring an additional reference. The wait allows to
omit the reference; the caller waiting for the erp to finish already
has a reference that holds the struct in place.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the code for managing zfcp_unit devices to the new file
zfcp_unit.c. This is in preparation for the change that zfcp_unit will
only track the LUNs configured via unit_add, other data will be moved
from zfcp_unit to the new struct zfcp_scsi_dev.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Exploit the cio siosl function to trigger logging in the FCP channel
on qdio error conditions. Add a helper function in zfcp_qdio to ensure
that tracing is only triggered once before calling qdio_shutdown.
Trigger in zfcp for hardware logs are:
- timeout for FSF requests to the FCP channel
- "no recommendation" status from FCP channel
- invalid FSF protocol status
- stalled outbound queue
- unknown request id on inbound queue
- QDIO_ERROR_SLSB_STATE
All of the above triggers run from the Linux qdio softirq context, so
no additional synchronization is necessary for the handling of the
ZFCP_STATUS_ADAPTER_SIOSL_ISSUED flag.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Introduce support for DIF/DIX in zfcp: Report the capabilities for the
Scsi_host, map the protection data when issuing I/O requests and
handle the new error codes. Also add the fsf data_direction field to
the hba trace, it is useful information for debugging in that area.
This is an EXPERIMENTAL feature for now.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Post FC transport class netlink events for usage in the userspace,
e.g. for HBAAPI. Supported events are those required for the
polled events in HBAAPI.
- link up
- link down
- incoming RSCN
(events related to FC-AL are not supported, as zfcp has no support for FC-AL)
Signed-off-by: Sven Schuetz <sven@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A lot of functions require the amount of SBALs as one of their
parameter which is most times invariable. Therefore remove this
parameter and set the SBAL value explicitly if a non standard value is
required. In addition the warning message "oversized data" is
replaced with a BUG_ON() statement assuring the limits defined and
requested by zfcp.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When the successful return of an adisc is the final step to set the
port online, the registration of SCSI devices might be omitted. SCSI
devices that have been removed before (due to a short dev_loss_tmo
setting) might not be attached again.
The problem is that the registration of SCSI devices is done only
after erp has finished. The correct place would be after the call to
fc_remote_port_add to mimick the scan in the FC transport class.
Change the registration of SCSI devices to be triggered after the
fc_remote_port_add call. For the initial inquiry command to succeed,
the unit must also be open. If the unit reopen is still pending, the
inquiry command to the LUN will be deferred with DID_IMM_RETRY, so
there is no harm from this approach.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Waiting for a free sbal is a operation on the qdio queue. Move the
code implementing the wait to zfcp_qdio.c and rename the functions
accordingly.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the code accessing the qdio sbales and zfcp_qdio_req struct to
the zfcp_qdio files and provide helper functions for accessing the
qdio related parts.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the qdio related structs and some helper functions to a new
zfcp_qdio.h header file. While doing this, rename the struct
zfcp_queue_req to zfcp_qdio_req to adhere to the naming scheme used in
zfcp. This allows a better seperation of the qdio code and inlining
the helper functions will save some function calls.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the code for tracking FSF requests to new file to have this code
in one place. The functions for adding and removing requests on the
I/O path are already inline. The alloc and free functions are only
called once, so it does not hurt to inline them and add them to the
same file.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The hardware used with zfcp provides a timer for CT and ELS requests
instead of an abort capability for these commands. To correctly handle
the FC BSG timeouts, pass the timeout from the BSG requests to the
hardware.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Introduce a zfcp callback for timeouts triggered from FC BSG. With
zfcp, the underlying hardware cannot abort CT or ELS requests, so
there is nothing to do when the block layer timeout expires. To avoid
interference with the block layer timeout, simply indicate that the
block layer timer should be reset. The timer running in the hardware
for the pending CT or ELS request will return the request when it
expires.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove some redundancies in FC related code and trace:
- drop redundant data from SAN trace (local s_id that only changes
during link down, ls_code that is already part of payload, d_id in
ct response trace that is always the same as in ct request trace)
- use one common fsf struct to hold zfcp data for ct and els requests
- leverage common fsf struct for FC passthrough job data, allocate it
with dd_bsg_data for passthrough requests and unify common code for
ct and els passthrough request
- simplify callback handling in zfcp_fc
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The well-known-address (WKA) port handling code is part of the FC code
in zfcp. Move everything WKA related to the zfcp_fc files and use the
common zfcp_fc prefix for structs and functions. Drop the unused key
management service while renaming the struct, no request could ever
reach this service in zfcp and it is obsolete anyway.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use common code definitions for FC plogi, logo, rscn and adisc structs
instead of inventing private ones. Move the private struct for issuing
ELS ADISC inside zfcp to zfcp_fc header file.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use common data structures for FCP CMND, FCP RSP and related
definitions and remove zfcp private definitions. Split the FCP CMND
setup and FCP RSP evaluation code in seperate functions. Use inline
functions to not negatively impact the I/O path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The port_scan work was scheduled to the work_queue provided by the
kernel. This resulted on SMP systems to a likely situation that more
than one scan_work were processed in parallel. This is not required
and openes the possibility of race conditions between the removal of
invalid ports and the enqueue of just scanned ports. This patch
synchronizes the scan_work tasks by scheduling them to adapter local
work_queue.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the reference counting for zfcp data structures, it is now
possible to implement module unloading again. Module unloading
requires to free all data structures in the module exit function. This
is done by unregistering zfcp from s390 cio and the SCSI midlayer
first in the module exit function.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The global config_mutex was required for the serialization of a
configuration change within the zfcp driver. This global locking is
now obsolete and can be removed. The requirement of serializing the
access to a zfcp_adapter reference via a ccw_device is realized wth a
static spinlock.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Replace the local reference counting by already available mechanisms
offered by kref. Where possible existing device structures were used,
including the same functionality.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For ports, zfcp gets the DID from the FC nameserver and tries to open
the port. If the open succeeds, zfcp compares the WWPN from the
nameserver with the WWPN in the PLOGI payload. In case of a mismatch,
zfcp assumes that the DID of the port just changed and we opened the
wrong port. This means that zfcp has to forget the DID, lookup the DID
again and retry.
This error case had a problem that zfcp forgets the DID, but never
looks up a new one, stalling the ERP in this case. Fix this by
triggering the DID lookup and properly exit from the ERP. The DID
lookup will trigger a new ERP action.
Also ensure when trying to open the port again with the new DID, first
close the open port, even in the NOESC case.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the change for delaying the allocation of zfcp_adapter, the
initial device parameter function has to first call
ccw_device_set_online which allocates the zfcp_adapter structure.
Change this and adapt the cfdc part accordingly.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>