firewire-core uses fw_card.lock to protect topology data and transaction
data. firewire-sbp2 uses fw_card.lock for entirely unrelated purposes.
Introduce a sbp2_target.lock to firewire-sbp2 and replace all
fw_card.lock uses in the driver. fw_card.lock is now entirely private
to firewire-core. This has no immediate advantage apart from making it
clear in the code that firewire-sbp2 does not interact with the core
via the core lock.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Users of card->lock Calling context
------------------------------------------------------------------------
sbp2_status_write AR-req handler, tasklet
complete_transaction AR-resp or AT-req handler, tasklet
sbp2_send_orb among else scsi host .queuecommand, which may
be called in some sort of atomic context
sbp2_cancel_orbs sbp2_send_management_orb/
sbp2_{login,reconnect,remove},
worklet or process
sbp2_scsi_abort, scsi eh thread
sbp2_allow_block sbp2_login, worklet
sbp2_conditionally_block among else complete_command_orb, tasklet
sbp2_conditionally_unblock sbp2_{login,reconnect}, worklet
sbp2_unblock sbp2_{login,remove}, worklet or process
Drop the IRQ flags saving from sbp2_cancel_orbs,
sbp2_conditionally_unblock, and sbp2_unblock.
It was already omitted in sbp2_allow_block.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The assertion in the comment in sbp2_allow_block() is no longer true.
Or maybe it never was true. At least now, the sole caller of
sbp2_allow_block(), sbp2_login, can run concurrently to one of
sbp2_unblock()'s callers, sbp2_remove.
sbp2_login is performed by sbp2_logical_unit.work.
sbp2_remove is performed by fw_device.work.
sbp2_remove cancels sbp2_logical_unit.work, but only after it called
sbp2_unblock.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
PREPARE_[DELAYED_]WORK() are being phased out. They have few users
and a nasty surprise in terms of reentrancy guarantee as workqueue
considers work items to be different if they don't have the same work
function.
firewire core-device and sbp2 have been been multiplexing work items
with multiple work functions. Introduce fw_device_workfn() and
sbp2_lu_workfn() which invoke fw_device->workfn and
sbp2_logical_unit->workfn respectively and always use the two
functions as the work functions and update the users to set the
->workfn fields instead of overriding work functions using
PREPARE_DELAYED_WORK().
This fixes a variety of possible regressions since a2c1c57be8
"workqueue: consider work function when searching for busy work items"
due to which fw_workqueue lost its required non-reentrancy property.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: linux1394-devel@lists.sourceforge.net
Cc: stable@vger.kernel.org # v3.9+
Cc: stable@vger.kernel.org # v3.8.2+
Cc: stable@vger.kernel.org # v3.4.60+
Cc: stable@vger.kernel.org # v3.2.40+
Commit 54b2b50c20 "[SCSI] Disable WRITE SAME for RAID and virtual
host adapter drivers" disabled WRITE SAME support for all SBP-2 attached
targets. But as described in the changelog of commit b0ea5f19d3
"firewire: sbp2: allow WRITE SAME and REPORT SUPPORTED OPERATION CODES",
it is not required to blacklist WRITE SAME.
Bring the feature back by reverting the sbp2.c hunk of commit 54b2b50c20.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: stable@kernel.org
Some host adapters do not pass commands through to the target disk
directly. Instead they provide an emulated target which may or may not
accurately report its capabilities. In some cases the physical device
characteristics are reported even when the host adapter is processing
commands on the device's behalf. This can lead to adapter firmware hangs
or excessive I/O errors.
This patch disables WRITE SAME for devices connected to host adapters
that provide an emulated target. Driver writers can disable WRITE SAME
by setting the no_write_same flag in the host adapter template.
[jejb: fix up rejections due to eh_deadline patch]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
FireWire upper layer drivers are converted from generic
struct driver.probe() and .remove()
to bus-specific
struct fw_driver.probe() and .remove().
The new .probe() adds a const struct ieee1394_device_id *id argument,
indicating the entry in the driver's device identifiers table which
matched the fw_unit to be probed. This new argument is used by the
snd-firewire-speakers driver to look up device-specific parameters and
methods. There is at least one other FireWire audio driver currently in
development in which this will be useful too.
The new .remove() drops the unused error return code.
Although all in-tree drivers are being converted to the new methods,
support for the old methods is left in place in this commit. This
allows public developer trees to merge this commit and then move to the
new fw_driver methods.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Clemens Ladisch <clemens@ladisch.de> (for sound/firewire/)
Cc: Peter Hurley <peter@hurleysoftware.com> (for drivers/staging/fwserial/)
No need to crash and burn if S/G element sizes cannot be set to our
liking; just leave a message in the log.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The old IEEE 1394 driver stack was removed in v2.6.37. That made the
checks for two Kconfig (module) macros unneeded, since they will now
always evaluate to true. Remove these two checks.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The commits
3c6bdaeab4 "[SCSI] Add a report opcode helper"
5db44863b6 "[SCSI] sd: Implement support for WRITE SAME"
introduced in-kernel uses of the mentioned commands but cautiously
blacklisted them for any IEEE 1394 (SBP-2/3) targets and some other
transports.
I looked through a range of SBP devices and found that the blacklist
flags can be removed:
The kernel never attempts these commands if the device's INQUIRY
data claim a SCSI revision of less than 0x05. This is the case with
all SBP devices that I checked, except for three more recent devices
which claimed a revision of 0x05 i.e. conformance with SPC-3 (two
devices based on the OXUF936QSE chip but having different firmwares,
one based on OXUF934DSB.)
I tried "sg_opcodes" from sg3_utils on several older and newer devices
and did not encounter any apparent firmware bugs with it. All devices
returned Illegal Request/ Invalid command operation code and carried on.
I furthermore tried "sg_write_same -U" on the OXUF934DSB device with the
same result. Alas I did not have a TRIM enabled SSD available for these
tests. All of the bridges were correctly identified by the kernel as
"fully provisioned", CD-ROM devices aside.
The kernel won't issue WRITE SAME to fully provisioned devices, nor
would it attempt REPORT SUPPORTED OPERATION CODES or WRITE SAME with
UNMAP bit on devices which do not claim conformance to SPC-3 or later.
Hence let's remove the no_report_opcodes and no_write_same blacklist
flags so that these commands can be used on newer targets with
respective capabilities. I guess the Linux sbp-target could be such a
target.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk
driver.
- We set the default maximum to 0xFFFF because there are several
devices out there that only support two-byte block counts even with
WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the
device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK
LIMITS VPD.
- max_write_same_blocks can be overriden per-device basis in sysfs.
- The UNMAP discovery heuristics remain unchanged but the discard
limits are tweaked to match the "real" WRITE SAME commands.
- In the error handling logic we now distinguish between WRITE SAME
with and without UNMAP set.
The discovery process heuristics are:
- If the device reports a SCSI level of SPC-3 or greater we'll issue
READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is
supported. If that's the case we will use it.
- If the device supports the block limits VPD and reports a MAXIMUM
WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16).
- Otherwise we will use WRITE SAME(10) unless the target LBA is beyond
0xFFFFFFFF or the block count exceeds 0xFFFF.
- no_write_same is set for ATA, FireWire and USB.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The REPORT SUPPORTED OPERATION CODES command can be used to query
whether a given opcode is supported by a device. Add a helper function
that allows us to look up commands.
We only issue RSOC if the device reports compliance with SPC-3 or
later. But to err on the side of caution we disable the command for ATA,
FireWire and USB.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The SBP-2/3 specifications do not require any alignment of data
buffers; only their own data structures need to be quadlet-aligned
[SR: or octlet-aligned].
Fix the comments to reflect this, but leave the actual alignment at
32 bits to avoid theoretical problems with target implementations
that might handle this incorrectly.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The SCSI framework automatically initializes the block queue's segment
size with the DMA device's segment size.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Use the scsi_dma_map/scsi_dma_unmap helper to simplify the code
a little.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The sbp2 driver does DMA not on the unit but on the card device.
The driver worked even with the wrong device because at the moment, it
happens to reimplement the DMA functions of the SCSI framework.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:
perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: David Howells <dhowells@redhat.com>
- Some SBP-2 initiator fixes, side product from ongoing work on a target.
- Reintroduction of an isochronous I/O feature of the older ieee1394 driver
stack (flush buffer completions); it was evidently rarely used but not
actually unused. Matching libraw1394 code is already available.
- Be sure to prefix all kernel log messages with device name or card name,
and other logging related cleanups.
- Misc other small cleanups, among them a small API change that affects
sound/firewire/ too.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
iQIcBAABAgAGBQJPa5I1AAoJEHnzb7JUXXnQvUoQAMl9PhUk5ZFhWp0AOnQ4uLhI
lEfRnUp94kGBdazBhxM9wtAwZRAeXUev/JyxwymMKSG40dMTbuxqRcs71v6a+ifd
VqNctL0yUncrOw/92l+TG2t/hWttB4u+dTKYX2U5yza42+uUHWMZb7MzmV+qVYc8
H+NR71WLQM4wkWdX8LBxmdeAOm0X635cjKsC/5FX9dws7q1ebSoxs4q4iIaGR7W8
ETWx5lh/UVyR7c9T+VIr0jfQWdsm2IcmHr/+nldlesePZ1gRjIEi69ErEnGxTkGe
NLPwt9lWuFXgWWHBON7C/rLmBA+NSys9lbvRAsPXrb3GpOKlde81c7U7Kr/kmEkh
hB9oM2Qh0A/7sglvIZiDUP565lqOAbXSJzziG3+0XgOP2zsxukm5gSecF8qM8tHY
IDwN05R9+nc26NA5TOfaRWx08n9SqTxq4V326oz9WMuK4bosCEfg4dvMwyMK/V3i
AyipAl2YYIG/2JURMFcGSKbw33dBw3mRsS8XG3MXwzagUMw/8tSyZKQIwF9qO4si
69QV7+CJoEfbJiLJMZJnKrRjfU+ZVRNA/xFuHUmhpmvYIbN8iJVGpGZABfXBUcH0
c1+qX9zE4NEAUEylbgn5raYSY6otF51O8QJzQOn2HRddBQSDpEwhkOGVfZ7zcSLH
sjAOn9qLIMHnrxUXxBDP
=oWbr
-----END PGP SIGNATURE-----
Merge tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull IEEE 1394 (FireWire) subsystem updates post v3.3 from Stefan Richter:
- Some SBP-2 initiator fixes, side product from ongoing work on a target.
- Reintroduction of an isochronous I/O feature of the older ieee1394 driver
stack (flush buffer completions); it was evidently rarely used but not
actually unused. Matching libraw1394 code is already available.
- Be sure to prefix all kernel log messages with device name or card name,
and other logging related cleanups.
- Misc other small cleanups, among them a small API change that affects
sound/firewire/ too. Clemens Ladisch is aware of it.
* tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: (26 commits)
firewire: allow explicit flushing of iso packet completions
firewire: prevent dropping of completed iso packet header data
firewire: ohci: factor out iso completion flushing code
firewire: ohci: simplify iso header pointer arithmetic
firewire: ohci: optimize control bit checks
firewire: ohci: remove unused excess_bytes field
firewire: ohci: copy_iso_headers(): make comment match the code
firewire: cdev: fix IR multichannel event documentation
firewire: ohci: fix too-early completion of IR multichannel buffers
firewire: ohci: move runtime debug facility out of #ifdef
firewire: tone down some diagnostic log messages
firewire: sbp2: replace a GFP_ATOMIC allocation
firewire: sbp2: Fix SCSI sense data mangling
firewire: sbp2: Ignore SBP-2 targets on the local node
firewire: sbp2: Take into account Unit_Unique_ID
firewire: nosy: Use the macro DMA_BIT_MASK().
firewire: core: convert AR-req handler lock from _irqsave to _bh
firewire: core: fix race at address_handler unregistration
firewire: core: remove obsolete comment
firewire: core: prefix log messages with card name
...
sbp2_send_management_orb() is called by sbp2_login, sbp2_reconnect, and
sbp2_remove, all which are able to sleep during memory allocations.
Actually, sbp2_send_management_orb() itself is a sleeping function.
Login and remove could allocate with GFP_KERNEL but reconnect needs
GFP_NOIO to ensure progress in low memory situations.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
SCSI sense data in SBP-2/3 is carried in an unusual format that means we
have to un-mangle it on our end before we pass it to the SCSI subsystem.
Currently our un-mangling code doesn't quite follow the SBP-2 standard
in that we always assume Current and never Deferred error types, we
never set the VALID bit, and we mishandle the FILEMARK, EOM and ILI
bits.
This patch fixes the sense un-mangling to correctly handle those and
follow the spec.
Signed-off-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The firewire-sbp2 module tries to login to an SBP-2/3 target even when
it is running on the local node, which fails because of the inability to
fetch data from DMA mapped regions using firewire transactions on the
local node. It also doesn't make much sense to have the initiator and
target on the same node, so this patch prevents this behaviour.
Signed-off-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (changed the comment)
If the target's unit directory contains a Unit_Unique_ID entry, we
should use that as the target's GUID for identification purposes. The
SBP-2 standards document says:
"Although the node unique ID (EUI-64) present in the bus information
block is sufficient to uniquely identify nodes attached to Serial Bus,
it is insufficient to identify a target when a vendor implements a
device with multiple Serial Bus node connections. In this case initiator
software requires information by which a particular target may be
uniquely identified, regardless of the Serial Bus access path used."
[ IEEE T10 P1155D Revision 4, Section 7.6 (page 51) ] and
[ IEEE T10 P1467D Revision 5, Section 7.9 (page 74) ]
Signed-off-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Commit eba9ebaaa2 "firewire: sbp2: use dev_printk API" changed
messages from e.g.
firewire_sbp2: fw3.0: logged in to LUN 0000 (0 retries)
to
sbp2 fw3.0: logged in to LUN 0000 (0 retries)
because the driver calls itself as "sbp2" when registering with driver
core and with SCSI core. This is of course confusing, so switch to the
name "firewire_sbp2" for driver core in order to match what lsmod and
/sys/module/ show. So we are back to
firewire_sbp2 fw3.0: logged in to LUN 0000 (0 retries)
in the kernel log.
This also changes
/sys/bus/firewire/drivers/sbp2
/sys/bus/firewire/devices/fw3.0/driver -> [...]/sbp2
/sys/module/firewire_sbp2/drivers/firewire:sbp2
to
/sys/bus/firewire/drivers/firewire_sbp2
/sys/bus/firewire/devices/fw3.0/driver -> [...]/firewire_sbp2
/sys/module/firewire_sbp2/drivers/firewire:firewire_sbp2
but "cat /sys/class/scsi_host/host27/proc_name" stays "sbp2" just in
case that proc_name is used by any userland.
The transport detection in lsscsi is not affected. (Tested with lsscsi
version 0.25.) Udev's /dev/disk/by-id and by-path symlinks are not
affected either.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use kernel.h's convenience macros. Also omit a printk that should never
happen and won't matter much if it ever happened.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
sbp2_release_target() is folded into its primary user, sbp2_remove().
The only other caller, a failure path in sbp2_probe(), now uses
sbp2_remove(). This adds unnecessary cancel_delayed_work_sync() calls
to that failure path but results in less code and text.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Implement sbp2_queue_work(), which is now a very simple accessor to one
of the struct sbp2_logical_unit members, right after the definition of
struct sbp2_logical_unit.
Put the sbp2_reconnect() implementation right after the sbp2_login()
implementation. They are both part of the SBP-2 access protocol.
Implement the driver methods sbp2_probe(), spp2_update(), sbp2_remove()
in this order, reflecting the lifetime of an SBP-2 target.
Place the sbp2_release_target() implementation right next to
sbp2_remove() which is its primary user, and after sbp2_probe() which is
the counterpart to sbp2_release_target().
There are no changes to the implementations here, or at least not meant
to be.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Since commit 0278ccd9d5 "firewire: sbp2:
fix panic after rmmod with slow targets", the lifetime of an sbp2_target
instance does no longer extent past the return of sbp2_remove().
Therefore it is no longer necessary to call fw_unit_get/put() and
fw_device_get/put() in sbp2_probe/remove().
Furthermore, said commit also ensures that lu->work is not going to be
executed or requeued at a time when the sbp2_target is no longer in use.
Hence there is no need for sbp2_target reference counting for lu->work.
Other concurrent contexts:
- Processes which access the sysfs of the SCSI host device or of one
of its subdevices are safe because these interfaces are all removed
by scsi_remove_device/host() in sbp2_release_target().
- SBP-2 command block ORB transactions are finished when
scsi_remove_device() in sbp2_release_target() returns.
- SBP-2 management ORB transactions are finished when
cancel_delayed_work_sync(&lu->work) before sbp2_release_target()
returns.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
If firewire-sbp2 starts a login to a target that doesn't complete ORBs
in a timely manner (and has to retry the login), and the module is
removed before the operation times out, you end up with a null-pointer
dereference and a kernel panic.
[SR: This happens because sbp2_target_get/put() do not maintain
module references. scsi_device_get/put() do, but at occasions like
Chris describes one, nobody holds a reference to an SBP-2 sdev.]
This patch cancels pending work for each unit in sbp2_remove(), which
hopefully means there are no extra references around that prevent us
from unloading. This fixes my crash.
Signed-off-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The struct sbp2_logical_unit.work items can all be executed in parallel
but are not reentrant. Furthermore, reconnect or re-login work must be
executed in a WQ_MEM_RECLAIM workqueue.
Hence replace the old single-threaded firewire-sbp2 workqueue by a
concurrency-managed but non-reentrant workqueue with rescuer.
firewire-core already maintains one, hence use this one.
In earlier versions of this change, I observed occasional failures of
parallel INQUIRY to an Initio INIC-2430 FireWire 800 to dual IDE bridge.
More testing indicates that parallel INQUIRY is not actually a problem,
but too quick successions of logout and login + INQUIRY, e.g. a quick
sequence of cable plugout and plugin, can result in failed INQUIRY.
This does not seem to be something that should or could be addressed by
serialization.
Another dual-LU device to which I currently have access to, an
OXUF924DSB FireWire 800 to dual SATA bridge with firmware from MacPower,
has been successfully tested with this too.
This change is beneficial to environments with two or more FireWire
storage devices, especially if they are located on the same bus.
Management tasks that should be performed as soon and as quickly as
possible, especially reconnect, are no longer held up by tasks on other
devices that may take a long time, especially login with INQUIRY and sd
or sr driver probe.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
We do not need slab allocations for ORB pointer write transactions
anymore in order to satisfy streaming DMA mapping constraints, thanks to
commit da28947e7e "firewire: ohci: avoid separate DMA mapping for
small AT payloads".
(Besides, the slab-allocated buffers that firewire-sbp2 used to provide
for 8-byte write requests were still not fully portable since they
shared a cacheline with unrelated CPU-accessed data.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
firewire-sbp2 already takes care for internal serialization where
required (ORB list accesses), and it does not use cmd->serial_number
internally. Hence it is safe to not grab the shost lock around
queuecommand.
While we are at housekeeping, drop a redundant struct member:
sbp2_command_orb.done is set once in a hot path and dereferenced once in
a hot path. We can as well dereference sbp2_command_orb.cmd->scsi_done
instead.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Now that firewire-core sets the local node's SPLIT_TIMEOUT to 2 seconds
per default, commit a481e97d3c is no
longer required.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.
The patch below presents a simple SCSI host lock push-down as an
equivalent transformation. No locking or other behavior should change
with this patch. All existing bugs and locking orders are preserved.
Additionally, add one parameter to queuecommand,
struct Scsi_Host *
and remove one parameter from queuecommand,
void (*done)(struct scsi_cmnd *)
Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.
Minimal code disturbance was attempted with this change. Most drivers
needed only two one-line modifications for their host lock push-down.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix I/O stalls with some 4-bay RAID enclosures which are based on
OXUF936QSE:
- Onnto dataTale RSM4QO, old firmware (not anymore with current
firmware),
- inXtron Hydra Super-S LCM, old as well as current firmware
when used in RAID-5 mode, perhaps also in other RAID modes.
The stalls happen during heavy or moderate disk traffic in periods that
are a multiple of 5 minutes, roughly twice per hour. They are caused
by the target responding too late to an ORB_Pointer register write:
The target responds after Split_Timeout, hence firewire-core cancels
the transaction, and firewire-sbp2 fails the SCSI request. The SCSI
core retries the request, that fails again (and again), hence SCSI core
calls firewire-sbp2's abort handler (and even the Management_Agent
register write in the abort handler has the transaction timeout
problem).
During all that, the process which issued the I/O is stalled in I/O
wait state.
Meanwhile, the target actually acts on the first failed SCSI request:
It responds to the ORB_Pointer write later (seen in the kernel log as
"firewire_core: Unsolicited response") and also finishes the SCSI
request with proper status (seen in the kernel log as "firewire_sbp2:
status write for unknown orb").
So let's just ignore RCODE_CANCELLED in the transaction callback and
wait for the target to complete the ORB nevertheless. This requires
a small modification is sbp2_cancel_orbs(); it now needs to call
orb->callback() regardless whether fw_cancel_transaction() found the
transaction unfinished or finished.
A different solution is to increase Split_Timeout on the local node.
(Tested: 2000ms timeout; maybe 1000ms or something like that works too.
200ms is insufficient. Standard is 100ms.) However, I rather not do
this because any software on any node could change the Split_Timeout to
something unsuitable. Or such a large Split_Timeout may be undesirable
for other purposes.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
When an ORB was canceled (Command ORB i.e. SCSI request timed out, or
Management ORB timed out), or there was a send error in the initial
transaction, we missed to drop one of the ORB's references and thus
leaked memory.
Background:
In total, we hold 3 references to each Operation Request Block:
- 1 during sbp2_scsi_queuecommand() or sbp2_send_management_orb()
respectively,
- 1 for the duration of the write transaction to the ORB_Pointer or
Management_Agent register of the target,
- 1 for as long as the ORB stays within the lu->orb_list, until
the ORB is unlinked from the list and the orb->callback was
executed.
The latter one of these 3 references is finished
- normally by sbp2_status_write() when the target wrote status
for a pending ORB,
- or by sbp2_cancel_orbs() in case of an ORB time-out,
- or by complete_transaction() in case of a send error.
Of them, the latter two lacked the kref_put.
Add the missing kref_put()s. Add comments to the gets and puts of
references for transaction callbacks and ORB callbacks so that it is
easier to see what is supposed to happen.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Conflicts:
drivers/firewire/core-card.c
drivers/firewire/core-cdev.c
and forgotten #include <linux/time.h> in drivers/firewire/ohci.c
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
void (*fw_address_callback_t)(..., int speed, ...) is the speed that a
remote node chose to transmit a request to us. In case of split
transactions, firewire-core will transmit the response at that speed.
Upper layer drivers on the other hand (firewire-net, -sbp2, firedtv, and
userspace drivers) cannot do anything useful with that speed datum,
except log it for debug purposes. But data that is merely potentially
(not even actually) used for debug purposes does not belong into the API.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
In case of fw_card_bm_work()'s lock request, the present sizeof
expression is going to be wrong if somebody changes the fw_card's DMA
scratch buffer's size in the future.
In case of quadlet write requests, sizeof(u32) is just silly; it's 4.
In case of SBP-2 ORB pointer write requests, 8 is arguably quicker to
understand as the correct and only possible value than
sizeof(some_datum).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (23 commits)
firewire: ohci: extend initialization log message
firewire: ohci: fix IR/IT context mask mixup
firewire: ohci: add module parameter to activate quirk fixes
firewire: ohci: use an ID table for quirks detection
firewire: ohci: reorder struct fw_ohci for better cache efficiency
firewire: ohci: remove unused dualbuffer IR code
firewire: core: combine a bit of repeated code
firewire: core: change type of a data buffer
firewire: cdev: increment ABI version number
firewire: cdev: add more flexible cycle timer ioctl
firewire: core: rename an internal function
firewire: core: fix an information leak
firewire: core: increase stack size of config ROM reader
firewire: core: don't fail device creation in case of too large config ROM blocks
firewire: core: fix "giving up on config rom" with Panasonic AG-DV2500
firewire: remove incomplete Bus_Time CSR support
firewire: get_cycle_timer optimization and cleanup
firewire: ohci: enable cycle timer fix on ALi and NEC controllers
firewire: ohci: work around cycle timer bugs on VIA controllers
firewire: make PCI device id constant
...
The block layer calling convention is blk_queue_<limit name>.
blk_queue_max_sectors predates this practice, leading to some confusion.
Rename the function to appropriately reflect that its intended use is to
set max_hw_sectors.
Also introduce a temporary wrapper for backwards compability. This can
be removed after the merge window is closed.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Several config ROM related functions only peek at the ROM cache; mark
their arguments as const pointers. Ditto fw_device.config_rom and
fw_unit.directory, as the memory behind them is meant to be write-once.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
A few stylistic changes to unify some code patterns in the subsystem:
- The similar queue_delayed_work helpers fw_schedule_bm_work,
schedule_iso_resource, and sbp2_queue_work now have the same call
convention.
- Two conditional calls of schedule_iso_resource are factored into
another small helper.
- An sbp2_target_get helper is added as counterpart to
sbp2_target_put.
Object size of firewire-core is decreased a little bit, object size of
firewire-sbp2 remains unchanged.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The Unit_Characteristics entry of an SBP-2 unit directory is not
mandatory as far as I can tell. If it is missing, we would probably
fail to log in into the target because firewire-sbp2 would not wait for
status after it sent the login request.
The fix moves the cleanup of tgt->mgt_orb_timeout into a place where it
is executed exactly once before login, rather than 0..n times depending
on the target's config ROM. With targets with one or more
Unit_Characteristics entries, the result is the same as before.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Per SBP-2 clause 5.3, a target shall store 8...32 bytes of status
information. Trailing zeros after the first 8 bytes don't need to be
stored, they are implicit. Fix the status write handler to clear all
unwritten status data.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
If a target writes invalid status (typically status of a command that
already timed out), firewire-sbp2 attempts to put away an ORB that
doesn't exist. https://bugzilla.redhat.com/show_bug.cgi?id=519772
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>