The same case is handled further below in sbp2scsi_complete_command.
Note, the second version behaves slightly different but looks
preferable. It's an extremely unlikely case by the way.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Remove unnecessary function prototypes.
Remove variable names from function prototypes.
Move declarations from sbp2.h to sbp2.c.
Move definitions of driver templates together near the top of sbp2.c.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The queue depth can be read from /sys/bus/scsi/devices/*/queue_depth,
so don't log it. And the hint about speed improvements is misleading,
at least under current kernels. If serialization is switched off, read
performance is typically increased by less than 10%. (I did not test
write performance recently.) On the other hand, serialize_io=0 is not
yet safe due to some implementation issues that are not trivial to fix.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Put the target's fetch agent into reset state before the underlying ORB
DMA is unmapped and the ->done handler is called. It is highly unlikely
but the target could access that ORB right before sbp2 sends the reset
request.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This is a follow-up to patch "ieee1394: sbp2: enable auto spin-up for
Maxtor disks". When I 'ejected' an OXUF922 based HDD from a Mac OS X
box, it was spun down by the Mac and did not spin up by itself when
attached to a Linux box right after that. The first SCSI command that
required the bridge to access the drive ended in
sda:<6>sd 18:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
Additional sense: Logical unit not ready, initializing cmd. required
Therefore the flag which instructs scsi_mod to send START STOP UNIT with
START=1 ("make medium ready") after such a condition is now enabled
unconditionally for all FireWire storage devices.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
In the old days, sbp2 used to coerce all MODE SENSE commands into the
10 bytes version. When all command set conversions were removed from
sbp2 several months ago, sdev->use_10_for_ms = 1 was added. Meaning,
higher SCSI layers preferred the 10 bytes version but would try the 6
bytes version if the former failed.
Recently, a problem with the 10 bytes version was discovered. An Initio
INIC-1530 firmware accepted the 10 bytes version but replied with bogus
data, showing the HDD incorrectly as write-protected. Since RBC
actually mandates MODE SENSE (6), I checked which version was sent by
Windows XP and Mac OS X 10.3 to an SBP-2 target hosted by Linux --- it
was the 6 bytes version. (Exception: OS X sent the 10 bytes version to
an MMC target. RBC and SBC got MODE SENSE (6).)
Therefore, drop the use_10_for_ms flag from sbp2. Now the upper layers
will try MODE SENSE (6) before MODE SENSE (10) on all SBP-2 devices.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Scsi_remove_device() may go into uninterruptible sleep if blocked.
Therefore sbp2_remove() unblocks the Scsi_Host before the device is
requested to be removed. But there could be another 1394 bus reset
after that which would block the host again. The 1394 subsystem won't
call sbp2_update() concurrently to sbp2_remove(), which is why there is
no chance for sbp2_remove() to be unblocked by sbp2_update().
The fix is to tell sbp2's bus reset handler when a device is to be shut
down so that it skips scsi_block_requests() on that host. As before,
any new commands after a reset without reconnect will be failed quickly
by sbp2scsi_queuecommand().
In the long term, means to go without scsi_block_requests() should be
found.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Remove unused includes. Add missing includes, i.e. explicitly include
all used headers. Sort includes alphabetically. Replace one call to
signal_pending(current) to avoid to include headers just for this line.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
If the target signals a transport failure via status block, complete the
request with DID_BUSY to indicate to the SCSI subsystem that the command
may succeed when retried.
Also log diagnostic information if the status block shows a transport
related problem. It may point to hardware faults.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
While sbp2_update() is doing its duties after a bus reset, another reset
could happen. Don't accept new requests until the next undisturbed
sbp2_update() or until sbp2_remove().
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The scsi_host_template's eh_abort_handler and eh_device_reset_handler
are allowed to sleep. Use this to run sbp2_agent_reset in the more
reliable mode which returns _after_ its write transaction was finished.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Fix for http://bugzilla.kernel.org/show_bug.cgi?id=6948
Because sbp2 writes to the target's fetch agent's registers from within
atomic context, it cannot sleep to guaranteedly get a free transaction
label. This may repeatedly lead to "sbp2util_node_write_no_wait failed"
and consequently to SCSI command abortion after timeout. A likely cause
is that many queue_command softirqs may occur before khpsbpkt (the
ieee1394 driver's thread which cleans up after finished transactions) is
woken up to recycle tlabels.
Sbp2 now schedules a workqueue job whenever sbp2_link_orb_command fails
in sbp2util_node_write_no_wait. The job will reliably get a transaction
label because it can sleep.
We use the kernel-wide shared workqueue because it is unlikely that the
job itself actually needs to sleep. In the improbable case that it has
to sleep, it doesn't need to sleep long since the standard transaction
timeout is 100ms.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
A deactivated macro, defined as "#define foo(bar)", will result in
silent corruption if somebody forgets a semicolon after a call to foo.
Replace it by "#define foo(bar) do {} while (0)" which will reveal any
respective syntax errors.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This patch contains the scheduled removal of the force_inquiry_hack
module parameter.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The waitqueue API is used to replace a custom wait mechanism. Only one
global waitqueue (instead of per-device waitqueues or completions) is
added because there is usually just one waiter.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
- Add checks for the (very unlikely) cases that the target writes too
little or too much status data or writes unsolicited status.
- Indicate that these and similar conditions are unlikely().
- Check the 'resp' and 'sbp_status' fields for possible failure status.
- Slightly optimize access macros for the status block bitfields.
- Unify a few related log messages.
TODO: Check if 'src'==1, then withhold the respective ORB from reuse
until status for any subsequent ORB was received. This is an old bug
whose fix requires more complex command queue handling.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Sbp2's copy of the status fifo was cleared when management ORBs or new
command ORBs were prepared. The latter had potential for a race
condition if the block layer's soft IRQ and the 1394 LLD's interrupt
handler ran on different CPUs. It would also yield wrong status if a
command was completed with non-zero completion status before other
commands that had zero completion status, and no new command was
enqueued in the meantime.
Now, the status buffer is cleared right before it is written. Thus it
ends up in the following simpler and safer access pattern:
- sbp2_alloc_device: allocates and implicitly clears once,
- sbp2_handle_status_write: clears, writes, and reads,
- sbp2_query_logins, sbp2_login_device, sbp2_reconnect_device: read.
The latter three do not race with sbp2_handle_status_write because of
how the protocol works.
As a tiny optimization, the first two quadlets of the status never need
to be cleared.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Only the driver writes ORBs, the device just reads them. Therefore
PCI_DMA_BIDIRECTIONAL can be replaced by PCI_DMA_TODEVICE which may be
cheaper on some architectures.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Since sbp2 is at the moment unable to do anything with the return value
of sbp2_link_orb_command, just discard it.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The sbp2 initiator has two ways to tell a target's fetch agent about new
command ORBs:
- Write the ORB's address to the ORB_POINTER register. This must not
be done while the fetch agent is active.
- Put the ORB's address into the previously submitted ORB's next_ORB
field and write to the DOORBELL register. This may be done while the
fetch agent is active or suspended. It must not be done while the
fetch agent is in reset state.
Sbp2 has a last_orb pointer which indicates in what way a new command
should be announced. That pointer is concurrently accessed at various
occasions. Furthermore, initiator and target are accessing the next_ORB
field of ORBs concurrently and asynchronously.
This patch does:
- Protect all initiator accesses to last_orb by sbp2_command_orb_lock.
- Add pci_dma_sync_single_for_device before a previously submitted
ORB's next_ORB field is overwritten.
- Insert a memory barrier between when next_ORB_lo and next_ORB_hi are
overwritten. Next_ORB_hi must not be updated before next_ORB_lo.
- Remove the rather unspecific and now superfluous qualifier "volatile"
from the next_ORB fields.
- Add comments on how last_orb is connected with what is known about
the target's fetch agent's state.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
At least Maxtor OneTouch III require a "start stop unit" command after
auto spin-down before the next access can proceed. This patch activates
the responsible code in scsi_mod for all Maxtor SBP-2 disks.
https://bugzilla.novell.com/show_bug.cgi?id=183011
Maybe that should be done for all SBP-2 disks, but better be cautious.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Replace occurrences of the magic value ~(u64)0 for invalid
CSR address spaces by a named constant for better readability.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
The proper designator of an invalid CSR address is ~(u64)0, not (u64)0.
Use the correct value in initialization and deregistration.
Also, scsi_id->sbp2_lun does not need to be initialized twice.
(scsi_id was kzalloc'd.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
If sbp2 is forced to move data via ARM handler, the maximum packet size
allowed for S800 transfers exceeds ohci1394's buffer size on platforms
where PAGE_SIZE is 4096.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Since this is useful information, promote it from a debug macro to
a regular log message. The message appears only if the user set
exclusive_login=0, therefore won't clutter the logs in normal use.
Also update the comment on exclusive_login.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
This code became ineffective a few Linux releases ago and is not
required anyway.
Note from Christoph Hellwig: scsi_cmnd.request_buffer is always a
scatterlist these days. Checking random bites into it and then
mangling the data in sbp2_check_sbp2_response will cause really bad
memory corruption when you're not lucky enough to have the check not
trigger by luck.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Add support for the following types of hardware:
+ nodes that have a link speed < PHY speed
+ 1394b PHYs that are less than S800 capable
+ 1394b/1394a adapter cable between two 1394b PHYs
Also, S1600 and S3200 are now supported if IEEE1394_SPEED_MAX is raised.
A probing function is added to nodemgr's config ROM fetching routine
which adjusts the allowable speed if an access problem was encountered.
Pros and Cons of the approach:
+ minimum code footprint to support this less widely used hardware
+ nearly no overhead for unaffected hardware
- ineffective before nodemgr began to read the ROM of affected nodes
- ineffective if ieee1394 is loaded with disable_nodemgr=1
The speed map CSRs which are published to the bus are not touched by the
patch.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Hakan Ardo <hakan@debian.org>
Cc: Calculex <linux@calculex.com>
Cc: Robert J. Kosinski <robk@cmcherald.com>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
The workarounds are not required for DViCO Momobay FX-3A and AFAIR not
for Momobay CX-2. These contain an TSB42AA9A but feature the same
firmware_revision value as the older DViCO Momobay CX-1.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
From: Stefan Richter <stefanr@s5r6.in-berlin.de>
I added a failure check in patch "sbp2: variable status FIFO address (fix
login timeout)" --- alas for a wrong error value. This is a bug since
Linux 2.6.16. Leads to NULL pointer dereference if the call failed, and
bogus failure handling if call succeeded.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>
Cc: Ben Collins <bcollins@debian.org>
Cc: Jody McIntyre <scjody@modernduck.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Re-enable posted writes for status FIFO.
Besides bringing back a very minor bandwidth tweak from Linux 2.6.15.x
and older, this also fixes an interoperability regression since 2.6.16:
http://bugzilla.kernel.org/show_bug.cgi?id=6356
(sbp2: scsi_add_device failed. IEEE1394 HD is not working anymore.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Vanei Heidemann <linux@javanei.com.br>
Tested-by: Martin Putzlocher <mputzi@gmx.de> (chip type unconfirmed)
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In case the blacklist with workarounds for device bugs yields a false
positive, the module load parameter can now also be used as an override
instead of an addition to the blacklist.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Apple decided to copy some USB stupidity over to FireWire.
The sector number returned by iPods from read_capacity is one too many.
This may cause I/O errors, especially if the kernel is configured for EFI
partition support. We use the same workaround as usb-storage but have to
check for different model IDs.
http://marc.theaimsgroup.com/?t=114233262300001https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=187409
Acknowledgements:
Diagnosis and therapy by Mathieu Chouquet-Stringer <ml2news@free.fr>,
additional data about affected and unaffected Apple hardware from
Vladimir Kotal, Sander De Graaf, Bryan Olmstead and Hugh Dixon.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Grand unification of the three types of workarounds we have so far.
The "skip mode page 8" workaround is now limited to devices which
pretend to be of TYPE_DISK instead of TYPE_RBC. This workaround is no
longer enabled for Initio bridges.
Patch update in anticipation of more workarounds:
- Add module parameter "workarounds".
- Deprecate parameter "force_inquiry_hack".
- Compose the blacklist of a compound type for better readability and
extensibility.
- Remove a now unused #define.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sbp2util_mark_command_completed takes a lock which was already taken by
sbp2scsi_complete_all_commands. This is a regression in Linux 2.6.15.
Reported by Kristian Harms at
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=187394
[ More complete commentary, as response to questions by Andrew: ]
> This changes the call environment for all implementations of
> ->Current_done(). Are they all safe to call under this lock?
Short answer: Yes, trust me. ;-) Long answer:
The done() callbacks are passed on to sbp2 from the SCSI stack along
with each SCSI command via the queuecommand hook. The done() callback
is safe to call in atomic context. So does
Documentation/scsi/scsi_mid_low_api.txt say, and many if not all SCSI
low-level handlers rely on this fact. So whatever this callback does,
it is "self-contained" and it won't conflict with sbp2's internal ORB
list handling. In particular, it won't race with the
sbp2_command_orb_lock.
Moreover, sbp2 already calls the done() handler with
sbp2_command_orb_lock taken in sbp2scsi_complete_all_commands(). I
admit this is ultimately no proof of correctness, especially since this
portion of code introduced the spinlock recursion in the first place and
we didn't realize it since this code's submission before 2.6.15 until
now. (I have learned a lesson from this.)
I stress-tested my patch on x86 uniprocessor with a preemptible SMP
kernel (alas I have no SMP machine yet) and made sure that all code
paths which involve the sbp2_command_orb_lock were gone through multiple
times.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- move call of scsi_print_command from sbp2_send_command to the beginning of
sbp2_queue_command to show also commands which are not sent
- put sbp2's name into scsi_print_sense
- use __FUNCTION__ in log messages
- remove a few less useful log messages and comments
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
Sbp2 relied on DID_OK to be defined as 0. Always shift DID_OK into the right
position anyway, and explicitly return DID_OK together with CHECK_CONDITION.
Also comment on some #if 0 code. The patch does not change current behaviour.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
Sbp2 did not check for successful registration of the lower address range
when CONFIG_IEEE1394_SBP2_PHYS_DMA was set. If hpsb_register_addrspace
failed, a "login timed-out" would occur which is misleading. Now sbp2 logs
a sensible error message.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
When a new SBP-2 unit is added, sbp2 now takes a reference on the 1394
low-level driver (ohci1394 or pcilynx). This prevents the 1394 host driver
module from being unloaded, e.g. by an administrative routine cleanup of
unused kernel modules or when another 1394 driver which depends on ohci1394
is unloaded.
The reference is dropped when the SBP-2 unit was disconnected, when sbp2 is
unloaded or detached from the unit, or when addition of the SBP-2 unit failed.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
Since about Linux 2.6.14, sbp2's inquiry workaround did not work anymore
due to changes in the SCSI layer. Update it to become effective again.
Testing one of the two known affected bridges has shown that skip_ms_page_8
is required as well.
Also, make force_inquiry_hack tunable via /sys/module/sbp2/parameters.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
(cherry picked from 99496037c6744fd938ffb8ccfc8fc91762322ff8 commit)
Let the ieee1394 core select a suitable 1394 address range for sbp2's
status FIFO instead of using a fixed range. Since the core only selects
addresses which are guaranteed to be out of the "physical range" as per
OHCI 1.1, this patch also fixes an old bug:
OHCI controllers which implement a writeable PhysicalUpperBound register
included sbp2's status FIFO in the physical range. That way sbp2 was
never notified of a succesful login and always failed after timeout.
Affected OHCI host adapters include ALi and Fujitsu controllers.
As another side effect of this patch, the status FIFO is no longer
located in a range for which OHCI chips perform "posted writes". Each
status write now requires a response subaction. But since large data
transfers involve only few status writes, there is no measurable
decrease of I/O throughput. What's more, the status FIFO is now safe
from potential host bus errors. Nevertheless, posted writes could be
re-enabled by extensions to the ARM features of the 1394 stack.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
(cherry picked from b2d38cccad4ef80d6b672b8f89aae5fe2907b113 commit)
If there were commands enqueued but not completed before an SBP-2 unit
was unplugged (or an attempt to reconnect failed), knodemgrd or any
process which tried to remove the device would sleep uninterruptibly
in blk_execute_rq(). Therefore make sure that all commands are
completed when sbp2 retreats.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
(cherry picked from 61daa34c132c5d4ed8630e2c46e9bf2f0c7b3428 commit)
sbp2.c mangles INQUIRY response in a way that only applies to standard
inquiry data (i.e. when both cmddt and evpd bits are 0). Leave other cases
alone; e.g. when asking for VPD the length of reply is in byte 3, not 4
and byte 4 is the first byte of device serial number.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Use sbp2_remove_device() to free FIFO and ORB DMAs in a failure case.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
sbp2_create_command_orb() code cleanup:
- add two helper functions to reduce nesting depth
- omit the return value which was always ignored
- remove unnecessary declaration from sb2.h
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
If scsi_add_device() at the end of sbp2_start_device() fails, e.g. due to
transport errors during SCSI inquiry, sbp2 needs to log out of the device
and release all associated resources.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>