Block layer alignment was used for two different purposes - memory
alignment and padding. This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len. Separate out padding such that padding occurs iff
driver explicitly requests it.
Tomo: restorethe code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbd
according to padding alignment.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).
This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbd.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
Interrupts must be disabled if using kmap_atomic(KM_IRQ0), but that was
not the case in a few code paths coming directly from ATA driver
interrupt handlers (which use spin_lock rather than spin_lock_irqsave).
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Back in 2.6.17-rc2, a libata module parameter was added for atapi_dmadir.
That's nice, but most SATA devices which need it will tell us about it
in their IDENTIFY PACKET response, as bit-15 of word-62 of the
returned data (as per ATA7, ATA8 specifications).
So for those which specify it, we should automatically use the DMADIR bit.
Otherwise, disc writing will fail by default on many SATA-ATAPI drives.
This patch adds ATA_DFLAG_DMADIR and make ata_dev_configure() set it
if atapi_dmadir is set or identify data indicates DMADIR is necessary.
atapi_xlat() is converted to check ATA_DFLAG_DMADIR before setting
DMADIR.
Original patch is from Mark Lord.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix libata kernel-doc parameter:
Warning(linux-2.6.25-rc2-git3//drivers/ata/libata-scsi.c:845): No description found for parameter 'rq'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This just updates the libata slave configure routine to take advantage
of the block layer drain buffers. It also adjusts the size lengths in
the atapi code to add the drain buffer to the DMA length so the driver
knows it can rely on it.
I suspect I should also be checking for AHCI as well as ATA_DEV_ATAPI,
but I couldn't see how to do that easily.
tj: * atapi_drain_needed() added such that draining is applied to only
misc ATAPI commands.
* q->bounce_gfp used when allocating drain buffer.
* Now duplicate ATAPI PIO drain logic dropped.
* ata_dev_printk() used instead of sdev_printk().
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
that provided by the block layer
ATA requires that all DMA transfers begin and end on word boundaries.
Because of this, a large amount of machinery grew up in ide to adjust
scatterlists on this basis. However, as of 2.5, the block layer has a
dma_alignment variable which ensures both the beginning and length of a
DMA transfer are aligned on the dma_alignment boundary. Although the
block layer does adjust the beginning of the transfer to ensure this
happens, it doesn't actually adjust the length, it merely makes sure
that space is allocated for transfers beyond the declared length. The
upshot of this is that scatterlists may be padded to any size between
the actual length and the length adjusted to the dma_alignment safely
knowing that memory is allocated in this region.
Right at the moment, SCSI takes the default dma_aligment which is on a
512 byte boundary. Note that this aligment only applies to transfers
coming in from user space. However, since all kernel allocations are
automatically aligned on a minimum of 32 byte boundaries, it is safe to
adjust them in this manner as well.
tj: * Adjusting sg after padding is done in block layer. Make libata
set queue alignment correctly for ATAPI devices and drop broken
sg mangling from ata_sg_setup().
* Use request->raw_data_len for ATAPI transfer chunk size.
* Killed qc->raw_nbytes.
* Separated out killing qc->n_iter.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits)
[SCSI] usbstorage: use last_sector_bug flag universally
[SCSI] libsas: abstract STP task status into a function
[SCSI] ultrastor: clean up inline asm warnings
[SCSI] aic7xxx: fix firmware build
[SCSI] aacraid: fib context lock for management ioctls
[SCSI] ch: remove forward declarations
[SCSI] ch: fix device minor number management bug
[SCSI] ch: handle class_device_create failure properly
[SCSI] NCR5380: fix section mismatch
[SCSI] sg: fix /proc/scsi/sg/devices when no SCSI devices
[SCSI] IB/iSER: add logical unit reset support
[SCSI] don't use __GFP_DMA for sense buffers if not required
[SCSI] use dynamically allocated sense buffer
[SCSI] scsi.h: add macro for enclosure bit of inquiry data
[SCSI] sd: add fix for devices with last sector access problems
[SCSI] fix pcmcia compile problem
[SCSI] aacraid: add Voodoo Lite class of cards.
[SCSI] aacraid: add new driver features flags
[SCSI] qla2xxx: Update version number to 8.02.00-k7.
[SCSI] qla2xxx: Issue correct MBC_INITIALIZE_FIRMWARE command.
...
Hugh Dickens noticed that SMART commands issued from user space can
end up corupting memory. The problem occurs if the buffer used to
read data spans two pages. The reason is that the PIO sector routines
in libata are expecting physically contiguous pages when they do
sector operations, so the left overs on the second page go into the
next physically adjacent page rather than the next page in the sg
mapping.
Fix this by enforcing strict 512 byte alignment on all buffers from
userspace.
Acked-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata used private sg iterator to handle padding sg. Now that sg can
be chained, padding can be handled using standard sg ops. Convert to
chained sg.
* s/qc->__sg/qc->sg/
* s/qc->pad_sgent/qc->extra_sg[]/. Because chaining consumes one sg
entry. There need to be two extra sg entries. The renaming is also
for future addition of other extra sg entries.
* Padding setup is moved into ata_sg_setup_extra() which is organized
in a way that future addition of other extra sg entries is easy.
* qc->orig_n_elem is unused and removed.
* qc->n_elem now contains the number of sg entries that LLDs should
map. qc->mapped_n_elem is added to carry the original number of
mapped sgs for unmapping.
* The last sg of the original sg list is used to chain to extra sg
list. The original last sg is pointed to by qc->last_sg and the
content is stored in qc->saved_last_sg. It's restored during
ata_sg_clean().
* All sg walking code has been updated. Unnecessary assertions and
checks for conditions the core layer already guarantees are removed.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_PROT_ATAPI_* are ugly and naming schemes between ATA_PROT_* and
ATA_PROT_ATAPI_* are inconsistent causing confusion. Rename them to
ATAPI_PROT_* and make them consistent with ATA counterpart.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
atapi_request_sense() is now the only left user of ata_sg_init_one().
Convert it to use sg interface.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Rusty Russel <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Historically word 48 in the identify data was used to mean 32bit I/O
was supported for VLB IDE etc. ATA8 reassigns this word to the Trusted
Computing Group, where it is used for TCG features. This means that
an ATA8 TCG drive is going to trigger 32bit I/O on some systems which
will be funny.
Anyway we need to sort this out ready for ATA8 so:
- Reorder the ata.h header a bit so the ata_version function occurs early
in it
- Make dword_io check the ATA version
- Add an ATA8 version checking TCG presence test
While we are at it the current drafts have a flaw where it may not be
possible to disable TCG features at boot (and opt out of the trusted
model) as TCG intends because it relies on presence of a different
optional feature (DCS). Handle this in software by refusing the TCG
commands if libata.allow_tpm is not set. (We must make it possible
as some environments such as proprietary VDR devices will doubtless
want to use it to lock up content)
Finally as with CPRM print a warning so that the user knows they may
not be able to full access and use the device.
Signed-off-by: Alan Cox <alan@redhat.com>
After 9b8e8de7, manage_start_stop configuration depends on valid ATA
device. Move it into ata_scsi_dev_config(). This was detected by the
coverity checker.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch relaxes the default SCSI DMA alignment from 512 bytes to 4
bytes. I remember from previous discussions that usb and firewire have
sector size alignment requirements, so I upped their alignments in the
respective slave allocs.
The reason for doing this is so that we don't get such a huge amount of
copy overhead in bio_copy_user() for udev. (basically all inquiries it
issues can now be directly mapped).
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
None of the drives I have follows what the standard says about
transfer chunk size. Of the four SATA and six PATA ATAPI devices
tested, four ignore transfer chunk size completely and the ones which
honor it don't behave according to the spec when it's odd.
According to the spec, transfer chunk size can be odd if the amount of
data to transfer equals or is smaller than the chunk size and the
device can indicate the same odd number and transfer the whole thing
at one go with a pad byte appended. However, in reality, none of the
drives I have does that. They all indicate and transfer even number
of bytes one byte shorter than the chunk size first; then indicate and
transfer two bytes, which is clearly out of spec.
In addition to unnecessary second PIO data phase, this also creates a
weird problem when combined with SATA controllers which perform PIO
via DMA. Some of these controllers use actualy number of bytes
received to update DMA pointer so chunks which are sized 4n + 2 makes
DMA pointer off by two bytes. This causes data corruption and buffer
overruns.
This patch rounds nbytes up to the nearest even number such that ATAPI
devices don't split data transfer for the last odd byte. This
shouldn't confuse controllers which depend on transfer chunk size as
devices will report the rounded-up number, actually transfer that much
and padding buffer is there to receive them.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Sebastian Kemper reported that issuing CD/DVD commands under libata is
not fully compatible with ide-scsi. In particular, the
GPCMD_SET_STREAMING was being rejected at the host level in some
instances.
The reason is that libata-scsi insists upon the cmd_len field exactly
matching the SCSI opcode being issued, whereas ide-scsi tolerates
12-byte commands contained within a 16-byte (cmd_len) CDB.
There doesn't seem to be a good reason for us to not be compatible
there, so here is a patch to fix libata-scsi to permit SCSI opcodes so
long as they fit within whatever size CDB is provided.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement ATA_QCFLAG_QUIET which indicates that there's no need to
report if the command fails with AC_ERR_DEV and set it for passthrough
commands.
Combined with previous changes, this now makes device errors for all
direct commands reported directly to the issuer without going through
EH actions and reporting.
Note that EH is still invoked after non-IO device errors to determine
the nature of the error and resume command execution (some controller
requires special care after error to continue). It just performs
default maintenance after error, examines what's going on, realizes
that it's none of its business and reports the command failure without
logging any error messages.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_QCFLAG_IO is used to mark commands which are used to perform
regluar IO transfers via block layer. These commands are assumed to
be valid and taken more seriously during error handling. Cache flush
is used by regular IO path and necessary for data integrity. Mark it
with ATA_QCFLAG_IO.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Device Initiated Power Management, which is defined
in SATA 2.5 can be enabled for disks which support it.
This patch enables DIPM when the user sets the link
power management policy to "min_power".
Additionally, libata drivers can define a function
(enable_pm) that will perform hardware specific actions to
enable whatever power management policy the user set up
for Host Initiated Power management (HIPM).
This power management policy will be activated after all
disks have been enumerated and intialized. Drivers should
also define disable_pm, which will turn off link power
management, but not change link power management policy.
Documentation/scsi/link_power_management_policy.txt has additional
information.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Some commands need post-processing after successful completion. This
was done in ata_scsi_qc_complete() till now but this has the following
problems.
* Post-command processing gets executed when qc is completed from EH.
Some qc's are retried from EH with zero err_mask and thus triggers
unnecessary/incorrect post-command processing.
* Command post processing doesn't belong to SAT layer.
* Link-wide revalidation was scheduled where device revalidation
suffices.
This patch moves post-command processing to success completion path of
ata_qc_complete() which is travelled iff the command is going to be
completed without passing through EH and updates post-command
processing such that device-specific action is used. While at it,
restructure code a bit for readability.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tackle the relatively sane complaints of checkpatch --file.
The vast majority is indentation and whitespace changes, the rest are
* #include fixes
* printk KERN_xxx prefix addition
* BSS/initializer cleanups
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
LIBATA_MAX_PRD is the maximum number of DMA scatter/gather elements
permitted by the HBA's DMA engine. It's properly set to
q->max_hw_segments via the sg_tablesize parameter.
libata shouldn't call blk_queue_max_phys_segments. Now LIBATA_MAX_PRD
is equal to SCSI_MAX_PHYS_SEGMENTS by default (both is 128), so
everything is fine. But if they are changed, some code (like the scsi
mid layer, sg chaining, etc) might not work properly.
(Addition from Jens) The basic issue is that the physical segment
setting is purely a driver issue. And since SCSI is managing the sglist,
libata has no business changing the setting. All libata should care
about is the hw segment setting.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fix libata docbook warnings.
Warning(linux-2.6.23-git8//drivers/ata/libata-scsi.c:3251): No description found for parameter 'dev'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commands which can change device configuration, EH is scheduled
to revalidate and reconfigure the device. Host link was incorrectly
used unconditionally when scheduling EH action. This resulted in
bogus revalidation request and mismatched configuration between device
and driver. Fix it.
This bug was reported by Igor Durdanovic.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Igor Durdanovic <idurdanovic@comcast.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some controller variants snoop the ATAPI length value for Packet
transfers to do state machine and FIFO management. Thus we want to
set it properly, even for cases where it is otherwise meaningless.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
AN serves multiple purposes. For ATAPI, it's used for media change
notification. For PMP, for downstream PHY status change notification.
Implement sata_async_notification() which demultiplexes AN.
To avoid unnecessary port events, ATAPI AN is not enabled if PMP is
attached but SNTF is not available.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kriten Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some pseudo devices fail PM commands unnecessarily aborting system
suspend. Implement ATA_HORKAGE_SKIP_PM which makes libata skip PM
commands for these devices.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Controllers which support PMP have various restrictions on which
combinations of commands are allowed to what number of devices
concurrently. This patch implements ops->qc_defer() which determines
whether a qc can be issued at the moment or should be deferred.
If the function returns ATA_DEFER_LINK, the qc will be deferred until
a qc completes on the link. If ATA_DEFER_PORT, until a qc completes
on any link. The defer conditions are advisory and in general
ATA_DEFER_LINK can be considered as lower priority deferring than
ATA_DEFER_PORT.
ops->qc_defer() replaces fixed ata_scmd_need_defer(). For standard
NCQ/non-NCQ exclusion, ata_std_qc_defer() is implemented. ahci and
sata_sil24 are converted to use ata_std_qc_defer().
ops->qc_defer() is heavier than the original mechanism because full qc
is prepped before determining to defer it, but various information is
needed to determine defer conditinos and fully translating a qc is the
only way to supply such information in generic manner.
IMHO, this shouldn't cause any noticeable performance issues as
* for most cases deferring occurs rarely (except for NCQ-aware
cmd-switching PMP)
* translation itself isn't that expensive
* once deferred the command won't be repeated until another command
completes which usually is a very long time cpu-wise.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update AN support in preparation of PMP support.
* s/ata_id_has_AN/ata_id_has_atapi_AN/
* add AN enabled reporting during configuration
* add err_mask to AN configuration failure reporting
* update LOCKING comment for ata_scsi_media_change_notify()
* check whether ATA dev is attached to SCSI dev ata_scsi_media_change_notify()
* set ATA_FLAG_AN in ahci and sata_sil24
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kriten Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Clear ARRE, we don't do auto-reallocation on reads, just on writes.
Also, hardcode the size of the array using RW_RECOVERY_MPAGE_LEN,
following the style of the surrounding code.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* SAT specifies that FORMAT UNIT should be translated into a series
of READ and WRITE commands that zero the ATA device. That is far too
cumbersome to bother with.
Since we don't actually format the device, the old behavior of
always returning success was inaccurate. Change FORMAT UNIT from
returning success immediately (old behavior) to always returning
an error (new behavior).
* Add some comments around SYNCHRONIZE CACHE
* Shuffle scsi command code around a bit, so that things are close
to alphabetic order.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
A few pedantic apps care about missing or lame "mandatory" SCSI
commands, so
REQUEST SENSE -- as we autosense, R.S. just returns zeroes
SEND DIAGNOSTIC -- our default (no-op) self-test succeeds, all
other requests for testing fail.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
simple search-and-replace of direct scsi_cmnd access to
use the data buffer accessors.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This is a minimal patch needed to remove use of !use_sg
but it is not a complete clean up of the !use_sg paths.
Libata-core still has the qc->flags & ATA_QCFLAG_SG
and !qc->n_elem code paths. Perhaps an ata maintainer
would have a go at it.
- TODO: further cleanup of qc->flags & ATA_QCFLAG_SG
and !qc->n_elem code paths in libata-core
- TODO: Use scsi_dma_{map,unmap} where applicable.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When we get an SDB FIS with the 'N' bit set, we should send
an event to user space to indicate that there has been a
media change. This will be done via the scsi device.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add support for issuing ATA_16 passthru commands to ATAPI devices
managed by libata. It requires the previous CDB length fix patch.
A boot/module parameter, "atapi_passthru16=0" can be used to globally
disable this feature, if ever desired.
tj: restructured __ata_scsi_queuecmd() according to Jeff's suggestion.
Signed-off-by: Mark Lord <liml@rtr.ca>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update hotplug to handle PMP links. When PMP is attached, the PMP
number corresponds to C of SCSI H:C:I:L. While at it, change argument
to ata_find_dev() to @devno from @id to avoid confusion with SCSI
device ID.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Multiple links and different number of devices per link should be
considered to iterate over links and devices. This patch implements
and uses link and device iterators - ata_port_for_each_link() and
ata_link_for_each_dev() - and ata_link_max_devices().
This change makes a lot of functions iterate over only possible
devices instead of from dev 0 to dev ATA_MAX_DEVICES. All such
changes have been examined and nothing should be broken.
While at it, add a separating comment before device helpers to
distinguish them better from link helpers and others.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>