Fix commands timeout with Sil3124/3132 based HBA when pass-through ATA
commands [where ATA_QCFLAG_RESULT_TF is set] are used while other
commands are active on other devices connected to the same port with a
Port Multiplier. Due to a hardware bug, these commands must be sent
alone, like ATAPI commands.
Signed-off-by: Gwendal Grignou <gwendal@google.com>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
After non-classifying reset, ehc->classes[] could contain
ATA_DEV_UNKNOWN which used to be normalized to ATA_DEV_NONE for
consistency. However, this causes unfortunate side effect for drivers
which have non-classifying hardresets (e.g. sata_nv) by making
hardreset report ATA_DEV_NONE for non-classifying resets and thus
makes EH believe that the port is unoccupied and recovery can be
skipped. The end result is that after a device is swapped with
another one, the new device isn't attached after the old one is
detached.
This patch makes ata_eh_reset() not normalize UNKNOWN to NONE after
non-classifying resets. This fixes the above problem. As UNKNOWN and
NONE are handled differently by only EH hotplug logic, this doesn't
cause other behavior changes.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Timeout on downstream command may indicate transmission problem on
host link. Propagate timeouts to host link.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Turns out distros always enabled burst mode and it is pretty essential so
do the same. Also sort out the post DMA mode restore properly.
My 20263 card now seems happy but needs some four drive tests done yet
(when I've persuaded the kernel not to hang in the edd boot code if I
plug them in ..)
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
It is legitimate (although annoying and silly) for a PCI IDE controller
not to be assigned an interrupt and to be polled. The libata-sff code
should therefore not try and request IRQ 0 in this case.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
During conversion to new EH, sata_qstor was accidentaly changed to use
softreset, which is buggy on this chip, instead of hardreset. This
patch updates sata_qstor such that it uses hardreset again.
This fixes bugzilla bug 9631.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
With ATAPI transfer chunk size properly programmed, libata PIO HSM
should be able to handle full spurious data chunks. Also, it's a good
idea to suppress trailing data warning for misc ATAPI commands as
there can be many of them per command - for example, if the chunk size
is 16 and the drive tries to transfer 510 bytes, there can be 31
trailing data messages.
This patch makes the following updates to libata ATAPI PIO HSM
implementation.
* Make it drain full spurious chunks.
* Suppress trailing data warning message for misc commands.
* Put limit on how many bytes can be drained.
* If odd, round up consumed bytes and the number of bytes to be
drained. This gets the number of bytes to drain right for drivers
which do 16bit PIO.
This patch is partial backport of improve-ATAPI-data-xfer patchset
pending for #upstream.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
While updating lbam/h for ATAPI commands, atapi_eh_request_sense() was
left out. Update it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement _GTF command filtering which can be controlled by
libata.acpi_filter kernel parameter. Currently SETXFER and LOCK
commands are filtered.
libata configures transfer mode by itself and _GTF SETXFER commands
can potentially disrupt device configuration. _GTM/_STM mechanism
can't handle hotplugging too well and when _GTF is executed,
controller is in PIO0 rather than the mode _STM configured.
Note that detecting SET MAX LOCK requires looking at the previous
command. This adds a bit to code complexity.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
As _GTF commands can't transfer data, device error never signals
transfer error. It indicates that the device vetoed the operation, so
it's meaningless to retry.
This patch makes libata-acpi to report and continue on device errors
when executing _GTF commands. Also commands rejected by device don't
contribute to the number of _GTF commands executed.
While at it, update _GTF execution reporting such that all successful
commands are logged at KERN_DEBUG and rename taskfile_load_raw() to
ata_acpi_run_tf() for consistency.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* If _GTF evalution fails, it's pointless to retry. If nothing else
is wrong, just ignore the error.
* After disabling ACPI, return success iff the number of executed _GTF
command equals zero. Otherwise, tell EH to retry. This change
fixes bogus 1 return bug where ata_acpi_on_devcfg() expects the
caller to reload IDENTIFY data and continue but the caller
interprets it as an error.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
On certain implementations, _GTF evaluation depends on preceding _STM
and both can be pretty picky about the configuration. Using _GTM
result cached during controller initialization satisfies the most
neurotic _STM implementation. However, libata evaluates _GTF after
reset during device configuration and the hardware state can be
different from what _GTF expects and can cause evaluation failure.
This patch adds dev->gtf_cache and updates ata_dev_get_GTF() such that
it uses the cached value if available. Cache is cleared with a call
to ata_acpi_clear_gtf().
Because for SATA ACPI nodes _GTF must be evaluated after _SDD which
can't be done till IDENTIFY is complete, _GTF caching from
ata_acpi_on_resume() is used only for IDE ACPI nodes.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
_GTM fetches currently configured transfer mode while _STM configures
controller according to _GTM parameter and prepares transfer mode
configuration TFs for _GTF. In many cases _GTM and _STM
implementations are quite brittle and can't cope with configuration
changed by libata.
libata does not depend on ATA ACPI to configure devices. The only
reason libata performs _GTM and _STM are to make _GTF evaluation
succeed and libata also doesn't care about how _GTF TFs configure
transfer mode. It overrides that configuration anyway, so from
libata's POV, it doesn't matter what value is feeded to _STM as long
as evaluation succeeds for _STM and following _GTF.
This patch adds dev->__acpi_init_gtm and store initial _GTM values on
host initialization before modified by reset and mode configuration.
If the field is valid, ata_acpi_init_gtm() returns pointer to the
saved _GTM structure; otherwise, NULL.
This saved value is used for _STM during resume and peek at
BIOS/firmware programmed initial timing for later use. The accessor
is there to make building w/o ACPI easy as dev->__acpi_init doesn't
exist if ACPI is not enabled.
On driver detach, the initial BIOS configuration is restored by
executing _STM with the initial _GTM values such that the next driver
can also use the initial BIOS configured values.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add two hooks - ata_acpi_dissociate() which is called during driver
detach after the whole host is shutdown and ata_acpi_on_disable()
which is called when a device is disabled.
Signed-off-by: Tejun heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_port_detach() calls ata_dev_disable() with host lock held but
ata_dev_disable() should be called from EH context. ata_port_detach()
steals EH context by setting ATA_PFLAG_UNLOADAING and flushing EH.
Drop locking around ata_dev_disable() and note that ata_port_detach()
owns EH context at that point.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* No internal function uses const ata_port. Drop const from @ap.
* Make ata_acpi_stm() copy @stm before using it and change @stm to
const.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Improve the existing boot/load time warnings from sata_mv
for Highpoint RocketRAID 23xx cards, based on new knowledge
about where the BIOS likes to overwrite sectors with metadata.
Harmless to us, but very useful for end users.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Like ST380817AS / 3.42, ST3160023AS / 3.42 times out commands if NCQ
is used. Blacklist it. This is reported by Matheus Izvekov in the
following thread.
http://thread.gmane.org/gmane.linux.ide/24202
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Matheus Izvekov <mizvekov@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
link->eh_info.serror is used to cache SError for controllers which
need it cleared from interrupt handler to clear IRQ. It also should
be cleared after reset just like SError itself.
Make ata_std_postreset() clear link->eh_info.serror too and update
sata_sil such that it doesn't care about bookkeeping the value.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Interestingly, sata_sil raises spurious interrupts if it's coupled
with Sil SATA_PATA bridge. Currently, sata_sil interrupt handler is
strict about spurious interrupts and freezes the port when it occurs.
This patch makes it more forgiving.
* On SATA PHY event interrupt, serror value is checked to see whether
it really is PHYRDY CHG event. If not, SATA PHY event interrupt is
ignored.
* If ATA interrupt occurs while no command is in progress, it's
cleared and ignored.
This fixes bugzilla bug 9505.
http://bugzilla.kernel.org/show_bug.cgi?id=9505
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Spurious NCQ completion detection implemented in ahci was incorrect.
On AHCI receving and processing FISes and raising interrupts are not
interlocked and spurious interrupts are expected.
For example, if an interrupt occurs while interrupt handler is running
and the running interrupt handler handles the event the new IRQ
indicated, after IRQ handler finishes, it will be executed again
because IRQ pending bit is set by the new interrupt but there won't be
anything to process.
Please read the following message for more information.
http://article.gmane.org/gmane.linux.ide/26012
This patch...
* Removes all spurious IRQ whining from ahci. Spurious NCQ completion
detection was completely wrong. Spurious D2H Register FIS taught us
that some early drives send spurious D2H Register FIS with I bit set
while NCQ commands are in progress but none of recent drives does
that and even the ones which show such behavior can do NCQ fine.
* Kills all NCQ blacklist entries which were added because of spurious
NCQ completions. I tracked down each commit and verified all
removed ones are actually added because of spurious completions.
WD740ADFD-00NLR1 wasn't deleted but moved upward because the drive
not only had spurious NCQ completions but also is slow on sequential
data transfers if NCQ is enabled.
Maxtor 7V300F0 was added by 0e3dbc01d5
from Alan Cox. I can only find evidences that the drive only had
troubles with spuruious completions by searching the mailing list.
This entry needs to be verified and removed if it doesn't have other
NCQ related problems.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ICH6 R/Ms share PCI ID between piix and ahci modes and we've been
allowing ahci to attach regardless of how BIOS configured it.
However, enabling AHCI mode when the controller is in combined mode
can result in unexpected behavior. Don't attach if the controller is
in combined mode.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Bill Nottingham <notting@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add Toshiba Tecra M4 to broken suspend list. This is from OSDL
bugzilla bug 7780.
Signed-off-by: Peter Schwenke <peter@bluetoad.com.au>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There isn't much point in reporting -EOPNOTSUPP as failure. Also the
message was missing newline.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* Don't program UDMA timings when programming PIO or MWDMA modes.
This has also a nice side-effect of fixing regression added by commit
681c80b5d9 ("libata: correct handling of
SRST reset sequences") (->set_piomode method for PIO0 is called before
->cable_detect method which checks UDMA timings to get the cable type).
* Bump driver version.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Tested-by: "Thomas Lindroth" <thomas.lindroth@gmail.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add the device IDs of legacy mode of MCP79 AHCI controller to ahci.c
Signed-off-by: Peer Chen <peerchen@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The Highpoint RocketRAID boards using Marvell 7042 chips
overwrite the 9th sector of attached drives at boot time,
when those drives are configured as "Legacy" (the default)
in the HighPoint BIOS.
This kills GRUB, and probably other stuff.
But it all happens *before* Linux is even loaded.
So, for now we'll log a WARNING when such boards are detected,
and advise users to configure BIOS "JBOD" volumes instead,
which don't appear to suffer from this problem.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
We need to run any DMA command with result taskfile requested in ADMA mode
when the port is in ADMA mode, otherwise it may try to use the legacy DMA engine
in ADMA mode which is not allowed. Enforce this with BUG_ON() since data
corruption could potentially result if this happened. Also, fail any attempt to
try and issue NCQ commands with result taskfile requested, since the hardware
doesn't allow this.
Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sata_mv: Fix broken Marvell 7042 support.
The Marvell 7042 chip is more or less the same as the 6042 internally,
but sports a PCIe bus. Despite having identical SATA cores, the 7042
does differ from its PCI bus counterparts in placment and layout of
certain bus related registers.
This patch fixes sata_mv to distinguish between the PCI bus registers
of earlier chips, and the PCIe bus registers of the 7042.
Specifically, move the offsets and bit patterns for the
PCI/PCIe interrupt cause/mask registers into the struct mv_host_priv,
as these values differ between the 6xxx and 7xxx series chips.
This fixes the driver to not access reserved PCI addresses,
and prevents the lockups reported in linux-2.6.24 with 7042 boards.
Also add a new PCI ID for the Highpoint 2300 7042-based board
that I'm using for testing this stuff here.
Tested with Marvell 6081 + 7042 chips, on x86 & x86_64.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
On Fri, 30 Nov 2007 14:34:11 +0200 (EET)
Meelis Roos <mroos@linux.ee> wrote:
> > Can you stick a stack trace in at that point ? That would help diagnose
> > it a great deal quicker.
>
> Finally done - found out hard way that BUG() is too bad and
> dump_st5ack() suits me better.
Thanks. This should fix the real cause, and also allow for port start to
fail politely with -ENODEV.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add more toshiba laptops to broken suspend list. This is from OSDL
bugzilla bug 7780.
tj: re-formatted patch and added description and SOB.
Signed-off-by: Peter Schwenke <peter@bluetoad.com.au>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata: Add more TSST (Samsung/Toshiba) IDE drives with broken
cable detection validation bits.
signed-off-by: Peter Missel (peter.missel@onlinehome.de)
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Protocol and CDB allocation size field are important in determining
what went wrong with ATAPI commands. Report them on failure.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Several fixes for the AVR32 PATA driver:
* Updated to use new AVR32 SMC timing API. This removes the need for "magic"
constants in signal timing.
* Removed the ATA_FLAG_PIO_POLLING, the driver should use interrupts.
* Removed .port_disable and .irq_ack as these are no longer needed.
* Improved some comments.
Signed-off-by: Kristoffer Nyborg Gregertsen <kngregertsen@norway.atmel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
None of the drives I have follows what the standard says about
transfer chunk size. Of the four SATA and six PATA ATAPI devices
tested, four ignore transfer chunk size completely and the ones which
honor it don't behave according to the spec when it's odd.
According to the spec, transfer chunk size can be odd if the amount of
data to transfer equals or is smaller than the chunk size and the
device can indicate the same odd number and transfer the whole thing
at one go with a pad byte appended. However, in reality, none of the
drives I have does that. They all indicate and transfer even number
of bytes one byte shorter than the chunk size first; then indicate and
transfer two bytes, which is clearly out of spec.
In addition to unnecessary second PIO data phase, this also creates a
weird problem when combined with SATA controllers which perform PIO
via DMA. Some of these controllers use actualy number of bytes
received to update DMA pointer so chunks which are sized 4n + 2 makes
DMA pointer off by two bytes. This causes data corruption and buffer
overruns.
This patch rounds nbytes up to the nearest even number such that ATAPI
devices don't split data transfer for the last odd byte. This
shouldn't confuse controllers which depend on transfer chunk size as
devices will report the rounded-up number, actually transfer that much
and padding buffer is there to receive them.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
If a card has no IRQ then pass no interrupt handler but allow polled
usage.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Hopefully there is a better long term solution but for now lets favour
reliability.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sil24 unnecessarily used LIBATA_MAX_PRD and ATAPI sg table was short
by one entry which might cause very obscure problems. This patch
updates sg table sizing such that
* One full page is used for PRB + sg table. On 4k page,
this results in 253 sg's.
* Make ATAPI sg block properly sized.
* Make build fail if command block size doesn't equal PAGE_SIZE.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>