In most cases, when EH is scheduled, all in-flight commands are
aborted causing EH to kick in immediately. However, in some cases
(especially with PMP), it's unclear which commands are affected by the
error condition and although aborting all in-flight commands work, it
isn't optimal and may cause unnecessary disruption. On the other
hand, waiting for in-flight commands to drain themselves can take up
to 30seconds.
This patch implements EH fast drain to handle such situations. It
gives in-flight commands some time to finish up but doesn't wait for
too long. After EH is scheduled, fast drain timer is started and if
no other completion occurs in ATA_EH_FASTDRAIN_INTERVAL all in-flight
commands are aborted. If any completion occurred in the interval, the
port is given another interval to finish up itself.
Currently ATA_EH_FASTDRAIN_INTERVAL is 3 secs which should be enough
for finishing up most commands.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
SCSI scan may fail due to memory allocation failure even if EH is not
in progress. Due to use of GFP_ATOMIC in SCSI scan path, allocation
failure isn't too rare especially while probing multiple devices at
once which is the case when a bunch of devices are connected to PMP.
This patch moves SCSI scan failure detetion logic from
ata_scsi_hotplug() to ata_scsi_scan_host() and implement synchronous
scan behavior. The synchronous path sleeps briefly and repeats SCSI
scan if some devices aren't attached properly. It contains robust
retry loop to minimize the chance of device misdetection during boot
and falls back to async retry if everything fails.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Debouncing failure is a good indicator of basic link problem. Use
-EPIPE to indicate debouncing failure and make ata_eh_reset() invoke
sata_down_spd_limit() if the error occurs during reset.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sata_down_spd_limit() first reads the current SPD from SStatus and
limit the speed to the lower one of one below the current limit or one
below the current SPD in SStatus. SPD may not be accessible or valid
when SPD down is requested making sata_down_spd_limit() fail when it's
most needed.
This patch makes the current SPD cached after each successful reset
and forces GEN I speed (1.5Gbps) if neither of SStatus or the cached
value is valid, so sata_down_spd_limit() is now guaranteed to lower
the speed limit if lower speed is available.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Convert ->scr_read/write callbacks to return error code to better
indicate failure. This will help handling of SCR_NOTIFICATION.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Requiring LLDs to format multiple error description messages properly
doesn't work too well. Help LLDs a bit by making ata_ehi_push_desc()
insert ", " on each invocation. __ata_ehi_push_desc() is the raw
version without the automatic separator.
While at it, make ehi_desc interface proper functions instead of
macros.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add @is_cmd to ata_tf_to_fis(). This controls bit 7 of the second
byte which tells the device whether this H2D FIS is for a command or
not. This cleans up ahci a bit and will be used by PMP.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Yay, the first one from Seagate. 3.ALC firmware is okay. This was
reported by Sam Freed on bugzilla bug 8759.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Sam Freed <sam@freed.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
It seems irq_on() in ata_bus_reset() and ata_std_postreset()
are leftover of the EDD reset. Remove them.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add another Maxtor 6B200M0 drive with broken NCQ to the list.
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add Hitachi HDS7250SASUN500G 0621KTAWSD to list of devices with broken NCQ.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Please warmly welcome the first member from FUJITSU to the prestigious
NCQ spurious completion club.
This is reported by Serge Van Thillo in bugzilla bug 8730.
http://bugzilla.kernel.org/show_bug.cgi?id=8730
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Serge van Thillo <nulleke@hotmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Horkage handling had the following problems.
* dev->horkage was positioned after ATA_DEVICE_CLEAR_OFFSET, so it was
cleared before the device is configured. This broke
HORKAGE_DIAGNOSTIC.
* Some used dev->horkage while others called ata_device_blacklisted()
directly. This was at best confusing.
This patch moves dev->horkage right after dev->flags and set the field
according to the blacklist during device configuration. All users
test against dev->horkage. ata_device_blacklisted() now has only one
user, make it static. While at it, rename it to ata_dev_blacklisted()
for consistency.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The Zip 250 which chokes on MWDMA SET_XFERMODE sometimes have "Floppy"
appeneded to its model number. Quirk it too.
http://bugzilla.kernel.org/show_bug.cgi?id=8563
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Hans de Bruin <bruinjm@xs4all.nl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
With PCI resource fix up for legacy hosts. We can use the same code
path to allocate IO resources and initialize host for both legacy and
native SFF hosts. Only IRQ requesting needs to be different.
Rename ata_pci_*_native_host() to ata_pci_*_sff_host(), kill all
legacy specific functions and use the renamed functions instead. This
simplifies code a lot.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
We should not use cancel_work_sync(delayed_work->work). This works, but not
good. We can use cancel_rearming_delayed_work(), this also simplifies the
code.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add ata_dumb_qc_prep and supporting logic so that a driver can just
specify it needs to be helped in this area. 64K entries are split
as with drivers/ide.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ap->cbl == ATA_CBL_SATA indicates SATA cable while ap->flags &
ATA_FLAG_SATA indicates SATA host port. Till now they always gave the
same result but SATA/PATA bridge handling will change that. Switch to
ATA_FLAG_SATA test if we're testing for host port type.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch reimplements ACPI invocation such that, instead of
exporting ACPI details to the rest of libata, ACPI event handlers -
ata_acpi_on_resume() and ata_acpi_on_devcfg() - are used. These two
functions are responsible for determining whether specific ACPI method
is used and when.
On resume, _GTF is scheduled by setting ATA_DFLAG_ACPI_PENDING device
flag. This is done this way to avoid performing the action on wrong
device device (device swapping while suspended).
On every ata_dev_configure(), ata_acpi_on_devcfg() is called, which
performs _SDD and _GTF. _GTF is performed only after resuming and, if
SATA, hardreset as the ACPI spec specifies. As _GTF may contain
arbitrary commands, IDENTIFY page is re-read after _GTF taskfiles are
executed.
If one of ACPI methods fails, ata_acpi_on_devcfg() retries on the
first failure. If it fails again on the second try, ACPI is disabled
on the device. Note that successful configuration clears ACPI failed
status.
With all feature checks moved to the above two functions,
do_drive_set_taskfiles() is trivial and thus collapsed into
ata_acpi_exec_tfs(), which is now static and converted to return the
number of executed taskfiles to be used by ata_acpi_on_resume(). As
failures are handled properly, ata_acpi_push_id() now returns -errno
on errors instead of unconditional zero.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* Add acpi_handle to ata_host and ata_port. Rename
ata_device->obj_handle to ->acpi_handle and move it above such that
it doesn't get cleared on reconfiguration.
* Replace ACPI node association which ata_acpi_associate() which is
called once during host initialization. Unlike the previous
implementation, ata_acpi_associate() uses ATA_FLAG_ACPI_SATA to
choose between IDE or SATA ACPI hierarchy and uses simple child look
up instead of recursive walk to match the nodes. This is way safer
and simpler. Please read the following message for more info.
http://article.gmane.org/gmane.linux.ide/17554
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
host->irq and host->irq2 should be set before ata_host_register() for
IRQ reporting to work. Move up host->irq assignment in
ata_host_activate() and add it to ata_pci_init_one() native path and
pata_cs5520.
The port info printing in ata_host_register() doesn't fit all the
different controllers. It should probably be moved out to LLDs with
some helpers in the future.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Another member of HTS5416* family doing spurious NCQ completion.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Enrico Sardi <enricoss@tiscali.it>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
In ata_hsm_qc_complete():
Calling ata_altstatus() after the qc is completed might race with next qc. Remove it.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_HORKAGE_DMA_RW_ONLY for TORiSAN is verified to be subset of using
DMA for ATAPI commands which aren't aligned to 16 bytes. As libata
now doesn't use DMA for unaligned ATAPI commands, the horkage is
redundant. Kill it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The IDE driver used DMA for ATAPI commands if READ/WRITE command is
multiple of sector size or sg command is multiple of 16 bytes. For
libata, READ/WRITE sector alignment is guaranteed by the high level
driver (sr), so we only have to worry about the 16 byte alignment.
This patch makes ata_check_atapi_dma() always request PIO for all data
transfer commands which are not multiple of 16 bytes.
The following reports are related to this problem.
http://bugzilla.kernel.org/show_bug.cgi?id=8605 (confirmed)
http://thread.gmane.org/gmane.linux.kernel/476620 (confirmed)
https://bugzilla.novell.com/show_bug.cgi?id=229260 (probably)
Albert first pointed out the difference between IDE and libata. Kudos
to him.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There's no reason to print out hpa related messages when HPA is not
active. Kill the unconditional message and add a warning message
which is printed if HPA size is smaller than the current size.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix parameter name from ata_dev_reread_id() in libata-core.c for kerneldoc.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
More for the NCQ blacklist. One hitachi and one raptor. Other
members of these families of drives are already on the list, so no
surprises.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
After SRST, libata used to wait for nsect/lbal to be set to 1/1 for
the slave device. However, some ATAPI devices don't set nsect/lbal
after SRST and the wait itself isn't too useful as we're gonna wait
for !BSY right after that anyway.
Before reset-seq update, nsect/lbal wait failure used to be ignored
and caused 30sec delay during detection. After reset-seq, all
timeouts are considered error conditions making libata fail to detect
such ATAPI devices.
This patch limits nsect/lbal wait to around 100ms. This should give
acceptable behavior to such ATAPI devices while not disturbing the
heavily used code path too much.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
IOMEGA ZIP 250 ATAPI claims MWDMA0 support but fails SETXFERMODE if
asked to configure itself to MWDMA0. Force PIO.
This fixes bugzilla bug#8497.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Calvin Walton <calvin.walton@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The controller is not reporting an unlawful type, it is reporting an
invalid type. Illegal specifically means "prohibited by law"
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The ata IRQ ack functions are only used when debugging. Unfortunately
almost every controller that calls them can cause crashes in some
configurations as there are missing checks for bmdma presence.
In addition ata_port_start insists of installing DMA buffers and pad
buffers for controllers regardless. The SFF controllers actually need to
make that decision dynamically at controller setup time and all need the
same helper - so we add ata_sff_port_start. Future patches will switch
the SFF drivers to use this.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
hw_sata_spd_limit used to be incorrectly initialized to zero instead
of UINT_MAX if SPD is zero in SControl register. This breaks PHY
speed down. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
For ATA/CFA devices, libata prints out the device model and firmware revision.
Do the same for ATAPI devices.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Most drivers don't seem to fill out the host->irq field, resulting in the
wrong (no) irq being reported at probe time. For example, sil24 on my system:
ata1: SATA max UDMA/100 cmd 0xd00008009001f000 ctl 0x0000000000000000 bmdma 0x0000000000000000 irq 0
ata2: SATA max UDMA/100 cmd 0xd000080090021000 ctl 0x0000000000000000 bmdma 0x0000000000000000 irq 0
Since they're allocated and set up in ata_host_activate(), just save
them away there.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Several people have reported LITE-ON LTR-48246S detection failed
because SETXFER fails. It seems the device raises IRQ too early after
SETXFER. This is controller independent. The same problem has been
reported for different controllers.
So, now we have pata_via where the controller raises IRQ before it's
ready after SETXFER and a device which does similar thing. This patch
makes libata always execute SETXFER via polling. As this only happens
during EH, performance impact is nil. Setting ATA_TFLAG_POLLING is
also moved from issue hot path to ata_dev_set_xfermode() - the only
place where SETXFER can be issued.
Note that ATA_TFLAG_POLLING applies only to drivers which implement
SFF TF interface and use libata HSM. More advanced controllers ignore
the flag. This doesn't matter for this fix as SFF TF controllers are
the problematic ones.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
During prereset, -ENODEV return from ata_wait_ready() is not an error.
This causes unnecessary bug message on controllers which uses 0xff to
indicate empty port. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some SATA controllers (sata_sil) use 0xff to indicate port not ready
status, not port empty. As libata interprets 0xff as port empty, this
causes unnecessary reset failure and retry. Don't consider 0xff as
port empty if SStatus is available and indicates that port is online.
Signed-off-by: tejun Heo <htejun@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Indan Zupancic <indan@nul.nu>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
http://bugzilla.kernel.org/show_bug.cgi?id=1044 points out an
additional hard disk that doesn't handle DMA transfers correctly.
This patch is the libata variant of the earlier patch to drivers/ide/
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
With STANDBYDOWN tracking added, libata.spindown_compat isn't
necessary anymore. If userspace shutdown(8) issues STANDBYNOW, libata
warns. If userspace shutdown(8) doesn't issue STANDBYNOW, libata does
the right thing. Userspace can tell whether kernel supports spindown
by testing whether sysfs node manage_start_stop exists as before.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Device might be resized during ata_dev_configure() due to HPA or
(later) ACPI _GTF. Currently it's worked around by caching n_sectors
before turning off HPA. The cached original size is overwritten if
the device is reconfigured without being hardreset - which always
happens after configuring trasnfer mode. If the device gets hardreset
for some reason after that, revalidation fails with -ENODEV.
This patch makes size checking more robust by moving n_sectors check
from ata_dev_reread_id() to ata_dev_revalidate() after the device is
fully configured. No matter what happens during configuration, a
device must have the same n_sectors after fully configured to be
treated as the same device.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Separate out ata_dev_reread_id() from ata_dev_revalidate().
ata_dev_reread_id() reads IDENTIFY page and determines whether the
same device is still there. ata_dev_revalidate() reconfigures after
reread completes. This will be used by ACPI update.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch cleans up libata-acpi such that it looks similar to other
libata files. This patch doesn't introuce any behavior changes.
* make libata-acpi functions take ata_device instead of ata_port +
device index
* s/atadev/dev/
* de-indent local variable declarations
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
It seems the world isn't as frank as we thought and some devices lie
about who they are. Fallback to the other IDENTIFY if IDENTIFY is
aborted by the device. As this is the strategy used by IDE for a long
time, it shouldn't cause too much problem.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: William Thompson <wt@electro-mechanical.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>