Reset device operational notification flag when channel paths become
unavailable during path verification.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The grace period handling introduced needless complexity. It didn't
help the dasd driver (which can handle terminated I/O just well),
and it doesn't help for long running channel programs (which won't
complete during the grace period anyway). Terminating I/O using a
path that just disappeared immediately is much more consistent with
what the user expects.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the proper structures to identify device and subchannel. Change
get_disc_ccwdev_by_devno() to get_disc_ccwdev_by_dev_id().
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the last few places where a pointer to pt_regs gets passed.
Also make sure we call set_irq_regs() before irq_enter() and after
irq_exit(). This doesn't fix anything but makes sure s390 looks the
same like all other architectures.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In order to determine chpid validity, we need to check whether the
corresponding path is specified in the pim.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Retry internal operation after unit check instead of aborting them.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add timeout handler for common-I/O-layer-internal I/O operations.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
While the machine owns us an interrupt in these cases (and we should get
one), reality isn't always like that...
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Major cleanup of all s390 inline assemblies. They now have a common
coding style. Quite a few have been shortened, mainly by using register
asm variables. Use of the EX_TABLE macro helps as well. The atomic ops,
bit ops and locking inlines new use the Q-constraint if a newer gcc
is used. That results in slightly better code.
Thanks to Christian Borntraeger for proof reading the changes.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
css_evaluate_subchannel() operates subchannel without lock which can
lead to erratic behavior caused by concurrent device access. Also
split evaluation function to make it more readable.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reappearing channel paths are sometimes not utilized by CCW devices
because path verification incorrectly relies on path-operational-mask
information which is not updated until a channel path has been used
again.
Modify path verification procedure to always query all available paths
to a device.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
CHPIDs that are logically varied off will not be removed from
a CCW device's path group because resign-from-pathgroup command is
issued with invalid path mask of 0 because internal CCW operations
are masked by the logical path mask after the relevant bits are
cleared by the vary operation.
Do not apply logical path mask to internal operations.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Subchannel may incorrectly remain in state no-path after channel paths
have reappeared. Currently the scan for subchannels which are using a
channel path ends at the first occurrence if a full link address was
provided by the channel subsystem. The scan needs to continue over
all subchannels.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add the MODALIAS environment variable for ccw bus uevents.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The last SLSB has to be set to STATE_PROCESSING if we really want to
use the PROCESSING feature.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Previous patch that was intended to reduce stack usage within common
i/o layer didn't consider implicit memset(..., 0, ...) used with the
initializations used before.
Add these missing memsets wherever it's not obvious that the
concerned memory region is zeroed. This should give the same semantics
as before.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Convert GET_IPL_DEVICE assembler macro to C function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
qdio_get_micros is supposed to return microseconds. The get_clock()
return value needs to be shifted by 12 to get to microseconds.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use do { } while (0) constructs instead of empty defines to avoid
subtle compile bugs.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It is now possible to specify a ccw/fcp dump device which is used to
automatically create a system dump in case of a kernel panic. The dump
device can be configured under /sys/firmware/dump.
In addition it is now possible to specify a ccw/fcp device which is used
for the next reboot of Linux. The reipl device can be configured under
/sys/firmware/reipl.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Calls to set a device online with path grouping may get stuck in
some cases because certain device conditions where discarded after
unsolicited interrupts.
Check subchannel activity after unsolicited interrupts and retry
the operation if the subchannel is idle.
Signed-off-by: Stefan Bader <shbader@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Devices enter no-path state after disabling a channel path
via the SE even though another path has been reenabled at the SE.
The devices are set into no-path state before triggering path
verification even though other paths may have become available.
To fix this trigger path verification before setting a device into
no-path state.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use different kind of assignment to make sure gcc doesn't create code
that creates temp variables on the stack, assigns values to it and
copies the content of the whole temp variable to the destination.
This reduces stack usage of e.g. ccwgroup_driver_register from 976
to 48 bytes instead.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
I/O on a CCW device may stall if a channel path to that device is
logicaly varied off/on. A user I/O interrupt can get misinterpreted
as interrupt for an internal path verification operation due to a
missing check and is therefore never reported to the device driver.
Correct check for pending interruptions before starting path
verification.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Do a retry of read device characteristics / read configuration
data when a deferred condition code 1 is encountered in
ccw_device_wake_up().
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fail to create a ccwgroup device if a ccw device is passed in twice.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In special conditions where a subchannel rejects the HALT I/O-
instruction with a busy indication (cc 2), I/O may stall.
I/O request termination logic retries HALT I/O indefinitely
because it expects HALT I/O to alter the subchannel status which
is not true when cc 2 is returned.
In case of a busy indication, try CLEAR I/O instruction immediately.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Display avg_sample_interval in nanoseconds, like it is documented.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
1. Multipath devices for which SetPGID is not supported are not handled well.
Use NOP ccws for path verification (sans path grouping) when SetPGID is not
supported.
2. Check for PGIDs already set with SensePGID on _all_ paths (not just the
first one) and try to find a common one. Moan if no common PGID can be
found (and use NOP verification). If no PGIDs have been set, use the css
global PGID (as before). (Rationale: SetPGID will get a command reject if
the PGID it tries to set does not match the already set PGID.)
3. Immediately before reboot, issue RESET CHANNEL PATH (rcp) on all chpids. This
will remove the old PGIDs. rcp will generate solicited CRWs which can be
savely ignored by the machine check handler (all other actions create
unsolicited CRWs).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a reg_mutex to prevent unregistering a subchannel before it has been
registered. Since 2.6.17, we've seen oopses in kslowcrw when a device is
found to be not operational during sense id when doing initial device
recognition; it is not clear yet why that particular problem was not (yet)
observed with earlier kernels...
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
irqtrace support for s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes for several channel measurement facility bugs:
* Blocks copied from the hardware might not be consistent. Solve this
by moving the copying into idle state and repeating the copying.
* avg_sample_interval changed with every read, even though no new block
was available. Solve this by storing a timestamp when the last new
block was received.
* Several locking issues.
* Measurements were not reenabled after a disconnected device became
available again.
* Remove #defines for ioctls that were never implemented.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After setting a path to a dasd offline at the SE, I/O hangs on that
dasd for 5 minutes, then continues.
I/O for which an interrupt will not be reported after the channel
path has been disabled was not terminated by the common I/O layer,
causing the dasd MIH to hit after 5 minutes.
Be more aggressive in terminating I/O after setting a channel path
offline. Also make sure to generate a fake irb if the device
driver issues an I/O request after being notified of the killed
I/O and clear residual information from the irb before trying to
start the delayed verification.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Changes in the DASD driver require an asynchronous implementation of the
subchannel reprobe loop. This loop was so far only used by the blacklisting
mechanism but is now available to all CCW device drivers.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Work around the problem that a device cannot be unregistered from
driver_for_each_device() because of klist node refcounting: Get device
after device owned by the driver to be unregistered with driver_find_device()
and then unregister it. This works because driver_get_device() gets us out of
the region of the elevated klist node refcount. driver_find_device() will
always get the next device in the list after the found one has been
unregistered.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Specify correct sizeof() in chp_measurement_read() and return
correct amount of read data.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Trying to set a DASD root device online can fail under some circumstances
with the message "Read configuration data returned error -5". The cause
is that read configuration data incorrectly aborts with -EIO when it
encounters a temporary busy condition at a storage server.
Perform retry when encountering temporary busy conditions.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
From: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
The path grouping can fail due to non-unique pathgroup-IDs. The source for
the CPU-ID part of the ID was incorrectly specified on 64 bit systems.
Additionally, the length of the ID was too large due to incorrect data packing
declaration. Fix CPU-ID lowcore address and add missing packing declaration.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With CONFIG_SLAB_DEBUG=y networking over qeth doesn't work. The problem is
that the qib structure embedded in the qeth_irq structure needs an alignment
of 256 but kmalloc only guarantees an alignment of 8. When using SLAB
debugging the alignment of qeth_irq is not sufficient for the embedded qib
structure which causes all users of qdio (qeth and zfcp) to stop working.
Allocate qeth_irq structure with __get_free_page. That wastes a small amount
of memory (~2500 bytes) per online adapter.
Signed-off-by: Christian Borntraeger <cborntra@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avoid memory allocation with GFP_KERNEL in qdio_establish/qdio_shutdown. Use
memory pool instead. (Otherwise this can lead to an I/O stall where qdio
waits for a free page and zfcp waits for end of error recovery in low memory
situations.)
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Interrupts can stay disabled if an error occurred in _chp_add(). Use
spin_unlock_irq on the error paths to reenable interrupts.
Signed-off-by: Stefan Bader <shbader@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a race condition in the I/O termination logic. The race can cause I/O to
a dasd device to fail with no retry left after turning one channel path to the
device off and on multiple times.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Debugging events in cio_trace/hex_ascii are truncated for some trace entries.
Increase trace event size to 16 bytes to cover longer text events, make
CIO_HEX_EVENT an inline function that loops to cover bigger hex events.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>