Calls to set a device online with path grouping may get stuck in
some cases because certain device conditions where discarded after
unsolicited interrupts.
Check subchannel activity after unsolicited interrupts and retry
the operation if the subchannel is idle.
Signed-off-by: Stefan Bader <shbader@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Devices enter no-path state after disabling a channel path
via the SE even though another path has been reenabled at the SE.
The devices are set into no-path state before triggering path
verification even though other paths may have become available.
To fix this trigger path verification before setting a device into
no-path state.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use different kind of assignment to make sure gcc doesn't create code
that creates temp variables on the stack, assigns values to it and
copies the content of the whole temp variable to the destination.
This reduces stack usage of e.g. ccwgroup_driver_register from 976
to 48 bytes instead.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
I/O on a CCW device may stall if a channel path to that device is
logicaly varied off/on. A user I/O interrupt can get misinterpreted
as interrupt for an internal path verification operation due to a
missing check and is therefore never reported to the device driver.
Correct check for pending interruptions before starting path
verification.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Do a retry of read device characteristics / read configuration
data when a deferred condition code 1 is encountered in
ccw_device_wake_up().
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fail to create a ccwgroup device if a ccw device is passed in twice.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In special conditions where a subchannel rejects the HALT I/O-
instruction with a busy indication (cc 2), I/O may stall.
I/O request termination logic retries HALT I/O indefinitely
because it expects HALT I/O to alter the subchannel status which
is not true when cc 2 is returned.
In case of a busy indication, try CLEAR I/O instruction immediately.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Display avg_sample_interval in nanoseconds, like it is documented.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
1. Multipath devices for which SetPGID is not supported are not handled well.
Use NOP ccws for path verification (sans path grouping) when SetPGID is not
supported.
2. Check for PGIDs already set with SensePGID on _all_ paths (not just the
first one) and try to find a common one. Moan if no common PGID can be
found (and use NOP verification). If no PGIDs have been set, use the css
global PGID (as before). (Rationale: SetPGID will get a command reject if
the PGID it tries to set does not match the already set PGID.)
3. Immediately before reboot, issue RESET CHANNEL PATH (rcp) on all chpids. This
will remove the old PGIDs. rcp will generate solicited CRWs which can be
savely ignored by the machine check handler (all other actions create
unsolicited CRWs).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a reg_mutex to prevent unregistering a subchannel before it has been
registered. Since 2.6.17, we've seen oopses in kslowcrw when a device is
found to be not operational during sense id when doing initial device
recognition; it is not clear yet why that particular problem was not (yet)
observed with earlier kernels...
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
irqtrace support for s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes for several channel measurement facility bugs:
* Blocks copied from the hardware might not be consistent. Solve this
by moving the copying into idle state and repeating the copying.
* avg_sample_interval changed with every read, even though no new block
was available. Solve this by storing a timestamp when the last new
block was received.
* Several locking issues.
* Measurements were not reenabled after a disconnected device became
available again.
* Remove #defines for ioctls that were never implemented.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After setting a path to a dasd offline at the SE, I/O hangs on that
dasd for 5 minutes, then continues.
I/O for which an interrupt will not be reported after the channel
path has been disabled was not terminated by the common I/O layer,
causing the dasd MIH to hit after 5 minutes.
Be more aggressive in terminating I/O after setting a channel path
offline. Also make sure to generate a fake irb if the device
driver issues an I/O request after being notified of the killed
I/O and clear residual information from the irb before trying to
start the delayed verification.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Changes in the DASD driver require an asynchronous implementation of the
subchannel reprobe loop. This loop was so far only used by the blacklisting
mechanism but is now available to all CCW device drivers.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Work around the problem that a device cannot be unregistered from
driver_for_each_device() because of klist node refcounting: Get device
after device owned by the driver to be unregistered with driver_find_device()
and then unregister it. This works because driver_get_device() gets us out of
the region of the elevated klist node refcount. driver_find_device() will
always get the next device in the list after the found one has been
unregistered.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Specify correct sizeof() in chp_measurement_read() and return
correct amount of read data.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Trying to set a DASD root device online can fail under some circumstances
with the message "Read configuration data returned error -5". The cause
is that read configuration data incorrectly aborts with -EIO when it
encounters a temporary busy condition at a storage server.
Perform retry when encountering temporary busy conditions.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
From: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
The path grouping can fail due to non-unique pathgroup-IDs. The source for
the CPU-ID part of the ID was incorrectly specified on 64 bit systems.
Additionally, the length of the ID was too large due to incorrect data packing
declaration. Fix CPU-ID lowcore address and add missing packing declaration.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With CONFIG_SLAB_DEBUG=y networking over qeth doesn't work. The problem is
that the qib structure embedded in the qeth_irq structure needs an alignment
of 256 but kmalloc only guarantees an alignment of 8. When using SLAB
debugging the alignment of qeth_irq is not sufficient for the embedded qib
structure which causes all users of qdio (qeth and zfcp) to stop working.
Allocate qeth_irq structure with __get_free_page. That wastes a small amount
of memory (~2500 bytes) per online adapter.
Signed-off-by: Christian Borntraeger <cborntra@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avoid memory allocation with GFP_KERNEL in qdio_establish/qdio_shutdown. Use
memory pool instead. (Otherwise this can lead to an I/O stall where qdio
waits for a free page and zfcp waits for end of error recovery in low memory
situations.)
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Interrupts can stay disabled if an error occurred in _chp_add(). Use
spin_unlock_irq on the error paths to reenable interrupts.
Signed-off-by: Stefan Bader <shbader@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a race condition in the I/O termination logic. The race can cause I/O to
a dasd device to fail with no retry left after turning one channel path to the
device off and on multiple times.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Debugging events in cio_trace/hex_ascii are truncated for some trace entries.
Increase trace event size to 16 bytes to cover longer text events, make
CIO_HEX_EVENT an inline function that loops to cover bigger hex events.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cio_ignore_proc_init() returns 1 in case of success and 0 in case of failure.
The caller tests for != 0, so better return 0 in case of success and -ENOENT
in case of failure.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert all kmalloc + memset sequences in drivers/s390 to kzalloc usage.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Gather extended measurements for channel paths from the channel subsystem and
expose them to userspace via a sysfs attribute.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When cio waits for the interrupt for a basic sense, interrupts for hsch() or
csch() issued in the meantime are wrongly counted as interrupts for the basic
sense and the accumulated irb is passed to the device driver. In
ccw_device_w4sense(), check for clear or halt function in the irb and pass the
irb for the csch() or hsch() to the device driver.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It seems this patch got dropped (it was in addition to the `s390:
improve response code handling in chsc_enable_facility()' patch).
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rather than checking for some known failures, check positively for the
success response code 0x0001 and return -EIO for unrecognized failure
response codes.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Greg Smith <gsmith@nc.rr.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Using FCP devices with V=V support, the input queue stalled when CCQ 97 had
been returned in qdio_do_eqbs. When this happen we have to reissue the eqbs
instruction.
Another bug was when V=V was enabled we checked if hardware has SIGA-sync
support. If not we returned with 0 from tiqdio_is_inbound_q_done. Thus qdio
lost initiative on FCP devices and input queue stalled. Running devices in
V=V there is no SIGA-sync support but nevertheless we have to process
tiqdio_is_inbound_q_done either.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If __ccw_device_disband_start() fails to initiate disbanding, it should finish
with ccw_device_disband_done() (which leaves the device in offline state)
instead of ccw_device_verify_done() (which leaves the device in online state).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix locking in __chp_add() and s390_subchannel_remove_chpid(): Need to
disable/enable because they are always called from a thread (and not
directly from a machine check...)
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix modalias for ccw devices: cu_type should be in capitals as well.
Signed-off-by: Cornelia Huck <cohuck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The return code of ccw_device_probe_console() is not properly handled. It
should only return a valid ccw device pointer or a error value converted by
ERR_PTR. Fix the console driver code to check with IS_ERR instead against
NULL.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Remove all CVS generated information like e.g. revision IDs from
drivers/s390 and include/asm-s390 (none present in arch/s390).
- Add newline at end of arch/s390/lib/Makefile to avoid diff message.
Acked-by: Andreas Herrmann <aherrman@de.ibm.com>
Acked-by: Frank Pavlic <pavlic@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The chps[] array in struct channel_subsystem is one too short; therefore the
code doesn't realize the chpid ff is already known. When several devices on
chpid ff become available, the message "new_channel_path: could not register
ff" is displayed for every device but the first one.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following patch converts css_bus_type and ccw_bus_type to use
the new bus_type methods.
Signed-off-by: Cornelia Huck <huckc@de.ibm.com>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X,
ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by
S390, 64BIT and COMPAT.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Use kzalloc() in blacklist.c.
- Kill unwanted casts in blacklist.c.
- Provide release function for struct channel_subsystem.
Signed-off-by: Cornelia Huck <huckc@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for multiple subchannel sets. Works with arbitrary devices in
subchannel set 1 and is transparent to device drivers. Although currently
only two subchannel sets are available, this will work with the architectured
maximum number of subchannel sets as well.
Signed-off-by: Cornelia Huck <cohuck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert /proc/cio_ignore to a sequential file. This makes multiple subchannel
sets support easier.
Signed-off-by: Cornelia Huck <cohuck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
struct channel_subsystem encapsulates several per channel subsystem
properties, like status of chpids or the global path group id.
Signed-off-by: Cornelia Huck <cohuck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
for_each_subchannel() is an iterator calling a function for every possible
subchannel id until non-zero is returned. Convert the current iterating
functions to it.
Signed-off-by: Cornelia Huck <cohuck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>