IO stall after deleting and path checker changes after reenabling zfcp device
Setting one zfcp device offline using chccwdev in a multipath
environment and waiting will lead to IO stall on all paths.
After setting the zfcp device back online using chccwdev,
the devices with io stall will have a different path checker.
Devices corresponding to the deleted units are never freed.
This has the effect that 'slave_destroy' is never called and zfcp
still thinks that this unit is registered
(ZFCP_STATUS_UNIT_REGISTERED is still set). Hence the erp
routine is not called correctly and the unit is not enabled properly.
Do not delete rport and the sdev. Just set the host to block on
'offline'. Setting host online again will then remove the blocked status
and everything is fine again.
Signed-off-by: Michael Loehr <mloehr2@linux.vnet.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Simplify request ID management and make sure that frequently used
functions are inlined. Also fix a memory leak in zfcp_adapter_enqueue()
which only gets hit in error handling.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The SCSI stack requires low level drivers to register and
unregister devices. For zfcp this leads to the situation where
zfcp calls the SCSI stack, the SCSI tries to scan the new device
and the scan SCSI command fails. This would require the zfcp erp,
but the erp thread is already blocked in the register call.
The fix is to make sure that the calls from the ERP thread to
the SCSI stack do not block the ERP thread. In detail:
1) Use a workqueue to avoid blocking of the scsi_scan_target calls.
2) When removing a unit make sure that no scsi_scan_target call is
pending.
3) Replace scsi_flush_work with scsi_target_unblock. This avoids
blocking and has the same result.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
All on stack DECLARE_COMPLETIONs should be replaced by:
DECLARE_COMPLETION_ONSTACK
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the fix ... One of my previous fixes introduced removal of all fsf
requests in zfcp's eh_host_reset_handler. But this must not happen
before qdio queues are shut down. So, I revert the changes of
zfcp_scsi_eh_host_reset_handler.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This instance will be used whenever a timer is needed for
a request by zfcp.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
zfcp's eh_abort_handler used the wrong request ID to
identify the request to be aborted. The bug was introduced
with commit fea9d6c7bc
for improved management of request IDs. The bug is
fixed with this patch.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Create private slab caches in order to guarantee proper alignment of
data structures that get passed to hardware.
Sidenote: with this patch slab cache debugging will finally work on s390
(at least no known problems left).
Furthermore this patch does some minor cleanups:
- store ptr for transport template in struct zfcp_data
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Compile fix ups and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Bug fixes for zfcp's erp:
- trigger adapter reopen if do_QDIO fails
- avoid erp deadlock if registration of scsi target or remote port hang
- do not treat as error if exchange port data fails
- decrease timeout for target reset and aborts
- mark unit failed if slave_destroy is called
Additionally some code cleanup was done:
- made some functions void when retval is not of interest
- shortened initialization of zfcp's host_template
- corrected some comments
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If zfcp's port erp fails we now call fc_remote_port_delete. This helps
to avoid offlined scsi devices if scsi commands time out due to path
failures. When an adapter erp fails we call fc_remote_port_delete for
all ports on that adapter.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com>
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Removed some macros, struct members and typedefs which were
unused or not necessary.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Replace kmalloc/memset by kzalloc or kcalloc.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The patch fixes following issues:
(1) Replace scsi_add_device with scsi_scan_target.
(Thus the rport instead of the scsi_host becomes parent of a
scsi_target again.)
(2) Avoid scsi_device allocation during registration of an remote port.
(Would be done during fc_scsi_scan_rport.)
(3) Fix queuecommand behaviour when an zfcp unit is blocked.
(Call scsi_done with DID_NO_CONNECT instead of returning
SCSI_MLQUEUE_DEVICE_BUSY otherwise we might end up waiting
for completion in blk_execute_rq for ever.)
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
It fixes a bug in zfcp which provokes a race
in scsi_scan.c. Finally this can lead to an Oops like:
kernel BUG at fs/sysfs/symlink.c:87!
Correctly set this_id for the host. Otherwise we provoke
a race between scsi_target_reap_work and concurrent
scsi_add_device.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Avoid access to old fsf_requests if device reset is logged.
Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com>
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Remove all CVS generated information like e.g. revision IDs from
drivers/s390 and include/asm-s390 (none present in arch/s390).
- Add newline at end of arch/s390/lib/Makefile to avoid diff message.
Acked-by: Andreas Herrmann <aherrman@de.ibm.com>
Acked-by: Frank Pavlic <pavlic@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replaced zfcp adapter attributes with fc_host attributes:
fc_topology by port_type, physical_wwpn by permanent_port_name.
Make use of fc_host attribute supported_speeds.
Removed zfcp adapter attribute physical_s_id.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Added host stats, removed superfluous get_starget_ functions,
removed some attributes from zfcp specific sysfs tree (e.g.
scsi_host_no, scsi_lun, wwnn and d_id).
Host stats are given for the physical adapter port not for the
virtual adapter. Reset stats is implemented in the device driver.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Change return code in slave_alloc to avoid irritating error message from
scsi_alloc_sdev() when scsi stack tries target scan.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
this patch adds some fc host attributes and removes its equivalents
from the zfcp_adapter structure and zfcp specific sysfs subtree.
Furthermore it removes superfluous calls to fc_remort_port_delete when
an adapter is set offline because rports will be removed by
fc_remove_host anyway.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Debug features (DBFs) els_dbf, cmd_dbf and abt_dbf were removed and
san_dbf, hba_dbf and scsi_dbf were introduced. The erp_dbf did not
change.
The new traces improve debugging of problems with zfcp, scsi-stack,
multipath and hardware in the SAN. san_dbf traces things like ELS and
CT commands, hba_dbf saves HBA specific information of requests, and
scsi_dbf saves FCP and SCSI specific information of requests. Common
to all new DBFs is that they provide a so called structured view. This
significantly improves readability of the traces.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
o union zfcp_req_data removed
o increment unit refcount when processing FCP commands
(This fixes a theoretical race: When all scsi commands of a unit
are aborted and the scsi_device is removed then the unit could be
removed before all fsf_requests of that unit are completely processed.)
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch fixes a severe problem with 2.6.13-rc7.
Due to recent SCSI changes it is not possible to add any LUNs to the zfcp
device driver anymore. With registration of remote ports this is fixed.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Acked-by: James Bottomley <jejb@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes a race between zfcp_fsf_req_dismiss_all and
zfcp_qdio_reqid_check. During adapter shutdown it occurred that a
request was cleaned up twice. First during its normal
completion. Second when dismiss_all was called. The fix is to
serialize access to fsf request list between zfcp_fsf_req_dismiss_all
and zfcp_qdio_reqid_check and delete a fsf request from the list if
its completion is triggered. (Additionally a rwlock was replaced by a
spinlock and fsf_req_cleanup was eliminated.)
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!