OpenCloudOS-Kernel/drivers/scsi/device_handler
Menny Hamburger db422318cb [SCSI] scsi_dh: propagate SCSI device deletion
Currently, when scsi_dh_activate() returns with an error
(e.g. SCSI_DH_NOSYS) the activate_complete callback is not called and
the error is not propagated to DM mpath.

When a SCSI device attached to a device handler is deleted, userland
processes currently performing I/O on the device will have their I/O
hang forever.

- Set SCSI_DH_NOSYS error when the handler is in the process of being
  deleted (e.g. the SCSI device is in a SDEV_CANCEL or SDEV_DEL state).

- Set SCSI_DH_DEV_OFFLINED error when device is in SDEV_OFFLINE state.

- Call the activate_complete callback function directly from
  scsi_dh_activate if an error has been set (when either the scsi_dh
  internal data has already been deleted or is in the process of being
  deleted).

The patch was tested in an iSCSI environment, RDAC H/W handler and
multipath.  In the following reproduction process, dd will I/O hang
forever and the only way to release it will be to reboot the machine:
1) Perform I/O on a multipath device:
    dd if=/dev/dm-0 of=/dev/zero bs=8k count=1000000 &
2) Delete all slave SCSI devices contained in the mpath device:
   I)  In an iSCSI environment, the easiest way to do this is by
   stopping iSCSI:
       /etc/init.d/iscsi stop
   II) Another way to delete the devices is by applying the following
   bash scriptlet:
       dm_devs=$(ls /sys/block/ | grep dm- | xargs)
       for dm_dev in $dm_devs; do
         devices=$(ls /sys/block/$dm_dev/slaves)
         for device in $devices; do
            echo 1 > /sys/block/$device/device/delete
         done
       done

NOTE: when DM mpath's fail_path uses blk_abort_queue this scsi_dh change
isn't strictly required.  However, DM mpath's call to blk_abort_queue
will soon be reverted because it has proven to be unsafe due to a race
(between blk_abort_queue and scsi_request_fn) that can lead to list
corruption.  Therefore we cannot rely on blk_abort_queue via fail_path,
but even if we could this scsi_dh change is still preferrable.

Signed-off-by: Menny Hamburger <Menny_Hamburger@Dell.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Babu Moger <babu.moger@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-12-21 12:37:27 -06:00
..
Kconfig [SCSI] scsi_dh: add generic SPC-3 alua handler 2008-07-26 15:14:52 -04:00
Makefile [SCSI] scsi_dh: add generic SPC-3 alua handler 2008-07-26 15:14:52 -04:00
scsi_dh.c [SCSI] scsi_dh: propagate SCSI device deletion 2010-12-21 12:37:27 -06:00
scsi_dh_alua.c [SCSI] scsi_dh_alua: Handle all states correctly 2010-10-07 17:22:22 -05:00
scsi_dh_emc.c [SCSI] scsi_dh_emc: request flag cleanup 2010-04-11 14:04:02 -05:00
scsi_dh_hp_sw.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
scsi_dh_rdac.c [SCSI] scsi_dh_rdac: Add two new SUN devices to rdac_dev_list 2010-10-25 16:13:24 -05:00