OpenCloudOS-Kernel/drivers
Brian King 309bd27121 [SCSI] scsi: Device scanning oops for offlined devices (resend)
If a device gets offlined as a result of the Inquiry sent
during scanning, the following oops can occur. After the
disk gets put into the SDEV_OFFLINE state, the error handler
sends back the failed inquiry, which wakes the thread doing
the scan. This starts a race between the scanning thread
freeing the scsi device and the error handler calling
scsi_run_host_queues to restart the host. Since the disk
is in the SDEV_OFFLINE state, scsi_device_get will still
work, which results in __scsi_iterate_devices getting
a reference to the scsi disk when it shouldn't.

The following execution thread causes the oops:

CPU 0 (scan)				CPU 1 (eh)

---------------------------------------------------------
scsi_probe_and_add_lun
                        ....
                                        scsi_eh_offline_sdevs
                                        scsi_eh_flush_done_q
scsi_destroy_sdev
scsi_device_dev_release
                                        scsi_restart_operations
                                         scsi_run_host_queues
                                          __scsi_iterate_devices
                                           get_device
scsi_device_dev_release_usercontext
                                          scsi_run_queue
                                            <---OOPS--->

The patch fixes this by changing the state of the sdev to SDEV_DEL
before doing the final put_device, which should prevent the race
from occurring.

Original oops follows:

Badness in kref_get at lib/kref.c:32
Call Trace:
[C00000002F4476D0] [C00000000000EE20] .show_stack+0x68/0x1b0 (unreliable)
[C00000002F447770] [C00000000037515C] .program_check_exception+0x1cc/0x5a8
[C00000002F447840] [C00000000000446C] program_check_common+0xec/0x100
 Exception: 700 at .kref_get+0x10/0x28
    LR = .kobject_get+0x20/0x3c
[C00000002F447B30] [C00000002F447BC0] 0xc00000002f447bc0 (unreliable)
[C00000002F447BB0] [C000000000254BDC] .get_device+0x20/0x3c
[C00000002F447C30] [D000000000063188] .scsi_device_get+0x34/0xdc [scsi_mod]
[C00000002F447CC0] [D0000000000633EC] .__scsi_iterate_devices+0x50/0xbc [scsi_mod]
[C00000002F447D60] [D00000000006A910] .scsi_run_host_queues+0x34/0x5c [scsi_mod]
[C00000002F447DF0] [D000000000069054] .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[C00000002F447EE0] [C00000000007B4E0] .kthread+0x128/0x178
[C00000002F447F90] [C000000000025E84] .kernel_thread+0x4c/0x68
Unable to handle kernel paging request for <7>PCI: Enabling device: (0002:41:01.1), cmd 143
data at address 0x000001b8
Faulting instruction address: 0xd0000000000698e4
sym1: <1010-66> rev 0x1 at pci 0002:41:01.1 irq 216
sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi2 : sym-2.2.2
cpu 0x0: Vector: 300 (Data Access) at [c00000002f447a30]
    pc: d0000000000698e4: .scsi_run_queue+0x2c/0x218 [scsi_mod]
    lr: d00000000006a904: .scsi_run_host_queues+0x28/0x5c [scsi_mod]
    sp: c00000002f447cb0
   msr: 9000000000009032
   dar: 1b8
 dsisr: 40000000
  current = 0xc0000000045fecd0
  paca    = 0xc00000000048ee80
    pid   = 1123, comm = scsi_eh_1
enter ? for help
[c00000002f447d60] d00000000006a904 .scsi_run_host_queues+0x28/0x5c [scsi_mod]
[c00000002f447df0] d000000000069054 .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[c00000002f447ee0] c00000000007b4e0 .kthread+0x128/0x178
[c00000002f447f90] c000000000025e84 .kernel_thread+0x4c/0x68

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-06-28 12:39:56 -04:00
..
acorn
acpi [PATCH] fix typo in acpi video brightness changes. 2006-06-23 21:37:34 -07:00
amba
atm [SPARC]: Kill __irq_itoa(). 2006-06-20 01:21:29 -07:00
base Enable minimal per-device resume tracing 2006-06-24 14:47:59 -07:00
block [PATCH] CCISS: tidy up product table indentation 2006-06-25 10:01:22 -07:00
bluetooth [PATCH] pcmcia: use bitfield instead of p_state and state 2006-03-31 17:26:33 +02:00
cdrom [PATCH] cdrom/mcdx: section fixes 2006-06-25 10:01:16 -07:00
char [PATCH] synclink_gt: add GT2 adapter support 2006-06-25 10:01:24 -07:00
connector [PATCH] connector-exports 2006-06-23 07:43:06 -07:00
cpufreq [PATCH] cpufreq build fix 2006-06-23 08:47:27 -07:00
crypto
dio [PATCH] hp300: fix driver_register() return handling, remove dio_module_init() 2006-03-25 08:22:53 -08:00
dma [I/OAT]: Do not use for_each_cpu(). 2006-06-17 21:25:58 -07:00
edac [PATCH] EDAC Coexistence with BIOS 2006-05-03 20:05:41 -07:00
eisa [PATCH] EISA: Ignore generated file drivers/eisa/devlist.h 2006-03-25 08:23:01 -08:00
fc4 [SPARC]: Kill __irq_itoa(). 2006-06-20 01:21:29 -07:00
firmware [PATCH] DMI: cleanup kernel-doc, add to DocBook 2006-06-25 10:01:24 -07:00
hwmon [PATCH] hwmon-vid: Add support for Intel Core and Conroe 2006-06-22 11:10:36 -07:00
i2c Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc 2006-06-22 22:11:30 -07:00
ide [PATCH] ide-floppy: fix debug-only syntax error 2006-06-25 10:01:20 -07:00
ieee1394 [PATCH] ieee1394: nodemgr: do not peek into struct semaphore 2006-06-25 10:00:54 -07:00
infiniband Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband 2006-06-25 16:07:58 -07:00
input [PATCH] random: remove redundant SA_SAMPLE_RANDOM from touchscreen drivers 2006-06-25 10:01:00 -07:00
isdn [PATCH] ISDN: correctly handle isdn_writebuf_stub() errors 2006-06-23 07:43:04 -07:00
leds [PATCH] LED: add LED heartbeat trigger 2006-06-25 10:01:23 -07:00
macintosh [PATCH] Rewritten backlight infrastructure for portable Apple computers 2006-06-25 10:00:59 -07:00
mca
md [PATCH] drivers/md/raid6algos.c: fix a NULL dereference 2006-06-23 07:43:08 -07:00
media Fixes some sync issues between V4L/DVB development and GIT 2006-06-25 02:05:24 -03:00
message Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2006-06-21 11:18:25 -07:00
mfd [PATCH] show MCP menu only on ARCH_SA1100 2006-03-24 07:33:28 -08:00
misc [PATCH] VFS: Permit filesystem to override root dentry on mount 2006-06-23 07:42:45 -07:00
mmc [ARM] 3565/1: AT91RM9200 MMC update 2006-06-19 13:06:05 +01:00
mtd [MTD] NAND: Fix breakage all over the place 2006-06-20 20:31:24 +01:00
net [PATCH] m68knommu: 532x FEC eth struct map 2006-06-25 17:43:33 -07:00
nubus
oprofile [PATCH] oprofile: convert from semaphores to mutexes 2006-06-25 10:01:04 -07:00
parisc [PARISC] Document that we tolerate "Relaxed Ordering" 2006-04-21 22:20:33 +00:00
parport [PATCH] parport: add to kernel-doc 2006-06-25 10:01:25 -07:00
pci [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible 2006-06-21 12:00:01 -07:00
pcmcia [ARM] Fix badge4 build error 2006-06-19 15:37:31 +01:00
pnp [PATCH] pnp: card_probe(): fix memory leak 2006-06-25 10:01:01 -07:00
rapidio
rtc [PATCH] RTC: add rtc-ds1742 driver 2006-06-25 10:01:14 -07:00
s390 [PATCH] kernel/sys.c: cleanups 2006-06-25 10:01:06 -07:00
sbus [PATCH] mm: remove VM_LOCKED before remap_pfn_range and drop VM_SHM 2006-06-25 10:00:55 -07:00
scsi [SCSI] scsi: Device scanning oops for offlined devices (resend) 2006-06-28 12:39:56 -04:00
serial [PATCH] m68knommu: 532x UART support 2006-06-25 17:43:33 -07:00
sh
sn [PATCH] SGI IOC4: Detect IO card variant 2006-06-23 07:43:07 -07:00
spi [PATCH] s3c24xx: fix spi driver with CONFIG_PM 2006-05-26 11:55:46 -07:00
tc [PATCH] kill _INLINE_ 2006-03-23 07:38:16 -08:00
telephony [PATCH] pcmcia: use bitfield instead of p_state and state 2006-03-31 17:26:33 +02:00
usb Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ 2006-06-25 06:44:44 -04:00
video Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb 2006-06-25 10:09:31 -07:00
w1 [PATCH] connector-exports 2006-06-23 07:43:06 -07:00
zorro [PATCH] amiga: fix driver_register() return handling, remove zorro_module_init() 2006-03-25 08:22:53 -08:00
Kconfig [I/OAT]: DMA memcpy subsystem 2006-06-17 21:18:43 -07:00
Makefile [I/OAT]: DMA memcpy subsystem 2006-06-17 21:18:43 -07:00