linux-sg2042/block
Tejun Heo 77ea887e43 implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland.  There are several issues with this.

* Polling is done by periodically opening the device.  For SCSI
  devices, the command sequence generated by such action involves a
  few different commands including TEST_UNIT_READY.  This behavior,
  while perfectly legal, is different from Windows which only issues
  single command, GET_EVENT_STATUS_NOTIFICATION.  Unfortunately, some
  ATAPI devices lock up after being periodically queried such command
  sequences.

* There is no reliable and unintrusive way for a userland program to
  tell whether the target device is safe for media presence polling.
  For example, polling for media presence during an on-going burning
  session can make it fail.  The polling program can avoid this by
  opening the device with O_EXCL but then it risks making a valid
  exclusive user of the device fail w/ -EBUSY.

* Userland polling is unnecessarily heavy and in-kernel implementation
  is lighter and better coordinated (workqueue, timer slack).

This patch implements framework for in-kernel disk event handling,
which includes media presence polling.

* bdops->check_events() is added, which supercedes ->media_changed().
  It should check whether there's any pending event and return if so.
  Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
  DISK_EVENT_EJECT_REQUEST.  ->check_events() is guaranteed not to be
  called parallelly.

* gendisk->events and ->async_events are added.  These should be
  initialized by block driver before passing the device to add_disk().
  The former contains the mask of all supported events and the latter
  the mask of all events which the device can report without polling.
  /sys/block/*/events[_async] export these to userland.

* Kernel parameter block.events_dfl_poll_msecs controls the system
  polling interval (default is 0 which means disable) and
  /sys/block/*/events_poll_msecs control polling intervals for
  individual devices (default is -1 meaning use system setting).  Note
  that if a device can report all supported events asynchronously and
  its polling interval isn't explicitly set, the device won't be
  polled regardless of the system polling interval.

* If a device is opened exclusively with write access, event checking
  is automatically disabled until all write exclusive accesses are
  released.

* There are event 'clearing' events.  For example, both of currently
  defined events are cleared after the device has been successfully
  opened.  This information is passed to ->check_events() callback
  using @clearing argument as a hint.

* Event checking is always performed from system_nrt_wq and timer
  slack is set to 25% for polling.

* Nothing changes for drivers which implement ->media_changed() but
  not ->check_events().  Going forward, all drivers will be converted
  to ->check_events() and ->media_change() will be dropped.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-16 17:53:38 +01:00
..
Kconfig blkio: Core implementation of throttle policy 2010-09-16 08:42:52 +02:00
Kconfig.iosched blk-cgroup: config options re-arrangement 2010-04-26 19:27:56 +02:00
Makefile Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block 2010-10-22 17:07:18 -07:00
blk-cgroup.c blk-cgroup: Allow creation of hierarchical cgroups 2010-11-15 19:37:36 +01:00
blk-cgroup.h blkio-throttle: limit max iops value to UINT_MAX 2010-10-01 21:16:41 +02:00
blk-core.c block: Rename "block_remap" tracepoint to "block_bio_remap" to clarify the event. 2010-11-16 12:53:39 +01:00
blk-exec.c block: Prevent hang_check firing during long I/O 2010-09-24 15:52:09 +02:00
blk-flush.c block: remove BLKDEV_IFL_WAIT 2010-09-16 20:52:58 +02:00
blk-integrity.c block: Fix double free in blk_integrity_unregister 2010-10-15 15:49:18 +02:00
blk-ioc.c block: remove unused copy_io_context() 2010-11-11 13:40:11 +01:00
blk-iopoll.c tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
blk-lib.c block: remove BLKDEV_IFL_WAIT 2010-09-16 20:52:58 +02:00
blk-map.c block: check for proper length of iov entries in blk_rq_map_user_iov() 2010-11-10 14:40:42 +01:00
blk-merge.c Revert "block: fix accounting bug on cross partition merges" 2010-10-24 22:06:02 +02:00
blk-settings.c Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block 2010-10-22 17:07:18 -07:00
blk-softirq.c generic-ipi: remove CSD_FLAG_WAIT 2009-02-25 14:13:44 +01:00
blk-sysfs.c block: fix use-after-free bug in blk throttle code 2010-10-23 20:40:26 +02:00
blk-tag.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
blk-throttle.c blkio-throttle: Fix possible multiplication overflow in iops calculations 2010-10-01 21:16:42 +02:00
blk-timeout.c block: ensure jiffies wrap is handled correctly in blk_rq_timed_out_timer 2010-04-21 17:42:08 +02:00
blk.h Revert "block: fix accounting bug on cross partition merges" 2010-10-24 22:06:02 +02:00
bsg.c Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
cfq-iosched.c block cfq: select new workload if priority changed 2010-12-13 14:32:22 +01:00
cfq.h blk-cgroup: Prepare the base for supporting more than one IO control policies 2010-09-16 08:42:04 +02:00
compat_ioctl.c block: read i_size with i_size_read() 2010-11-10 14:40:53 +01:00
deadline-iosched.c block: convert to pos and nr_sectors accessors 2009-05-11 09:50:54 +02:00
elevator.c block: remove REQ_HARDBARRIER 2010-11-10 14:54:09 +01:00
genhd.c implement in-kernel gendisk events handling 2010-12-16 17:53:38 +01:00
ioctl.c Merge branch 'cleanup-bd_claim' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into for-2.6.38/core 2010-11-27 19:49:18 +01:00
noop-iosched.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
scsi_ioctl.c block: take care not to overflow when calculating total iov length 2010-11-10 14:40:42 +01:00