linux-sg2042/drivers
majianpeng 79ef3a8aa1 raid1: Rewrite the implementation of iobarrier.
There is an iobarrier in raid1 because of contention between normal IO and
resync IO.  It suspends all normal IO when resync/recovery happens.

However if normal IO is out side the resync window, there is no contention.
So this patch changes the barrier mechanism to only block IO that
could contend with the resync that is currently happening.

We partition the whole space into five parts.
|---------|-----------|------------|----------------|-------|
        start   next_resync   start_next_window    end_window

start + RESYNC_WINDOW = next_resync
next_resync + NEXT_NORMALIO_DISTANCE = start_next_window
start_next_window + NEXT_NORMALIO_DISTANCE = end_window

Firstly we introduce some concepts:

1 - RESYNC_WINDOW: For resync, there are 32 resync requests at most at the
      same time. A sync request is RESYNC_BLOCK_SIZE(64*1024).
      So the RESYNC_WINDOW is 32 * RESYNC_BLOCK_SIZE, that is 2MB.
2 - NEXT_NORMALIO_DISTANCE: the distance between next_resync
      and start_next_window.  It also indicates the distance between
      start_next_window and end_window.
      It is currently 3 * RESYNC_WINDOW_SIZE but could be tuned if
      this turned out not to be optimal.
3 - next_resync: the next sector at which we will do sync IO.
4 - start: a position which is at most RESYNC_WINDOW before
      next_resync.
5 - start_next_window:  a position which is NEXT_NORMALIO_DISTANCE
      beyond next_resync.  Normal-io after this position doesn't need to
      wait for resync-io to complete.
6 - end_window:  a position which is 2 * NEXT_NORMALIO_DISTANCE beyond
      next_resync.  This also doesn't need to wait, but is counted
      differently.
7 - current_window_requests:  the count of normalIO between
      start_next_window and end_window.
8 - next_window_requests: the count of normalIO after end_window.

NormalIO will be partitioned into four types:

NormIO1:  the end sector of bio is smaller or equal the start
NormIO2:  the start sector of bio larger or equal to end_window
NormIO3:  the start sector of bio larger or equal to
          start_next_window.
NormIO4:  the location between start_next_window and end_window

|--------|-----------|--------------------|----------------|-------------|
    | start   |   next_resync   |  start_next_window   |  end_window |
 NormIO1   NormIO4            NormIO4                NormIO3      NormIO2

For NormIO1, we don't need any io barrier.
For NormIO4, we used a similar approach to the original iobarrier
    mechanism.  The normalIO and resyncIO must be kept separate.
For NormIO2/3, we add two fields to struct r1conf: "current_window_requests"
    and "next_window_requests". They indicate the count of active
    requests in the two window.
    For these, we don't wait for resync io to complete.

For resync action, if there are NormIO4s, we must wait for it.
If not, we can proceed.
But if resync action reaches start_next_window and
current_window_requests > 0 (that is there are NormIO3s), we must
wait until the current_window_requests becomes zero.
When current_window_requests becomes zero,  start_next_window also
moves forward. Then current_window_requests will replaced by
next_window_requests.

There is a problem which when and how to change from NormIO2 to
NormIO3.  Only then can sync action progress.

We add a field in struct r1conf "start_next_window".

A: if start_next_window == MaxSector, it means there are no NormIO2/3.
   So start_next_window = next_resync + NEXT_NORMALIO_DISTANCE
B: if current_window_requests == 0 && next_window_requests != 0, it
   means start_next_window move to end_window

There is another problem which how to differentiate between
old NormIO2(now it is NormIO3) and NormIO2.
For example, there are many bios which are NormIO2 and a bio which is
NormIO3. NormIO3 firstly completed, so the bios of NormIO2 became NormIO3.

We add a field in struct r1bio "start_next_window".
This is used to record the position conf->start_next_window when the call
to wait_barrier() is made in make_request().

In allow_barrier(), we check the conf->start_next_window.
If r1bio->stat_next_window == conf->start_next_window, it means
there is no transition between NormIO2 and NormIO3.
If r1bio->start_next_window != conf->start_next_window, it mean
there was a transition between NormIO2 and NormIO3.  There can only
have been one transition.  So it only means the bio is old NormIO2.

For one bio, there may be many r1bio's. So we make sure
all the r1bio->start_next_window are the same value.
If we met blocked_dev in make_request(), it must call allow_barrier
and wait_barrier. So the former and the later value of
conf->start_next_window will be change.
If there are many r1bio's with differnet start_next_window,
for the relevant bio, it depend on the last value of r1bio.
It will cause error. To avoid this, we must wait for previous r1bios
to complete.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-11-19 15:19:18 +11:00
..
accessibility
acpi Merge branch 'linus' into sched/core 2013-11-01 08:24:41 +01:00
amba
ata Merge branch 'for-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2013-10-22 08:21:34 +01:00
atm
auxdisplay
base devres: restore zeroing behavior of devres_alloc() 2013-10-25 05:46:27 +01:00
bcma Merge 3.12-rc6 into driver-core-next 2013-10-19 13:05:38 -07:00
block Via Paul Walmsley <paul@pwsan.com>: 2013-10-28 14:39:03 -07:00
bluetooth Bluetooth: btusb: Add support for Belkin F8065bf 2013-09-23 17:44:25 -03:00
bus ARM: driver updates for 3.13 2013-11-11 17:05:37 +09:00
cdrom
char Merge 3.12-rc6 into char-misc-next 2013-10-19 13:02:47 -07:00
clk ARM: SoC DT updates for 3.13 2013-11-11 17:34:56 +09:00
clocksource clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast 2013-09-26 02:31:04 +02:00
connector connector: use 'size' everywhere in cn_netlink_send() 2013-10-02 16:03:50 -04:00
cpufreq ARM: SoC platform changes for 3.13 2013-11-11 16:49:45 +09:00
cpuidle cpuidle: calxeda: add support to use PSCI calls 2013-10-01 16:30:56 -05:00
crypto
dca
devfreq
dio
dma ARM: driver updates for 3.13 2013-11-11 17:05:37 +09:00
edac
eisa
extcon Update extcon for 3.13 2013-09-26 20:47:25 -07:00
firewire
firmware
fmc
gpio ARM: SoC DT updates for 3.13 2013-11-11 17:34:56 +09:00
gpu i915: fix compiler warning 2013-10-31 15:28:23 -07:00
hid Staging driver update for 3.13-rc1 2013-11-07 15:07:58 +09:00
hsi hsi: convert bus code to use dev_groups 2013-10-16 18:36:04 -07:00
hv Drivers: hv: vmbus: Fix a bug in channel rescind code 2013-10-19 19:53:46 -07:00
hwmon hwmon: (applesmc) Always read until end of data 2013-10-09 09:48:55 -07:00
hwspinlock
i2c i2c: i2c-mux-pinctrl: use deferred probe when adapter not found 2013-10-10 10:22:35 +02:00
ide ARM: SoC cleanups for 3.13 2013-11-11 16:42:43 +09:00
idle sched, idle: Fix the idle polling state logic 2013-09-25 13:53:10 +02:00
iio iio: light: vcnl4000: Remove redundant code 2013-10-24 14:48:14 +01:00
infiniband Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-10-27 10:16:33 -07:00
input ARM: SoC cleanups for 3.13 2013-11-11 16:42:43 +09:00
iommu x86, build, pci: Fix PCI_MSI build on !SMP 2013-10-04 10:43:34 -07:00
ipack ipack: convert bus code to use dev_groups 2013-10-16 18:40:57 -07:00
irqchip Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-11-12 10:02:59 +09:00
isdn
leds
lguest
macintosh
mailbox
md raid1: Rewrite the implementation of iobarrier. 2013-11-19 15:19:18 +11:00
media ARM: SoC cleanups for 3.13 2013-11-11 16:42:43 +09:00
memory
memstick memstick: convert bus code to use dev_groups 2013-10-16 18:40:58 -07:00
message i2o: convert bus code to use dev_groups 2013-10-16 18:40:58 -07:00
mfd mfd: dbx500: Remove any mention of the BML8580CLK 2013-09-26 11:04:16 +02:00
misc Driver Core / sysfs patches for 3.13-rc1 2013-11-07 11:42:15 +09:00
mmc Merge 3.12-rc6 into driver-core-next 2013-10-19 13:05:38 -07:00
mtd Driver Core / sysfs patches for 3.13-rc1 2013-11-07 11:42:15 +09:00
net Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-11-08 08:32:58 +09:00
nfc
ntb
nubus
of Revert "drivers: of: add initialization code for dma reserved memory" 2013-10-15 09:26:07 +01:00
oprofile
parisc
parport
pci ARM: driver updates for 3.13 2013-11-11 17:05:37 +09:00
pcmcia Driver Core / sysfs patches for 3.13-rc1 2013-11-07 11:42:15 +09:00
phy usb: patches for v3.13 2013-10-24 16:18:40 +01:00
pinctrl pinctrl: single: Fix build when not built on ARM 2013-10-18 16:43:06 -07:00
platform platform/x86: fix asus-wmi build error 2013-10-23 07:57:57 +01:00
pnp PNP: convert bus code to use dev_groups 2013-10-16 18:36:02 -07:00
power
pps
ps3
ptp
pwm
rapidio rapidio: convert bus code to use dev_groups 2013-10-16 18:36:03 -07:00
regulator Merge remote-tracking branch 'regulator/fix/wm8350' into regulator-linus 2013-09-30 12:04:33 +01:00
remoteproc
reset
rpmsg
rtc HID RTC: Open sensor hub open close 2013-10-01 22:06:15 +01:00
s390 s390/scm_blk: fix endless loop for requests != REQ_TYPE_FS 2013-11-06 14:32:22 +01:00
sbus
scsi Driver Core / sysfs patches for 3.13-rc1 2013-11-07 11:42:15 +09:00
sfi
sh
sn
spi Merge remote-tracking branch 'spi/fix/s3c64xx' into spi-linus 2013-10-07 14:51:59 +01:00
ssb ssb: convert bus code to use dev_groups 2013-10-16 18:36:03 -07:00
staging Staging driver update for 3.13-rc1 2013-11-07 15:07:58 +09:00
target target/pscsi: fix return value check 2013-10-25 10:42:09 -07:00
tc
thermal Merge branch 'x86_pkg_temp' of .git into for-rc 2013-10-21 11:26:45 +08:00
tty Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2013-11-08 08:24:38 +09:00
uio Char/Misc patches for 3.13-rc1 2013-11-07 09:41:06 +09:00
usb ARM: SoC DT updates for 3.13 2013-11-11 17:34:56 +09:00
uwb Driver Core / sysfs patches for 3.13-rc1 2013-11-07 11:42:15 +09:00
vfio VFIO: vfio_iommu_type1: fix bug caused by break in nested loop 2013-10-11 10:40:46 -06:00
vhost vhost/scsi: Fix incorrect usage of get_user_pages_fast write parameter 2013-10-25 11:03:34 -07:00
video Merge branch 'parisc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux 2013-11-11 18:15:25 +09:00
virt
virtio virtio: convert bus code to use dev_groups 2013-10-16 18:40:57 -07:00
vlynq
vme
w1 w1-gpio: Use devm_* functions 2013-10-29 16:58:18 -07:00
watchdog watchdog: sunxi: Fix section mismatch 2013-10-13 20:02:03 +02:00
xen xenbus: convert bus code to use dev_groups 2013-10-16 18:36:03 -07:00
zorro
Kconfig drivers: phy: add generic PHY framework 2013-09-27 17:35:41 -07:00
Makefile drivers: phy: add generic PHY framework 2013-09-27 17:35:41 -07:00