We have found a race in the optimisation used in the dm cache
writethrough implementation. Currently, dm core sends the cache target
two bios, one for the origin device and one for the cache device and
these are processed in parallel. This patch avoids the race by
changing the code back to a simpler (slower) implementation which
processes the two writes in series, one after the other, until we can
develop a complete fix for the problem.
When the cache is in writethrough mode it needs to send WRITE bios to
both the origin and cache devices.
Previously we've been implementing this by having dm core query the
cache target on every write to find out how many copies of the bio it
wants. The cache will ask for two bios if the block is in the cache,
and one otherwise.
Then main problem with this is it's racey. At the time this check is
made the bio hasn't yet been submitted and so isn't being taken into
account when quiescing a block for migration (promotion or demotion).
This means a single bio may be submitted when two were needed because
the block has since been promoted to the cache (catastrophic), or two
bios where only one is needed (harmless).
I really don't want to start entering bios into the quiescing system
(deferred_set) in the get_num_write_bios callback. Instead this patch
simplifies things; only one bio is submitted by the core, this is
first written to the origin and then the cache device in series.
Obviously this will have a latency impact.
deferred_writethrough_bios is introduced to record bios that must be
later issued to the cache device from the worker thread. This deferred
submission, after the origin bio completes, is required given that we're
in interrupt context (writethrough_endio).
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>