Commit Graph

464 Commits

Author SHA1 Message Date
Andreas Gruenbacher a6b32bc3ce drbd: Introduce "peer_device" object between "device" and "connection"
In a setup where a device (aka volume) can replicate to multiple peers and one
connection can be shared between multiple devices, we need separate objects to
represent devices on peer nodes and network connections.

As a first step to introduce multiple connections per device, give each
drbd_device object a single drbd_peer_device object which connects it to a
drbd_connection object.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:44:51 +01:00
Andreas Gruenbacher bde89a9e15 drbd: Rename drbd_tconn -> drbd_connection
sed -i -e 's:all_tconn:connections:g' -e 's:tconn:connection:g'

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:44:47 +01:00
Andreas Gruenbacher b30ab7913b drbd: Rename "mdev" to "device"
sed -i -e 's:mdev:device:g'

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:42:24 +01:00
Andreas Gruenbacher 5476169793 drbd: Rename struct drbd_conf -> struct drbd_device
sed -i -e 's:\<drbd_conf\>:drbd_device:g'

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:36:44 +01:00
Andreas Gruenbacher a3603a6e3b drbd: Split off on-the-wire protocol definitions
Keep the protocol definitions separate from the kernel code; they are useful in
their own right.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:27:49 +01:00
Rashika Kheria de0b2e69b6 drivers: block: Move prototype declaration to appropriate header file from drbd_main.c
Move prototype declaration of functions drbdd_init() and drbd_asender()
from drbd/drbd_main.c to header file drbd/drbd_int.h because these
functions are used by more than one file.

This eliminates the following warning in drbd/drbd_receiver.c:
drivers/block/drbd/drbd_receiver.c:4836:5: warning: no previous prototype for ‘drbdd_init’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_receiver.c:5245:5: warning: no previous prototype for ‘drbd_asender’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:19:39 +01:00
Rashika Kheria 16f4e743c8 drivers: block: Mark functions as static in drbd_main.c
Mark functions _drbd_send_uuids(), fill_bitmap_rle_bits() and
init_submitter() as static in drbd/drbd_main.c because they are
not used outside this file.

This eliminates the following warnings in drbd/drbd_main.c:
drivers/block/drbd/drbd_main.c:826:5: warning: no previous prototype for ‘_drbd_send_uuids’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_main.c:1070:5: warning: no previous prototype for ‘fill_bitmap_rle_bits’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_main.c:2592:5: warning: no previous prototype for ‘init_submitter’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:19:38 +01:00
Kent Overstreet 4550dd6c6b block: Immutable bio vecs
This adds a mechanism by which we can advance a bio by an arbitrary
number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done
indicates the number of bytes completed in the current bvec.

Various driver code still needs to be updated to not refer to the bvec
directly before we can use this for interesting things, like efficient
bio splitting.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
2013-11-23 22:33:49 -08:00
Kent Overstreet 7988613b0e block: Convert bio_for_each_segment() to bvec_iter
More prep work for immutable biovecs - with immutable bvecs drivers
won't be able to use the biovec directly, they'll need to use helpers
that take into account bio->bi_iter.bi_bvec_done.

This updates callers for the new usage without changing the
implementation yet.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Cc: support@lsi.com
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-m68k@lists.linux-m68k.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
Cc: cbe-oss-dev@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-raid@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: DL-MPTFusionLinux@lsi.com
Cc: linux-scsi@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: linux-fsdevel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: linux-mm@kvack.org
Acked-by: Geoff Levand <geoff@infradead.org>
2013-11-23 22:33:49 -08:00
Lars Ellenberg 69babf05cb drbd: fix NULL pointer deref in module init error path
If we want to iterate over the (as of yet still empty) list in the
cleanup path, we need to initialize the list before the first goto fail.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:28 -07:00
Philipp Reisner d752b26960 drbd: Allow online change of al-stripes and al-stripe-size
Allow to change the AL layout with an resize operation. For that
the reisze command gets two new fields: al_stripes and al_stripe_size.

In order to make the operation crash save:
1) Lock out all IO and MD-IO
2) Write the super block with MDF_PRIMARY_IND clear
3) write the bitmap to the new location (all zeros, since
   we allow only while connected)
4) Initialize the new AL-area
5) Write the super block with the restored MDF_PRIMARY_IND.
6) Unfreeze all IO

Since the AL-layout has no influence on the protocol, this operation
needs to be beforemed on both sides of a resource (if intended).

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-06-28 16:04:36 +02:00
Wei Yongjun 6110d70bdf drbd: fix error return code in drbd_init()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-06-28 16:04:36 +02:00
Linus Torvalds ebb3727779 Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "It might look big in volume, but when categorized, not a lot of
  drivers are touched.  The pull request contains:

   - mtip32xx fixes from Micron.

   - A slew of drbd updates, this time in a nicer series.

   - bcache, a flash/ssd caching framework from Kent.

   - Fixes for cciss"

* 'for-3.10/drivers' of git://git.kernel.dk/linux-block: (66 commits)
  bcache: Use bd_link_disk_holder()
  bcache: Allocator cleanup/fixes
  cciss: bug fix to prevent cciss from loading in kdump crash kernel
  cciss: add cciss_allow_hpsa module parameter
  drivers/block/mg_disk.c: add CONFIG_PM_SLEEP to suspend/resume functions
  mtip32xx: Workaround for unaligned writes
  bcache: Make sure blocksize isn't smaller than device blocksize
  bcache: Fix merge_bvec_fn usage for when it modifies the bvm
  bcache: Correctly check against BIO_MAX_PAGES
  bcache: Hack around stuff that clones up to bi_max_vecs
  bcache: Set ra_pages based on backing device's ra_pages
  bcache: Take data offset from the bdev superblock.
  mtip32xx: mtip32xx: Disable TRIM support
  mtip32xx: fix a smatch warning
  bcache: Disable broken btree fuzz tester
  bcache: Fix a format string overflow
  bcache: Fix a minor memory leak on device teardown
  bcache: Documentation updates
  bcache: Use WARN_ONCE() instead of __WARN()
  bcache: Add missing #include <linux/prefetch.h>
  ...
2013-05-08 11:51:05 -07:00
Al Viro db2a144bed block_device_operations->release() should return void
The value passed is 0 in all but "it can never happen" cases (and those
only in a couple of drivers) *and* it would've been lost on the way
out anyway, even if something tried to pass something meaningful.
Just don't bother.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-07 02:16:21 -04:00
Lars Ellenberg 94ad0a1014 drbd: fix memory leak
We forgot to free the disk_conf,
so for each attach/detach cycle we leaked 336 bytes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28 10:10:25 -06:00
Philipp Reisner bb45185de2 drbd: fix spurious warning about bitmap being locked from detach
Introduced in drbd: always write bitmap on detach,
the bitmap bulk writeout on detach was indicating
it expected exclusive bitmap access.

Where I meant to say: expect no more modifications,
but testing/counting is still allowed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28 10:10:25 -06:00
Lars Ellenberg 113fef9e20 drbd: prepare to queue write requests on a submit worker
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:14:40 -06:00
Lars Ellenberg c04ccaa669 drbd: read meta data early, base on-disk offsets on super block
We used to calculate all on-disk meta data offsets, and then compare
the stored offsets, basically treating them as magic numbers.

Now with the activity log striping, the activity log size is no longer
fixed.  We need to first read the super block, then base the activity
log and bitmap offsets on the stored offsets/al stripe settings.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg cccac9857d drbd: mechanically rename la_size to la_size_sect
Make it obvious that this value is in units of 512 Byte sectors.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg 3a4d4eb3cb drbd: prepare for new striped layout of activity log
Introduce two new on-disk meta data fields: al_stripes and al_stripe_size_4k
The intended use case is activity log on RAID 0 or similar.
Logically consecutive transactions will advance their on-disk position
by al_stripe_size_4k 4kB (transaction sized) blocks.

Right now, these are still asserted to be the backward compatible
values al_stripes = 1, al_stripe_size_4k = 8 (which amounts to 32kB).

Also introduce a caching member for meta_dev_idx in the in-core
structure: even though it is initially passed in in the rcu-protected
disk_conf structure, it cannot change without a detach/attach cycle.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg ae8bf312e9 drbd: cleanup ondisk meta data layout calculations and defines
Add a comment about our meta data layout variants,
and rename a few defines (e.g. MD_RESERVED_SECT -> MD_128MB_SECT)
to make it clear that they are short hand for fixed constants,
and not arbitrarily to be redefined as one may see fit.

Properly pad struct meta_data_on_disk to 4kB,
and initialize to zero not only the first 512 Byte,
but all of it in drbd_md_sync().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg 9114d79569 drbd: cleanup bogus assert message
This fixes ASSERT( mdev->state.disk == D_FAILED ) in drivers/block/drbd/drbd_main.c

When we detach from local disk, we let the local refcount hit zero twice.

First, we transition to D_FAILED, so we won't give out new references
to incoming requests; we still may give out *internal* references, though.
Once the refcount hits zero [1] while in D_FAILED, we queue a transition
to D_DISKLESS to our worker.  We need to queue it, because we may be in
atomic context when putting the reference.
Once the transition to D_DISKLESS actually happened [2] from worker context,
we don't give out new internal references either.

Between hitting zero the first time [1] and actually transition to
D_DISKLESS [2], there may be a few very short lived internal get/put,
so we may hit zero more than once while being in D_FAILED, or even see a
race where a an internal get_ldev() happened while D_FAILED, but the
corresponding put_ldev() happens just after the transition to D_DISKLESS.

That's why we have the additional test_and_set_bit(GO_DISKLESS,);
and that's why the assert was placed wrong.
Since there was exactly one code path left to drbd_go_diskless(),
and that checks already for D_FAILED, drop that assert,
and fold in the drbd_queue_work().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Tejun Heo 56de210245 drbd: convert to idr_alloc()
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:15 -08:00
Philipp Reisner 298307ed1d drbd: Remove obsolete check
Smatch complained about it this redundanct check.

The check was introduced in 2006-09-13. On 2007-07-24 the body of the
function was enclosed by get_ldev()/put_ldev() reference counting.
Since then the check is useless and miss leading.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-06 12:09:55 +01:00
Philipp Reisner 986836503e Merge branch 'drbd-8.4_ed6' into for-3.8-drivers-drbd-8.4_ed6 2012-11-09 14:20:23 +01:00
Philipp Reisner fd0017c124 drbd: fix regression: potential NULL pointer dereference
recent commit
    drbd: always write bitmap on detach
introduced a bitmap writeout during detach,
which obviously needs some meta data device to write to.

Unfortunately, that same error path may be taken if we fail to attach,
e.g. due to UUID mismatch, after we changed state to D_ATTACHING,
but before the lower level device pointer is even assigned.

We need to test for presence of mdev->ldev.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:42 +01:00
Lars Ellenberg edc9f5eb7a drbd: always write bitmap on detach
If we detach due to local read-error (which sets a bit in the bitmap),
stay Primary, and then re-attach (which re-reads the bitmap from disk),
we potentially lost the "out-of-sync" (or, "bad block") information in
the bitmap.

Always (try to) write out the changed bitmap pages before going diskless.

That way, we don't lose the bit for the bad block,
the next resync will fetch it from the peer, and rewrite
it locally, which may result in block reallocation in some
lower layer (or the hardware), and thereby "heal" the bad blocks.

If the bitmap writeout errors out as well, we will (again: try to)
mark the "we need a full sync" bit in our super block,
if it was a READ error; writes are covered by the activity log already.

If that superblock does not make it to disk either, we are sorry.

Maybe we just lost an entire disk or controller (or iSCSI connection),
and there actually are no bad blocks at all, so we don't need to
re-fetch from the peer, there is no "auto-healing" necessary.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:41 +01:00
Philipp Reisner 19fffd7b03 drbd: Call drbd_md_sync() explicitly after a state change on the connection
Without this, the meta-data gets updates after 5 seconds by the
md_sync_timer. Better to do it immeditaly after a state change.

If the asender detects a network failure, it may take a bit until
the worker processes the according after-conn-state-change work item.

  The worker might be blocked in sending something, i.e. it
  takes until it gets into its timeout. That is 6 seconds by
  default which is longer than the 5 seconds of the md_sync_timer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:08 +01:00
Lars Ellenberg b792b655cd drbd: fix potential list_add corruption
If the md_sync_timer triggers a second time,
while the work queued during the first time is still pending,
this could result in list_add() of an already added item,
and corrupt the work item list.

This likely only triggered because of the erroneous
batch-dequeueing of work items fixed with
  drbd: dequeue single work items in wait_for_work()

Still, skip queueing if md_sync_work is already queued.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:21 +01:00
Philipp Reisner 39a1aa7f49 drbd: Protect accesses to the uuid set with a spinlock
There is at least the worker context, the receiver context, the context of
receiving netlink packts.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:04 +01:00
Lars Ellenberg d4dabbe22d drbd: disambiguation, s/P_DISCARD_WRITE/P_SUPERSEDED/
To avoid confusion with REQ_DISCARD aka TRIM, rename our
"discard concurrent write acks" from P_DISCARD_WRITE to P_SUPERSEDED.

At the same time, rename the drbd request event DISCARD_WRITE
to CONFLICT_RESOLVED. It already triggers both successful completion
or restart of the request, depending on our RQ_POSTPONED flag.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:49 +01:00
Lars Ellenberg 81a3537a97 drbd: announce FLUSH/FUA capability to upper layers
In 8.4, we may have bios spanning two activity log extents.
Fixup drbd_al_begin_io() and drbd_al_complete_io() to deal with zero sized bios.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:44 +01:00
Lars Ellenberg 6f3465ed82 drbd: report congestion if we are waiting for some userland callback
If the drbd worker thread is synchronously waiting for some userland
callback, we don't want some casual pageout to block on us.
Have drbd_congested() report congestion in that case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:39 +01:00
Lars Ellenberg 0c84966601 drbd: differentiate between normal and forced detach
Aborting local requests (not waiting for completion from the lower level
disk) is dangerous: if the master bio has been completed to upper
layers, data pages may be re-used for other things already.
If local IO is still pending and later completes,
this may cause crashes or corrupt unrelated data.

Only abort local IO if explicitly requested.
Intended use case is a lower level device that turned into a tarpit,
not completing io requests, not even doing error completion.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:39 +01:00
Lars Ellenberg 9a278a7906 drbd: allow read requests to be retried after force-detach
Sometimes, a lower level block device turns into a tar-pit,
not completing requests at all, not even doing error completion.

We can force-detach from such a tar-pit block device,
either by disk-timeout, or by drbdadm detach --force.

Queueing for retry only from the request destruction path (kref hit 0)
makes it impossible to retry affected read requests from the peer,
until the local IO completion happened, as the locally submitted
bio holds a reference on the drbd request object.

If we can only complete READs when the local completion finally
happens, we would not need to force-detach in the first place.

Instead, queue for retry where we otherwise had done the error completion.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:37 +01:00
Lars Ellenberg a0d856dfae drbd: base completion and destruction of requests on ref counts
cherry-picked and adapted from drbd 9 devel branch

The logic for when to get or put a reference is in mod_rq_state().

To not get confused in the freeze/thaw respectively resend/restart
paths, or when cleaning up requests waiting for P_BARRIER_ACK, this
also introduces additional state flags:
RQ_COMPLETION_SUSP, and RQ_EXP_BARR_ACK.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:36 +01:00
Lars Ellenberg 5df69ece6e drbd: __drbd_make_request() is now void
The previous commit causes __drbd_make_request() to always return 0.
Change it to void.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:35 +01:00
Lars Ellenberg b6dd1a8976 drbd: remove struct drbd_tl_epoch objects (barrier works)
cherry-picked and adapted from drbd 9 devel branch

DRBD requests (struct drbd_request) are already on the per resource
transfer log list, and carry their epoch number. We do not need to
additionally link them on other ring lists in other structs.

The drbd sender thread can recognize itself when to send a P_BARRIER,
by tracking the currently processed epoch, and how many writes
have been processed for that epoch.

If the epoch of the request to be processed does not match the currently
processed epoch, any writes have been processed in it, a P_BARRIER for
this last processed epoch is send out first.
The new epoch then becomes the currently processed epoch.

To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK,
the sender thread also needs to handle the case when the current
epoch was closed already, but no new requests are queued yet,
and send out P_BARRIER as soon as possible.

This is done by comparing the per resource "current transfer log epoch"
(tconn->current_tle_nr) with the per connection "currently processed
epoch number" (tconn->send.current_epoch_nr), while waiting for
new requests to be processed in wait_for_work().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:35 +01:00
Lars Ellenberg d5b27b01f1 drbd: move the drbd_work_queue from drbd_socket to drbd_connection
cherry-picked and adapted from drbd 9 devel branch
In 8.4, we don't distinguish between "resource work" and "connection
work" yet, we have one worker for both, as we still have only one connection.

We only ever used the "data.work",
no need to keep the "meta.work" around.

Move tconn->data.work to tconn->sender_work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:34 +01:00
Lars Ellenberg 8c0785a5c9 drbd: allow to dequeue batches of work at a time
cherry-picked and adapted from drbd 9 devel branch

In 8.4, we still use drbd_queue_work_front(),
so in normal operation, we can not dequeue batches,
but only single items.

Still, followup commits will wake the worker
without explicitly queueing a work item,
so up() is replaced by a simple wake_up().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:34 +01:00
Lars Ellenberg b379c41ed7 drbd: transfer log epoch numbers are now per resource
cherry-picked from drbd 9 devel branch.

In preparation of multiple connections, the "barrier number" or
"epoch number" needs to be tracked per-resource, not per connection.
The sequence number space will not be reset anymore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:33 +01:00
Lars Ellenberg 9d05e7c4e7 drbd: rename drbd_restart_write to drbd_restart_request
Meanwhile, this is used to restart failed READ requests as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:33 +01:00
Philipp Reisner c5b005ab70 drbd: use bitmap_parse instead of __bitmap_parse
The buffer 'sc.cpu_mask' is a kernel buffer.  If bitmap_parse is used
instead of __bitmap_parse the extra parameter that indicates a kernel
buffer is not needed.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:29 +01:00
Lars Ellenberg 9ed57dcbda drbd: ignore volume number for drbd barrier packet exchange
Transfer log epochs, and therefore P_BARRIER packets,
are per resource, not per volume.
We must not associate them with "some random volume".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:25 +01:00
Lars Ellenberg 2312f0b3c5 drbd: fix potential deadlock during "restart" of conflicting writes
w_restart_write(), run from worker context, calls __drbd_make_request()
and further drbd_al_begin_io(, delegate=true), which then
potentially deadlocks.  The previous patch moved a BUG_ON to expose
such call paths, which would now be triggered.

Also, if we call __drbd_make_request() from resource worker context,
like w_restart_write() did, and that should block for whatever reason
(!drbd_state_is_stable(), resource suspended, ...),
we potentially deadlock the whole resource, as the worker
is needed for state changes and other things.

Create a dedicated retry workqueue for this instead.

Also make sure that inc_ap_bio()/dec_ap_bio() are properly paired,
even if do_retry() needs to retry itself,
in case __drbd_make_request() returns != 0.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:21 +01:00
Lars Ellenberg f9916d61a4 drbd: don't pretend that barrier_nr == 0 was special
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:21 +01:00
Lars Ellenberg 5cdb0bf322 drbd: remove now unused seq_num member from struct drbd_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:19 +01:00
Lars Ellenberg 4b8514ee28 drbd: fix potential data corruption and protocol error
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:19 +01:00
Lars Ellenberg b17f33cb0a drbd: explicitly clear unused dp_flags in drbd_send_block
We send left-over garbage from the previous packet in P_DATA_REPLY and
P_RS_DATA_REPLY packets. That's bad behaviour.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:15 +01:00
Philipp Reisner 12038a3a71 drbd: Move list of epochs from mdev to tconn
This is necessary since the transfer_log on the sending is also
per tconn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:08 +01:00
Philipp Reisner 4b0007c0e8 drbd: Move write_ordering from mdev to tconn
This is necessary in order to prepare the move of the (receiver side)
epoch list from the device (mdev) to the connection (tconn) objects.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:07 +01:00
Philipp Reisner 6936fcb49a drbd: Move the CREATE_BARRIER flag from connection to device
That is necessary since the whole transfer log is per connection(tconn)
and not per device(mdev).

This bug caused list corruption on the worker list. When a barrier is queued
for sending in the context of one device, another device did not see the
CREATE_BARRIER bit, and queued the same object again -> list corruption.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:06 +01:00
Philipp Reisner 43de7c852b drbd: Fixes from the drbd-8.3 branch
* drbd-8.3:
  drbd: O_SYNC gives EIO on ramdisks for some kernels (eg. RHEL6).
  drbd: send intermediate state change results to the peer

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:06 +01:00
Philipp Reisner 0cfac5dd90 drbd: Fixes from the drbd-8.3 branch
* drbd-8.3:
  drbd: fix spurious meta data IO "error"
  drbd: Fixed a race condition between detach and start of resync
  drbd: fix harmless race to not trigger an ASSERT
  drbd: Derive sync-UUIDs only from the bitmap-uuid if it is non-zero
  drbd: Fixed current UUID generation (regression introduced recently, after 8.3.11)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:05 +01:00
Lars Ellenberg 97ddb68790 drbd: detach must not try to abort non-local requests
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:00 +01:00
Andreas Gruenbacher f497609e4c drbd: Get rid of MR_{READ,WRITE}_SHIFT
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:00 +01:00
Andreas Gruenbacher 7d4c782cbd drbd: Fix the data-integrity-alg setting
The last data-integrity-alg fix made data integrity checking work when the
algorithm was changed for an established connection, but the common case of
configuring the algorithm before connecting was still broken.  Fix that.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:59 +01:00
Andreas Gruenbacher 71fc7eedb3 drbd: Turn tl_apply() into tl_abort_disk_io()
There is no need to overly generalize this function; it only makes the code
harder to understand.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:58 +01:00
Lars Ellenberg d5d7ebd422 drbd: on attach, enforce clean meta data
Detection of unclean shutdown has moved into user space.

The kernel code will, whenever it updates the meta data, mark it as
"unclean", and will refuse to attach to such unclean meta data.

"drbdadm up" now schedules "drbdmeta apply-al", which will apply
the activity log to the bitmap, and/or reinitialize it, if necessary,
as well as set a "clean" indicator flag.

This moves a bit code out of kernel space.
As a side effect, it also prevents some 8.3 module from accidentally
ignoring the 8.4 style activity log, if someone should downgrade,
whether on purpose, or accidentally because he changed kernel versions
without providing an 8.4 for the new kernel, and the new kernel comes
with in-tree 8.3.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:51 +01:00
Philipp Reisner cdfda633d2 drbd: detach from frozen backing device
* drbd-8.3:
  documentation: Documented detach's --force and disk's --disk-timeout
  drbd: Implemented the disk-timeout option
  drbd: Force flag for the detach operation
  drbd: Allow new IOs while the local disk in in FAILED state
  drbd: Bitmap IO functions can not return prematurely if the disk breaks
  drbd: Added a kref to bm_aio_ctx
  drbd: Hold a reference to ldev while doing meta-data IO
  drbd: Keep a reference to the bio until the completion handler finished
  drbd: Implemented wait_until_done_or_disk_failure()
  drbd: Replaced md_io_mutex by an atomic: md_io_in_use
  drbd: moved md_io into mdev
  drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
  drbd: Keep a reference to barrier acked requests

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:50 +01:00
Philipp Reisner 2ffca4f3ee drbd: Improve compatibility with drbd's older than 8.3.7
Regression introduced with 8.3.11 commit:
drbd: Take a more conservative approach when deciding max_bio_size

Never ever tell an older drbd, that we support more than 32KiB
in a single data request (packet).
Never believe an older drbd, that is supports more than 32KiB
in a single data request (packet)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:49 +01:00
Andreas Gruenbacher 6dff290220 drbd: Rename --dry-run to --tentative
drbdadm already has a --dry-run option, so this option cannot directly be
passed through to drbdsetup.  Rename the drbdsetup option to resolve this
conflict.

For backward compatibility, make --dry-run an alias of --tentative.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:47 +01:00
Andreas Gruenbacher afbbfa88bc drbd: Allow to pass resource options to the new-resource command
This is equivalent to how the attach and connect commands work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:46 +01:00
Andreas Gruenbacher 089c075d88 drbd: Convert the generic netlink interface to accept connection endpoints
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:46 +01:00
Andreas Gruenbacher 46530e859c drbd: Use DRBD_MINOR_COUNT_DEF in one more place
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:43 +01:00
Philipp Reisner d659f2aaea drbd: Send PROTOCOL_UPDATE packets when appropriate
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:54 +01:00
Philipp Reisner 036b17eaab drbd: Receiving part for the PROTOCOL_UPDATE packet
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:53 +01:00
Philipp Reisner 46e1ce4177 drbd: protect updates to integrits_tfm by tconn->data->mutex
Since we need to hold that mutex anyways to make sure the peer
gets that change in the right position in the data stream,
it makes a lot of sense to use the same mutex to ensure existence
of the tfm.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:52 +01:00
Andreas Gruenbacher 6394b9358e drbd: Refer to resync-rate consistently throughout the code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:50 +01:00
Andreas Gruenbacher 6139f60dc1 drbd: Rename the want_lose field/flag to discard_my_data
This is what it is called in config files and on the command line as
well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:49 +01:00
Philipp Reisner c141ebda03 drbd: Removing drbd_cfg_rwsem
* Updates to all configuration items is done under genl_lock().
   Including removal of mdevs or tconns.
 * All read non sleeping read sides are protected by rcu
 * All sleeping read sides keep reference counts to keep the
   objects alive

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:48 +01:00
Philipp Reisner ec0bddbc55 drbd: Use RCU for the drbd_tconns list
Preparing removal of drbd_cfg_rwsem

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:47 +01:00
Philipp Reisner 81fa2e675c drbd: Refcounting for mdev objects
Preparing removal of drbd_cfg_rwsem

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:47 +01:00
Philipp Reisner 9958c857c7 drbd: Made the fifo object a self contained object (preparing for RCU)
* Moved rs_planed into it, named total
* When having a pointer to the object the values can
  be embedded into the fifo object.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:43 +01:00
Philipp Reisner daeda1cca9 drbd: RCU for disk_conf
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:43 +01:00
Philipp Reisner a0095508ca drbd: Renamed the net_conf_update mutex to conf_update
Preparing to use the same mutex for disk_conf updates

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:41 +01:00
Andreas Gruenbacher b966b5dd8e drbd: Generate the drbd_set_*_defaults() functions from drbd_genl.h
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:38 +01:00
Lars Ellenberg 992d6e91d3 drbd: fix thread stop deadlock
There are races where the receiver may be exiting,
but still need the worker to process some stuff.

Do not wait for the receiver to die from an exiting worker.
The receiver must already be dead in case the worker decides to exit.
If the receiver was still alive, it may still want to queue work, and do
drbd_flush_workqueue() from it's disconnect cleanup code,
which would no longer be processed by an exiting worker.

This also would deadlock,
if the worker was to synchornously wait for the receiver to die.

Do not implicitly stop the worker.
The worker will only be stopped from configuration context, from
conn_reconfig_done(), drbd_adm_down() or drbd_adm_delete_connection(),
after making sure the receiver is already stopped.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:53:00 +01:00
Andreas Gruenbacher 88104ca458 drbd: Allow to change data-integrity-alg on the fly
The main purpose of this is to allow to turn data integrity checking on
and off on demand without causing interruptions.

Implemented by allocating tconn->peer_integrity_tfm only when receiving
a P_PROTOCOL message.  l accesses to tconn->peer_integrity_tf happen in
worker context, and no further synchronization is necessary.

On the sender side, tconn->integrity_tfm is modified under
tconn->data.mutex, and a P_PROTOCOL message is sent whenever.  All
accesses to tconn->integrity_tfm already happen under this mutex.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:59 +01:00
Andreas Gruenbacher a7eb7bdf58 drbd: Introduce a "lockless" variant of drbd_send_protocoll()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:59 +01:00
Andreas Gruenbacher 5b614abe30 drbd: Rename integrity_r_tfm -> peer_integrity_tfm
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:58 +01:00
Andreas Gruenbacher 8d412fc6d5 drbd: Rename integrity_w_tfm -> integrity_tfm
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:58 +01:00
Lars Ellenberg 5979e36155 drbd: on reconfiguration requests, mind the SET_DEFAULTS flag
The DRBD_GENL_F_SET_DEFAULTS flag was ignored
for drbd_adm_disk_opts() and drbd_adm_net_opts().

Factor out drbd_set_*_defaults() helper functions,
and call them appropriately.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:50:38 +01:00
Philipp Reisner 0ace9dfabe drbd: Take a reference on tconn when finding a tconn by name
Rule #3 of kref.txt

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:06 +01:00
Philipp Reisner 9dc9fbb357 drbd: Basic refcounting for drbd_tconn
References hold by:
 * Each (running) drbd thread has a reference on tconn
 * Each mdev has a referenc on tconn
 * Beeing in the all_tconn list counts for one reference
 * Each after_conn_state_chg_work has a reference to tconn

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:06 +01:00
Philipp Reisner 1d04122599 drbd: Eliminated drbd_free_resoruces() it is superseeded by conn_free_crypto()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:05 +01:00
Lars Ellenberg ae25b336e0 drbd: cmdname() enum to string convertion was missing a few constants
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:05 +01:00
Philipp Reisner 91fd4dad64 drbd: Proper locking for updates to net_conf under RCU
Removing the get_net_conf()/put_net_conf() functions

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:03 +01:00
Philipp Reisner 44ed167da7 drbd: rcu_read_lock() and rcu_dereference() for tconn->net_conf
Removing the get_net_conf()/put_net_conf() calls

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:48:59 +01:00
Philipp Reisner 303d1448a0 drbd: Runtime changeable wire protocol
The wire protocol is no longer a property that is negotiated
between the two peers. It is now expressed with two bits
(DP_SEND_WRITE_ACK and DP_SEND_RECEIVE_ACK) in each data
packet. Therefore the primary node is free to change the
wire protocol at any time without disconnect/reconnect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:18 +01:00
Philipp Reisner d3fcb4908d drbd: protect all idr accesses that might sleep with drbd_cfg_rwsem
With this commit the locking for all accesses to IDRs is complete:

 * Non sleeping read accesses are protected by RCU
 * sleeping read accesses are protocted by a read lock on drbd_cfg_rwsem
 * accesses that add anything are protected by a write lock
 * accesses that remove an object are protoected by a write lock
   and a call to synchronize_rcu() after it is removed from the IDR
   and before the object is actually free()ed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:17 +01:00
Philipp Reisner ef35626284 drbd: Converted drbd_cfg_mutex into drbd_cfg_rwsem
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:17 +01:00
Philipp Reisner 695d08fa94 drbd: rcu_read_[un]lock() for all idr accesses that do not sleep
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:16 +01:00
Philipp Reisner cd1d9950f6 drbd: Inlined drbd_free_mdev(); it got called only from one place
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:16 +01:00
Philipp Reisner ff370e5a9e drbd: drbd_delete_device() takes a struct drbd_conf * now
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:15 +01:00
Andreas Gruenbacher 7721f5675e drbd: Rename drbd_release_ee() to drbd_free_peer_reqs()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:13 +01:00
Andreas Gruenbacher e0ab6ad4bc drbd: drbd_init_ee() no longer exists
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:11 +01:00
Andreas Gruenbacher b55d84ba17 drbd: Removed outdated comments and code that envisioned VNRs in header 95
Since have now header 100, that has space for 16 bit volume numbers,
the high byte of the length in header 95 is no longer reserved for
8 bit volume numbers.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:10 +01:00
Andreas Gruenbacher 0c8e36d9b8 drbd: Introduce protocol version 100 headers
The 8 byte header finally becomes too small. With the protocol 100 header we
have 16 bit for the volume number, proper 32 bit for the data length, and
32 bit for further extensions in the future.

Previous versions of drbd are using version 80 headers for all packets
short enough for protocol 80.  They support both header versions in
worker context, but only version 80 headers in asynchronous context.
For backwards compatibility, continue to use version 80 headers for
short packets before protocol version 100.

From protocol version 100 on, use the same header version for all
packets.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:10 +01:00
Andreas Gruenbacher e658983af6 drbd: Remove headers from on-the-wire data structures (struct p_*)
Prepare the introduction of the protocol 100 headers. The actual protocol
header is removed for the packet declarations. I.e. allow us to use the
packets with different headers.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:09 +01:00