OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Kent Overstreet	7988613b0e	block: Convert bio_for_each_segment() to bvec_iter More prep work for immutable biovecs - with immutable bvecs drivers won't be able to use the biovec directly, they'll need to use helpers that take into account bio->bi_iter.bi_bvec_done. This updates callers for the new usage without changing the implementation yet. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: Jim Paris <jim@jtan.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com> Cc: support@lsi.com Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Cc: linux-m68k@lists.linux-m68k.org Cc: linuxppc-dev@lists.ozlabs.org Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net Cc: cbe-oss-dev@lists.ozlabs.org Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Cc: linux-raid@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: DL-MPTFusionLinux@lsi.com Cc: linux-scsi@vger.kernel.org Cc: devel@driverdev.osuosl.org Cc: linux-fsdevel@vger.kernel.org Cc: cluster-devel@redhat.com Cc: linux-mm@kvack.org Acked-by: Geoff Levand <geoff@infradead.org>	2013-11-23 22:33:49 -08:00
Kent Overstreet	4f024f3797	block: Abstract out bvec iterator Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for now, this patch effectively just renames things. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <chris.mason@fusionio.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: xfs@oss.sgi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchand@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Peng Tao <tao.peng@emc.com> Cc: Andy Adamson <andros@netapp.com> Cc: fanchaoting <fanchaoting@cn.fujitsu.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Pankaj Kumar <pankaj.km@samsung.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Mel Gorman <mgorman@suse.de>6	2013-11-23 22:33:47 -08:00
Lars Ellenberg	d2da5b0cb5	drbd: fix decoding of bitmap vli rle for device sizes > 64 TB Symptoms: disconnect after bitmap exchange due to bitmap overflow (e:49731075554) while decoding bm RLE packet In the decoding step of the variable length integer run length encoding there was potentially an uncatched bitshift by wordsize (variable >> 64). The result of which is "undefined" :( (only "sometimes" the result is the desired 0) Fix: don't do any bit shift magic for shift == 64, just assign. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-08 09:10:28 -07:00
Philipp Reisner	b874d231e1	drbd: Fix an connection drop issue after enabling allow-two-primaries Since drbd-8.4.0 it is possible to change the allow-two-primaries network option while the connection is established. The sequence code used to partially order packets from the data socket with packets from the meta-data socket, still assued that the allow-two-primaries option is constant while the connection is established. I.e. On a node that has the RESOLVE_CONFLICTS bits set, after enabling allow-two-primaries, when receiving the next data packet it timed out while waiting for the necessary packets on the data socket to arrive (wait_for_and_update_peer_seq() function). Fixed that by always tracking the sequence number, but only waiting for it if allow-two-primaries is set. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-08 09:10:28 -07:00
Philipp Reisner	d752b26960	drbd: Allow online change of al-stripes and al-stripe-size Allow to change the AL layout with an resize operation. For that the reisze command gets two new fields: al_stripes and al_stripe_size. In order to make the operation crash save: 1) Lock out all IO and MD-IO 2) Write the super block with MDF_PRIMARY_IND clear 3) write the bitmap to the new location (all zeros, since we allow only while connected) 4) Initialize the new AL-area 5) Write the super block with the restored MDF_PRIMARY_IND. 6) Unfreeze all IO Since the AL-layout has no influence on the protocol, this operation needs to be beforemed on both sides of a resource (if intended). Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-06-28 16:04:36 +02:00
Philipp Reisner	e96c96333f	drbd: Constants should be UPPERCASE Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-06-28 16:04:36 +02:00
Andreas Gruenbacher	26ea8f9239	drbd: Do not sleep inside rcu Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-06-28 16:04:36 +02:00
Linus Torvalds	ebb3727779	Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-block Pull block driver updates from Jens Axboe: "It might look big in volume, but when categorized, not a lot of drivers are touched. The pull request contains: - mtip32xx fixes from Micron. - A slew of drbd updates, this time in a nicer series. - bcache, a flash/ssd caching framework from Kent. - Fixes for cciss" * 'for-3.10/drivers' of git://git.kernel.dk/linux-block: (66 commits) bcache: Use bd_link_disk_holder() bcache: Allocator cleanup/fixes cciss: bug fix to prevent cciss from loading in kdump crash kernel cciss: add cciss_allow_hpsa module parameter drivers/block/mg_disk.c: add CONFIG_PM_SLEEP to suspend/resume functions mtip32xx: Workaround for unaligned writes bcache: Make sure blocksize isn't smaller than device blocksize bcache: Fix merge_bvec_fn usage for when it modifies the bvm bcache: Correctly check against BIO_MAX_PAGES bcache: Hack around stuff that clones up to bi_max_vecs bcache: Set ra_pages based on backing device's ra_pages bcache: Take data offset from the bdev superblock. mtip32xx: mtip32xx: Disable TRIM support mtip32xx: fix a smatch warning bcache: Disable broken btree fuzz tester bcache: Fix a format string overflow bcache: Fix a minor memory leak on device teardown bcache: Documentation updates bcache: Use WARN_ONCE() instead of __WARN() bcache: Add missing #include <linux/prefetch.h> ...	2013-05-08 11:51:05 -07:00
Akinobu Mita	38b682b261	drbd: rename random32() to prandom_u32() Use preferable function name which implies using a pseudo-random number generator. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:42 -07:00
Philipp Reisner	3990e04df0	drbd: use sched_setscheduler() It was unnoticed for some time that assigning to current->policy is no longer sufficient to set a real time priority for a kernel thread. Reported-by: Charlie Suffin <Charlie.Suffin@stratus.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:25 -06:00
Philipp Reisner	7c689e63a8	drbd: fix for deadlock when using automatic split-brain-recovery With an automatic after split-brain recovery policy of "after-sb-1pri call-pri-lost-after-sb", when trying to drbd_set_role() to R_SECONDARY, we run into a deadlock. This was first recognized and supposedly fixed by 2009-06-10 "Fixed a deadlock when using automatic split brain recovery when both nodes are" replacing drbd_set_role() with drbd_change_state() in that code-path, but the first hunk of that patch forgets to remove the drbd_set_role(). We apparently only ever tested the "two primaries" case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:25 -06:00
Philipp Reisner	ef57f9e6bb	drbd: Fix build error when CONFIG_CRYPTO_HMAC is not set Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:25 -06:00
Philipp Reisner	2d56a974f3	drbd: reset ap_in_flight counter for new connections Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:24 -06:00
Lars Ellenberg	56392d2f40	drbd: Clarify when activity log I/O is delegated to the worker thread Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:14:00 -06:00
Lars Ellenberg	cccac9857d	drbd: mechanically rename la_size to la_size_sect Make it obvious that this value is in units of 512 Byte sectors. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:13:59 -06:00
Philipp Reisner	13c76aba78	drbd: close race between drbd_set_role and drbd_connect drbd_set_role(, R_PRIMARY, ) does the state change to Primary, some more housekeeping, and possibly generates a new UUID set. All of this holding the "state_mutex". The connection handshake involves sending of various state information, including the current data generation UUID set, and two connection state changes from C_WF_CONNECTION to C_WF_REPORT_PARAMS further to a number of different outcomes, resync being one of them. If the connection handshake happens between the state change to Primary and the generation of the new UUIDs, the resync decision based on the old UUID set may be confused, depending on circumstances. Make sure that, before we do the handshake, any promotion to Primary role will either be complete (including the housekeeping stuff), or can see, and serialize with, the ongoing handshake, based on the "STATE_SENT" bit, which is set when we start the handshake, and cleared only when we leave C_WF_REPORT_PARAMS again. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 13:00:33 +01:00
Lars Ellenberg	ed635cb067	drbd: if the replication link breaks during handshake, keep retrying The 8.3.12 commit drbd: Bugfix for the connection behavior fixes a "wasted established connection", if a former connection attempt failed during its early stages. However it opened a window for a regression, if a connection attempt fails during its last stages. The result was a terminated receiver thread, that left behind the supposedly transient "C_UNCONNECTED" state. Any later requests to change the connection state fail, as they wait for the connection state to "stabilize". Fix: short circuit and keep retrying to restablish a new connection, if we don't reach C_WF_REPORT_PARAMS. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:22:19 +01:00
Jing Wang	063eacf88c	drbd: check return of kmalloc in receive_uuids Signed-off-by: Jing Wang <windsdaemon@gmail.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:22:10 +01:00
Philipp Reisner	986836503e	Merge branch 'drbd-8.4_ed6' into for-3.8-drivers-drbd-8.4_ed6	2012-11-09 14:20:23 +01:00
Philipp Reisner	1393b59f8c	drbd: Remove duplicate code Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:38 +01:00
Philipp Reisner	19fffd7b03	drbd: Call drbd_md_sync() explicitly after a state change on the connection Without this, the meta-data gets updates after 5 seconds by the md_sync_timer. Better to do it immeditaly after a state change. If the asender detects a network failure, it may take a bit until the worker processes the according after-conn-state-change work item. The worker might be blocked in sending something, i.e. it takes until it gets into its timeout. That is 6 seconds by default which is longer than the 5 seconds of the md_sync_timer. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:08 +01:00
Lars Ellenberg	08332d7325	drbd: properly call drbd_rs_cancel_all() in drbd_disconnected() drbd_disconnected() is supposed to clear the resync lru cache, by calling drbd_rs_cancel_all(). We must do so before we call drbd_flush_workqueue(), as at least the callback w_restart_disk_io() may wait for resync progres, and would otherwise deadlock. drbd_finish_peer_reqs() may again populate that cache, which will then potentially be stale after the next resync handshake and bitmap exchange, we have to do it again after that. A stale resync lru cache causes no harm but ugly messages like this: BAD! sector=196608s enr=6 rs_left=-256 rs_failed=0 count=256 cstate=SyncTarget Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:08:19 +01:00
Philipp Reisner	155522df5b	drbd: Remove dead code Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:08:19 +01:00
Philipp Reisner	b66623e33e	drbd: Avoid NetworkFailure state during disconnect Disconnecting is a cluster wide state change. In case the peer node agrees to the state transition, it sends back the fact on the meta-data connection and closes both sockets. In case the node node that initiated the state transfer sees the closing action on the data-socket, before the P_STATE_CHG_REPLY packet, it was going into one of the network failure states. At least with the fencing option set to something else thatn "dont-care", the unclean shutdown of the connection causes a short IO freeze or a fence operation. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:08:18 +01:00
Philipp Reisner	39a1aa7f49	drbd: Protect accesses to the uuid set with a spinlock There is at least the worker context, the receiver context, the context of receiving netlink packts. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:08:04 +01:00
Andreas Gruenbacher	715306f69d	drbd: Don't unregister socket state_change callback from within the callback Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:50 +01:00
Lars Ellenberg	427c0434fc	drbd: disambiguation, s/DISCARD_CONCURRENT/RESOLVE_CONFLICTS/ We don't discard anything here, really. We resolve conflicting, concurrent writes to overlapping data blocks. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:49 +01:00
Lars Ellenberg	d4dabbe22d	drbd: disambiguation, s/P_DISCARD_WRITE/P_SUPERSEDED/ To avoid confusion with REQ_DISCARD aka TRIM, rename our "discard concurrent write acks" from P_DISCARD_WRITE to P_SUPERSEDED. At the same time, rename the drbd request event DISCARD_WRITE to CONFLICT_RESOLVED. It already triggers both successful completion or restart of the request, depending on our RQ_POSTPONED flag. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:49 +01:00
Lars Ellenberg	e959d08d3e	drbd: Fix a potential issue with the DISCARD_CONCURRENT flag The DISCARD_CONCURRENT flag should be set on one node and cleared on the other node. As the code was before it was theoretical possible that a node accepts the meta socket, but has to close it later on, and keeps the DISCARD_CONCURRENT flag. Correct this by moving the clear_bit(DISCARD_CONCURRENT) where the packet gets sent. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:47 +01:00
Philipp Reisner	80c6eed49d	drbd: More random to the connect logic Since the listening socket is open all the time, it was possible to get into stable "initial packet S crossed" loops. * when both sides realize in the drbd_socket_okay() call at the end of the loop that the other side closed the main socket you had the chance to get into a stable loop with repeated "packet S crossed" messages. * when both sides do not realize with the drbd_socket_okay() call at the end of the loop that the other side closed the main socket you had the chance to get into a stable loop with alternating "packet S crossed" "packet M crossed" messages. In order to break out these stable loops randomize the behaviour if such a crossing of P_INITIAL_DATA or P_INITIAL_META packets is detected. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:46 +01:00
Philipp Reisner	92f14951c0	drbd: Try to connec to peer only once per cycle Since now our listening socket is open all the time we will get connection tries of the peer always in. No need to try it three times. This is valid when connecting to older peers as well, it simply increases the probability that the new version DRBD will accept a connection instead that it will establish one. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:45 +01:00
Philipp Reisner	b666dbf819	drbd: Remove redundant and wrong test for NULL simplification in conn_connect() Since the drbd_socket_okay() function itself tests if the the socket is NULL, the explicit test "if (sock.socket && &msock.socket)" was redundent. Apart from that the address opperator ('&') before msock.socket rendered the test pointless. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:45 +01:00
Lars Ellenberg	81a3537a97	drbd: announce FLUSH/FUA capability to upper layers In 8.4, we may have bios spanning two activity log extents. Fixup drbd_al_begin_io() and drbd_al_complete_io() to deal with zero sized bios. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:44 +01:00
Lars Ellenberg	58ffa580a7	drbd: introduce stop-sector to online verify We now can schedule only a specific range of sectors for online verify, or interrupt a running verify without interrupting the connection. Had to bump the protocol version differently, we are now 101. Added verify_can_do_stop_sector() { protocol >= 97 && protocol != 100; } Also, the return value convention for worker callbacks has changed, we returned "true/false" for "keep the connection up" in 8.3, we return 0 for success and <= for failure in 8.4. Affected: receive_state() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:32 +01:00
Lars Ellenberg	b6dd1a8976	drbd: remove struct drbd_tl_epoch objects (barrier works) cherry-picked and adapted from drbd 9 devel branch DRBD requests (struct drbd_request) are already on the per resource transfer log list, and carry their epoch number. We do not need to additionally link them on other ring lists in other structs. The drbd sender thread can recognize itself when to send a P_BARRIER, by tracking the currently processed epoch, and how many writes have been processed for that epoch. If the epoch of the request to be processed does not match the currently processed epoch, any writes have been processed in it, a P_BARRIER for this last processed epoch is send out first. The new epoch then becomes the currently processed epoch. To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK, the sender thread also needs to handle the case when the current epoch was closed already, but no new requests are queued yet, and send out P_BARRIER as soon as possible. This is done by comparing the per resource "current transfer log epoch" (tconn->current_tle_nr) with the per connection "currently processed epoch number" (tconn->send.current_epoch_nr), while waiting for new requests to be processed in wait_for_work(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:35 +01:00
Lars Ellenberg	d5b27b01f1	drbd: move the drbd_work_queue from drbd_socket to drbd_connection cherry-picked and adapted from drbd 9 devel branch In 8.4, we don't distinguish between "resource work" and "connection work" yet, we have one worker for both, as we still have only one connection. We only ever used the "data.work", no need to keep the "meta.work" around. Move tconn->data.work to tconn->sender_work. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:34 +01:00
Philipp Reisner	7a426fd8d5	drbd: Keep the listening socket open while trying to connect to the peer Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:31 +01:00
Philipp Reisner	1f3e509b76	drbd: pull prepare_listen_socket() out of drbd_wait_for_connect() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:31 +01:00
Philipp Reisner	7e0f096b8d	drbd: Remove drbd_accept() and use kernel_accept() instead Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:30 +01:00
Philipp Reisner	2820fd3969	drbd: Move the call to listen() out of drbd_accept() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:29 +01:00
Lars Ellenberg	1882e22df7	drbd: grammar fix in log message Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:29 +01:00
Philipp Reisner	3ea35df83f	drbd: fix spelling, remove boring development log message Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:27 +01:00
Philipp Reisner	e4bad1bcac	drbd: Ensure that data_size is not 0 before using data_size-1 as index This could be exploited by a peer which runs modified code. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:26 +01:00
Philipp Reisner	a1096a6e9d	drbd: Delay/reject other state changes while establishing a connection Changes to the role and disk state should be delayed or rejected while we establish a connection. This is necessary, since the peer will base its resync decision on the UUIDs and the state we sent in the drbd_connect() function. The most prominent example for this race is becoming primary after sending state and UUIDs and before the state changes to C_WF_CONNECTION. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:26 +01:00
Philipp Reisner	27eb13e99b	drbd: Fixed processing of disk-barrier, disk-flushes and disk-drain Since drbd_bump_write_ordering() is called in the attaching process while the disk state is D_ATTACHING, it was not considering these three flags during attach. A call to this function was missing form drbd_adm_disk_opts(). Fixed both issues. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:25 +01:00
Lars Ellenberg	9ed57dcbda	drbd: ignore volume number for drbd barrier packet exchange Transfer log epochs, and therefore P_BARRIER packets, are per resource, not per volume. We must not associate them with "some random volume". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:25 +01:00
Lars Ellenberg	2312f0b3c5	drbd: fix potential deadlock during "restart" of conflicting writes w_restart_write(), run from worker context, calls __drbd_make_request() and further drbd_al_begin_io(, delegate=true), which then potentially deadlocks. The previous patch moved a BUG_ON to expose such call paths, which would now be triggered. Also, if we call __drbd_make_request() from resource worker context, like w_restart_write() did, and that should block for whatever reason (!drbd_state_is_stable(), resource suspended, ...), we potentially deadlock the whole resource, as the worker is needed for state changes and other things. Create a dedicated retry workqueue for this instead. Also make sure that inc_ap_bio()/dec_ap_bio() are properly paired, even if do_retry() needs to retry itself, in case __drbd_make_request() returns != 0. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:21 +01:00
Lars Ellenberg	d93f63028f	drbd: Fix a potential write ordering issue on SyncTarget nodes Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:18 +01:00
Philipp Reisner	dd9b360475	drbd: Fix module refcount leak in drbd_accept() drbd_accept was modelled after kernel_accept with drbd commit 53eb779 in July 2008. Only, kernel_accept was then broken, and only fixed later with kernel commit `1b08534e` in Dec 2008: net: Fix module refcount leak in kernel_accept() Impact: protocol families provided as modules, e.g. ipv6 or ib_sdp, would soon have their reference count become negative, preventing them from being unloaded (likely), or worse, hit zero without actually being unused, allowing them to be unloaded while still in use (unlikely, but if triggered, causing a kernel crash). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:16 +01:00
Philipp Reisner	4d0fc3fdc3	drbd: Fixed compat issue with disconnecting 8.4 from a primary 8.3 For compatibility reasons 8.4 has to send P_STATE_CHG_REQ (instead of P_CONN_ST_CHG_REQ) when disconnecting. In the receiving code path we missed to convert the old answer (P_STATE_CHG_REPLY) back to 8.4 logic. Therefore the CL_ST_CHG_SUCCESS or CL_ST_CHG_FAIL bit in the flags word of mdev got set, while the state code was waiting for the CONN_WD_ST_CHG_OKAY or CONN_WD_ST_CHG_FAIL bits in tconn. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:14 +01:00

1 2 3 4 5 ...

431 Commits