OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Andreas Gruenbacher	70b1987663	drbd: Improve the drbd_find_overlap() documentation Describe how to reach any further overlapping intervals from the first overlap found. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:36 +02:00
Andreas Gruenbacher	43ae077d0a	drbd: Make the peer_seq updating code more obvious Make it more clear that update_peer_seq() is supposed to wake up the seq_wait queue whenever the sequence number changes. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:35 +02:00
Andreas Gruenbacher	6024fece73	drbd: Defer new writes when detecting conflicting writes Before submitting a new local write request, wait for any conflicting local or remote requests to complete. We could assume that the new request occurred first and that the conflicting requests overwrote it (and therefore discard the new reques), but we know for sure that the new request occurred after the conflicting requests and so this behavior would we weird. We would also end up with the wrong result if the new request is not fully contained within the conflicting requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:34 +02:00
Andreas Gruenbacher	ddd8877d31	drbd: Remove unnecessary reference counting left-over Nothing in this function accesses mdev->tconn->net_conf, so there is no need for get_net_conf() / put_net_conf() anymore. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:33 +02:00
Andreas Gruenbacher	5e4722645a	drbd: _req_conflicts(): Get rid of the epoch_entries tree Instead of keeping a separate tree for local and remote write requests for finding requests and for conflict detection, use the same tree for both purposes. Introduce a flag to allow distinguishing the two possible types of entries in this tree. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:32 +02:00
Andreas Gruenbacher	53840641bb	drbd: Allow to wait for the completion of an epoch entry as well Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:31 +02:00
Andreas Gruenbacher	3e05146f0a	drbd: Remove redundant check from drbd_contains_interval() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:30 +02:00
Andreas Gruenbacher	a500c2efbb	drbd: struct drbd_request: Introduce a new collision flag This flag is set when a processes puts itself to sleep to wait for a conflicting request to complete. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:29 +02:00
Andreas Gruenbacher	9e204cddaf	drbd: Move some functions to where they are used Move drbd_update_congested() to drbd_main.c, and drbd_req_new() and drbd_req_free() to drbd_req.c: those functions are not used anywhere else. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:28 +02:00
Andreas Gruenbacher	3e394da184	drbd: Move sequence number logic into drbd_receiver.c and simplify it These things are only used there. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:27 +02:00
Andreas Gruenbacher	cc378270e4	drbd: Initialize the sequence number sent over the network even when not used Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:26 +02:00
Andreas Gruenbacher	bdc7adb006	drbd: Remove redundant initialization packet_seq is initialized by both sides of a connection in drbd_connect(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:25 +02:00
Andreas Gruenbacher	d876302306	drbd: Rename "enum drbd_packets" to "enum drbd_packet" Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:24 +02:00
Andreas Gruenbacher	f2ad906379	drbd: Move cmdname() out of drbd_int.h There is no good reason for cmdname() to be an inline function. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:23 +02:00
Philipp Reisner	b42a70ad32	drbd: Do not access tconn after it was freed Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:22 +02:00
Philipp Reisner	257d0af689	drbd: Implemented receiving of new style packets on meta socket Now drbd communication with protocol 100 actually works. Replaced the remaining p_header80 with p_header where we no longer know which header it is. In the places where p_header80 is still in use, it is on purpose, because we know that it is an old style header there. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:26:18 +02:00
Philipp Reisner	fd340c12c9	drbd: Use new header layout The new header layout will only be used if the peer supports it of course. For the first packet and the handshake packet the old (h80) layout is used for compatibility reasons. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-09-28 10:23:03 +02:00
Philipp Reisner	c012949a40	drbd: Replaced all p_header80 with a generic p_header Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:30:26 +02:00
Philipp Reisner	c6d25cfe52	drbd: Preparing to use p_header96 for all packets recv_bm_rle_bits() should not make any assumptions abou the layout of the packet header Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:30:25 +02:00
Philipp Reisner	191d3cc8d9	drbd: Made drbd_flush_workqueue() to take a tconn instead of an mdev Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:30:24 +02:00
Philipp Reisner	a0638456c6	drbd: moved crypto transformations and friends from mdev to tconn sed -i \ -e 's/mdev->cram_hmac_tfm/mdev->tconn->cram_hmac_tfm/g' \ -e 's/mdev->integrity_w_tfm/mdev->tconn->integrity_w_tfm/g' \ -e 's/mdev->integrity_r_tfm/mdev->tconn->integrity_r_tfm/g' \ -e 's/mdev->int_dig_out/mdev->tconn->int_dig_out/g' \ -e 's/mdev->int_dig_in/mdev->tconn->int_dig_in/g' \ -e 's/mdev->int_dig_vv/mdev->tconn->int_dig_vv/g' \ *.[ch] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:30:23 +02:00
Philipp Reisner	87eeee41f8	drbd: moved req_lock and transfer log from mdev to tconn sed -i \ -e 's/mdev->req_lock/mdev->tconn->req_lock/g' \ -e 's/mdev->unused_spare_tle/mdev->tconn->unused_spare_tle/g' \ -e 's/mdev->newest_tle/mdev->tconn->newest_tle/g' \ -e 's/mdev->oldest_tle/mdev->tconn->oldest_tle/g' \ -e 's/mdev->out_of_sequence_requests/mdev->tconn->out_of_sequence_requests/g' \ *.[ch] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:30:15 +02:00
Philipp Reisner	31890f4ab2	drbd: moved agreed_pro_version, last_received and ko_count to tconn sed -i \ -e 's/mdev->agreed_pro_version/mdev->tconn->agreed_pro_version/g' \ -e 's/mdev->last_received/mdev->tconn->last_received/g' \ -e 's/mdev->ko_count/mdev->tconn->ko_count/g' \ *.[ch] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:27:07 +02:00
Philipp Reisner	e6b3ea83bc	drbd: moved receiver, worker and asender from mdev to tconn Patch mostly: sed -i -e 's/mdev->receiver/mdev->tconn->receiver/g' \ -e 's/mdev->worker/mdev->tconn->worker/g' \ -e 's/mdev->asender/mdev->tconn->asender/g' \ *.[ch] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:27:06 +02:00
Philipp Reisner	e42325a576	drbd: moved data and meta from mdev to tconn Patch mostly: sed -i -e 's/mdev->data/mdev->tconn->data/g' \ -e 's/mdev->meta/mdev->tconn->meta/g' \ *.[ch] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:27:05 +02:00
Philipp Reisner	b2fb6dbe52	drbd: moved net_cont and net_cnt_wait from mdev to tconn Patch partly generated by: sed -i -e 's/get_net_conf(mdev)/get_net_conf(mdev->tconn)/g' \ -e 's/put_net_conf(mdev)/put_net_conf(mdev->tconn)/g' \ -e 's/get_net_conf(odev)/get_net_conf(odev->tconn)/g' \ -e 's/put_net_conf(odev)/put_net_conf(odev->tconn)/g' \ *.[ch] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:27:04 +02:00
Philipp Reisner	89e58e755e	drbd: moved net_conf from mdev to tconn Besides moving the struct member, everything else is generated by: sed -i -e 's/mdev->net_conf/mdev->tconn->net_conf/g' \ -e 's/odev->net_conf/odev->tconn->net_conf/g' \ *.[ch] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:27:03 +02:00
Philipp Reisner	2111438b30	drbd: Minimal struct drbd_tconn Starting to dissolve the network connection from the actual block devices. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:27:02 +02:00
Andreas Gruenbacher	6618bf1638	drbd: Interval tree bugfix Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:27:00 +02:00
Andreas Gruenbacher	e3cfa7b26a	drbd: Inline function overlaps() is now unused Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:59 +02:00
Andreas Gruenbacher	70dc65e1b3	drbd: Remove some useless paranoia code The open_cnt check is an open-coded D_ASSERT() check. In case the data.work queue is not empty, it does not really help to know which drbd_work elements remained on that list: they will be freed immediately afterwards, anyway. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:58 +02:00
Andreas Gruenbacher	841ce241fa	drbd: Replace the ERR_IF macro with an assert-like macro Remove the file name and line number from the syslog messages generated: we have no duplicate function names, and no function contains the same assertion more than once. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:57 +02:00
Andreas Gruenbacher	e77a0a5cc1	drbd: Convert all constants in enum drbd_thread_state to upper case Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:56 +02:00
Andreas Gruenbacher	8554df1c6d	drbd: Convert all constants in enum drbd_req_event to upper case Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:55 +02:00
Andreas Gruenbacher	bb3bfe9614	drbd: Remove the unused hash tables Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:54 +02:00
Andreas Gruenbacher	8b946255f8	drbd: Use interval tree for overlapping epoch entry detection Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:53 +02:00
Andreas Gruenbacher	010f6e678f	drbd: Put sector and size in struct drbd_epoch_entry into struct drbd_interval Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:52 +02:00
Andreas Gruenbacher	bc9c5c4118	drbd: Use the read and write request trees for request lookups Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:50 +02:00
Andreas Gruenbacher	dac1389ccc	drbd: Add read_requests tree We do not do collision detection for read requests, but we still need to look up the request objects when we receive a package over the network. Using the same data structure for read and write requests results in simpler code once the tl_hash and app_reads_hash tables are removed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-29 11:26:31 +02:00
Andreas Gruenbacher	de696716e8	drbd: Use interval tree for overlapping write request detection Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:58:06 +02:00
Andreas Gruenbacher	ace652acf2	drbd: Put sector and size in struct drbd_request into struct drbd_interval Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:58:05 +02:00
Andreas Gruenbacher	0939b0e5cd	drbd: Add interval tree data structure Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:58:04 +02:00
Andreas Gruenbacher	c3afd8f568	drbd: Request lookup code cleanup (4) Factor out duplicate code in got_NegAck(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:58:03 +02:00
Andreas Gruenbacher	ae3388daae	drbd: Request lookup code cleanup (3) Get rid of the ar_id_to_req() and ack_id_to_req() wrappers. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:58:02 +02:00
Andreas Gruenbacher	668eebc6a1	drbd: Request lookup code cleanup (2) Unify the ar_id_to_req() and ack_id_to_req() functions: make both fail if the consistency check fails. Move the request lookup code now duplicated in both functions into its own function. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:58:01 +02:00
Andreas Gruenbacher	5162458564	drbd: Request lookup code cleanup (1) Move _ar_id_to_req() to drbd_receiver.c and mark it non-inline. Remove the leading underscores from _ar_id_to_req() and _ack_id_to_req(). Mark ar_hash_slot() inline. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:58:00 +02:00
Andreas Gruenbacher	9c50842a35	drbd: Update outdated comment Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:59 +02:00
Andreas Gruenbacher	d628769b3c	drbd: Move drbd_free_tl_hash() to drbd_main() This is the only place where this function is used. Make it static. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:58 +02:00
Andreas Gruenbacher	579b57ed73	drbd: Magic reserved block_id value cleanup The ID_VACANT definition has become entirely irrelevant by now. The is_syncer_block_id() macro does not improve the code, so eliminated it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:57 +02:00
Andreas Gruenbacher	e7fad8af75	drbd: Endianness convert the constants instead of the variables Converting the constants happens at compile time. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:57 +02:00
Andreas Gruenbacher	ca9bc12b90	drbd: Get rid of BE_DRBD_MAGIC and BE_DRBD_MAGIC_BIG Converting the constants happens at compile time. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:56 +02:00
Andreas Gruenbacher	9a8e77530f	drbd: Consistently use block_id == ID_SYNCER for checksum based resync and online verify DRBD_MAGIC has nothing to do with block ids and the funny values computed were not actually used, anyway. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:55 +02:00
Andreas Gruenbacher	3980485361	drbd: Remove superfluous declaration Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:43 +02:00
Andreas Gruenbacher	28c455ceb2	drbd: Get rid of req_validator_fn typedef Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-08-25 14:57:31 +02:00
Jens Axboe	7b28afe01a	Merge branch 'for-3.0-important' of git://git.drbd.org/linux-2.6-drbd into for-linus	2011-06-30 10:10:50 +02:00
Lars Ellenberg	86e1e98e5c	drbd: we should write meta data updates with FLUSH FUA We used to write these with BIO_RW_BARRIER aka REQ_HARDBARRIER (unless disabled in the configuration). The correct semantic now would be to write with FLUSH/FUA. For example, with activity log transactions, FUA alone is not enough, we need the corresponding bitmap update (and all related application updates) on stable storage as well. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:46 +02:00
Lars Ellenberg	cb6518cbef	drbd: when receive times out on meta socket, also check last receive time on data socket If we have an asymetrically congested network, we may send P_PING, but due to congestion, the corresponding P_PING_ACK would time out, and we would drop a (congested, but otherwise) healthy connection ("PingAck did not arrive in time.") Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:44 +02:00
Lars Ellenberg	5a8b424276	drbd: account bitmap IO during resync as resync-(related-)-io If we have a good resync rate, we will frequently update the on-disk bitmap, which, if not accounted for as resync io, may let an otherwise idle device appear to be "busy", and cause us to throttle resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:43 +02:00
Lars Ellenberg	8ccee20e3e	drbd: don't cond_resched_lock with IRQs disabled The last commit, drbd: add missing spinlock to bitmap receive, introduced a cond_resched_lock(), where the lock in question is taken with irqs disabled. As we must not schedule with IRQs disabled, and cond_resched_lock_irq() does not exist, yet, we re-aquire the spin_lock_irq() for each bitmap page processed in turn. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:42 +02:00
Lars Ellenberg	829c608786	drbd: add missing spinlock to bitmap receive During bitmap exchange, when using the RLE bitmap compression scheme, we have a code path that can set the whole bitmap at once. To avoid holding spin_lock_irq() for too long, we used to lock out other bitmap modifications during bitmap exchange by other means, and then, knowing we have exclusive access to the bitmap, modify it without the spinlock, and with IRQs enabled. Since we now allow local IO to continue, potentially setting additional bits during the bitmap receive phase, this is no longer true, and we get uncoordinated updates of bitmap members, causing bm_set to no longer accurately reflect the total number of set bits. To actually see this, you'd need to have a large bitmap, use RLE bitmap compression, and have busy IO during sync handshake and bitmap exchange. Fix this by taking the spin_lock_irq() in this code path as well, but calling cond_resched_lock() after each page worth of bits processed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:41 +02:00
Philipp Reisner	0cfdd247d1	drbd: Use the correct max_bio_size when creating resync requests Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:40 +02:00
Linus Torvalds	929cfdd5d3	Merge branch 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block: (110 commits) loop: handle on-demand devices correctly loop: limit 'max_part' module param to DISK_MAX_PARTS drbd: fix warning drbd: fix warning drbd: Fix spelling drbd: fix schedule in atomic drbd: Take a more conservative approach when deciding max_bio_size drbd: Fixed state transitions after async outdate-peer-handler returned drbd: Disallow the peer_disk_state to be D_OUTDATED while connected drbd: Fix for the connection problems on high latency links drbd: fix potential activity log refcount imbalance in error path drbd: Only downgrade the disk state in case of disk failures drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int drbd: fix potential distributed deadlock lru_cache.h: fix comments referring to ts_ instead of lc_ drbd: Fix for application IO with the on-io-error=pass-on policy xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions. xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override. xen/blkback: don't fail empty barrier requests xen/blkback: fix xenbus_transaction_start() hang caused by double xenbus_transaction_end() ...	2011-05-25 09:15:35 -07:00
Andrew Morton	0ddf72be4e	drbd: fix warning In file included from drivers/block/drbd/drbd_main.c:54: drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type Forward declarations of enums do not work. Fix it unpleasantly by moving the prototype. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Lars Ellenberg <drbd-dev@lists.linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2011-05-24 10:38:33 +02:00
Philipp Reisner	9b2f61aec7	drbd: fix warning Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2011-05-24 10:38:32 +02:00
Bart Van Assche	24c4830c8e	drbd: Fix spelling Found these with the help of ispell -l. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2011-05-24 10:21:29 +02:00
Lars Ellenberg	9a0d9d0389	drbd: fix schedule in atomic An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. manually picked commit 564040f from the drbd 8.4 branch. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:14:32 +02:00
Philipp Reisner	99432fcc52	drbd: Take a more conservative approach when deciding max_bio_size The old (optimistic) implementation could shrink the bio size on an primary device. Shrinking the bio size on a primary device is bad. Since there we might get BIOs with the old (bigger) size shortly after we published the new size. The new implementation is more conservative, and eventually increases the max_bio_size on a primary device (which is valid). It does so, when it knows the local limit AND the remote limit. We cache the last seen max_bio_size of the peer in the meta data, and rely on that, to make the operation of single nodes more efficient. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:08:58 +02:00
Philipp Reisner	21423fa791	drbd: Fixed state transitions after async outdate-peer-handler returned Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:08:11 +02:00
Philipp Reisner	fa7d939663	drbd: Disallow the peer_disk_state to be D_OUTDATED while connected Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:07:50 +02:00
Philipp Reisner	a8e407925d	drbd: Fix for the connection problems on high latency links It seems that the real cause of all the issues where that we did not noticed in drbd_try_connect() when the other guy closes one socket if the round trip time gets higher than 100ms. There were that 100ms hard coded! Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:07:22 +02:00
Lars Ellenberg	76727f684a	drbd: fix potential activity log refcount imbalance in error path It is no longer sufficient to trigger on local WRITE, we need to check on (rq_state & RQ_IN_ACT_LOG) before calling drbd_al_complete_io also in the error path. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:06:44 +02:00
Philipp Reisner	d2e17807e3	drbd: Only downgrade the disk state in case of disk failures Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:05:48 +02:00
Lars Ellenberg	f36af18c7b	drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int If there is no replication traffic within the idle timeout (ping-int seconds), DRBD will send a P_PING, and adjust the timeout to ping-timeout. If there is no P_PING_ACK received within this ping-timeout, DRBD finally drops the connection, and tries to re-establish it. To decide which timeout was active, we compared the current timeout with the ping-timeout, and dropped the connection, if that was the case. By default, ping-int is 10 seconds, ping-timeout is 500 ms. Unfortunately, if you configure ping-timeout to be the same as ping-int, expiry of the idle-timeout had been mistaken for a missing ping ack, and caused an immediate reconnection attempt. Fix: Allow both timeouts to be equal, use a local variable to store which timeout is active. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:03:30 +02:00
Lars Ellenberg	53ea433145	drbd: fix potential distributed deadlock We limit ourselves to a configurable maximum number of pages used as temporary bio pages. If the configured "max_buffers" is not big enough to match the bandwidth of the respective deployment, a distributed deadlock could be triggered by e.g. fast online verify and heavy application IO. TCP connections would block on congestion, because both receivers would wait on pages to become available. Fortunately the respective senders in this case would be able to give back some pages already. So do that. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:02:41 +02:00
Philipp Reisner	738a84b25c	drbd: Fix for application IO with the on-io-error=pass-on policy In case a write failes on the local disk, go into D_INCONSISTENT disk state. That causes future reads of that block to be shipped to the peer. Read retry remote was already in place. Actually the documentation needs to get fixed now. Since the application is still shielded from the error. (as long as we have only a single disk failing) The difference to detach is that we keep the disk. And therefore might keep all the other, still working sectors up to date. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 09:59:49 +02:00
Paul Gortmaker	70c7160619	Add appropriate <linux/prefetch.h> include for prefetch users After discovering that wide use of prefetch on modern CPUs could be a net loss instead of a win, net drivers which were relying on the implicit inclusion of prefetch.h via the list headers showed up in the resulting cleanup fallout. Give them an explicit include via the following $0.02 script. ========================================= #!/bin/bash MANUAL="" for i in `git grep -l 'prefetch(.*)' .` ; do grep -q '<linux/prefetch.h>' $i if [ $? = 0 ] ; then continue fi ( echo '?^#include <linux/?a' echo '#include <linux/prefetch.h>' echo . echo w echo q ) \| ed -s $i > /dev/null 2>&1 if [ $? != 0 ]; then echo $i needs manual fixup MANUAL="$i $MANUAL" fi done echo ------------------- 8\<---------------------- echo vi $MANUAL ========================================= Signed-off-by: Paul <paul.gortmaker@windriver.com> [ Fixed up some incorrect #include placements, and added some non-network drivers and the fib_trie.c case - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-22 21:41:57 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Linus Torvalds	7e599e6e62	drbd: fix up merge error In commit `95a0f10cdd` ("drbd: store in-core bitmap little endian, regardless of architecture") drbd had made the sane choice to use little-endian bitmap functions everywhere. However, it used the horrible old functions names from <asm-generic/bitops/le.h>, that were never really meant to be exported. In the meantime, things got cleaned up, and in commit `c4945b9ed4` ("asm-generic: rename generic little-endian bitops functions") we renamed the LE bitops to something sane, exactly so that they could be used in random code without people gouging their eyes out when seeing the crazy jumble of letters that were the old internal names. As a result the drbd thing merged cleanly (commit 8d49a77568d1: "Merge branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block"), since there was no data conflict - but the end result obviously doesn't actually compile. Reported-and-tested-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-28 07:42:58 -07:00
Linus Torvalds	8d49a77568	Merge branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block: (122 commits) cciss: fix lost command issue drbd: need include for bitops functions declarations Revert "cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation" cciss: fix missed command status value CMD_UNABORTABLE cciss: remove unnecessary casts cciss: Mask off error bits of c->busaddr in cmd_special_free when calling pci_free_consistent cciss: Inform controller we are using 32-bit tags. cciss: hoist tag masking out of loop cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation cciss: export resettable host attribute drbd: drop code present under #ifdef which is relevant to 2.6.28 and below drbd: Fixed handling of read errors on a 'VerifyS' node drbd: Fixed handling of read errors on a 'VerifyT' node drbd: Implemented real timeout checking for request processing time drbd: Remove unused function atodb_endio() drbd: improve log message if received sector offset exceeds local capacity drbd: kill dead code drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails drbd: Removed left over, now wrong comments drbd: serialize admin requests for new verify run with pending bitmap io ...	2011-03-27 20:02:07 -07:00
Linus Torvalds	6c51038900	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits) Documentation/iostats.txt: bit-size reference etc. cfq-iosched: removing unnecessary think time checking cfq-iosched: Don't clear queue stats when preempt. blk-throttle: Reset group slice when limits are changed blk-cgroup: Only give unaccounted_time under debug cfq-iosched: Don't set active queue in preempt block: fix non-atomic access to genhd inflight structures block: attempt to merge with existing requests on plug flush block: NULL dereference on error path in __blkdev_get() cfq-iosched: Don't update group weights when on service tree fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away block: Require subsystems to explicitly allocate bio_set integrity mempool jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging fs: make fsync_buffers_list() plug mm: make generic_writepages() use plugging blk-cgroup: Add unaccounted time to timeslice_used. block: fixup plugging stubs for !CONFIG_BLOCK block: remove obsolete comments for blkdev_issue_zeroout. blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. ... Fix up conflicts in fs/{aio.c,super.c}	2011-03-24 10:16:26 -07:00
Stephen Rothwell	f0ff1357ce	drbd: need include for bitops functions declarations Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-03-17 15:02:51 +01:00
Or Gerlitz	03567812d8	drbd: drop code present under #ifdef which is relevant to 2.6.28 and below Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:21 +01:00
Philipp Reisner	7961243b7b	drbd: Fixed handling of read errors on a 'VerifyS' node Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:20 +01:00
Philipp Reisner	8f21420ebd	drbd: Fixed handling of read errors on a 'VerifyT' node Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:18 +01:00
Philipp Reisner	7fde2be930	drbd: Implemented real timeout checking for request processing time Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:16 +01:00
Andreas Gruenbacher	c5a9161979	drbd: Remove unused function atodb_endio() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:15 +01:00
Lars Ellenberg	fdda6544ad	drbd: improve log message if received sector offset exceeds local capacity Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:13 +01:00
Lars Ellenberg	e99dc367b3	drbd: kill dead code This code became obsolete and unused last December with drbd: bitmap keep track of changes vs on-disk bitmap Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:12 +01:00
Lars Ellenberg	10f6d9926c	drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails Just deal with it more gracefully, if we fail to add even a single page to an empty bio. We used to BUG_ON() there, but it has been observed in some Xen deployment, so we need to handle that case more robustly now. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:10 +01:00
Philipp Reisner	039312b648	drbd: Removed left over, now wrong comments Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:09 +01:00
Lars Ellenberg	873b0d5f98	drbd: serialize admin requests for new verify run with pending bitmap io This is an addendum to drbd: serialize admin requests for new resync with pending bitmap io It avoids a race that could trigger "FIXME" assert log messages. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:07 +01:00
Lars Ellenberg	e636db5b95	drbd: fix potential imbalance of ap_in_flight When we receive a barrier ack, we walk the ring list of drbd requests in the transfer log of the respective epoch, do some housekeeping, and free those objects. We tried to keep epochs of mirrored and unmirrored drbd requests separate, and assert that no local-only requests are present in a barrier_acked epoch. It turns out that this has quite a number of corner cases and would add bloated code without functional benefit. We now revert the (insufficient) commits drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions drbd: Ensure that an epoch contains only requests of one kind and instead fix the processing of barrier acks to cope with a mix of local-only and mirrored requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:06 +01:00
Lars Ellenberg	0ddc5549f8	drbd: silence some noisy log messages during disconnect If we fail to send the information that we lost our disk, we have no connection, and no disk: no access to data anymore. That is either expected (deconfiguration), or there will be so much noise in the logs that "Sending state failed" is not useful at all. Drop it. If the reason for a shorter than expected receive was a signal, which we sent because we already decided to disconnect, these additional log messages are confusing and useless. This patch follows this pattern: - dev_warn(DEV, "short read expecting header on sock: r=%d\n", r); + if (!signal_pending(current)) + dev_warn(DEV, "short read expecting header on sock: r=%d\n", r); Also make them all dev_warn for consistency. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:04 +01:00
Lars Ellenberg	20ceb2b22e	drbd: describe bitmap locking for bulk operation in finer detail Now that we do no longer in-place endian-swap the bitmap, we allow selected bitmap operations (testing bits, sometimes even settting bits) during some bulk operations. This caused us to hit a lot of FIXME asserts similar to FIXME asender in drbd_bm_count_bits, bitmap locked for 'write from resync_finished' by worker Which now is nonsense: looking at the bitmap is perfectly legal as long as it is not being resized. This cosmetic patch defines some flags to describe expectations in finer detail, so the asserts in e.g. bm_change_bits_to() can be skipped if appropriate. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:02 +01:00
Lars Ellenberg	62b0da3a24	drbd: log UUIDs whenever they change All decisions about sync, sync direction, and wether or not to allow a connect or attach are based on our set of UUIDs to tag a data generation. Log changes to the UUIDs whenever they occur, logging "new current UUID P:Q:R:S" is more useful than "Creating new current UUID". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:48:01 +01:00
Philipp Reisner	d07c9c10e5	drbd: We can not process BIOs with a size of 0 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:47:59 +01:00
Philipp Reisner	cd88d030d4	drbd: Provide hints with the error message when clearing the sync pause flag When the user clears the sync-pause flag, and sync stays in pause state, give hints to the user, why it still is in pause state. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:47:58 +01:00
Lars Ellenberg	79a30d2d71	drbd: queue bitmap writeout more intelligently The "lazy writeout" of cleared bitmap pages happens during resync, and should happen again once the resync finishes cleanly, or is aborted. If resync finished cleanly, or was aborted because of peer disk failure, we trigger the writeout from worker context in the after state change work. If resync was aborted because of connection failure, we should not immediately trigger bitmap writeout, but rather postpone the writeout to after the connection cleanup happened. We now do it in the receiver context from drbd_disconnect(). If resync was aborted because of local disk failure, well, there is nothing to write to anymore. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:47:56 +01:00
Lars Ellenberg	54b956abef	drbd: don't pointlessly queue bitmap send, if we lost connection This is a minor optimization and cleanup, and also considerably reduces some harmless (but noisy) race with the connection cleanup code. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:47:55 +01:00
Lars Ellenberg	194bfb32db	drbd: serialize admin requests for new resync with pending bitmap io Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:47:53 +01:00
Lars Ellenberg	6c922ed543	drbd: only generate and send a new sync uuid after a successful state change Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:47:52 +01:00
Philipp Reisner	20ee639024	drbd: cleaned up __set_current_state() followed by schedule_timeout() calls Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:47:42 +01:00
Philipp Reisner	6a35c45f89	drbd: Ensure that an epoch contains only requests of one kind The assert in drbd_req.c:755 forces us to have only requests of one kind in an epoch. The two kinds we distinguish here are: local-only or mirrored. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:42 +01:00
Philipp Reisner	2deb8336d0	drbd: Fixed P_NEG_ACK processing for protocol A and B Protocol A has no P_WRITE_ACKs, but has P_NEG_ACKs. The master bio might already be completed, therefore the request is no longer in the collision hash. => Do not try to validate block_id as request In Protocol B we might already have got a P_RECV_ACK but then get a P_NEG_ACK after wards. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:40 +01:00
Philipp Reisner	94f2b05f03	drbd: Killed an assert that is no longer valid The point is that drbd_disconnect() can be called with a cstate of WFConnection. That happens if the user issues "drbdsetup disconnect" while the drbd_connect() function executes. Then drbdd_init() will call drbdd(), which in turn will return without receiving any packets. Then drbdd_init() will end up calling drbd_disconnect() with a cstate of WFConnection. Bottom line: This assertion is wrong as it is, and we do not see value in fixing it. => Removing it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:39 +01:00
Philipp Reisner	148efa165e	drbd: Do not drop net config if sending in drbd_send_protocol() fails Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:37 +01:00
Philipp Reisner	370a43e798	drbd: Work on the Ahead -> SyncSource transition The test if rs_pending_cnt == 0 was too weak. Using Test for unacked_cnt == 0 instead. Moved that into the worker. Since unacked_cnt gets already increased when an P_RS_DATA_REQ comes in. Also using a timer to make Ahead -> SyncSource -> Ahead cycles slower... Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:36 +01:00
Philipp Reisner	71c78cfba2	drbd: Nothing should stop SyncSource -> Ahead transitions Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:34 +01:00
Philipp Reisner	4a23f26496	drbd: Do not full sync if a P_SYNC_UUID packet gets lost See also commit from 2009-08-15 "drbd_uuid_compare(): Do not full sync in case a P_SYNC_UUID packet gets lost." We saw cases where the History UUIDs where not as expected. So the detection of the special case did not trigger. With the sync UUID no longer being a random number, but deducible from the previous bitmap UUID, the detection of this special case becomes more reliable. The SyncUUID now is the previous bitmap UUID + 0x1000000000000. Rule 5a: Cs = H1p & H1p + Offset = Bp Connection was lost before SyncUUID Packet came through. Corrent (peer) UUIDs: Bp = H1p H1p = H2p H2p = 0 Become Sync target. Rule 7a: Cp = H1s & H1s + Offset = Bs Connection was lost before SyncUUID Packet came through. Correct (own) UUIDs: Bs = H1s H1s = H2s H2s = 0 Become Sync source. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:32 +01:00
Philipp Reisner	2b8a90b555	drbd: Corrected off-by-one error in DRBD_MINOR_COUNT_MAX Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:31 +01:00
Andreas Gruenbacher	110a204a35	drbd: Remove useless / wrong comments Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:29 +01:00
Philipp Reisner	794abb753e	drbd: Cleaned up the resync timer logic Besides removed a few lines of code, this moves the inspection of the state from before the queuing process to after the queuing. I.e. more closely to the actual invocation of the work. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:28 +01:00
Philipp Reisner	da0a78161d	drbd: Be more careful with SyncSource -> Ahead transitions We may not get from SyncSource to Ahead if we have sent some P_RS_DATA_REPLY packets to the peer and are waiting for P_WRITE_ACK. Again, this is not relevant for proper tuned systems, but makes sure that the not-tuned system does not get diverging bitmaps. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:26 +01:00
Philipp Reisner	d612d309e4	drbd: No longer answer P_RS_DATA_REQUEST packets when in C_AHEAD mode When the sync source node replies to a P_RS_DATA_REQUEST packet when it is already in ahead mode. I.e. those two packets crossed each other on the wire, that may lead to diverging bitmaps. This never happens in a well-tuned-system. In a well-tuned- system the resync controller has reduced the resync speed to zero long before we got into ahead-mode. But we have to be prepared for the not-well-tuned-system of course as well. Because -> diverging bitmaps = non terminating resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:25 +01:00
Philipp Reisner	617049aa7d	drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions Create a new barrier when leaving the AHEAD mode. Otherwise we trigger the assertion in req_mod(, barrier_acked) D_ASSERT(req->rq_state & RQ_NET_SENT); The new barrier is created by recycling the newest existing one. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:23 +01:00
Lars Ellenberg	0719427278	drbd: ratelimit io error messages Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:21 +01:00
Philipp Reisner	3f98688afc	drbd: There might be a resync after unfreezing IO due to no disk [Bugz 332] When on-no-data-accessible is set to suspend-io, also consider that a Primary, SyncTarget node losses its connection. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:20 +01:00
Lars Ellenberg	725a97e43e	drbd: fix potential access of on-stack wait_queue_head_t after return I run into something declaring itself as "spinlock deadlock", BUG: spinlock lockup on CPU#1, kjournald/27816, ffff88000ad6bca0 Pid: 27816, comm: kjournald Tainted: G W 2.6.34.6 #2 Call Trace: <IRQ> [<ffffffff811ba0aa>] do_raw_spin_lock+0x11e/0x14d [<ffffffff81340fde>] _raw_spin_lock_irqsave+0x6a/0x81 [<ffffffff8103b694>] ? __wake_up+0x22/0x50 [<ffffffff8103b694>] __wake_up+0x22/0x50 [<ffffffffa07ff661>] bm_async_io_complete+0x258/0x299 [drbd] but the call traces do not fit at all, all other cpus are cpu_idle. I think it may be this race: drbd_bm_write_page wait_queue_head_t io_wait; atomic_t in_flight; bm_async_io submit_bio bm_async_io_complete if (atomic_dec_and_test(in_flight)) wait_event(io_wait, atomic_read(in_flight) == 0) return wake_up(io_wait) The wake_up now accesses the wait_queue_head_t spinlock, which is no longer valid, since the stack frame of drbd_bm_write_page has been clobbered now. Fix this by using struct completion, which does both the condition test as well as the wake_up inside its spinlock, so this race cannot happen. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:45:08 +01:00
Lars Ellenberg	06d33e968d	drbd: improve on bitmap write out timing Even though we now track the need for bitmap writeout per bitmap page, there is no need to trigger the writeout while a resync is going on. Once the resync is finished (or aborted), we trigger bitmap writeout anyways. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:40 +01:00
Lars Ellenberg	418e0a927d	drbd: spelling fix in log message Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:38 +01:00
Lars Ellenberg	7648cdfe52	drbd: be less noisy with some log messages We expect changes to a bitmap page in drbd_bm_write_page, that's why we submit a copy page. If a page changes during global writeout, that would be unexpected, and reason to warn, though. Also, often page writeout can be skipped (on activity log transactions during normal operation, for example), no need to log that everytime. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:37 +01:00
Lars Ellenberg	5a22db8968	drbd: serialize sending of resync uuid with pending w_send_oos To improve the latency of IO requests during bitmap exchange, we recently allowed writes while waiting for the bitmap, sending "set out-of-sync" information packets for any newly dirtied bits. We have to make sure that the new resync-uuid does not overtake these "set oos" packets. Once the resync-uuid is received, the sync target starts the resync process, and expects the bitmap to only be cleared, not re-set. If we use this protocol extension, we queue the generation and sending of the resync-uuid on the worker, which naturally serializes with all previously queued packets. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:35 +01:00
Lars Ellenberg	f735e36354	drbd: add debugging assert to make sure the protocol is clean We expect to only receive the recently introduced "set out of sync" packets in specific states. If we receive them in different states, that may confuse the resync process to the point where it won't terminate, or think it made negative progress. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:34 +01:00
Philipp Reisner	c88d65e223	drbd: Documenting drbd_should_do_remote() and drbd_should_send_oos() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:32 +01:00
Lars Ellenberg	2265b473ae	drbd: fix potential dereference of NULL pointer If drbd used to have crypto digest algorithms configured, then is being unconfigured (but not unloaded), it frees the algorithms, but does not reset the config. If it then is reconfigured to use the very same algorithm, it "forgot" to re-allocate the algorithms, thinking that the config has not changed in that aspect. It will then Oops on the first attempt to actually use those algorithms. Fix this by resetting the config to defaults after cleanup. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:30 +01:00
Lars Ellenberg	02851e9f00	drbd: move bitmap write from resync_finished to after_state_change We must not call it directly from resync_finished, as we may be in either receiver or worker context there. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:29 +01:00
Lars Ellenberg	84e7c0f7d1	drbd: Removed a reference to debug macros removed long time ago Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:27 +01:00
Lars Ellenberg	6850c44214	drbd: get rid of unused debug code Long time ago, we had paranoia code in the bitmap that allocated one extra word, assigned a magic value, and checked on every occasion that the magic value was still unchanged. That debug code is unused, the extra long word complicates code a bit. Get rid of it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:26 +01:00
Lars Ellenberg	4b0715f096	drbd: allow petabyte storage on 64bit arch Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:24 +01:00
Lars Ellenberg	19f843aa08	drbd: bitmap keep track of changes vs on-disk bitmap When we set or clear bits in a bitmap page, also set a flag in the page->private pointer. This allows us to skip writes of unchanged pages. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:43:19 +01:00
Lars Ellenberg	95a0f10cdd	drbd: store in-core bitmap little endian, regardless of architecture Our on-disk bitmap is a little endian bitstream. Up to now, we have stored the in-core copy of that in native endian, applying byte order conversion when necessary. Instead, keep the bitmap pages little endian, as they are read from disk, and use the generic_*_le_bit family of functions. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:40 +01:00
Lars Ellenberg	7777a8ba1f	drbd: bitmap: don't count unused bits (fix non-terminating resync) We trusted the on-disk bitmap to have unused bits cleared. In case that is not true for whatever reason, and we take a code path where the unused bits don't get cleared elsewhere (bm_clear_surplus is not called), we may miscount the bits, and get confused during resync, waiting for bits to get cleared that we don't even use: the resync process would not terminate. Fix this by masking out unused bits in __bm_count_bits. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:38 +01:00
Andreas Gruenbacher	1b881ef775	drbd: Rename __inc_ap_bio_cond to may_inc_ap_bio The old name is confusing: the function does not increment anything. Also rename _inc_ap_bio_cond to inc_ap_bio_cond: there is no need for an underscore. Finally, make it clear that these functions return boolean values. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:37 +01:00
Andreas Gruenbacher	24dccabb39	drbd: Fix: drbd_bitmap_io does not return an enum determine_dev_size I guess bitmap I/O errors are supposed to cause drbd_determin_dev_size to return dev_size_error. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:35 +01:00
Andreas Gruenbacher	2c46407d24	drbd: receive_bitmap_plain: Get rid of ugly and useless enum Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:34 +01:00
Andreas Gruenbacher	f70af118e3	drbd: send_bitmap_rle_or_plain: Get rid of ugly and useless enum Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:32 +01:00
Andreas Gruenbacher	78fcbdae22	drbd: receive_bitmap: Missing free_page() on error path Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:30 +01:00
Andreas Gruenbacher	de1f8e4a0a	drbd: receive_bitmap: Avoid casting enum drbd_state_rv to int Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:29 +01:00
Andreas Gruenbacher	4114be815f	drbd: receive_bitmap: Fix the wrong return value Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:27 +01:00
Andreas Gruenbacher	f2024e7ce2	drbd: drbd_nl_disk_conf: Avoid a compiler warning Warning: comparison between ‘enum drbd_ret_code’ and ‘enum drbd_state_rv’ Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:26 +01:00
Andreas Gruenbacher	81e84650c2	drbd: Use the standard bool, true, and false keywords Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:24 +01:00
Andreas Gruenbacher	6184ea2145	drbd: This code is dead now Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:22 +01:00
Andreas Gruenbacher	bb4379464e	drbd: Another small enum drbd_state_rv cleanup Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:21 +01:00
Andreas Gruenbacher	bf885f8a67	drbd: Be more explicit about functions that return an enum drbd_state_rv Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:19 +01:00
Andreas Gruenbacher	c8b325632f	drbd: Rename enum drbd_state_ret_codes to enum drbd_state_rv Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:18 +01:00
Andreas Gruenbacher	116676ca62	drbd: Rename enum drbd_ret_codes to enum drbd_ret_code Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:16 +01:00
Andreas Gruenbacher	0cf9d27e38	drbd: Get rid of unnecessary macros (2) The FAULT_ACTIVE macro just wraps the drbd_insert_fault macro for no apparent reason. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:15 +01:00
Andreas Gruenbacher	662d91a23a	drbd: Get rid of unnecessary macros (1) This macro doesn't save much code, but makes things a lot harder to read. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:13 +01:00
Andreas Gruenbacher	2f58dcfc85	drbd: Rename drbd_make_request_26 to drbd_make_request Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:11 +01:00
Andreas Gruenbacher	96756784a6	drbd: Remove left-over prototype Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-03-10 11:36:10 +01:00

1 2 3 4 5 ...

512 Commits