ocfs2/dlm: fix possible convert=sion deadlock
We found there is a conversion deadlock when the owner of lockres happened to crash before send DLM_PROXY_AST_MSG for a downconverting lock. The situation is as follows: Node1 Node2 Node3 the owner of lockresA lock_1 granted at EX mode and call ocfs2_cluster_unlock to decrease ex_holders. converting lock_3 from NL to EX send DLM_PROXY_AST_MSG to Node1, asking Node 1 to downconvert. receiving DLM_PROXY_AST_MSG, thread ocfs2dc send DLM_CONVERT_LOCK_MSG to Node2 to downconvert lock_1(EX->NL). lock_1 can be granted and put it into pending_asts list, return DLM_NORMAL. then something happened and Node2 crashed. received DLM_NORMAL, waiting for DLM_PROXY_AST_MSG. selected as the recovery master, receving migrate lock from Node1, queue lock_1 to the tail of converting list. After dlm recovery, converting list in the master of lockresA(Node3) will be: converting list head <-> lock_3(NL->EX) <->lock_1(EX<->NL). Requested mode of lock_3 is not compatible with the granted mode of lock_1, so it can not be granted. and lock_1 can not downconvert because covnerting queue is strictly FIFO. So a deadlock is created. We think function dlm_process_recovery_data() should queue_ast for lock_1 or alter the order of lock_1 and lock_3, so dlm_thread can process lock_1 first. And if there are multiple downconverting locks, they must convert form PR to NL, so no need to sort them. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
55b465b668
commit
6718cb5e0e
|
@ -1986,6 +1986,14 @@ skip_lvb:
|
|||
}
|
||||
if (!bad) {
|
||||
dlm_lock_get(newlock);
|
||||
if (mres->flags & DLM_MRES_RECOVERY &&
|
||||
ml->list == DLM_CONVERTING_LIST &&
|
||||
newlock->ml.type >
|
||||
newlock->ml.convert_type) {
|
||||
/* newlock is doing downconvert, add it to the
|
||||
* head of converting list */
|
||||
list_add(&newlock->list, queue);
|
||||
} else
|
||||
list_add_tail(&newlock->list, queue);
|
||||
mlog(0, "%s:%.*s: added lock for node %u, "
|
||||
"setting refmap bit\n", dlm->name,
|
||||
|
|
Loading…
Reference in New Issue