OpenCloudOS-Kernel/net/tipc
Jon Maloy ec835f8912 tipc: fix lockdep warning during node delete
We see the following lockdep warning:

[ 2284.078521] ======================================================
[ 2284.078604] WARNING: possible circular locking dependency detected
[ 2284.078604] 4.19.0+ #42 Tainted: G            E
[ 2284.078604] ------------------------------------------------------
[ 2284.078604] rmmod/254 is trying to acquire lock:
[ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0
[ 2284.078604]
[ 2284.078604] but task is already holding lock:
[ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc]
[ 2284.078604]
[ 2284.078604] which lock already depends on the new lock.
[ 2284.078604]
[ 2284.078604]
[ 2284.078604] the existing dependency chain (in reverse order) is:
[ 2284.078604]
[ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}:
[ 2284.078604]        tipc_node_timeout+0x20a/0x330 [tipc]
[ 2284.078604]        call_timer_fn+0xa1/0x280
[ 2284.078604]        run_timer_softirq+0x1f2/0x4d0
[ 2284.078604]        __do_softirq+0xfc/0x413
[ 2284.078604]        irq_exit+0xb5/0xc0
[ 2284.078604]        smp_apic_timer_interrupt+0xac/0x210
[ 2284.078604]        apic_timer_interrupt+0xf/0x20
[ 2284.078604]        default_idle+0x1c/0x140
[ 2284.078604]        do_idle+0x1bc/0x280
[ 2284.078604]        cpu_startup_entry+0x19/0x20
[ 2284.078604]        start_secondary+0x187/0x1c0
[ 2284.078604]        secondary_startup_64+0xa4/0xb0
[ 2284.078604]
[ 2284.078604] -> #0 ((&n->timer)#2){+.-.}:
[ 2284.078604]        del_timer_sync+0x34/0xa0
[ 2284.078604]        tipc_node_delete+0x1a/0x40 [tipc]
[ 2284.078604]        tipc_node_stop+0xcb/0x190 [tipc]
[ 2284.078604]        tipc_net_stop+0x154/0x170 [tipc]
[ 2284.078604]        tipc_exit_net+0x16/0x30 [tipc]
[ 2284.078604]        ops_exit_list.isra.8+0x36/0x70
[ 2284.078604]        unregister_pernet_operations+0x87/0xd0
[ 2284.078604]        unregister_pernet_subsys+0x1d/0x30
[ 2284.078604]        tipc_exit+0x11/0x6f2 [tipc]
[ 2284.078604]        __x64_sys_delete_module+0x1df/0x240
[ 2284.078604]        do_syscall_64+0x66/0x460
[ 2284.078604]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 2284.078604]
[ 2284.078604] other info that might help us debug this:
[ 2284.078604]
[ 2284.078604]  Possible unsafe locking scenario:
[ 2284.078604]
[ 2284.078604]        CPU0                    CPU1
[ 2284.078604]        ----                    ----
[ 2284.078604]   lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]                                lock((&n->timer)#2);
[ 2284.078604]                                lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]   lock((&n->timer)#2);
[ 2284.078604]
[ 2284.078604]  *** DEADLOCK ***
[ 2284.078604]
[ 2284.078604] 3 locks held by rmmod/254:
[ 2284.078604]  #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30
[ 2284.078604]  #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc]
[ 2284.078604]  #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19
[...}

The reason is that the node timer handler sometimes needs to delete a
node which has been disconnected for too long. To do this, it grabs
the lock 'node_list_lock', which may at the same time be held by the
generic node cleanup function, tipc_node_stop(), during module removal.
Since the latter is calling del_timer_sync() inside the same lock, we
have a potential deadlock.

We fix this letting the timer cleanup function use spin_trylock()
instead of just spin_lock(), and when it fails to grab the lock it
just returns so that the timer handler can terminate its execution.
This is safe to do, since tipc_node_stop() anyway is about to
delete both the timer and the node instance.

Fixes: 6a939f365b ("tipc: Auto removal of peer down node instance")
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-27 16:30:39 -08:00
..
Kconfig tipc: implement socket diagnostics for AF_TIPC 2018-03-22 14:43:35 -04:00
Makefile tipc: implement socket diagnostics for AF_TIPC 2018-03-22 14:43:35 -04:00
addr.c tipc: handle collisions of 32-bit node address hash values 2018-03-23 13:12:18 -04:00
addr.h tipc: add 128-bit node identifier 2018-03-23 13:12:18 -04:00
bcast.c tipc: correct spelling errors for struct tipc_bc_base's comment 2018-09-03 22:03:07 -07:00
bcast.h tipc: make replicast a user selectable option 2017-01-20 12:10:17 -05:00
bearer.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-03 21:00:17 -07:00
bearer.h tipc: implement configuration of UDP media MTU 2018-04-20 11:04:05 -04:00
core.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
core.h tipc: replace name table service range array with rb tree 2018-03-31 22:19:52 -04:00
diag.c tipc: switch to rhashtable iterator 2018-08-29 18:04:54 -07:00
discover.c tipc: fix lockdep warning when reinitilaizing sockets 2018-11-17 22:01:31 -08:00
discover.h tipc: some cleanups in the file discover.c 2018-03-23 13:12:17 -04:00
eth_media.c tipc: make media address offset a common define 2015-02-27 18:18:48 -05:00
group.c tipc: fix info leak from kernel tipc_event 2018-10-18 16:49:53 -07:00
group.h tipc: extend sock diag for group communication 2018-06-30 21:05:42 +09:00
ib_media.c tipc: rename media/msg related definitions 2015-02-27 18:18:48 -05:00
link.c tipc: fix link re-establish failure 2018-11-11 10:03:38 -08:00
link.h tipc: fix failover problem 2018-09-29 11:45:14 -07:00
monitor.c tipc: make some functions static 2018-07-21 16:23:22 -07:00
monitor.h tipc: dump monitor attributes 2016-07-26 14:26:42 -07:00
msg.c tipc: buffer overflow handling in listener socket 2018-09-29 11:24:22 -07:00
msg.h tipc: buffer overflow handling in listener socket 2018-09-29 11:24:22 -07:00
name_distr.c tipc: eliminate message disordering during binding table update 2018-10-22 19:29:12 -07:00
name_distr.h tipc: permit overlapping service ranges in name table 2018-03-31 22:19:52 -04:00
name_table.c tipc: eliminate message disordering during binding table update 2018-10-22 19:29:12 -07:00
name_table.h tipc: eliminate message disordering during binding table update 2018-10-22 19:29:12 -07:00
net.c tipc: fix lockdep warning when reinitilaizing sockets 2018-11-17 22:01:31 -08:00
net.h tipc: fix lockdep warning when reinitilaizing sockets 2018-11-17 22:01:31 -08:00
netlink.c tipc: switch to rhashtable iterator 2018-08-29 18:04:54 -07:00
netlink.h tipc: make cluster size threshold for monitoring configurable 2016-07-26 14:26:42 -07:00
netlink_compat.c tipc: check return value of __tipc_dump_start() 2018-09-12 13:15:04 -07:00
node.c tipc: fix lockdep warning during node delete 2018-11-27 16:30:39 -08:00
node.h tipc: add SYN bit to connection setup messages 2018-09-29 11:24:22 -07:00
socket.c tipc: don't assume linear buffer when reading ancillary data 2018-11-17 22:08:02 -08:00
socket.h tipc: call start and done ops directly in __tipc_nl_compat_dumpit() 2018-09-06 21:49:18 -07:00
subscr.c tipc: fix unbalanced reference counter 2018-04-12 21:46:10 -04:00
subscr.h tipc: replace name table service range array with rb tree 2018-03-31 22:19:52 -04:00
sysctl.c tipc: add name distributor resiliency queue 2014-09-01 17:51:48 -07:00
topsrv.c Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
topsrv.h tipc: rename tipc_server to tipc_topsrv 2018-02-16 15:26:34 -05:00
udp_media.c tipc: support binding to specific ip address when activating UDP bearer 2018-10-15 21:56:56 -07:00
udp_media.h tipc: implement configuration of UDP media MTU 2018-04-20 11:04:05 -04:00