OpenCloudOS-Kernel/drivers/infiniband/sw
Zhu Yanjun 8ac0e6641c RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq
When run stress tests with RXE, the following Call Traces often occur

  watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [swapper/2:0]
  ...
  Call Trace:
  <IRQ>
  create_object+0x3f/0x3b0
  kmem_cache_alloc_node_trace+0x129/0x2d0
  __kmalloc_reserve.isra.52+0x2e/0x80
  __alloc_skb+0x83/0x270
  rxe_init_packet+0x99/0x150 [rdma_rxe]
  rxe_requester+0x34e/0x11a0 [rdma_rxe]
  rxe_do_task+0x85/0xf0 [rdma_rxe]
  tasklet_action_common.isra.21+0xeb/0x100
  __do_softirq+0xd0/0x298
  irq_exit+0xc5/0xd0
  smp_apic_timer_interrupt+0x68/0x120
  apic_timer_interrupt+0xf/0x20
  </IRQ>
  ...

The root cause is that tasklet is actually a softirq. In a tasklet
handler, another softirq handler is triggered. Usually these softirq
handlers run on the same cpu core. So this will cause "soft lockup Bug".

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200212072635.682689-8-leon@kernel.org
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-13 10:51:09 -04:00
..
rdmavt IB/rdmavt: Reset all QPs when the device is shut down 2020-02-11 11:41:32 -04:00
rxe RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq 2020-02-13 10:51:09 -04:00
siw RDMA/siw: Remove unwanted WARN_ON in siw_cm_llp_data_ready() 2020-02-11 14:33:58 -04:00
Makefile rdma/siw: addition to kernel build environment 2019-07-02 17:03:41 -03:00