OpenCloudOS-Kernel/drivers/infiniband/core
Jack Morgenstein 770b7d96cf IB/mad: Fix use-after-free in ib mad completion handling
We encountered a use-after-free bug when unloading the driver:

[ 3562.116059] BUG: KASAN: use-after-free in ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.117233] Read of size 4 at addr ffff8882ca5aa868 by task kworker/u13:2/23862
[ 3562.118385]
[ 3562.119519] CPU: 2 PID: 23862 Comm: kworker/u13:2 Tainted: G           OE     5.1.0-for-upstream-dbg-2019-05-19_16-44-30-13 #1
[ 3562.121806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[ 3562.123075] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
[ 3562.124383] Call Trace:
[ 3562.125640]  dump_stack+0x9a/0xeb
[ 3562.126911]  print_address_description+0xe3/0x2e0
[ 3562.128223]  ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.129545]  __kasan_report+0x15c/0x1df
[ 3562.130866]  ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.132174]  kasan_report+0xe/0x20
[ 3562.133514]  ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.134835]  ? find_mad_agent+0xa00/0xa00 [ib_core]
[ 3562.136158]  ? qlist_free_all+0x51/0xb0
[ 3562.137498]  ? mlx4_ib_sqp_comp_worker+0x1970/0x1970 [mlx4_ib]
[ 3562.138833]  ? quarantine_reduce+0x1fa/0x270
[ 3562.140171]  ? kasan_unpoison_shadow+0x30/0x40
[ 3562.141522]  ib_mad_recv_done+0xdf6/0x3000 [ib_core]
[ 3562.142880]  ? _raw_spin_unlock_irqrestore+0x46/0x70
[ 3562.144277]  ? ib_mad_send_done+0x1810/0x1810 [ib_core]
[ 3562.145649]  ? mlx4_ib_destroy_cq+0x2a0/0x2a0 [mlx4_ib]
[ 3562.147008]  ? _raw_spin_unlock_irqrestore+0x46/0x70
[ 3562.148380]  ? debug_object_deactivate+0x2b9/0x4a0
[ 3562.149814]  __ib_process_cq+0xe2/0x1d0 [ib_core]
[ 3562.151195]  ib_cq_poll_work+0x45/0xf0 [ib_core]
[ 3562.152577]  process_one_work+0x90c/0x1860
[ 3562.153959]  ? pwq_dec_nr_in_flight+0x320/0x320
[ 3562.155320]  worker_thread+0x87/0xbb0
[ 3562.156687]  ? __kthread_parkme+0xb6/0x180
[ 3562.158058]  ? process_one_work+0x1860/0x1860
[ 3562.159429]  kthread+0x320/0x3e0
[ 3562.161391]  ? kthread_park+0x120/0x120
[ 3562.162744]  ret_from_fork+0x24/0x30
...
[ 3562.187615] Freed by task 31682:
[ 3562.188602]  save_stack+0x19/0x80
[ 3562.189586]  __kasan_slab_free+0x11d/0x160
[ 3562.190571]  kfree+0xf5/0x2f0
[ 3562.191552]  ib_mad_port_close+0x200/0x380 [ib_core]
[ 3562.192538]  ib_mad_remove_device+0xf0/0x230 [ib_core]
[ 3562.193538]  remove_client_context+0xa6/0xe0 [ib_core]
[ 3562.194514]  disable_device+0x14e/0x260 [ib_core]
[ 3562.195488]  __ib_unregister_device+0x79/0x150 [ib_core]
[ 3562.196462]  ib_unregister_device+0x21/0x30 [ib_core]
[ 3562.197439]  mlx4_ib_remove+0x162/0x690 [mlx4_ib]
[ 3562.198408]  mlx4_remove_device+0x204/0x2c0 [mlx4_core]
[ 3562.199381]  mlx4_unregister_interface+0x49/0x1d0 [mlx4_core]
[ 3562.200356]  mlx4_ib_cleanup+0xc/0x1d [mlx4_ib]
[ 3562.201329]  __x64_sys_delete_module+0x2d2/0x400
[ 3562.202288]  do_syscall_64+0x95/0x470
[ 3562.203277]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The problem was that the MAD PD was deallocated before the MAD CQ.
There was completion work pending for the CQ when the PD got deallocated.
When the mad completion handling reached procedure
ib_mad_post_receive_mads(), we got a use-after-free bug in the following
line of code in that procedure:
   sg_list.lkey = qp_info->port_priv->pd->local_dma_lkey;
(the pd pointer in the above line is no longer valid, because the
pd has been deallocated).

We fix this by allocating the PD before the CQ in procedure
ib_mad_port_open(), and deallocating the PD after freeing the CQ
in procedure ib_mad_port_close().

Since the CQ completion work queue is flushed during ib_free_cq(),
no completions will be pending for that CQ when the PD is later
deallocated.

Note that freeing the CQ before deallocating the PD is the practice
in the ULPs.

Fixes: 4be90bc60d ("IB/mad: Remove ib_get_dma_mr calls")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190801121449.24973-1-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-01 11:58:54 -04:00
..
Makefile RDMA/counter: Add set/clear per-port auto mode support 2019-07-05 10:22:54 -03:00
addr.c RDMA/core: Fix race when resolving IP address 2019-07-09 16:27:04 -03:00
agent.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
agent.h IB/mad: Add final OPA MAD processing 2015-06-12 14:49:18 -04:00
cache.c IB/core, ipoib: Do not overreact to SM LID change event 2019-05-07 16:06:03 -03:00
cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
cm.c IB/cm: Reduce dependency on gid attribute ndev check 2019-05-03 11:10:02 -03:00
cm_msgs.h RDMA: Use __packed annotation instead of __attribute__ ((packed)) 2019-03-25 21:14:12 -03:00
cma.c RDMA/cma: Use rdma_read_gid_attr_ndev_rcu to access netdev 2019-05-03 11:10:03 -03:00
cma_configfs.c RDMA/cma: Move cma module specific functions to cma_priv.h 2018-11-22 11:57:33 -07:00
cma_priv.h IB/cma: Define option to set ack timeout and pack tos_set 2019-02-08 16:14:21 -07:00
core_priv.h RDMA/restrack: Track driver QP types in resource tracker 2019-08-01 11:54:13 -04:00
counters.c IB/counters: Always initialize the port counter object 2019-07-25 11:45:48 -03:00
cq.c RDMA/core: Fix -Wunused-const-variable warnings 2019-07-11 11:49:55 -03:00
device.c RDMA/devices: Remove the lock around remove_client_context 2019-08-01 11:44:48 -04:00
fmr_pool.c RDMA: Start use ib_device_ops 2018-12-12 07:40:16 -07:00
iwcm.c RDMA: Get rid of iw_cm_verbs 2019-05-03 10:56:56 -03:00
iwcm.h iw_cm: free cm_id resources on the last deref 2016-08-02 13:15:18 -04:00
iwpm_msg.c RDMA/iwpm: move kdoc comments to functions 2019-02-05 15:40:41 -07:00
iwpm_util.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
iwpm_util.h RDMA/IWPM: Support no port mapping requirements 2019-02-04 16:26:02 -07:00
mad.c IB/mad: Fix use-after-free in ib mad completion handling 2019-08-01 11:58:54 -04:00
mad_priv.h RDMA: Use __packed annotation instead of __attribute__ ((packed)) 2019-03-25 21:14:12 -03:00
mad_rmpp.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
mad_rmpp.h
mr_pool.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
multicast.c IB/core, ipoib: Do not overreact to SM LID change event 2019-05-07 16:06:03 -03:00
netlink.c RDMA/cma: Remove CM_ID statistics provided by rdma-cm module 2019-02-05 15:30:33 -07:00
nldev.c IB/core: Work on the caller socket net namespace in nldev_newlink() 2019-07-08 17:03:35 -03:00
opa_smi.h RDMA: Start use ib_device_ops 2018-12-12 07:40:16 -07:00
packer.c IB/core: trivial prink cleanup. 2016-03-03 10:20:25 -05:00
rdma_core.c uverbs: Convert idr to XArray 2019-04-25 12:27:11 -03:00
rdma_core.h RDMA/core: Clear out the udata before error unwind 2019-05-27 14:35:26 -03:00
restrack.c RDMA/restrack: Make is_visible_in_pid_ns() as an API 2019-07-05 10:22:54 -03:00
restrack.h RDMA/restrack: Make is_visible_in_pid_ns() as an API 2019-07-05 10:22:54 -03:00
roce_gid_mgmt.c drivers: use in_dev_for_each_ifa_rtnl/rcu 2019-06-02 18:06:26 -07:00
rw.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
sa.h RDMA/core: Annotate timeout as unsigned long 2018-10-16 13:34:01 -04:00
sa_query.c 5.2 Merge Window pull request 2019-05-09 09:02:46 -07:00
security.c RDMA/device: Consolidate ib_device per_port data into one place 2019-02-19 10:13:39 -07:00
smi.c IB: Add rdma_cap_ib_switch helper and use where appropriate 2015-07-14 13:20:08 -04:00
smi.h RDMA: Start use ib_device_ops 2018-12-12 07:40:16 -07:00
sysfs.c RDMA/nldev: Allow get default counter statistics through RDMA netlink 2019-07-05 10:22:55 -03:00
ucma.c RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV 2019-06-18 22:44:08 -04:00
ud_header.c IB/core: trivial prink cleanup. 2016-03-03 10:20:25 -05:00
umem.c RDMA: Check umem pointer validity prior to release 2019-06-20 15:17:59 -04:00
umem_odp.c RDMA/odp: Do not leak dma maps when working with huge pages 2019-06-20 21:52:47 -04:00
user_mad.c IB/core: Add mitigation for Spectre V1 2019-08-01 11:34:11 -04:00
uverbs.h uverbs: Convert idr to XArray 2019-04-25 12:27:11 -03:00
uverbs_cmd.c RDMA/uverbs: remove redundant assignment to variable ret 2019-07-04 14:06:47 -03:00
uverbs_ioctl.c mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options 2019-07-12 11:05:46 -07:00
uverbs_main.c RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV 2019-06-18 22:44:08 -04:00
uverbs_marshall.c IB/cm: Replace members of sa_path_rec with 'struct sgid_attr *' 2018-06-25 14:19:57 -06:00
uverbs_std_types.c IB: Remove 'uobject->context' dependency in object destroy APIs 2019-04-01 14:59:35 -03:00
uverbs_std_types_counters.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_cq.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
uverbs_std_types_device.c IB/uverbs: Fix ioctl query port to consider device disassociation 2019-01-25 11:58:06 -07:00
uverbs_std_types_dm.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_flow_action.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_mr.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
uverbs_uapi.c RDMA: Move driver_id into struct ib_device_ops 2019-06-10 16:56:02 -03:00
verbs.c RDMA/counter: Add "auto" configuration mode support 2019-07-05 10:22:54 -03:00