OpenCloudOS-Kernel/drivers/net/ethernet/broadcom/bnxt
Edwin Peer 20d7d1c5c9 bnxt_en: reliably allocate IRQ table on reset to avoid crash
The following trace excerpt corresponds with a NULL pointer dereference
of 'bp->irq_tbl' in bnxt_setup_inta() on an Aarch64 system after many
device resets:

    Unable to handle kernel NULL pointer dereference at ... 000000d
    ...
    pc : string+0x3c/0x80
    lr : vsnprintf+0x294/0x7e0
    sp : ffff00000f61ba70 pstate : 20000145
    x29: ffff00000f61ba70 x28: 000000000000000d
    x27: ffff0000009c8b5a x26: ffff00000f61bb80
    x25: ffff0000009c8b5a x24: 0000000000000012
    x23: 00000000ffffffe0 x22: ffff000008990428
    x21: ffff00000f61bb80 x20: 000000000000000d
    x19: 000000000000001f x18: 0000000000000000
    x17: 0000000000000000 x16: ffff800b6d0fb400
    x15: 0000000000000000 x14: ffff800b7fe31ae8
    x13: 00001ed16472c920 x12: ffff000008c6b1c9
    x11: ffff000008cf0580 x10: ffff00000f61bb80
    x9 : 00000000ffffffd8 x8 : 000000000000000c
    x7 : ffff800b684b8000 x6 : 0000000000000000
    x5 : 0000000000000065 x4 : 0000000000000001
    x3 : ffff0a00ffffff04 x2 : 000000000000001f
    x1 : 0000000000000000 x0 : 000000000000000d
    Call trace:
    string+0x3c/0x80
    vsnprintf+0x294/0x7e0
    snprintf+0x44/0x50
    __bnxt_open_nic+0x34c/0x928 [bnxt_en]
    bnxt_open+0xe8/0x238 [bnxt_en]
    __dev_open+0xbc/0x130
    __dev_change_flags+0x12c/0x168
    dev_change_flags+0x20/0x60
    ...

Ordinarily, a call to bnxt_setup_inta() (not in trace due to inlining)
would not be expected on a system supporting MSIX at all. However, if
bnxt_init_int_mode() does not end up being called after the call to
bnxt_clear_int_mode() in bnxt_fw_reset_close(), then the driver will
think that only INTA is supported and bp->irq_tbl will be NULL,
causing the above crash.

In the error recovery scenario, we call bnxt_clear_int_mode() in
bnxt_fw_reset_close() early in the sequence. Ordinarily, we will
call bnxt_init_int_mode() in bnxt_hwrm_if_change() after we
reestablish communication with the firmware after reset.  However,
if the sequence has to abort before we call bnxt_init_int_mode() and
if the user later attempts to re-open the device, then it will cause
the crash above.

We fix it in 2 ways:

1. Check for bp->irq_tbl in bnxt_setup_int_mode(). If it is NULL, call
bnxt_init_init_mode().

2. If we need to abort in bnxt_hwrm_if_change() and cannot complete
the error recovery sequence, set the BNXT_STATE_ABORT_ERR flag.  This
will cause more drastic recovery at the next attempt to re-open the
device, including a call to bnxt_init_int_mode().

Fixes: 3bc7d4a352 ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-26 15:50:23 -08:00
..
Makefile treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
bnxt.c bnxt_en: reliably allocate IRQ table on reset to avoid crash 2021-02-26 15:50:23 -08:00
bnxt.h bnxt_en: Reply to firmware's echo request async message. 2021-02-14 17:27:51 -08:00
bnxt_coredump.h bnxt_en: Add support for ethtool get dump. 2018-08-05 17:08:26 -07:00
bnxt_dcb.c bnxt_en: Refactor statistics code and structures. 2020-07-27 11:47:33 -07:00
bnxt_dcb.h bnxt_en: Do not use the CNP CoS queue for networking traffic. 2018-08-05 17:08:26 -07:00
bnxt_debugfs.c bnxt: no need to check return value of debugfs_create functions 2019-08-10 15:25:47 -07:00
bnxt_debugfs.h
bnxt_devlink.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-16 17:51:13 -08:00
bnxt_devlink.h bnxt_en: Refactor bnxt_dl_info_get(). 2020-10-12 14:27:03 -07:00
bnxt_dim.c linux/dim: Move implementation to .c files 2019-06-25 13:46:39 -07:00
bnxt_ethtool.c bnxt_en: Clear DEFRAG flag in firmware message when retry flashing. 2021-01-12 20:05:35 -08:00
bnxt_ethtool.h devlink: move request_firmware out of driver 2020-11-19 21:40:57 -08:00
bnxt_fw_hdr.h
bnxt_hsi.h bnxt_en: Update firmware interface spec to 1.10.2.16. 2021-02-14 17:27:50 -08:00
bnxt_nvm_defs.h
bnxt_sriov.c net: don't include ethtool.h from netdevice.h 2020-11-23 17:27:04 -08:00
bnxt_sriov.h bnxt_en: Retain user settings on a VF after RESET_NOTIFY event. 2019-08-30 14:02:19 -07:00
bnxt_tc.c net: bnxt: don't complain if TC flower can't be supported 2020-07-17 18:26:20 -07:00
bnxt_tc.h bnxt_en: Fix array overrun in bnxt_fill_l2_rewrite_fields(). 2019-11-13 14:28:30 -08:00
bnxt_ulp.c bnxt_en: Improve stats context resource accounting with RDMA driver loaded. 2021-01-12 20:05:35 -08:00
bnxt_ulp.h bnxt_en: Add doorbell information to bnxt_en_dev struct. 2020-05-04 10:44:11 -07:00
bnxt_vfr.c net/broadcom: Clean broadcom code from driver versions 2020-03-03 17:54:53 -08:00
bnxt_vfr.h devlink: Add extack for eswitch operations 2018-10-03 16:17:58 -07:00
bnxt_xdp.c net, xdp: Introduce xdp_prepare_buff utility routine 2021-01-08 13:39:24 -08:00
bnxt_xdp.h bnxt_en: optimized XDP_REDIRECT support 2019-07-08 15:15:24 -07:00