OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Uma Krishnan	e0f76ad130	scsi: cxlflash: Yield to active send threads The following Oops may be encountered if the device is reset, i.e. EEH recovery, while there is heavy I/O traffic: 59:mon> t [c000200db64bb680] c008000009264c40 cxlflash_queuecommand+0x3b8/0x500 [cxlflash] [c000200db64bb770] c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0 [c000200db64bb7f0] c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0 [c000200db64bb900] c00000000067f528 __blk_run_queue+0x68/0xb0 [c000200db64bb930] c00000000067ab80 __elv_add_request+0x140/0x3c0 [c000200db64bb9b0] c00000000068daac blk_execute_rq_nowait+0xec/0x1a0 [c000200db64bba00] c00000000068dbb0 blk_execute_rq+0x50/0xe0 [c000200db64bba50] c0000000006b2040 sg_io+0x1f0/0x520 [c000200db64bbaf0] c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610 [c000200db64bbc20] c000000000926208 sd_ioctl+0x118/0x280 [c000200db64bbcc0] c00000000069f7ac blkdev_ioctl+0x7fc/0xe30 [c000200db64bbd20] c000000000439204 block_ioctl+0x84/0xa0 [c000200db64bbd40] c0000000003f8514 do_vfs_ioctl+0xd4/0xa00 [c000200db64bbde0] c0000000003f8f04 SyS_ioctl+0xc4/0x130 [c000200db64bbe30] c00000000000b184 system_call+0x58/0x6c When there is no room to send the I/O request, the cached room is refreshed by reading the memory mapped command room value from the AFU. The AFU register mapping is refreshed during a reset, creating a race condition that can lead to the Oops above. During a device reset, the AFU should not be unmapped until all the active send threads quiesce. An atomic counter, cmds_active, is currently used to track internal AFU commands and quiesce during reset. This same counter can also be used for the active send threads. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 11:22:09 -04:00
Xiaofei Tan	2f6bca202b	scsi: hisi_sas: add check of device in hisi_sas_task_exec() Currently we don't check that device is not gone before dereferencing its elements in the function hisi_sas_task_exec() (specifically, the DQ pointer). This patch fixes this issue by filling in the DQ pointer in hisi_sas_task_prep() after we check that the device pointer is still safe to reference. [mkp: typo] Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 11:22:09 -04:00
Xiang Chen	e85d93b212	scsi: hisi_sas: Use device lock to protect slot alloc/free The IPTT of a slot is unique, and we currently use hisi_hba lock to protect it. Now slot is managed on hisi_sas_device.list, so use DQ lock to protect for allocating and freeing the slot. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 11:22:09 -04:00
Xiang Chen	fa222db0b0	scsi: hisi_sas: Don't lock DQ for complete task sending Currently we lock the DQ to protect whole delivery process. So this stops us building slots for the same queue in parallel, and can affect performance. To optimise it, only lock the DQ during special periods, specifically when allocating a slot from the DQ and when delivering a slot to the HW. This approach is now safe, thanks to the previous patches to ensure that we always deliver a slot to the HW once allocated. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 11:22:09 -04:00
Xiang Chen	3de0026dad	scsi: hisi_sas: allocate slot buffer earlier Currently we allocate the slot's memory buffer after allocating the DQ slot. To aid DQ lockout reduction, and allow slots to be built in parallel, move this step (which can fail) prior to allocating the slot. Also a stray spin_unlock_irqrestore() is removed from internal task exec function. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 11:22:09 -04:00
Xiang Chen	a2b3820bdd	scsi: hisi_sas: make return type of prep functions void Since the task prep functions now should not fail, adjust the return types to void. In addition, some checks in the task prep functions are relocated to the main module; this is specifically the check for the number of elements in an sg list exceeded the HW SGE limit. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 11:22:09 -04:00
Xiang Chen	7eee4b9218	scsi: hisi_sas: relocate smp sg map Currently we use DQ lock to protect delivery of DQ entry one by one. To optimise to allow more than one slot to be built for a single DQ in parallel, we need to remove the DQ lock when preparing slots, prior to delivery. To achieve this, we rearrange the slot build order to ensure that once we allocate a slot for a task, we do cannot fail to deliver the task. In this patch, we rearrange the slot building for SMP tasks to ensure that sg mapping part (which can fail) happens before we allocate the slot in the DQ. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 11:22:09 -04:00
Alim Akhtar	0d846e703d	scsi: ufs: make ufshcd_config_pwr_mode of non-static func This makes ufshcd_config_pwr_mode non-static so that other vendors like exynos can use it. Signed-off-by: Seungwon Jeon <essuuj@gmail.com> Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:50:59 -04:00
Alim Akhtar	4404c5de77	scsi: ufs: add quirk to enable host controller without hce Some host controllers don't support host controller enable via HCE. Signed-off-by: Seungwon Jeon <essuuj@gmail.com> Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:47:10 -04:00
Alim Akhtar	5ac6abc941	scsi: ufs: add quirk to disallow reset of interrupt aggregation Some host controllers support interrupt aggregation but don't allow resetting counter and timer in software. Signed-off-by: Seungwon Jeon <essuuj@gmail.com> Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:44:52 -04:00
Alim Akhtar	1399c5b02c	scsi: ufs: add quirk to fix mishandling utrlclr/utmrlclr In the right behavior, setting the bit to '0' indicates clear and '1' indicates no change. If host controller handles this the other way, UFSHCI_QUIRK_BROKEN_REQ_LIST_CLR can be used. [mkp: typo] Signed-off-by: Seungwon Jeon <essuuj@gmail.com> Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Reviewed-by: "Asutosh Das (asd)" <asutoshd@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:43:19 -04:00
Kees Cook	bbe21d7a97	scsi: ufs: ufshcd: Remove VLA usage On the quest to remove all VLAs from the kernel[1] this moves buffers off the stack. In the second instance, this collapses two separately allocated buffers into a single buffer, since they are used consecutively, which saves 256 bytes (QUERY_DESC_MAX_SIZE + 1) of stack space. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:36:09 -04:00
Alexander Potapenko	a45b599ad8	scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() This shall help avoid copying uninitialized memory to the userspace when calling ioctl(fd, SG_IO) with an empty command. Reported-by: syzbot+7d26fc1eea198488deab@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:26:01 -04:00
Christoph Hellwig	b8b1483d79	sg: simplify procfs code Use remove_proc_subtree to remove the whole subtree on cleanup, and unwind the registration loop into individual calls. Switch to use proc_create_seq where applicable. Also don't bother handling proc_create* failures - the driver works perfectly fine without the proc files, and the cleanup will handle missing files gracefully. Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-05-16 07:24:30 +02:00
Christoph Hellwig	f7680bec04	megaraid: simplify procfs code Use remove_proc_subtree to remove the whole subtree on cleanup, and unwind the registration loop into individual calls. Switch to use proc_create_single. Also don't bother handling proc_create* failures - the driver works perfectly fine without the proc files, and the cleanup will handle missing files gracefully. Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-05-16 07:24:30 +02:00
Randy Dunlap	a406b0a069	scsi: core: clean up generated file scsi_devinfo_tbl.c "make clean" should remove the generated file "scsi_devinfo_tbl.c", so list it in the clean-files variable so that the file gets cleaned up. Fixes: `345e29608b` ("scsi: scsi: Export blacklist flags to sysfs") Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:48:03 -04:00
Wen Xiong	81471b07b7	scsi: ipr: new IOASC update This patch adds new adapter error log for P9 system with the new AZ SAS cable. Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:42:53 -04:00
Colin Ian King	eaa6d0df67	scsi: esas2r: fix spelling mistake: "requestss" -> "requests" Trivial fix to spelling mistake in esas2r_debug message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:40:30 -04:00
Kees Cook	97e066184b	scsi: libosd: Remove VLA usage On the quest to remove all VLAs from the kernel[1] this rearranges the code to avoid a VLA warning under -Wvla (gcc doesn't recognize "const" variables as not triggering VLA creation). Additionally cleans up variable naming to avoid 80 character column limit. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Boaz Harrosh <ooo@electrozaur.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:32:45 -04:00
Christoph Hellwig	0eb0b63c1d	block: consistently use GFP_NOIO instead of __GFP_NORECLAIM Same numerical value (for now at least), but a much better documentation of intent. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:18 -06:00
Christoph Hellwig	4accf5fc79	block: pass an explicit gfp_t to get_request blk_old_get_request already has it at hand, and in blk_queue_bio, which is the fast path, it is constant. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:14 -06:00
Christoph Hellwig	ff005a0662	block: sanitize blk_get_request calling conventions Switch everyone to blk_get_request_flags, and then rename blk_get_request_flags to blk_get_request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:12 -06:00
Christoph Hellwig	ac613e4566	scsi/osd: remove the gfp argument to osd_start_request Always GFP_KERNEL, and keeping it would cause serious complications for the next change. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:09 -06:00
Wenwen Wang	9899e4d352	scsi: 3w-xxxx: fix a missing-check bug In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'data_buffer_length'. Then a security check is performed on it to make sure that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, an error code -EINVAL is returned. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer length between the two copies. This way, the user can bypass the security check and inject invalid data buffer length. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:32:43 -04:00
Wenwen Wang	c9318a3e02	scsi: 3w-9xxx: fix a missing-check bug In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'driver_command'. Then a security check is performed on the data buffer size indicated by 'driver_command', which is 'driver_command.buffer_length'. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer size between the two copies. This way, the user can bypass the security check and inject invalid data buffer size. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:32:18 -04:00
Dan Carpenter	27e833daba	scsi: megaraid: silence a static checker bug If we had more than 32 megaraid cards then it would cause memory corruption. That's not likely, of course, but it's handy to enforce it and make the static checker happy. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:29:00 -04:00
Colin Ian King	7afc0ce912	scsi: lpfc: fix spelling mistakes: "mabilbox" and "maibox" Trivial fix to spelling mistakes in lpfc_printf_log log message "mabilbox" -> "mailbox" "maibox" -> "mailbox" Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:26:44 -04:00
Andrei Vagin	4b83cb8b06	scsi: qla2xxx: remove the unused tcm_qla2xxx_cmd_wq Signed-off-by: Andrei Vagin <avagin@openvz.org> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:24:25 -04:00
Xiaofei Tan	f70c1251de	scsi: hisi_sas: workaround a v3 hw hilink bug There is an SoC bug of v3 hw development version. When hot- unplugging a directly attached disk, the PHY down interrupt may not happen. It is very easy to appear on some boards. When this issue occurs, the controller will receive many invalid dword frames, and the "alos" fields of register HILINK_ERR_DFX can indicate that disk was unplugged. As an workaround solution, this patch detects this issue in the channel interrupt, and workaround it by following steps: - Disable the PHY - Clear error code and interrupt - Enable the PHY Then the HW will reissue PHY down interrupt. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
John Garry	9b8addf302	scsi: hisi_sas: add readl poll timeout helper wrappers It is common to use readl poll timeout helpers in the driver, so create custom wrappers. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiaofei Tan	bf081d5da4	scsi: hisi_sas: remove redundant handling to event95 for v3 Event95 is used for DFX purpose. The relevant bit for this interrupt in the ENT_INT_SRC_MSK3 register has been disabled, so remove the processing. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	9413532788	scsi: hisi_sas: config ATA de-reset as an constrained command for v3 hw As a unconstrained command, a command can be sent to SATA disk even if SATA disk status is BUSY, ERR or DRQ. If an ATA reset assert is successful but ATA reset de-assert fails, then it will retry the reset de-assert. If reset de- assert retry is successful, we think it is okay to probe the device but actually it still has Err status. Apparently we need to retry the ATA reset assertion and de- assertion instead for this mentioned scenario. As such, we config ATA reset assert as a constrained command, if ATA reset de-assert fails, then ATA reset de-assert retry will also fail. Then we will retry the proper process of ATA reset assert and de-assert again. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	c2c1d9ded0	scsi: hisi_sas: update PHY linkrate after a controller reset After the controller is reset, we currently may not honour the PHY max linkrate set via sysfs, in that after a reset we always revert to max linkrate of 12Gbps, ignoring the value set via sysfs. This patch modifies to policy to set the programmed PHY linkrate, honouring the max linkrate programmed via sysfs. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
John Garry	6f7c32d605	scsi: hisi_sas: stop controller timer for reset We should only have the timer enabled after PHY up after controller reset, so disable prior to reset. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	c6ef895472	scsi: hisi_sas: check sas_dev gone earlier in hisi_sas_abort_task() It is possible to dereference a NULL-pointer in hisi_sas_abort_task() in special scenario when the device has been removed. If an SMP task times-out, it will call hisi_sas_abort_task() to recover. And currently there is a check in hisi_sas_abort_task() to avoid the situation of processing the abort for the removed device. However we have an ordering problem, in that we may reference a task for the removed device before checking if the device has been removed. Fix this by only referencing the sas_dev after we know it is still present. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	a14da7a20d	scsi: hisi_sas: fix PI memory size There are 28 bytes of protection information record of SSP for v3 hw, 16 bytes for v2 hw, and probably 24 for v1 hw (forgotten now). So use a value big enough in hisi_sas_command_table_ssp.prot to cover all cases. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	cd938e535e	scsi: hisi_sas: check host frozen before calling "done" function When the host is frozen in SCSI EH state, at any point after the LLDD sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free the task; see sas_scsi_find_task(). This puts the LLDD in a difficult position, in that once it sets SAS_TASK_STATE_DONE for the task state it should not reference the sas_task again. But the LLDD needs will check the sas_task indirectly in calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done() (to check if the host is frozen state actually). And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after task->task_done() is called (as the sas_task is free'd at this point). This situation would seem to be a problem made by libsas. To work around, check in the LLDD whether the host is in frozen state to ensure it is ok to call task->task_done() function. If in the frozen state, we rely on SCSI EH and libsas to free the sas_task directly. We do not do this for the following IO types: - SMP - they are managed in libsas directly, outside SCSI EH - Any internally originated IO, for similar reason Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	b81b6cce58	scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twice If the SCSI host enters EH, any pending IO will be processed by SCSI EH. However it is possible that SCSI EH will try to abort the IO and also at the same time the IO completes in the driver. In this situation there is a small chance of freeing the sas_task twice. Then if another IO re-uses freed sas_task before the second time of free'ing sas_task, it is possible to free incorrect sas_task. To avoid this situation, add some checks to increase reliability. The sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually protect the LLDD and libsas freeing the task. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:43 -04:00
Xiang Chen	24cf43612d	scsi: hisi_sas: optimise the usage of DQ locking In the DQ tasklet processing it is not necessary to take the DQ lock, as there is no contention between adding slots to the CQ and removing slots from the matching DQ. In addition, since we run each DQ in a separate tasklet context, there would be no possible contention between DQ processing running for the same queue in parallel. It is still necessary to take hisi_hba lock when free'ing slots. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:43 -04:00
James Smart	3e21d1cb0f	scsi: lpfc: Comment cleanup regarding Broadcom copyright header Fix small formatting and wording nits in Broadcom copyright header Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	e65d8c3411	scsi: lpfc: update driver version to 12.0.0.3 Update the driver version to 12.0.0.3 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	11f0e34ff4	scsi: lpfc: Enhance log messages when reporting CQE errors Enhance log messages for CQEs as they were not reporting certain fields. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	44c2757b76	scsi: lpfc: Fix up log messages and stats counters in IO submit code path Fix up log messages and add an fcp error stat counter in the IO submit code path to make diagnosing problems easier Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	d38f33b304	scsi: lpfc: Driver NVME load fails when CPU cnt > WQ resource cnt If the cpu count is larger than the number of WQ resources available, adapter attachment eventually failes due to a WQ_CREATE failure. Calculate the number of WQs desired (which initializes to cpu count) after accounting for the number of queues the adapter supports and the number allocated to SCSI and the control/ELS path, and scale down if necessary. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	23288b78a1	scsi: lpfc: Handle new link fault code returned by adapter firmware. The driver encounters a link event ACQE with a fault code it doesn't recognize, it logs an "Invalid" fault type and futher treats the unknown value as a mailbox command failure. First off, there is no "invalid" value, only values that are unknown. Secondly, the fault code doesn't indicate status - the rest of the ACQE contains that status so there is no reason to "fail the commands". Change the "Invalid" to "Unknown". There is no "invalid" code value. Separate fault code parsing and message genaration from any mbx handling status. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	a72d56b2a6	scsi: lpfc: Correct fw download error message In situations when the firmware image in inappropriate for the chip type, initial validation checks were light, allowing the checks to pass, thus allowing the firmware to be downloaded. Eventually, after the download, the chip rejects the firmware but it is logged as a generic firmware download error. Revise the initial checks to validate the image vs asic type so that the correct message is displayed and the download process is avoided. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	48f8fdb4b4	scsi: lpfc: enhance LE data structure copies to hardware The driver builds the control structures in host memory using definitions that are based on 32-bit words. After building the structure it is then written to the adapter. This patch slightly optimizes LE hosts by copying the structures via 64-bit copies. This is doable as the adapter interface is LE thus there is no byteswapping as the copy is performed. The same optimization would be nice on BE systems, but when byteswapping occurs, it swaps 32-bit words as well, thus trashing the control structure. Given amount of code that is dependent upon the 32-bit word definition, it was decided to not change things for the minor optimization. Thus PPC 64-bit systems sticks with doing 32-bit copies. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:15 -04:00
James Smart	cd2400715c	scsi: lpfc: Change IO submit return to EBUSY if remote port is recovering I/O submission paths in the lpfc nvme path are rejecting the io with an error code that reflects back to the callee as a hard io failure. Many of these conditions are transient and would likely resolve if retried. Correct by returning -EBUSY, which the FC transport triggers off of to return busy status codes to the blk-mq layer. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:15 -04:00
Chad Dupuis	ab7ad49d01	scsi: qedf: Update version number to 8.33.16.20 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	5d1c8b5ba0	scsi: qedf: Update copyright for 2018 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	f3690a89f9	scsi: qedf: Add more defensive checks for concurrent error conditions During an uplink toggle test all error handling is done via timeout and firmware error conditions which can occur concurrently: - SCSI layer timeouts - Error detect CQEs - Firmware detected underruns - ABTS timeouts All these concurrent events require more defensive checks in the driver including: - Check both internally and externally generated aborts to make sure the xid is not already been aborted in another context or in cleanup. - Check back pointers in qedf_cmd_timeout to verify the context of the io_req, fcport and qedf_ctx - Check rport state in host reset handler to not reset the whole host if the rport is already uploaded or in the process of relogin - Check to state for an fcport before initiating a middle path ELS request Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	4f4616ceeb	scsi: qedf: Set the UNLOADING flag when removing a vport Similar to what we do when we remove a PCI function, set the QEDF_UNLOADING flag to prevent any requests from being queued while a vport is being deleted. This prevents any requests from getting stuck in limbo when the vport is unloaded or deleted. Fixes the crash: PID: 106676 TASK: ffff9a436aa90000 CPU: 12 COMMAND: "multipathd" #0 [ffff9a43567d3550] machine_kexec+522 at ffffffffaca60b2a #1 [ffff9a43567d35b0] __crash_kexec+114 at ffffffffacb13512 #2 [ffff9a43567d3680] crash_kexec+48 at ffffffffacb13600 #3 [ffff9a43567d3698] oops_end+168 at ffffffffad117768 #4 [ffff9a43567d36c0] no_context+645 at ffffffffad106f52 #5 [ffff9a43567d3710] __bad_area_nosemaphore+116 at ffffffffad106fe9 #6 [ffff9a43567d3760] bad_area+70 at ffffffffad107379 #7 [ffff9a43567d3788] __do_page_fault+1247 at ffffffffad11a8cf #8 [ffff9a43567d37f0] do_page_fault+53 at ffffffffad11a915 #9 [ffff9a43567d3820] page_fault+40 at ffffffffad116768 [exception RIP: qedf_init_task+61] RIP: ffffffffc0e13c2d RSP: ffff9a43567d38d0 RFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffbe920472c738 RCX: ffff9a434fa0e3e8 RDX: ffff9a434f695280 RSI: ffffbe920472c738 RDI: ffff9a43aa359c80 RBP: ffff9a43567d3950 R8: 0000000000000c15 R9: ffff9a3fb09b9880 R10: ffff9a434fa0e3e8 R11: ffff9a43567d35ce R12: 0000000000000000 R13: ffff9a434f695280 R14: ffff9a43aa359c80 R15: ffff9a3fb9e005c0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	92bbccdf72	scsi: qedf: Add additional checks when restarting an rport due to ABTS timeout There are a couple of kernel cases when we restart a remote port due to ABTS timeout that we need to handle: 1. Flush any outstanding ABTS requests when flushing I/Os so that we do not hold up the eh_abort handler indefinitely causing process hangs. 2. Check if we are currently uploading a connection before issuing an ABTS. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	96673e1e22	scsi: qedf: If qed fails to enable MSI-X fail PCI probe Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	65b7beca42	scsi: qedf: Honor default_prio module parameter even if DCBX does not converge Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	4b9b7fabb3	scsi: qedf: Improve firmware debug dump handling Get all firmware debug data instead of just a grc dump. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Saurav Kashyap	f9a4a7f2c0	scsi: qedf: Remove setting DCBX pending during soft context reset PROBLEM DESCRIPTION: According to the logs, STAG was changing and it was triggering soft reset. In soft reset we used to virtual link down and up and also we were disabling DCBx flag. Since this was virtual link flap, DCBx never used to converge again. SOLUTION: Code change is to remove disabling DCBx flag from soft reset. Signed-off-by: Saurav Kashyap <saurav.kashyap@cavium.com> Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	8025c84208	scsi: qedf: Add task id to kref_get_unless_zero() debug messages when flushing requests Helps to corroborate which requests we can't get reference on and if it's real bug or not. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	3f9de7f041	scsi: qedf: Check if link is already up when receiving a link up event from qed [mkp: typo] Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	a8f192bce1	scsi: qedf: Return request as DID_NO_CONNECT if MSI-X is not enabled Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	adf4884252	scsi: qedf: Release RRQ reference correctly when RRQ command times out When an RRQ request times out the reference is not getting decremented correctly as there are still ELS commands leftover when we flush any pending I/Os during offload: [ 281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a. ... [ 281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a. [ 281.788772] [0000:21:00.3]:[qedf_rrq_compl:182]:4: Entered. [ 281.788774] [0000:21:00.3]:[qedf_rrq_compl:200]:4: rrq_compl: orig io = ffffc90004c556f8, orig xid = 0x81b, rrq_xid = 0x96a, refcount=1 ... [ 331.448032] [0000:21:00.3]:[qedf_flush_els_req:1512]:4: Flushing ELS request xid=0x96a refcount=2. The fix is to call kref_put on the rrq_req in case of timeout as the timeout handler will call rrq_compl directly vs. a normal completion where it is call from els_compl. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	84b2ba6e42	scsi: qedf: Honor priority from DCBX FCoE App tag We currently hard code the priority in the 8021q tag to 3 for FCoE traffic. The vast majority of the time this is fine but if the priority is something else besides 3, any VLAN ID comparison either in the non-offload path or offload path will fail and cause dropped frames where none are expected. Change the behavior so that the driver default is 3 if we do not get any DCBX convergence. If DCBX does converge, then set the FIP/FCoE priority in the following manner: 1. If the qedf_default_prio modparam is set use that 2. If the DCBX FCoE priority is not in range (0..7) use 3 3. Use the DCBX FCoE priority we get in the driver's DCBX handler Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	ba17d379c2	scsi: qedf: Add dcbx_not_wait module parameter so we won't wait for DCBX convergence to start discovery This module parameter is to work around cases where we do not receive the DCBX handler notification from qed but discovery is still possible if we send out a FIP VLAN request irregardless of the DCBX state. [mkp: zeroday warning] Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	a93755cf7e	scsi: qedf: Sanity check FCoE/FIP priority value to make sure it's between 0 and 7 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	766639cab0	scsi: qedf: Add check for offload before flushing I/Os for target We need to check that a fcport is offloaded before we try to flush any requests. No doing so could lead to undefined results and most likely a crash. Fixes the oops: [ 343.971886] [0000:42:00.3]:[qedf_execute_tmf:2070]:8: wait for tm_cmpl timeout! [ 343.971933] BUG: unable to handle kernel paging request at 00000000000024a8 [ 343.971949] IP: [<ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf] [ 343.971952] PGD 42c569067 PUD 4160fe067 PMD 0 [ 343.971954] Oops: 0000 [#1] SMP [ 343.972008] Modules linked in: qedf(OEX) qed(OEX) bnx2i cnic fuse af_packet iscsi_ibft msr xfs intel_rapl sb_edac edac_core x86_pkg_temp_thermal bnx2x geneve intel_powerclamp vxlan coretemp ipmi_ssif ipmi_devintf kvm_intel kvm libiscsi joydev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel tg3 ip6_udp_tunnel udp_tunnel mdio libcrc32c iTCO_wdt scsi_transport_iscsi uio drbg iTCO_vendor_support iscsi_boot_sysfs dcdbas(X) ipmi_si ansi_cprng aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper ptp pps_core pcspkr libphy lpc_ich mfd_core cryptd fjes wmi ipmi_msghandler button crc8 libfcoe libfc scsi_transport_fc mei_me mei shpchp processor acpi_pad btrfs xor hid_generic usbhid raid6_pq sd_mod sr_mod cdrom mgag200 crc32c_intel i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt [ 343.972020] fb_sys_fops ttm ahci ehci_pci libahci ehci_hcd drm libata usbcore megaraid_sas usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 [last unloaded: qedf] [ 343.972022] Supported: Yes, External [ 343.972026] CPU: 30 PID: 12777 Comm: sg_reset Tainted: G W OE X 4.4.73-5-default #1 [ 343.972027] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.1.3 11/20/2013 [ 343.972029] task: ffff88018dfc0e80 ti: ffff88042bd7c000 task.ti: ffff88042bd7c000 [ 343.972036] RIP: 0010:[<ffffffffa06b8cc6>] [<ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf] [ 343.972038] RSP: 0018:ffff88042bd7fbe0 EFLAGS: 00010286 [ 343.972039] RAX: 0000000000000000 RBX: ffff88042ce37800 RCX: 0000000000000400 [ 343.972040] RDX: 000000000000060e RSI: ffffffffa06be830 RDI: ffff8807e5072cc0 [ 343.972041] RBP: 0000000000001000 R08: ffffffffa06bff4d R09: ffff88018dd84580 [ 343.972042] R10: 000000000000018b R11: 0000000000000002 R12: 0000000000002003 [ 343.972043] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8807e5072cc0 [ 343.972046] FS: 00007fc1c8809700(0000) GS:ffff88042fbc0000(0000) knlGS:0000000000000000 [ 343.972048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 343.972049] CR2: 00000000000024a8 CR3: 00000004236ec000 CR4: 00000000001406e0 [ 343.972050] Stack: [ 343.972053] 504c78750607e154 ffffffff810a7d10 ffff88042ce37800 0000000000000010 [ 343.972055] 0000000000002003 ffff8807ff480c48 ffff8807e5072cc0 ffffc90004ec4ff8 [ 343.972057] ffffffffa06b9b86 ffff880800000010 0000000000000282 ffff88042ce37800 [ 343.972058] Call Trace: [ 343.972094] [<ffffffffa06b9b86>] qedf_initiate_tmf+0x346/0x3e0 [qedf] [ 343.972120] [<ffffffffa000fa06>] scsi_try_bus_device_reset+0x26/0x40 [scsi_mod] [ 343.972133] [<ffffffffa001038e>] scsi_ioctl_reset+0x13e/0x260 [scsi_mod] [ 343.972145] [<ffffffffa000f416>] scsi_ioctl+0x136/0x3d0 [scsi_mod] [ 343.972154] [<ffffffff812ff6eb>] blkdev_ioctl+0x6bb/0x950 [ 343.972164] [<ffffffff8123cfed>] block_ioctl+0x3d/0x40 [ 343.972170] [<ffffffff81217e2d>] do_vfs_ioctl+0x2cd/0x4a0 [ 343.972186] [<ffffffff81218074>] SyS_ioctl+0x74/0x80 [ 343.972193] [<ffffffff8160916e>] entry_SYSCALL_64_fastpath+0x12/0x6d [ 343.975285] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x6d Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	15a93de7e9	scsi: qedf: Fix VLAN display when printing sent FIP frames Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	f32803bb45	scsi: qedf: Add missing skb frees in error path Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:09 -04:00
Chad Dupuis	c3ef86f3ec	scsi: qedf: Increase the number of default FIP VLAN request retries to 60 Some configurations need more than 30 seconds to respond to a FIP VLAN request so increase the default to 60 seconds. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:09 -04:00
Chad Dupuis	44c7c85911	scsi: qedf: Synchronize rport restarts when multiple ELS commands time out If multiple ELS commands time out, such as aborts, they could all try to restart the same rport and the same time. This could mean multiple multiple processes trying to clean up any outstanding commands or trying to upload the same port. Add a new flag (QEDF_RPORT_IN_RESET) and check other fcport state flags before trying to reset the port. Fixes the crash: [17501.824701] ------------[ cut here ]------------ [17501.824733] kernel BUG at include/asm-generic/dma-mapping-common.h:65! [17501.824760] invalid opcode: 0000 [#1] SMP [17501.824781] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ses enclosure dm_service_time vfat fat sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass joydev btrfs hpilo raid6_pq iTCO_wdt iTCO_vendor_support xor hpwdt ipmi_ssif sg crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul ioatdma lpc_ich glue_helper ablk_helper i2c_i801 shpchp cryptd ipmi_si pcspkr acpi_power_meter ipmi_devintf pcc_cpufreq dca wmi ipmi_msghandler dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod cdrom sd_mod [17501.825119] crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qedf(OE) drm libfcoe ahci qedi(OE) crct10dif_pclmul libfc libahci uio crct10dif_common crc32c_intel libiscsi libata scsi_transport_iscsi scsi_transport_fc tg3 qede(OE) scsi_tgt hpsa qed(OE) i2c_core ptp scsi_transport_sas pps_core iscsi_boot_sysfs dm_mirror dm_region_hash dm_log dm_mod [17501.825292] CPU: 8 PID: 10531 Comm: kworker/u96:1 Tainted: G OE ------------ 3.10.0-693.el7.x86_64 #1 [17501.825330] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 06/02/2016 [17501.825372] Workqueue: fc_rport_eq fc_rport_work [libfc] [17501.825395] task: ffff88101bca8000 ti: ffff881025278000 task.ti: ffff881025278000 [17501.825424] RIP: 0010:[<ffffffffc042def9>] [<ffffffffc042def9>] qedf_unmap_sg_list.isra.15+0x89/0x90 [qedf] [17501.825471] RSP: 0018:ffff88102527bb98 EFLAGS: 00010212 [17501.825493] RAX: ffff8800224eac00 RBX: ffffc9000cd05210 RCX: 0000000000001000 [17501.825520] RDX: 000000007e655e40 RSI: 0000000000001000 RDI: ffff88107fe3b098 [17501.826683] RBP: ffff88102527bba0 R08: ffffffff81a13200 R09: 0000000000000286 [17501.827747] R10: 0000000000000004 R11: 0000000000000005 R12: ffffc9000cd051b8 [17501.828804] R13: ffff881037640c28 R14: 0000000000000007 R15: ffffc9000cd05200 [17501.829850] FS: 0000000000000000(0000) GS:ffff88103fa00000(0000) knlGS:0000000000000000 [17501.830910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17501.831966] CR2: 00007f9b94005f38 CR3: 00000000019f2000 CR4: 00000000003407e0 [17501.833027] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [17501.834087] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [17501.835142] Stack: [17501.836201] ffff881033ddbb80 ffff88102527bc30 ffffffffc042f834 0000000000002710 [17501.837264] ffff88102527bbd0 ffffffff8133d9dd ffffc9000cd052a0 ffff88102527bc30 [17501.838325] ffffffff816a9c65 0000000000000001 ffff88101bca8000 ffffffff810c4810 [17501.839388] Call Trace: [17501.840446] [<ffffffffc042f834>] qedf_scsi_done+0x54/0x1d0 [qedf] [17501.841504] [<ffffffff8133d9dd>] ? list_del+0xd/0x30 [17501.842537] [<ffffffff816a9c65>] ? wait_for_completion_timeout+0x125/0x140 [17501.843560] [<ffffffff810c4810>] ? wake_up_state+0x20/0x20 [17501.844577] [<ffffffffc0430311>] qedf_initiate_cleanup+0x2e1/0x310 [qedf] [17501.845587] [<ffffffffc04305fe>] qedf_flush_active_ios+0x10e/0x260 [qedf] [17501.846612] [<ffffffffc042892f>] qedf_cleanup_fcport+0x5f/0x370 [qedf] [17501.847613] [<ffffffffc04292d8>] qedf_rport_event_handler+0x398/0x950 [qedf] [17501.848602] [<ffffffff810cdc7c>] ? dequeue_entity+0x11c/0x5d0 [17501.849581] [<ffffffff81098a2b>] ? __internal_add_timer+0xab/0x130 [17501.850555] [<ffffffff810ce54e>] ? dequeue_task_fair+0x41e/0x660 [17501.851528] [<ffffffffc03241a4>] fc_rport_work+0xf4/0x6c0 [libfc] [17501.852490] [<ffffffff810a881a>] process_one_work+0x17a/0x440 [17501.853446] [<ffffffff810a94e6>] worker_thread+0x126/0x3c0 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:09 -04:00
himanshu.madhani@cavium.com	3f9da25602	scsi: qla2xxx: Update driver version to 10.00.00.07-k Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:12 -04:00
Quinn Tran	84905dfe78	scsi: qla2xxx: Fix TMF and Multi-Queue config For target mode, task management command is queued to specific cpu base on where the SCSI command is residing. This prevent race condition of task management command getting ahead of regular scsi command. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:12 -04:00
himanshu.madhani@cavium.com	fc31b7a803	scsi: qla2xxx: Prevent relogin loop by removing stale code Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:12 -04:00
Quinn Tran	36d49c92ef	scsi: qla2xxx: Remove stale debug value for login_retry flag Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:12 -04:00
Quinn Tran	e25f76549b	scsi: qla2xxx: Use predefined get_datalen_for_atio() inline function - Uses predefine inline function to access add_cdb_len field in ATIO. - Return SS_RESIDUAL_UNDER status when sending BUSY Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:11 -04:00
Quinn Tran	8ea4faf829	scsi: qla2xxx: Fix Inquiry command being dropped in Target mode When a connection is established, the target core session may not be created immediately. Current code will drop/terminate the command based on the session state. This patch will return BUSY status for any commands arriving on wire before the session is created. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:11 -04:00
Quinn Tran	cc28e0ace9	scsi: qla2xxx: Move GPSC and GFPNID out of session management Move GPSC & GFPNID commands out of session management to reduce time lag in reporting the session state to remote port. These commands are not essential when it comes to maintaining the rport state. Delay sending these commands after rport state is set to Online. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:11 -04:00
Quinn Tran	bee8b84686	scsi: qla2xxx: Reduce redundant ADISC command for RSCNs For each RSCN that triggers a rescan of the fabric, ADISC is used to revalidate an existing session. If the RSCN is not affecting all existing sessions, then driver should not send redundant ADISC for all existing sessions. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:11 -04:00
Quinn Tran	1d317b2123	scsi: qla2xxx: Delete session for nport id change This patch fixes regression introduced by commit `a4239945b8` ("scsi: qla2xxx: Add switch command to simplify fabric discovery") by scheduling session deletion when Nport ID changes. [mkp: clarified commit] Fixes: `a4239945b8` ("scsi: qla2xxx: Add switch command to simplify fabric discovery") Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:11 -04:00
Quinn Tran	29528491cc	scsi: qla2xxx: Fix Rport and session state getting out of sync This patch fixes rport state and session state getting out of sync. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:11 -04:00
Quinn Tran	625a1caefe	scsi: qla2xxx: Fix sending ADISC command for login This patch fixes login_retry login for ADISC command. when login_retry count reaches 0, further attempt to send ADISC command is ignored by the code. Remove this redundant login_retry count check from qla24xx_fcport_handle_login() [mkp: fix typo] Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:46:11 -04:00
Chaitra P B	f6972d7180	scsi: mpt3sas: Update driver version "25.100.00.00" Update driver version to match OOB/internal driver version. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:40:05 -04:00
Chaitra P B	87b3576e9e	scsi: mpt3sas: fix possible memory leak. In ioctl exit path driver refers ioc_list to free memory associated with diag buffers and event_log pointer used to save events by driver. If ctl_exit() func is called after unregistering driver, then ioc_list will be empty and hence driver will not be able to free the allocated memory which in turn causes memory leak. So call ctl_exit() function before unregistering mpt3sas driver. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:40:05 -04:00
Chaitra P B	c1a6c5ac42	scsi: mpt3sas: For NVME device, issue a protocol level reset 1) Manufacturing Page 11 contains parameters to control internal firmware behavior. Based on AddlFlags2 field FW/Driver behaviour can be changed, (flag tm_custom_handling is used for this) a) For PCIe device, protocol level reset should be used if flag tm_custom_handling is 0. Since Abort Task Set, LUN reset and Target reset will result in a protocol level reset. Drivers should issue only one type of this reset, if that fails then it should escalate to a controller reset (diag reset/OCR). b) If the driver has control over the TM reset timeout value, then driver should use the value exposed in PCIe Device Page 2 for pcie device (field ControllerResetTO). Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:39:49 -04:00
Chaitra P B	65928d1f41	scsi: mpt3sas: Update MPI Headers Update MPI Files to support protocol level reset for NVMe device. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:20 -04:00
Chaitra P B	3d29ed85fc	scsi: mpt3sas: Report Firmware Package Version from HBA Driver. Added function _base_display_fwpkg_version, which sends FWUpload request to pull FW package version from FW Image Header. Now driver prints FW package version in addition to FW version if the PackageVersion is valid. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:20 -04:00
Chaitra P B	22a923c315	scsi: mpt3sas: Cache enclosure pages during enclosure add. In function _scsih_add_device, for each device connected to an enclosure, driver reads the enclosure page(To get details like enclosure handle, enclosure logical ID, enclosure level etc.) With this patch, instead of reading enclosure page everytime, driver maintains a list for enclosure device(During enclosure add event, enclosure device is added to the list and removed from the list on delete events) and uses the enclosure page from the list. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:20 -04:00
Chaitra P B	79eb96d6ca	scsi: mpt3sas: Allow processing of events during driver unload. Events were not processed during driver unload, hence unloading of driver doesn't complete when drives are disconnected while unloading of driver. So don't block events in ISR path, i,e., remove the flag ioc->remove_host so that events are getting processed during driver unload. Thus allowing driver unload to complete by processing drive removal events during driver unload. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:20 -04:00
Chaitra P B	1537d1bfc5	scsi: mpt3sas: Increase event log buffer to support 24 port HBA's. For 24 port HBA's events generated by IOC are more in certain cases and the current circular buffer may be overwritten.Hence increased the event log buffer to accommodate more events. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:20 -04:00
Chaitra P B	95540b8eaf	scsi: mpt3sas: Added support for SAS Device Discovery Error Event. The SAS Device Discovery Error Event is sent to the host when discovery for a particular device is failed during discovery, even after maximum retries by the IOC. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:20 -04:00
Chaitra P B	e21fef6f33	scsi: mpt3sas: Enhanced handling of Sense Buffer. Enhanced DMA allocation for Sense Buffer, if the allocation does not fit within same 4GB.Introduced is_MSB_are_same function to check if allocted buffer within 4GB range or not. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:19 -04:00
Chaitra P B	74522a92bb	scsi: mpt3sas: Optimize I/O memory consumption in driver. For every IO, memory of PAGE size is allocated for handling NVMe native PRPS. And in addition to that for every IO (chains need per IO * chain buffer size, e.g. 38 * 128byte) amount of memory is allocated for chain buffers. However, at any point of time; the IO request can be for NVMe target device (where PRP's page is used for framing PRP's) or can be for SCSI target device (where chain buffers are used for framing chain SGE's). This patch modifies the driver to reuse same pre-allocated PRP page buffers as a chain buffer for IO's targeted for SCSI target devices. No need to allocate separate buffers for chain SGE's buffers. Suppose if the number of chain buffers need for IO doesn't fit in the PRP Page size then driver maintain's separate buffers for those extra chain buffers that exceeds the PRP page size. For example consider PRP page size as 4K and chain buffer size as 128 bytes, then number of chain buffers that can fit in PRP page is 4096/128 => 32. if the number of chain buffer need per IO exceeds 32; for example consider number of chains need per IO is 36 then for remaining 4 chain buffer's driver allocates them individual. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:19 -04:00
Chaitra P B	93204b782a	scsi: mpt3sas: Lockless access for chain buffers. Introduces Chain lookup table/tracker and implements accessing chain buffer using smid. Removed link list based access of chain buffer which requires lock and allocated as many chains needed. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:19 -04:00
Chaitra P B	cd33223b59	scsi: mpt3sas: Pre-allocate RDPQ Array at driver boot time. Instead of allocating RDPQ array (This stores the address's of each RDPQ pools) at run time, now it will be allocated once during driver load time and same will be reused during host reset operation also (instead of allocating & freeing this buffer on the fly during every host reset operation) and then freed during driver unload. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:19 -04:00
Chaitra P B	cf6bf9710c	scsi: mpt3sas: Bug fix for big endian systems. This patch fixes sparse warnings and bugs on big endian systems. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:34:19 -04:00
Sudarsana Reddy Kalluru	cac6f69154	qed: Add support for Unified Fabric Port. This patch adds driver changes for supporting the Unified Fabric Port (UFP). This is a new paritioning mode wherein MFW provides the set of parameters to be used by the device such as traffic class, outer-vlan tag value, priority type etc. Drivers receives this info via notifications from mfw and configures the hardware accordingly. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-07 23:46:10 -04:00
Christoph Hellwig	21e07dba9f	scsi: reduce use of block bounce buffers We can rely on the dma-mapping code to handle any DMA limits that is bigger than the ISA DMA mask for us (either using an iommu or swiotlb), so remove setting the block layer bounce limit for anything but the unchecked_isa_dma case, or the bouncing for highmem pages. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk>	2018-05-07 07:15:41 +02:00
Colin Ian King	23b389c231	scsi: mpt3sas: fix spelling mistake: "disbale" -> "disable" Trivial fix to spelling mistake in module parameter description text [mkp: applied by hand] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:35:51 -04:00
Colin Ian King	55d9a1d241	scsi: megaraid_sas: fix spelling mistake: "disbale" -> "disable" Trivial fix to spelling mistake in module parameter description text [mkp: applied by hand] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:35:43 -04:00
Colin Ian King	1895bd7bce	scsi: esas2r: fix spelling mistake: "asynchromous" -> "asynchronous" Trivial fix to spelling mistake in module description text Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:32:31 -04:00
Colin Ian King	35dc0b07b3	scsi: isci: remove redundant check on in_connection_align_insertion_frequency The sanity check on u->in_connection_align_insertion_frequency is being performed twice and hence the first check can be removed since it is redundant. Cleans up cppcheck warning: drivers/scsi/ibmvscsi/ibmvscsi.c:1711: (warning) Identical inner 'if' condition is always true. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:31:45 -04:00
YueHaibing	9407253f4a	scsi: a100u2w: Use module_pci_driver Remove boilerplate code by using macro module_pci_driver. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:31:40 -04:00
YueHaibing	3e1bbc5685	scsi: wd719x: Use module_pci_driver Remove boilerplate code by using macro module_pci_driver. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:30:12 -04:00
YueHaibing	5f1c721196	scsi: am53c974: Use module_pci_driver Remove boilerplate code by using macro module_pci_driver. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:29:41 -04:00
Dave Carroll	7d3af7d96a	scsi: aacraid: Correct hba_send to include iu_type commit `b60710ec7d` ("scsi: aacraid: enable sending of TMFs from aac_hba_send()") allows aac_hba_send() to send scsi commands, and TMF requests, but the existing code only updates the iu_type for scsi commands. For TMF requests we are sending an unknown iu_type to firmware, which causes a fault. Include iu_type prior to determining the validity of the command Reported-by: Noah Misner <nmisner@us.ibm.com> Fixes: `b60710ec7d` ("aacraid: enable sending of TMFs from aac_hba_send()") Fixes: `423400e64d` ("aacraid: Include HBA direct interface") Tested-by: Noah Misner <nmisner@us.ibm.com> cc: stable@vger.kernel.org Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:27:18 -04:00
Jim Gill	f4b024271a	scsi: vmw-pvscsi: return DID_BUS_BUSY for adapter-initated aborts The vmw_pvscsi driver returns DID_ABORT for commands aborted internally by the adapter, leading to the filesystem going read-only. Change the result to DID_BUS_BUSY, causing the kernel to retry the command. Signed-off-by: Jim Gill <jgill@vmware.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:15:40 -04:00
Christoph Hellwig	63aed100e2	scsi: scsi_transport_sas: don't bounce highmem pages for the smp handler All three instance of ->smp_handler deal with highmem backed requests just fine. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-01 23:11:15 -04:00
Arnd Bergmann	f990bee3f1	scsi: ips: fix firmware timestamps for 32-bit do_gettimeofday() is deprecated since it will stop working in 2038 on 32-bit platforms, leading to incorrect times passed to the firmware. On 64-bit platforms the current code appears to be fine, as the calculation passes an 8-bit century number into the firmware that can represent times long in the future (possibly until 25599). Using ktime_get_real_seconds() to get a 64-bit seconds value and time64_to_tm() to convert it into the firmware format greatly simplifies the ips timekeeping code, makes 32-bit and 64-bit behave the same way here, and gets us closer to removing the deprecated interfaces. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:40:17 -04:00
Arnd Bergmann	feeeca4ce2	scsi: esas2r: use ktime_get_real_seconds() do_gettimeofday() is deprecated because of the y2038 overflow. Here, we use the result to pass into a 32-bit field in the firmware, which still risks an overflow, but if the firmware is written to expect unsigned values, it can at least last until y2106, and there is not much we can do about it. This changes do_gettimeofday() to ktime_get_real_seconds(), which at least simplifies the code a bit, and avoids the deprecated interface. I'm adding a comment about the overflow to document what happens. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:40:16 -04:00
YueHaibing	f9c25ccfc1	scsi: mvumi: Using module_pci_driver Remove boilerplate code by using macro module_pci_driver. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:40:11 -04:00
Colin Ian King	4bc83b3f27	scsi: isci: Fix infinite loop in while loop In the case when the phy_mask is bitwise anded with the phy_index bit is zero the continue statement currently jumps to the next iteration of the while loop and phy_index is never actually incremented, potentially causing an infinite loop if phy_index is less than SCI_MAX_PHS. Fix this by turning the while loop into a for loop. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:23:32 -04:00
Jia-Ju Bai	4011f07660	scsi: st: Replace GFP_ATOMIC with GFP_KERNEL in new_tape_buffer new_tape_buffer() is never called in atomic context. new_tape_buffer() is only called by st_probe(), which is only set as ".probe" in struct scsi_driver. Despite never getting called from atomic context, new_tape_buffer() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:14:37 -04:00
Jia-Ju Bai	1f618aac2f	scsi: st: Replace GFP_ATOMIC with GFP_KERNEL in st_probe st_probe() is never called in atomic context. st_probe() is only set as ".probe" in struct scsi_driver. Despite never getting called from atomic context, st_probe() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:14:37 -04:00
Martin Wilck	c360652006	scsi: devinfo: BLIST_RETRY_ASC_C1 for Fujitsu ETERNUS On Fujitsu ETERNUS systems, sense code ABORTED COMMAND with ASC/Q C1/01 is used to indicate temporary condition where the storage-internal path to a target is switched from one controller to another. SCSI commands that return with this error code must be retried unconditionally (i.e. without the "maybe_retry" logic in scsi_decide_disposition); otherwise dm-multipath might initiate a failover from a healthy path e.g. for REQ_FAILFAST_DEV commands. Introduce a new blist flag for this case. [mkp: applied by hand] Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:14:36 -04:00
Martin Wilck	29cfc2ab71	scsi: devinfo: add BLIST_RETRY_ITF for EMC Symmetrix EMC Symmetrix returns 'internal target error' for a variety of conditions, most of which will be transient. So we should always retry it, even with failfast set. Otherwise we'd get spurious path flaps with multipath. Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:14:36 -04:00
Martin Wilck	358fda5ff4	scsi: devinfo: warn on undefined blist flags Warn if a device (or the user) sets blist flags which are unknown or have been removed. This should enable us to reuse freed blist bits in later releases. Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:14:35 -04:00
Martin Wilck	1409880357	scsi: devinfo: change blist_flag_t to 64bit Space for SCSI blist flags is gradually running out. Change the type to __u64 and fix a checkpatch complaint about symbolic mode flags in scsi_devinfo.c. Make checkpatch happy by replacing simple_strtoul() with kstrtoull(). Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:14:35 -04:00
Martin Wilck	659c1c1b29	scsi: devinfo: use const_ilog2 for array indices Use the just introduced const_ilog2() macro to avoid sparse errors. Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 19:14:28 -04:00
Long Li	2217a47de4	scsi: storvsc: Select channel based on available percentage of ring buffer to write This is a best effort for estimating on how busy the ring buffer is for that channel, based on available buffer to write in percentage. It is still possible that at the time of actual ring buffer write, the space may not be available due to other processes may be writing at the time. Selecting a channel based on how full it is can reduce the possibility that a ring buffer write will fail, and avoid the situation a channel is over busy. Now it's possible that storvsc can use a smaller ring buffer size (e.g. 40k bytes) to take advantage of cache locality. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 15:38:38 -04:00
Long Li	f286299c1d	scsi: storvsc: Set up correct queue depth values for IDE devices Unlike SCSI and FC, we don't use multiple channels for IDE. Also fix the calculation for sub-channels. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-20 15:36:02 -04:00
Bart Van Assche	ccce20fc79	scsi: sd_zbc: Avoid that resetting a zone fails sporadically Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is called from sd_probe_async() and since sd_revalidate_disk() calls sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called concurrently with blkdev_report_zones() and/or blkdev_reset_zones(). That can cause these functions to fail with -EIO because sd_zbc_read_zones() e.g. sets q->nr_zones to zero before restoring it to the actual value, even if no drive characteristics have changed. Avoid that this can happen by making the following changes: - Protect the code that updates zone information with blk_queue_enter() and blk_queue_exit(). - Modify sd_zbc_setup_seq_zones_bitmap() and sd_zbc_setup() such that these functions do not modify struct scsi_disk before all zone information has been obtained. Note: since commit `055f6e18e0` ("block: Make q_usage_counter also track legacy requests"; kernel v4.15) the request queue freezing mechanism also affects legacy request queues. Fixes: `89d9475610` ("sd: Implement support for ZBC devices") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: stable@vger.kernel.org # v4.16 Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-19 00:04:10 -04:00
Bart Van Assche	c976562162	scsi: sd_zbc: Let the SCSI core handle ILLEGAL REQUEST / ASC 0x21 scsi_io_completion() translates the sense key ILLEGAL REQUEST / ASC 0x21 into ACTION_FAIL. That means that setting cmd->allowed to zero in sd_zbc_complete() for this sense code / ASC combination is not necessary. Hence remove the code that resets cmd->allowed from sd_zbc_complete(). Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-19 00:00:44 -04:00
Bart Van Assche	354f113205	scsi: sd_zbc: Change the type of the ZBC fields into u32 This patch does not change any functionality but makes it clear that it is on purpose that these fields are 32 bits wide. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-19 00:00:44 -04:00
Christoph Hellwig	9027b15d51	scsi: storsvc: don't set a bounce limit The default already is to never bounce, so the call is a no-op. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-19 00:00:44 -04:00
Christoph Hellwig	3e58c5cf16	scsi: iscsi_tcp: don't set a bounce limit The default already is to never bounce, so the call is a no-op. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-19 00:00:44 -04:00
Souptick Joarder	0cac8e1bba	scsi: sg: Change return type to vm_fault_t Use new return type vm_fault_t for fault handler in struct vm_operations_struct. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-19 00:00:44 -04:00
Michael Schmitz	3109e5ae03	scsi: zorro_esp: New driver for Amiga Zorro NCR53C9x boards New combined SCSI driver for all ESP based Zorro SCSI boards for m68k Amiga. Code largely based on board specific parts of the old drivers (blz1230.c, blz2060.c, cyberstorm.c, cyberstormII.c, fastlane.c which were removed after the 2.6 kernel series for lack of maintenance) with contributions by Tuomas Vainikka (TCQ bug tests and workaround) and Finn Thain (TCQ bugfix by use of PIO in extended message in transfer). New Kconfig option and Makefile entries for new Amiga Zorro ESP SCSI driver included in this patch. Use DMA transfers wherever possible, with board-specific DMA set-up functions copied from the old driver code. Three byte reselection messages do appear to cause DMA timeouts. So wire up a PIO transfer routine for these instead. esp_reselect_with_tag explicitly sets esp->cmd_block_dma as target address for the message bytes but PIO requires a virtual address. Substiute kernel virtual address esp->cmd_block in PIO transfer call if DMA address is esp->cmd_block_dma and phase is message in. PIO code taken from mac_esp.c where the reselection timeout issue was debugged and fixed first, with minor macro and function rename. Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Christian T. Steigies <cts@debian.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-19 00:00:44 -04:00
Mahesh Rajashekhara	505aa4b6a8	scsi: sd: Defer spinning up drive while SANITIZE is in progress A drive being sanitized will return NOT READY / ASC 0x4 / ASCQ 0x1b ("LOGICAL UNIT NOT READY. SANITIZE IN PROGRESS"). Prevent spinning up the drive until this condition clears. [mkp: tweaked commit message] Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 23:37:40 -04:00
Vinson Lee	fb1633d56b	scsi: megaraid_sas: Do not log an error if FW successfully initializes. Fixes: `2d2c233167` ("scsi: megaraid_sas: modified few prints in OCR and IOC INIT path") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 23:37:40 -04:00
Ohad Sharabi	6667e6d91c	scsi: ufs: add trace event for ufs upiu Add UFS Protocol Information Units(upiu) trace events for ufs driver, used to trace various ufs transaction types- command, task-management and device management. The trace-point format is generic and can be easily adapted to trace other upius if needed. Currently tracing ufs transaction of type 'device management', which this patch introduce, cannot be obtained from any other trace. Device management transactions are used for communication with the device such as reading and writing descriptor or attributes etc. Signed-off-by: Ohad Sharabi <ohad.sharabi@sandisk.com> Reviewed-by: Stanislav Nijnikov <stanislav.nijnikov@wdc.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 23:37:39 -04:00
Colin Ian King	0cfce53a78	scsi: fnic: fix spelling mistake in fnic stats "Abord" -> "Abort" Trivial fix to spelling mistake in fnic stats message text. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 22:58:49 -04:00
Douglas Gilbert	4f2c8bf6bd	scsi: scsi_debug: IMMED related delay adjustments A patch titled: "[PATCH v2] scsi_debug: implement IMMED bit" introduced long delays to the Start stop unit (SSU) and Synchronize cache (SC) commands when the IMMED bit is clear. This patch makes those delays more realistic. It causes SSU to only delay when the start stop state is changed; SC only delays when there's been a write since the previous SC. It also reduced the SC delay from 1 second to 50 milliseconds. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reported-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 22:58:49 -04:00
Chris Leech	af17092810	scsi: iscsi: respond to netlink with unicast when appropriate Instead of always multicasting responses, send a unicast netlink message directed at the correct pid. This will be needed if we ever want to support multiple userspace processes interacting with the kernel over iSCSI netlink simultaneously. Limitations can currently be seen if you attempt to run multiple iscsistart commands in parallel. We've fixed up the userspace issues in iscsistart that prevented multiple instances from running, so now attempts to speed up booting by bringing up multiple iscsi sessions at once in the initramfs are just running into misrouted responses that this fixes. Signed-off-by: Chris Leech <cleech@redhat.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 22:58:49 -04:00
Xose Vazquez Perez	37b37d2609	scsi: scsi_dh: replace too broad "TP9" string with the exact models SGI/TP9100 is not an RDAC array: ^^^ https://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultipath/hwtable.c;h=88b4700beb1d8940008020fbe4c3cd97d62f4a56;hb=HEAD#l235 This partially reverts commit `35204772ea` ("[SCSI] scsi_dh_rdac : Consolidate rdac strings together") [mkp: fixed up the new entries to align with rest of struct] Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com> Cc: Hannes Reinecke <hare@suse.de> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Cc: DM ML <dm-devel@redhat.com> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:08 -04:00
Xose Vazquez Perez	b15578ddab	scsi: devinfo: delete duplicate "Generic"/"USB Storage-SMC" device The revision field is currently unused by the devinfo pattern matching code. Combine two blacklist entries into one. $ egrep "Generic.*Storage-SMC" /proc/scsi/device_info 'Generic' 'USB Storage-SMC' 0x402 'Generic' 'USB Storage-SMC' 0x402 [mkp: tweaked commit desc] Cc: Hannes Reinecke <hare@suse.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:08 -04:00
James Smart	40e4a2e15c	scsi: lpfc: update driver version to 12.0.0.2 Update the driver version to 12.0.0.2 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:07 -04:00
James Smart	b0a00d8d2b	scsi: lpfc: Correct missing remoteport registration during link bounces Remote port disappearance/reappearances would cause a series of RSCN events to be delivered to the driver. During the resulting GID_FT handling, the driver clears the fc4 settings on the remote port, which makes it skip registration. As such, the nvme associations eventually fail and return io errors to the applications. Correct by not clearng the nlp_fc4_types for all nodes in lpfc_issue_gidft. Instead, when the GID_FT response is handled, clear the nlp_fc4_types of FCP and NVME prior to evaluating the fc4_type returned by the GID_FT response. This approach leaves "skipped" nodes with their nlp_fc4_types intacted. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:06 -04:00
James Smart	66a85155d4	scsi: lpfc: Fix NULL pointer reference when resetting adapter Points referencing local port structures didn't accommodate cases where the localport may not be registered yet. Add NULL pointer checks to logic. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:06 -04:00
James Smart	b15bd3e621	scsi: lpfc: Fix nvme remoteport registration race conditions On tests adding and removing a remote port, calls to nvme_info would eventually show fewer target ports discovered than were present in the san. Additionally, the following error messages were seen: 6031 RemotePort Registration failed err: -116, DID x471301 There is a race condition that exists between the driver and the nvme transport on remote port unregister vs the confirmed deletion. It's possible that the driver may rediscover the remote port and reregister the remote port before a prior unregister delete callback was made (as it rebinded to the prior remoteport structure). However, the driver was coded to expect the callback before seeing the remote port again thus a new registration. The logic results in the driver having an invalid remoteport pointer set. Correct by tracking when waiting for the delete callback. In cases where the ndlp remoteport pointer is updated, it is only cleared when the wait has not been superceded by a prior registration. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:05 -04:00
James Smart	b04744ce52	scsi: lpfc: Fix driver not recovering NVME rports during target link faults During target-side port faults, the driver would not recover all target port logins. This resulted in a loss of nvme device discovery. The driver is coded to wait for all GID_FT requests to complete before restarting discovery. A fault is seen where the outstanding GIT_FT counts are not properly decremented, thus discovery would never start. Another fault was found in the clearing of the gidft_inp counter that would be skipped in this condition. And a third fault found with lpfc_nvme_register_port that would remove a reverence on the ndlp which then allows a node swap on a port address change to prematurely remove the reference and release the ndlp. The following changes are made: - Correct the decrementing of the outstanding GID_FT counters. - In RSCN handling, no longer zero the counter before calling to issue another GID_FT. - No longer remove the reference on the dlp when the ndlp->nrport value is not yet null. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:05 -04:00
James Smart	bf316c7851	scsi: lpfc: Fix WQ/CQ creation for older asic's. The patch to enlarge WQ/CQ creation keys off of an adapter response that indicates support for the larger values. Older adapters return an incorrect response and are limited in size. Thus the adapters fail the WQ creation steps. Augment the WQ sizing checks with a check on the older adapter types and limit them to the restricted sizes. Fixes: `c176ffa084` ("scsi: lpfc: Increase CQ and WQ sizes for SCSI") Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:04 -04:00
James Smart	01466024d2	scsi: lpfc: Fix NULL pointer access in lpfc_nvme_info_show After making remoteport unregister requests, the ndlp nrport pointer was stale. Track when waiting for waiting for unregister completion callback and adjust nldp pointer assignment. Add a few safety checks for NULL pointer values. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:04 -04:00
James Smart	0cdb84ec26	scsi: lpfc: Fix lingering lpfc_wq resource after driver unload After driver unloads, lpfc_wq remains active. The destroy_workqueue calls were not being made in driver unload. Additionally, SLI3 is allocating lpfc_wq resources, but never uses it. Make the destroy_workqueue calls on driver unload. Modify the SLI3 code path no longer allocate lpfc_wq resources. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:03 -04:00
James Smart	59c68eaad7	scsi: lpfc: Fix Abort request WQ selection When running loads that generated aborts, io errors where seen. Turns out the abort requests where not placed on the proper WQ resulting in the errors. Closer inspection inspection of this error also showed improper spinlock api use. Correct the WQ selection policy for the abort requests. Correct spin_lock/spin_lock_irq/spin_lock_irqsave usage. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:02 -04:00
James Smart	2448e48425	scsi: lpfc: Enlarge nvmet asynchronous receive buffer counts Under large io load, the current sizing of asynchronous buffer counts could be exceeded, indicated by a 2885 log message: 2885 Port Status Event: port status reg 0x81800000, port smphr reg 0xc000, error 1=0x52004a01, error 2=0x0 Enlarge the async receive queue size. Allow for a configurable number of buffers to be posted to each RQ, using the new attribute lpfc_nvmet_mrq_post. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:02 -04:00
James Smart	66a210ffb8	scsi: lpfc: Add per io channel NVME IO statistics When debugging various issues, per IO channel IO statistics were useful to understand what was happening. However, many of the stats were on a port basis rather than an io channel basis. Move statistics to an io channel basis. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:01 -04:00
James Smart	f91bc594ba	scsi: lpfc: Correct target queue depth application changes The max_scsicmpl_time parameter can be used to perform scsi cmd queue depth mgmt based on io completion time: the queue depth is reduced to make completion time shorter. However, as soon as an io completes and the completion time is within limits, the code immediately bumps the queue depth limit back up to the target queue depth. Thus the procedure restarts, effectively limiting the usefulness of adjusting queue depth to help completion time. This patch makes the following changes: - Removes the code at io completion that resets the queue depth as soon as within limits. - As the code removed was where the target queue depth was first applied, change target queue depth application so that it occurs when the parameter is changed. - Makes target queue depth a standard parameter: both a module parameter and a sysfs parameter. - Optimizes the command pending count by using atomics rather than locks. - Updates the debugfs nodelist stats to allow better debugging of pending command counts. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:01 -04:00
James Smart	118c0415ee	scsi: lpfc: Fix multiple PRLI completion error path Nodelist entry for SCSI array ends up in UNMAPPED state. This is due to illegal discovery State machine transition because of two PRLIs and the first one failing with LS_RJT. Also, the error path was designed assuming the PRLIs complete in the order they were sent, FCP first, then NVME. In a failing case, the array thinks about the first PRLI (FCP), but issues LS_RJT for the 2nd PRLI immediately. Fix PRLI completion error path for the ordering expectation. Ensure the discovery state machine update is not set until all outstanding PRLIs are complete. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:00 -04:00
Shivasharan S	67c5490ace	scsi: megaraid_sas: driver version upgrade Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:33:59 -04:00
Shivasharan S	3239b8cd28	scsi: megaraid_sas: Increase timeout by 1 sec for non-RAID fastpath IOs Hardware could time out Fastpath IOs one second earlier than the timeout provided by the host. For non-RAID devices, driver provides timeout value based on OS provided timeout value. Under certain scenarios, if the OS provides a timeout value of 1 second, due to above behavior hardware will timeout immediately. Increase timeout value for non-RAID fastpath IOs by 1 second. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:33:59 -04:00
Himanshu Jha	3c6c122cfc	scsi: megaraid_sas: Use zeroing memory allocator than allocator/memset Use pci_zalloc_consistent for allocating zeroed memory and remove unnecessary memset function. Done using Coccinelle. Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:33:58 -04:00

1 2 3 4 5 ...

16643 Commits