When executing SMP task failed, the smp_execute_task_sg() calls del_timer()
to delete "slow_task->timer". However, if the timer handler
sas_task_internal_timedout() is running, the del_timer() in
smp_execute_task_sg() will not stop it and a UAF will happen. The process
is shown below:
(thread 1) | (thread 2)
smp_execute_task_sg() | sas_task_internal_timedout()
... |
del_timer() |
... | ...
sas_free_task(task) |
kfree(task->slow_task) //FREE|
| task->slow_task->... //USE
Fix by calling del_timer_sync() in smp_execute_task_sg(), which makes sure
the timer handler have finished before the "task->slow_task" is
deallocated.
Link: https://lore.kernel.org/r/20220920144213.10536-1-duoming@zju.edu.cn
Fixes: 2908d778ab ("[SCSI] aic94xx: new driver")
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In order to help the compiler reason about the destination buffer in struct
fc_nl_event, add a flexible array member for this purpose. However, since
the header is UAPI, it must not change size or layout, so a union is used.
The allocation size calculations are also corrected (it was potentially
allocating an extra 8 bytes), and the padding is zeroed to avoid leaking
kernel heap memory contents.
Detected at run-time by the recently added memcpy() bounds checking:
memcpy: detected field-spanning write (size 8) of single field "&event->event_data" at drivers/scsi/scsi_transport_fc.c:581 (size 4)
Link: https://lore.kernel.org/linux-next/42404B5E-198B-4FD3-94D6-5E16CF579EF3@linux.ibm.com/
Link: https://lore.kernel.org/r/20220921205155.1451649-1-keescook@chromium.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SCSI_MOD is a helper config symbol for configuring RAID_ATTRS properly,
i.e., RAID_ATTRS needs to be m when SCSI=m.
This helper config symbol SCSI_MOD still shows up even in kernel
configurations that do not select the block subsystem and where SCSI is not
even a configuration option mentioned and selectable.
Make this SCSI_MOD depend on BLOCK, so that it only shows up when it is
slightly relevant in the kernel configuration.
Link: https://lore.kernel.org/r/20220919060112.24802-1-lukas.bulwahn@gmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the vmbus_on_event loop, 2 jiffies timer will not serve the purpose if
callback_fn takes longer. For effective use move this check inside of
callback functions where needed. Out of all the VMbus drivers using
vmbus_on_event, only storvsc has a high packet volume, thus add this limit
only in storvsc callback for now.
There is no apparent benefit of loop itself because this tasklet will be
scheduled anyway again if there are packets left in ring buffer. This
patch removes this unnecessary loop as well.
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1658741848-4210-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
dev_loss_tmo is an unsigned value. Using "%d" as output format causes
irritating negative values to be shown in sysfs.
Link: https://lore.kernel.org/r/20220902131519.16513-1-mwilck@suse.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the value
to be returned to user space.
Link: https://lore.kernel.org/r/20220901015130.419307-1-zhangxuezhi3@gmail.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Xuezhi Zhang <zhangxuezhi1@coolpad.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix up sysfs show entries to use sysfs_emit()
Link: https://lore.kernel.org/r/20220831140325.396295-1-zhangxuezhi3@gmail.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Xuezhi Zhang <zhangxuezhi1@coolpad.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the incorrect return value check of dma_get_required_mask(). Due to
this incorrect check, the driver was always setting the DMA mask to 63 bit.
Link: https://lore.kernel.org/r/20220913120538.18759-2-sreekanth.reddy@broadcom.com
Fixes: ba27c5cf28 ("scsi: mpt3sas: Don't change the DMA coherent mask after allocations")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 14.2.0.7
Link: https://lore.kernel.org/r/20220911221505.117655-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch fixes below Smatch reported issues:
1. lpfc_hbadisc.c:3020 lpfc_mbx_cmpl_fcf_rr_read_fcf_rec()
error: uninitialized symbol 'vlan_id'.
2. lpfc_hbadisc.c:3121 lpfc_mbx_cmpl_read_fcf_rec()
error: uninitialized symbol 'vlan_id'.
3. lpfc_init.c:335 lpfc_dump_wakeup_param_cmpl()
warn: always true condition '(prg->dist < 4) => (0-3 < 4)'
4. lpfc_init.c:2419 lpfc_parse_vpd()
warn: inconsistent indenting.
5. lpfc_init.c:13248 lpfc_sli4_enable_msi()
warn: 'phba->pcidev->irq' 2147483648 can't fit into 65535
'eqhdl->irq'
6. lpfc_debugfs.c:5300 lpfc_idiag_extacc_avail_get()
error: uninitialized symbol 'ext_cnt'
7. lpfc_debugfs.c:5300 lpfc_idiag_extacc_avail_get()
error: uninitialized symbol 'ext_size'
8. lpfc_vmid.c:248 lpfc_vmid_get_appid()
warn: sleeping in atomic context.
9. lpfc_init.c:8342 lpfc_sli4_driver_resource_setup()
warn: missing error code 'rc'.
10. lpfc_init.c:13573 lpfc_sli4_hba_unset()
warn: variable dereferenced before check 'phba->pport' (see
line 13546)
11. lpfc_auth.c:1923 lpfc_auth_handle_dhchap_reply()
error: double free of 'hash_value'
Fixes:
1. Initialize vlan_id to LPFC_FCOE_NULL_VID.
2. Initialize vlan_id to LPFC_FCOE_NULL_VID.
3. prg->dist is a 2 bit field. Its value can only be between 0-3.
Remove redundent check 'if (prg->dist < 4)'.
4. Fix inconsistent indenting. Moved logic into helper function
lpfc_fill_vpd().
5. Define 'eqhdl->irq' as int value as pci_irq_vector() returns int.
Also, check for return value of pci_irq_vector() and log message in
case of failure.
6. Initialize 'ext_cnt' to 0.
7. Initialize 'ext_size' to 0.
8. Use alloc_percpu_gfp() with GFP_ATOMIC flag.
9. 'rc' was not updated when dma_pool_create() fails. Update 'rc =
-ENOMEM' when dma_pool_create() fails before calling goto statement.
10. Add check for 'phba->pport' in lpfc_cpuhp_remove().
11. Initialize 'hash_value' to NULL, same like 'aug_chal' variable.
Link: https://lore.kernel.org/r/20220911221505.117655-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Firmware reports link degrade signaling via ACQES.
Handlers and new additions to the SET_FEATURES mbox command are implemented
so that link degrade parameters for 64GB capable links are reported through
EDC ELS frames.
Link: https://lore.kernel.org/r/20220911221505.117655-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Removed the lpfc_fdmi_attr_entry and lpfc_fdmi_attr_def structures that had
a union causing unintentional zero padding, which required the usage of
__packed. They are replaced with explicit lpfc_fdmi_attr_u32,
lpfc_fdmi_attr_wwn, lpfc_fdmi_attr_fc4types, and lpfc_fdmi_attr_string
structure defines instead of living in a union. This rids of ambiguous
compiler zero padding, and entailed cleaning up bitwise endian
declarations.
As such, all FDMI attribute registration routines are replaced with generic
void *arg and handlers for each of the newly defined attribute structure
types.
Link: https://lore.kernel.org/r/20220911221505.117655-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch case logics are reworked so they appear more similar and
consistent. This eliminates compiler errors indicating unaligned pointer
values and packed members.
Added comments to explain previous size offset accumulations.
Link: https://lore.kernel.org/r/20220911221505.117655-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Clarify naming of the mp/bmp dma buffers:
- Rename mp to rq as it is the request buffer
- Rename bmp to rsp as it is the response buffer
This reduces confusion about what the buffer content is based on their
name.
Link: https://lore.kernel.org/r/20220911221505.117655-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If there is a congestion or automated congestion response mode change, then
log the reported change to kmsg.
Link: https://lore.kernel.org/r/20220911221505.117655-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On a PCI hotplug capable system, it is possible for scsi_device_put() to
happen after lpfc_pci_remove_one() is called. As a result, the
sdev->host->hostt->module dereference is for a previously freed memory
location because the phba structure containing the hostt template was
already freed when lpfc_pci_remove_one() returned.
Since the lpfc module is still loaded during power slot disable, all
scsi_host_templates should be declared as part of the global data segment
instead of inside the heap allocated phba structure. This way the
sdev->host->hostt memory area is always valid as long as the module is
loaded regardless if PCI hotplug dynamically allocates or frees phba
structures.
Move all scsi_host_templates in the phba structure to global variables.
Create a small helper routine to determine appropriate sg_tablesize during
shost allocation.
Link: https://lore.kernel.org/r/20220911221505.117655-7-jsmart2021@gmail.com
Co-developed-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Co-developed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a target makes the mistake of registering a FC4 type with the fabric,
but then rejects a PRLI of that type, the lpfc driver incorrectly retries
the PRLI causing multiple registrations with the transport. The driver
needs to detect the reject reason data and stop any retry.
Rework the PRLI reject scenarios.
Link: https://lore.kernel.org/r/20220911221505.117655-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sometimes VMID targets are not getting rediscovered after a port reset.
The iocb is not freed in lpfc_cmpl_ct_cmd_vmid(), which is the completion
function for the appid CT commands. So after a port reset, the count of
sges is less than the expected count of 250. This causes post reset
operation logic to fail and keep the port offline.
Fix by freeing the iocb and kref put for the lpfc_cmpl_ct_cmd_vmid() early
return cases.
Link: https://lore.kernel.org/r/20220911221505.117655-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In a situation where the node state changes while a REG_LOGIN is in
progress, the LPFC_MBOXQ_t structure is cleared and reused for an
UNREG_LOGIN command to release RPI resources without first freeing the mbuf
pool resource allocated for REG_LOGIN.
Release mbuf pool resource prior to repurposing of the mailbox command
structure from REG_LOGIN to UNREG_LOGIN.
Link: https://lore.kernel.org/r/20220911221505.117655-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a FLOGI is received before we have issued our FLOGI, the ACC response
to the received FLOGI is issued with SID 2 instead of the expected fabric
controller SID. Certain target vendors ignore the malformed ACC with SID 2
and wait for a properly filled ACC with a fabric controller SID.
The lpfc_sli_prep_wqe() routine depends on the FC_PT2PT flag to fill in the
fabric controller SID when in PT2PT mode, but due to a previous commit the
flag was getting cleared. Fix by adding a check for the defer_flogi_acc
flag to know whether or not to clear the FC_PT2PT flag on link up.
Link: https://lore.kernel.org/r/20220911221505.117655-3-jsmart2021@gmail.com
Fixes: 439b93293f ("scsi: lpfc: Fix unsolicited FLOGI receive handling during PT2PT discovery")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The if statment check (prli_fc4_req & PRLI_NVME_TYPE) evaluates to true
when receiving a PRLI request for bogus FC4 type codes that happen to have
the 3rd or 5th bit set because PRLI_NVME_TYPE is 0x28. This leads to
sending a PRLI_NVME_ACC even for bogus FC4 type codes.
Change the bitwise & check to an exact == type code check to ensure we send
PRLI_NVME_ACC only for NVME type coded PRLI requests.
Link: https://lore.kernel.org/r/20220911221505.117655-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The error code from mpi3mr_post_transport_req() is supposed to be passed to
bsg_job_done(job, rc, reslen), but it isn't.
Link: https://lore.kernel.org/r/YyMISJzVDARpVwrr@kili
Fixes: 176d4aa69c ("scsi: mpi3mr: Support SAS transport class callbacks")
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are three error paths which return success:
1) Propagate the error code from mpi3mr_post_transport_req() if it fails.
2) Return -EINVAL if "ioc_status != MPI3_IOCSTATUS_SUCCESS".
3) Return -EINVAL if "le16_to_cpu(mpi_reply.response_data_length) !=
sizeof(struct rep_manu_reply)"
Link: https://lore.kernel.org/r/YyMIJh1HU2Qz9+Rs@kili
Fixes: 2bd37e2849 ("scsi: mpi3mr: Add framework to issue MPT transport cmds")
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ahd_linux_setup_iocell_info() intentionally writes to the const-marked
aic79xx_iocell_info array, but is called during __init, so the location is
actually writable at this point on most architectures. Annotate this
explicitly with __ro_after_init to avoid static analysis confusion.
Link: https://lpc.events/event/16/contributions/1175/attachments/1109/2128/2022-LPC-analyzer-talk.pdf
Link: https://lore.kernel.org/r/20220914115953.3854029-1-keescook@chromium.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reported-by: David Malcolm <dmalcolm@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 8f394da36a ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG")
made the __qlt_24xx_handle_abts() function return early if
tcm_qla2xxx_find_cmd_by_tag() didn't find a command, but it missed to clean
up the allocated memory for the management command.
Link: https://lore.kernel.org/r/20220914024924.695604-1-rafaelmendsr@gmail.com
Fixes: 8f394da36a ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
qla2x00_get_fw_version_str() has been removed since commit abbd8870b9
("[SCSI] qla2xxx: Factor-out ISP specific functions to method-based call
tables.").
qla2x00_release_nvram_protection() has been removed since commit
459c537807 ("[SCSI] qla2xxx: Add ISP24xx flash-manipulation routines.").
qla82xx_rdmem() and qla82xx_wrmem() have been removed since commit
3711333dfb ("[SCSI] qla2xxx: Updates for ISP82xx.").
qla25xx_rd_req_reg(), qla24xx_rd_req_reg(), qla25xx_wrt_rsp_reg(),
qla24xx_wrt_rsp_reg(), qla25xx_wrt_req_reg() and qla24xx_wrt_req_reg() have
been removed since commit 08029990b2 ("[SCSI] qla2xxx: Refactor
request/response-queue register handling.").
qla2x00_async_login_done() has been removed since commit 726b854870
("qla2xxx: Add framework for async fabric discovery").
qlt_24xx_process_response_error() has been removed since commit
c5419e2618 ("scsi: qla2xxx: Combine Active command arrays.").
Remove the declarations for them from header file.
Link: https://lore.kernel.org/r/20220913023722.547249-2-cuigaosheng1@huawei.com
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In __qedf_probe(), if qedf->cdev is NULL which means
qed_ops->common->probe() failed, then the program will goto label err1, and
scsi_host_put() will free lport->host pointer. Because the memory qedf
points to is allocated by libfc_host_alloc(), it will be freed by
scsi_host_put(). However, the if statement below label err0 only checks
whether qedf is NULL but doesn't check whether the memory has been freed.
So a UAF bug can occur.
There are two ways to reach the statements below err0. The first one is
described as before, "qedf" should be set to NULL. The second one is goto
"err0" directly. In the latter scenario qedf hasn't been changed and it has
the initial value NULL. As a result the if statement is not reachable in
any situation.
The KASAN logs are as follows:
[ 2.312969] BUG: KASAN: use-after-free in __qedf_probe+0x5dcf/0x6bc0
[ 2.312969]
[ 2.312969] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[ 2.312969] Call Trace:
[ 2.312969] dump_stack_lvl+0x59/0x7b
[ 2.312969] print_address_description+0x7c/0x3b0
[ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0
[ 2.312969] __kasan_report+0x160/0x1c0
[ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0
[ 2.312969] kasan_report+0x4b/0x70
[ 2.312969] ? kobject_put+0x25d/0x290
[ 2.312969] kasan_check_range+0x2ca/0x310
[ 2.312969] __qedf_probe+0x5dcf/0x6bc0
[ 2.312969] ? selinux_kernfs_init_security+0xdc/0x5f0
[ 2.312969] ? trace_rpm_return_int_rcuidle+0x18/0x120
[ 2.312969] ? rpm_resume+0xa5c/0x16e0
[ 2.312969] ? qedf_get_generic_tlv_data+0x160/0x160
[ 2.312969] local_pci_probe+0x13c/0x1f0
[ 2.312969] pci_device_probe+0x37e/0x6c0
Link: https://lore.kernel.org/r/20211112120641.16073-1-fantasquex@gmail.com
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Acked-by: Saurav Kashyap <skashyap@marvell.com>
Co-developed-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Rafael explained that the reason for having both PF_NOFREEZE and
PF_FREEZER_SKIP is that {,un}lock_system_sleep() is callable from
kthread context that has previously called set_freezable().
In preparation of merging the flags, have {,un}lock_system_slee() save
and restore current->flags.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114648.725003428@infradead.org
Fix the following use-after-free warning which is observed during
controller reset:
refcount_t: underflow; use-after-free.
WARNING: CPU: 23 PID: 5399 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
Link: https://lore.kernel.org/r/20220906134908.1039-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remote devices may go missing from the per-device nexus reset part of the
HA nexus, i.e after the controller reset. This is because libsas may find
the devices to be gone as the phy may be temporarily down when processing
the bcast event generated from the nexus reset. Filter out bcast events
during this time to stop the devices being lost.
Link: https://lore.kernel.org/r/1662378529-101489-6-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In resetting the controller, SATA devices may be lost.
The issue is that when we insert the bcast events to rescan the topology in
hisi_sas_rescan_topology(), when we subsequently nexus reset the SATA
devices in hisi_sas_async_I_T_nexus_reset(), there is a small timing window
in which the remote phy is down and we process the bcast event (meaning
that libsas judges that the disk is lost).
Ensure that all bcast events are processed prior to the nexus reset to
close this window.
Link: https://lore.kernel.org/r/1662378529-101489-4-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Once the controller HW has been reset then we can unset flag
HISI_SAS_HW_FAULT_BIT. In clearing this flag earlier we can now
successfully execute commands in hisi_sas_controller_reset_done(), like
bcast processing.
Link: https://lore.kernel.org/r/1662378529-101489-3-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Now that libsas and the SCSI core code limits the default sectors from
commit 4cbfca5f77 ("scsi: scsi_transport_sas: cap shost opt_sectors
according to DMA optimal limit") and commit 608128d391 ("scsi: sd: allow
max_sectors be capped at DMA optimal size limit"), there is no need for
the hack to limit the max HW sectors.
Link: https://lore.kernel.org/r/1662378529-101489-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In preparation for FORTIFY_SOURCE performing run-time destination buffer
bounds checking for memcpy(), specify the destination output buffer
explicitly, instead of asking memcpy() to write past the end of what looked
like a fixed-size object. Silences future run-time warning:
memcpy: detected field-spanning write (size 80) of single field "trc + 1" (size 64)
There is no binary code output differences from this change.
Link: https://lore.kernel.org/r/20220901205729.2260982-1-keescook@chromium.org
Cc: Bradley Grove <linuxdrivers@attotech.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The original code will "goto out_disable_device" and call
pci_disable_device() if pci_enable_device() fails. The kernel will generate
a warning message like "3w-9xxx 0000:00:05.0: disabling already-disabled
device".
We shouldn't disable a device that failed to be enabled. A simple return is
fine.
Link: https://lore.kernel.org/r/20220829110115.38789-1-fantasquex@gmail.com
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a driver returns:
- DID_TARGET_FAILURE
- DID_NEXUS_FAILURE
- DID_ALLOC_FAILURE
- DID_MEDIUM_ERROR
we hit a couple bugs:
1. The SCSI error handler runs because scsi_decide_disposition() has no
case statements for them and we return FAILED.
2. For SG IO the userspace app gets a success status instead of failed,
because scsi_result_to_blk_status() clears those errors.
This patch adds a new internal error code byte for use by the SCSI
midlayer. This will be used instead of the above error codes, so we don't
have to play that clearing the host code game in
scsi_result_to_blk_status() and drivers cannot accidentally use them.
A subsequent commit will then remove the internal users of the above codes
and convert us to use the new ones.
Link: https://lore.kernel.org/r/20220812010027.8251-9-michael.christie@oracle.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_ALLOC_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in entering SCSI error handling.
By the code comment, it looks like the driver wanted a retryable error
code, so this has it use DID_ERROR.
Link: https://lore.kernel.org/r/20220812010027.8251-8-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it
results in entering SCSI error handling.
This has qla2xxx use DID_NO_CONNECT because it looks like we hit this error
when we can't find a port. It will give us the same hard error behavior and
it seems to match the error where we can't find the endpoint.
Link: https://lore.kernel.org/r/20220812010027.8251-7-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_NEXUS_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in entering SCSI error handling.
virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_NEXUS_FAILURE. It looks like qemu returns that error code if
host OS returns DID_NEXUS_FAILURE (qemu's internal
SCSI_HOST_RESERVATION_ERROR maps to DID_NEXUS_FAILURE). This shouldn't
happen for Linux since we don't propagate that error code to userspace.
This has us convert VIRTIO_SCSI_S_NEXUS_FAILURE to a
SAM_STAT_RESERVATION_CONFLICT in case some other virt layer is returning
it. In that case we will still get the reservation confict failure we
expect.
Link: https://lore.kernel.org/r/20220812010027.8251-6-michael.christie@oracle.com
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in entering SCSI error handling.
virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_TARGET_FAILURE. It looks like qemu returns that error code
if a host OS returns it, but this shouldn't happen for Linux since we never
propagate that error to userspace.
This has us use DID_BAD_TARGET in case some other virt layer is returning
it. In that case we will still get a hard error like before and it conveys
something unexpected happened.
Link: https://lore.kernel.org/r/20220812010027.8251-5-michael.christie@oracle.com
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in the SCSI error handling running.
It looks like the driver wanted a hard failure so swap it with
DID_BAD_TARGET.
Link: https://lore.kernel.org/r/20220812010027.8251-3-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The error codes:
- DID_TARGET_FAILURE
- DID_NEXUS_FAILURE
- DID_ALLOC_FAILURE
- DID_MEDIUM_ERROR
are internal to the SCSI layer. Drivers must not use them because:
1. They are not propagated upwards, so SG IO/passthrough users will not
see an error and think a command was successful.
xen-scsiback will never see this error and should not try to send it.
2. There is no handling for them in scsi_decide_disposition() so if
xen-scsifront were to return the error to the SCSI midlayer then it
kicks off the error handler which is definitely not what we want.
Remove the use from xen-scsifront/back.
Link: https://lore.kernel.org/r/20220812010027.8251-2-michael.christie@oracle.com
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are two .exit_cmd_priv implementations. Both implementations use
resources associated with the SCSI host. Make sure that these resources are
still available when .exit_cmd_priv is called by waiting inside
scsi_remove_host() until the tag set has been freed.
This commit fixes the following use-after-free:
==================================================================
BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
Read of size 8 at addr ffff888100337000 by task multipathd/16727
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report.cold+0x5e/0x5db
kasan_report+0xab/0x120
srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
scsi_mq_exit_request+0x4d/0x70
blk_mq_free_rqs+0x143/0x410
__blk_mq_free_map_and_rqs+0x6e/0x100
blk_mq_free_tag_set+0x2b/0x160
scsi_host_dev_release+0xf3/0x1a0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_device_dev_release_usercontext+0x4c1/0x4e0
execute_in_process_context+0x23/0x90
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_disk_release+0x3f/0x50
device_release+0x54/0xe0
kobject_put+0xa5/0x120
disk_release+0x17f/0x1b0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
dm_put_table_device+0xa3/0x160 [dm_mod]
dm_put_device+0xd0/0x140 [dm_mod]
free_priority_group+0xd8/0x110 [dm_multipath]
free_multipath+0x94/0xe0 [dm_multipath]
dm_table_destroy+0xa2/0x1e0 [dm_mod]
__dm_destroy+0x196/0x350 [dm_mod]
dev_remove+0x10c/0x160 [dm_mod]
ctl_ioctl+0x2c2/0x590 [dm_mod]
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Link: https://lore.kernel.org/r/20220826002635.919423-1-bvanassche@acm.org
Fixes: 65ca846a53 ("scsi: core: Introduce {init,exit}_cmd_priv()")
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-5-bvanassche@acm.org
Fixes: fe44260419 ("scsi: core: Make sure that targets outlive devices")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-4-bvanassche@acm.org
Fixes: 16728aaba6 ("scsi: core: Make sure that hosts outlive targets")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-3-bvanassche@acm.org
Fixes: 1a9283782d ("scsi: core: Simplify LLD module reference counting")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-2-bvanassche@acm.org
Fixes: f323896fe6 ("scsi: core: Call blk_mq_free_tag_set() earlier")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the missing destroy_workqueue() before return from
lpfc_sli4_driver_resource_setup() in the error path.
Link: https://lore.kernel.org/r/20220823044237.285643-1-yangyingliang@huawei.com
Fixes: 3cee98db26 ("scsi: lpfc: Fix crash on driver unload in wq free")
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
{clear|set}_bit() can take an almost arbitrarily large bit number, so there
is no need to manually compute addresses. This is just redundant.
Link: https://lore.kernel.org/r/c3429a22023f58e5e5cc65d6cd7e83fb2bd9b870.1658340442.git.christophe.jaillet@wanadoo.fr
Tested-by: Don Brace <don.brace@microchip.com>
Acked-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use bitmap_zalloc()/bitmap_free() instead of hand-writing them. It is less
verbose and it improves the semantic.
Link: https://lore.kernel.org/r/5f975ef43f8b7306e4ac4e2e8ce4bcd53f6092bb.1658340441.git.christophe.jaillet@wanadoo.fr
Tested-by: Don Brace <don.brace@microchip.com>
Acked-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Return the value from lpfc_issue_reg_vfi() directly instead of storing it
in another redundant variable.
Link: https://lore.kernel.org/r/20220824075123.221316-1-ye.xingchen@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Return the value from lpfc_sli4_issue_wqe() directly instead of storing it
in another redundant variable.
Link: https://lore.kernel.org/r/20220824075017.221244-1-ye.xingchen@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/qla2xxx/qla_os.c:40:20: warning: symbol 'qla_trc_array'
was not declared. Should it be static?
drivers/scsi/qla2xxx/qla_os.c:345:5: warning: symbol
'ql2xdelay_before_pci_error_handling' was not declared.
Should it be static?
Define qla_trc_array and ql2xdelay_before_pci_error_handling as static to
fix sparse warnings.
Link: https://lore.kernel.org/r/20220826102559.17474-7-njavali@marvell.com
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Older tracing of driver messages was to:
- log only debug messages to kernel main trace buffer; and
- log only if extended logging bits corresponding to this message is
off
This has been modified and extended as follows:
- Tracing is now controlled via ql2xextended_error_logging_ktrace
module parameter. Bit usages same as ql2xextended_error_logging.
- Tracing uses "qla2xxx" trace instance, unless instance creation have
issues.
- Tracing is enabled (compile time tunable).
- All driver messages, include debug and log messages are now traced in
kernel trace buffer.
Trace messages can be viewed by looking at the qla2xxx instance at:
/sys/kernel/tracing/instances/qla2xxx/trace
Trace tunable that takes the same bit mask as ql2xextended_error_logging
is:
ql2xextended_error_logging_ktrace (default=1)
Link: https://lore.kernel.org/r/20220826102559.17474-6-njavali@marvell.com
Suggested-by: Daniel Wagner <dwagner@suse.de>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add new API to obtain the NVMe Parameters region status from the Auxiliary
Image Status bitmap.
Link: https://lore.kernel.org/r/20220826102559.17474-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On some platforms, the current logic of relying on finding new packet
solely based on signature pattern can lead to driver reading stale
packets. Though this is a bug in those platforms, reduce such exposures by
limiting reading packets until the IN pointer.
Link: https://lore.kernel.org/r/20220826102559.17474-3-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This message is helpful to troubleshoot missing LUNs/SAN boot errors. It'd
be nice to log it by default instead of only being enabled with debug.
This user had an accidental/forgotten file modprobe.d/qla2xxx.conf w/
option qlini_mode=disabled from experiments with FC target mode, and their
boot LUN didn't come up, as it skips SCSI scan, of course.
However, their boot log didn't provide any clues to help understand that.
The issue/message could be figured out w/ ql2xextended_error_logging, but
it would have been simpler (or even deflected/addressed by user) if it had
been there by default. And it also would help support/triage/deflection
tooling.
Expected change:
scsi host15: qla2xxx
+qla2xxx [0000:3b:00.0]-00fb:15: skipping scsi_scan_host() for non-initiator port
qla2xxx [0000:3b:00.0]-00fb:15: QLogic QLE2692 - QLE2692 Dual Port 16Gb FC to PCIe Gen3 x8 Adapter.
According to:
qla2x00_probe_one()
...
ret = scsi_add_host(...);
...
ql_log(ql_log_info, ...
"skipping scsi_scan_host() for non-initiator port\n");
...
ql_log(ql_log_info, ...
"QLogic %s - %s.\n", ha->model_number, ha->model_desc);
Link: https://lore.kernel.org/r/20220825120159.275051-1-mfo@canonical.com
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the driver hits an internal error condition returning DID_REQUEUE the
I/O will be retried on the same ITL nexus. This will inhibit multipathing,
resulting in endless retries even if the error could have been resolved by
using a different ITL nexus. Return DID_TRANSPORT_DISRUPTED to allow for
multipath to engage and route I/O to another ITL nexus.
Link: https://lore.kernel.org/r/20220824060033.138661-1-hare@suse.de
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With cmd_per_lun value 7, a higher number of cache lines (map_nr) are
needed while allocating sdev->budget_map which is not reasonable and hence
increase the cmd_per_lun value to 128.
Link: https://lore.kernel.org/r/20220825075457.16422-4-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The ExtendedType field was set to 1 in the diag buffer register command and
hence MPT Endpoint firmware is failing the request with Invalid Field
IOCStatus.
memset the request frame to zero before framing the diag buffer register
command.
Link: https://lore.kernel.org/r/20220825075457.16422-3-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a pool crosses the 4GB boundary region then before reallocating pools
change the coherent DMA mask to 32 bits and keep the normal DMA mask set to
63/64 bits.
Link: https://lore.kernel.org/r/20220825075457.16422-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If SCSI error handling is taking place for timed out I/Os on a drive and
the corresponding drive is removed, then stop escalating to higher level of
reset by returning the TUR with "I_T NEXUS LOSS OCCURRED" sense key.
Link: https://lore.kernel.org/r/20220816080801.13929-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct MR_DRV_RAID_MAP and refactor the code accordingly.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Link: https://lore.kernel.org/r/1448f387821833726b99f0ce13069ada89164eb5.1660592640.git.gustavoars@kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enhanced-by: Kees Cook <keescook@chromium.org> # Change in struct MR_DRV_RAID_MAP_ALL
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct MR_DRV_RAID_MAP and refactor the the rest of the code
accordingly.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Link: https://lore.kernel.org/r/4495ce170c8ef088a10f1abe0e7c227368f43242.1660592640.git.gustavoars@kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enhanced-by: Kees Cook <keescook@chromium.org> # Change in struct MR_FW_RAID_MAP_ALL
Although qla2xxx driver is calling schedule_{,delayed}_work() from 10
locations, I assume that flush_scheduled_work() from qlt_stop_phase1()
needs to flush only works scheduled by qlt_sched_sess_work(), for this loop
continues while "struct qla_tgt"->sess_works_list is not empty.
Link: https://lore.kernel.org/r/133c6723-90b6-5c8b-72b4-cc01eeb3a8c0@I-love.SAKURA.ne.jp
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently qlt_stop_phase1() may fail to call flush_scheduled_work(), for
list_empty() may return true as soon as qlt_sess_work_fn() called
list_del(). In order to close this race window, check list_empty() after
calling flush_scheduled_work().
If this patch causes problems, please check commit c4f135d643
("workqueue: Wrap flush_workqueue() using a macro"). We are on the way to
remove all flush_scheduled_work() calls from the kernel.
Link: https://lore.kernel.org/r/7f24469d-9e39-3398-d851-329b54c0b923@I-love.SAKURA.ne.jp
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
"struct qla_tgt"->del_sess_list is no longer used since commit
726b854870 ("qla2xxx: Add framework for async fabric discovery").
Link: https://lore.kernel.org/r/0c335e86-5624-b599-5137-f1377419fb0c@I-love.SAKURA.ne.jp
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyrights to 2022 for files modified in the 14.2.0.6 patch set.
Link: https://lore.kernel.org/r/20220819011736.14141-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 14.2.0.6.
Link: https://lore.kernel.org/r/20220819011736.14141-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SANDiags feature is unused, and related code is removed.
Link: https://lore.kernel.org/r/20220819011736.14141-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add capability to specify warning notification period to help firmware
adjust to congestion accordingly.
Link: https://lore.kernel.org/r/20220819011736.14141-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The kernel test robot reported the following sparse warning:
arch/arm64/include/asm/cmpxchg.h:88:1: sparse: sparse: cast truncates
bits from constant value (369 becomes 69)
On arm64, atomic_xchg only works on 8-bit byte fields. Thus, the macro
usage of LPFC_RXMONITOR_TABLE_IN_USE can be unintentionally truncated
leading to all logic involving the LPFC_RXMONITOR_TABLE_IN_USE macro to not
work properly.
Replace the Rx Table atomic_t indexing logic with a new
lpfc_rx_info_monitor structure that holds a circular ring buffer. For
locking semantics, a spinlock_t is used.
Link: https://lore.kernel.org/r/20220819011736.14141-4-jsmart2021@gmail.com
Fixes: 17b27ac592 ("scsi: lpfc: Add rx monitoring statistics")
Cc: <stable@vger.kernel.org> # v5.15+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
An error case exit from lpfc_cmpl_ct_cmd_gft_id() results in a call to
lpfc_nlp_put() with a null pointer to a nodelist structure.
Changed lpfc_cmpl_ct_cmd_gft_id() to initialize nodelist pointer upon
entry.
Link: https://lore.kernel.org/r/20220819011736.14141-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During a stress offline/online test in PT2PT topology, target rediscovery
can fail with a specific target vendor array.
When the HBA transitions to online mode it is possible to receive an
unsolicited FLOGI before processing the Link Up event. The received FLOGI
will set the defer_flogi_acc_flag, which instructs the driver to wait until
it transmits its own FLOGI before ACKing the received FLOGI. In this
failure scenario, the link up processing clears the set
defer_flogi_acc_flag before we have sent out the FLOGI. As the target has
the higher WWPN and is responsible for sending the PLOGI, the target is
stuck waiting for its FLOGI_ACC that the driver will never send.
Remove the clear of defer_flogi_acc_flag from Link Up event processing. In
this stress test case, the defer_flogi_acc_flag is cleared during the Link
Down event processing anyways.
Link: https://lore.kernel.org/r/20220819011736.14141-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Variable stp is assigned a value that is never read, the assignment and the
variable stp are redundant and can be removed.
Cleans up clang scan build warning:
drivers/scsi/st.c:4253:7: warning: Although the value stored to 'stp'
is used in the enclosing expression, the value is never actually
read from 'stp' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220805115652.2340991-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable mfiStatus is assigned a value but it is never read. The
assignment is redundant and can be removed. Also remove { } as the return
statement does not need to be in its own code block.
Cleans up clang scan build warning:
drivers/scsi/megaraid/megaraid_sas_base.c:4026:7: warning: Although the
value stored to 'mfiStatus' is used in the enclosing expression, the
value is never actually read from 'mfiStatus' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220805115042.2340400-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable scb is assigned a value but it is never read. The assignment
is redundant and can be removed. Also replace the != NULL check with the
more usual non-null check idiom.
Cleans up clang scan build warning:
drivers/scsi/initio.c:1169:9: warning: Although the value stored to 'scb'
is used in the enclosing expression, the value is never actually read
from 'scb' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220805114100.2339637-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Disable firmware download for ATTO devices where it is not supported.
Link: https://lore.kernel.org/r/20220805174609.14830-2-bgrove@attotech.com
Co-developed-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Bradley Grove <bgrove@attotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add ATTO's PCI IDs and modify the driver to handle the unique NVRAM
structure used by ATTO's devices.
Link: https://lore.kernel.org/r/20220805174609.14830-1-bgrove@attotech.com
Co-developed-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Bradley Grove <bgrove@attotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Block the I/Os on the target devices until corresponding target device's
target dev objects are refreshed as part of post controller reset
operation.
Link: https://lore.kernel.org/r/20220804131226.16653-16-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update the host's SAS ports if there is change in port id or phys. If the
port id is changed, then the driver updates it. If some phys are
enabled/disabled during reset, then driver updates them in STL.
Check for the responding expander devices and update the device handle if
it got changed. Register the expander with STL if it got added during reset
and unregister the expander device if it got removed during reset.
[mkp: include fix for zeroday warning]
Link: https://lore.kernel.org/r/20220804131226.16653-15-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add support for the following SAS transport class callbacks:
- get_linkerrors
- get_enclosure_identifier
- get_bay_identifier
- phy_reset
- phy_enable
- set_phy_speed
- smp_handler
Link: https://lore.kernel.org/r/20220804131226.16653-14-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add framework to issue MPT transport commands to controllers. Also issue
the MPT transport commands to get the manufacturing info of SAS expander
device.
Link: https://lore.kernel.org/r/20220804131226.16653-13-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Register/unregister the SAS, SATA devices to SCSI Transport Layer(STL)
whenever the corresponding device is added/removed from topology.
Link: https://lore.kernel.org/r/20220804131226.16653-12-sreekanth.reddy@broadcom.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When device is registered with the STL then get the corresponding device's
target object using the rphy in below callback functions:
- mpi3mr_target_alloc()
- mpi3mr_slave_alloc()
- mpi3mr_slave_configure()
- mpi3mr_slave_destroy()
Link: https://lore.kernel.org/r/20220804131226.16653-11-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Register/unregister the expander devices to SCSI Transport Layer(STL)
whenever the corresponding expander is added/removed from topology.
Link: https://lore.kernel.org/r/20220804131226.16653-10-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Register the SAS, SATA devices to SCSI Transport Layer (STL) only if
multipath capability is disabled in the controller's firmware.
Link: https://lore.kernel.org/r/20220804131226.16653-9-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the following helper functions:
- Update the host phys with STL
- Remove the device's SAS port with STL
Link: https://lore.kernel.org/r/20220804131226.16653-8-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the following helper functions:
- Get the device's sas address by reading corresponding device's Device
page0
- Get the expander object from expander list based on expander's handle
- Get the target device object from target device list based on device's
sas address
- Get the expander device object from expander list based on expanders's
sas address
- Get hba port object from hba port table list based on port's port id
Link: https://lore.kernel.org/r/20220804131226.16653-7-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add framework to register and unregister the host and expander phys with
SCSI Transport Layer (STL).
Link: https://lore.kernel.org/r/20220804131226.16653-6-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enable and process the Enclosure device add event.
Link: https://lore.kernel.org/r/20220804131226.16653-5-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helper functions to retrieve below controller's config pages:
- SAS IOUnit Page0
- SAS IOUnit Page1
- Driver Page1
- Device Page0
- SAS Phy Page0
- SAS Phy Page1
- SAS Expander Page0
- SAS Expander Page1
- Enclosure Page0
Also add the helper function to set SAS IOUnit Page1.
Link: https://lore.kernel.org/r/20220804131226.16653-4-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add config and transport request related error & info debug flags and
functions.
Link: https://lore.kernel.org/r/20220804131226.16653-2-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Although commit 88f1669019 ("scsi: sd: Rework asynchronous resume support")
eliminates a delay for some ATA disks after resume, it causes resume of ATA
disks to fail on other setups. See also:
* "Resume process hangs for 5-6 seconds starting sometime in 5.16"
(https://bugzilla.kernel.org/show_bug.cgi?id=215880).
* Geert's regression report
(https://lore.kernel.org/linux-scsi/alpine.DEB.2.22.394.2207191125130.1006766@ramsan.of.borg/).
This is what I understand about this issue:
* During resume, ata_port_pm_resume() starts the SCSI error handler. This
changes the SCSI host state into SHOST_RECOVERY and causes
scsi_queue_rq() to return BLK_STS_RESOURCE.
* sd_resume() calls sd_start_stop_device() for ATA devices. That function
in turn calls sd_submit_start() which tries to submit a START STOP UNIT
command. That command can only be submitted after the SCSI error handler
has changed the SCSI host state back to SHOST_RUNNING.
* The SCSI error handler runs on its own thread and calls
schedule_work(&(ap->scsi_rescan_task)). That causes
ata_scsi_dev_rescan() to be called from the context of a kernel
workqueue. That call hangs in blk_mq_get_tag(). I'm not sure why - maybe
because all available tags have been allocated by sd_submit_start()
calls (this is a guess).
Link: https://lore.kernel.org/r/20220816172638.538734-1-bvanassche@acm.org
Fixes: 88f1669019 ("scsi: sd: Rework asynchronous resume support")
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: gzhqyz@gmail.com
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: gzhqyz@gmail.com
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: John Garry <john.garry@huawei.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since blk_mq_map_queues() and the .map_queues() callbacks always return 0,
change their return type into void. Most callers ignore the returned value
anyway.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20220815170043.19489-3-bvanassche@acm.org
[axboe: fold in fix from Bart]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Passthrough users will set the scsi_cmnd->allowed value and were expecting
up to $allowed retries. The problem is that before:
commit 6aded12b10 ("scsi: core: Remove struct scsi_request")
we used to set the retries on the scsi_request then copy them over to
scsi_cmnd->allowed in scsi_setup_scsi_cmnd. With that patch we now set
scsi_cmnd->allowed to 0 in scsi_prepare_cmd and overwrite what the
passthrough user set.
This moves the allowed initialization to after the blk_rq_is_passthrough()
check so it's only done for the non-passthrough path where the ULD
init_command will normally set an allowed value it prefers.
Link: https://lore.kernel.org/r/20220812011206.9157-1-michael.christie@oracle.com
Fixes: 6aded12b10 ("scsi: core: Remove struct scsi_request")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mostly small bug fixes and trivial updates. The major new core update
is a change to the way device, target and host reference counting is
done to try to make it more robust (this change has soaked for a while
to try to winkle out any bugs).
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYveemSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisheaiAP9DPQ70
7hEdak6evXUwlWwlVBvEFAfZlzpHN+uNzUCvpgD+N61RSPhHV1hXu12sVMmjMNEb
pow+ee47Mc0t/OO68jA=
=1yQh
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Mostly small bug fixes and trivial updates.
The major new core update is a change to the way device, target and
host reference counting is done to try to make it more robust (this
change has soaked for a while to try to winkle out any bugs)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: pm8001: Fix typo 'the the' in comment
scsi: megaraid_sas: Remove redundant variable cmd_type
scsi: FlashPoint: Remove redundant variable bm_int_st
scsi: zfcp: Fix missing auto port scan and thus missing target ports
scsi: core: Call blk_mq_free_tag_set() earlier
scsi: core: Simplify LLD module reference counting
scsi: core: Make sure that hosts outlive targets
scsi: core: Make sure that targets outlive devices
scsi: ufs: ufs-pci: Correct check for RESET DSM
scsi: target: core: De-RCU of se_lun and se_lun acl
scsi: target: core: Fix race during ACL removal
scsi: ufs: core: Correct ufshcd_shutdown() flow
scsi: ufs: core: Increase the maximum data buffer size
scsi: lpfc: Check the return value of alloc_workqueue()
When alloc ctrl mem fails, the reply_map will subsequently be freed in
megasas_free_ctrl_mem(). No need to free it in megasas_alloc_ctrl_mem().
Link: https://lore.kernel.org/r/1659424740-46918-1-git-send-email-kanie@linux.alibaba.com
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When allocating log_to_span fails, kfree(instance->ctrl_context) is called
twice. Remove redundant call.
Link: https://lore.kernel.org/r/1659424729-46502-1-git-send-email-kanie@linux.alibaba.com
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The error path for the SCSI check condition of not ready, target in ALUA
state transition, will result in the failure of that path after the retries
are exhausted. In most cases that is well ahead of the transition timeout
established in the SCSI ALUA device handler.
Instead, reprep the command and re-add it to the queue after a 1 second
delay. This will allow the handler to take care of the timeout and only
fail the path if the target has exceeded the transition expiry timeout
(default 60 seconds). If the expiry timeout is exceeded, the handler will
change the path state from transitioning to standby leading to a path
failure eliminating the potential of this re-prep to continue endlessly. In
most cases the target will exit the transitioning state well before the
expiry timeout but after the retries are exhausted as mentioned.
Additionally remove the scsi_io_completion_reprep() function which provides
little value.
Link: https://lore.kernel.org/r/20220729214110.58576-1-brian@purestorage.com
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Krishna Kant <krishna.kant@purestorage.com>
Acked-by: Seamus Connor <sconnor@purestorage.com>
Signed-off-by: Brian Bunker <brian@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This partially reverts commit d2b292c3f6 ("scsi: qla2xxx: Enable ATIO
interrupt handshake for ISP27XX")
For some workloads where the host sends a batch of commands and then
pauses, ATIO interrupt coalesce can cause some incoming ATIO entries to be
ignored for extended periods of time, resulting in slow performance,
timeouts, and aborted commands.
Disable interrupt coalesce and re-enable the dedicated ATIO MSI-X
interrupt.
Link: https://lore.kernel.org/r/97dcf365-89ff-014d-a3e5-1404c6af511c@cybernetics.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- Add support for syscall stack randomization.
- Add support for atomic operations to the 32 & 64-bit BPF JIT.
- Full support for KASAN on 64-bit Book3E.
- Add a watchdog driver for the new PowerVM hypervisor watchdog.
- Add a number of new selftests for the Power10 PMU support.
- Add a driver for the PowerVM Platform KeyStore.
- Increase the NMI watchdog timeout during live partition migration, to avoid timeouts
due to increased memory access latency.
- Add support for using the 'linux,pci-domain' device tree property for PCI domain
assignment.
- Many other small features and fixes.
Thanks to: Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira Rajeev, Bagas
Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Greg Kurz,
Haowen Bai, Hari Bathini, Jason A. Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg
Haefliger, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch, Naveen N. Rao, Nayna
Jain, Nicholas Piggin, Ning Qiang, Pali Rohár, Petr Mladek, Rashmica Gupta, Sachin Sant,
Scott Cheloha, Segher Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, Zhouyi Zhou.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmLuAPgTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBPpD/9kY/T0qlOXABxlZCgtqeAjPX+2xpnY
BF+TlsN1TS1auFcEZL2BapmVacsvOeGEFDVuZHZvZJc69Hx+gSjnjFCnZjp6n+Yz
wt6y9w9Pu0t/sjD5vNQ46O15/dXqm6RoVI7um12j/WLMN8Ko5+x3gKAyQONjQd2/
1kPcxVH6FUosAdnCuvIcqCX4e4IIHl2ZkitHOTXoQUvUy9oAK/mOBnwqZ6zLGUKC
E5M+Zyt4RFGxhPs48FkX6Nq6crDGU/P0VJpDKkR/t7GHnE67Bm70gZougAPrzrgP
nx8zoTWgDKpqDeuqK7pFcyKgNS3dKbxsN3sAfKHOWu/YnV4wMyy+7fmwagMauki7
lXccKN6F/r+8JcMNx80Jp/dAw3ZdLceP38M3Ryf8IL6lTfkNySumUvrKJn6r1Cu1
wvzhgyEuDawss9KHdEmXcA2i3+XVZvitaipO7JWUC8pblrP1SJMoPfIIe9zh3y3M
pyZj0TcGJ8XaK+badvI+PW/K/KeRgXEY8HpC3wDHSoIkli3OE4jDwXn6TiZgvm3n
k0sKL8YSmQZ8hP8QAkR+r8NQKYqLlfyPxdslK5omDPxfub5Uzk9ZV2Ep7svkaiQn
Wqjq27Dpz8+w0XPjsQ0Tkv+ByTkOhrawOH7x9SpFLHpv9g5otcYmS79NkO/htx8C
6LyPNx1VYn5IRA==
=tRkm
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add support for syscall stack randomization
- Add support for atomic operations to the 32 & 64-bit BPF JIT
- Full support for KASAN on 64-bit Book3E
- Add a watchdog driver for the new PowerVM hypervisor watchdog
- Add a number of new selftests for the Power10 PMU support
- Add a driver for the PowerVM Platform KeyStore
- Increase the NMI watchdog timeout during live partition migration, to
avoid timeouts due to increased memory access latency
- Add support for using the 'linux,pci-domain' device tree property for
PCI domain assignment
- Many other small features and fixes
Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira
Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas,
Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A.
Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol
Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch,
Naveen N. Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár,
Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher
Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, and Zhouyi Zhou.
* tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits)
powerpc/64e: Fix kexec build error
EDAC/ppc_4xx: Include required of_irq header directly
powerpc/pci: Fix PHB numbering when using opal-phbid
powerpc/64: Init jump labels before parse_early_param()
selftests/powerpc: Avoid GCC 12 uninitialised variable warning
powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
powerpc/xive: Fix refcount leak in xive_get_max_prio
powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
powerpc/perf: Include caps feature for power10 DD1 version
powerpc: add support for syscall stack randomization
powerpc: Move system_call_exception() to syscall.c
powerpc/powernv: rename remaining rng powernv_ functions to pnv_
powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
powerpc/powernv: Avoid crashing if rng is NULL
selftests/powerpc: Fix matrix multiply assist test
powerpc/signal: Update comment for clarity
powerpc: make facility_unavailable_exception 64s
powerpc/platforms/83xx/suspend: Remove write-only global variable
powerpc/platforms/83xx/suspend: Prevent unloading the driver
powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration
...
- convert arm32 to the common dma-direct code (Arnd Bergmann, Robin Murphy,
Christoph Hellwig)
- restructure the PCIe peer to peer mapping support (Logan Gunthorpe)
- allow the IOMMU code to communicate an optional DMA mapping length
and use that in scsi and libata (John Garry)
- split the global swiotlb lock (Tianyu Lan)
- various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang,
Lukas Bulwahn, Robin Murphy)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmLuIYULHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPS5A//Ty1ZNyXExmwZ6J6g7/oIvQlpAHilDr22mCd8tR8Y
Ne7TgLa/X+usFvJTxJfkvg/LNMDjD7qx0J/mhDGm4reOFcEL4/PBy0rDSOgnmntV
k/fPhgwnpuztiAQ+s+WkJ3pkrmG1HaEId7GGj2JaoYdas6RX2mGX7vL8uvUFepjw
lYPAqWMtJHkOfsDK0PqqyQsr7dcC6lyFLqnn/wqvHtTJeKCfGs6W/SIrlWme2SZY
3dNx84ZR1uPjaazAmtf2IWfjh/TBmd0ETRYycgUUKRP9iwsCkBQDBwsBGSIYXiWj
BUKQ5oMvjAlUGRF0jYz9e77KuedE6GxWiXNQstitBmid142M37DHA5tvZRf65MPS
THHcjTDmmoaO4YfFhhXOcFOrjG4/V8bF7fgHB6XkHDjhVVTcnIx8zuOAXIVBZvIV
VAALmamBqEfIZZrCqgr7hzFssK2bip+TIMkdoD46Wcr+D7bAlujhuzWxubn9+ulT
23v/pAvC80ut6LvKj6EA+GpRm/pejfOtEbjXPoO2hguNxvuUKvPQqNh9hy0q+v1e
8n2Y/4lhy5bv02S7wKooNkfCoV753jBY1TIru45UmEYc3EkTQPii6okYe0DvW4QX
VCnKgo156wSBfE+9eWdxCROv2SZqJFMV/wL3vw54dpJQMbDy7VkNsh4mGREdUkU1
uek=
=Bv19
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- convert arm32 to the common dma-direct code (Arnd Bergmann, Robin
Murphy, Christoph Hellwig)
- restructure the PCIe peer to peer mapping support (Logan Gunthorpe)
- allow the IOMMU code to communicate an optional DMA mapping length
and use that in scsi and libata (John Garry)
- split the global swiotlb lock (Tianyu Lan)
- various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang,
Lukas Bulwahn, Robin Murphy)
* tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping: (45 commits)
swiotlb: fix passing local variable to debugfs_create_ulong()
dma-mapping: reformat comment to suppress htmldoc warning
PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg()
RDMA/rw: drop pci_p2pdma_[un]map_sg()
RDMA/core: introduce ib_dma_pci_p2p_dma_supported()
nvme-pci: convert to using dma_map_sgtable()
nvme-pci: check DMA ops when indicating support for PCI P2PDMA
iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg
iommu: Explicitly skip bus address marked segments in __iommu_map_sg()
dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support
dma-direct: support PCI P2PDMA pages in dma-direct map_sg
dma-mapping: allow EREMOTEIO return code for P2PDMA transfers
PCI/P2PDMA: Introduce helpers for dma_map_sg implementations
PCI/P2PDMA: Attempt to set map_type if it has not been set
lib/scatterlist: add flag for indicating P2PDMA segments in an SGL
swiotlb: clean up some coding style and minor issues
dma-mapping: update comment after dmabounce removal
scsi: sd: Add a comment about limiting max_sectors to shost optimal limit
ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors
scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit
...
Updates to the usual drivers (ufs, qla2xx, target, lpfc, smartpqi,
mpi3mr). The main driver change that might cause issues on down the
road is the conversion of some of our oldest surviving drivers to the
DMA API (should only affect m68k). The only major core change is the
rework of async resume; the rest are either completely trivial or for
updating deprecated APIs.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYuvakyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfvOAP4m0N6b
e3JwoBtB1c0JMKv6G4gka8suEG8p5f4khDu8wwD+LfGUCzG49Y5Ts7rByXfEiGgO
krSdwsAZiV6yKg/HuPw=
=Ak9L
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (ufs, qla2xx, target, lpfc, smartpqi,
mpi3mr).
The main driver change that might cause issues on down the road is the
conversion of some of our oldest surviving drivers to the DMA API
(should only affect m68k).
The only major core change is the rework of async resume; the rest are
either completely trivial or for updating deprecated APIs"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (195 commits)
scsi: target: Remove XDWRITEREAD emulated support
scsi: megaraid: Remove the static variable initialisation
scsi: ch: Do not initialise statics to 0
scsi: ufs: core: Fix spelling mistake "Cannnot" -> "Cannot"
scsi: target: iscsi: Do not require target authentication
scsi: target: iscsi: Allow AuthMethod=None
scsi: target: iscsi: Support base64 in CHAP
scsi: target: iscsi: Add support for extended CDB AHS
scsi: ufs: dt-bindings: Add SC8280XP binding
scsi: target: iscsi: Fix clang -Wformat warnings
scsi: ufs: core: Read device property for ref clock
scsi: libsas: Resume SAS host for phy reset or enable via sysfs
scsi: hisi_sas: Modify v3 HW SATA completion error processing
scsi: hisi_sas: Relocate DMA unmap of SMP task
scsi: hisi_sas: Remove unnecessary variable to hold DMA map elements
scsi: hisi_sas: Call hisi_sas_slave_configure() from slave_configure_v3_hw()
scsi: mpi3mr: Delete a stray tab
scsi: mpi3mr: Unlock on error path
scsi: mpi3mr: Reduce VD queue depth on detecting throttling
scsi: mpi3mr: Resource Based Metering
...
Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text. Also included in here are a few other minor updates,
2 USB files, and one Documentation file update to get the SPDX lines
correct.
All of these have been in the linux-next tree for a very long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYupz3g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynPUgCgslaf2ssCgW5IeuXbhla+ZBRAzisAnjVgOvLN
4AKdqbiBNlFbCroQwmeQ
=v1sg
-----END PGP SIGNATURE-----
Merge tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX updates from Greg KH:
"Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text.
Also included in here are a few other minor updates, two USB files,
and one Documentation file update to get the SPDX lines correct.
All of these have been in the linux-next tree for a very long time"
* tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits)
Documentation: samsung-s3c24xx: Add blank line after SPDX directive
x86/crypto: Remove stray comment terminator
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLko3gQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpmQaD/90NKFj4v8I456TUQyg1jimXEsL+e84E6o2
ALWVb6JzQvlPVQXNLnK5YKIunMWOTtTMz0nyB8sVRwVJVJO0P5d7QopAkZM8fkyU
MK5OCzoryENw4DTc2wJS4in6cSbGylIuN74wMzlf7+M67JTImfoZQhbTMcjwzZfn
b3OlL6sID7zMXwGcuOJPZyUJICCpDhzdSF9JXqKma5PQuG2SBmQyvFxJAcsoFBPc
YetnoRIOIN6yBvsIZaPaYq7XI9MIvF0e67EQtyCEHj4tHpyVnyDWkeObVFULsISU
gGEKbkYPvNUzRAU5Q1NBBHh1tTfkf/MaUxTuZwoEwZ/s04IGBGMmrZGyfvdfzYo6
M7NwSEg/TrUSNfTwn65mQi7uOXu1pGkJrqz84Flm8u9Qid9Vd7LExLG5p/ggnWdH
5th93MDEmtEg29e9DXpEAuS5d0t3TtSvosflaKpyfNNfr+P0rWCN6GM/uW62VUTK
ls69SQh/AQJRbg64jU4xper6WhaYtSXK7TKEnxJycoEn9gYNyCcdot2uekth0xRH
ChHGmRlteiqe/y4uFWn/2dcxWjoleiHbFjTaiRL75WVl8wIDEjw02LGuoZ61Ss9H
WOV+MT7KqNjBGe6lreUY+O/PO02dzmoR6heJXN19p8zr/pBuLCTGX7UpO7rzgaBR
4N1HEozvIw==
=celk
-----END PGP SIGNATURE-----
Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- Improve the type checking of request flags (Bart)
- Ensure queue mapping for a single queues always picks the right queue
(Bart)
- Sanitize the io priority handling (Jan)
- rq-qos race fix (Jinke)
- Reserved tags handling improvements (John)
- Separate memory alignment from file/disk offset aligment for O_DIRECT
(Keith)
- Add new ublk driver, userspace block driver using io_uring for
communication with the userspace backend (Ming)
- Use try_cmpxchg() to cleanup the code in various spots (Uros)
- Finally remove bdevname() (Christoph)
- Clean up the zoned device handling (Christoph)
- Clean up independent access range support (Christoph)
- Clean up and improve block sysfs handling (Christoph)
- Clean up and improve teardown of block devices.
This turns the usual two step process into something that is simpler
to implement and handle in block drivers (Christoph)
- Clean up chunk size handling (Christoph)
- Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu,
Ming, Sebastian, Yang, Ying)
* tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits)
ublk_drv: fix double shift bug
ublk_drv: make sure that correct flags(features) returned to userspace
ublk_drv: fix error handling of ublk_add_dev
ublk_drv: fix lockdep warning
block: remove __blk_get_queue
block: call blk_mq_exit_queue from disk_release for never added disks
blk-mq: fix error handling in __blk_mq_alloc_disk
ublk: defer disk allocation
ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask
ublk: fold __ublk_create_dev into ublk_ctrl_add_dev
ublk: cleanup ublk_ctrl_uring_cmd
ublk: simplify ublk_ch_open and ublk_ch_release
ublk: remove the empty open and release block device operations
ublk: remove UBLK_IO_F_PREFLUSH
ublk: add a MAINTAINERS entry
block: don't allow the same type rq_qos add more than once
mmc: fix disk/queue leak in case of adding disk failure
ublk_drv: fix an IS_ERR() vs NULL check
ublk: remove UBLK_IO_F_INTEGRITY
ublk_drv: remove unneeded semicolon
...
Replace 'the the' with 'the' in the comment.
Link: https://lore.kernel.org/r/20220722094612.78583-1-slark_xiao@163.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Slark Xiao <slark_xiao@163.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable cmd_type is assigned a value but it is never read. The
variable and the assignment are redundant and can be removed.
Cleans up clang scan build warning:
drivers/scsi/megaraid/megaraid_sas_fusion.c:3228:10: warning: Although
the value stored to 'cmd_type' is used in the enclosing expression, the
value is never actually read from 'cmd_type' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220730124509.148457-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable bm_int_st is assigned a value but it is never read. The
variable and the assignment are redundant and can be removed.
Cleans up clang scan build warning:
drivers/scsi/FlashPoint.c:1726:7: warning: Although the value stored
to 'bm_int_st' is used in the enclosing expression, the value is never
actually read from 'bm_int_st' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220730123736.147758-1-colin.i.king@gmail.com
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are two .exit_cmd_priv implementations. Both implementations use
resources associated with the SCSI host. Make sure that these resources are
still available when .exit_cmd_priv is called by moving the .exit_cmd_priv
calls from scsi_host_dev_release() to scsi_forget_host(). Moving
blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_remove_host() is
safe because scsi_forget_host() waits until all SCSI devices associated
with the host have been removed.
Fix the following use-after-free:
==================================================================
BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
Read of size 8 at addr ffff888100337000 by task multipathd/16727
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report.cold+0x5e/0x5db
kasan_report+0xab/0x120
srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
scsi_mq_exit_request+0x4d/0x70
blk_mq_free_rqs+0x143/0x410
__blk_mq_free_map_and_rqs+0x6e/0x100
blk_mq_free_tag_set+0x2b/0x160
scsi_host_dev_release+0xf3/0x1a0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_device_dev_release_usercontext+0x4c1/0x4e0
execute_in_process_context+0x23/0x90
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_disk_release+0x3f/0x50
device_release+0x54/0xe0
kobject_put+0xa5/0x120
disk_release+0x17f/0x1b0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
dm_put_table_device+0xa3/0x160 [dm_mod]
dm_put_device+0xd0/0x140 [dm_mod]
free_priority_group+0xd8/0x110 [dm_multipath]
free_multipath+0x94/0xe0 [dm_multipath]
dm_table_destroy+0xa2/0x1e0 [dm_mod]
__dm_destroy+0x196/0x350 [dm_mod]
dev_remove+0x10c/0x160 [dm_mod]
ctl_ioctl+0x2c2/0x590 [dm_mod]
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Link: https://lore.kernel.org/r/20220728221851.1822295-5-bvanassche@acm.org
Fixes: 65ca846a53 ("scsi: core: Introduce {init,exit}_cmd_priv()")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Swap two statements in scsi_device_put() now that it is guaranteed that
SCSI hosts outlive SCSI devices. Remove the reference counting code from
scsi_sysfs.c that became superfluous because SCSI hosts now outlive SCSI
devices.
Link: https://lore.kernel.org/r/20220728221851.1822295-4-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ bvanassche: Extracted this patch from a larger patch ]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the race conditions between SCSI LLD kernel module unloading and SCSI
device and target removal by making sure that SCSI hosts are destroyed
after all associated target and device objects have been freed.
Link: https://lore.kernel.org/r/20220728221851.1822295-3-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ bvanassche: Reworked Ming's patch and split it ]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit prevents that the following sequence triggers a kernel crash:
- Deletion of a SCSI device is requested via sysfs. Device removal takes
some time because blk_cleanup_queue() is waiting for the SCSI error
handler.
- The SCSI target associated with that SCSI device is removed.
- scsi_remove_target() returns and its caller frees the resources
associated with the SCSI target.
- The error handler makes progress and invokes an LLD callback that
dereferences the SCSI target pointer.
Link: https://lore.kernel.org/r/20220728221851.1822295-2-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The function alloc_workqueue() in lpfc_sli4_driver_resource_setup() can
fail, but there is no check of its return value. The return value should be
checked.
Link: https://lore.kernel.org/r/20220723064027.2956623-1-williamsukatube@163.com
Fixes: 3cee98db26 ("scsi: lpfc: Fix crash on driver unload in wq free")
Reported-by: Hacash Robot <hacashRobot@santino.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: William Dean <williamsukatube@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
powerpc's asm/prom.h brings some headers that it doesn't need itself.
Once those headers are removed from asm/prom.h, the following
errors occur:
CC [M] drivers/scsi/cxlflash/ocxl_hw.o
drivers/scsi/cxlflash/ocxl_hw.c: In function 'afu_map_irq':
drivers/scsi/cxlflash/ocxl_hw.c:195:16: error: implicit declaration of function 'irq_create_mapping' [-Werror=implicit-function-declaration]
195 | virq = irq_create_mapping(NULL, irq->hwirq);
| ^~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/ocxl_hw.c:222:9: error: implicit declaration of function 'irq_dispose_mapping' [-Werror=implicit-function-declaration]
222 | irq_dispose_mapping(virq);
| ^~~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/ocxl_hw.c: In function 'afu_unmap_irq':
drivers/scsi/cxlflash/ocxl_hw.c:264:13: error: implicit declaration of function 'irq_find_mapping'; did you mean 'is_cow_mapping'? [-Werror=implicit-function-declaration]
264 | if (irq_find_mapping(NULL, irq->hwirq)) {
| ^~~~~~~~~~~~~~~~
| is_cow_mapping
cc1: some warnings being treated as errors
Fix it by including linux/irqdomain.h
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c6c0cc5e9179a642370a61439f95158271a78c03.1657264228.git.christophe.leroy@csgroup.eu
Initialising global and static variables to 0 is unnecessary. Remove the
initialisation.
Link: https://lore.kernel.org/r/20220723091620.5463-1-wangborong@cdjrlc.com
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During system shutdown or reboot, mpt3sas will reset the firmware back to
ready state. However, the driver leaves running a watchdog work item
intended to keep the firmware in operational state. This causes a second,
unneeded reset on shutdown and moves the firmware back to operational
instead of in ready state as intended. And if the mpt3sas_fwfault_debug
module parameter is set, this extra reset also panics the system.
mpt3sas's scsih_shutdown needs to stop the watchdog before resetting the
firmware back to ready state.
Link: https://lore.kernel.org/r/20220722142448.6289-1-djeffery@redhat.com
Fixes: fae21608c3 ("scsi: mpt3sas: Transition IOC to Ready state during shutdown")
Tested-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a comment about limiting the default the SCSI disk request_queue
max_sectors initial value to that of the SCSI host optimal sectors limit.
Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.
For performance reasons set the request queue max_sectors from
dma_opt_mapping_size(), which knows this mapping limit.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.
New member Scsi_Host.opt_sectors is added, which is the optimal host
max_sectors, and use this value to cap the request queue max_sectors when
set.
It could be considered to have request queues io_opt value initially
set at Scsi_Host.opt_sectors in __scsi_init_queue(), but that is not
really the purpose of io_opt.
Finally, even though Scsi_Host.opt_sectors value should never be greater
than the request queue max_hw_sectors value, continue to limit to this
value for safety.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The shost->max_sectors is repeatedly capped according to the host DMA
mapping limit for each sdev in __scsi_init_queue(). This is unnecessary, so
set only once when adding the host.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Currently if a phy reset or enable phy is issued via sysfs when controller
is suspended, those operations will be ignored as SAS_HA_REGISTERED is
cleared. If RPM is enabled then we may aggressively suspend automatically.
In this case it may be difficult to enable or reset a phy via sysfs, so
resume the host in these scenarios.
Link: https://lore.kernel.org/r/1657823002-139010-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the I/O completion response frame returned by the target device has been
written to the host memory and the err bit in the status field of the
received fis is 1, ts->stat should set to SAS_PROTO_RESPONSE, and this will
let EH analyze and further determine cause of failure.
Link: https://lore.kernel.org/r/1657823002-139010-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently SMP tasks are DMA unmapped only when cq of SMP I/O is returned
normally. If the cq of SMP I/O is returned with exception actually SMP TAS
is never unmapped. Relocate DMA unmap of SMP task to fix the issue.
Link: https://lore.kernel.org/r/1657823002-139010-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use slot->n_elem to store the return value of dma_map_sg() for SSP and SMP
IOs, and remove unnecessary variable n_elem_req.
Link: https://lore.kernel.org/r/1657823002-139010-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is duplicated code between slave_configure_v3_hw() and
hisi_sas_slave_configure(), so call common function
hisi_sas_slave_configure() from slave_configure_v3_hw().
Link: https://lore.kernel.org/r/1657823002-139010-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This code is indented one more tab than it should be.
Link: https://lore.kernel.org/r/YtVCFshEJNC7ELid@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is some clean up necessary before returning. Smatch complains:
drivers/scsi/mpi3mr/mpi3mr_fw.c:4786 mpi3mr_soft_reset_handler()
warn: inconsistent returns '&mrioc->reset_mutex'.
Locked on : 4730
Unlocked on: 4786
Link: https://lore.kernel.org/r/YtVCEsxMU8buuMjP@kili
Fixes: f10af05732 ("scsi: mpi3mr: Resource Based Metering")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reduce the VD queue depth on detecting the throttling condition.
[mkp: incorporate fix for pointer cast issue reported by the test
robot and Guenter Roeck]
Link: https://lore.kernel.org/r/20220708195020.8323-3-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update driver to track cumulative pending large data size at the controller
level and at the throttle group level. When one of the values meet or
exceed the controller's firmware-determined high threshold value, then the
driver will divert future selective I/O to the firmware. Once both
controller level and at the throttle group level cumulative pending large
data size reach controller's firmware determined low threshold value, then
the driver will stop diverting I/Os to the firmware.
Link: https://lore.kernel.org/r/20220708195020.8323-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a SCSI device is removed while in active use, currently sg will
immediately return -ENODEV on any attempt to wait for active commands that
were sent before the removal. This is problematic for commands that use
SG_FLAG_DIRECT_IO since the data buffer may still be in use by the kernel
when userspace frees or reuses it after getting ENODEV, leading to
corrupted userspace memory (in the case of READ-type commands) or corrupted
data being sent to the device (in the case of WRITE-type commands). This
has been seen in practice when logging out of a iscsi_tcp session, where
the iSCSI driver may still be processing commands after the device has been
marked for removal.
Change the policy to allow userspace to wait for active sg commands even
when the device is being removed. Return -ENODEV only when there are no
more responses to read.
Link: https://lore.kernel.org/r/5ebea46f-fe83-2d0b-233d-d0dcb362dd0a@cybernetics.com
Cc: <stable@vger.kernel.org>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
vref_count took an extra decrement in the task management path. Add an
extra ref count to compensate the imbalance.
Link: https://lore.kernel.org/r/20220713052045.10683-7-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch fixes IKE message being dropped due to error in processing Purex
IOCB and Continuation IOCBs.
Link: https://lore.kernel.org/r/20220713052045.10683-6-njavali@marvell.com
Fixes: fac2807946 ("scsi: qla2xxx: edif: Add extraction of auth_els from the wire")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On some platforms, the current logic of relying on finding new packet
solely based on signature pattern can lead to driver reading stale
packets. Though this is a bug in those platforms, reduce such exposures by
limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for
response packets only until response queue in pointer.
When reset, response packets are read based on a signature pattern
logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no
queue pointer shadowing.
Link: https://lore.kernel.org/r/20220713052045.10683-5-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While requesting a new mailbox command, driver does not write any data to
unused registers. Initialize the unused register value to zero while
requesting a new mailbox command to prevent stale entry access by firmware.
Link: https://lore.kernel.org/r/20220713052045.10683-4-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bikash Hazarika <bhazarika@marvell.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Improve static type checking by using the new blk_opf_t type for variables
that represent request flags.
Cc: Hannes Reinecke <hare@suse.com>
Cc: Martin Wilck <mwilck@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-43-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use the new blk_opf_t type for arguments and variables that represent
request flags. Use the !! operator in scsi_noretry_cmd() to convert the
blk_opf_t type into a boolean. This patch does not change any functionality.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-42-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch prepares for introducing the new blk_opf_t type in the SCSI core.
Since the value returned by scsi_noretry_cmd() is only used in boolean
expressions, this patch does not change any functionality.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-41-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Improve static type checking by using the new blk_opf_t type for the
combination of a request operation and its flags.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-40-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The type name enum req_opf is misleading since it suggests that values of
this type include both an operation type and flags. Since values of this
type represent an operation only, change the type name into enum req_op.
Convert the enum req_op documentation into kernel-doc format. Move a few
definitions such that the enum req_op documentation occurs just above
the enum req_op definition.
The name "req_opf" was introduced by commit ef295ecf09 ("block: better op
and flags encoding").
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-2-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/165730608687.177165.11815510982277242966.stgit@brunhilda
Reviewed-by: Gerry Morong <gerry.morong@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyright to current year.
Link: https://lore.kernel.org/r/165730608177.177165.13184715486635363193.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allow user to override the default driver timeout for controller ready.
There are some rare configurations which require the driver to wait longer
than the normal 3 minutes for the controller to complete its bootup
sequence and be ready to accept commands from the driver.
The module parameter is:
ctrl_ready_timeout= { 0 | 30-1800 }
and specifies the timeout in seconds for the driver to wait for controller
ready. The valid range is 0 or 30-1800. The default value is 0, which
causes the driver to use a timeout of 180 seconds (3 minutes).
Link: https://lore.kernel.org/r/165730607666.177165.9221211345284471213.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change removing a LUN using sysfs from an internal driver function
pqi_remove_all_scsi_devices() to using the .slave_destroy entry in the
scsi_host_template.
A LUN can be deleted via sysfs using this syntax:
echo 1 > /sys/block/sdX/device/delete
Link: https://lore.kernel.org/r/165730607154.177165.9723066932202995774.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allow SMP affinity to be changeable by disabling managed interrupts.
On distributions where the driver is enabled for multi-queue support the
driver utilizes kernel managed interrupts, which automatically distributes
interrupts to all available CPUs and assigns SMP affinity.
On most distributions, the affinity can not be changed by the user.
This change will allow managed interrupts to be disabled by the user via a
module parameter while still allowing multi-queue support to function
properly.
Use the module parameter disable_managed_interrupts=1
Link: https://lore.kernel.org/r/165730606638.177165.12846020942931640329.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct a rare stale RAID map access when performing AIO during a RAID
configuration change.
A race condition in the driver could cause it to access a stale RAID map
when a logical volume is reconfigured.
Modify the driver logic to invalidate a RAID map very early when a RAID
configuration change is detected and only switch to a new RAID map after
the driver detects that the RAID map has changed.
Link: https://lore.kernel.org/r/165730606128.177165.7671413443814750829.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct a SOP READ and WRITE DMA flags for some requests.
This update corrects DMA direction issues with SCSI commands removed from
the controller's internal lookup table.
Currently, SCSI READ BLOCK LIMITS (0x5) was removed from the controller
lookup table and exposed a DMA direction flag issue.
SCSI READ BLOCK LIMITS was recently removed from our controller lookup
table so the controller uses the respective IU flag field to set the DMA
data direction. Since the DMA direction is incorrect the FW never completes
the request causing a hang.
Some SCSI commands which use SCSI READ BLOCK LIMITS
* sg_map
* mt -f /dev/stX status
After updating controller firmware, users may notice their tape units
failing. This patch resolves the issue.
Also, the AIO path DMA direction is correct.
The DMA direction flag is a day-one bug with no reported BZ.
Fixes: 6c223761eb ("smartpqi: initial commit of Microsemi smartpqi driver")
Link: https://lore.kernel.org/r/165730605618.177165.9054223644512926624.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mahesh Rajashekhara <Mahesh.Rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change method used to detect controller firmware crash during PQI reset.
PQI reset can fail with error -6 if firmware takes > 100ms to complete
reset.
Method used by driver to detect controller firmware crash during PQI was
incorrect in some cases.
Link: https://lore.kernel.org/r/165730605108.177165.1132931838384767071.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the PCI ID for (values in hex):
VID / DID / SVID / SDID
---- ---- ---- ----
Adaptec SmartHBA 2100-8i-o 9005 / 0285 / 9005 / 0659
Link: https://lore.kernel.org/r/165730604089.177165.17257514581321583667.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fail all outstanding requests after a PCI linkdown.
Block access to device SCSI attributes during the following conditions:
"Cable pull" is called PQI_CTRL_SURPRISE_REMOVAL.
"PCIe Link Down" is called PQI_CTRL_GRACEFUL_REMOVAL.
Block access to device SCSI attributes during and in rare instances when
the controller goes offline.
Either outstanding requests or the access of SCSI attributes post linkdown
can lead to a hang.
Post linkdown, driver does not fail the outstanding requests leading to
long wait time before all the IOs eventually fail.
Also access of the SCSI attributes by host applications can lead to a
system hang.
Link: https://lore.kernel.org/r/165730603578.177165.4699352086827187263.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add driver support for up to 256 LUNs per device.
Update AIO path to pass the appropriate LUN number for base-code to target
the correct LUN.
Update RAID IO path to pass the appropriate LUN number for FW to target the
correct LUN.
Pass the correct LUN number while doing a LUN reset.
Count the outstanding commands based on LUN number. While removing a
Multi-LUN device, wait for all outstanding commands to complete for all
LUNs.
Add Feature bit support.
Link: https://lore.kernel.org/r/165730603067.177165.14016422176841798336.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kumar Meiyappan <Kumar.Meiyappan@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Insert a minimum 1 millisecond delay after writing to a register before
reading from it.
SIS and PQI registers that can be both written to and read from can return
stale data if read from too soon after having been written to.
There is no read/write ordering or hazard detection on the inbound path to
the MSGU from the PCIe bus, therefore reads could pass writes.
Link: https://lore.kernel.org/r/165730602555.177165.11181012469428348394.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com>
Co-developed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Print controller firmware version to OS message log during driver
initialization or after OFA.
Link: https://lore.kernel.org/r/165730601536.177165.17698744242908911822.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Check the response code returned from the LUN reset task management
function and if it indicates the LUN is not valid, do not retry.
Reduce rescan worker delay to 5 seconds for the event handler only.
The removal of a drive from the OS could have been delayed up to 30 seconds
after being physically pulled.
The driver was retrying a LUN reset 3 times even though the return code
indiciated the LUN was no longer valid. There was a 10 second delay between
each retry. Additionally, the rescan worker was scheduled to run 10 seconds
after the driver received the event.
Link: https://lore.kernel.org/r/165730601025.177165.9416869335174437006.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Negotiated link rate needs to be updated to 'Disabled' when phy is stopped.
Link: https://lore.kernel.org/r/20220708205026.969161-1-changyuanl@google.com
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, the data flow of the max/min linkrate in the driver is
* in pm8001_get_lrate_mode():
hardcoded value ==> struct sas_phy
* in pm8001_bytes_dmaed():
struct pm8001_phy ==> struct sas_phy
* in pm8001_phy_control():
libsas data ==> struct pm8001_phy
Since pm8001_bytes_dmaed() follows pm8001_get_lrate_mode(), and the fields
in struct pm8001_phy are not initialized, sysfs
`/sys/class/sas_phy/phy-*/maximum_linkrate` always shows `Unknown`.
To fix the issue, change the dataflow to the following:
* in pm8001_phy_init():
initial value ==> struct pm8001_phy
* in pm8001_get_lrate_mode():
struct pm8001_phy ==> struct sas_phy
* in pm8001_phy_control():
libsas data ==> struct pm8001_phy
For negotiated linkrate, the current dataflow is:
* in pm8001_get_lrate_mode():
iomb data ==> struct asd_sas_phy ==> struct sas_phy
* in pm8001_bytes_dmaed():
struct asd_sas_phy ==> struct sas_phy
Since pm8001_bytes_dmaed() follows pm8001_get_lrate_mode(), the assignment
statements in pm8001_bytes_dmaed() are unnecessary and cleaned up.
Link: https://lore.kernel.org/r/20220707175210.528858-1-changyuanl@google.com
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Refactor code from fnic probe into a different function so that
scsi layer initialization code is grouped together.
Also, add log messages for better debugging.
Link: https://lore.kernel.org/r/20220707205155.692688-1-kartilak@cisco.com
Co-developed-by: Gian Carlo Boffa <gcboffa@cisco.com>
Signed-off-by: Gian Carlo Boffa <gcboffa@cisco.com>
Co-developed-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Signed-off-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The megaraid SCSI driver sets set->nr_maps as 3 if poll_queues is > 0, and
blk-mq actually initializes each map's nr_queues as nr_hw_queues.
Consequently the driver has to clear READ queue map's nr_queues, otherwise
the queue map becomes broken if poll_queues is set as non-zero.
Link: https://lore.kernel.org/r/20220706125942.528533-1-ming.lei@redhat.com
Fixes: 9e4bec5b2a ("scsi: megaraid_sas: mq_poll support")
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: sumit.saxena@broadcom.com
Cc: chandrakanth.patil@broadcom.com
Cc: linux-block@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.de>
Reported-by: Guangwu Zhang <guazhang@redhat.com>
Tested-by: Guangwu Zhang <guazhang@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyrights to 2022 for files modified in the 14.2.0.5 patch set.
Link: https://lore.kernel.org/r/20220701211425.2708-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 14.2.0.5
Link: https://lore.kernel.org/r/20220701211425.2708-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The Menlo/Hornet adapter was never released to the field. As such, driver
code specific to the adapter is unnecessary and should be removed.
Link: https://lore.kernel.org/r/20220701211425.2708-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc_nvmet_prep_abort_wqe() has a lot of common code with
lpfc_sli_prep_abort_xri().
Delete lpfc_nvmet_prep_abort_wqe() as the wqe can be filled out using the
generic lpfc_sli_prep_abort_xri routine(). Add the wqec option to
lpfc_sli_prep_abort_xri() for lpfc_nvmet_prep_abort_wqe().
Link: https://lore.kernel.org/r/20220701211425.2708-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The RSCN_MEMENTO logic was to workaround a target that does not register
both FCP and NVMe FC4 types at the same time. This caused the
configuration to not produce a second RSCN for the NVMe FC4 type
registration in a timely manner. The intention of the RSCN_MEMENTO flag
was to always signal to try NVMe PRLI.
However, there are other FCP-only target arrays in correctly behaved
configurations that reject the NVMe PRLI followed by a LOGO leading to
never rediscovering the target after an issue_lip (as LOGO causes a repeat
of PLOGI/PRLIs).
Revert the RSCN_MEMENTO patch as it is causing correctly behaved configs to
fail while it exists only to succeed on a misbehaved config.
Link: https://lore.kernel.org/r/20220701211425.2708-9-jsmart2021@gmail.com
Fixes: 1045592fc9 ("scsi: lpfc: Introduce FC_RSCN_MEMENTO flag for tracking post RSCN completion")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During a target link bounce test, the driver sees a mismatch between the
NPortId and the WWPN on the node structures (ndlps) involved. When this
occurs, the driver "swaps" the ndlp and new_ndlp node parameters to restore
WWPN/DID uniqueness in the fc_nodes list per vport. However, the driver
neglected to swap the nlp_fc4_type in the ndlp passed to
lpfc_plogi_confirm_nport causing a failure to recover the NVMe PLOGI/PRLI
and ultimately the NVMe paths.
Correct confirm_nport to preserve the fc4 types from the new-ndlp when the
data is moved over ot the ndlp structure.
Link: https://lore.kernel.org/r/20220701211425.2708-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Disabling FA-PWWN should be effective after port reset, but in some cases
it was found to be impossible to clear FA-PWWN usage without a driver
reload.
Clean up FA-PWWN flag management to make enable and disable of the feature
more robust.
Link: https://lore.kernel.org/r/20220701211425.2708-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is no corresponding free routine if lpfc_sli4_issue_wqe fails to
issue the CMF WQE in lpfc_issue_cmf_sync_wqe.
If ret_val is non-zero, then free the iocbq request structure.
Link: https://lore.kernel.org/r/20220701211425.2708-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
VMID introduced an extra increment of cmd_pending, causing double-counting
of the I/O. The normal increment ios performed in lpfc_get_scsi_buf.
Link: https://lore.kernel.org/r/20220701211425.2708-5-jsmart2021@gmail.com
Fixes: 33c79741de ("scsi: lpfc: vmid: Introduce VMID in I/O path")
Cc: <stable@vger.kernel.org> # v5.14+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When providing a D_ID in XMIT_ELS_RSP64_CX iocb the PU field should
be set to 3 to describe the parameter being passed to firmware.
Link: https://lore.kernel.org/r/20220701211425.2708-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Malformed user input to debugfs results in buffer overflow crashes. Adapt
input string lengths to fit within internal buffers, leaving space for NULL
terminators.
Link: https://lore.kernel.org/r/20220701211425.2708-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In lpfc_nvme_cancel_iocb(), a cqe is created locally from stack storage.
The code didn't initialize the total_data_placed word, inheriting stack
content.
Initialize the total_data_placed word.
Link: https://lore.kernel.org/r/20220701211425.2708-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For some technologies, e.g. an ATA bus, resuming can take multiple
seconds. Waiting for resume to finish can cause a very noticeable delay.
Hence this commit that restores the behavior from before "scsi: core: pm:
Rely on the device driver core for async power management" for most SCSI
devices.
This commit introduces a behavior change: if the START command fails, do
not consider this as a SCSI disk resume failure.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215880
Link: https://lore.kernel.org/r/20220630195703.10155-3-bvanassche@acm.org
Fixes: a19a93e4c6 ("scsi: core: pm: Rely on the device driver core for async power management")
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: ericspero@icloud.com
Cc: jason600.groome@gmail.com
Tested-by: jason600.groome@gmail.com
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the definition of SCSI_QUEUE_DELAY to just above the function that
uses it.
Link: https://lore.kernel.org/r/20220630195703.10155-2-bvanassche@acm.org
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This was found by coccicheck:
drivers/scsi/megaraid/megaraid_sas_base.c:3950 process_fw_state_change_wq() warn: inconsistent indenting.
Link: https://lore.kernel.org/r/20220630074152.29171-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
sdev_printk() will only accept messages up to 128 bytes.
Shorten strings exceeding 128 bytes avoid printing an incomplete sentence
like:
[ 475.156955] sd 9:0:0:0: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Link: https://lore.kernel.org/r/20220630024516.1571209-1-lizhijian@fujitsu.com
Suggested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enable shared host tagset to make sure that total outstanding I/O count can
not exceed controller's can_queue setting.
Link: https://lore.kernel.org/r/20220628074848.5036-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a copy and paste bug here. It should check ".rsp" instead of
".req". The error message is copy and pasted as well so update that too.
Link: https://lore.kernel.org/r/YrK1A/t3L6HKnswO@kili
Fixes: 9c40c36e75 ("scsi: qla2xxx: edif: Reduce Initiator-Initiator thrashing")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the zone related fields that are currently stored in
struct request_queue to struct gendisk as these are part of the highlevel
block layer API and are only used for non-passthrough I/O that requires
the gendisk.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220706070350.1703384-17-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Switch to a gendisk based API in preparation for moving all zone related
fields from the request_queue to the gendisk.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220706070350.1703384-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Prepare for storing the zone related field in struct gendisk instead
of struct request_queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220706070350.1703384-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We no longer use the 'reserved' arg in busy_tag_iter_fn for any iter
function so it may be dropped.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me> #nvme
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/1657109034-206040-6-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The SCSI core code does not support reserved requests, so drop the
handling in fnic_pending_aborts_iter().
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/1657109034-206040-5-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With new API blk_mq_is_reserved_rq() we can tell if a request is from
the reserved pool, so stop passing 'reserved' arg. There is actually
only a single user of that arg for all the callback implementations, which
can use blk_mq_is_reserved_rq() instead.
This will also allow us to stop passing the same 'reserved' around the
blk-mq iter functions next.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/1657109034-206040-4-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The SCSI core code does not currently support reserved commands. As such,
requests which time-out would never be reserved, and scsi_timeout()
'reserved' arg should never be set.
Remove handling for reserved requests, drop the wrapper scsi_timeout()
as it now just calls scsi_times_out() always, and finally rename
scsi_times_out() -> scsi_timeout() to match the blk_mq_ops method name.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/1657109034-206040-2-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for
all disks that do not have separately allocated queues, and thus remove
the need to call blk_cleanup_queue for them.
Rename blk_cleanup_disk to blk_mq_destroy_queue to make it clear that
this function is intended only for separately allocated blk-mq queues.
This saves an extra queue freeze for devices without a separately
allocated queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The dpt_i2o driver was fixed to stop using virt_to_bus() in 2008, but it
still has a stale reference in an error handling code path that could never
work. I submitted a patch to fix this reference earlier, but Hannes
Reinecke suggested that removing the driver may be just as good here.
The i2o driver layer was removed in 2015 with commit 4a72a7af46
("staging: remove i2o subsystem"), but the even older dpt_i2o scsi driver
stayed around.
The last non-cleanup patches I could find were from Miquel van Smoorenburg
and Mark Salyzyn back in 2008, they might know if there is any chance of
the hardware still being used anywhere.
Link: https://lore.kernel.org/linux-scsi/CAK8P3a1XfwkTOV7qOs1fTxf4vthNBRXKNu8A5V7TWnHT081NGA@mail.gmail.com/T/
Link: https://lore.kernel.org/r/20220624155226.2889613-3-arnd@kernel.org
Cc: Miquel van Smoorenburg <mikevs@xs4all.net>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The BusLogic driver is the last remaining driver that relies on the
deprecated bus_to_virt() function, which in turn only works on a few
architectures, and is incompatible with both swiotlb and iommu support.
Before commit 391e2f2560 ("[SCSI] BusLogic: Port driver to 64-bit."), the
driver had a dependency on x86-32, presumably because of this
problem. However, the change introduced another bug that made it still
impossible to use the driver on any 64-bit machine.
This was in turn fixed in commit 56f396146a ("scsi: BusLogic: Fix 64-bit
system enumeration error for Buslogic"), 8 years later, which shows that
there are not a lot of users.
Maciej is still using the driver on 32-bit hardware, and Khalid mentioned
that the driver works with the device emulation used in VirtualBox and
VMware. Both of those only emulate it for Windows 2000 and older operating
systems that did not ship with the better LSI logic driver.
Do a minimum fix that searches through the list of descriptors to find one
that matches the bus address. This is clearly as inefficient as was
indicated in the code comment about the lack of a bus_to_virt()
replacement. A better fix would likely involve changing out the entire
descriptor allocation for a simpler one, but that would be much more
invasive.
Link: https://lore.kernel.org/r/20220624155226.2889613-2-arnd@kernel.org
Cc: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: Matt Wang <wwentao@vmware.com>
Tested-by: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Variable wlen is being assigned a value that is never read, it is being
re-assigned with a different value later on. The assignment is redundant
and can be removed.
Cleans up clang scan build warning:
drivers/scsi/fcoe/fcoe.c:1491:2: warning: Value stored to 'wlen'
is never read [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220623164710.76831-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It seems to me that mpt3sas driver is using dedicated workqueues and is not
calling schedule{,_delayed}_work{,_on}(). Then, there will be no work to
flush using flush_scheduled_work().
Link: https://lore.kernel.org/r/f3b97c7c-1094-4e46-20d8-4321716d6f3f@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a spelling mistake in _base_sas_ioc_info(). Change 'cant' to
'can't'.
Also fix up whitespace.
Link: https://lore.kernel.org/r/20220617101103.3162-1-jiaming@nfschina.com
Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The conn_send_pdu API is evil in that it returns a pointer to an
iscsi_task, but that task might have been freed already so you can't touch
it. This patch splits the task allocation and transmission, so functions
like iscsi_send_nopout() can access the task before its sent and do
whatever bookkeeping is needed before it is sent.
Link: https://lore.kernel.org/r/20220616224557.115234-10-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We need the back lock when freeing a task, so we hold it when calling
__iscsi_put_task() from the completion path to make it easier and to avoid
having to retake it in that path. For iscsi_put_task() we just grabbed it
while also doing the decrement on the refcount but it's only really needed
if the refcount is zero and we free the task. This modifies
iscsi_put_task() to just take the lock when needed then has the xmit path
use it. Normally we will then not take the back lock from the xmit path. It
will only be rare cases where the network is so fast that we get a response
right after we send the header/data.
Link: https://lore.kernel.org/r/20220616224557.115234-9-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We currently require that the back_lock is held when calling the functions
that manipulate the iscsi_task refcount. The only reason for this is to
handle races where we are handling SCSI-ml EH callbacks and the cmd is
completing at the same time the normal completion path is running, and we
can't return from the EH callback until the driver has stopped accessing
the cmd. Holding the back_lock while also accessing the task->state made it
simple to check that a cmd is completing and also get/put a refcount at the
same time, and at the time we were not as concerned about performance.
The problem is that we don't want to take the back_lock from the xmit path
for normal I/O since it causes contention with the completion path if the
user has chosen to try and split those paths on different CPUs (in this
case abusing the CPUs and ignoring caching improves perf for some uses).
Begins to remove the back_lock requirement for iscsi_get/put_task by
removing the requirement for the get path. Instead of always holding the
back_lock we detect if something has done the last put and is about to call
iscsi_free_task(). A subsequent commit will then allow iSCSI code to do the
last put on a task and only grab the back_lock if the refcount is now zero
and it's going to call iscsi_free_task().
Link: https://lore.kernel.org/r/20220616224557.115234-8-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 5923d64b7a ("scsi: libiscsi: Drop taskqueuelock") added an extra
task->state because for commit 6f8830f5bb ("scsi: libiscsi: add lock
around task lists to fix list corruption regression") we didn't know why we
ended up with cmds on the list and thought it might have been a bad target
sending a response while we were still sending the cmd. We were never able
to get a target to send us a response early, because it turns out the bug
was just a race in libiscsi/libiscsi_tcp where:
1. iscsi_tcp_r2t_rsp() queues a r2t to tcp_task->r2tqueue.
2. iscsi_tcp_task_xmit() runs iscsi_tcp_get_curr_r2t() and sees we have a
r2t. It dequeues it and iscsi_tcp_task_xmit() starts to process it.
3. iscsi_tcp_r2t_rsp() runs iscsi_requeue_task() and puts the task on the
requeue list.
4. iscsi_tcp_task_xmit() sends the data for r2t. This is the final chunk
of data, so the cmd is done.
5. target sends the response.
6. On a different CPU from #3, iscsi_complete_task() processes the
response. Since there was no common lock for the list, the lists/tasks
pointers are not fully in sync, so could end up with list corruption.
Since it was just a race on our side, remove the extra check and fix up the
comments.
Link: https://lore.kernel.org/r/20220616224557.115234-7-michael.christie@oracle.com
Reviewed-by: Wu Bo <wubo40@huawei.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For software iSCSI, we do a session per host so there is no need to set the
target's can_queue since it's the same as the host one. Setting it just
results in extra atomic checks in the main I/O path.
Link: https://lore.kernel.org/r/20220616224557.115234-6-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If we have more data, set the MSG_SENDPAGE_NOTLAST in case we go down the
sendpage path.
Link: https://lore.kernel.org/r/20220616224557.115234-5-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We don't always want to run the recv path from the network softirq because
when we have to have multiple sessions sharing the same CPUs, some sessions
can eat up the NAPI softirq budget and affect other sessions or users.
Allow us to queue the recv handling to the iscsi workqueue so we can have
the scheduler/wq code try to balance the work and CPU use across all
sessions' worker threads.
Note: It wasn't the original intent of the change but a nice side effect is
that for some workloads/configs we get a nice performance boost. For a
simple read heavy test:
fio --direct=1 --filename=/dev/dm-0 --rw=randread --bs=256K
--ioengine=libaio --iodepth=128 --numjobs=4
where the iscsi threads, fio jobs, and rps_cpus share CPUs we see a 32%
throughput boost. We also see increases for small I/O IOPs tests but it's
not as high.
Link: https://lore.kernel.org/r/20220616224557.115234-4-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helpers to allow the drivers to run their recv paths from libiscsi's
workqueue.
Link: https://lore.kernel.org/r/20220616224557.115234-3-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Rename iscsi_conn_queue_work() to iscsi_conn_queue_xmit() to reflect that
it handles queueing of xmits only.
Link: https://lore.kernel.org/r/20220616224557.115234-2-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the system is shutting down, iscsid is not running so we will not get
a response to the ISCSI_ERR_INVALID_HOST error event. The system shutdown
will then hang waiting on userspace to remove the session.
This has libiscsi force the destruction of the session from the kernel when
iscsi_host_remove() is called from a driver's shutdown callout.
This fixes a regression added in qedi boot with commit d1f2ce7763 ("scsi:
qedi: Fix host removal with running sessions") which made qedi use the
common session removal function that waits on userspace instead of rolling
its own kernel based removal.
Link: https://lore.kernel.org/r/20220616222738.5722-7-michael.christie@oracle.com
Fixes: d1f2ce7763 ("scsi: qedi: Fix host removal with running sessions")
Tested-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When handling errors that lead to host removal use QEDI_MODE_NORMAL. There
is currently no difference in behavior between NORMAL and SHUTDOWN, but in
a subsequent commit we will want to know when we are called from the
pci_driver shutdown callout vs remove/err_handler so we know when userspace
is up.
Link: https://lore.kernel.org/r/20220616222738.5722-6-michael.christie@oracle.com
Tested-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During qedi shutdown we need to stop the iSCSI layer from sending new nops
as pings and from responding to target ones and make sure there is no
running connection cleanups. Commit d1f2ce7763 ("scsi: qedi: Fix host
removal with running sessions") converted the driver to use the libicsi
helper to drive session removal, so the above issues could be handled. The
problem is that during system shutdown iscsid will not be running so when
we try to remove the root session we will hang waiting for userspace to
reply.
Add a helper that will drive the destruction of sessions like these during
system shutdown.
Link: https://lore.kernel.org/r/20220616222738.5722-5-michael.christie@oracle.com
Tested-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the next patch we allow drivers to drive session removal during
shutdown. In this case iscsid will not be running, so we need to detect
bound endpoints and disconnect them. This moves the bound ep check so we
now always check.
Link: https://lore.kernel.org/r/20220616222738.5722-4-michael.christie@oracle.com
Tested-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
iscsi_if_stop_conn() is only called from the userspace interface but in a
subsequent commit we will want to call it from the kernel interface to
allow drivers like qedi to remove sessions from inside the kernel during
shutdown. This removes the iscsi_uevent code from iscsi_if_stop_conn() so we
can call it in a new helper.
Link: https://lore.kernel.org/r/20220616222738.5722-3-michael.christie@oracle.com
Tested-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If qla4xxx doesn't remove the connection before the session, the iSCSI
class tries to remove the connection for it. We were doing a
iscsi_put_conn() in the iter function which is not needed and will result
in a use after free because iscsi_remove_conn() will free the connection.
Link: https://lore.kernel.org/r/20220616222738.5722-2-michael.christie@oracle.com
Tested-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This function always returns 0. We can make it return void to simplify the
code. Also, no caller ever checks the return value of this function.
Link: https://lore.kernel.org/r/20220616080210.18531-1-mgurtovoy@nvidia.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Clear wait for mailbox interrupt flag to prevent stale mailbox:
Feb 22 05:22:56 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-500a:4: LOOP UP detected (16 Gbps).
Feb 22 05:22:59 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-d04c:4: MBX Command timeout for cmd 69, ...
To fix the issue, driver needs to clear the MBX_INTR_WAIT flag on purging
the mailbox. When the stale mailbox completion does arrive, it will be
dropped.
Link: https://lore.kernel.org/r/20220616053508.27186-11-njavali@marvell.com
Fixes: b6faaaf796 ("scsi: qla2xxx: Serialize mailbox request")
Cc: Naresh Bannoth <nbannoth@in.ibm.com>
Cc: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Cc: stable@vger.kernel.org
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Naresh Bannoth <nbannoth@in.ibm.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FCP-2 devices were not coming back online once they were lost, login
retries exhausted, and then came back up. Fix this by accepting RSCN when
the device is not online.
Link: https://lore.kernel.org/r/20220616053508.27186-10-njavali@marvell.com
Fixes: 44c57f2058 ("scsi: qla2xxx: Changes to support FCP2 Target")
Cc: stable@vger.kernel.org
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>