Removed the lpfc_fdmi_attr_entry and lpfc_fdmi_attr_def structures that had
a union causing unintentional zero padding, which required the usage of
__packed. They are replaced with explicit lpfc_fdmi_attr_u32,
lpfc_fdmi_attr_wwn, lpfc_fdmi_attr_fc4types, and lpfc_fdmi_attr_string
structure defines instead of living in a union. This rids of ambiguous
compiler zero padding, and entailed cleaning up bitwise endian
declarations.
As such, all FDMI attribute registration routines are replaced with generic
void *arg and handlers for each of the newly defined attribute structure
types.
Link: https://lore.kernel.org/r/20220911221505.117655-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch case logics are reworked so they appear more similar and
consistent. This eliminates compiler errors indicating unaligned pointer
values and packed members.
Added comments to explain previous size offset accumulations.
Link: https://lore.kernel.org/r/20220911221505.117655-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Clarify naming of the mp/bmp dma buffers:
- Rename mp to rq as it is the request buffer
- Rename bmp to rsp as it is the response buffer
This reduces confusion about what the buffer content is based on their
name.
Link: https://lore.kernel.org/r/20220911221505.117655-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If there is a congestion or automated congestion response mode change, then
log the reported change to kmsg.
Link: https://lore.kernel.org/r/20220911221505.117655-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On a PCI hotplug capable system, it is possible for scsi_device_put() to
happen after lpfc_pci_remove_one() is called. As a result, the
sdev->host->hostt->module dereference is for a previously freed memory
location because the phba structure containing the hostt template was
already freed when lpfc_pci_remove_one() returned.
Since the lpfc module is still loaded during power slot disable, all
scsi_host_templates should be declared as part of the global data segment
instead of inside the heap allocated phba structure. This way the
sdev->host->hostt memory area is always valid as long as the module is
loaded regardless if PCI hotplug dynamically allocates or frees phba
structures.
Move all scsi_host_templates in the phba structure to global variables.
Create a small helper routine to determine appropriate sg_tablesize during
shost allocation.
Link: https://lore.kernel.org/r/20220911221505.117655-7-jsmart2021@gmail.com
Co-developed-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Co-developed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a target makes the mistake of registering a FC4 type with the fabric,
but then rejects a PRLI of that type, the lpfc driver incorrectly retries
the PRLI causing multiple registrations with the transport. The driver
needs to detect the reject reason data and stop any retry.
Rework the PRLI reject scenarios.
Link: https://lore.kernel.org/r/20220911221505.117655-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sometimes VMID targets are not getting rediscovered after a port reset.
The iocb is not freed in lpfc_cmpl_ct_cmd_vmid(), which is the completion
function for the appid CT commands. So after a port reset, the count of
sges is less than the expected count of 250. This causes post reset
operation logic to fail and keep the port offline.
Fix by freeing the iocb and kref put for the lpfc_cmpl_ct_cmd_vmid() early
return cases.
Link: https://lore.kernel.org/r/20220911221505.117655-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In a situation where the node state changes while a REG_LOGIN is in
progress, the LPFC_MBOXQ_t structure is cleared and reused for an
UNREG_LOGIN command to release RPI resources without first freeing the mbuf
pool resource allocated for REG_LOGIN.
Release mbuf pool resource prior to repurposing of the mailbox command
structure from REG_LOGIN to UNREG_LOGIN.
Link: https://lore.kernel.org/r/20220911221505.117655-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a FLOGI is received before we have issued our FLOGI, the ACC response
to the received FLOGI is issued with SID 2 instead of the expected fabric
controller SID. Certain target vendors ignore the malformed ACC with SID 2
and wait for a properly filled ACC with a fabric controller SID.
The lpfc_sli_prep_wqe() routine depends on the FC_PT2PT flag to fill in the
fabric controller SID when in PT2PT mode, but due to a previous commit the
flag was getting cleared. Fix by adding a check for the defer_flogi_acc
flag to know whether or not to clear the FC_PT2PT flag on link up.
Link: https://lore.kernel.org/r/20220911221505.117655-3-jsmart2021@gmail.com
Fixes: 439b93293f ("scsi: lpfc: Fix unsolicited FLOGI receive handling during PT2PT discovery")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The if statment check (prli_fc4_req & PRLI_NVME_TYPE) evaluates to true
when receiving a PRLI request for bogus FC4 type codes that happen to have
the 3rd or 5th bit set because PRLI_NVME_TYPE is 0x28. This leads to
sending a PRLI_NVME_ACC even for bogus FC4 type codes.
Change the bitwise & check to an exact == type code check to ensure we send
PRLI_NVME_ACC only for NVME type coded PRLI requests.
Link: https://lore.kernel.org/r/20220911221505.117655-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The error code from mpi3mr_post_transport_req() is supposed to be passed to
bsg_job_done(job, rc, reslen), but it isn't.
Link: https://lore.kernel.org/r/YyMISJzVDARpVwrr@kili
Fixes: 176d4aa69c ("scsi: mpi3mr: Support SAS transport class callbacks")
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are three error paths which return success:
1) Propagate the error code from mpi3mr_post_transport_req() if it fails.
2) Return -EINVAL if "ioc_status != MPI3_IOCSTATUS_SUCCESS".
3) Return -EINVAL if "le16_to_cpu(mpi_reply.response_data_length) !=
sizeof(struct rep_manu_reply)"
Link: https://lore.kernel.org/r/YyMIJh1HU2Qz9+Rs@kili
Fixes: 2bd37e2849 ("scsi: mpi3mr: Add framework to issue MPT transport cmds")
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ahd_linux_setup_iocell_info() intentionally writes to the const-marked
aic79xx_iocell_info array, but is called during __init, so the location is
actually writable at this point on most architectures. Annotate this
explicitly with __ro_after_init to avoid static analysis confusion.
Link: https://lpc.events/event/16/contributions/1175/attachments/1109/2128/2022-LPC-analyzer-talk.pdf
Link: https://lore.kernel.org/r/20220914115953.3854029-1-keescook@chromium.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reported-by: David Malcolm <dmalcolm@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 8f394da36a ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG")
made the __qlt_24xx_handle_abts() function return early if
tcm_qla2xxx_find_cmd_by_tag() didn't find a command, but it missed to clean
up the allocated memory for the management command.
Link: https://lore.kernel.org/r/20220914024924.695604-1-rafaelmendsr@gmail.com
Fixes: 8f394da36a ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
qla2x00_get_fw_version_str() has been removed since commit abbd8870b9
("[SCSI] qla2xxx: Factor-out ISP specific functions to method-based call
tables.").
qla2x00_release_nvram_protection() has been removed since commit
459c537807 ("[SCSI] qla2xxx: Add ISP24xx flash-manipulation routines.").
qla82xx_rdmem() and qla82xx_wrmem() have been removed since commit
3711333dfb ("[SCSI] qla2xxx: Updates for ISP82xx.").
qla25xx_rd_req_reg(), qla24xx_rd_req_reg(), qla25xx_wrt_rsp_reg(),
qla24xx_wrt_rsp_reg(), qla25xx_wrt_req_reg() and qla24xx_wrt_req_reg() have
been removed since commit 08029990b2 ("[SCSI] qla2xxx: Refactor
request/response-queue register handling.").
qla2x00_async_login_done() has been removed since commit 726b854870
("qla2xxx: Add framework for async fabric discovery").
qlt_24xx_process_response_error() has been removed since commit
c5419e2618 ("scsi: qla2xxx: Combine Active command arrays.").
Remove the declarations for them from header file.
Link: https://lore.kernel.org/r/20220913023722.547249-2-cuigaosheng1@huawei.com
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In __qedf_probe(), if qedf->cdev is NULL which means
qed_ops->common->probe() failed, then the program will goto label err1, and
scsi_host_put() will free lport->host pointer. Because the memory qedf
points to is allocated by libfc_host_alloc(), it will be freed by
scsi_host_put(). However, the if statement below label err0 only checks
whether qedf is NULL but doesn't check whether the memory has been freed.
So a UAF bug can occur.
There are two ways to reach the statements below err0. The first one is
described as before, "qedf" should be set to NULL. The second one is goto
"err0" directly. In the latter scenario qedf hasn't been changed and it has
the initial value NULL. As a result the if statement is not reachable in
any situation.
The KASAN logs are as follows:
[ 2.312969] BUG: KASAN: use-after-free in __qedf_probe+0x5dcf/0x6bc0
[ 2.312969]
[ 2.312969] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[ 2.312969] Call Trace:
[ 2.312969] dump_stack_lvl+0x59/0x7b
[ 2.312969] print_address_description+0x7c/0x3b0
[ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0
[ 2.312969] __kasan_report+0x160/0x1c0
[ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0
[ 2.312969] kasan_report+0x4b/0x70
[ 2.312969] ? kobject_put+0x25d/0x290
[ 2.312969] kasan_check_range+0x2ca/0x310
[ 2.312969] __qedf_probe+0x5dcf/0x6bc0
[ 2.312969] ? selinux_kernfs_init_security+0xdc/0x5f0
[ 2.312969] ? trace_rpm_return_int_rcuidle+0x18/0x120
[ 2.312969] ? rpm_resume+0xa5c/0x16e0
[ 2.312969] ? qedf_get_generic_tlv_data+0x160/0x160
[ 2.312969] local_pci_probe+0x13c/0x1f0
[ 2.312969] pci_device_probe+0x37e/0x6c0
Link: https://lore.kernel.org/r/20211112120641.16073-1-fantasquex@gmail.com
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Acked-by: Saurav Kashyap <skashyap@marvell.com>
Co-developed-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Rafael explained that the reason for having both PF_NOFREEZE and
PF_FREEZER_SKIP is that {,un}lock_system_sleep() is callable from
kthread context that has previously called set_freezable().
In preparation of merging the flags, have {,un}lock_system_slee() save
and restore current->flags.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114648.725003428@infradead.org
Fix the following use-after-free warning which is observed during
controller reset:
refcount_t: underflow; use-after-free.
WARNING: CPU: 23 PID: 5399 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
Link: https://lore.kernel.org/r/20220906134908.1039-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remote devices may go missing from the per-device nexus reset part of the
HA nexus, i.e after the controller reset. This is because libsas may find
the devices to be gone as the phy may be temporarily down when processing
the bcast event generated from the nexus reset. Filter out bcast events
during this time to stop the devices being lost.
Link: https://lore.kernel.org/r/1662378529-101489-6-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In resetting the controller, SATA devices may be lost.
The issue is that when we insert the bcast events to rescan the topology in
hisi_sas_rescan_topology(), when we subsequently nexus reset the SATA
devices in hisi_sas_async_I_T_nexus_reset(), there is a small timing window
in which the remote phy is down and we process the bcast event (meaning
that libsas judges that the disk is lost).
Ensure that all bcast events are processed prior to the nexus reset to
close this window.
Link: https://lore.kernel.org/r/1662378529-101489-4-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Once the controller HW has been reset then we can unset flag
HISI_SAS_HW_FAULT_BIT. In clearing this flag earlier we can now
successfully execute commands in hisi_sas_controller_reset_done(), like
bcast processing.
Link: https://lore.kernel.org/r/1662378529-101489-3-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Now that libsas and the SCSI core code limits the default sectors from
commit 4cbfca5f77 ("scsi: scsi_transport_sas: cap shost opt_sectors
according to DMA optimal limit") and commit 608128d391 ("scsi: sd: allow
max_sectors be capped at DMA optimal size limit"), there is no need for
the hack to limit the max HW sectors.
Link: https://lore.kernel.org/r/1662378529-101489-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In preparation for FORTIFY_SOURCE performing run-time destination buffer
bounds checking for memcpy(), specify the destination output buffer
explicitly, instead of asking memcpy() to write past the end of what looked
like a fixed-size object. Silences future run-time warning:
memcpy: detected field-spanning write (size 80) of single field "trc + 1" (size 64)
There is no binary code output differences from this change.
Link: https://lore.kernel.org/r/20220901205729.2260982-1-keescook@chromium.org
Cc: Bradley Grove <linuxdrivers@attotech.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The original code will "goto out_disable_device" and call
pci_disable_device() if pci_enable_device() fails. The kernel will generate
a warning message like "3w-9xxx 0000:00:05.0: disabling already-disabled
device".
We shouldn't disable a device that failed to be enabled. A simple return is
fine.
Link: https://lore.kernel.org/r/20220829110115.38789-1-fantasquex@gmail.com
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a driver returns:
- DID_TARGET_FAILURE
- DID_NEXUS_FAILURE
- DID_ALLOC_FAILURE
- DID_MEDIUM_ERROR
we hit a couple bugs:
1. The SCSI error handler runs because scsi_decide_disposition() has no
case statements for them and we return FAILED.
2. For SG IO the userspace app gets a success status instead of failed,
because scsi_result_to_blk_status() clears those errors.
This patch adds a new internal error code byte for use by the SCSI
midlayer. This will be used instead of the above error codes, so we don't
have to play that clearing the host code game in
scsi_result_to_blk_status() and drivers cannot accidentally use them.
A subsequent commit will then remove the internal users of the above codes
and convert us to use the new ones.
Link: https://lore.kernel.org/r/20220812010027.8251-9-michael.christie@oracle.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_ALLOC_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in entering SCSI error handling.
By the code comment, it looks like the driver wanted a retryable error
code, so this has it use DID_ERROR.
Link: https://lore.kernel.org/r/20220812010027.8251-8-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it
results in entering SCSI error handling.
This has qla2xxx use DID_NO_CONNECT because it looks like we hit this error
when we can't find a port. It will give us the same hard error behavior and
it seems to match the error where we can't find the endpoint.
Link: https://lore.kernel.org/r/20220812010027.8251-7-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_NEXUS_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in entering SCSI error handling.
virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_NEXUS_FAILURE. It looks like qemu returns that error code if
host OS returns DID_NEXUS_FAILURE (qemu's internal
SCSI_HOST_RESERVATION_ERROR maps to DID_NEXUS_FAILURE). This shouldn't
happen for Linux since we don't propagate that error code to userspace.
This has us convert VIRTIO_SCSI_S_NEXUS_FAILURE to a
SAM_STAT_RESERVATION_CONFLICT in case some other virt layer is returning
it. In that case we will still get the reservation confict failure we
expect.
Link: https://lore.kernel.org/r/20220812010027.8251-6-michael.christie@oracle.com
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in entering SCSI error handling.
virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_TARGET_FAILURE. It looks like qemu returns that error code
if a host OS returns it, but this shouldn't happen for Linux since we never
propagate that error to userspace.
This has us use DID_BAD_TARGET in case some other virt layer is returning
it. In that case we will still get a hard error like before and it conveys
something unexpected happened.
Link: https://lore.kernel.org/r/20220812010027.8251-5-michael.christie@oracle.com
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it results
in the SCSI error handling running.
It looks like the driver wanted a hard failure so swap it with
DID_BAD_TARGET.
Link: https://lore.kernel.org/r/20220812010027.8251-3-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The error codes:
- DID_TARGET_FAILURE
- DID_NEXUS_FAILURE
- DID_ALLOC_FAILURE
- DID_MEDIUM_ERROR
are internal to the SCSI layer. Drivers must not use them because:
1. They are not propagated upwards, so SG IO/passthrough users will not
see an error and think a command was successful.
xen-scsiback will never see this error and should not try to send it.
2. There is no handling for them in scsi_decide_disposition() so if
xen-scsifront were to return the error to the SCSI midlayer then it
kicks off the error handler which is definitely not what we want.
Remove the use from xen-scsifront/back.
Link: https://lore.kernel.org/r/20220812010027.8251-2-michael.christie@oracle.com
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are two .exit_cmd_priv implementations. Both implementations use
resources associated with the SCSI host. Make sure that these resources are
still available when .exit_cmd_priv is called by waiting inside
scsi_remove_host() until the tag set has been freed.
This commit fixes the following use-after-free:
==================================================================
BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
Read of size 8 at addr ffff888100337000 by task multipathd/16727
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report.cold+0x5e/0x5db
kasan_report+0xab/0x120
srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
scsi_mq_exit_request+0x4d/0x70
blk_mq_free_rqs+0x143/0x410
__blk_mq_free_map_and_rqs+0x6e/0x100
blk_mq_free_tag_set+0x2b/0x160
scsi_host_dev_release+0xf3/0x1a0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_device_dev_release_usercontext+0x4c1/0x4e0
execute_in_process_context+0x23/0x90
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_disk_release+0x3f/0x50
device_release+0x54/0xe0
kobject_put+0xa5/0x120
disk_release+0x17f/0x1b0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
dm_put_table_device+0xa3/0x160 [dm_mod]
dm_put_device+0xd0/0x140 [dm_mod]
free_priority_group+0xd8/0x110 [dm_multipath]
free_multipath+0x94/0xe0 [dm_multipath]
dm_table_destroy+0xa2/0x1e0 [dm_mod]
__dm_destroy+0x196/0x350 [dm_mod]
dev_remove+0x10c/0x160 [dm_mod]
ctl_ioctl+0x2c2/0x590 [dm_mod]
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Link: https://lore.kernel.org/r/20220826002635.919423-1-bvanassche@acm.org
Fixes: 65ca846a53 ("scsi: core: Introduce {init,exit}_cmd_priv()")
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-5-bvanassche@acm.org
Fixes: fe44260419 ("scsi: core: Make sure that targets outlive devices")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-4-bvanassche@acm.org
Fixes: 16728aaba6 ("scsi: core: Make sure that hosts outlive targets")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-3-bvanassche@acm.org
Fixes: 1a9283782d ("scsi: core: Simplify LLD module reference counting")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Revert the patch series "Call blk_mq_free_tag_set() earlier" because it
introduces a deadlock if the scsi_remove_host() caller holds a reference on
a device, target or host.
Link: https://lore.kernel.org/r/20220821220502.13685-2-bvanassche@acm.org
Fixes: f323896fe6 ("scsi: core: Call blk_mq_free_tag_set() earlier")
Reported-by: syzbot+bafeb834708b1bb750bc@syzkaller.appspotmail.com
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the missing destroy_workqueue() before return from
lpfc_sli4_driver_resource_setup() in the error path.
Link: https://lore.kernel.org/r/20220823044237.285643-1-yangyingliang@huawei.com
Fixes: 3cee98db26 ("scsi: lpfc: Fix crash on driver unload in wq free")
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
{clear|set}_bit() can take an almost arbitrarily large bit number, so there
is no need to manually compute addresses. This is just redundant.
Link: https://lore.kernel.org/r/c3429a22023f58e5e5cc65d6cd7e83fb2bd9b870.1658340442.git.christophe.jaillet@wanadoo.fr
Tested-by: Don Brace <don.brace@microchip.com>
Acked-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use bitmap_zalloc()/bitmap_free() instead of hand-writing them. It is less
verbose and it improves the semantic.
Link: https://lore.kernel.org/r/5f975ef43f8b7306e4ac4e2e8ce4bcd53f6092bb.1658340441.git.christophe.jaillet@wanadoo.fr
Tested-by: Don Brace <don.brace@microchip.com>
Acked-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Return the value from lpfc_issue_reg_vfi() directly instead of storing it
in another redundant variable.
Link: https://lore.kernel.org/r/20220824075123.221316-1-ye.xingchen@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Return the value from lpfc_sli4_issue_wqe() directly instead of storing it
in another redundant variable.
Link: https://lore.kernel.org/r/20220824075017.221244-1-ye.xingchen@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/qla2xxx/qla_os.c:40:20: warning: symbol 'qla_trc_array'
was not declared. Should it be static?
drivers/scsi/qla2xxx/qla_os.c:345:5: warning: symbol
'ql2xdelay_before_pci_error_handling' was not declared.
Should it be static?
Define qla_trc_array and ql2xdelay_before_pci_error_handling as static to
fix sparse warnings.
Link: https://lore.kernel.org/r/20220826102559.17474-7-njavali@marvell.com
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Older tracing of driver messages was to:
- log only debug messages to kernel main trace buffer; and
- log only if extended logging bits corresponding to this message is
off
This has been modified and extended as follows:
- Tracing is now controlled via ql2xextended_error_logging_ktrace
module parameter. Bit usages same as ql2xextended_error_logging.
- Tracing uses "qla2xxx" trace instance, unless instance creation have
issues.
- Tracing is enabled (compile time tunable).
- All driver messages, include debug and log messages are now traced in
kernel trace buffer.
Trace messages can be viewed by looking at the qla2xxx instance at:
/sys/kernel/tracing/instances/qla2xxx/trace
Trace tunable that takes the same bit mask as ql2xextended_error_logging
is:
ql2xextended_error_logging_ktrace (default=1)
Link: https://lore.kernel.org/r/20220826102559.17474-6-njavali@marvell.com
Suggested-by: Daniel Wagner <dwagner@suse.de>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add new API to obtain the NVMe Parameters region status from the Auxiliary
Image Status bitmap.
Link: https://lore.kernel.org/r/20220826102559.17474-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On some platforms, the current logic of relying on finding new packet
solely based on signature pattern can lead to driver reading stale
packets. Though this is a bug in those platforms, reduce such exposures by
limiting reading packets until the IN pointer.
Link: https://lore.kernel.org/r/20220826102559.17474-3-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This message is helpful to troubleshoot missing LUNs/SAN boot errors. It'd
be nice to log it by default instead of only being enabled with debug.
This user had an accidental/forgotten file modprobe.d/qla2xxx.conf w/
option qlini_mode=disabled from experiments with FC target mode, and their
boot LUN didn't come up, as it skips SCSI scan, of course.
However, their boot log didn't provide any clues to help understand that.
The issue/message could be figured out w/ ql2xextended_error_logging, but
it would have been simpler (or even deflected/addressed by user) if it had
been there by default. And it also would help support/triage/deflection
tooling.
Expected change:
scsi host15: qla2xxx
+qla2xxx [0000:3b:00.0]-00fb:15: skipping scsi_scan_host() for non-initiator port
qla2xxx [0000:3b:00.0]-00fb:15: QLogic QLE2692 - QLE2692 Dual Port 16Gb FC to PCIe Gen3 x8 Adapter.
According to:
qla2x00_probe_one()
...
ret = scsi_add_host(...);
...
ql_log(ql_log_info, ...
"skipping scsi_scan_host() for non-initiator port\n");
...
ql_log(ql_log_info, ...
"QLogic %s - %s.\n", ha->model_number, ha->model_desc);
Link: https://lore.kernel.org/r/20220825120159.275051-1-mfo@canonical.com
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the driver hits an internal error condition returning DID_REQUEUE the
I/O will be retried on the same ITL nexus. This will inhibit multipathing,
resulting in endless retries even if the error could have been resolved by
using a different ITL nexus. Return DID_TRANSPORT_DISRUPTED to allow for
multipath to engage and route I/O to another ITL nexus.
Link: https://lore.kernel.org/r/20220824060033.138661-1-hare@suse.de
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With cmd_per_lun value 7, a higher number of cache lines (map_nr) are
needed while allocating sdev->budget_map which is not reasonable and hence
increase the cmd_per_lun value to 128.
Link: https://lore.kernel.org/r/20220825075457.16422-4-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The ExtendedType field was set to 1 in the diag buffer register command and
hence MPT Endpoint firmware is failing the request with Invalid Field
IOCStatus.
memset the request frame to zero before framing the diag buffer register
command.
Link: https://lore.kernel.org/r/20220825075457.16422-3-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a pool crosses the 4GB boundary region then before reallocating pools
change the coherent DMA mask to 32 bits and keep the normal DMA mask set to
63/64 bits.
Link: https://lore.kernel.org/r/20220825075457.16422-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If SCSI error handling is taking place for timed out I/Os on a drive and
the corresponding drive is removed, then stop escalating to higher level of
reset by returning the TUR with "I_T NEXUS LOSS OCCURRED" sense key.
Link: https://lore.kernel.org/r/20220816080801.13929-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct MR_DRV_RAID_MAP and refactor the code accordingly.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Link: https://lore.kernel.org/r/1448f387821833726b99f0ce13069ada89164eb5.1660592640.git.gustavoars@kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enhanced-by: Kees Cook <keescook@chromium.org> # Change in struct MR_DRV_RAID_MAP_ALL
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct MR_DRV_RAID_MAP and refactor the the rest of the code
accordingly.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Link: https://lore.kernel.org/r/4495ce170c8ef088a10f1abe0e7c227368f43242.1660592640.git.gustavoars@kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enhanced-by: Kees Cook <keescook@chromium.org> # Change in struct MR_FW_RAID_MAP_ALL
Although qla2xxx driver is calling schedule_{,delayed}_work() from 10
locations, I assume that flush_scheduled_work() from qlt_stop_phase1()
needs to flush only works scheduled by qlt_sched_sess_work(), for this loop
continues while "struct qla_tgt"->sess_works_list is not empty.
Link: https://lore.kernel.org/r/133c6723-90b6-5c8b-72b4-cc01eeb3a8c0@I-love.SAKURA.ne.jp
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently qlt_stop_phase1() may fail to call flush_scheduled_work(), for
list_empty() may return true as soon as qlt_sess_work_fn() called
list_del(). In order to close this race window, check list_empty() after
calling flush_scheduled_work().
If this patch causes problems, please check commit c4f135d643
("workqueue: Wrap flush_workqueue() using a macro"). We are on the way to
remove all flush_scheduled_work() calls from the kernel.
Link: https://lore.kernel.org/r/7f24469d-9e39-3398-d851-329b54c0b923@I-love.SAKURA.ne.jp
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
"struct qla_tgt"->del_sess_list is no longer used since commit
726b854870 ("qla2xxx: Add framework for async fabric discovery").
Link: https://lore.kernel.org/r/0c335e86-5624-b599-5137-f1377419fb0c@I-love.SAKURA.ne.jp
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyrights to 2022 for files modified in the 14.2.0.6 patch set.
Link: https://lore.kernel.org/r/20220819011736.14141-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 14.2.0.6.
Link: https://lore.kernel.org/r/20220819011736.14141-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SANDiags feature is unused, and related code is removed.
Link: https://lore.kernel.org/r/20220819011736.14141-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add capability to specify warning notification period to help firmware
adjust to congestion accordingly.
Link: https://lore.kernel.org/r/20220819011736.14141-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The kernel test robot reported the following sparse warning:
arch/arm64/include/asm/cmpxchg.h:88:1: sparse: sparse: cast truncates
bits from constant value (369 becomes 69)
On arm64, atomic_xchg only works on 8-bit byte fields. Thus, the macro
usage of LPFC_RXMONITOR_TABLE_IN_USE can be unintentionally truncated
leading to all logic involving the LPFC_RXMONITOR_TABLE_IN_USE macro to not
work properly.
Replace the Rx Table atomic_t indexing logic with a new
lpfc_rx_info_monitor structure that holds a circular ring buffer. For
locking semantics, a spinlock_t is used.
Link: https://lore.kernel.org/r/20220819011736.14141-4-jsmart2021@gmail.com
Fixes: 17b27ac592 ("scsi: lpfc: Add rx monitoring statistics")
Cc: <stable@vger.kernel.org> # v5.15+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
An error case exit from lpfc_cmpl_ct_cmd_gft_id() results in a call to
lpfc_nlp_put() with a null pointer to a nodelist structure.
Changed lpfc_cmpl_ct_cmd_gft_id() to initialize nodelist pointer upon
entry.
Link: https://lore.kernel.org/r/20220819011736.14141-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During a stress offline/online test in PT2PT topology, target rediscovery
can fail with a specific target vendor array.
When the HBA transitions to online mode it is possible to receive an
unsolicited FLOGI before processing the Link Up event. The received FLOGI
will set the defer_flogi_acc_flag, which instructs the driver to wait until
it transmits its own FLOGI before ACKing the received FLOGI. In this
failure scenario, the link up processing clears the set
defer_flogi_acc_flag before we have sent out the FLOGI. As the target has
the higher WWPN and is responsible for sending the PLOGI, the target is
stuck waiting for its FLOGI_ACC that the driver will never send.
Remove the clear of defer_flogi_acc_flag from Link Up event processing. In
this stress test case, the defer_flogi_acc_flag is cleared during the Link
Down event processing anyways.
Link: https://lore.kernel.org/r/20220819011736.14141-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Variable stp is assigned a value that is never read, the assignment and the
variable stp are redundant and can be removed.
Cleans up clang scan build warning:
drivers/scsi/st.c:4253:7: warning: Although the value stored to 'stp'
is used in the enclosing expression, the value is never actually
read from 'stp' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220805115652.2340991-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable mfiStatus is assigned a value but it is never read. The
assignment is redundant and can be removed. Also remove { } as the return
statement does not need to be in its own code block.
Cleans up clang scan build warning:
drivers/scsi/megaraid/megaraid_sas_base.c:4026:7: warning: Although the
value stored to 'mfiStatus' is used in the enclosing expression, the
value is never actually read from 'mfiStatus' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220805115042.2340400-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable scb is assigned a value but it is never read. The assignment
is redundant and can be removed. Also replace the != NULL check with the
more usual non-null check idiom.
Cleans up clang scan build warning:
drivers/scsi/initio.c:1169:9: warning: Although the value stored to 'scb'
is used in the enclosing expression, the value is never actually read
from 'scb' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220805114100.2339637-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Disable firmware download for ATTO devices where it is not supported.
Link: https://lore.kernel.org/r/20220805174609.14830-2-bgrove@attotech.com
Co-developed-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Bradley Grove <bgrove@attotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add ATTO's PCI IDs and modify the driver to handle the unique NVRAM
structure used by ATTO's devices.
Link: https://lore.kernel.org/r/20220805174609.14830-1-bgrove@attotech.com
Co-developed-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Rob Crispo <rcrispo@attotech.com>
Signed-off-by: Bradley Grove <bgrove@attotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Block the I/Os on the target devices until corresponding target device's
target dev objects are refreshed as part of post controller reset
operation.
Link: https://lore.kernel.org/r/20220804131226.16653-16-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update the host's SAS ports if there is change in port id or phys. If the
port id is changed, then the driver updates it. If some phys are
enabled/disabled during reset, then driver updates them in STL.
Check for the responding expander devices and update the device handle if
it got changed. Register the expander with STL if it got added during reset
and unregister the expander device if it got removed during reset.
[mkp: include fix for zeroday warning]
Link: https://lore.kernel.org/r/20220804131226.16653-15-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add support for the following SAS transport class callbacks:
- get_linkerrors
- get_enclosure_identifier
- get_bay_identifier
- phy_reset
- phy_enable
- set_phy_speed
- smp_handler
Link: https://lore.kernel.org/r/20220804131226.16653-14-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add framework to issue MPT transport commands to controllers. Also issue
the MPT transport commands to get the manufacturing info of SAS expander
device.
Link: https://lore.kernel.org/r/20220804131226.16653-13-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Register/unregister the SAS, SATA devices to SCSI Transport Layer(STL)
whenever the corresponding device is added/removed from topology.
Link: https://lore.kernel.org/r/20220804131226.16653-12-sreekanth.reddy@broadcom.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When device is registered with the STL then get the corresponding device's
target object using the rphy in below callback functions:
- mpi3mr_target_alloc()
- mpi3mr_slave_alloc()
- mpi3mr_slave_configure()
- mpi3mr_slave_destroy()
Link: https://lore.kernel.org/r/20220804131226.16653-11-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Register/unregister the expander devices to SCSI Transport Layer(STL)
whenever the corresponding expander is added/removed from topology.
Link: https://lore.kernel.org/r/20220804131226.16653-10-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Register the SAS, SATA devices to SCSI Transport Layer (STL) only if
multipath capability is disabled in the controller's firmware.
Link: https://lore.kernel.org/r/20220804131226.16653-9-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the following helper functions:
- Update the host phys with STL
- Remove the device's SAS port with STL
Link: https://lore.kernel.org/r/20220804131226.16653-8-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the following helper functions:
- Get the device's sas address by reading corresponding device's Device
page0
- Get the expander object from expander list based on expander's handle
- Get the target device object from target device list based on device's
sas address
- Get the expander device object from expander list based on expanders's
sas address
- Get hba port object from hba port table list based on port's port id
Link: https://lore.kernel.org/r/20220804131226.16653-7-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add framework to register and unregister the host and expander phys with
SCSI Transport Layer (STL).
Link: https://lore.kernel.org/r/20220804131226.16653-6-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enable and process the Enclosure device add event.
Link: https://lore.kernel.org/r/20220804131226.16653-5-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helper functions to retrieve below controller's config pages:
- SAS IOUnit Page0
- SAS IOUnit Page1
- Driver Page1
- Device Page0
- SAS Phy Page0
- SAS Phy Page1
- SAS Expander Page0
- SAS Expander Page1
- Enclosure Page0
Also add the helper function to set SAS IOUnit Page1.
Link: https://lore.kernel.org/r/20220804131226.16653-4-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add config and transport request related error & info debug flags and
functions.
Link: https://lore.kernel.org/r/20220804131226.16653-2-sreekanth.reddy@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Although commit 88f1669019 ("scsi: sd: Rework asynchronous resume support")
eliminates a delay for some ATA disks after resume, it causes resume of ATA
disks to fail on other setups. See also:
* "Resume process hangs for 5-6 seconds starting sometime in 5.16"
(https://bugzilla.kernel.org/show_bug.cgi?id=215880).
* Geert's regression report
(https://lore.kernel.org/linux-scsi/alpine.DEB.2.22.394.2207191125130.1006766@ramsan.of.borg/).
This is what I understand about this issue:
* During resume, ata_port_pm_resume() starts the SCSI error handler. This
changes the SCSI host state into SHOST_RECOVERY and causes
scsi_queue_rq() to return BLK_STS_RESOURCE.
* sd_resume() calls sd_start_stop_device() for ATA devices. That function
in turn calls sd_submit_start() which tries to submit a START STOP UNIT
command. That command can only be submitted after the SCSI error handler
has changed the SCSI host state back to SHOST_RUNNING.
* The SCSI error handler runs on its own thread and calls
schedule_work(&(ap->scsi_rescan_task)). That causes
ata_scsi_dev_rescan() to be called from the context of a kernel
workqueue. That call hangs in blk_mq_get_tag(). I'm not sure why - maybe
because all available tags have been allocated by sd_submit_start()
calls (this is a guess).
Link: https://lore.kernel.org/r/20220816172638.538734-1-bvanassche@acm.org
Fixes: 88f1669019 ("scsi: sd: Rework asynchronous resume support")
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: gzhqyz@gmail.com
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: gzhqyz@gmail.com
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: John Garry <john.garry@huawei.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since blk_mq_map_queues() and the .map_queues() callbacks always return 0,
change their return type into void. Most callers ignore the returned value
anyway.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20220815170043.19489-3-bvanassche@acm.org
[axboe: fold in fix from Bart]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Passthrough users will set the scsi_cmnd->allowed value and were expecting
up to $allowed retries. The problem is that before:
commit 6aded12b10 ("scsi: core: Remove struct scsi_request")
we used to set the retries on the scsi_request then copy them over to
scsi_cmnd->allowed in scsi_setup_scsi_cmnd. With that patch we now set
scsi_cmnd->allowed to 0 in scsi_prepare_cmd and overwrite what the
passthrough user set.
This moves the allowed initialization to after the blk_rq_is_passthrough()
check so it's only done for the non-passthrough path where the ULD
init_command will normally set an allowed value it prefers.
Link: https://lore.kernel.org/r/20220812011206.9157-1-michael.christie@oracle.com
Fixes: 6aded12b10 ("scsi: core: Remove struct scsi_request")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mostly small bug fixes and trivial updates. The major new core update
is a change to the way device, target and host reference counting is
done to try to make it more robust (this change has soaked for a while
to try to winkle out any bugs).
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYveemSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisheaiAP9DPQ70
7hEdak6evXUwlWwlVBvEFAfZlzpHN+uNzUCvpgD+N61RSPhHV1hXu12sVMmjMNEb
pow+ee47Mc0t/OO68jA=
=1yQh
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Mostly small bug fixes and trivial updates.
The major new core update is a change to the way device, target and
host reference counting is done to try to make it more robust (this
change has soaked for a while to try to winkle out any bugs)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: pm8001: Fix typo 'the the' in comment
scsi: megaraid_sas: Remove redundant variable cmd_type
scsi: FlashPoint: Remove redundant variable bm_int_st
scsi: zfcp: Fix missing auto port scan and thus missing target ports
scsi: core: Call blk_mq_free_tag_set() earlier
scsi: core: Simplify LLD module reference counting
scsi: core: Make sure that hosts outlive targets
scsi: core: Make sure that targets outlive devices
scsi: ufs: ufs-pci: Correct check for RESET DSM
scsi: target: core: De-RCU of se_lun and se_lun acl
scsi: target: core: Fix race during ACL removal
scsi: ufs: core: Correct ufshcd_shutdown() flow
scsi: ufs: core: Increase the maximum data buffer size
scsi: lpfc: Check the return value of alloc_workqueue()
When alloc ctrl mem fails, the reply_map will subsequently be freed in
megasas_free_ctrl_mem(). No need to free it in megasas_alloc_ctrl_mem().
Link: https://lore.kernel.org/r/1659424740-46918-1-git-send-email-kanie@linux.alibaba.com
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When allocating log_to_span fails, kfree(instance->ctrl_context) is called
twice. Remove redundant call.
Link: https://lore.kernel.org/r/1659424729-46502-1-git-send-email-kanie@linux.alibaba.com
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The error path for the SCSI check condition of not ready, target in ALUA
state transition, will result in the failure of that path after the retries
are exhausted. In most cases that is well ahead of the transition timeout
established in the SCSI ALUA device handler.
Instead, reprep the command and re-add it to the queue after a 1 second
delay. This will allow the handler to take care of the timeout and only
fail the path if the target has exceeded the transition expiry timeout
(default 60 seconds). If the expiry timeout is exceeded, the handler will
change the path state from transitioning to standby leading to a path
failure eliminating the potential of this re-prep to continue endlessly. In
most cases the target will exit the transitioning state well before the
expiry timeout but after the retries are exhausted as mentioned.
Additionally remove the scsi_io_completion_reprep() function which provides
little value.
Link: https://lore.kernel.org/r/20220729214110.58576-1-brian@purestorage.com
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Krishna Kant <krishna.kant@purestorage.com>
Acked-by: Seamus Connor <sconnor@purestorage.com>
Signed-off-by: Brian Bunker <brian@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This partially reverts commit d2b292c3f6 ("scsi: qla2xxx: Enable ATIO
interrupt handshake for ISP27XX")
For some workloads where the host sends a batch of commands and then
pauses, ATIO interrupt coalesce can cause some incoming ATIO entries to be
ignored for extended periods of time, resulting in slow performance,
timeouts, and aborted commands.
Disable interrupt coalesce and re-enable the dedicated ATIO MSI-X
interrupt.
Link: https://lore.kernel.org/r/97dcf365-89ff-014d-a3e5-1404c6af511c@cybernetics.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- Add support for syscall stack randomization.
- Add support for atomic operations to the 32 & 64-bit BPF JIT.
- Full support for KASAN on 64-bit Book3E.
- Add a watchdog driver for the new PowerVM hypervisor watchdog.
- Add a number of new selftests for the Power10 PMU support.
- Add a driver for the PowerVM Platform KeyStore.
- Increase the NMI watchdog timeout during live partition migration, to avoid timeouts
due to increased memory access latency.
- Add support for using the 'linux,pci-domain' device tree property for PCI domain
assignment.
- Many other small features and fixes.
Thanks to: Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira Rajeev, Bagas
Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Greg Kurz,
Haowen Bai, Hari Bathini, Jason A. Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg
Haefliger, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch, Naveen N. Rao, Nayna
Jain, Nicholas Piggin, Ning Qiang, Pali Rohár, Petr Mladek, Rashmica Gupta, Sachin Sant,
Scott Cheloha, Segher Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, Zhouyi Zhou.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmLuAPgTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBPpD/9kY/T0qlOXABxlZCgtqeAjPX+2xpnY
BF+TlsN1TS1auFcEZL2BapmVacsvOeGEFDVuZHZvZJc69Hx+gSjnjFCnZjp6n+Yz
wt6y9w9Pu0t/sjD5vNQ46O15/dXqm6RoVI7um12j/WLMN8Ko5+x3gKAyQONjQd2/
1kPcxVH6FUosAdnCuvIcqCX4e4IIHl2ZkitHOTXoQUvUy9oAK/mOBnwqZ6zLGUKC
E5M+Zyt4RFGxhPs48FkX6Nq6crDGU/P0VJpDKkR/t7GHnE67Bm70gZougAPrzrgP
nx8zoTWgDKpqDeuqK7pFcyKgNS3dKbxsN3sAfKHOWu/YnV4wMyy+7fmwagMauki7
lXccKN6F/r+8JcMNx80Jp/dAw3ZdLceP38M3Ryf8IL6lTfkNySumUvrKJn6r1Cu1
wvzhgyEuDawss9KHdEmXcA2i3+XVZvitaipO7JWUC8pblrP1SJMoPfIIe9zh3y3M
pyZj0TcGJ8XaK+badvI+PW/K/KeRgXEY8HpC3wDHSoIkli3OE4jDwXn6TiZgvm3n
k0sKL8YSmQZ8hP8QAkR+r8NQKYqLlfyPxdslK5omDPxfub5Uzk9ZV2Ep7svkaiQn
Wqjq27Dpz8+w0XPjsQ0Tkv+ByTkOhrawOH7x9SpFLHpv9g5otcYmS79NkO/htx8C
6LyPNx1VYn5IRA==
=tRkm
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add support for syscall stack randomization
- Add support for atomic operations to the 32 & 64-bit BPF JIT
- Full support for KASAN on 64-bit Book3E
- Add a watchdog driver for the new PowerVM hypervisor watchdog
- Add a number of new selftests for the Power10 PMU support
- Add a driver for the PowerVM Platform KeyStore
- Increase the NMI watchdog timeout during live partition migration, to
avoid timeouts due to increased memory access latency
- Add support for using the 'linux,pci-domain' device tree property for
PCI domain assignment
- Many other small features and fixes
Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira
Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas,
Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A.
Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol
Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch,
Naveen N. Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár,
Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher
Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, and Zhouyi Zhou.
* tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits)
powerpc/64e: Fix kexec build error
EDAC/ppc_4xx: Include required of_irq header directly
powerpc/pci: Fix PHB numbering when using opal-phbid
powerpc/64: Init jump labels before parse_early_param()
selftests/powerpc: Avoid GCC 12 uninitialised variable warning
powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
powerpc/xive: Fix refcount leak in xive_get_max_prio
powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
powerpc/perf: Include caps feature for power10 DD1 version
powerpc: add support for syscall stack randomization
powerpc: Move system_call_exception() to syscall.c
powerpc/powernv: rename remaining rng powernv_ functions to pnv_
powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
powerpc/powernv: Avoid crashing if rng is NULL
selftests/powerpc: Fix matrix multiply assist test
powerpc/signal: Update comment for clarity
powerpc: make facility_unavailable_exception 64s
powerpc/platforms/83xx/suspend: Remove write-only global variable
powerpc/platforms/83xx/suspend: Prevent unloading the driver
powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration
...
- convert arm32 to the common dma-direct code (Arnd Bergmann, Robin Murphy,
Christoph Hellwig)
- restructure the PCIe peer to peer mapping support (Logan Gunthorpe)
- allow the IOMMU code to communicate an optional DMA mapping length
and use that in scsi and libata (John Garry)
- split the global swiotlb lock (Tianyu Lan)
- various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang,
Lukas Bulwahn, Robin Murphy)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmLuIYULHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPS5A//Ty1ZNyXExmwZ6J6g7/oIvQlpAHilDr22mCd8tR8Y
Ne7TgLa/X+usFvJTxJfkvg/LNMDjD7qx0J/mhDGm4reOFcEL4/PBy0rDSOgnmntV
k/fPhgwnpuztiAQ+s+WkJ3pkrmG1HaEId7GGj2JaoYdas6RX2mGX7vL8uvUFepjw
lYPAqWMtJHkOfsDK0PqqyQsr7dcC6lyFLqnn/wqvHtTJeKCfGs6W/SIrlWme2SZY
3dNx84ZR1uPjaazAmtf2IWfjh/TBmd0ETRYycgUUKRP9iwsCkBQDBwsBGSIYXiWj
BUKQ5oMvjAlUGRF0jYz9e77KuedE6GxWiXNQstitBmid142M37DHA5tvZRf65MPS
THHcjTDmmoaO4YfFhhXOcFOrjG4/V8bF7fgHB6XkHDjhVVTcnIx8zuOAXIVBZvIV
VAALmamBqEfIZZrCqgr7hzFssK2bip+TIMkdoD46Wcr+D7bAlujhuzWxubn9+ulT
23v/pAvC80ut6LvKj6EA+GpRm/pejfOtEbjXPoO2hguNxvuUKvPQqNh9hy0q+v1e
8n2Y/4lhy5bv02S7wKooNkfCoV753jBY1TIru45UmEYc3EkTQPii6okYe0DvW4QX
VCnKgo156wSBfE+9eWdxCROv2SZqJFMV/wL3vw54dpJQMbDy7VkNsh4mGREdUkU1
uek=
=Bv19
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- convert arm32 to the common dma-direct code (Arnd Bergmann, Robin
Murphy, Christoph Hellwig)
- restructure the PCIe peer to peer mapping support (Logan Gunthorpe)
- allow the IOMMU code to communicate an optional DMA mapping length
and use that in scsi and libata (John Garry)
- split the global swiotlb lock (Tianyu Lan)
- various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang,
Lukas Bulwahn, Robin Murphy)
* tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping: (45 commits)
swiotlb: fix passing local variable to debugfs_create_ulong()
dma-mapping: reformat comment to suppress htmldoc warning
PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg()
RDMA/rw: drop pci_p2pdma_[un]map_sg()
RDMA/core: introduce ib_dma_pci_p2p_dma_supported()
nvme-pci: convert to using dma_map_sgtable()
nvme-pci: check DMA ops when indicating support for PCI P2PDMA
iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg
iommu: Explicitly skip bus address marked segments in __iommu_map_sg()
dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support
dma-direct: support PCI P2PDMA pages in dma-direct map_sg
dma-mapping: allow EREMOTEIO return code for P2PDMA transfers
PCI/P2PDMA: Introduce helpers for dma_map_sg implementations
PCI/P2PDMA: Attempt to set map_type if it has not been set
lib/scatterlist: add flag for indicating P2PDMA segments in an SGL
swiotlb: clean up some coding style and minor issues
dma-mapping: update comment after dmabounce removal
scsi: sd: Add a comment about limiting max_sectors to shost optimal limit
ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors
scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit
...
Updates to the usual drivers (ufs, qla2xx, target, lpfc, smartpqi,
mpi3mr). The main driver change that might cause issues on down the
road is the conversion of some of our oldest surviving drivers to the
DMA API (should only affect m68k). The only major core change is the
rework of async resume; the rest are either completely trivial or for
updating deprecated APIs.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYuvakyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfvOAP4m0N6b
e3JwoBtB1c0JMKv6G4gka8suEG8p5f4khDu8wwD+LfGUCzG49Y5Ts7rByXfEiGgO
krSdwsAZiV6yKg/HuPw=
=Ak9L
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (ufs, qla2xx, target, lpfc, smartpqi,
mpi3mr).
The main driver change that might cause issues on down the road is the
conversion of some of our oldest surviving drivers to the DMA API
(should only affect m68k).
The only major core change is the rework of async resume; the rest are
either completely trivial or for updating deprecated APIs"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (195 commits)
scsi: target: Remove XDWRITEREAD emulated support
scsi: megaraid: Remove the static variable initialisation
scsi: ch: Do not initialise statics to 0
scsi: ufs: core: Fix spelling mistake "Cannnot" -> "Cannot"
scsi: target: iscsi: Do not require target authentication
scsi: target: iscsi: Allow AuthMethod=None
scsi: target: iscsi: Support base64 in CHAP
scsi: target: iscsi: Add support for extended CDB AHS
scsi: ufs: dt-bindings: Add SC8280XP binding
scsi: target: iscsi: Fix clang -Wformat warnings
scsi: ufs: core: Read device property for ref clock
scsi: libsas: Resume SAS host for phy reset or enable via sysfs
scsi: hisi_sas: Modify v3 HW SATA completion error processing
scsi: hisi_sas: Relocate DMA unmap of SMP task
scsi: hisi_sas: Remove unnecessary variable to hold DMA map elements
scsi: hisi_sas: Call hisi_sas_slave_configure() from slave_configure_v3_hw()
scsi: mpi3mr: Delete a stray tab
scsi: mpi3mr: Unlock on error path
scsi: mpi3mr: Reduce VD queue depth on detecting throttling
scsi: mpi3mr: Resource Based Metering
...
Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text. Also included in here are a few other minor updates,
2 USB files, and one Documentation file update to get the SPDX lines
correct.
All of these have been in the linux-next tree for a very long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYupz3g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynPUgCgslaf2ssCgW5IeuXbhla+ZBRAzisAnjVgOvLN
4AKdqbiBNlFbCroQwmeQ
=v1sg
-----END PGP SIGNATURE-----
Merge tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX updates from Greg KH:
"Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text.
Also included in here are a few other minor updates, two USB files,
and one Documentation file update to get the SPDX lines correct.
All of these have been in the linux-next tree for a very long time"
* tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits)
Documentation: samsung-s3c24xx: Add blank line after SPDX directive
x86/crypto: Remove stray comment terminator
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLko3gQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpmQaD/90NKFj4v8I456TUQyg1jimXEsL+e84E6o2
ALWVb6JzQvlPVQXNLnK5YKIunMWOTtTMz0nyB8sVRwVJVJO0P5d7QopAkZM8fkyU
MK5OCzoryENw4DTc2wJS4in6cSbGylIuN74wMzlf7+M67JTImfoZQhbTMcjwzZfn
b3OlL6sID7zMXwGcuOJPZyUJICCpDhzdSF9JXqKma5PQuG2SBmQyvFxJAcsoFBPc
YetnoRIOIN6yBvsIZaPaYq7XI9MIvF0e67EQtyCEHj4tHpyVnyDWkeObVFULsISU
gGEKbkYPvNUzRAU5Q1NBBHh1tTfkf/MaUxTuZwoEwZ/s04IGBGMmrZGyfvdfzYo6
M7NwSEg/TrUSNfTwn65mQi7uOXu1pGkJrqz84Flm8u9Qid9Vd7LExLG5p/ggnWdH
5th93MDEmtEg29e9DXpEAuS5d0t3TtSvosflaKpyfNNfr+P0rWCN6GM/uW62VUTK
ls69SQh/AQJRbg64jU4xper6WhaYtSXK7TKEnxJycoEn9gYNyCcdot2uekth0xRH
ChHGmRlteiqe/y4uFWn/2dcxWjoleiHbFjTaiRL75WVl8wIDEjw02LGuoZ61Ss9H
WOV+MT7KqNjBGe6lreUY+O/PO02dzmoR6heJXN19p8zr/pBuLCTGX7UpO7rzgaBR
4N1HEozvIw==
=celk
-----END PGP SIGNATURE-----
Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- Improve the type checking of request flags (Bart)
- Ensure queue mapping for a single queues always picks the right queue
(Bart)
- Sanitize the io priority handling (Jan)
- rq-qos race fix (Jinke)
- Reserved tags handling improvements (John)
- Separate memory alignment from file/disk offset aligment for O_DIRECT
(Keith)
- Add new ublk driver, userspace block driver using io_uring for
communication with the userspace backend (Ming)
- Use try_cmpxchg() to cleanup the code in various spots (Uros)
- Finally remove bdevname() (Christoph)
- Clean up the zoned device handling (Christoph)
- Clean up independent access range support (Christoph)
- Clean up and improve block sysfs handling (Christoph)
- Clean up and improve teardown of block devices.
This turns the usual two step process into something that is simpler
to implement and handle in block drivers (Christoph)
- Clean up chunk size handling (Christoph)
- Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu,
Ming, Sebastian, Yang, Ying)
* tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits)
ublk_drv: fix double shift bug
ublk_drv: make sure that correct flags(features) returned to userspace
ublk_drv: fix error handling of ublk_add_dev
ublk_drv: fix lockdep warning
block: remove __blk_get_queue
block: call blk_mq_exit_queue from disk_release for never added disks
blk-mq: fix error handling in __blk_mq_alloc_disk
ublk: defer disk allocation
ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask
ublk: fold __ublk_create_dev into ublk_ctrl_add_dev
ublk: cleanup ublk_ctrl_uring_cmd
ublk: simplify ublk_ch_open and ublk_ch_release
ublk: remove the empty open and release block device operations
ublk: remove UBLK_IO_F_PREFLUSH
ublk: add a MAINTAINERS entry
block: don't allow the same type rq_qos add more than once
mmc: fix disk/queue leak in case of adding disk failure
ublk_drv: fix an IS_ERR() vs NULL check
ublk: remove UBLK_IO_F_INTEGRITY
ublk_drv: remove unneeded semicolon
...
Replace 'the the' with 'the' in the comment.
Link: https://lore.kernel.org/r/20220722094612.78583-1-slark_xiao@163.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Slark Xiao <slark_xiao@163.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable cmd_type is assigned a value but it is never read. The
variable and the assignment are redundant and can be removed.
Cleans up clang scan build warning:
drivers/scsi/megaraid/megaraid_sas_fusion.c:3228:10: warning: Although
the value stored to 'cmd_type' is used in the enclosing expression, the
value is never actually read from 'cmd_type' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220730124509.148457-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable bm_int_st is assigned a value but it is never read. The
variable and the assignment are redundant and can be removed.
Cleans up clang scan build warning:
drivers/scsi/FlashPoint.c:1726:7: warning: Although the value stored
to 'bm_int_st' is used in the enclosing expression, the value is never
actually read from 'bm_int_st' [deadcode.DeadStores]
Link: https://lore.kernel.org/r/20220730123736.147758-1-colin.i.king@gmail.com
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are two .exit_cmd_priv implementations. Both implementations use
resources associated with the SCSI host. Make sure that these resources are
still available when .exit_cmd_priv is called by moving the .exit_cmd_priv
calls from scsi_host_dev_release() to scsi_forget_host(). Moving
blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_remove_host() is
safe because scsi_forget_host() waits until all SCSI devices associated
with the host have been removed.
Fix the following use-after-free:
==================================================================
BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
Read of size 8 at addr ffff888100337000 by task multipathd/16727
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report.cold+0x5e/0x5db
kasan_report+0xab/0x120
srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
scsi_mq_exit_request+0x4d/0x70
blk_mq_free_rqs+0x143/0x410
__blk_mq_free_map_and_rqs+0x6e/0x100
blk_mq_free_tag_set+0x2b/0x160
scsi_host_dev_release+0xf3/0x1a0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_device_dev_release_usercontext+0x4c1/0x4e0
execute_in_process_context+0x23/0x90
device_release+0x54/0xe0
kobject_put+0xa5/0x120
scsi_disk_release+0x3f/0x50
device_release+0x54/0xe0
kobject_put+0xa5/0x120
disk_release+0x17f/0x1b0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
dm_put_table_device+0xa3/0x160 [dm_mod]
dm_put_device+0xd0/0x140 [dm_mod]
free_priority_group+0xd8/0x110 [dm_multipath]
free_multipath+0x94/0xe0 [dm_multipath]
dm_table_destroy+0xa2/0x1e0 [dm_mod]
__dm_destroy+0x196/0x350 [dm_mod]
dev_remove+0x10c/0x160 [dm_mod]
ctl_ioctl+0x2c2/0x590 [dm_mod]
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
dm_ctl_ioctl+0x5/0x10 [dm_mod]
__x64_sys_ioctl+0xb4/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Link: https://lore.kernel.org/r/20220728221851.1822295-5-bvanassche@acm.org
Fixes: 65ca846a53 ("scsi: core: Introduce {init,exit}_cmd_priv()")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Swap two statements in scsi_device_put() now that it is guaranteed that
SCSI hosts outlive SCSI devices. Remove the reference counting code from
scsi_sysfs.c that became superfluous because SCSI hosts now outlive SCSI
devices.
Link: https://lore.kernel.org/r/20220728221851.1822295-4-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ bvanassche: Extracted this patch from a larger patch ]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the race conditions between SCSI LLD kernel module unloading and SCSI
device and target removal by making sure that SCSI hosts are destroyed
after all associated target and device objects have been freed.
Link: https://lore.kernel.org/r/20220728221851.1822295-3-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ bvanassche: Reworked Ming's patch and split it ]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit prevents that the following sequence triggers a kernel crash:
- Deletion of a SCSI device is requested via sysfs. Device removal takes
some time because blk_cleanup_queue() is waiting for the SCSI error
handler.
- The SCSI target associated with that SCSI device is removed.
- scsi_remove_target() returns and its caller frees the resources
associated with the SCSI target.
- The error handler makes progress and invokes an LLD callback that
dereferences the SCSI target pointer.
Link: https://lore.kernel.org/r/20220728221851.1822295-2-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The function alloc_workqueue() in lpfc_sli4_driver_resource_setup() can
fail, but there is no check of its return value. The return value should be
checked.
Link: https://lore.kernel.org/r/20220723064027.2956623-1-williamsukatube@163.com
Fixes: 3cee98db26 ("scsi: lpfc: Fix crash on driver unload in wq free")
Reported-by: Hacash Robot <hacashRobot@santino.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: William Dean <williamsukatube@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
powerpc's asm/prom.h brings some headers that it doesn't need itself.
Once those headers are removed from asm/prom.h, the following
errors occur:
CC [M] drivers/scsi/cxlflash/ocxl_hw.o
drivers/scsi/cxlflash/ocxl_hw.c: In function 'afu_map_irq':
drivers/scsi/cxlflash/ocxl_hw.c:195:16: error: implicit declaration of function 'irq_create_mapping' [-Werror=implicit-function-declaration]
195 | virq = irq_create_mapping(NULL, irq->hwirq);
| ^~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/ocxl_hw.c:222:9: error: implicit declaration of function 'irq_dispose_mapping' [-Werror=implicit-function-declaration]
222 | irq_dispose_mapping(virq);
| ^~~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/ocxl_hw.c: In function 'afu_unmap_irq':
drivers/scsi/cxlflash/ocxl_hw.c:264:13: error: implicit declaration of function 'irq_find_mapping'; did you mean 'is_cow_mapping'? [-Werror=implicit-function-declaration]
264 | if (irq_find_mapping(NULL, irq->hwirq)) {
| ^~~~~~~~~~~~~~~~
| is_cow_mapping
cc1: some warnings being treated as errors
Fix it by including linux/irqdomain.h
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c6c0cc5e9179a642370a61439f95158271a78c03.1657264228.git.christophe.leroy@csgroup.eu
Initialising global and static variables to 0 is unnecessary. Remove the
initialisation.
Link: https://lore.kernel.org/r/20220723091620.5463-1-wangborong@cdjrlc.com
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During system shutdown or reboot, mpt3sas will reset the firmware back to
ready state. However, the driver leaves running a watchdog work item
intended to keep the firmware in operational state. This causes a second,
unneeded reset on shutdown and moves the firmware back to operational
instead of in ready state as intended. And if the mpt3sas_fwfault_debug
module parameter is set, this extra reset also panics the system.
mpt3sas's scsih_shutdown needs to stop the watchdog before resetting the
firmware back to ready state.
Link: https://lore.kernel.org/r/20220722142448.6289-1-djeffery@redhat.com
Fixes: fae21608c3 ("scsi: mpt3sas: Transition IOC to Ready state during shutdown")
Tested-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a comment about limiting the default the SCSI disk request_queue
max_sectors initial value to that of the SCSI host optimal sectors limit.
Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.
For performance reasons set the request queue max_sectors from
dma_opt_mapping_size(), which knows this mapping limit.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.
New member Scsi_Host.opt_sectors is added, which is the optimal host
max_sectors, and use this value to cap the request queue max_sectors when
set.
It could be considered to have request queues io_opt value initially
set at Scsi_Host.opt_sectors in __scsi_init_queue(), but that is not
really the purpose of io_opt.
Finally, even though Scsi_Host.opt_sectors value should never be greater
than the request queue max_hw_sectors value, continue to limit to this
value for safety.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The shost->max_sectors is repeatedly capped according to the host DMA
mapping limit for each sdev in __scsi_init_queue(). This is unnecessary, so
set only once when adding the host.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Currently if a phy reset or enable phy is issued via sysfs when controller
is suspended, those operations will be ignored as SAS_HA_REGISTERED is
cleared. If RPM is enabled then we may aggressively suspend automatically.
In this case it may be difficult to enable or reset a phy via sysfs, so
resume the host in these scenarios.
Link: https://lore.kernel.org/r/1657823002-139010-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the I/O completion response frame returned by the target device has been
written to the host memory and the err bit in the status field of the
received fis is 1, ts->stat should set to SAS_PROTO_RESPONSE, and this will
let EH analyze and further determine cause of failure.
Link: https://lore.kernel.org/r/1657823002-139010-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently SMP tasks are DMA unmapped only when cq of SMP I/O is returned
normally. If the cq of SMP I/O is returned with exception actually SMP TAS
is never unmapped. Relocate DMA unmap of SMP task to fix the issue.
Link: https://lore.kernel.org/r/1657823002-139010-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use slot->n_elem to store the return value of dma_map_sg() for SSP and SMP
IOs, and remove unnecessary variable n_elem_req.
Link: https://lore.kernel.org/r/1657823002-139010-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is duplicated code between slave_configure_v3_hw() and
hisi_sas_slave_configure(), so call common function
hisi_sas_slave_configure() from slave_configure_v3_hw().
Link: https://lore.kernel.org/r/1657823002-139010-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This code is indented one more tab than it should be.
Link: https://lore.kernel.org/r/YtVCFshEJNC7ELid@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is some clean up necessary before returning. Smatch complains:
drivers/scsi/mpi3mr/mpi3mr_fw.c:4786 mpi3mr_soft_reset_handler()
warn: inconsistent returns '&mrioc->reset_mutex'.
Locked on : 4730
Unlocked on: 4786
Link: https://lore.kernel.org/r/YtVCEsxMU8buuMjP@kili
Fixes: f10af05732 ("scsi: mpi3mr: Resource Based Metering")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reduce the VD queue depth on detecting the throttling condition.
[mkp: incorporate fix for pointer cast issue reported by the test
robot and Guenter Roeck]
Link: https://lore.kernel.org/r/20220708195020.8323-3-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update driver to track cumulative pending large data size at the controller
level and at the throttle group level. When one of the values meet or
exceed the controller's firmware-determined high threshold value, then the
driver will divert future selective I/O to the firmware. Once both
controller level and at the throttle group level cumulative pending large
data size reach controller's firmware determined low threshold value, then
the driver will stop diverting I/Os to the firmware.
Link: https://lore.kernel.org/r/20220708195020.8323-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a SCSI device is removed while in active use, currently sg will
immediately return -ENODEV on any attempt to wait for active commands that
were sent before the removal. This is problematic for commands that use
SG_FLAG_DIRECT_IO since the data buffer may still be in use by the kernel
when userspace frees or reuses it after getting ENODEV, leading to
corrupted userspace memory (in the case of READ-type commands) or corrupted
data being sent to the device (in the case of WRITE-type commands). This
has been seen in practice when logging out of a iscsi_tcp session, where
the iSCSI driver may still be processing commands after the device has been
marked for removal.
Change the policy to allow userspace to wait for active sg commands even
when the device is being removed. Return -ENODEV only when there are no
more responses to read.
Link: https://lore.kernel.org/r/5ebea46f-fe83-2d0b-233d-d0dcb362dd0a@cybernetics.com
Cc: <stable@vger.kernel.org>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
vref_count took an extra decrement in the task management path. Add an
extra ref count to compensate the imbalance.
Link: https://lore.kernel.org/r/20220713052045.10683-7-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch fixes IKE message being dropped due to error in processing Purex
IOCB and Continuation IOCBs.
Link: https://lore.kernel.org/r/20220713052045.10683-6-njavali@marvell.com
Fixes: fac2807946 ("scsi: qla2xxx: edif: Add extraction of auth_els from the wire")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On some platforms, the current logic of relying on finding new packet
solely based on signature pattern can lead to driver reading stale
packets. Though this is a bug in those platforms, reduce such exposures by
limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for
response packets only until response queue in pointer.
When reset, response packets are read based on a signature pattern
logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no
queue pointer shadowing.
Link: https://lore.kernel.org/r/20220713052045.10683-5-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While requesting a new mailbox command, driver does not write any data to
unused registers. Initialize the unused register value to zero while
requesting a new mailbox command to prevent stale entry access by firmware.
Link: https://lore.kernel.org/r/20220713052045.10683-4-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bikash Hazarika <bhazarika@marvell.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Improve static type checking by using the new blk_opf_t type for variables
that represent request flags.
Cc: Hannes Reinecke <hare@suse.com>
Cc: Martin Wilck <mwilck@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-43-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use the new blk_opf_t type for arguments and variables that represent
request flags. Use the !! operator in scsi_noretry_cmd() to convert the
blk_opf_t type into a boolean. This patch does not change any functionality.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-42-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch prepares for introducing the new blk_opf_t type in the SCSI core.
Since the value returned by scsi_noretry_cmd() is only used in boolean
expressions, this patch does not change any functionality.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-41-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Improve static type checking by using the new blk_opf_t type for the
combination of a request operation and its flags.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-40-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The type name enum req_opf is misleading since it suggests that values of
this type include both an operation type and flags. Since values of this
type represent an operation only, change the type name into enum req_op.
Convert the enum req_op documentation into kernel-doc format. Move a few
definitions such that the enum req_op documentation occurs just above
the enum req_op definition.
The name "req_opf" was introduced by commit ef295ecf09 ("block: better op
and flags encoding").
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-2-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/165730608687.177165.11815510982277242966.stgit@brunhilda
Reviewed-by: Gerry Morong <gerry.morong@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyright to current year.
Link: https://lore.kernel.org/r/165730608177.177165.13184715486635363193.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allow user to override the default driver timeout for controller ready.
There are some rare configurations which require the driver to wait longer
than the normal 3 minutes for the controller to complete its bootup
sequence and be ready to accept commands from the driver.
The module parameter is:
ctrl_ready_timeout= { 0 | 30-1800 }
and specifies the timeout in seconds for the driver to wait for controller
ready. The valid range is 0 or 30-1800. The default value is 0, which
causes the driver to use a timeout of 180 seconds (3 minutes).
Link: https://lore.kernel.org/r/165730607666.177165.9221211345284471213.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change removing a LUN using sysfs from an internal driver function
pqi_remove_all_scsi_devices() to using the .slave_destroy entry in the
scsi_host_template.
A LUN can be deleted via sysfs using this syntax:
echo 1 > /sys/block/sdX/device/delete
Link: https://lore.kernel.org/r/165730607154.177165.9723066932202995774.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allow SMP affinity to be changeable by disabling managed interrupts.
On distributions where the driver is enabled for multi-queue support the
driver utilizes kernel managed interrupts, which automatically distributes
interrupts to all available CPUs and assigns SMP affinity.
On most distributions, the affinity can not be changed by the user.
This change will allow managed interrupts to be disabled by the user via a
module parameter while still allowing multi-queue support to function
properly.
Use the module parameter disable_managed_interrupts=1
Link: https://lore.kernel.org/r/165730606638.177165.12846020942931640329.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct a rare stale RAID map access when performing AIO during a RAID
configuration change.
A race condition in the driver could cause it to access a stale RAID map
when a logical volume is reconfigured.
Modify the driver logic to invalidate a RAID map very early when a RAID
configuration change is detected and only switch to a new RAID map after
the driver detects that the RAID map has changed.
Link: https://lore.kernel.org/r/165730606128.177165.7671413443814750829.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct a SOP READ and WRITE DMA flags for some requests.
This update corrects DMA direction issues with SCSI commands removed from
the controller's internal lookup table.
Currently, SCSI READ BLOCK LIMITS (0x5) was removed from the controller
lookup table and exposed a DMA direction flag issue.
SCSI READ BLOCK LIMITS was recently removed from our controller lookup
table so the controller uses the respective IU flag field to set the DMA
data direction. Since the DMA direction is incorrect the FW never completes
the request causing a hang.
Some SCSI commands which use SCSI READ BLOCK LIMITS
* sg_map
* mt -f /dev/stX status
After updating controller firmware, users may notice their tape units
failing. This patch resolves the issue.
Also, the AIO path DMA direction is correct.
The DMA direction flag is a day-one bug with no reported BZ.
Fixes: 6c223761eb ("smartpqi: initial commit of Microsemi smartpqi driver")
Link: https://lore.kernel.org/r/165730605618.177165.9054223644512926624.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mahesh Rajashekhara <Mahesh.Rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change method used to detect controller firmware crash during PQI reset.
PQI reset can fail with error -6 if firmware takes > 100ms to complete
reset.
Method used by driver to detect controller firmware crash during PQI was
incorrect in some cases.
Link: https://lore.kernel.org/r/165730605108.177165.1132931838384767071.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the PCI ID for (values in hex):
VID / DID / SVID / SDID
---- ---- ---- ----
Adaptec SmartHBA 2100-8i-o 9005 / 0285 / 9005 / 0659
Link: https://lore.kernel.org/r/165730604089.177165.17257514581321583667.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fail all outstanding requests after a PCI linkdown.
Block access to device SCSI attributes during the following conditions:
"Cable pull" is called PQI_CTRL_SURPRISE_REMOVAL.
"PCIe Link Down" is called PQI_CTRL_GRACEFUL_REMOVAL.
Block access to device SCSI attributes during and in rare instances when
the controller goes offline.
Either outstanding requests or the access of SCSI attributes post linkdown
can lead to a hang.
Post linkdown, driver does not fail the outstanding requests leading to
long wait time before all the IOs eventually fail.
Also access of the SCSI attributes by host applications can lead to a
system hang.
Link: https://lore.kernel.org/r/165730603578.177165.4699352086827187263.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add driver support for up to 256 LUNs per device.
Update AIO path to pass the appropriate LUN number for base-code to target
the correct LUN.
Update RAID IO path to pass the appropriate LUN number for FW to target the
correct LUN.
Pass the correct LUN number while doing a LUN reset.
Count the outstanding commands based on LUN number. While removing a
Multi-LUN device, wait for all outstanding commands to complete for all
LUNs.
Add Feature bit support.
Link: https://lore.kernel.org/r/165730603067.177165.14016422176841798336.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kumar Meiyappan <Kumar.Meiyappan@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Insert a minimum 1 millisecond delay after writing to a register before
reading from it.
SIS and PQI registers that can be both written to and read from can return
stale data if read from too soon after having been written to.
There is no read/write ordering or hazard detection on the inbound path to
the MSGU from the PCIe bus, therefore reads could pass writes.
Link: https://lore.kernel.org/r/165730602555.177165.11181012469428348394.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com>
Co-developed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Print controller firmware version to OS message log during driver
initialization or after OFA.
Link: https://lore.kernel.org/r/165730601536.177165.17698744242908911822.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Check the response code returned from the LUN reset task management
function and if it indicates the LUN is not valid, do not retry.
Reduce rescan worker delay to 5 seconds for the event handler only.
The removal of a drive from the OS could have been delayed up to 30 seconds
after being physically pulled.
The driver was retrying a LUN reset 3 times even though the return code
indiciated the LUN was no longer valid. There was a 10 second delay between
each retry. Additionally, the rescan worker was scheduled to run 10 seconds
after the driver received the event.
Link: https://lore.kernel.org/r/165730601025.177165.9416869335174437006.stgit@brunhilda
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Negotiated link rate needs to be updated to 'Disabled' when phy is stopped.
Link: https://lore.kernel.org/r/20220708205026.969161-1-changyuanl@google.com
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, the data flow of the max/min linkrate in the driver is
* in pm8001_get_lrate_mode():
hardcoded value ==> struct sas_phy
* in pm8001_bytes_dmaed():
struct pm8001_phy ==> struct sas_phy
* in pm8001_phy_control():
libsas data ==> struct pm8001_phy
Since pm8001_bytes_dmaed() follows pm8001_get_lrate_mode(), and the fields
in struct pm8001_phy are not initialized, sysfs
`/sys/class/sas_phy/phy-*/maximum_linkrate` always shows `Unknown`.
To fix the issue, change the dataflow to the following:
* in pm8001_phy_init():
initial value ==> struct pm8001_phy
* in pm8001_get_lrate_mode():
struct pm8001_phy ==> struct sas_phy
* in pm8001_phy_control():
libsas data ==> struct pm8001_phy
For negotiated linkrate, the current dataflow is:
* in pm8001_get_lrate_mode():
iomb data ==> struct asd_sas_phy ==> struct sas_phy
* in pm8001_bytes_dmaed():
struct asd_sas_phy ==> struct sas_phy
Since pm8001_bytes_dmaed() follows pm8001_get_lrate_mode(), the assignment
statements in pm8001_bytes_dmaed() are unnecessary and cleaned up.
Link: https://lore.kernel.org/r/20220707175210.528858-1-changyuanl@google.com
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Refactor code from fnic probe into a different function so that
scsi layer initialization code is grouped together.
Also, add log messages for better debugging.
Link: https://lore.kernel.org/r/20220707205155.692688-1-kartilak@cisco.com
Co-developed-by: Gian Carlo Boffa <gcboffa@cisco.com>
Signed-off-by: Gian Carlo Boffa <gcboffa@cisco.com>
Co-developed-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Signed-off-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The megaraid SCSI driver sets set->nr_maps as 3 if poll_queues is > 0, and
blk-mq actually initializes each map's nr_queues as nr_hw_queues.
Consequently the driver has to clear READ queue map's nr_queues, otherwise
the queue map becomes broken if poll_queues is set as non-zero.
Link: https://lore.kernel.org/r/20220706125942.528533-1-ming.lei@redhat.com
Fixes: 9e4bec5b2a ("scsi: megaraid_sas: mq_poll support")
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: sumit.saxena@broadcom.com
Cc: chandrakanth.patil@broadcom.com
Cc: linux-block@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.de>
Reported-by: Guangwu Zhang <guazhang@redhat.com>
Tested-by: Guangwu Zhang <guazhang@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyrights to 2022 for files modified in the 14.2.0.5 patch set.
Link: https://lore.kernel.org/r/20220701211425.2708-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 14.2.0.5
Link: https://lore.kernel.org/r/20220701211425.2708-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The Menlo/Hornet adapter was never released to the field. As such, driver
code specific to the adapter is unnecessary and should be removed.
Link: https://lore.kernel.org/r/20220701211425.2708-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc_nvmet_prep_abort_wqe() has a lot of common code with
lpfc_sli_prep_abort_xri().
Delete lpfc_nvmet_prep_abort_wqe() as the wqe can be filled out using the
generic lpfc_sli_prep_abort_xri routine(). Add the wqec option to
lpfc_sli_prep_abort_xri() for lpfc_nvmet_prep_abort_wqe().
Link: https://lore.kernel.org/r/20220701211425.2708-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The RSCN_MEMENTO logic was to workaround a target that does not register
both FCP and NVMe FC4 types at the same time. This caused the
configuration to not produce a second RSCN for the NVMe FC4 type
registration in a timely manner. The intention of the RSCN_MEMENTO flag
was to always signal to try NVMe PRLI.
However, there are other FCP-only target arrays in correctly behaved
configurations that reject the NVMe PRLI followed by a LOGO leading to
never rediscovering the target after an issue_lip (as LOGO causes a repeat
of PLOGI/PRLIs).
Revert the RSCN_MEMENTO patch as it is causing correctly behaved configs to
fail while it exists only to succeed on a misbehaved config.
Link: https://lore.kernel.org/r/20220701211425.2708-9-jsmart2021@gmail.com
Fixes: 1045592fc9 ("scsi: lpfc: Introduce FC_RSCN_MEMENTO flag for tracking post RSCN completion")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During a target link bounce test, the driver sees a mismatch between the
NPortId and the WWPN on the node structures (ndlps) involved. When this
occurs, the driver "swaps" the ndlp and new_ndlp node parameters to restore
WWPN/DID uniqueness in the fc_nodes list per vport. However, the driver
neglected to swap the nlp_fc4_type in the ndlp passed to
lpfc_plogi_confirm_nport causing a failure to recover the NVMe PLOGI/PRLI
and ultimately the NVMe paths.
Correct confirm_nport to preserve the fc4 types from the new-ndlp when the
data is moved over ot the ndlp structure.
Link: https://lore.kernel.org/r/20220701211425.2708-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Disabling FA-PWWN should be effective after port reset, but in some cases
it was found to be impossible to clear FA-PWWN usage without a driver
reload.
Clean up FA-PWWN flag management to make enable and disable of the feature
more robust.
Link: https://lore.kernel.org/r/20220701211425.2708-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is no corresponding free routine if lpfc_sli4_issue_wqe fails to
issue the CMF WQE in lpfc_issue_cmf_sync_wqe.
If ret_val is non-zero, then free the iocbq request structure.
Link: https://lore.kernel.org/r/20220701211425.2708-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
VMID introduced an extra increment of cmd_pending, causing double-counting
of the I/O. The normal increment ios performed in lpfc_get_scsi_buf.
Link: https://lore.kernel.org/r/20220701211425.2708-5-jsmart2021@gmail.com
Fixes: 33c79741de ("scsi: lpfc: vmid: Introduce VMID in I/O path")
Cc: <stable@vger.kernel.org> # v5.14+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When providing a D_ID in XMIT_ELS_RSP64_CX iocb the PU field should
be set to 3 to describe the parameter being passed to firmware.
Link: https://lore.kernel.org/r/20220701211425.2708-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Malformed user input to debugfs results in buffer overflow crashes. Adapt
input string lengths to fit within internal buffers, leaving space for NULL
terminators.
Link: https://lore.kernel.org/r/20220701211425.2708-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In lpfc_nvme_cancel_iocb(), a cqe is created locally from stack storage.
The code didn't initialize the total_data_placed word, inheriting stack
content.
Initialize the total_data_placed word.
Link: https://lore.kernel.org/r/20220701211425.2708-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For some technologies, e.g. an ATA bus, resuming can take multiple
seconds. Waiting for resume to finish can cause a very noticeable delay.
Hence this commit that restores the behavior from before "scsi: core: pm:
Rely on the device driver core for async power management" for most SCSI
devices.
This commit introduces a behavior change: if the START command fails, do
not consider this as a SCSI disk resume failure.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215880
Link: https://lore.kernel.org/r/20220630195703.10155-3-bvanassche@acm.org
Fixes: a19a93e4c6 ("scsi: core: pm: Rely on the device driver core for async power management")
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: ericspero@icloud.com
Cc: jason600.groome@gmail.com
Tested-by: jason600.groome@gmail.com
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the definition of SCSI_QUEUE_DELAY to just above the function that
uses it.
Link: https://lore.kernel.org/r/20220630195703.10155-2-bvanassche@acm.org
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This was found by coccicheck:
drivers/scsi/megaraid/megaraid_sas_base.c:3950 process_fw_state_change_wq() warn: inconsistent indenting.
Link: https://lore.kernel.org/r/20220630074152.29171-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
sdev_printk() will only accept messages up to 128 bytes.
Shorten strings exceeding 128 bytes avoid printing an incomplete sentence
like:
[ 475.156955] sd 9:0:0:0: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Link: https://lore.kernel.org/r/20220630024516.1571209-1-lizhijian@fujitsu.com
Suggested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enable shared host tagset to make sure that total outstanding I/O count can
not exceed controller's can_queue setting.
Link: https://lore.kernel.org/r/20220628074848.5036-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a copy and paste bug here. It should check ".rsp" instead of
".req". The error message is copy and pasted as well so update that too.
Link: https://lore.kernel.org/r/YrK1A/t3L6HKnswO@kili
Fixes: 9c40c36e75 ("scsi: qla2xxx: edif: Reduce Initiator-Initiator thrashing")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the zone related fields that are currently stored in
struct request_queue to struct gendisk as these are part of the highlevel
block layer API and are only used for non-passthrough I/O that requires
the gendisk.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220706070350.1703384-17-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Switch to a gendisk based API in preparation for moving all zone related
fields from the request_queue to the gendisk.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220706070350.1703384-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Prepare for storing the zone related field in struct gendisk instead
of struct request_queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220706070350.1703384-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>