struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.
Link: https://lore.kernel.org/r/20211012233558.4066756-19-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-19-bvanassche@acm.org
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
All callers pass 0 as offset. Therefore remove the parameter and use a
fixed offset 0 in pci_vpd_find_tag().
Link: https://lore.kernel.org/r/f62e6e19-5423-2ead-b2bd-62844b23ef8f@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/cxlflash/main.c:1369: warning: Function parameter or member 'hwq' not described in 'process_hrrq'
drivers/scsi/cxlflash/main.c:1369: warning: Excess function parameter 'afu' description in 'process_hrrq'
drivers/scsi/cxlflash/main.c:2005: warning: Function parameter or member 'index' not described in 'init_mc'
drivers/scsi/cxlflash/main.c:3303: warning: Function parameter or member 'lunprov' not described in 'cxlflash_lun_provision'
drivers/scsi/cxlflash/main.c:3303: warning: Excess function parameter 'arg' description in 'cxlflash_lun_provision'
drivers/scsi/cxlflash/main.c:3397: warning: Function parameter or member 'afu_dbg' not described in 'cxlflash_afu_debug'
drivers/scsi/cxlflash/main.c:3397: warning: Excess function parameter 'arg' description in 'cxlflash_afu_debug'
Link: https://lore.kernel.org/r/20210317091230.2912389-32-lee.jones@linaro.org
Cc: "Manoj N. Kumar" <manoj@linux.ibm.com>
Cc: "Matthew R. Ochs" <mrochs@linux.ibm.com>
Cc: Uma Krishnan <ukrishn@linux.ibm.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The "cmd" pointer was already dereferenced a couple lines earlier so this
NULL check is too late. Fortunately, the pointer can never be NULL and the
check can be removed.
Link: https://lore.kernel.org/r/20200605110258.GD978434@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix to return negative error code -ENOMEM from create_afu error handling
case instead of 0, as done elsewhere in this function.
Link: https://lore.kernel.org/r/20200428141855.88704-1-weiyongjun1@huawei.com
Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly update of the usual drivers: aacraid, ufs, zfcp,
NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
plus a whole load of minor updates and fixes. The two major core
changes are Al Viro's reworking of sg's handling of copy to/from user,
Ming Lei's removal of the host busy counter to avoid contention in the
multiqueue case and Damien Le Moal's fixing of residual tracking
across error handling.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXeKvHCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQJMAQDAjlAi
SNfbyndMqyf+rZGWufDI+43Up1VvW9GeWJHeDwEAxfO5XZsCks2uT8UxXhpEp9L7
HkiUww3zbcgl0FWFkUM=
=cdVU
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: aacraid, ufs, zfcp,
NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
plus a whole load of minor updates and fixes.
The major core changes are Al Viro's reworking of sg's handling of
copy to/from user, Ming Lei's removal of the host busy counter to
avoid contention in the multiqueue case and Damien Le Moal's fixing of
residual tracking across error handling"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits)
scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort()
scsi: target: core: Fix a pr_debug() argument
scsi: iscsi: Don't send data to unbound connection
scsi: target: iscsi: Wait for all commands to finish before freeing a session
scsi: target: core: Release SPC-2 reservations when closing a session
scsi: target: core: Document target_cmd_size_check()
scsi: bnx2i: fix potential use after free
Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails"
scsi: NCR5380: Add disconnect_mask module parameter
scsi: NCR5380: Unconditionally clear ICR after do_abort()
scsi: NCR5380: Call scsi_set_resid() on command completion
scsi: scsi_debug: num_tgts must be >= 0
scsi: lpfc: use hdwq assigned cpu for allocation
scsi: arcmsr: fix indentation issues
scsi: qla4xxx: fix double free bug
scsi: pm80xx: Modified the logic to collect fatal dump
scsi: pm80xx: Tie the interrupt name to the module instance
scsi: pm80xx: Controller fatal error through sysfs
scsi: pm80xx: Do not request 12G sas speeds
scsi: pm80xx: Cleanup command when a reset times out
...
The .ioctl and .compat_ioctl file operations have the same prototype so
they can both point to the same function, which works great almost all
the time when all the commands are compatible.
One exception is the s390 architecture, where a compat pointer is only
31 bit wide, and converting it into a 64-bit pointer requires calling
compat_ptr(). Most drivers here will never run in s390, but since we now
have a generic helper for it, it's easy enough to use it consistently.
I double-checked all these drivers to ensure that all ioctl arguments
are used as pointers or are ignored, but are not interpreted as integer
values.
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: Darren Hart (VMware) <dvhart@infradead.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/scsi/cxlflash/main.c:47:22: warning:
variable ioarcb set but not used [-Wunused-but-set-variable]
It is never used, so can be removed.
Link: https://lore.kernel.org/r/20191021141957.18828-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mark switch cases where we are expecting to fall through.
This patch fixes the following warnings:
drivers/scsi/cxlflash/main.c: In function 'send_afu_cmd':
drivers/scsi/cxlflash/main.c:2347:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (rc) {
^
drivers/scsi/cxlflash/main.c:2357:2: note: here
case -EAGAIN:
^~~~
drivers/scsi/cxlflash/main.c: In function 'term_intr':
drivers/scsi/cxlflash/main.c:754:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (index == PRIMARY_HWQ)
^
drivers/scsi/cxlflash/main.c:756:2: note: here
case UNMAP_TWO:
^~~~
drivers/scsi/cxlflash/main.c:757:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
cfg->ops->unmap_afu_irq(hwq->ctx_cookie, 2, hwq);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/main.c:758:2: note: here
case UNMAP_ONE:
^~~~
drivers/scsi/cxlflash/main.c:759:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
cfg->ops->unmap_afu_irq(hwq->ctx_cookie, 1, hwq);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/main.c:760:2: note: here
case FREE_IRQ:
^~~~
drivers/scsi/cxlflash/main.c: In function 'cxlflash_remove':
drivers/scsi/cxlflash/main.c:975:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
cxlflash_release_chrdev(cfg);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/main.c:976:2: note: here
case INIT_STATE_SCSI:
^~~~
drivers/scsi/cxlflash/main.c:978:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
scsi_remove_host(cfg->host);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/main.c:979:2: note: here
case INIT_STATE_AFU:
^~~~
drivers/scsi/cxlflash/main.c:980:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
term_afu(cfg);
^~~~~~~~~~~~~
drivers/scsi/cxlflash/main.c:981:2: note: here
case INIT_STATE_PCI:
^~~~
drivers/scsi/cxlflash/main.c:983:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
pci_disable_device(pdev);
^~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/cxlflash/main.c:984:2: note: here
case INIT_STATE_NONE:
^~~~
drivers/scsi/cxlflash/main.c: In function 'num_hwqs_store':
drivers/scsi/cxlflash/main.c:3018:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (cfg->state == STATE_NORMAL)
^
drivers/scsi/cxlflash/main.c:3020:2: note: here
default:
^~~~~~~
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
hisi_sas, target/iscsi and target/core. Additionally Christoph
refactored gdth as part of the dma changes. The major mid-layer
change this time is the removal of bidi commands and with them the
whole of the osd/exofs driver and filesystem.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIC54SYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishT1GAPwJEV23
ExPiPsnuVgKj49nLTagZ3rILRQcYNbL+MNYqxQEA0cT8FHzSDBfWY5OKPNE+RQ8z
f69LpXGmMpuagKGvvd4=
=Fhy1
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
hisi_sas, target/iscsi and target/core.
Additionally Christoph refactored gdth as part of the dma changes. The
major mid-layer change this time is the removal of bidi commands and
with them the whole of the osd/exofs driver and filesystem. This is a
major simplification for block and mq in particular"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits)
scsi: cxgb4i: validate tcp sequence number only if chip version <= T5
scsi: cxgb4i: get pf number from lldi->pf
scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
scsi: mpt3sas: Add missing breaks in switch statements
scsi: aacraid: Fix missing break in switch statement
scsi: kill command serial number
scsi: csiostor: drop serial_number usage
scsi: mvumi: use request tag instead of serial_number
scsi: dpt_i2o: remove serial number usage
scsi: st: osst: Remove negative constant left-shifts
scsi: ufs-bsg: Allow reading descriptors
scsi: ufs: Allow reading descriptor via raw upiu
scsi: ufs-bsg: Change the calling convention for write descriptor
scsi: ufs: Remove unused device quirks
Revert "scsi: ufs: disable vccq if it's not needed by UFS device"
scsi: megaraid_sas: Remove a bunch of set but not used variables
scsi: clean obsolete return values of eh_timed_out
scsi: sd: Optimal I/O size should be a multiple of physical block size
scsi: MAINTAINERS: SCSI initiator and target tweaks
scsi: fcoe: make use of fip_mode enum complete
...
Clang warns several times in the scsi subsystem (trimmed for brevity):
drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to
switch condition type (2147762695 to 18446744071562347015) [-Wswitch]
case CCISS_GETBUSTYPES:
^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to
switch condition type (2147762694 to 18446744071562347014) [-Wswitch]
case CCISS_GETHEARTBEAT:
^
The root cause is that the _IOC macro can generate really large numbers,
which don't fit into type 'int', which is used for the cmd parameter in
the ioctls in scsi_host_template. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful. However, looking at
the rest of the kernel tree, all ioctls use an 'unsigned int' for the
cmd parameter, which will fit all of the _IOC values in the scsi/ata
subsystems.
Make that change because none of the ioctls expect a negative value for
any command, it brings the ioctls inline with the reset of the kernel,
and it removes ambiguity, which is never good when dealing with compilers.
Link: https://github.com/ClangBuiltLinux/linux/issues/85
Link: https://github.com/ClangBuiltLinux/linux/issues/154
Link: https://github.com/ClangBuiltLinux/linux/issues/157
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Bradley Grove <bgrove@attotech.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Presently when an error is encountered during probe of the cxlflash
adapter, a deadlock is seen with cpu thread stuck inside
cxlflash_remove(). Below is the trace of the deadlock as logged by
khungtaskd:
cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
INFO: task kworker/80:1:890 blocked for more than 120 seconds.
Not tainted 5.0.0-rc4-capi2-kexec+ #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/80:1 D 0 890 2 0x00000808
Workqueue: events work_for_cpu_fn
Call Trace:
0x4d72136320 (unreliable)
__switch_to+0x2cc/0x460
__schedule+0x2bc/0xac0
schedule+0x40/0xb0
cxlflash_remove+0xec/0x640 [cxlflash]
cxlflash_probe+0x370/0x8f0 [cxlflash]
local_pci_probe+0x6c/0x140
work_for_cpu_fn+0x38/0x60
process_one_work+0x260/0x530
worker_thread+0x280/0x5d0
kthread+0x1a8/0x1b0
ret_from_kernel_thread+0x5c/0x80
INFO: task systemd-udevd:5160 blocked for more than 120 seconds.
The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
successfully probed the 'cxlflash_cfg->state' never changes from
STATE_PROBING hence the deadlock occurs.
We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
STATE_PROBED in case an error occurs during cxlflash_probe() and just
before calling cxlflash_remove().
Cc: stable@vger.kernel.org
Fixes: c21e0bbfc485("cxlflash: Base support for IBM CXL Flash Adapter")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have
a pile of annotation, unused variable and minor updates. The big API
change is the updates for Christoph's DMA rework which include
removing the DISABLE_CLUSTERING flag. And finally there are a couple
of target tree updates.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXCEUNiYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdjKAP9vrTTv
qFaYmAoRSbPq9ZiixaXLMy0K/6o76Uay0gnBqgD/fgn3jg/KQ6alNaCjmfeV3wAj
u1j3H7tha9j1it+4pUw=
=GDa+
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.
Additionally, we have a pile of annotation, unused variable and minor
updates.
The big API change is the updates for Christoph's DMA rework which
include removing the DISABLE_CLUSTERING flag.
And finally there are a couple of target tree updates"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits)
scsi: isci: request: mark expected switch fall-through
scsi: isci: remote_node_context: mark expected switch fall-throughs
scsi: isci: remote_device: Mark expected switch fall-throughs
scsi: isci: phy: Mark expected switch fall-through
scsi: iscsi: Capture iscsi debug messages using tracepoints
scsi: myrb: Mark expected switch fall-throughs
scsi: megaraid: fix out-of-bound array accesses
scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
scsi: fcoe: remove set but not used variable 'port'
scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
scsi: smartpqi: fix build warnings
scsi: smartpqi: update driver version
scsi: smartpqi: add ofa support
scsi: smartpqi: increase fw status register read timeout
scsi: smartpqi: bump driver version
scsi: smartpqi: add smp_utils support
scsi: smartpqi: correct lun reset issues
scsi: smartpqi: correct volume status
scsi: smartpqi: do not offline disks for transient did no connect conditions
scsi: smartpqi: allow for larger raid maps
...
Most SCSI drivers want to enable "clustering", that is merging of
segments so that they might span more than a single page. Remove the
ENABLE_CLUSTERING define, and require drivers to explicitly set
DISABLE_CLUSTERING to disable this feature.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This removes the legacy (non-mq) IO path for SCSI.
Cc: linux-scsi@vger.kernel.org
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Depending on the underlying transport, cxlflash has a dependency on either
the CXL or OCXL drivers, which are enabled via their Kconfig option.
Instead of having a module wide dependency on these config options, it is
better to isolate the object modules that are dependent on the CXL and OCXL
drivers and adjust the module dependencies accordingly.
This commit isolates the object files that are dependent on CXL and/or
OCXL. The cxl/ocxl fops used in the core driver are tucked under an ifdef to
avoid compilation errors.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As a staging cleanup to support transport specific builds of the cxlflash
module, relocate device dependent assignments to header files. This will
avoid littering the core driver with conditional compilation logic.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
AFUs can only process a single AFU command at a time. This is enforced with
a global mutex situated within the AFU send routine. As this mutex has a
global scope, it has the potential to unnecessarily block commands destined
for other AFUs.
Instead of using a global mutex, transition the mutex to be per-AFU. This
will allow commands to only be blocked by siblings of the same AFU.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The kernel log can get filled with debug messages from send_cmd_ioarrin()
when dynamic debug is enabled for the cxlflash module and there is a lot of
legacy I/O traffic.
While these messages are necessary to debug issues that involve command
tracking, the abundance of data can overwrite other useful data in the
log. The best option available is to limit the messages that should serve
most of the common use cases.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following Oops may be encountered if the device is reset, i.e. EEH
recovery, while there is heavy I/O traffic:
59:mon> t
[c000200db64bb680] c008000009264c40 cxlflash_queuecommand+0x3b8/0x500
[cxlflash]
[c000200db64bb770] c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0
[c000200db64bb7f0] c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0
[c000200db64bb900] c00000000067f528 __blk_run_queue+0x68/0xb0
[c000200db64bb930] c00000000067ab80 __elv_add_request+0x140/0x3c0
[c000200db64bb9b0] c00000000068daac blk_execute_rq_nowait+0xec/0x1a0
[c000200db64bba00] c00000000068dbb0 blk_execute_rq+0x50/0xe0
[c000200db64bba50] c0000000006b2040 sg_io+0x1f0/0x520
[c000200db64bbaf0] c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610
[c000200db64bbc20] c000000000926208 sd_ioctl+0x118/0x280
[c000200db64bbcc0] c00000000069f7ac blkdev_ioctl+0x7fc/0xe30
[c000200db64bbd20] c000000000439204 block_ioctl+0x84/0xa0
[c000200db64bbd40] c0000000003f8514 do_vfs_ioctl+0xd4/0xa00
[c000200db64bbde0] c0000000003f8f04 SyS_ioctl+0xc4/0x130
[c000200db64bbe30] c00000000000b184 system_call+0x58/0x6c
When there is no room to send the I/O request, the cached room is refreshed
by reading the memory mapped command room value from the AFU. The AFU
register mapping is refreshed during a reset, creating a race condition that
can lead to the Oops above.
During a device reset, the AFU should not be unmapped until all the active
send threads quiesce. An atomic counter, cmds_active, is currently used to
track internal AFU commands and quiesce during reset. This same counter can
also be used for the active send threads.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following Oops can occur when there is heavy I/O traffic and the host is
reset by a tool such as sg_reset.
[c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500
[cxlflash] (unreliable)
[c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash]
[c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310
[c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90
[c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0
[c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230
[c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70
[c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0
[c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24
[c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130
[c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120
When a context is reset, the pending commands are flushed and the AFU is
notified. Before the AFU handles this request there could be command
completion interrupts queued to PHB which are yet to be delivered to the
context. In this scenario, a context could receive an interrupt for a command
that has been flushed, leading to a possible crash when the memory for the
flushed command is accessed.
To resolve this problem, a boolean will indicate if the hardware queue is
ready to process interrupts or not. This can be evaluated in the interrupt
handler before proessing an interrupt.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following Oops can occur if an internal command sent to the AFU does not
complete within the timeout:
[c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash]
[c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash]
[c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230
[cxlflash]
[c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
[c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
[c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0
[c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
[c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580
[c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
[c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
[c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0
[c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4
When an internal command times out, the command buffer is freed while it is
still in the pending commands list of the context. This corrupts the list and
when the context is cleaned up, a crash is encountered.
To resolve this issue, when an AFU command or TMF command times out, the
command should be deleted from the hardware queue pending command list before
freeing the buffer.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following Oops can be encountered if a device removal or system shutdown
is initiated while an EEH recovery is in process:
[c000000ff2f479c0] c008000015256f18 cxlflash_pci_slot_reset+0xa0/0x100
[cxlflash]
[c000000ff2f47a30] c00800000dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl]
[c000000ff2f47ae0] c00000000003ef1c eeh_report_reset+0xec/0x170
[c000000ff2f47b20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
[c000000ff2f47bb0] c00000000003f80c eeh_handle_normal_event+0x56c/0x580
[c000000ff2f47c60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
[c000000ff2f47d10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
[c000000ff2f47dc0] c00000000013da48 kthread+0x1a8/0x1b0
[c000000ff2f47e30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4
The remove handler frees AFU memory while the EEH recovery is in progress,
leading to a race condition. This can result in a crash if the recovery thread
tries to access this memory.
To resolve this issue, the cxlflash remove handler will evaluate the device
state and yield to any active reset or probing threads.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit enables the OCXL operations for the OCXL devices.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Similar to user contexts, master contexts also require that the per-context
LISN registers be programmed for certain AFUs. The mapped trigger page is
obtained from underlying transport and registered with AFU for each master
context.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When an adapter is initialized, transport specific configuration and MMIO
mapping details need to be saved. For CXL, this data is managed by the
underlying kernel module. To maintain a separation between the cxlflash core
and underlying transports, introduce a new structure to store data specific to
the OCXL AFU.
Initially only the pointers to underlying PCI and generic devices are added to
this new structure - it will be expanded further in future commits. Services
to create and destroy this hardware AFU are added and integrated in the probe
and exit paths of the driver.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SISLite specification originally defined the context control register with
a single field of bits to represent the LISN and also stipulated that the
register reset value be 0. The cxlflash driver took advantage of this when
programming the LISN for the master contexts via an unconditional write - no
other bits were preserved.
When unmap support was added, SISLite was updated to define bit 0 of the
context control register as a way for the AFU to notify the context owner that
unmap operations were supported. Thus the assumptions under which the register
is setup changed and the existing unconditional write is clobbering the unmap
state for master contexts. This is presently not an issue due to the order in
which the context control register is programmed in relation to the unmap bit
being queried but should be addressed to avoid a future regression in the
event this code is moved elsewhere.
To remedy this issue, preserve the bits when programming the LISN field in the
context control register. Since the LISN will now be programmed using a read
value, assert that the initial state of the LISN field is as described in
SISLite (0).
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The number of interrupts requested for user contexts are stored in the context
specific structures and utilized to manage the interrupts. For the master
contexts, this number is only used once and therefore not saved.
To prepare for future commits where the number of interrupts will be required
in more than one place, preserve the value in the master context structure.
[mkp: typo in comment]
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As staging to support future accelerator transports, add a shim layer
such that the underlying services the cxlflash driver requires can be
conditional upon the accelerator infrastructure.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Adapter context creation can return either NULL or an error pointer.
Updating the check condition to reflect this.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The CXL-specific work structure used to request the number of interrupts
currently resides as a nested member of both the context information and
hardware queue structures. It is used to cache values (specifically the
number of interrupts) required by the CXL layer when starting a context.
To facilitate staging that will ultimately allow the cxlflash core to
become agnostic of the underlying accelerator transport, remove these
embedded work structures.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Convert cxl-specific pointers to generic cookies to facilitate future
enhancements.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the event of a command failure, cxlflash returns the failure to the upper
layers to process. After processing the error, when the command is queued
again, the private command structure will not be zeroed and the ioasc could be
stale. Per the SISLite specification, the AFU only sets the ioasc in the
presence of a failure. Thus, even though the original command succeeds the
second time, the command is considered a failure due to stale ioasc. This
cycle repeats indefinitely and can cause a hang or IO failure.
To fix the issue, clear the ioasc before queuing any command.
[mkp: added Cc: stable per request]
Fixes: 479ad8e9d4 ("scsi: cxlflash: Remove zeroing of private command data")
Cc: <stable@vger.kernel.org>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull compat and uaccess updates from Al Viro:
- {get,put}_compat_sigset() series
- assorted compat ioctl stuff
- more set_fs() elimination
- a few more timespec64 conversions
- several removals of pointless access_ok() in places where it was
followed only by non-__ variants of primitives
* 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits)
coredump: call do_unlinkat directly instead of sys_unlink
fs: expose do_unlinkat for built-in callers
ext4: take handling of EXT4_IOC_GROUP_ADD into a helper, get rid of set_fs()
ipmi: get rid of pointless access_ok()
pi433: sanitize ioctl
cxlflash: get rid of pointless access_ok()
mtdchar: get rid of pointless access_ok()
r128: switch compat ioctls to drm_ioctl_kernel()
selection: get rid of field-by-field copyin
VT_RESIZEX: get rid of field-by-field copyin
i2c compat ioctls: move to ->compat_ioctl()
sched_rr_get_interval(): move compat to native, get rid of set_fs()
mips: switch to {get,put}_compat_sigset()
sparc: switch to {get,put}_compat_sigset()
s390: switch to {get,put}_compat_sigset()
ppc: switch to {get,put}_compat_sigset()
parisc: switch to {get,put}_compat_sigset()
get_compat_sigset()
get rid of {get,put}_compat_itimerspec()
io_getevents: Use timespec64 to represent timeouts
...
Currently, all adapters that cxlflash supports must have WWPN VPD
keywords to complete configuration. This was required as cards with
external FC ports needed to be programmed with WWPNs.
Newer supported cards do not have an external FC interface and therefore
do not require WWPN. To support backwards compatibility, these devices
have included 'dummy' WWPN VPD with WWPN values of zero. This however
places a dependency that all future cards have WWPN VPD, which may not
always be the case.
Allow for cards to not have WWPN, designating which cards are expected
to have it in order to configure properly.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The AFU termination sequence has been refactored over time such that the
main tear down routine, term_afu(), can no longer can be invoked with a
NULL AFU pointer. Remove the unnecessary existence check from
term_afu().
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The copy_from/to_user() functions return the number of bytes remaining
to be copied but we had intended to return -EFAULT here.
Fixes: bc88ac47d5 ("scsi: cxlflash: Support AFU debug")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The device and host reset handler contain debug prints to help identify
the entities being reset. Today these reset handlers are based on a SCSI
EH design that uses a SCSI command reference as a means of identifying
the target entity. As such, the debug trace includes the SCSI command
pointer and associated CDB. This is not necessary as the SCSI command is
simply the messenger in these scenarios.
Refactor the debug prints in the host and reset handlers to only present
information that is applicable given the function scope.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The current send_tmf() implementation is based on the caller providing a
SCSI command reference. In reality all that is needed is a SCSI device
reference as the routine uses a private command.
Refactor send_tmf() to pass the private adapter configuration reference
and a SCSI device reference. As a nice side effect, this will ease the
burden of converting caller routines to be based solely off of a SCSI
device reference.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The device_unregister() service used when cleaning up the character
device is already responsible for the internal state associated with the
device upon successful creation. As the cxlflash driver does not obtain
a second reference to the character device, the explicit call to
put_device() is not required and can lead to an inconsistent sysfs among
other issues as the reference is no longer valid after the first
put_device() is performed.
Remove the unnecessary put_device() to remedy this issue.
Fixes: a834a36b57 ("scsi: cxlflash: Create character device to provide host management interface")
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, the SCSI command presented to the device reset handler is used
to send TMFs to the AFU for a device reset. This behavior is incorrect as
the command presented is an actual command and not a special notification.
As such, it should only be used for reference and not be acted upon.
Additionally, the existing TMF transmission routine does not account for
actual errors from the hardware, only reflecting failure when a timeout
occurs. This can lead to a condition where the device reset handler is
presented with a false 'success'.
Update send_tmf() to dynamically allocate a private command for sending
the TMF command and properly reflect failure when the completed command
indicates an error or was aborted. Detect TMF commands during response
processing and avoid scsi_done() for these types of commands. Lastly,
update comments in the TMF processing paths to describe the new behavior.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SCSI core now zeroes the per-command private data area prior to
calling into the LLD. Replace the clearing operation that takes place
when the private command data reference is obtained with a routine that
performs common initializations. The zeroing that takes place in the
device reset path remains intact as the private command data associated
with the specified SCSI command is not guaranteed to be cleared.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>