Remove the AAC_STAT_GOOD definition and open code it in the places it was
used.
This will make subsequent refactoring in this area easier.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Dave Carroll <david.carroll@microsemi.com>
Cc: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver fails to set the correct queue depth for native devices, due to
failing to set the device type prior to calling aac_set_safw_target_qd().
This results in slave configure setting the queue depth to 1.
This causes around 30% performance degradation. Fixed by setting the dev
type before trying to set queue depth.
Reported-by: Steve Best <sbest@redhat.com>
Fixes: 0bcb45fb20 ("scsi: aacraid: Add helper function to set queue depth")
cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
get_seconds() can overflow on 32-bit architectures and is deprecated
because of that. The use in the aacraid driver has the same problem due to
a limited firmware interface, it also overflows in the year 2106.
This changes all calls to get_seconds() to the non-deprecated
ktime_get_real_seconds(), which unfortunately doesn't solve that problem
but gets rid of one user of the deprecated interface.
[mkp: checkpatch]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is a set of minor (and safe changes) that didn't make the initial
pull request plus some bug fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWyHBVCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishTkjAPoDF71y
5+w0pim7HQvyo02GxKRWyYzkibZsTfNQ49Yo6wD9EhKp1OD4TIrO1ey3fHpCcYry
CHfUIClnev6hiqDBDrI=
=xJ+K
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of minor (and safe changes) that didn't make the initial
pull request plus some bug fixes"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Mask off Scope bits in retry delay
scsi: qla2xxx: Fix crash on qla2x00_mailbox_command
scsi: aic7xxx: aic79xx: fix potential null pointer dereference on ahd
scsi: mpt3sas: Add an I/O barrier
scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails
scsi: hpsa: disable device during shutdown
scsi: sd_zbc: Fix sd_zbc_check_zone_size() error path
scsi: aacraid: remove bogus GFP_DMA32 specifies
For one GFP_DMA32 does not actually work with kmalloc, as we only have
GFP_DMA and GFP_KERNEL caches, but not GFP_DMA32. Second the memory
is mapped using the proper DMA API anyway, which would include proper
bounce buffering if needed by the device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
commit b60710ec7d ("scsi: aacraid: enable sending of TMFs from
aac_hba_send()") allows aac_hba_send() to send scsi commands, and TMF
requests, but the existing code only updates the iu_type for scsi
commands. For TMF requests we are sending an unknown iu_type to
firmware, which causes a fault.
Include iu_type prior to determining the validity of the command
Reported-by: Noah Misner <nmisner@us.ibm.com>
Fixes: b60710ec7d ("aacraid: enable sending of TMFs from aac_hba_send()")
Fixes: 423400e64d ("aacraid: Include HBA direct interface")
Tested-by: Noah Misner <nmisner@us.ibm.com>
cc: stable@vger.kernel.org
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly updates of the usual drivers: arcmsr, qla2xx, lpfc,
ufs, mpt3sas, hisi_sas. In addition we have removed several really
old drivers: sym53c416, NCR53c406a, fdomain, fdomain_cs and removed
the old scsi_module.c initialization from all remaining drivers. Plus
an assortment of bug fixes, initialization errors and other minor
fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWsVSnSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbvbAP9ErpTZ
OR5iJ5HIz4W3Bd8aTfEpJrDyeYwSUC+sra5SKQD/ZWyVB3fYFSg+ZROyT26pmtmd
SdImhG7hLaHgVvF5qRQ=
=SQ/n
-----END PGP SIGNATURE-----
Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly updates of the usual drivers: arcmsr, qla2xx, lpfc,
ufs, mpt3sas, hisi_sas.
In addition we have removed several really old drivers: sym53c416,
NCR53c406a, fdomain, fdomain_cs and removed the old scsi_module.c
initialization from all remaining drivers.
Plus an assortment of bug fixes, initialization errors and other minor
fixes"
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (168 commits)
scsi: ufs: Add support for Auto-Hibernate Idle Timer
scsi: ufs: sysfs: reworking of the rpm_lvl and spm_lvl entries
scsi: qla2xxx: fx00 copypaste typo
scsi: qla2xxx: fix error message on <qla2400
scsi: smartpqi: update driver version
scsi: smartpqi: workaround fw bug for oq deletion
scsi: arcmsr: Change driver version to v1.40.00.05-20180309
scsi: arcmsr: Sleep to avoid CPU stuck too long for waiting adapter ready
scsi: arcmsr: Handle adapter removed due to thunderbolt cable disconnection.
scsi: arcmsr: Rename ACB_F_BUS_HANG_ON to ACB_F_ADAPTER_REMOVED for adapter hot-plug
scsi: qla2xxx: Update driver version to 10.00.00.06-k
scsi: qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan
scsi: qla2xxx: Cleanup code to improve FC-NVMe error handling
scsi: qla2xxx: Fix FC-NVMe IO abort during driver reset
scsi: qla2xxx: Fix retry for PRLI RJT with reason of BUSY
scsi: qla2xxx: Remove nvme_done_list
scsi: qla2xxx: Return busy if rport going away
scsi: qla2xxx: Fix n2n_ae flag to prevent dev_loss on PDB change
scsi: qla2xxx: Add FC-NVMe abort processing
scsi: qla2xxx: Add changes for devloss timeout in driver
...
Somewhat nasty merge due to conflicts between "33b28357dd00 scsi:
qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan" and "2b5b96473efc
scsi: qla2xxx: Fix FC-NVMe LUN discovery"
Merge is non-trivial and has been verified by Qlogic (Cavium)
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
This patch fixes spelling typos found in printk.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
These are mostly fixes for problems with merge window code. In
addition we have one doc update (alua) and two dead code removals
(aiclib and octogon) a spurious assignment removal (csiostor) and a
performance improvement for storvsc involving better interrupt
spreading and increasing the command per lun handling.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWo+H2yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishe2eAQDyWfoK
Mfjbrl6cdPop+JIoED0VtBzAQyeXceJt8GYDQwEApXTIZon2HTdJqGawfUhaapBA
JnO6iOiC13/nZjl7C28=
=K3Pk
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"These are mostly fixes for problems with merge window code.
In addition we have one doc update (alua) and two dead code removals
(aiclib and octogon) a spurious assignment removal (csiostor) and a
performance improvement for storvsc involving better interrupt
spreading and increasing the command per lun handling"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla4xxx: skip error recovery in case of register disconnect.
scsi: aacraid: fix shutdown crash when init fails
scsi: qedi: Cleanup local str variable
scsi: qedi: Fix truncation of CHAP name and secret
scsi: qla2xxx: Fix incorrect handle for abort IOCB
scsi: qla2xxx: Fix double free bug after firmware timeout
scsi: storvsc: Increase cmd_per_lun for higher speed devices
scsi: qla2xxx: Fix a locking imbalance in qlt_24xx_handle_els()
scsi: scsi_dh: Document alua_rtpg_queue() arguments
scsi: Remove Makefile entry for oktagon files
scsi: aic7xxx: remove aiclib.c
scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
scsi: sym53c8xx_2: iterator underflow in sym_getsync()
scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
scsi: csiostor: remove redundant assignment to pointer 'ln'
scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info
scsi: qla2xxx: Fix memory corruption during hba reset test
scsi: mpt3sas: fix an out of bound write
During sync command processing, if legacy INTx status indicates command
is not completed, sample the MSIx register and check if it indicates
command completion, set controller MSIx enabled flag.
Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Preserve the current MSIX mode value in the OMR before rewriting the OMR
to initiate the IOP or Soft Reset.
Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
IOP_RESET takes a long time to complete. If controller is in a state
where we can bring it back with init struct, send a DropIO sync command
instead.
- If controller is faulted perform standard IOP_RESET in aac_srcv_init.
- If controller is not faulted get adapter properties and extended
properties.
- Update the sa_firmware variable and determine if DropIO request is
supported.
- Issue DropIO request, and get the number of outstanding commands.
- If all commands are complete with success (CT_OK), consider IOP_RESET
is complete.
- If any commands timeout, Perform the IOP_RESET.
Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When aacraid init fails with "AAC0: adapter self-test failed.", shutdown
leads to UBSAN warning and then oops:
[154316.118423] ================================================================================
[154316.118508] UBSAN: Undefined behaviour in drivers/scsi/scsi_lib.c:2328:27
[154316.118566] member access within null pointer of type 'struct Scsi_Host'
[154316.118631] CPU: 2 PID: 14530 Comm: reboot Tainted: G W 4.15.0-dirty #89
[154316.118701] Hardware name: Hewlett Packard HP NetServer/HP System Board, BIOS 4.06.46 PW 06/25/2003
[154316.118774] Call Trace:
[154316.118848] dump_stack+0x48/0x65
[154316.118916] ubsan_epilogue+0xe/0x40
[154316.118976] __ubsan_handle_type_mismatch+0xfb/0x180
[154316.119043] scsi_block_requests+0x20/0x30
[154316.119135] aac_shutdown+0x18/0x40 [aacraid]
[154316.119196] pci_device_shutdown+0x33/0x50
[154316.119269] device_shutdown+0x18a/0x390
[...]
[154316.123435] BUG: unable to handle kernel NULL pointer dereference at 000000f4
[154316.123515] IP: scsi_block_requests+0xa/0x30
This is because aac_shutdown() does
struct Scsi_Host *shost = pci_get_drvdata(dev);
scsi_block_requests(shost);
and that assumes shost has been assigned with pci_set_drvdata().
However, pci_set_drvdata(pdev, shost) is done in aac_probe_one() far
after bailing out with error from calling the init function
((*aac_drivers[index].init)(aac)), and when the init function fails, no
error is returned from aac_probe_one() so PCI layer assumes there is
driver attached, and tries to shut it down later.
Fix it by returning error from aac_probe_one() when card-specific init
function fails.
This fixes reboot on my HP NetRAID-4M with dead battery.
Signed-off-by: Meelis Roos <mroos@linux.ee>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly updates of the usual driver suspects: arcmsr,
scsi_debug, mpt3sas, lpfc, cxlflash, qla2xxx, aacraid, megaraid_sas,
hisi_sas. We also have a rework of the libsas hotplug handling to
make it more robust, a slew of 32 bit time conversions and fixes, and
a host of the usual minor updates and style changes. The biggest
potential for regressions is the libsas hotplug changes, but so far
they seem stable under testing.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWnH+5SYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWxuAP0UvuJp
MNR/yU/wv/emSzOc48Ldwd7I0xD2XxSnloGUgwD+IGZZT5yNUQA1THCbm+en4hkB
WvyBieQs9qRit+2czd4=
=gJMf
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly updates of the usual driver suspects: arcmsr,
scsi_debug, mpt3sas, lpfc, cxlflash, qla2xxx, aacraid, megaraid_sas,
hisi_sas.
We also have a rework of the libsas hotplug handling to make it more
robust, a slew of 32 bit time conversions and fixes, and a host of the
usual minor updates and style changes. The biggest potential for
regressions is the libsas hotplug changes, but so far they seem stable
under testing"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (313 commits)
scsi: qla2xxx: Fix logo flag for qlt_free_session_done()
scsi: arcmsr: avoid do_gettimeofday
scsi: core: Add VENDOR_SPECIFIC sense code definitions
scsi: qedi: Drop cqe response during connection recovery
scsi: fas216: fix sense buffer initialization
scsi: ibmvfc: Remove unneeded semicolons
scsi: hisi_sas: fix a bug in hisi_sas_dev_gone()
scsi: hisi_sas: directly attached disk LED feature for v2 hw
scsi: hisi_sas: devicetree: bindings: add LED feature for v2 hw
scsi: megaraid_sas: NVMe passthrough command support
scsi: megaraid: use ktime_get_real for firmware time
scsi: fnic: use 64-bit timestamps
scsi: qedf: Fix error return code in __qedf_probe()
scsi: devinfo: fix format of the device list
scsi: qla2xxx: Update driver version to 10.00.00.05-k
scsi: qla2xxx: Add XCB counters to debugfs
scsi: qla2xxx: Fix queue ID for async abort with Multiqueue
scsi: qla2xxx: Fix warning for code intentation in __qla24xx_handle_gpdb_event()
scsi: qla2xxx: Fix warning during port_name debug print
scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout()
...
The delay for the rescan worker needs to 10 seconds, missed the HZ in
there.
Fixes: a1367e4ade (scsi: aacraid: Reschedule host scan in case of failure)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The correct lun count needs to be divided by 24, missed it in the
previous patch set.
Fixes: 4b00022753 (scsi: aacraid: Create helper functions to get lun info)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A previous commit no longer stores the contents of c, so we now have a
situation where c is being updated but the value is never read. Clean up
the code by removing the now redundant setting of variable c.
Cleans up clang warning:
drivers/scsi/aacraid/aachba.c:943:3: warning: Value stored to 'c' is
never read
Fixes: f4e8708d31 ("scsi: aacraid: Fix udev inquiry race condition")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The battery in my HP NetRAID-4M died of old age, and the aacraid driver
started oopsing with NULL pointer dereference on startup after that.
Fix it by reordering the init sequence to fill in function pointers
before ioremapping memory, or dev->a_ops.adapter_ioremap pointer will be
NULL.
Other subtypes of aacraid seem to have the order already correct.
This was the call trace:
? aac_probe_one+0x7a5/0xb30 [aacraid]
pci_device_probe+0xc0/0x1a0
driver_probe_device+0x1df/0x3b0
__driver_attach+0xa9/0xe0
? driver_probe_device+0x3b0/0x3b0
bus_for_each_dev+0x4c/0x90
driver_attach+0x1d/0x40
? driver_probe_device+0x3b0/0x3b0
bus_add_driver+0x1a7/0x2a0
driver_register+0x6e/0x130
__pci_register_driver+0x54/0x90
? 0xf81f4000
aac_init+0x2b/0x1000 [aacraid]
do_one_initcall+0x45/0x1e0
? kfree_skbmem+0x74/0xa0
? kfree+0x16d/0x240
? kvfree+0x45/0x50
? kvfree+0x45/0x50
? __vunmap+0x99/0x120
? do_init_module+0x1a/0x245
do_init_module+0x83/0x245
load_module+0x2764/0x34a0
? kernel_read_file+0x150/0x320
SyS_finit_module+0x82/0xa0
do_fast_syscall_32+0xba/0x340
Signed-off-by: Meelis Roos <mroos@linux.ee>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update driver Version to 50877
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Earlier driver would scan throgh all supported buses and targets and add
devices that responded. It would add devices that were _hidden_ by the fw.
Driver would invalidate commands sent to _hidden_ devices via the
AAC_HIDE_DISK check.
Since the driver now adds only the devices that are supposed to be
exposed, this code can be removed.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a chance of the driver to be stuck in kdump if drives start
acting up in kdump discovery process and the kernel decides to send eh
resets, which would prompt rescan to be scheduled.
Do not perform a rescan in kdump context, since we do not expect a hotplug
event during kdump and all the devices are going to go away anyway.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add back the ability to scan for hotplug changes while eh was in progress.
Schedule a rescan for a later time in the eh recovery code and wait for
eh to complete in the rescan worker.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the driver fails to retrieve information from the fw (could happen when
the fw is not fully in its senses), the driver does nothing and change is
not processed correctly by the driver
Schedule host rescan in case of failure. This is only for SAFW, since
the information retrieval failure will happen on SAFW devices.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver uses scsi_scan_host to add new devices in the driver init path,
which adds all the fw exposed devices. The drivers resorts to queue
command checks to block out commands to _hidden_ devices.
Use the hotplug handler code to add new devices during driver init and
other areas, this is only for safw. For ARC scsi_scan_host will still
apply.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently driver will attempt to process hotplug events concurrently based
on the FW interrupt.
Protect safw update function with a scan mutex.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The device hotplug events are processed only after retrieving the updated
lun information from the fw. Does not make sense to keep them separate.
Merge both the hotplug handling and safw adapter setup code into single
function.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Resolve luns checks the if a sdev is already present in the os to figure
out if it needs to be removed. Internally the driver exposes HBA on bus
2 even though its bus 1 in the fw. Its mildly confusing.
Refactor out the sdev lookup into its function to check if sdev has been
added to the kernel or not. Add helper functions to add, remove and put
devices based on their fw bus and target number.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added macros to loop through the MAX SUPPORTED Buses and Targets. This
will make the code a bit easier to read.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The hotplug handler code is duplicated for hba handling and container
handling.
Merged function to handle hba and container hot plug events into the
resolve luns functions. Added a bunch of helper functions to check the
validity of a given target and to check if bus, target is container
device.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Merge aac_get_containers to setup target function, so that information
about all the present devices can be retrieved in one shot.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helper function to set queue depth from information retrieved from
the bmic phy structure.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Save the bmic information for each phy, so that it can processed in
target setup function.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Created inline function to retrieve lun info for each device from the
phy luns structure.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the function to get phy luns information to the top of function
to set target information
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove function call to process targets from the report phy luns function
and make it a function in its own right. This will help understand the
flow of the code.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helper function to setup targets devices and create the base for the
upcoming patches
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Rename variables and functions to make bmic identify, report phy luns
to make them consistent across code internal existing code bases
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Edit function that retrieves phy lun information to use common
bmic function
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
safw command submission is duplicated across many functions.
Move the safw submission code from bmic identify into its own function
for common use
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ideally driver needs to wait for IO to be submitted or responded to before
shutdown.
Move code to wait for IO completion into shutdown path
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Refactored the reset_host store function to make consistent across code
bases
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It is possible to restart the controller via the use of the reset_host
sysfs variable. This does work for controllers that can no longer respond,
since driver will attempt to send down a shutdown in this path.
Check if the controller is able to receive commands before sending down
a shutdown
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver would hang when attempting to send reset from the ioctl interface,
since it would wait to retrieve the ioctl mutex at send shutdown.
Set adapter shutdown and unlock mutex before sending down reset request.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As part of the recovery process, the drivers removes offline devices (
done by the kernel) and then tries to add them back in the rescan code.
Removing the device is like taking a sledgehammer to a nail.
Set the device as running if it is marked offline.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver attempts to perform a device scan and device add after coming out
of reset. At times when the kdump kernel loads and it tries to perform
eh recovery, the device scan hangs since its commands are blocked because
of the eh recovery. This should have shown up in normal eh recovery path
(Should have been obvious)
Remove the code that performs scanning.I can live without the rescanning
support in the stable kernels but a hanging kdump/eh recovery needs to be
fixed.
Fixes: a2d0321dd5 (scsi: aacraid: Reload offlined drives after controller reset)
Cc: <stable@vger.kernel.org>
Reported-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Fixes: a2d0321dd5 (scsi: aacraid: Reload offlined drives after controller reset)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Check if the adapter can receive abort requests, before sending aborts
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When udev requests for a devices inquiry string, it might create multiple
threads causing a race condition on the shared inquiry resource string.
Created a buffer with the string for each thread.
Cc: <stable@vger.kernel.org>
Fixes: 3bc8070fb7 ([SCSI] aacraid: SMC vendor identification)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
"FIB_CONTEXT_FLAG_TIMEDOUT" flag is set in aac_eh_abort to indicate
command timeout. Using the same flag in reset handler causes the command
to time out and the I/Os were dropped.
Define a new flag "FIB_CONTEXT_FLAG_EH_RESET" to make sure I/O is
properly handled in eh_reset handler.
[mkp: tweaked commit message]
Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Variable managed_request_id is being assigned but it is never read,
hence it is redundant and can be removed. Cleans up clang warning:
drivers/scsi/aacraid/linit.c:706:5: warning: Value stored to
'managed_request_id' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As reported by Meelis Roos, my previous patch causes an incorrect
calculation of the timeout, through an undefined signed integer
overflow:
[ 12.228155] UBSAN: Undefined behaviour in drivers/scsi/aacraid/commsup.c:2514:49
[ 12.228229] signed integer overflow:
[ 12.228283] 964297611 * 250 cannot be represented in type 'long int'
The problem is that doing a multiplication with HZ first and then
dividing by USEC_PER_SEC worked correctly for 32-bit microseconds,
but not for 32-bit nanoseconds, which would require up to 41 bits.
This reworks the calculation to first convert the nanoseconds into
jiffies, which should give us the same result as before and not overflow.
Unfortunately I did not understand the exact intention of the algorithm,
in particular the part where we add half a second, so it's possible that
there is still a preexisting problem in this function. I added a comment
that this would be handled more nicely using usleep_range(), which
generally works better for waking up at a particular time than the
current schedule_timeout() based implementation. I did not feel
comfortable trying to implement that without being sure what the
intent is here though.
Fixes: 820f188659 ("scsi: aacraid: use timespec64 instead of timeval")
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As part of the scsi EH path, aacraid performs a reinitialization of the
adapter, which encompass freeing resources and IRQs, NULLifying lots of
pointers, and then initialize it all over again. We've identified a
problem during the free IRQ portion of this path if CONFIG_DEBUG_SHIRQ
is enabled on kernel config file.
Happens that, in case this flag was set, right after free_irq()
effectively clears the interrupt, it checks if it was requested as
IRQF_SHARED. In positive case, it performs another call to the IRQ
handler on driver. Problem is: since aacraid currently free some
resources *before* freeing the IRQ, once free_irq() path calls the
handler again (due to CONFIG_DEBUG_SHIRQ), aacraid crashes due to NULL
pointer dereference with the following trace:
aac_src_intr_message+0xf8/0x740 [aacraid]
__free_irq+0x33c/0x4a0
free_irq+0x78/0xb0
aac_free_irq+0x13c/0x150 [aacraid]
aac_reset_adapter+0x2e8/0x970 [aacraid]
aac_eh_reset+0x3a8/0x5d0 [aacraid]
scsi_try_host_reset+0x74/0x180
scsi_eh_ready_devs+0xc70/0x1510
scsi_error_handler+0x624/0xa20
This patch prevents the crash by changing the order of the
deinitialization in this path of aacraid: first we clear the IRQ, then
we free other resources. No functional change intended.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the driver accepts two ways of requesting an initialization
reset on the adapter: by passing aac_reset_devices module parameter,
or the generic kernel parameter reset_devices.
It's working as intended...but if we end up reaching a scsi hang and
the scsi EH mechanism takes place, aacraid performs resets as part of
the scsi error recovery procedure. These EH routines might reinitialize
the device, and if we have provided some of the reset parameters in the
kernel command-line, we again perform an "initialization" reset.
So, to avoid this duplication of resets in case of scsi EH path, this
patch adds a field to aac_dev struct to keep per-adapter track of the
init reset request - once it's done, we set it to false and don't
proactively reset anymore in case of reinitializations.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 16ae9dd35d ("scsi: aacraid: Fix for excessive prints on EEH")
introduced checks about the state of device before any PCI operations in
the driver. Basically, this prevents it to perform PCI accesses when
device is in the process of recover from a PCI error. In PowerPC, such
mechanism is called EEH, and the aforementioned commit introduced checks
that are based on EEH-specific primitives for that.
The potential problems with this approach are three: first, these checks
are "locked" to powerpc only - another archs could have error recovery
methods too, like AER in Intel. Also, the powerpc primitives perform
expensive FW accesses to validate the precise PCI state of a device.
Finally, code becomes more complicated and needs ifdef validation based
on arch config being set.
So, this patch makes use of generic PCI state checks, which are
lightweight and non-dependent of arch configs - also, it makes the code
cleaner.
Fixes: 16ae9dd35d ("scsi: aacraid: Fix for excessive prints on EEH")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
updates.
There's no major behaviour change or additions to the core in all of
this, so the potential for regressions should be small (biggest
potential being in the scsi error handler changes).
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJaCxtCAAoJEAVr7HOZEZN4d9EQAI+OHP6ss6zjKKC21c9jNPcH
NhLrNv37gHg/LA2VXeUEL9RGUjCGLIUrI4HsrxzkFAMLKP4TkshMs8/2RvczY+Sa
VpayPqVybEKLIS6ipQyM1SLIQff2nvtDVcN/T+8z1lkk45TrbA6ZGuwUwd2aJyEA
2V2wtg51ObnL0Nr9QPPll0JrtL1AnCZyRlu9XrwTZuuSBZwk93opIuuvbZm/3dVg
Ir4GSS4Y+PuHIfu4cxqdsPMdzRdY9I2me1YiE4jeFSn1/VTAjL4HBz7fO9eITT42
VhXSpDz1XvFsa9dJ0ubkqoALpJzCfOcBw+EuGvSydLEvOBoEVwMccdfaD9lT1zc5
L9e1Z5qqJoq7hTA6xTXCYfWG73I9HYvljtmc8yudKHhADOdnSTUXhaO6uBF0RNqD
OxPSA1RZwRx3c6lDOcK6BTtvLAkTEuYKdrWSKJi0w+QXJAyQ6etqbmsKpmPdRim7
Z4ZSpJFro2gyo9gcdJO0ykTG+z3U7Z/ay1sNgnuprsv+eU/QjUdlAPl18o79EkRf
H54zZggZ4wC6q/cFVVt4Vx+V+oqIeu38s7NDXS9UltLoTZPm2EzDW6pXd/38Z4Tf
a1oBAUET8kYLC90P8sVZxUIHZjITlpgDbyE2Lq00PMYXhk8S4IxF0aMN5RvVqzUv
+7N2HrHkSSgG1nhw1t+E
=3O85
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
updates.
There's no major behaviour change or additions to the core in all of
this, so the potential for regressions should be small (biggest
potential being in the scsi error handler changes)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
scsi: lpfc: Fix hard lock up NMI in els timeout handling.
scsi: mpt3sas: remove a stray KERN_INFO
scsi: mpt3sas: cleanup _scsih_pcie_enumeration_event()
scsi: aacraid: use timespec64 instead of timeval
scsi: scsi_transport_fc: add 64GBIT and 128GBIT port speed definitions
scsi: qla2xxx: Suppress a kernel complaint in qla_init_base_qpair()
scsi: mpt3sas: fix dma_addr_t casts
scsi: be2iscsi: Use kasprintf
scsi: storvsc: Avoid excessive host scan on controller change
scsi: lpfc: fix kzalloc-simple.cocci warnings
scsi: mpt3sas: Update mpt3sas driver version.
scsi: mpt3sas: Fix sparse warnings
scsi: mpt3sas: Fix nvme drives checking for tlr.
scsi: mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log info
scsi: mpt3sas: Add-Task-management-debug-info-for-NVMe-drives.
scsi: mpt3sas: scan and add nvme device after controller reset
scsi: mpt3sas: Set NVMe device queue depth as 128
scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.
scsi: mpt3sas: API's to remove nvme drive from sml
scsi: mpt3sas: API 's to support NVMe drive addition to SML
...
aacraid passes the current time to the firmware in one of two ways,
either as year/month/day/... or as 32-bit unsigned seconds.
The first one is broken on 32-bit architectures as it cannot go past
year 2038. Using timespec64 here makes it behave properly on both 32-bit
and 64-bit architectures, and avoids relying on signed integer overflow
to pass times into the second interface.
The interface used in aac_send_hosttime() however is still problematic
in year 2106 when 32-bit seconds overflow. Hopefully we don't have to
worry about aacraid by that time.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is a fix to an issue where the driver sends its periodic WELLNESS
command to the controller after the driver shut it down.This causes the
controller to crash. The window where this can happen is small, but it
can be hit at around 4 hours of constant resets.
Cc: <stable@vger.kernel.org>
Fixes: fbd185986e (aacraid: Fix AIF triggered IOP_RESET)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 0e9973ed33 ("scsi: aacraid: Add periodic checks to see IOP reset
status") changed the way driver checks if a reset succeeded. Now, after an
IOP reset, aacraid immediately start polling a register to verify the reset
is complete.
This behavior cause regressions on the reset path in PowerPC (at least).
Since the delay after the IOP reset was removed by the aforementioned patch,
the fact driver just starts to read a register instantly after the reset
was issued (by writing in another register) "corrupts" the reset procedure,
which ends up failing all the time.
The issue highly impacted kdump on PowerPC, since on kdump path we
proactively issue a reset in adapter (through the reset_devices kernel
parameter).
This patch (re-)adds a delay right after IOP reset is issued. Empirically
we measured that 3 seconds is enough, but for safety reasons we delay
for 5s (and since it was 30s before, 5s is still a small amount).
For reference, without this patch we observe the following messages
on kdump kernel boot process:
[ 76.294] aacraid 0003:01:00.0: IOP reset failed
[ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed
[ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff.
[ 86.524] aacraid 0003:01:00.0: Controller reset type is 3
[ 86.524] aacraid 0003:01:00.0: Issuing IOP reset
[146.534] aacraid 0003:01:00.0: IOP reset failed
[146.534] aacraid 0003:01:00.0: ARC Reset attempt failed
Fixes: 0e9973ed33 ("scsi: aacraid: Add periodic checks to see IOP reset status")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix possible indexing array of bound for &aac->hba_map[bus][cid], where
bus and cid boundary check happens later.
Fixes: 0d643ff3c3 ("scsi: aacraid: use aac_tmf_callback for reset fib")
Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The logic for supporting large drives was previously tied to 4Kn support
for SmartIOC-2000. As SmartIOC-2000 does not support volumes using 4Kn
drives, use the intended option flag AAC_OPT_NEW_COMM_64 to determine
support for volumes greater than 2T.
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aac_convert_sgraw2() kmalloc memory and return -1 on error, which should
be -ENOMEM. However, nobody is checking return value, so with this
change, -ENOMEM is propagated to upper layer.
Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
unsigned long byte_count = 0;
nseg = scsi_dma_map(scsicmd);
if (nseg < 0)
return nseg;
if (nseg) {
...
}
return byte_count;
is equal to
unsigned long byte_count = 0;
nseg = scsi_dma_map(scsicmd);
if (nseg <= 0)
return nseg;
...
return byte_count;
No other code has changed.
[mkp: fix checkpatch complaints]
Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This fixes a potential race condition observed on Power systems.
Several places throughout the aacraid driver call aac_fib_send or
similar to send a command to the aacraid adapter, then check the return
code to determine if the command was actually sent to the adapter, then
update the phase field in the scsi command scratch pad area to track
that the firmware now owns this command. However, there is nothing that
ensures that by the time the aac_fib_send function returns and we go to
write to the scsi command, that the command hasn't already completed and
the scsi command has been freed. This was causing random crashes in the
TCP stack which was tracked down to be caused by memory that had been a
struct request + scsi_cmnd being now used for an skbuff. Memory
poisoning was enabled in the kernel to debug this which showed that the
last owner of the memory that had been freed was aacraid and that it was
a struct request. The memory that was corrupted was the exact data
pattern of AAC_OWNER_FIRMWARE and it was at the same offset that aacraid
writes, which is scsicmd->SCp.phase. The patch below resolves this
issue.
Cc: <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We terminate the aac_get_name_resp on a byte that is outside the bounds
of the structure. Extend the return response by one byte to remove the
out of bounds reference.
Fixes: b836439faf ("aacraid: 4KB sector support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When issuing a bus reset we should complete all commands, not
just the command triggering the reset.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To correctly identify which fib has a scsi command callback this
patch implements a flag FIB_CONTEXT_FLAG_SCSI_CMD.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aac_hba_send() will return FAILED for any non-SCSI command requests,
failing any TMFs. This patch updates the check to allow TMFs.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When sending a reset fib we shouldn't rely on the scsi command,
but rather set the TMF status in the map_info->reset_state variable.
That allows us to send a TMF independent on a scsi command.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Split off device, target, and bus reset functionality into
individual functions.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Split off the host reset parts of aac_eh_reset() into a separate
host reset function.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Split off reset FIB generation into separate functions.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
"qd.id" comes directly from the copy_from_user() on the line before so
we should verify that it's within bounds.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Both aac_send_raw_srb() and aac_get_hba_info() may copy stack allocated
structs to userspace without initializing all members of these
structs. Clear out this memory to prevent information leaks.
Fixes: 423400e64d ("scsi: aacraid: Include HBA direct interface")
Fixes: c799d519bf ("scsi: aacraid: Retrieve HBA host information ioctl")
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The fields sense_data_size and sense_data are unitialized garbage from
the stack and are being copied back to userspace. Fix this leak of
stack information by ensuring they are zero'd.
Detected by CoverityScan, CID#1435473 ("Uninitialized scalar variable")
Fixes: 423400e64d ("scsi: aacraid: Include HBA direct interface")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update the driver version to 50834
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove reference to Series-9 HBA and created arc ctrl check function.
Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added info and error messages in controller reset function to log
information about the status of the IOP/SOFT reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make sure that IOP and SOFT reset are enabled for both for both arc and
hba1000 controllers.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Made sure that ioctl commands return in case of a controller reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The command thread checks the ctrl health periodically before sending
updates to the controller. The function that it uses is aac_check_health
which does more than get the health status.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Removed switch case and replaced with if mask checks. Moved KERNEL_PANIC
check to when bled is less than 0.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Now the driver issues a soft reset and waits for the controller to be up
and running by periodically checking on the status of the controller
health registers. Also prevents ARC adapters from issuing soft reset if
IOP resets failed.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added function that waits with a timeout for the ctrl to be up and running
after triggering an IOP reset. Also removed 30 sec sleep as it is not
needed.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reworked IOP reset to remove unneeded variable and created a helper
function to notify fw of an imminent IOP reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver can now trigger IOP reset with a single reset mask. Removed
code that retrieves a reset_mask from the firmware.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Log the status of the controller before issuing a reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Log the location of the scsi cmds before triggering a reset. This
information is useful for debugging.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change the completion wait time for the fibs in the reset and abort
callback from 2 minutes to 15 seconds.
2 minutes is too long for waiting for completion.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Check health does not need to reset the ctrl but just return the
controller health status.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The default queue depth for non NATIVE RAW disks is calculated from the
number of fibs and number of disks or a max of 256. This causes poor disk
IO performance.
The fix is to set default qd based on the type of disks
(SATA -32 and SAS -64)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The qd for ARC Native disks is calculated by dividing the max IO 1024
by the number of disks or 256 which ever is lower. This causes poor
disk IO performance.
The fix is set the qd based on the type of disk (SAS - 64 and SATA -
32).
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver changed the DMA consistent map after consistent memory was
allocated, this invalidated the IOMMU identity mapping. The fix was to
make sure that we set the DMA consistent mask setting once depending on
the controller card.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The raw srb commands do not requires memory that in the ZONE_DMA memory
space. For 32bit srb commands use GFP_DMA32 to limit the memory to 32bit
memory range (4GB).
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There were pci_alloc_consistent() failures on ARM64 platform. Use
dma_alloc_coherent() with GFP_KERNEL flag DMA memory allocations.
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
[hch: tweaked indentation, removed memsets]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During a PCI error recovery, if aac_check_health() is not aware that a
PCI error happened and we have an offline PCI channel, it might trigger
some errors (like NULL pointer dereference) and inhibit the error
recovery process to complete.
This patch makes the health check procedure aware of PCI channel issues,
and in case of error recovery process, the function
aac_adapter_check_health() returns -1 and let the recovery process to
complete successfully. This patch was tested on upstream kernel
v4.11-rc5 in PowerPC ppc64le architecture with adapter 9005:028d
(VID:DID) - the error recovery procedure was able to recover fine.
Fixes: 5c63f7f710 ("aacraid: Added EEH support")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, command threads fails to return ioctls commands for older
controller versions, since it returns when all the fibs have been
allocated. Another issue is even all the fibs have not been allocated,
the correct allocated fibs is not updated nor freed.
Fixes: 113156bcea (scsi: aacraid: Reworked aac_command_thread)
Reported-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The return status of the adapter check on KERNEL_PANIC is supposed to be
the upper 16 bits of the OMR status register.
Fixes: c421530bf8 (scsi: aacraid: Reorder Adpater status check)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is the set of stuff that didn't quite make the initial pull and a
set of fixes for stuff which did. The new stuff is basically lpfc
(nvme), qedi and aacraid. The fixes cover a lot of previously
submitted stuff, the most important of which probably covers some of
the failing irq vectors allocation and other fallout from having the
SCSI command allocated as part of the block allocation functions.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYug/3AAoJEAVr7HOZEZN4bRwP/RvAW/NWlvs812gUeHNahUHx
HF3EtY5YHneZev1L4fa1Myd8cNcB3gic53e1WvQhN5f+LczeI7iXHH8xIQK/B4GT
OI5W7ZCGlhIamgsU8tUFGrtIOM6N/LIMZPMsaE62dcev0jC0Db4p/5yv5bUg1yuL
JOLWqIaTYTx7b5FwR2cXKv7KGYpXonvcoVUDV6s8aMYToJhJiQ6JtWmjaAA+/OJ7
HTcqNX/KSWQBG3ERjE3A/rGBFxPR9KPq8kpqVJlw39KEN9tMjXb0QAGvAeauNye2
xaIjV41tXH1Ys3aoh6ZU+TpzN3HLlH/gegRe8671fUwqu/RnSzFC8Eidoddzgqne
z5BF0Dy36FA5ol2HTFEwS5WM1gRM+z6YGnbhpE/XfyTL2sLHRynSXKxb02IdjRNi
gCdV89AL8xdfPPFJjsx4hGWUJhalW/RwuPayAupePKGQHePabHa19McI2ssg0WBg
OC6uuEk7vT95xuTZVJlrwXTJPnc9dxWJwzh21oEnvMfWM0VMW06EKT4AcX8FOsjL
RswiYp9wl8JDmfNxyJnQJj0+QXeIE/gh35vBQCGsLVKY4+xzbi3yZfb1sZB0XOum
yLK2LFQ+Ltk1TaPPSZsgXQZfodot0dFJPTnWr+whOfY1kepNlFRwQHltXnCWzgRs
QG//WHsk0u1Mj2gehwmQ
=e+6v
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is the set of stuff that didn't quite make the initial pull and a
set of fixes for stuff which did.
The new stuff is basically lpfc (nvme), qedi and aacraid. The fixes
cover a lot of previously submitted stuff, the most important of which
probably covers some of the failing irq vectors allocation and other
fallout from having the SCSI command allocated as part of the block
allocation functions"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (59 commits)
scsi: qedi: Fix memory leak in tmf response processing.
scsi: aacraid: remove redundant zero check on ret
scsi: lpfc: use proper format string for dma_addr_t
scsi: lpfc: use div_u64 for 64-bit division
scsi: mac_scsi: Fix MAC_SCSI=m option when SCSI=m
scsi: cciss: correct check map error.
scsi: qla2xxx: fix spelling mistake: "seperator" -> "separator"
scsi: aacraid: Fixed expander hotplug for SMART family
scsi: mpt3sas: switch to pci_alloc_irq_vectors
scsi: qedf: fixup compilation warning about atomic_t usage
scsi: remove scsi_execute_req_flags
scsi: merge __scsi_execute into scsi_execute
scsi: simplify scsi_execute_req_flags
scsi: make the sense header argument to scsi_test_unit_ready mandatory
scsi: sd: improve TUR handling in sd_check_events
scsi: always zero sshdr in scsi_normalize_sense
scsi: scsi_dh_emc: return success in clariion_std_inquiry()
scsi: fix memory leak of sdpk on when gd fails to allocate
scsi: sd: make sd_devt_release() static
scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.
...
The check for ret being zero is redundant as a few statements earlier we
break out of the while loop if ret is non-zero. Thus we can remove the
zero check and also the dead-code non-zero case too.
Detected by CoverityScan, CID#1411632 ("Logically Dead Code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix typos and add the following to the scripts/spelling.txt:
therfore||therefore
Besides, tidy up comment blocks for 80-col wrapping.
Link: http://lkml.kernel.org/r/1481573103-11329-31-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current driver Hotplug processing code skips over Enclosure channel,
therefore any addition/removal of expander enclosure is not processed.
Additionally device addition code relies on older device type, which
prevents the hotplug of adapter expanders.
Fixed by removing code that skips over Enclosure channels and using the
latest device type for addition or removal or enclosure expanders.
Fixes: 6223a39fe6 (scsi: aacraid: Added support for hotplug)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Updated driver version to 50792
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver does not unlock the reply queue spin lock after handling SMART
adapter events. Instead it might attempt to unlock an already unlocked
spin lock.
Fixed by making sure the driver locks the spin lock before freeing it.
Thank you dan for finding this issue out.
Fixes: 6223a39fe6 (scsi: aacraid: Added support for hotplug)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the adapter firmware does not save outstanding I/O's log
information when an IOP reset is triggered. This is problematic when
trying to root cause and debug issues.
Fixed by adding sync command to trigger I/O log file save in the adapter
firmware before issuing an IOP reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver currently checks the SELF_TEST_FAILED first and then
KERNEL_PANIC next. Under error conditions(boot code failure) both
SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time.
The driver has the capability to reset the controller on an KERNEL_PANIC,
but not on SELF_TEST_FAILED.
Fixed by first checking KERNEL_PANIC and then the others.
Cc: stable@vger.kernel.org
Fixes: e8b12f0fb8 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC base controller family)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the SMART family of controller panic (KERNEL_PANIC) , they do not
honor IOP resets. So better to skip it and directly perform a IWBR reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently driver checks the health status of the adapter once every 24
hours. When that happens the driver becomes dependent on the kernel to
figure out if the adapter is misbehaving. This might take some time
(when the adapter is idle). The driver currently has support to
restart/recover the controller when it fails, and decreasing the time
interval will help.
Fixed by decreasing check interval from 24 hours to 1 minute
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During the IOP reset stress testing, it was found that the drives can be
marked offline when the adapter controller crashes and IO's are running
in parallel. When the controller does come back from the reset, the drive
that is marked offline is not exposed.
Fixed by removing and adding drives that are marked offline. In addition
invoke a scsi host bus rescan to capture any additional configuration
changes.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aac_command_thread checks on the health of controller periodically,
using aac_check_health. If the status is an error state KERNEL_PANIC or
anything else. The driver will attempt to restart the adapter, but the
response is not checked in aac_command_thread. This allows the periodic
sync to go thru and lead the driver to a hung state.
Fixed by terminating the periodic loop(intended per original design),
if the controller is not restored to a healthy state.
Cc: stable@vger.kernel.org
Fixes: 3d77d84044 (scsi: aacraid: Added support for periodic wellness sync)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
After controller shutdown, all sync fibs time out due to not knowing
about the switch to INT-x mode
Fixed by replacing aac_src_access_devreg() to aac_set_intx_mode() call.
Cc: stable@vger.kernel.org
Fixes: 495c021767 (aacraid: MSI-x support)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added support to retrieve driver version from a new sysfs variable called
driver_version. It makes it easier for the user to figure out the driver
version that is currently running.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aac_fib_map_free frees misaligned fib dma memory, additionally it does not
free up the whole memory.
Fixed by changing the code to free up the correct and full memory
allocation.
Cc: stable@vger.kernel.org
Fixes: e8b12f0fb8 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC based controller family)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arrconf management utility at times sends fibs with AdapterProcessed set
in its fibs. This causes the controller to panic and lockup.
Fixed by failing the commands that have AdapterProcessed set in its flag.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This issue showed up on a kdump debug(single CPU on powerkvm), when EEH
errors rendered the adapter unusable. The driver correctly detected the
issue and attempted to restart the controller, in doing so the driver
attempted to read the status registers of the controller. This triggered
additional eeh errors which continued for a good 6 minutes.
Fixed by returning without waiting when EEH error is reported.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The channel being used for raw srb commands is retrieved from the utility
sent fibs and is converted into physical channel id. The driver does not
need to to do this since the management utility sends the correct channel
id in the first place and in addition the driver sets inaccurate
information in the cmd sent to the firmware and gets an invalid response.
Fixed by using channel id from srb command.
Cc: stable@vger.kernel.org
Fixes: 423400e64d ("scsi: aacraid: Include HBA direct interface")
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replaced camel case with snake case for init supported options.
Suggested-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This update includes the usual round of major driver updates (ncr5380,
ufs, lpfc, be2iscsi, hisi_sas, storvsc, cxlflash, aacraid,
megaraid_sas, ). There's also an assortment of minor fixes and the
major update of switching a bunch of drivers to pci_alloc_irq_vectors
from Christoph.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYq5adAAoJEAVr7HOZEZN4bjUP/Atk7CSZVnC75pcYmncbEGCx
ysOlEHK4uW2HhiAYk3PlYMk+pKrMHet2zsbbM9PHJfopdOHZ7Sq1+UZZVeqE1Zun
8pe0NhON+fZx7XAnevdEvnSSULQZ+AGfjZO72iUwkJiN3ozYaFtCITOyn49l4GpR
ra9emskBh7CQOFW2voGn1AKeDijPYGx3+TO4AUrWjVMiByR06gb1bmImx+ljiUrs
jzRJPfrt90ORcTdpMateyN2EXxudcASMhX03SJ6fRI84hPAhMCROMbTv8RnzOTE4
DPbnvbYUowlHt43iUhJHSwGdkRRaRBnkzQENBp1fNrNzZgF6vB7+kShxbonrYB2p
gC4ewaJr0BNj+HsUnvTpe3WseiPOcfsnBsKilPLKBlm2dCKEXqFox/dj/T1uexxg
HoyFrl3u8fyEqVHrzRS4M9t/njWh0NFmXxb0wBdj+lkVFTRErGSKQ8SfOqshuSGs
P8NN88jy8vC7uqgzKBJ+UH3ehzn3qfBxasFHIC/e2awY9FqKjHGTxKMmSVpjXVxy
wCvE2FQ3k/qEj2XSM6f7/NGytlSOlju5q1rFtHPW2M+TFSh0LJWCnmVjR/Zle9em
pBWmtIgCv8W5b41zL2H94nLWAZbfdrrNU/XnX88l47LKnmorte/PGhpxu36NEsMS
VCgreQmFMdMRY+WzDWl1
=cBQx
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This update includes the usual round of major driver updates (ncr5380,
ufs, lpfc, be2iscsi, hisi_sas, storvsc, cxlflash, aacraid,
megaraid_sas, ...).
There's also an assortment of minor fixes and the major update of
switching a bunch of drivers to pci_alloc_irq_vectors from Christoph"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (188 commits)
scsi: megaraid_sas: handle dma_addr_t right on 32-bit
scsi: megaraid_sas: array overflow in megasas_dump_frame()
scsi: snic: switch to pci_irq_alloc_vectors
scsi: megaraid_sas: driver version upgrade
scsi: megaraid_sas: Change RAID_1_10_RMW_CMDS to RAID_1_PEER_CMDS and set value to 2
scsi: megaraid_sas: Indentation and smatch warning fixes
scsi: megaraid_sas: Cleanup VD_EXT_DEBUG and SPAN_DEBUG related debug prints
scsi: megaraid_sas: Increase internal command pool
scsi: megaraid_sas: Use synchronize_irq to wait for IRQs to complete
scsi: megaraid_sas: Bail out the driver load if ld_list_query fails
scsi: megaraid_sas: Change build_mpt_mfi_pass_thru to return void
scsi: megaraid_sas: During OCR, if get_ctrl_info fails do not continue with OCR
scsi: megaraid_sas: Do not set fp_possible if TM capable for non-RW syspdIO, change fp_possible to bool
scsi: megaraid_sas: Remove unused pd_index from megasas_build_ld_nonrw_fusion
scsi: megaraid_sas: megasas_return_cmd does not memset IO frame to zero
scsi: megaraid_sas: max_fw_cmds are decremented twice, remove duplicate
scsi: megaraid_sas: update can_queue only if the new value is less
scsi: megaraid_sas: Change max_cmd from u32 to u16 in all functions
scsi: megaraid_sas: set pd_after_lb from MR_BuildRaidContext and initialize pDevHandle to MR_DEVHANDLE_INVALID
scsi: megaraid_sas: latest controller OCR capability from FW before sending shutdown DCMD
...
commit 78cbccd3bd ("aacraid: Fix for KDUMP driver hang")
caused a problem on older controllers which do not support MSI-x (namely
ASR3405,ASR3805). This patch conditionalizes the previous patch to
controllers which support MSI-x
Cc: <stable@vger.kernel.org> # v4.7+
Fixes: 78cbccd3bd ("aacraid: Fix for KDUMP driver hang")
Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Shifting a dma_addr_t right by 32 bits causes a compile-time warning
when that type is only 32 bit wide:
drivers/scsi/aacraid/src.c: In function 'aac_src_start_adapter':
drivers/scsi/aacraid/src.c:414:29: error: right shift count >= width of type [-Werror=shift-count-overflow]
This changes the driver to use the predefined macros consistently,
including one correct but open-coded upper_32_bits() instance.
Fixes: d1ef4da848 ("scsi: aacraid: added support for init_struct_8")
Fixes: 423400e64d ("scsi: aacraid: Include HBA direct interface")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aac_fib_send can return -ve error returns and hence rcode should be
signed. Currently the rcode >= 0 check is always true and -ve errors are
not being checked.
Thanks to Dan Carpenter for spotting my original broken fix to this
issue.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update the driver version to 50740
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change the aacraid driver prefix from 1.2-1 to 1.2.1
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added new copyright messages
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added a new ioctl interface to retrieve the host device information.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added a new ioctl interface to trigger an IOP or IWBR reset from ioctl.
Primary used by management utility to trigger resets.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added a new IWBR soft reset type, reworked the IOP reset interface for
a bit.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds support to retrieve the unique identifier data (VPD page
83 type3) for Logical drives created on SmartIOC 2000 products. In
addition added a sysfs device structure to expose the id information.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added task management command support to abort any timed out commands
in case of a eh_abort call and to reset lun's in case of eh_reset call.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added support to send out task management commands.
[mkp: removed // fibsize... ]
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added support to send direct pasthru srb commands from management utilty
to the controller.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added support for drive hotplug add and removal
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added support to set qd of drives in slave_configure.This only works for
HBA1000 attached drives.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Retrieved queue depth from fw and saved it for future use.
Only applicable for HBA1000 drives.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds a new functions that periodically sync the time of host
to the adapter. In addition also informs the adapter that the driver is
alive and kicking. Only applicable to the HBA1000 and SMARTIOC2000.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reworked aac_command_thread into aac_process_events
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch processes Raw IO read medium errors.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch enables the driver to actually process the I/O, or srb replies
from adapter. In addition to any HBA1000 or SmartIOC2000 adapter events.
Signed-off-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make sure that the driver processes error conditions even in the fast
response path for response from the adapter.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Moved the READ and WRITE switch cases to the top. Added a default
case to the switch case and replaced duplicate scsi result value with a
macro.
The idea is that since most of scsi commands we care about performance
wise are read or write, we need to process them first.
Internally the compiler (GCC) converts a switch case into either a jump
table or a bunch of if else conditions, so placing the often used read,
write cases at the top is an effort in optimization.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Dave Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>