Sync code to the same with tk4 pub/lts/0017-kabi, except deleted rue
and wujing. Partners can submit pull requests to this branch, and we
can pick the commits to tk4 pub/lts/0017-kabi easly.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Gitee limit the repo's size to 3GB, to reduce the size of the code,
sync codes to ock 5.4.119-20.0009.21 in one commit.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Sync kernel codes to the same with 590eaf1fec ("Init Repo base on
linux 5.4.32 long term, and add base tlinux kernel interfaces."), which
is from tk4, and it is the base of tk4.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates.
The only core change this time around is the addition of request
batching for virtio. Since batching requires an additional flag to
use, it should be invisible to the rest of the drivers.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXYQE/yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXs9AP4usPY5
OpMlF6OiKFNeJrCdhCScVghf9uHbc7UA6cP+EgD/bCtRgcDe1ZjOTYWdeTwvwWqA
ltWYonnv6Lg3b1f9yqI=
=jRC/
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates. The
only core change this time around is the addition of request batching
for virtio. Since batching requires an additional flag to use, it
should be invisible to the rest of the drivers"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (264 commits)
scsi: hisi_sas: Fix the conflict between device gone and host reset
scsi: hisi_sas: Add BIST support for phy loopback
scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation
scsi: hisi_sas: Remove some unused function arguments
scsi: hisi_sas: Remove redundant work declaration
scsi: hisi_sas: Remove hisi_sas_hw.slot_complete
scsi: hisi_sas: Assign NCQ tag for all NCQ commands
scsi: hisi_sas: Update all the registers after suspend and resume
scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device
scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails
scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled
scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset()
scsi: hisi_sas: add debugfs auto-trigger for internal abort time out
scsi: virtio_scsi: unplug LUNs when events missed
scsi: scsi_dh_rdac: zero cdb in send_mode_select()
scsi: fcoe: fix null-ptr-deref Read in fc_release_transport
scsi: ufs-hisi: use devm_platform_ioremap_resource() to simplify code
scsi: ufshcd: use devm_platform_ioremap_resource() to simplify code
scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code
scsi: ufs: Use kmemdup in ufshcd_read_string_desc()
...
This patch provides a module parameter and sysfs interface to select
whether the queue depth for each device should be based on the value
suggested by firmware (the default) or the maximum supported by the
controller (can_queue).
Although we have a sysfs interface per sdev to change the queue depth of
individual scsi devices, this implementation provides a single sysfs entry
per shost to switch between the controller max and the value reported by
firmware. The module parameter can provide an interface for one time grub
settings and provides persistent settings across the boot.
[mkp: tweaked commit desc]
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While loading fw crashdump in function fw_crash_buffer_show(), left bytes
in one dma chunk was not checked, if copying size over it, overflow access
will cause kernel panic.
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix spelling mistake in kernel warning message and replace printk with with
pr_warn.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is the final round of mostly small fixes in our initial submit.
It's mostly minor fixes and driver updates. The only change of note
is adding a virt_boundary_mask to the SCSI host and host template to
parametrise this for NVMe devices instead of having them do a call in
slave_alloc. It's a fairly straightforward conversion except in the
two NVMe handling drivers that didn't set it who now have a virtual
infinity parameter added.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXTJS/yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQTNAQCsTdkA
IN1BvDBbE+KO8mvL5DuRxLtnDU6Pq5K6fkrE3gD/a1GkqyPPaJIuspq7fQY87DH/
o7VsJd/5uGphIE2Ls+M=
=38XV
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is the final round of mostly small fixes in our initial submit.
It's mostly minor fixes and driver updates. The only change of note is
adding a virt_boundary_mask to the SCSI host and host template to
parametrise this for NVMe devices instead of having them do a call in
slave_alloc. It's a fairly straightforward conversion except in the
two NVMe handling drivers that didn't set it who now have a virtual
infinity parameter added"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
scsi: megaraid_sas: set an unlimited max_segment_size
scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs
scsi: IB/srp: set virt_boundary_mask in the scsi host
scsi: IB/iser: set virt_boundary_mask in the scsi host
scsi: storvsc: set virt_boundary_mask in the scsi host template
scsi: ufshcd: set max_segment_size in the scsi host template
scsi: core: take the DMA max mapping size into account
scsi: core: add a host / host template field for the virt boundary
scsi: core: Fix race on creating sense cache
scsi: sd_zbc: Fix compilation warning
scsi: libfc: fix null pointer dereference on a null lport
scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
scsi: zfcp: fix request object use-after-free in send path causing wrong traces
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
scsi: megaraid_sas: Update driver version to 07.710.50.00
scsi: megaraid_sas: Add module parameter for FW Async event logging
scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers
scsi: megaraid_sas: Fix calculation of target ID
scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE
scsi: devinfo: BLIST_TRY_VPD_PAGES for SanDisk Cruzer Blade
...
When using a virt_boundary_mask, as done for NVMe devices attached to
megaraid_sas controllers, we require an unlimited max_segment_size as the
virt boundary merging code assumes that. But we also need to propagate
that to the DMA mapping layer to make dma-debug happy. The SCSI layer
takes care of that when using the per-host virt_boundary setting, but
given that megaraid_sas only wants to set the virt_boundary for actual
NVMe devices, we can't rely on that. The DMA layer maximum segment is
global to the HBA however, so we have to set it explicitly. This patch
assumes that megaraid_sas does not have a segment size limitation, which
seems true based on the SGL format, but will need to be verified.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add module parameter to control logging levels of async event
notifications from firmware that get logged to system log. Also, allow
changing the value from sysfs after driver load.
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Load balancing IO completions across all available MSI-X vectors should be
enabled for Invader and later generation controllers only. This needs to
be disabled for older controllers. Add an adapter type check before
setting msix_load_balance flag.
Fixes: 1d15d9098a ("scsi: megaraid_sas: Load balance completions across all MSI-X")
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In megasas_get_target_prop(), driver is incorrectly calculating the target
ID for devices with channel 1 and 3. Due to this, firmware will either
fail the command (if there is no device with the target id sent from
driver) or could return the properties for a target which was not
intended. Devices could end up with the wrong queue depth due to this.
Fix target id calculation for channel 1 and 3.
Fixes: 96188a89cc ("scsi: megaraid_sas: NVME interface target prop added")
Cc: stable@vger.kernel.org
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix sparse warnings:
drivers/scsi/megaraid/megaraid_sas_base.c:271:1: warning: symbol 'megasas_issue_dcmd' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_base.c:2227:6: warning: symbol 'megasas_do_ocr' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_base.c:3194:25: warning: symbol 'megaraid_host_attrs' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
removal of the osst driver (I heard from Willem privately that he
would like the driver removed because all his test hardware has
failed). Plus number of minor changes, spelling fixes and other
trivia.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd
fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE
fjy17dQLvsJ4GsidHy8=
=aS5B
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
removal of the osst driver (I heard from Willem privately that he
would like the driver removed because all his test hardware has
failed). Plus number of minor changes, spelling fixes and other
trivia.
The big merge conflict this time around is the SPDX licence tags.
Following discussion on linux-next, we believe our version to be more
accurate than the one in the tree, so the resolution is to take our
version for all the SPDX conflicts"
Note on the SPDX license tag conversion conflicts: the SCSI tree had
done its own SPDX conversion, which in some cases conflicted with the
treewide ones done by Thomas & co.
In almost all cases, the conflicts were purely syntactic: the SCSI tree
used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the
treewide conversion had used the new-style ones ("GPL-2.0-only" and
"GPL-2.0-or-later").
In these cases I picked the new-style one.
In a few cases, the SPDX conversion was actually different, though. As
explained by James above, and in more detail in a pre-pull-request
thread:
"The other problem is actually substantive: In the libsas code Luben
Tuikov originally specified gpl 2.0 only by dint of stating:
* This file is licensed under GPLv2.
In all the libsas files, but then muddied the water by quoting GPLv2
verbatim (which includes the or later than language). So for these
files Christoph did the conversion to v2 only SPDX tags and Thomas
converted to v2 or later tags"
So in those cases, where the spdx tag substantially mattered, I took the
SCSI tree conversion of it, but then also took the opportunity to turn
the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag.
Similarly, when there were whitespace differences or other differences
to the comments around the copyright notices, I took the version from
the SCSI tree as being the more specific conversion.
Finally, in the spdx conversions that had no conflicts (because the
treewide ones hadn't been done for those files), I just took the SCSI
tree version as-is, even if it was old-style. The old-style conversions
are perfectly valid, even if the "-only" and "-or-later" versions are
perhaps more descriptive.
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits)
scsi: qla2xxx: move IO flush to the front of NVME rport unregistration
scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition
scsi: qla2xxx: on session delete, return nvme cmd
scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices
scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1
scsi: megaraid_sas: Introduce various Aero performance modes
scsi: megaraid_sas: Use high IOPS queues based on IO workload
scsi: megaraid_sas: Set affinity for high IOPS reply queues
scsi: megaraid_sas: Enable coalescing for high IOPS queues
scsi: megaraid_sas: Add support for High IOPS queues
scsi: megaraid_sas: Add support for MPI toolbox commands
scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver
scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura
scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD
scsi: megaraid_sas: Handle sequence JBOD map failure at driver level
scsi: megaraid_sas: Don't send FPIO to RL Bypass queue
scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault
scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout
scsi: megaraid_sas: Call disable_irq from process IRQ poll
scsi: megaraid_sas: Remove few debug counters from IO path
...
For Aero adapters, driver provides three different performance modes
controlled through module parameter named 'perf_mode'. Below are those
performance modes:
0: Balanced - Additional high IOPS reply queues will be enabled along with
low latency queues. Interrupt coalescing will be enabled only for these
high IOPS reply queues.
1: IOPS - No additional high IOPS queues are enabled. Interrupt coalescing
will be enabled on all reply queues.
2: Latency - No additional high IOPS queues are enabled. Interrupt
coalescing will be disabled on all reply queues. This is a legacy
behavior similar to Ventura & Invader Series.
Default performance mode settings:
- Performance mode set to 'Balanced', if Aero controller is working in
16GT/s PCIe speed.
- Performance mode will be set to 'Latency' mode for all other cases.
Through module parameter 'perf_mode', user can override default performance
mode to desired one.
Captured some performance numbers with these performance modes. 4k Random
Read IO performance numbers on 24 SAS SSD drives for above three
performance modes. Performance data is from Intel Skylake and HGST SS300
(drive model SDLL1DLR400GCCA1).
IOPS:
-----------------------------------------------------------------------
|perf_mode | qd = 1 | qd = 64 | note |
|-------------|--------|---------|-------------------------------------
|balanced | 259K | 3061k | Provides max performance numbers |
| | | | both on lower QD workload & |
| | | | also on higher QD workload |
|-------------|--------|---------|-------------------------------------
|iops | 220K | 3100k | Provides max performance numbers |
| | | | only on higher QD workload. |
|-------------|--------|---------|-------------------------------------
|latency | 246k | 2226k | Provides good performance numbers |
| | | | only on lower QD worklaod. |
-----------------------------------------------------------------------
Average Latency:
-----------------------------------------------------
|perf_mode | qd = 1 | qd = 64 |
|-------------|--------------|----------------------|
|balanced | 92.05 usec | 501.12 usec |
|-------------|--------------|----------------------|
|iops | 108.40 usec | 498.10 usec |
|-------------|--------------|----------------------|
|latency | 97.10 usec | 689.26 usec |
-----------------------------------------------------
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
High iops queues are mapped to non-managed IRQs. Set affinity of
non-managed irqs to local numa node. Low latency queues are mapped to
managed IRQs.
Driver reserves some reply queues for high IOPS queues (through
pci_alloc_irq_vectors_affinity and .pre_vectors interface). The rest of
queues are for low latency.
Based on IO workload, driver will decide which group of reply queues
(either high IOPS queues or low latency queues) to be used.
High IOPS queues will be mapped to local numa node of controller and
low latency queues will be mapped to CPUs across numa nodes. In general,
high IOPS and low latency queues should fit into 128 reply queues
which is the max number of reply queues supported by Aero adapters.
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Aero controllers support balanced performance mode through the ability to
configure queues with different properties.
Reply queues with interrupt coalescing enabled are called "high iops reply
queues" and reply queues with interrupt coalescing disabled are called "low
latency reply queues".
The driver configures a combination of high iops and low latency reply
queues if:
- HBA is an AERO controller;
- MSI-X vectors supported by the HBA is 128;
- Total CPU count in the system more than high iops queue count;
- Driver is loaded with default max_msix_vectors module parameter; and
- System booted in non-kdump mode.
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added driver support to allow passthrough MPI toolbox type MFI commands to
firmware based on firmware capability.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For RAID5/RAID6 volumes configured behind Aero, driver will be doing 64bit
division operations on behalf of firmware as controller's ARM CPU is very
slow in this division. Later, driver calculates Q-ARM, P-ARM and Log-ARM and
passes those values to firmware by writing these values to RAID_CONTEXT.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
RAID1 PCI bandwidth limit algorithm is not applicable to Aero as it's PCIe
Gen4 adapter.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Issue: This issue is applicable to scenario when JBOD sequence map is
unavailable (memory allocation for JBOD sequence map failed) to driver but
feature is supported by firmware. If the driver sends a JBOD IO by not
adding 255 (MAX_PHYSICAL_DEVICES - 1) to device ID when underlying firmware
supports JBOD sequence map, it will lead to the IO failure.
Fix: For JBOD IOs, driver will not use the RAID map to fetch the devhandle
if JBOD sequence map is unavailable. Driver will set Devhandle to 0xffff
and Target ID to 'device ID + 255 (MAX_PHYSICAL_DEVICES - 1)'.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Issue: There is possibility of few DCMDs timing out with 'reset_mutex' lock
held. As part of DCMD timeout handling, driver calls function
megasas_reset_fusion which also tries to acquire same lock 'reset_mutex'
and end up with deadlock.
Fix: Upon timeout of DCMDs (which are fired with 'reset_mutex' lock held),
driver will release 'reset_mutex' before calling OCR function and will
acquire lock again after OCR function returns.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch will add support for non-secure Aero adapter PCI IDs. Driver
will throw an error message when a non-secure type controller is
detected. Purpose of this interface is to avoid interacting with any
firmware which is not secured/signed by Broadcom. Any tampering on Firmware
component will be detected by hardware and it will be communicated to the
driver to avoid any further interaction with that component.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use existing macros. No functional change.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Checkpatch emits a warning when using symbolic permissions. Use octal
permissions instead. No functional change.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Support is easier with all driver parameters visible in sysfs.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes gcc '-Wunused-but-set-variable' warnings:
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_fw_crash_buffer_show:
drivers/scsi/megaraid/megaraid_sas_base.c:3138:16: warning: variable buff_addr set but not used [-Wunused-but-set-variable]
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_get_pd_list:
drivers/scsi/megaraid/megaraid_sas_base.c:4426:13: warning: variable ci_h set but not used [-Wunused-but-set-variable]
'buff_addr' is never used since inroduction in commit fc62b3fc90
("megaraid_sas : Firmware crash dump feature support")
'ci_h' is not used since commit 9b3d028f34 ("scsi: megaraid_sas:
Pre-allocate frequently used DMA buffers")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_create_frame_pool:
drivers/scsi/megaraid/megaraid_sas_base.c:4124:6: warning: variable sge_sz set but not used [-Wunused-but-set-variable]
It's not used any more since commit 200aed582d ("megaraid_sas: endianness
related bug fixes and code optimization")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes gcc '-Wunused-but-set-variable' warnings:
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_suspend:
drivers/scsi/megaraid/megaraid_sas_base.c:7269:20: warning: variable host set but not used [-Wunused-but-set-variable]
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_aen_polling:
drivers/scsi/megaraid/megaraid_sas_base.c:8397:15: warning: variable wait_time set but not used [-Wunused-but-set-variable]
'host' never used since introduction in commit 31ea708897 ("[SCSI]
megaraid_sas: add hibernation support")
'wait_time' never used since commit 11c71cb4ab ("megaraid_sas: Do
not allow PCI access during OCR")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_transition_to_ready:
drivers/scsi/megaraid/megaraid_sas_base.c:3900:6: warning: variable cur_state set but not used [-Wunused-but-set-variable]
Never used since commit 7218df69e3 ("[SCSI] megaraid_sas: use the
firmware boot timeout when waiting for commands")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Create a debugfs interface for megaraid_sas driver. Provide interface to
dump driver RAID map in debugfs.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Print FW supported MSI-X vector count only if FW supports
MSI-X.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add debug prints related to device list being returned by firmware. The a
debug flag to activate these prints.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add prints in resume/suspend path to help in debugging hibernation
issues. The print gives an indication when the driver entry points are
called.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When driver detects a firmware fault during load, dump additional
information on fault code and subcode that will help in debugging.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a sysfs interface to get the raid map index that is being used by
driver.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add prints for BAR address information during driver load. This helps in
debugging issues with BAR address changing during OS boot.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When controller fails to transition to READY state during driver probe,
dump the system interface register set. This will give snapshot of the
firmware status for debugging driver load issues.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a sysfs interface to dump the controller's system interface registers.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add option to format the buffer that is being dumped. Currently, the IO
frame and chain frame dumped in the syslog is getting split across multiple
lines based on the formatting. Fix this by using KERN_CONT in printk.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add prints to identify the internal DCMD opcode that has timed out.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch enhances the existing debug prints in reset and task management
path.
These debug prints in adapter reset path helps with debugging issues
related to IO timeouts that are seen frequently in the field. Add
additional debug prints to dump the pending command frames before
initiating an adapter reset. Also, print FastPath IOs that are
outstanding.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver will use "reply descriptor post queues" in round robin fashion when
the combined MSI-X mode is not enabled. With this IO completions are
distributed and load balanced across all the available reply descriptor
post queues equally.
This is enabled only if combined MSI-X mode is not enabled in firmware.
This improves performance and also fixes soft lockups.
When load balancing is enabled, IRQ affinity from driver needs to be
disabled.
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Issue Description:
We have seen cpu lock up issues from field if system has a large (more than
96) logical cpu count. SAS3.0 controller (Invader series) supports max 96
MSI-X vector and SAS3.5 product (Ventura) supports max 128 MSI-X vectors.
This may be a generic issue (if PCI device support completion on multiple
reply queues).
Let me explain it w.r.t megaraid_sas supported h/w just to simplify the
problem and possible changes to handle such issues. MegaRAID controller
supports multiple reply queues in completion path. Driver creates MSI-X
vectors for controller as "minimum of (FW supported Reply queues, Logical
CPUs)". If submitter is not interrupted via completion on same CPU, there
is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO
timeout, system sluggish etc.
Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU
(e.g. CPU B) is busy with processing the corresponding IO's reply
descriptors from reply descriptor queue upon receiving the interrupts from
HBA. If CPU A is continuously pumping the IOs then always CPU B (which is
executing the ISR) will see the valid reply descriptors in the reply
descriptor queue and it will be continuously processing those reply
descriptor in a loop without quitting the ISR handler.
megaraid_sas driver will exit ISR handler if it finds unused reply
descriptor in the reply descriptor queue. Since CPU A will be continuously
sending the IOs, CPU B may always see a valid reply descriptor (posted by
HBA Firmware after processing the IO) in the reply descriptor queue. In
worst case, driver will not quit from this loop in the ISR handler.
Eventually, CPU lockup will be detected by watchdog.
Above mentioned behavior is not common if "rq_affinity" set to 2 or
affinity_hint is honored by irqbalancer as "exact". If rq_affinity is set
to 2, submitter will be always interrupted via completion on same CPU. If
irqbalancer is using "exact" policy, interrupt will be delivered to
submitter CPU.
Problem statement:
If CPU count to MSI-X vectors (reply descriptor Queues) count ratio is not
1:1, we still have exposure of issue explained above and for that we don't
have any solution.
Exposure of soft/hard lockup is seen if CPU count is more than MSI-X
supported by device.
If CPUs count to MSI-X vectors count ratio is not 1:1, (Other way, if
CPU counts to MSI-X vector count ratio is something like X:1, where X > 1)
then 'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU
hard/soft lockups. There won't be any one to one mapping between
CPU to MSI-X vector instead one MSI-X interrupt (or reply descriptor queue)
is shared with group/set of CPUs and there is a possibility of having a
loop in the IO path within that CPU group and may observe lockups.
For example: Consider a system having two NUMA nodes and each node having
four logical CPUs and also consider that number of MSI-X vectors enabled on
the HBA is two, then CPUs count to MSI-X vector count ratio as 4:1.
e.g.
MSI-X vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and
MSI-X vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1.
numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 --> MSI-X 0
node 0 size: 65536 MB
node 0 free: 63176 MB
node 1 cpus: 4 5 6 7 --> MSI-X 1
node 1 size: 65536 MB
node 1 free: 63176 MB
Assume that user started an application which uses all the CPUs of NUMA
node 0 for issuing the IOs. Only one CPU from affinity list (it can be any
cpu since this behavior depends upon irqbalance) CPU0 will receive the
interrupts from MSI-X 0 for all the IOs. Eventually, CPU 0 IO submission
percentage will be decreasing and ISR processing percentage will be
increasing as it is more busy with processing the interrupts. Gradually IO
submission percentage on CPU 0 will be zero and it's ISR processing
percentage will be 100% as IO loop has already formed within the
NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with
submitting the heavy IOs and only CPU 0 is busy in the ISR path as it
always find the valid reply descriptor in the reply descriptor queue.
Eventually, we will observe the hard lockup here.
Chances of occurring of hard/soft lockups are directly proportional to
value of X. If value of X is high, then chances of observing CPU lockups is
high.
Solution:
Use IRQ poll interface defined in "irq_poll.c".
megaraid_sas driver will execute ISR routine in softirq context and it will
always quit the loop based on budget provided in IRQ poll interface.
Driver will switch to IRQ poll only when more than a threshold number of
reply descriptors are handled in one ISR. Currently threshold is set as
1/4th of HBA queue depth.
In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is
X:1 (where X > 1)), IRQ poll interface will avoid CPU hard lockups due to
voluntary exit from the reply queue processing based on budget.
Note - Only one MSI-X vector is busy doing processing.
Select CONFIG_IRQ_POLL from driver Kconfig for driver compilation.
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While an online controller reset(OCR) is in progress, there is short
duration where all access to controller's PCI config space from the host
needs to be blocked. This is due to a hardware limitation of MegaRAID
controllers.
With this patch, driver will block all access to controller's config space
from userland applications by calling pci_cfg_access_lock() while OCR is in
progress and unlocking after controller comes back to ready state.
Added helper function which locks the config space before initiating OCR
and wait for controller to become READY.
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
fw_reset_no_pci_access is only applicable for MFI controllers and is not
used for Fusion controllers.
For all Fusion controllers, driver can check reset adapter bit in
status register before performing a chip reset without
setting "fw_reset_no_pci_access".
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
No functional change. Remove set but unused variable in
megasas_set_static_target_properties.
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not see http www gnu org licenses
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details [based]
[from] [clk] [highbank] [c] you should have received a copy of the
gnu general public license along with this program if not see http
www gnu org licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 355 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is mostly update of the usual drivers: qla2xxx, qedf, smartpqi,
hpsa, lpfc, ufs, mpt3sas, ibmvfc and hisi_sas. Plus number of minor
changes, spelling fixes and other trivia.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXNIK0yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbeFAP4wsOm6
BNqDvLbF7gHAH+oSVAeENd9tepXG2xKbUHmLRgEA1wPaUxon8L8v/SSsqwewqMXP
y6CnU6aO2iaViTNBsPs=
=RX74
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: qla2xxx, qedf, smartpqi,
hpsa, lpfc, ufs, mpt3sas, ibmvfc and hisi_sas. Plus number of minor
changes, spelling fixes and other trivia"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (298 commits)
scsi: qla2xxx: Avoid that lockdep complains about unsafe locking in tcm_qla2xxx_close_session()
scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory
scsi: qla2xxx: Fix hardirq-unsafe locking
scsi: qla2xxx: Complain loudly about reference count underflow
scsi: qla2xxx: Use __le64 instead of uint32_t[2] for sending DMA addresses to firmware
scsi: qla2xxx: Introduce the dsd32 and dsd64 data structures
scsi: qla2xxx: Check the size of firmware data structures at compile time
scsi: qla2xxx: Pass little-endian values to the firmware
scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands
scsi: qla2xxx: Use an on-stack completion in qla24xx_control_vp()
scsi: qla2xxx: Make qla24xx_async_abort_cmd() static
scsi: qla2xxx: Remove unnecessary locking from the target code
scsi: qla2xxx: Remove qla_tgt_cmd.released
scsi: qla2xxx: Complain if a command is released that is owned by the firmware
scsi: qla2xxx: target: Fix offline port handling and host reset handling
scsi: qla2xxx: Fix abort handling in tcm_qla2xxx_write_pending()
scsi: qla2xxx: Fix error handling in qlt_alloc_qfull_cmd()
scsi: qla2xxx: Simplify qlt_send_term_imm_notif()
scsi: qla2xxx: Fix use-after-free issues in qla2xxx_qpair_sp_free_dma()
scsi: qla2xxx: Fix a qla24xx_enable_msix() error path
...