NAND:
Updates from Boris:
"""
This pull request contains only one notable change:
* Addition of the MTK NAND controller driver
And a bunch of specific NAND driver improvements/fixes. Here are the
changes that are worth mentioning:
* A few fixes/improvements for the xway NAND controller driver
* A few fixes for the sunxi NAND controller driver
* Support for DMA in the sunxi NAND driver
* Support for the sunxi NAND controller IP embedded in A23/A33 SoCs
* Addition for bitflips detection in erased pages to the brcmnand driver
* Support for new brcmnand IPs
* Update of the OMAP-GPMC binding to support DMA channel description
"""
In addition, some small fixes around error handling, etc., as well as one
long-standing corner case issue (2.6.20, I think?) with writing 1 byte less
than a page.
NOR:
* Rework some error handling on reads and writes, so we can better handle (for
instance) SPI controllers which have limitations on their maximum transfer size
* Add new Cadence Quad SPI flash controller driver
* Add new Atmel QSPI flash controller driver
* Add new Hisilicon SPI flash controller driver
* Support a few new flash, and update supported features on others
* Fix the logic used for detecting a fully-unlocked flash
And other miscellaneous small fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXn/j0AAoJEFySrpd9RFgto58P/j1huB0d21zFen3teo8YKKr1
dLi65mFbqtpU1BLqD07uc2gsH17kvezFJCtx8H8Jp/yCjLF2kYIKL7wDTf6OJPtn
aYGS5dG5jhMIq+6CD2olKqy+IVLfL9GvCf44Z3fpVta5lOn09y7Jm0AkBjmJcH45
SdJi+iUkSKhqRY3O2udyauGyL4JWG7eHUonrG9g8ROrO0GWQkT4Ijm1ZwIBlFDkJ
e1N960OqPg5ISOzuTeM14Ok9IUyeb7wiXhqRfOJgzNk6iBcN1YC2ono3C7RH2sN7
wiyCqqUpDDCDBDiFdGOdpc9cjzNysrt02ypWRsZIpQVCm89nPLSutqQEWLuo0qzq
/eIzdwbk1AxX96CeQohOezqL+n6+RHP9AIvwzL9GeWjipD1LBvfM1l3CmuSKK5jb
bQ4CA/FVz1tO/25q8tuLJfpFzhFE2PC3pphVf8tREL/U6OR/97NgDMuQIuqiQpRc
4nJtu79yacAzEiztZh0bsx+t94QFE+kfs/6d8m+llLEyx2sI8HKZeDvNRwEB0OsD
wQ5bjyd54m7+i4H8njrnOTP+K2YrwNjGlbTo7qrRSpFMDr4mD0VQLap03Srvo5xV
OYB6uGZhGV2L3k6nG5wywM2z6Hw0QfHxjrpcLyG51xABmni2QF3lOdYSllRtRTXp
aYLfPUqgIgSUI2GTDHyW
=HQr0
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20160801' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"NAND:
Quoting Boris:
'This pull request contains only one notable change:
- Addition of the MTK NAND controller driver
And a bunch of specific NAND driver improvements/fixes. Here are the
changes that are worth mentioning:
- A few fixes/improvements for the xway NAND controller driver
- A few fixes for the sunxi NAND controller driver
- Support for DMA in the sunxi NAND driver
- Support for the sunxi NAND controller IP embedded in A23/A33 SoCs
- Addition for bitflips detection in erased pages to the brcmnand driver
- Support for new brcmnand IPs
- Update of the OMAP-GPMC binding to support DMA channel description'
In addition, some small fixes around error handling, etc., as well
as one long-standing corner case issue (2.6.20, I think?) with
writing 1 byte less than a page.
NOR:
- rework some error handling on reads and writes, so we can better
handle (for instance) SPI controllers which have limitations on
their maximum transfer size
- add new Cadence Quad SPI flash controller driver
- add new Atmel QSPI flash controller driver
- add new Hisilicon SPI flash controller driver
- support a few new flash, and update supported features on others
- fix the logic used for detecting a fully-unlocked flash
And other miscellaneous small fixes"
* tag 'for-linus-20160801' of git://git.infradead.org/linux-mtd: (60 commits)
mtd: spi-nor: don't build Cadence QuadSPI on non-ARM
mtd: mtk-nor: remove duplicated include from mtk-quadspi.c
mtd: nand: fix bug writing 1 byte less than page size
mtd: update description of MTD_BCM47XXSFLASH symbol
mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller
mtd: spi-nor: Bindings for Cadence Quad SPI Flash Controller driver
mtd: nand: brcmnand: Change BUG_ON in brcmnand_send_cmd
mtd: pmcmsp-flash: Allocating too much in init_msp_flash()
mtd: maps: sa1100-flash: potential NULL dereference
mtd: atmel-quadspi: add driver for Atmel QSPI controller
mtd: nand: omap2: fix return value check in omap_nand_probe()
Documentation: atmel-quadspi: add binding file for Atmel QSPI driver
mtd: spi-nor: add hisilicon spi-nor flash controller driver
mtd: spi-nor: support dual, quad, and WP for Gigadevice
mtd: spi-nor: Added support for n25q00a.
memory: Update dependency of IFC for Layerscape
mtd: nand: jz4780: Update MODULE_AUTHOR email address
mtd: nand: sunxi: prevent a small memory leak
mtd: nand: sunxi: add reset line support
mtd: nand: sunxi: update DT bindings
...
Pull misc kbuild updates from Michal Marek:
- coccicheck script improvements by Luis Rodriguez and Deepa Dinamani
- new coccinelle patches by Yann Droneaud and Vaishali Thakkar
- debian packaging fixes by Wilfried Klaebe, Henning Schild and Marcin
Mielniczuk
* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
Fix the Debian packaging script on systems with no codename
builddeb: fix file permissions before packaging
scripts/coccinelle: require coccinelle >= 1.0.4 on device_node_continue.cocci
coccicheck: refer to Documentation/coccinelle.txt and wiki
coccicheck: add support for requring a coccinelle version
scripts: add Linux .cocciconfig for coccinelle
coccicheck: replace --very-quiet with --quiet when debugging
coccicheck: add support for DEBUG_FILE
coccicheck: enable parmap support
coccicheck: make SPFLAGS more useful
coccicheck: move spatch binary check up
builddeb: really include objtool binary in headers package
coccinelle: catch krealloc() on devm_*() allocated memory
coccinelle: recognize more devm_* memory allocation functions
coccinelle: also catch kzfree() issues
coccicheck: Allow for overriding spatch flags
Coccinelle: noderef: Add new rules and correct the old rule
Pull kbuild updates from Michal Marek:
- GCC plugin support by Emese Revfy from grsecurity, with a fixup from
Kees Cook. The plugins are meant to be used for static analysis of
the kernel code. Two plugins are provided already.
- reduction of the gcc commandline by Arnd Bergmann.
- IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada
- bin2c fix by Michael Tautschnig
- setlocalversion fix by Wolfram Sang
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
gcc-plugins: disable under COMPILE_TEST
kbuild: Abort build on bad stack protector flag
scripts: Fix size mismatch of kexec_purgatory_size
kbuild: make samples depend on headers_install
Kbuild: don't add obj tree in additional includes
Kbuild: arch: look for generated headers in obtree
Kbuild: always prefix objtree in LINUXINCLUDE
Kbuild: avoid duplicate include path
Kbuild: don't add ../../ to include path
vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
kconfig.h: use already defined macros for IS_REACHABLE() define
export.h: use __is_defined() to check if __KSYM_* is defined
kconfig.h: use __is_defined() to check if MODULE is defined
kbuild: setlocalversion: print error to STDERR
Add sancov plugin
Add Cyclomatic complexity GCC plugin
GCC plugin infrastructure
Shared library support
Otherwise, there is potential for both DMF_SUSPENDED* and
DMF_NOFLUSH_SUSPENDING to not be set during dm_suspend() -- which is
definitely _not_ a valid state.
This fix, in conjuction with "dm rq: fix the starting and stopping of
blk-mq queues", addresses the potential for request-based DM multipath's
__multipath_map() to see !dm_noflush_suspending() during suspend.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Improve dm_stop_queue() to cancel any requeue_work. Also, have
dm_start_queue() and dm_stop_queue() clear/set the QUEUE_FLAG_STOPPED
for the blk-mq request_queue.
On suspend dm_stop_queue() handles stopping the blk-mq request_queue
BUT: even though the hw_queues are marked BLK_MQ_S_STOPPED at that point
there is still a race that is allowing block/blk-mq.c to call ->queue_rq
against a hctx that it really shouldn't. Add a check to
dm_mq_queue_rq() that guards against this rarity (albeit _not_
race-free).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # must patch dm.c on < 4.8 kernels
Multiple flags were being tested without locking. Protect against
non-atomic bit changes in m->flags by holding m->lock (while testing or
setting the queue_if_no_path related flags).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
VGIC implementation.
- s390: support for trapping software breakpoints, nested virtualization
(vSIE), the STHYI opcode, initial extensions for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
preliminary to this and the upcoming support for hardware virtualization
extensions.
- x86: support for execute-only mappings in nested EPT; reduced vmexit
latency for TSC deadline timer (by about 30%) on Intel hosts; support for
more than 255 vCPUs.
- PPC: bugfixes.
The ugly bit is the conflicts. A couple of them are simple conflicts due
to 4.7 fixes, but most of them are with other trees. There was definitely
too much reliance on Acked-by here. Some conflicts are for KVM patches
where _I_ gave my Acked-by, but the worst are for this pull request's
patches that touch files outside arch/*/kvm. KVM submaintainers should
probably learn to synchronize better with arch maintainers, with the
latter providing topic branches whenever possible instead of Acked-by.
This is what we do with arch/x86. And I should learn to refuse pull
requests when linux-next sends scary signals, even if that means that
submaintainers have to rebase their branches.
Anyhow, here's the list:
- arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
by the nvdimm tree. This tree adds handle_preemption_timer and
EXIT_REASON_PREEMPTION_TIMER at the same place. In general all mentions
of pcommit have to go.
There is also a conflict between a stable fix and this patch, where the
stable fix removed the vmx_create_pml_buffer function and its call.
- virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
This tree adds kvm_io_bus_get_dev at the same place.
- virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
file was completely removed for 4.8.
- include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
this is a change that should have gone in through the irqchip tree and
pulled by kvm-arm. I think I would have rejected this kvm-arm pull
request. The KVM version is the right one, except that it lacks
GITS_BASER_PAGES_SHIFT.
- arch/powerpc: what a mess. For the idle_book3s.S conflict, the KVM
tree is the right one; everything else is trivial. In this case I am
not quite sure what went wrong. The commit that is causing the mess
(fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
and arch/powerpc/kvm/. It's large, but at 396 insertions/5 deletions
I guessed that it wasn't really possible to split it and that the 5
deletions wouldn't conflict. That wasn't the case.
- arch/s390: also messy. First is hypfs_diag.c where the KVM tree
moved some code and the s390 tree patched it. You have to reapply the
relevant part of commits 6c22c98637, plus all of e030c1125e, to
arch/s390/kernel/diag.c. Or pick the linux-next conflict
resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
Second, there is a conflict in gmap.c between a stable fix and 4.8.
The KVM version here is the correct one.
I have pushed my resolution at refs/heads/merge-20160802 (commit
3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
=42ti
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
- ARM: GICv3 ITS emulation and various fixes. Removal of the
old VGIC implementation.
- s390: support for trapping software breakpoints, nested
virtualization (vSIE), the STHYI opcode, initial extensions
for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots
of cleanups, preliminary to this and the upcoming support for
hardware virtualization extensions.
- x86: support for execute-only mappings in nested EPT; reduced
vmexit latency for TSC deadline timer (by about 30%) on Intel
hosts; support for more than 255 vCPUs.
- PPC: bugfixes.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
KVM: PPC: Introduce KVM_CAP_PPC_HTM
MIPS: Select HAVE_KVM for MIPS64_R{2,6}
MIPS: KVM: Reset CP0_PageMask during host TLB flush
MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
MIPS: KVM: Sign extend MFC0/RDHWR results
MIPS: KVM: Fix 64-bit big endian dynamic translation
MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
MIPS: KVM: Use 64-bit CP0_EBase when appropriate
MIPS: KVM: Set CP0_Status.KX on MIPS64
MIPS: KVM: Make entry code MIPS64 friendly
MIPS: KVM: Use kmap instead of CKSEG0ADDR()
MIPS: KVM: Use virt_to_phys() to get commpage PFN
MIPS: Fix definition of KSEGX() for 64-bit
KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
kvm: x86: nVMX: maintain internal copy of current VMCS
KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
KVM: arm64: vgic-its: Simplify MAPI error handling
KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
...
When the corrupt_bio_byte feature was introduced it caused READ bios to
no longer be errored with -EIO during the down_interval. This had to do
with the complexity of needing to submit READs if the corrupt_bio_byte
feature was used.
Fix it so READ bios are properly errored with -EIO; doing so early in
flakey_map() as long as there isn't a match for the corrupt_bio_byte
feature.
Fixes: a3998799fb ("dm flakey: add corrupt_bio_byte feature")
Reported-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Instead of copying the actual GRH of type struct ib_grh, existing code
copies the struct ib_global_route into the sge. This patch fixes that
and constructs the actual GRH from ib_global_route and copies the GRH
into the sge.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The interface is used to compute the 5-bit SC field from the
LRH and the RHF bits. Modify code to use the interface instead.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cleanup hfi1_ud_rcv to not have to look at the packet
header fields multiple times. The fields are looked up
once and used throughout the function. Also fix sc
computation when validating MAD packets.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
hfi1_pio_header should really be called hfi1_sdma_header
as it is only used for sdma transmits.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
struct ahg_ib_header has no header specific information.
Rename it to struct hfi1_ahg_info
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
sde and hfi1_ib_header are not used anymore.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Active QSFP cables were reset only every alternate iteration of the
channel tuning algorithm instead of every iteration due to incorrect
reset of the flag that controlled QSFP reset, resulting in using stale
QSFP status in the channel tuning algorithm.
Fixes: 8ebd4cf185 ("Add active and optical cable support")
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some QSFP cables assert the interrupt line as a side effect of module
plug-in and power up. This causes the SerDes and QSFP tuning algorithm
to begin cable initialization by reading the QSFP memory map over I2C,
which fails. This patch ignores any interrupt line assertion until
the module has completed power up and voltage rails have stabilized,
which can take a maximum of 500 ms per the SFF-8679 specification.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
QSFP CDR enablement is now controlled by determining power class
and the configuration file. We disable the DC 8051 from requesting
enablement or disabling of TX and RX CDRs by removing the code
that allowed the DC 8051 to request changes.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Hanging has been observed while writing a file over NFSoRDMA. Dmesg on
the server contains messages like these:
[ 931.992501] svcrdma: Error -22 posting RDMA_READ
[ 952.076879] svcrdma: Error -22 posting RDMA_READ
[ 982.154127] svcrdma: Error -22 posting RDMA_READ
[ 1012.235884] svcrdma: Error -22 posting RDMA_READ
[ 1042.319194] svcrdma: Error -22 posting RDMA_READ
Here is why:
With the base memory management extension enabled, FRMR is used instead
of FMR. The xprtrdma server issues each RDMA read request as the following
bundle:
(1)IB_WR_REG_MR, signaled;
(2)IB_WR_RDMA_READ, signaled;
(3)IB_WR_LOCAL_INV, signaled & fencing.
These requests are signaled. In order to generate completion, the fast
register work request is processed by the hfi1 send engine after being
posted to the work queue, and the corresponding lkey is not valid until
the request is processed. However, the rdmavt driver validates lkey when
the RDMA read request is posted and thus it fails immediately with error
-EINVAL (-22).
This patch changes the work flow of local operations (fast register and
local invalidate) so that fast register work requests are always
processed immediately to ensure that the corresponding lkey is valid
when subsequent work requests are posted. Local invalidate requests are
processed immediately if fencing is not required and no previous local
invalidate request is pending.
To allow completion generation for signaled local operations that have
been processed before posting to the work queue, an internal send flag
RVT_SEND_COMPLETION_ONLY is added. The hfi1 send engine checks this flag
and only generates completion for such requests.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This fix allows for support of in-kernel reserved operations
without impacting the ULP user.
The low level driver can register a non-zero value which
will be transparently added to the send queue size and hidden
from the ULP in every respect.
ULP post sends will never see a full queue due to a reserved
post send and reserved operations will never exceed that
registered value.
The s_avail will continue to track the ULP swqe availability
and the difference between the reserved value and the reserved
in use will track reserved availabity.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Trace shows incorrect amount of allocated memory.
Fix trace to display memory in KB.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Grzegorz Heldt <grzegorz.heldt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add sysfs entry to allow user to override affinity for SDMA
engine interrupts.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Enhance the PCIe Gen3 recipe to support static CTLE tuning,
and add a switch to choose between static and dynamic
approaches. Make discrete chips default to static CTLE
tuning.
Reviewed-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This fixes the following warnings with PROVE_LOCKING and PROVE_RCU
enabled in the kernel:
case (1):
[ INFO: suspicious RCU usage. ]
drivers/infiniband/hw/hfi1/init.c:532
suspicious rcu_dereference_check() usage!
case (2):
[ INFO: suspicious RCU usage. ]
drivers/infiniband/hw/hfi1/hfi.h:1624
suspicious rcu_dereference_check() usage!
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Read the version of the SBus, PCIe SerDes, and Fabric Serdes
firmwares at driver load time.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When link up fails in LNI, the local and peer state complete
frames are reported as numbers. Explain what the values mean
so the operator can better diagnose the problem.
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently, the default number of kernel receive contexts is set to the
number of NUMA nodes on the system plus one for control context. However,
the systems that have a single socket and/or have NUMA disabled in the BIOS
will have only one receive context by default. This patch would ensure that
by default there will be at least two kernel receive contexts plus one for
control context regardless of the number of NUMA nodes on the system. The
user can override the default number of kernel receive contexts with the
krcvqs module parameter.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Advertise and add the capability of handing all aspects of IBTA extended
memory management support in post send.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to support extended memory management support, add send side
processing of work requests of type IB_WR_REG_MR, IB_WR_LOCAL_INV, and
IB_WR_SEND_WITH_INV. The first two are local operations and are supported
for both RC and UC. Send with invalidate is only supported for RC because
the corresponding IB opcodes are not defined for UC.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
As part of enabling extended memory management support, add the processing
of the RC send with invalidate.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some work requests are local operations, such as IB_WR_REG_MR and
IB_WR_LOCAL_INV. They differ from non-local operations in that:
(1) Local operations can be processed immediately without being posted
to the send queue if neither fencing nor completion generation is needed.
However, to ensure correct ordering, once a local operation is posted to
the work queue due to fencing or completion requiement, all subsequent
local operations must also be posted to the work queue until all the
local operations on the work queue have completed.
(2) Local operations don't send packets over the wire and thus don't
need (and shouldn't update) the packet sequence numbers.
Define a new a flag bit for the post send table to identify local
operations.
Add a new field to the QP structure to track the number of local
operations on the send queue to determine if direct processing of new
local operations should be enabled/disabled.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to support extended memory management, add the mechanism to
invalidate MR keys. This includes a flag "lkey_invalid" in the MR data
structure that is to be checked when validating access to the MR via
the associated key, and two utility functions to perform fast memory
registration and memory key invalidate operations.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This implements the device specific function needed by the verbs
API function ib_map_mr_sg().
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
There were multiple places where FECN/BECN processing was
being done for the different types of QPs. All of that code
was very similar, which meant that it could be pulled into
a single function used by the different QP types.
To retain the performance in the fastpath, the common code
starts with an inline function, which only calls the slow
path if the packet has any of the [FB]ECN bits set.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
While handling buffer control MAD, partially initialized
dd->kernel_send_context area may cause potential dereference
of uninitialized pointers. Fix by using kzalloc_node()
instead of kmalloc_node().
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
PMA should not sum TX and RX replay counts when reporting
local link integrity errors. Fixed by removing C_DC_TX_REPLAY
counter from calculation of the link integrity errors counter
value.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Change rvt_post_one_wr to use the new table mechanism for
post send.
Validate that each low level driver specifies the table.
Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add flexibility for driver dependent operations in post send
because different drivers will have differing post send
operation support.
This includes data structure definitions to support a table
driven scheme along with the necessary validation routine
using the new table.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Prevent processing receive packet in case when opcode is
accepted by QP but handler for this type of packet is not
defined.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently each user context is assigned a single SDMA engine
based on the VL, context id, and subcontext id. That means for
MPI applications, each rank can only use one SDMA engine for
all messages. This may create unwanted backup for independent
messages going to different destinations upon congestion at one
destination.
This patch adds the packet "dlid" to the formula of SDMA engine
selection for user SDMA requests. A simple hash table is used
to maintain even distribution among the available SDMA engines
regardless how the "dlid" values are distributed.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Remove the TWSI code. The driver now uses the kernel's built-in
i2c bit bus module.
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Use built-in i2c bit-shift bus adapter to control the
i2c busses on the chip.
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When performing process affinity recommendations for MPI ranks, the current
algorithm doesn't take into account multiple HFI units. Also, real
cores and HT cores are not distinguished from one another. Therefore,
all HT cores are recommended to be assigned first within the local NUMA
node before recommending the assignments of cores in other NUMA nodes.
It's ideal to assign all real cores across all NUMA nodes first, then all
HT 1 cores, then all HT 2 cores, and so on to balance CPU workload. CPU
cores in other NUMA nodes could be running interrupt handlers, and this is
not taken into account.
To balance the CPU workload for user processes, the following
recommendation algorithm is used:
For each user process that is opening a context on HFI Y:
a) If all cores are assigned to user processes, start assignments all
over from the first core
b) Assign real cores first, then HT cores (First set of HT cores on
all physical cores, then second set of HT cores, and, so on) in the
following order:
1. Same NUMA node as HFI Y and not running an IRQ handler
2. Same NUMA node as HFI Y and running an IRQ handler
3. Different NUMA node to HFI Y and not running an IRQ handler
4. Different NUMA node to HFI Y and running an IRQ handler
c) Mark core as assigned in the global affinity structure. As user
processes are done, remove core assignments from global affinity
structure.
This implementation allows an arbitrary number of HT cores and provides
support for multiple HFIs.
This is being included in the kernel rather than user space due to the
fact that user space has no way of knowing the CPU recommendations for
contexts running as part of other jobs.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Kernel receive queues oversubscribe CPU cores on multi-HFI systems.
To prevent this, the kernel receive queues are separated onto
different cores, and the SDMA engine interrupts are constrained to
a lesser number of cores.
hfi1s_on_numa_node*krcvqs is the number of CPU cores that are
reserved for kernel receive queues for all HFIs. Each HFI initializes
its kernel receive queues to one of the reserved CPU cores. If there
ends up being 0 CPU cores leftover for SDMA engines, use the same
CPU cores as receive contexts.
In addition, general and control contexts are assigned to their own
CPU core, however, both types of contexts tend to have low traffic.
To save CPU cores, collapse general and control contexts to one CPU
core for all HFI units. This change prevents SDMA engine interrupts
from wrapping around general contexts.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When HFI units get initialized, they each use their own mask copy for
affinity assignments. On a multi-HFI system, affinity assignments
overbook CPU cores as each HFI doesn't have knowledge of affinity
assignments for other HFI units. Therefore, some CPU cores are never
used for interrupt handlers in systems with high number of CPU cores
per NUMA node.
For multi-HFI systems, SDMA engine interrupt assignments start all over
from the first CPU in the local NUMA node after the first HFI
initialization. This change allows assignments to continue where the
last HFI unit left off.
Add global structure for affinity assignments for multiple HFIs to share
affinity mask.
Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: Jubin John <jubin.john@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
objdump's raw insn output can vary across architectures on the number of
bytes per chunk (bpc) displayed and their endianness.
The code-reading test relied on reading objdump output as 1 bpc. Kaixu
Xia reported test failure on ARM64, where objdump displays 4 bpc:
70c48: f90027bf str xzr, [x29,#72]
70c4c: 91224000 add x0, x0, #0x890
70c50: f90023a0 str x0, [x29,#64]
This patch adds support to read raw insn output for any bpc length.
In case of 2+ bpc it also guesses objdump's display endian.
Reported-and-Tested-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/07f0f7bcbda78deb423298708ef9b6a54d6b92bd.1452592712.git.jstancek@redhat.com
[ Fix up pr_fmt() call to use %zd for size_t variables, fixing the build on Ubuntu cross-compiling to armhf and ppc64 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Describe use of jiffy-based timeout values involved in inode maintenance.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>