Commit Graph

243 Commits

Author SHA1 Message Date
Michael S. Tsirkin 4e56ea794e IB/mthca: memfree completion with error FW bug workaround
Memfree firmware is in rare cases reporting WQE index == base - 1 in
receive completion with error, instead of (rq size - 1); base is 0 in
mthca.  Here is a patch to avoid kernel crash and report a correct WR
id in this case.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:20 -07:00
Michael S. Tsirkin 13aa6ecb47 IB/mthca: restore missing PCI registers after reset
mthca does not restore the following PCI-X/PCI Express registers after reset:
  PCI-X device: PCI-X command register
  PCI-X bridge: upstream and downstream split transaction registers
  PCI Express : PCI Express device control and link control registers

This causes instability and/or bad performance on systems where one of
these registers is set to a non-default value by BIOS.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:00 -07:00
Michael S. Tsirkin ab28b171ea IB/mthca: Fix posting lists of 256 receive requests to SRQ for Tavor
If we post a list of length exactly a multiple of 256, nreq in
doorbell gets set to 256 which is wrong: it should be encoded by 0.
This is because we only zero it out on the next WR, which may not be
there.  The solution is to ring the doorbell after posting a WQE, not
before posting the next one.

This is the same bug that we just fixed for QPs with non-shared RQ.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-24 13:43:37 -07:00
Bryan O'Sullivan 09b74de9ff IB/ipath: deref correct pointer when using kernel SMA
At this point, the core QP structure hasn't been initialized, so what's
in there isn't valid.  Get the same information elsewhere.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:29:35 -07:00
Bryan O'Sullivan 3977026462 IB/ipath: fix null deref during rdma ops
The problem was that node A's sending thread, which handles sending RDMA
read response data, would write the trigger word, the last packet would
be sent, node B would send a new RDMA read request, node A's interrupt
handler would initialize s_rdma_sge, then node A's sending thread would
update s_rdma_sge.  This didn't happen very often naturally but was more
frequent with 1 byte RDMA reads.  Rather than adding more locking or
increasing the QP structure size and copying sge data, I modified the
copy routine to update the pointers before writing the trigger word to
avoid the update race.

Signed-off-by: Ralph Campbell <ralphc@pathscale.com>
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:29:35 -07:00
Bryan O'Sullivan 41c75a19bf IB/ipath: register as IB device owner
This fixes an oops.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:29:35 -07:00
Bryan O'Sullivan 9dcc0e58e2 IB/ipath: enable PE800 receive interrupts on user ports
Fixed so it works on the PE-800.  It had not previously been updated to
match PE-800 receive interrupt differences from HT-400.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:29:35 -07:00
Bryan O'Sullivan f2080fa3c6 IB/ipath: enable GPIO interrupt on HT-460
This is required for even semi-decent performance on OpenIB.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:29:34 -07:00
Bryan O'Sullivan b0ff7c2005 IB/ipath: fix NULL dereference during cleanup
Fix NULL deref due to pcidev being clobbered before dd->ipath_f_cleanup()
was called.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:27:06 -07:00
Bryan O'Sullivan 94b8d9f98d IB/ipath: replace uses of LIST_POISON
Per Andrew's request.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:27:06 -07:00
Bryan O'Sullivan eaf6733bc1 IB/ipath: fix reporting of driver version to userspace
Fix the interface version that gets exported to userspace.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:27:06 -07:00
Bryan O'Sullivan b228b43c49 IB/ipath: don't modify QP if changes fail
Make sure modify_qp won't modify the QP if any of the changes failed.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:27:06 -07:00
Bryan O'Sullivan ebac3800e5 IB/ipath: fix spinlock recursion bug
The local loopback path for RC can lock the rkey table lock without
blocking interrupts.  The receive interrupt path can then call
ipath_rkey_ok() and deadlock.  Remove the redundant lock.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-23 13:27:06 -07:00
Michael S. Tsirkin 23f3bc0f2c IB/mthca: Fix posting lists of 256 receive requests for Tavor
If we post a list of length 256 exactly, nreq in doorbell gets set to
256 which is wrong: it should be encoded by 0.  This is because we
only zero it out on the next WR, which may not be there.  The solution
is to ring the doorbell after posting a WQE, not before posting the
next one.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-18 11:37:03 -07:00
Roland Dreier 1db76c14d2 IB/mthca: Make fw_cmd_doorbell default to 0
Setting fw_cmd_doorbell allows FW command to be queued using posted
writes instead of requiring polling on a "go" bit, so it should be a
performance boost.  However, the option causes problems with at least
some device/firmware combinations, so set the default to 0 until we
understand what's going on better.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-17 07:48:07 -07:00
Roland Dreier 6f4bb3d820 IB/ipath: Properly terminate PCI ID table
The ipath driver's table of PCI IDs needs a { 0, } entry at the end.
This makes all of the device aliases visible to userspace so hotplug
loads the module for all supported devices.  Without the patch,
modinfo ipath_core only shows:

    alias:          pci:v00001FC1d0000000Dsv*sd*bc*sc*i*

instead of the correct:

    alias:          pci:v00001FC1d00000010sv*sd*bc*sc*i*
    alias:          pci:v00001FC1d0000000Dsv*sd*bc*sc*i*

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
2006-05-12 14:57:52 -07:00
Michael S. Tsirkin ce477ae4f8 IB/mthca: FMR ioremap fix
Addresses for ioremap must be calculated off of pci_resource_start;
we can't directly use the bus address as seen by the HCA.  Fix the
code that remaps device memory for FMR access.

Based on patch by Klaus Smolin.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-10 15:16:57 -07:00
Roland Dreier a3285aa4ee IB/mthca: Fix race in reference counting
Fix races in in destroying various objects.  If a destroy routine
waits for an object to become free by doing

	wait_event(&obj->wait, !atomic_read(&obj->refcount));
	/* now clean up and destroy the object */

and another place drops a reference to the object by doing

	if (atomic_dec_and_test(&obj->refcount))
		wake_up(&obj->wait);

then this is susceptible to a race where the wait_event() and final
freeing of the object occur between the atomic_dec_and_test() and the
wake_up().  And this is a use-after-free, since wake_up() will be
called on part of the already-freed object.

Fix this in mthca by replacing the atomic_t refcounts with plain old
integers protected by a spinlock.  This makes it possible to do the
decrement of the reference count and the wake_up() so that it appears
as a single atomic operation to the code waiting on the wait queue.

While touching this code, also simplify mthca_cq_clean(): the CQ being
cleaned cannot go away, because it still has a QP attached to it.  So
there's no reason to be paranoid and look up the CQ by number; it's
perfectly safe to use the pointer that the callers already have.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-09 10:50:29 -07:00
Bryan O'Sullivan f6f0413e10 IB/ipath: tidy up white space in a few files
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:23 -07:00
Bryan O'Sullivan d562a5ae69 IB/ipath: fix label name in interrupt handler
Names that are the opposite of their intended meanings are not so helpful.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:22 -07:00
Bryan O'Sullivan ae15f540cc IB/ipath: improve sparse annotation
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:21 -07:00
Bryan O'Sullivan 9b2017f1e1 IB/ipath: simplify IB timer usage
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:21 -07:00
Bryan O'Sullivan 76f0dd141b IB/ipath: simplify RC send posting
Remove some unnecessarily complicated tests.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:20 -07:00
Bryan O'Sullivan c71c30dcba IB/ipath: prevent hardware from being accessed during reset
The reset code now turns off the PRESENT flag during a reset, so that
other code won't attempt to access a device that's in mid-reset.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:16 -07:00
Bryan O'Sullivan fccea66364 IB/ipath: fix verbs registration
Remember when the verbs layer unregisters from the lower-level code.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:15 -07:00
Bryan O'Sullivan 52e7fad825 IB/ipath: change handling of PIO buffers
Different ipath hardware types have different numbers of buffers
available, so we decide on the counts ourselves unless we are specifically
overridden with a module parameter.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:14 -07:00
Bryan O'Sullivan 23e86a4584 IB/ipath: iterate over correct number of ports during reset
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:13 -07:00
Bryan O'Sullivan 68dd43a162 IB/ipath: set up 32-bit DMA mask if 64-bit setup fails
Some systems do not set up 64-bit maps on systems with 2GB or less of
memory installed, so we have to fall back to trying a 32-bit setup.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:12 -07:00
Bryan O'Sullivan 755e4ca4a9 IB/ipath: fix race with exposing reset file
We were accidentally exposing the "reset" sysfs file more than once
per device.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 12:14:11 -07:00
Roland Dreier 254abfd33a IB/mthca: Fix offset in query_gid method
GuidInfo records have 8 byte GUIDs in them, so an index should be
multiplied by 8 to get an offset.  mthca_query_gid() was incorrectly
multiplying by 16.

Noticed by Leonid Keller <leonid@mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 10:40:23 -07:00
Adrian Bunk 415dcd95b2 IB/mthca: make a function static
This patch makes the needlessly global mthca_update_rate() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-19 11:40:12 -07:00
Roland Dreier 5494c22ba2 IB/ipath: Fix whitespace
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-19 11:40:12 -07:00
Roland Dreier ac2ae4c977 IB/ipath: Make more names static
Make symbols that are only used in a single source file static.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-19 11:40:12 -07:00
Jack Morgenstein 59fef3b1e9 IB/mthca: Fix max_srq_sge returned by ib_query_device for Tavor devices
The driver allocates SRQ WQEs size with a power of 2 size both for
Tavor and for memfree. For Tavor, however, the hardware only requires
the WQE size to be a multiple of 16, not a power of 2, and the max
number of scatter-gather allowed is reported accordingly by the
firmware (and this is the value currently returned by
ib_query_device() and ibv_query_device()).

If the max number of scatter/gather entries reported by the FW is used
when creating an SRQ, the creation will fail for Tavor, since the
required WQE size will be increased to the next power of 2, which
turns out to be larger than the device permitted max WQE size (which
is not a power of 2).

This patch reduces the reported SRQ max wqe size so that it can be used
successfully in creating an SRQ on Tavor HCAs.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-12 11:42:30 -07:00
Michael S. Tsirkin abf45dbb5b IB/mthca: Disable tuning PCI read burst size
The PCI spec recommends against drivers playing with a device's PCI
read burst size, and says that systems software should configure it.
And we actually have users that report that changing it from the
default set by BIOS hurts performance and/or stability for them.  On
the other hand, the Mellanox Programmer's Reference Manual recommends
turning it up all the way to the maximum value.  Some tests conducted
here in the lab do not show performance improvement from this tuning,
but this might be just me.

As a work-around, make this tuning an option, off by default (safe
value), with an eye towards removing it completely one day if no one
complains.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:58 -07:00
Jack Morgenstein bf6a9e31cf IB: simplify static rate encoding
Push translation of static rate to HCA format into low-level drivers,
where it belongs.  For static rate encoding, use encoding of rate
field from IB standard PathRecord, with addition of value 0, for
backwards compatibility with current usage.  The changes are:

 - Add enum ib_rate to midlayer includes.
 - Get rid of static rate translation in IPoIB; just use static rate
   directly from Path and MulticastGroup records.
 - Update mthca driver to translate absolute static rate into the
   format used by hardware.  This also fixes mthca's static rate
   handling for HCAs that are capable of 4X DDR.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:47 -07:00
Roland Dreier 227c939b00 IB/mthca: Always build debugging code unless CONFIG_EMBEDDED=y
Change the mthca debugging trace output code so that it can enabled
and disabled at runtime with the debug_level module parameter in
sysfs.  Also, don't allow CONFIG_INFINIBAND_MTHCA_DEBUG to be disabled
unless CONFIG_EMBEDDED is selected.  We want users (and especially
distros) to have this turned on unless they really need to save space,
because by the time we want debugging output, it's usually too late to
rebuild a kernel.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-02 14:39:20 -07:00
Bryan O'Sullivan 77d8798b55 IB/ipath: kbuild infrastructure
Integrate the ipath core and OpenIB drivers into the kernel build
infrastructure.  Add entry to MAINTAINERS.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:21 -08:00
Bryan O'Sullivan 6522108f19 IB/ipath: infiniband verbs support
The ipath_verbs.c file implements the driver-specific components of the
kernel's Infiniband verbs layer.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:21 -08:00
Bryan O'Sullivan e28c00ad67 IB/ipath: misc infiniband code, part 2
Management datagram support, queue pairs, and reliable and unreliable
connections.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:21 -08:00
Bryan O'Sullivan cef1cce5c8 IB/ipath: misc infiniband code, part 1
Completion queues, local and remote memory keys, and memory region
support.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:20 -08:00
Bryan O'Sullivan 97f9efbc47 IB/ipath: infiniband RC protocol support
This is an implementation of the Infiniband RC ("reliable connection")
protocol.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:20 -08:00
Bryan O'Sullivan 74ed6b5eb1 IB/ipath: infiniband UC and UD protocol support
These files implement the Infiniband UC ("unreliable connection") and UD
("unreliable datagram") protocols.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:20 -08:00
Bryan O'Sullivan aa735edf5d IB/ipath: infiniband header files
These header files are used by the layered Infiniband driver.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:20 -08:00
Bryan O'Sullivan 889ab795a3 IB/ipath: layering interfaces used by higher-level driver code
The layering interfaces are used to implement the Infiniband protocols
and the ethernet emulation driver.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:20 -08:00
Bryan O'Sullivan 7f510b46e4 IB/ipath: support for userspace apps using core driver
These files introduce a char device that userspace apps use to gain
direct memory-mapped access to the InfiniPath hardware, and routines for
pinning and unpinning user memory in cases where the hardware needs to
DMA into the user address space.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:19 -08:00
Bryan O'Sullivan 3e9b4a5eb4 IB/ipath: sysfs and ipathfs support for core driver
The ipathfs filesystem contains files that are not appropriate for
sysfs, because they contain binary data.  The hierarchy is simple; the
top-level directory contains driver-wide attribute files, while numbered
subdirectories contain per-device attribute files.

Our userspace code currently expects this filesystem to be mounted on
/ipathfs, but a final location has not yet been chosen.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:19 -08:00
Bryan O'Sullivan 108ecf0d90 IB/ipath: misc driver support code
EEPROM support, interrupt handling, statistics gathering, and write
combining management for x86_64.

A note regarding i2c: The Atmel EEPROM hardware we use looks like an
i2c device electrically, but is not i2c compliant at all from a
functional perspective.  We tried using the kernel's i2c support to
talk to it, but failed.

Normal i2c devices have a single 7-bit or 10-bit i2c address that they
respond to.  Valid 7-bit addresses range from 0x03 to 0x77.  Addresses
0x00 to 0x02 and 0x78 to 0x7F are special reserved addresses
(e.g. 0x00 is the "general call" address.)  The Atmel device, on the
other hand, responds to ALL addresses.  It's designed to be the only
device on a given i2c bus.  A given i2c device address corresponds to
the memory address within the i2c device itself.

At least one reason why the linux core i2c stuff won't work for this
is that it prohibits access to reserved addresses like 0x00, which are
really valid addresses on the Atmel devices.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:19 -08:00
Bryan O'Sullivan 097709fea0 IB/ipath: chip initialisation code, and diag support
ipath_init_chip.c sets up an InfiniPath device for use.

ipath_diag.c permits userspace diagnostic tools to read and write a
chip's registers.  It is different in purpose from the mmap interfaces
to the /sys/bus/pci resource files.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:19 -08:00
Bryan O'Sullivan dc741bbd4f IB/ipath: support for PCI Express devices
This file contains routines and definitions specific to InfiniPath
devices that have PCI Express interfaces.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-31 13:14:19 -08:00