Using the new registration mechanism, define a flag that indicates the
user wishes to process RMPP messages in user space rather than have
the kernel process them.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Registrations options are specified through flags. Definitions of flags will
be in subsequent patches.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Registration failures can be difficult to debug from userspace. This
gives more visibility.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Use dev_* style print when struct device is available.
Also combine previously line broken user-visible strings as per
Documentation/CodingStyle:
"However, never break user-visible strings such as printk messages,
because that breaks the ability to grep for them."
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
[ Remove PFX so the patch actually builds. - Roland ]
Signed-off-by: Roland Dreier <roland@purestorage.com>
Running with DMA_API_DEBUG enabled and not checking for DMA mapping
errors triggers a kernel stack trace with "DMA-API: device driver
failed to check map error" message. Add these checks to the MAD
module, both to be be more robust and also eliminate these
false-positive stack traces.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Currently, QP1 is created using pkey index 0. This patch simply looks
for the index containing the default pkey, rather than hard-coding
pkey index 0.
This change will have no effect in native mode, since QP0 and QP1 are
created before the SM configures the port, so pkey table will still be
the default table defined by the IB Spec, in C10-123: "If non-volatile
storage is not used to hold P_Key Table contents, then if a PM
(Partition Manager) is not present, and prior to PM initialization of
the P_Key Table, the P_Key Table must act as if it contains a single
valid entry, at P_Key_ix = 0, containing the default partition
key. All other entries in the P_Key Table must be invalid."
Thus, in the native mode case, the driver will find the default pkey
at index 0 (so it will be no different than the hard-coding).
However, in SR-IOV mode, for VFs, the pkey table may be
paravirtualized, so that the VF's pkey index zero may not necessarily
be mapped to the real pkey index 0. For VFs, therefore, it is
important to find the virtual index which maps to the real default
pkey.
This commit does the following for QP1 creation:
1. Find the pkey index containing the default pkey, and use that index
if found. ib_find_pkey() returns the index of the
limited-membership default pkey (0x7FFF) if the full-member default
pkey is not in the table.
2. If neither form of the default pkey is found, use pkey index 0
(previous behavior).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Now that cancel_delayed_work() can be safely called from IRQ handlers,
there's no reason to use __cancel_delayed_work(). Use
cancel_delayed_work() instead of __cancel_delayed_work() and mark the
latter deprecated.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Roland Dreier <roland@kernel.org>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Now that mod_delayed_work() is safe to call from IRQ handlers,
__cancel_delayed_work() followed by queue_delayed_work() can be
replaced with mod_delayed_work().
Most conversions are straight-forward except for the following.
* net/core/link_watch.c: linkwatch_schedule_work() was doing a quite
elaborate dancing around its delayed_work. Collapse it such that
linkwatch_work is queued for immediate execution if LW_URGENT and
existing timer is kept otherwise.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Commit 0b30704304 ("IB/mad: Return error response for unsupported
MADs") does not failed MADs (eg those that return
IB_MAD_RESULT_FAILURE) properly -- these MADs should be silently
discarded. (We should not force the lower-layer drivers to return
SUCCESS | CONSUMED in this case, since the MAD is NOT successful).
Unsupported MADs are not failures -- they return SUCCESS, but with an
"unsupported error" status value inside the response MAD.
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 0b30704304 ("IB/mad: Return error response for unsupported
MADs") does not handle directed-route MADs properly -- it fails to set
the 'D' bit in the response MAD status field. This is a problem for
SmInfo MADs when the receiver does not have an SM running.
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Set up a response with appropriate error status and send it for MADs
that are not supported by a specific class/version.
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Swapna Thete <swapna.thete@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include <linux/module.h>
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include <linux/module.h>
net: sch_generic remove redundant use of <linux/module.h>
net: inet_timewait_sock doesnt need <linux/module.h>
...
Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h
They had been getting it implicitly via device.h but we can't
rely on that for the future, due to a pending cleanup so fix
it now.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
If a received MAD contains an invalid or reserved mgmt class, we will
attempt to access method_table outside of its range. Add a check to
ensure that mgmt class falls within the handled range.
Found by code inspection.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
We had a script which was looping through the devices returned from
ibstat and attempted to register a SMI agent on an ethernet device.
This caused a kernel panic for IBoE devices that don't have QP0.
Fix this by checking if the QP exists before using it.
Signed-off-by: Ira Weiny <weiny2@llnl.gov>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Since IBoE is using Ethernet as its link layer, there is no central
management entity so there is need for QP0. QP1 is still needed since
it handles communications between CM agents. This patch will skip QP0
and create only QP1 for IBoE ports.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Use kmemdup when some other buffer is immediately copied into the
allocated region.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression from,to,size,flag;
statement S;
@@
- to = \(kmalloc\|kzalloc\)(size,flag);
+ to = kmemdup(from,size,flag);
if (to==NULL || ...) S
- memcpy(to, from, size);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
When an iWARP device is unloaded, the ib_mad module logs errors. It
should be ignoring iWARP devices on device removal just like it does
on device add.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Replace open-coded loop with for_each_set_bit().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MADs are UD and can be dropped if there are no receives posted, so
allow receive queue size to be set with a module parameter in case the
queue needs to be lengthened. Send side tuning is done for symmetry
with receive.
Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Lockdep reported a possible deadlock with cm_id_priv->lock,
mad_agent_priv->lock and mad_agent_priv->timed_work.timer; this
happens because the mad module does
cancel_delayed_work(&mad_agent_priv->timed_work);
while holding mad_agent_priv->lock. cancel_delayed_work() internally
does del_timer_sync(&mad_agent_priv->timed_work.timer).
This can turn into a deadlock because mad_agent_priv->lock is taken
inside cm_id_priv->lock, so we can get the following set of contexts
that deadlock each other:
A: holding cm_id_priv->lock, waiting for mad_agent_priv->lock
B: holding mad_agent_priv->lock, waiting for del_timer_sync()
C: interrupt during mad_agent_priv->timed_work.timer that takes
cm_id_priv->lock
Fix this by using the new __cancel_delayed_work() interface (which
internally does del_timer() instead of del_timer_sync()) in all the
places where we are holding a lock.
Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757
Reported-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Rather than just defining static spinlock_t variables and then
initializing them later in init functions, simply define them with
DEFINE_SPINLOCK() and remove the calls to spin_lock_init(). This cleans
up the source a tad and also shrinks the compiled code; eg on x86-64:
add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-40 (-40)
function old new delta
ib_uverbs_init 336 326 -10
ib_mad_init_module 147 137 -10
ib_sa_init 123 103 -20
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If ib_post_send_mad() returns 0, the API guarantees that there will be
a callback to send_buf->mad_agent->send_handler() so that the sender
can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf will be
leaked and the mad_agent reference count will never go to zero and the
IB device module cannot be unloaded. The above can happen without
this patch if process_mad() returns (IB_MAD_RESULT_SUCCESS |
IB_MAD_RESULT_CONSUMED).
If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent
registered to receive the mad being sent, handle_outgoing_dr_smp()
returns zero which causes a MAD packet which is at the end of the
directed route to be incorrectly sent on the wire but doesn't cause a
hang since the HCA generates a send completion.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There is a potential race in ib_register_mad_agent() where the struct
ib_mad_agent_private is not fully initialized before it is added to
the list of agents per IB port. This means the ib_mad_agent_private
could be seen before the refcount, spin locks, and linked lists are
initialized. The fix is to initialize the structure earlier.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
handle_outgoing_dr_smp() can queue a struct ib_mad_local_private
*local on the mad_agent_priv->local_work work queue with
local->mad_priv == NULL if device->process_mad() returns
IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY and
(!ib_response_mad(&mad_priv->mad.mad) ||
!mad_agent_priv->agent.recv_handler).
In this case, local_completions() will be called with local->mad_priv
== NULL. The code does check for this case and skips calling
recv_mad_agent->agent.recv_handler() but recv == 0 so
kmem_cache_free() is called with a NULL pointer.
Also, since recv isn't reinitialized each time through the loop, it
can cause a memory leak if recv should have been zero.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Use krealloc() instead of kmalloc() followed by memcpy() when resizing
the MAD module's snoop table.
Also put parentheses around the new table size to avoid calculating
the wrong size to allocate, which fixes a bug pointed out by Haven
Hash <haven.hash@isilon.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This fixes the problem of incoming BMA responses being dropped due to
a bad "is response" check. Fix the test to use the ib_response_mad()
predicate, which correctly handles BMA MADs.
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=988>.
Signed-off-by: Michael Brooks <michael.brooks@qlogic.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a low-level driver returns IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED,
handle_outgoing_dr_smp() doesn't clean up properly. The fix is to
kfree the local data and break, rather than falling through. This was
observed with the ipath driver, but could happen with any driver.
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1027>.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In cancel_mads(), MADs are moved from the wait_list and local_list
to a cancel_list for processing. However, the structures on these two
lists are not the same. The wait_list references struct
ib_mad_send_wr_private, but local_list references struct
ib_mad_local_private. Cancel_mads() treats all items moved to the
cancel_list as struct ib_mad_send_wr_private. This leads to a system
crash when requests are moved from the local_list to the cancel_list.
Fix this by leaving local_list alone. All requests on the local_list
have completed are just awaiting processing by a queued worker thread.
Bug (crash) reported by Dotan Barak <dotanb@dev.mellanox.co.il>.
Problem with local_list access reported by Robert Reynolds
<rreynolds@opengridcomputing.com>.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
To allow ULPs to tune timeout values and capture retry statistics,
report the number of times that a mad send operation was retried.
For RMPP mads, report the total number of times that the any portion
(send window) of the send operation was retried.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The local loopback of an outgoing DR SMP response is limited to those
that originate at the driver specific SMA implementation during the
driver specific process_mad() function. This patch enables a
returning DR SMP originating in userspace (or elsewhere) to be
delivered to the local managment stack. In this specific case the
driver process_mad() function does not consume or process the MAD, so
a reponse mad has not be created and the original MAD must manually be
copied to the MAD buffer that is to be handed off to the local agent.
Signed-off-by: Steve Welch <swelch@systemfabricworks.com>
Acked-by: Hal Rosenstock <hal@xsigo.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In ib_mad_recv_done_handler(), the response pointer is checked for
NULL after allocating it. It is then checked again in the local
process_mad() path but there is no possibility of it changing in
between.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Acked-by: Hal Rosenstock <hal@xsigo.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If agent_send_response() returns an error, we shouldn't do anything
differently than if it succeeds; setting response to NULL just means
that the response buffer gets leaked.
Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com>
Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If ib_mad_recv_done_handler() fails to allocate response, then it just
printed a warning and continued, which leads to an oops if the MAD is
being handled for a switch device, because the switch code uses
response without checking for NULL. Fix this by bailing out of the
function if the allocation fails.
Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com>
Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.
This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Extend the SMI with switch (intermediate hop) support. Care has been
taken to ensure that the CA (and router) code paths are changed as
little as possible.
Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com>
Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a num_comp_vectors member to struct ib_device and extend
ib_create_cq() to pass in a comp_vector parameter -- this parallels
the userspace libibverbs API. Update all hardware drivers to set
num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector
value. Pass the value of num_comp_vectors to userspace rather than
hard-coding a value of 1.
We want multiple CQ event vector support (via MSI-X or similar for
adapters that can generate multiple interrupts), but it's not clear
how many vectors we want, or how we want to deal with policy issues
such as how to decide which vector to use or how to set up interrupt
affinity. This patch is useful for experimenting, since no core
changes will be necessary when updating a driver to support multiple
vectors, and we know that we want to make at least these changes
anyway.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Clarify code by changing return values from SMI functions to named
enum values instead of magic 0/1 values.
Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
struct ib_wc currently only includes the local QP number: this matches
the IB spec, but seems mostly useless. The following patch replaces
this with the pointer to qp itself, and updates all low level drivers
and all users.
This has the following advantages:
- Ability to get a per-qp context through wc->qp->qp_context
- Existing drivers already have the qp pointer ready in poll cq, so
this change actually saves a tiny bit (extra memory read) on data path
(for ehca it would actually be expensive to find the QP pointer when
polling a CQ, but ehca does not support SRQ so we can leave wc->qp as
NULL for ehca)
- Users that need the QP number can still get it through wc->qp->qp_num
Use case:
In IPoIB connected mode code, I have a common CQ shared by multiple
QPs. To track connection usage, I need a way to get at some per-QP
context upon the completion, and I would like to avoid allocating
context object per work request just to stick a QP pointer into it.
With this code, I can just use wc->qp->qp_context.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Convert code in core/ to use the new DMA mapping functions for kernel
verbs consumers.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
When ib_cancel_mad() is called, it puts the canceled send on a list
and schedules a "flushed" callback from process context. However,
this leaves a window where a receive completion could be processed
before the send is fully flushed.
This is fine, except that ib_find_send_mad() will find the MAD and
return it to the receive processing, which results in the sender
getting both a successful receive and a "flushed" send completion for
the same request. Understandably, this confuses the sender, which is
expecting only one of these two callbacks, and leads to grief such as
a use-after-free in IPoIB.
Fix this by changing ib_find_send_mad() to return a send struct only
if the status is still successful (and not "flushed"). The search of
the send_list already had this check, so this patch just adds the same
check to the search of the wait_list.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Rougly half of callers already do it by not checking return value
* Code in drivers/acpi/osl.c does the following to be sure:
(void)kmem_cache_destroy(cache);
* Those who check it printk something, however, slab_error already printed
the name of failed cache.
* XFS BUGs on failed kmem_cache_destroy which is not the decision
low-level filesystem driver should make. Converted to ignore.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Modifications to the existing rdma header files, core files, drivers,
and ulp files to support iWARP, including:
- Hook iWARP CM into the build system and use it in rdma_cm.
- Convert enum ib_node_type to enum rdma_node_type, which includes
the possibility of RDMA_NODE_RNIC, and update everything for this.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Remove some trailing whitespace that has snuck in despite the best
efforts of whitespace=error-all. Also fix a few other whitespace
bogosities.
Signed-off-by: Roland Dreier <rolandd@cisco.com>