Commit Graph

2743 Commits

Author SHA1 Message Date
Joe Perches f4f01b542c infiniband: Remove duplicated KERN_<LEVEL> from pr_<level> uses
These KERN_<LEVEL> uses are unnecessary with pr_<level> and cause
bad logging output so remove them.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-12 15:52:37 -04:00
Mike Marciniszyn ec40f925e0 IB/qib: fix test of unsigned variable
Commit d4988623cc ("IB/qib: use arch_phys_wc_add()")
adjusted mtrr inititialization to use the new interface.

Unfortunately, the new interface returns a signed
value and the patch tested the unsigned wc_cookie.

Fix the issue by changing the type of wc_cookie to int.  For
the success case the ret left at zero to avoid
a warning from the caller.  For failure wc_cookie
is used as the ret.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-12 13:55:41 -04:00
Steve Wise 940fd304d2 iw_cxgb4: use wildcard mapping for getting remote addr info
For listening endpoints bound to the wildcard address, we need to pass
the wildcard address mapping to iwpm_get_remote_info() instead of the
mapped address of the new child connection.

Without this fix, and with iwarp port mapping enabled, each iw_cxgb4
connection that is spawned from a listening endpoint bound to the wildcard
address, will generate an annoying dmesg entry about failing to find
the remote address mapping info, and the connection state displayed in
debugfs under /sys/kernel/debug/iw_cxgb4/<pci-slot-no>/eps  will not have
the peer's address/port mapping info.  The connection still works though.

Fixes: 5b6b8fe ("RDMA/cxgb4: Report the actual address of the remote connecting peer")

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-11 17:16:49 -04:00
Nicholas Mc Guire 94634e9861 IB/ehca: use correct destination for memcpy
Using an element of a struct as the address for the memcpy of the whole
struct may introduce a buffer overflow and does not help readability either
simply pass the real thing as first argument to memcpy.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-11 17:14:37 -04:00
Hariprasad S 179d03bbfd iw_cxgb4: Remove negative advice dmesg warnings
Remove these log messages in favor of per-endpoint counters as well as
device-global counters that can be inspected via debugfs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 13:21:27 -04:00
Luis R. Rodriguez d4988623cc IB/qib: use arch_phys_wc_add()
This driver already makes use of ioremap_wc() on PIO buffers,
so convert it to use arch_phys_wc_add().

The qib driver uses a mmap() special case for when PAT is
not used, this behaviour used to be determined with a
module parameter but since we have been asked to just
remove that module parameter this checks for the WC cookie,
if not set we can assume PAT was used. If its set we do
what we used to do for the mmap for when MTRR was enabled.

The removal of the module parameter is OK given that Andy
notes that even if users of module parameter are still around
it will not prevent loading of the module on recent kernels.

Cc: Doug Ledford <dledford@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: konrad.wilk@oracle.com
Cc: ville.syrjala@linux.intel.com
Cc: david.vrabel@citrix.com
Cc: jbeulich@suse.com
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: infinipath@intel.com
Cc: linux-rdma@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:02 -04:00
Luis R. Rodriguez 87a26e976c IB/qib: add acounting for MTRR
There is no good reason not to, we eventually delete it as well.

Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Mike Marciniszyn <infinipath@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:02 -04:00
Steve Wise 5b6b8fe640 RDMA/cxgb4: Report the actual address of the remote connecting peer
Get the actual (non-mapped) ip/tcp address of the connecting peer from
the port mapper

Also setup the passive side endpoint to correctly display the actual
and mapped addresses for the new connection.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Tatyana Nikolova 230da36ae9 RDMA/nes: Report the actual address of the remote connecting peer
Get the actual (non-mapped) ip/tcp address of the connecting peer from
the port mapper and report the address info to the user space application
at the time of connection establishment

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Hariprasad S 4a75a86c8d iw_cxgb4: enforce qp/cq id requirements
Currently the iw_cxgb4 implementation requires the qp and cq qid densities
to match as well as the qp and cq id ranges.  So fail a device open if
the device configuration doesn't meet the requirements.

The reason for these restictions has to do with the fact that IQ qid X
has a UGTS register in the same bar2 page as EQ qid X.  Thus both qids
need to be allocated to the same user process for security reasons.
The logic that does this (the qpid allocator in iw_cxgb4/resource.c)
handles this but requires the above restrictions.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Hariprasad S 09ece8b9e9 iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs
For T5, we must not use the kdb/kgts registers, in order avoid db drops
under extreme loads.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Hariprasad S 6198dd8d7a iw_cxgb4: 32b platform fixes
- get_dma_mr() was using ~0UL which is should be ~0ULL.  This causes the
DMA MR to get setup incorrectly in hardware.

- wr_log_show() needed a 64b divide function div64_u64() instead of
  doing
division directly.

- fixed warnings about recasting a pointer to a u64

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Hariprasad S 0b7410471d iw_cxgb4: Cleanup register defines/MACROS
Cleanup macros and register defines for consistency

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Linus Torvalds 9ec3a646fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode->i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some ->d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26 17:22:07 -07:00
Linus Torvalds 7c034dfd58 InfiniBand/RDMA updates for 4.1:
- IPoIB fixes from Doug Ledford and Erez Shitrit
  - iSER updates from Sagi Grimberg
  - mlx4 GUID handling changes from Yishai Hadas
  - other misc fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJVN9SzAAoJEENa44ZhAt0hWq4QAJRFrwoe9ubTextSHeTU0FkY
 CydiQtGWrhyAHTX/KtdB1Uv9FzGHc6gqkAOXImouacYTM9ffypMF6Oj4xIYIMQtz
 MvNlNm07KOtQYlubiaZWcP5BjdLfMZjQxb03/9smygLTBjm80dAEt5X1znx7YrqI
 ZfE+ibPdvRqVEvFZKfT2U0kGU6oEVKrbJEiUCoJPwwcghDZQl18YmGOxt5qdI2uO
 V+71ozwozT8utSIl7S2YTJZBdkJ7tLrqrX2D/D2jUAmh1rqHIDrsXXiZ44UJj82i
 oXuwqmHXfq1LfuC9kxCX5JJpGeLE7E3OoxM1zIev31710zPA0v57rNKKweCi2Tj6
 Z36B0SIRV4ipWr/sBhVDr1Ffc/uap3DOIEU9Z+t8rwhELCEVuxmNaNb0K1e5nPiy
 YOQYp/ctC0NslM4mqQJLhGMVl6H8PjodbM1whnYZLsF1+8clNvdtLYzy/cA5fGbO
 tngUGXu0YZGdwvfuQhi5FB45XLaErJaPcMH0QRI5G0JgtjvbzXiMlqWtekTUBi7W
 DJNQlVRI4S1RYRBYkq709ymXiWwTeh3rhH+ZJpM+aY8b0NR/lx+dNyesNG+7GBJH
 y5UOOUck0w+JbQzZo264I6a5e8pXq3kMi3BH8pF4Jbo5WvxSF6uriXb6Q1JzfH20
 Jn0J6W9ghCSfrhMI1zgQ
 =v1jB
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA updates from Roland Dreier:

 - IPoIB fixes from Doug Ledford and Erez Shitrit

 - iSER updates from Sagi Grimberg

 - mlx4 GUID handling changes from Yishai Hadas

 - other misc fixes

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (51 commits)
  mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures
  IB/iser: Rewrite bounce buffer code path
  IB/iser: Bump version to 1.6
  IB/iser: Remove code duplication for a single DMA entry
  IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and iser_reg_sig_mr
  IB/iser: Modify struct iser_mem_reg members
  IB/iser: Make fastreg pool cache friendly
  IB/iser: Move PI context alloc/free to routines
  IB/iser: Move fastreg descriptor pool get/put to helper functions
  IB/iser: Merge build page-vec into register page-vec
  IB/iser: Get rid of struct iser_rdma_regd
  IB/iser: Remove redundant assignments in iser_reg_page_vec
  IB/iser: Move memory reg/dereg routines to iser_memory.c
  IB/iser: Don't pass ib_device to fall_to_bounce_buff routine
  IB/iser: Remove a redundant struct iser_data_buf
  IB/iser: Remove redundant cmd_data_len calculation
  IB/iser: Fix wrong calculation of protection buffer length
  IB/iser: Handle fastreg/local_inv completion errors
  IB/iser: Fix unload during ep_poll wrong dereference
  ib_srpt: convert printk's to pr_* functions
  ...
2015-04-22 11:50:05 -07:00
Linus Torvalds 388f997620 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix verifier memory corruption and other bugs in BPF layer, from
    Alexei Starovoitov.

 2) Add a conservative fix for doing BPF properly in the BPF classifier
    of the packet scheduler on ingress.  Also from Alexei.

 3) The SKB scrubber should not clear out the packet MARK and security
    label, from Herbert Xu.

 4) Fix oops on rmmod in stmmac driver, from Bryan O'Donoghue.

 5) Pause handling is not correct in the stmmac driver because it
    doesn't take into consideration the RX and TX fifo sizes.  From
    Vince Bridgers.

 6) Failure path missing unlock in FOU driver, from Wang Cong.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  net: dsa: use DEVICE_ATTR_RW to declare temp1_max
  netns: remove BUG_ONs from net_generic()
  IB/ipoib: Fix ndo_get_iflink
  sfc: Fix memcpy() with const destination compiler warning.
  altera tse: Fix network-delays and -retransmissions after high throughput.
  net: remove unused 'dev' argument from netif_needs_gso()
  act_mirred: Fix bogus header when redirecting from VLAN
  inet_diag: fix access to tcp cc information
  tcp: tcp_get_info() should fetch socket fields once
  net: dsa: mv88e6xxx: Add missing initialization in mv88e6xxx_set_port_state()
  skbuff: Do not scrub skb mark within the same name space
  Revert "net: Reset secmark when scrubbing packet"
  bpf: fix two bugs in verification logic when accessing 'ctx' pointer
  bpf: fix bpf helpers to use skb->mac_header relative offsets
  stmmac: Configure Flow Control to work correctly based on rxfifo size
  stmmac: Enable unicast pause frame detect in GMAC Register 6
  stmmac: Read tx-fifo-depth and rx-fifo-depth from the devicetree
  stmmac: Add defines and documentation for enabling flow control
  stmmac: Add properties for transmit and receive fifo sizes
  stmmac: fix oops on rmmod after assigning ip addr
  ...
2015-04-17 16:31:08 -04:00
Michal Hocko f72f116a2a cxgb4: drop __GFP_NOFAIL allocation
set_filter_wr is requesting __GFP_NOFAIL allocation although it can return
ENOMEM without any problems obviously (t4_l2t_set_switching does that
already).  So the non-failing requirement is too strong without any
obvious reason.  Drop __GFP_NOFAIL and reorganize the code to have the
failure paths easier.

The same applies to _c4iw_write_mem_dma_aligned which uses __GFP_NOFAIL
and then checks the return value and returns -ENOMEM on failure.  This
doesn't make any sense what so ever.  Either the allocation cannot fail or
it can.

del_filter_wr seems to be safe as well because the filter entry is not
marked as pending and the return value is propagated up the stack up to
c4iw_destroy_listen.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-16 12:03:01 -04:00
Doug Ledford c1c2fef6cf Merge branches 'cve-fixup', 'ipoib', 'iser', 'misc-4.1', 'or-mlx4' and 'srp' into for-4.1 2015-04-15 16:24:49 -04:00
Sebastian Ott cc47d369b5 infiniband/mlx4: check for mapping error
Since ib_dma_map_single can fail use ib_dma_mapping_error to check
for errors.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:39 -04:00
Erez Shitrit ca9b590caa IB/mlx4: Fix WQE LSO segment calculation
The current code decreases from the mss size (which is the gso_size
from the kernel skb) the size of the packet headers.

It shouldn't do that because the mss that comes from the stack
(e.g IPoIB) includes only the tcp payload without the headers.

The result is indication to the HW that each packet that the HW sends
is smaller than what it could be, and too many packets will be sent
for big messages.

An easy way to demonstrate one more aspect of the problem is by
configuring the ipoib mtu to be less than 2*hlen (2*56) and then
run app sending big TCP messages. This will tell the HW to send packets
with giant (negative value which under unsigned arithmetics becomes
a huge positive one) length and the QP moves to SQE state.

Fixes: b832be1e40 ('IB/mlx4: Add IPoIB LSO support')
Reported-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:19 -04:00
Yishai Hadas 56c1d2335b IB/mlx4: Change alias guids default to be host assigned
Change the default mode to be HOST assigned instead of SM assigned. This is
the expected operational mode, because it doesn't depend on SM availability.

As PF generates random GUIDs as the initial admin values, this gives
out of the box experience.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas ee59fa0d7e IB/mlx4: Request alias GUID on demand
Request GIDs from the SM on demand, i.e., when a VF actually needs them,
and release them when the GIDs are no longer in use.

In cloud environments, this is useful for GID migrations, in which a
GID is assigned to a VF on the destination HCA, while the VF on the
source HCA is shutdown (but the GID was not administratively released).

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas f547960128 IB/mlx4: Change init flow to request alias GUIDs for active VFs
Change the init flow to ask GUIDs only for active VFs. This is done for
both SM & HOST modes so that there is no need any more to maintain the
ownership record type.

In case SM mode is used, the initial value will be 0, ask the SM to assign,
for the HOST mode the initial value will be the HOST generated GUID.

This will enable out of the box experience for both probed and attached VFs.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas 2350f24774 IB/mlx4: Manage admin alias GUID upon admin request
Set the admin alias GUID per the administrator's request via the sysfs
mechanism into the core layer.

The "get" request returns the current value. However, if the administrator
requests the SM to assign a new value by requesting 0, the SM assigned
GUID is returned.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas 99ee4df6aa IB/mlx4: Alias GUID adding persistency support
If the SM rejects an alias GUID request the PF driver keeps trying to acquire
the specified GUID indefinitely, utilizing an exponential backoff scheme.

Retrying is managed per GUID entry. Each entry that wasn't applied holds its
next retry information. Retry requests to the SM consist of records of 8
consecutive GUIDS. Each record that contains GUIDs requiring retries holds its
next time-to-run based on the retry information of all its GUID entries. The
record having the lowest retry time will run first when that retry time
arrives.

Since the method (SET or DELETE) as sent to the SM applies to all the GUIDs in
the record, we must handle SET requests and DELETE requests in separate SM
messages (one for SETs and the other for DELETEs).

To avoid race conditions where a GUID entry request (set or delete) was
modified after the SM request was sent, we save the method and the requested
indices as part of the callback's context -- thus, only the requested indexes
are evaluated when the response is received.

When an GUID entry is approved we turn off its retry-required bit, this
prevents redundant SM retries from occurring on that record.

The port down event should be sent only when previously it was up. Likewise,
the port up event should be sent only if previously the port was down.

Synchronization was added around the flows that change entries and record state
to prevent race conditions.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:49 -04:00
David Howells 75c3cfa855 VFS: assorted weird filesystems: d_inode() annotations
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:58 -04:00
Al Viro 4961772560 infinibad: weird APIs switched to ->write_iter()
Things Not To Do When Writing A Driver, part 1001st:
have writev() and write() on the same file doing completely
different things.  As in, "interpret very different sets of
commands".

	We _can_ handle that, but it's a bloody bad idea.
Don't do that in new drivers.  Ever.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:29:42 -04:00
Al Viro 237dae8890 Merge branch 'iocb' into for-davem
trivial conflict in net/socket.c and non-trivial one in crypto -
that one had evaded aio_complete() removal.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:01:38 -04:00
Saeed Mahameed 64613d9499 net/mlx5_core: Extend struct mlx5_interface to support multiple protocols
Preparation for ethernet driver.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:43 -04:00
Saeed Mahameed ce0f750932 net/mlx5_core: Modify arm CQ in preparation for upcoming Ethernet driver
Pass consumer index as a parameter to arm CQ

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:43 -04:00
Saeed Mahameed 233d05d28a net/mlx5_core: Move completion eqs from mlx5_ib to mlx5_core
Preparation for ethernet driver.
These functions will be used in drivers other than mlx5_ib.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:42 -04:00
Saeed Mahameed 6cf0a15f07 IB/mlx5: Fix Mellanox copyright note
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:42 -04:00
Saeed Mahameed b812b5441e net/mlx5_core: Clear doorbell record inside mlx5_db_alloc()
Do it in one place instead of every where the function is invoked

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:41 -04:00
Eli Cohen 7bef7ad24b net/mlx5_core: Coding style fix
Put a line of space before return and next statement.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:40 -04:00
Haggai Abramonvsky c3c6c9c810 net/mlx5_core: Fix call to mlx5_core_qp_modify
Pass 0 in the sqd_event parameter.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:40 -04:00
Ido Shamay a130b59057 net/mlx4: Add SET_PORT opcode modifiers enumeration
The calls to SET_PORT used hard-code numbers, when supplying command's
opcode modifiers, fix that to use well defined constants.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Christoph Hellwig e2e40f2c1e fs: move struct kiocb to fs.h
struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-25 20:28:11 -04:00
Majd Dibbiny 61a3855bb7 IB/mlx4: Saturate RoCE port PMA counters in case of overflow
For RoCE ports, we set the u32 PMA values based on u64 HCA counters. In case of
overflow, according to the IB spec, we have to saturate a counter to its
max value, do that.

Fixes: c37791349c ('IB/mlx4: Support PMA counters for IBoE')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 15:17:11 -04:00
Moni Shoua 217e8b16a4 IB/mlx4: Verify net device validity on port change event
Processing an event is done in a different context from the one when
the event was dispatched. This requires a check that the slave
net device is still valid when the event is being processed. The check is done
under the iboe lock which ensure correctness.

Fixes: a575009030 ('IB/mlx4: Add port aggregation support')
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 15:17:11 -04:00
Linus Torvalds be5e6616dd Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted stuff from this cycle.  The big ones here are multilayer
  overlayfs from Miklos and beginning of sorting ->d_inode accesses out
  from David"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
  autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
  procfs: fix race between symlink removals and traversals
  debugfs: leave freeing a symlink body until inode eviction
  Documentation/filesystems/Locking: ->get_sb() is long gone
  trylock_super(): replacement for grab_super_passive()
  fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
  SELinux: Use d_is_positive() rather than testing dentry->d_inode
  Smack: Use d_is_positive() rather than testing dentry->d_inode
  TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
  Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
  Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
  VFS: Split DCACHE_FILE_TYPE into regular and special types
  VFS: Add a fallthrough flag for marking virtual dentries
  VFS: Add a whiteout dentry type
  VFS: Introduce inode-getting helpers for layered/unioned fs environments
  Infiniband: Fix potential NULL d_inode dereference
  posix_acl: fix reference leaks in posix_acl_create
  autofs4: Wrong format for printing dentry
  ...
2015-02-22 17:42:14 -08:00
Linus Torvalds b5ccb078c8 InfiniBand/RDMA changes for 3.20 merge window:
- Re-enable on-demand paging changes with stable ABI
  - Fairly large set of ocrdma HW driver fixes
  - Some qib HW driver fixes
  - Other miscellaneous changes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJU52nHAAoJEENa44ZhAt0hkycP/0vwYNl0JJadasSUrLm2AWje
 iAePU+K7uiyxSuIU+eLnyyDV/iQ2sCjXfhQNE6FnFhH3JjwBYbhS2Q8WcjwfRWYX
 iFhORQltb3spSeLdT7N1QfkMF4MtAw3ENPE3QKIgNaIQSso+J256BHrSiqr9Akwe
 hwIVr0TLDO59ggPs3c083uUhC+5AMViMVgRR+N9+/59WHz2vG2WsTunpg1sYOAgC
 KpUhfZW4za5xYJ0c8BOnPiSAfJasD1UdDg3oX0RPD4j/diiWAO6EOR+jkOLtODpd
 8uhD8HM7ZvCKE1io5QnVdjTR/n51NaVGHk+yfWJRSvV1sw1AALtU4NVniI+E5mFs
 Fwe+zUzbQ3j08TtrU0VjaeteBh2oGC0pJlkJS9+HXeDmH30/LQCMCiA5mBSSEPDg
 0cPEAJgGAZqOIyyZe8jSslW/iN0cE6FDDb8+/1AZ80IfbdMxXh9FVwi7cdXn8+OQ
 nXmtnSa7yzKWS9VVXDrvSzI0Y2oDv8DSDpCWGCm7bUcDXd/T5LpPR3RdSorGkYAr
 O2zmuJZlCdkuKgW91O7cztNj2hTlDnJ+e+5P05KwPOb86Muum6tphLJd5b1h970X
 Actx/6EX/ycSQ3ukiKW7Ksn2bYD3RLZvdbPYQM5xUjlGGWPF6nvmbZ8MqnyaLFfE
 WKvx39DMGq3ktofiMkbf
 =JLsF
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA updates from Roland Dreier:
 - Re-enable on-demand paging changes with stable ABI
 - Fairly large set of ocrdma HW driver fixes
 - Some qib HW driver fixes
 - Other miscellaneous changes

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (43 commits)
  IB/qib: Add blank line after declaration
  IB/qib: Fix checkpatch warnings
  IB/mlx5: Enable the ODP capability query verb
  IB/core: Add on demand paging caps to ib_uverbs_ex_query_device
  IB/core: Add support for extended query device caps
  RDMA/cxgb4: Don't hang threads forever waiting on WR replies
  RDMA/ocrdma: Fix off by one in ocrdma_query_gid()
  RDMA/ocrdma: Use unsigned for bit index
  RDMA/ocrdma: Help gcc generate better code for ocrdma_srq_toggle_bit
  RDMA/ocrdma: Update the ocrdma module version string
  RDMA/ocrdma: set vlan present bit for user AH
  RDMA/ocrdma: remove reference of ocrdma_dev out of ocrdma_qp structure
  RDMA/ocrdma: Add support for interrupt moderation
  RDMA/ocrdma: Honor return value of ocrdma_resolve_dmac
  RDMA/ocrdma: Allow expansion of the SQ CQEs via buddy CQ expansion of the QP
  RDMA/ocrdma: Discontinue support of RDMA-READ-WITH-INVALIDATE
  RDMA/ocrdma: Host crash on destroying device resources
  RDMA/ocrdma: Report correct state in ibv_query_qp
  RDMA/ocrdma: Debugfs enhancments for ocrdma driver
  RDMA/ocrdma: Report correct count of interrupt vectors while registering ocrdma device
  ...
2015-02-21 12:53:21 -08:00
Roland Dreier 147d1da951 Merge branches 'core', 'cxgb4', 'iser', 'mlx4', 'mlx5', 'ocrdma', 'odp', 'qib' and 'srp' into for-next 2015-02-20 09:04:40 -08:00
Mike Marciniszyn da12c1f685 IB/qib: Add blank line after declaration
Upstream checkpatch now requires this.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-20 09:04:12 -08:00
Mike Marciniszyn a46a2802f7 IB/qib: Fix checkpatch warnings
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-20 09:04:09 -08:00
David Howells a95104fd33 Infiniband: Fix potential NULL d_inode dereference
Code that does this:

	if (!(d_unhashed(tmp) && tmp->d_inode)) {
		...
		simple_unlink(parent->d_inode, tmp);
	}

is broken because:

	!(d_unhashed(tmp) && tmp->d_inode)

is equivalent to:

	!d_unhashed(tmp) || !tmp->d_inode

so it is possible to get into simple_unlink() with tmp->d_inode == NULL.

simple_unlink(), however, assumes tmp->d_inode cannot be NULL.

I think that what was meant is this:

	!d_unhashed(tmp) && tmp->d_inode

and that the logical-not operator or the final close-bracket was misplaced.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Bryan O'Sullivan <bos@pathscale.com>
cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-20 04:56:45 -05:00
Haggai Eran 1707cb4ab7 IB/mlx5: Enable the ODP capability query verb
Re-enable the on-demand paging capability query through the
extended query device verb.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-18 08:36:26 -08:00
Hariprasad S 1fc8190dd6 RDMA/cxgb4: Don't hang threads forever waiting on WR replies
In c4iw_wait_for_reply(), if a FW6_MSG WR reply is not received after
C4IW_WR_TO seconds, fail the WR operation and mark the device as fatally
dead.  Further, if the device is marked fatally dead, then fail the WR
wait immediately.

Also change the timeout to 60 seconds.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-18 08:33:15 -08:00
Dan Carpenter 59a39ca3f7 RDMA/ocrdma: Fix off by one in ocrdma_query_gid()
The ->sgid_tbl[] array has OCRDMA_MAX_SGID number of elements so this
test is off by one.  ->sgid_tbl is allocated in ocrdma_alloc_resources().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-18 08:31:06 -08:00
Rasmus Villemoes f3070e7efd RDMA/ocrdma: Use unsigned for bit index
In the expressions idx/32 and idx%32, both idx and 32 have signed
type, and unfortunately the C standard prescribes rounding to 0, so
unless gcc can prove that idx is non-negative, these cannot be
implemented as simple shift respectively mask operations. Help gcc by
changing the type of idx to unsigned - this cuts another few
instructions from the generated code.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-18 08:31:06 -08:00
Rasmus Villemoes ba64fdca63 RDMA/ocrdma: Help gcc generate better code for ocrdma_srq_toggle_bit
gcc emits a surprising amount of code in order to flip a bit. One
would think that a single instruction is enough.

$ scripts/bloat-o-meter /tmp/ocrdma_verbs.o drivers/infiniband/hw/ocrdma/ocrdma_verbs.o
add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-142 (-142)
function                                     old     new   delta
ocrdma_post_srq_recv                         498     460     -38
ocrdma_poll_cq                              2010    1962     -48
ocrdma_discard_cqes                          495     439     -56

All three calls of ocrdma_srq_toggle_bit happen within spinlocks, so
saving a few useless instructions might be worthwhile.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-18 08:31:05 -08:00