Commit Graph

1464 Commits

Author SHA1 Message Date
Elad Raz fc1273afb2 mlxsw: Remember untagged VLANs
When a vlan is been configured, remeber the untagged mode of the vlan.
When displaying the list of configured VLANs, show the untagged attribute.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 14:42:42 -05:00
Elad Raz 26a4ea0f45 mlxsw: Disable vlan_filtering for non .1D bridge
When a port is bridged, the bridge must be vlan aware bridge (.1Q)
or the bridging should be on top of VLAN interfaces (.1D bridge).

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 14:42:41 -05:00
Elad Raz e4a1305507 mlxsw: Renaming local variable names for consistency
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 14:42:41 -05:00
Elad Raz 29edf44f85 mlxsw: Fixing vlans init range
Initialize VLANs 0..4095 (Remove init for VID 4096).

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 14:42:41 -05:00
Ido Schimmel f0138e2596 mlxsw: pci: Adjust value of CPU egress traffic class
During initialization, when creating the send descriptor queues (SDQs),
we specify the CPU egress traffic class of each SDQ. The maximum number
of classes of this type is different in the two ASICs supported by this
PCI driver.

New firmware versions check this value is set correctly, which causes
errors on the Spectrum ASIC, as its max exposed egress traffic class is
lower than 7.

Solve this by setting this field to 3, which is an acceptable value for
both ASICs.

Note that we currently do not expose the QoS capabilities of the ASICs,
so setting this to an hardcoded value is OK for now.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 00:48:50 -05:00
Eran Ben Elisha 3d8c38af14 net/mlx5e: Add PTP Hardware Clock (PHC) support
Add a PHC support to the mlx5_en driver. Use reader/writer spinlocks to
protect the timecounter since every packet received needs to call
timecounter_cycle2time() when timestamping is enabled.  This can become
a performance bottleneck with RSS and multiple receive queues if normal
spinlocks are used.

The driver has been tested with both Documentation/ptp/testptp and the
linuxptp project (http://linuxptp.sourceforge.net/) on a Mellanox
ConnectX-4 card.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 14:11:50 -05:00
Eran Ben Elisha ef9814deaf net/mlx5e: Add HW timestamping (TS) support
Add support for enable/disable HW timestamping for incoming and/or
outgoing packets. To enable/disable HW timestamping appropriate
ioctl should be used. Currently HWTSTAMP_FILTER_ALL/NONE and
HWTSAMP_TX_ON/OFF only are supported. Make all relevant changes in
RX/TX flows to consider TS request and plant HW timestamps into
relevant structures.

Add internal clock for converting hardware timestamp to nanoseconds. In
addition, add a service task to catch internal clock overflow, to make
sure timestamping is accurate.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 14:11:50 -05:00
Eran Ben Elisha b084444459 net/mlx5_core: Introduce access function to read internal timer
A preparation step which adds support for reading the hardware
internal timer and the hardware timestamping from the CQE.
In addition, advertize device_frequency_khz HCA capability.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 14:11:50 -05:00
Achiad Shochat 34802a42b3 net/mlx5e: Do not modify the TX SKB
If the SKB is cloned, or has an elevated users count, someone else
can be looking at it at the same time.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 14:11:50 -05:00
Ido Schimmel 6c72a3d0d3 mlxsw: spectrum: Change bridge port attributes only when bridged
Bridge port attributes are offloaded to hardware when invoked with SELF
flag set, but it really makes no sense to reflect them when port is not
bridged.

Allow a user to change these attribute only when port is bridged and
initialize them correctly when joining or leaving a bridge.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:07:58 -05:00
Ido Schimmel 5a8f45258e mlxsw: spectrum: Set bridge status in appropriate functions
Set the bridge status of physical ports in the appropriate functions, to
be consistent with LAG join/leave and vPorts joining/leaving bridge.

Also, remove the error messages in these two functions, as we already
emit errors in both the single functions they call.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:07:58 -05:00
Ido Schimmel 78124078c4 mlxsw: spectrum: Return NOTIFY_BAD on bridge failure
It is possible for us to fail when joining or leaving a bridge, so let
the user know about that by returning NOTIFY_BAD, as already done for
LAG join/leave and 802.1D bridges.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:07:58 -05:00
Ido Schimmel 7b31abe70b mlxsw: spectrum: Initialize PVID only once
We set PVID to 1 in mlxsw_sp_port_vlan_init(), so we can remove this
statement.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:07:58 -05:00
David S. Miller c07f30ad68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-31 18:20:10 -05:00
Eran Ben Elisha f91e6d8941 net/mlx5_core: Add setting ATOMIC endian mode
HW is capable of 2 requestor endianness modes for standard 8 Bytes
atomic: BE (0x0) and host endianness (0x1). Read the supported modes
from hca atomic capabilities and configure HW to host endianness mode if
supported.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:31 -05:00
Achiad Shochat 9efa752545 net/mlx5_core: Introduce access functions to query vport RoCE fields
Introduce access functions to query NIC vport system_image_guid,
node_guid and qkey_viol_cntr.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23 12:07:37 -05:00
Achiad Shochat 0de60af649 net/mlx5_core: Introduce access functions to enable/disable RoCE
A mlx5 Ethernet port must be explicitly enabled for RoCE.
When RoCE is not enabled on the port, the NIC will refuse to create
QPs attached to it and incoming RoCE packets will be considered by the
NIC as plain Ethernet packets.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23 12:07:36 -05:00
Achiad Shochat e5f6175c5b net/mlx5_core: Break down the vport mac address query function
Introduce a new function called mlx5_query_nic_vport_context().
This function gets all the NIC vport attributes from the device.

The MAC address is just one of the NIC vport attributes, so
mlx5_query_nic_vport_mac_address() is now just a wrapper function
above mlx5_query_nic_vport_context().

More NIC vport attributes will be used in following commits.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23 12:07:36 -05:00
Jiri Pirko f4cee3af0d mlxsw: core: Use devm_kzalloc to allocate mlxsw_hwmon structure
KASan reported use-after-free for the hwmon structure. So fix this by
using devm_kzalloc and let the core take care about freeing the memory
during device dettach.

Reported-by: Ido Schimmel <idosch@mellanox.com>
Fixes: 89309da39 ("mlxsw: core: Implement temperature hwmon interface")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-22 16:25:09 -05:00
Jiri Pirko e7bc73cbb5 mlxsw: core: Allow to reset temperature history via hwmon interface
Add another sysfs hwmon attribute to expose possibility to reset
temperature sensors history.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-22 16:00:04 -05:00
Eugenia Emantayev 90683061dd net/mlx4_en: Fix HW timestamp init issue upon system startup
mlx4_en_init_timestamp was called before creation of netdev and port
init, thus used uninitialized values.  Specifically - NIC frequency was
incorrect causing wrong calculations and later wrong HW timestamps.

Fixes: 1ec4864b10 ('net/mlx4_en: Fixed crash when port type is changed')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-18 14:48:04 -05:00
Eugenia Emantayev fc9f5ea9b4 net/mlx4_en: Remove dependency between timestamping capability and service_task
Service task is responsible for other tasks in addition to timestamping
overflow check. Launch it even if timestamping is not supported by device.

Fixes: 07841f9d94 ('net/mlx4_en: Schedule napi when RX buffers allocation fails')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-18 14:48:03 -05:00
David S. Miller b3e0d3d7ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/geneve.c

Here we had an overlapping change, where in 'net' the extraneous stats
bump was being removed whilst in 'net-next' the final argument to
udp_tunnel6_xmit_skb() was being changed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-17 22:08:28 -05:00
Linus Torvalds 73796d8bf2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
    people reported this...  From Arnd Bergmann.

 2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.

 3) Fix spurious EBUSY in rhashtable, from Herbert Xu.

 4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.

 5) Fix race with work structure access in pppoe driver causing
    corruptions, from Guillaume Nault.

 6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
    actually succeeded or not, from Sergei Shtylyov.

 7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
    Bjørn Mork.

 8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.

 9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
    Leitner.

10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.

11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
    properly as well, from Jiri Benc.

12) Handle request sockets properly in xfrm layer, from Eric Dumazet.

13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
    Shelar.

14) sk->sk_policy[] needs RCU protection, and as a result
    xfrm_policy_destroy() needs to free policies using an RCU grace
    period, from Eric Dumazet.

15) SCTP needs to clone ipv6 tx options in order to avoid use after
    free, from Eric Dumazet.

16) Missing kbuild export if ila.h, from Stephen Hemminger.

17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
    Tobias Klauser.

18) Validate protocol value range in ->create() methods, from Hannes
    Frederic Sowa.

19) Fix early socket demux races that result in illegal dst reuse, from
    Eric Dumazet.

20) Validate socket address length in pptp code, from WANG Cong.

21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
    packets, from Vlad Yasevich.

22) Fix memory leaks in nl80211 registry code, from Ola Olsson.

23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
    qlcnic.  From Dan Carpenter.

24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
    example, AF_ALG will interpret it as an async call.  From Tadeusz
    Struk.

25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
    Eric Dumazet.

26) rhashtable enforces the minimum table size not early enough,
    breaking how we calculate the per-cpu lock allocations.  From
    Herbert Xu.

27) Fix FCC port lockup in 82xx driver, from Martin Roth.

28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.

29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
    sock_setsockopt() wrt.  timestamp handling.  From WANG Cong.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
  net: check both type and procotol for tcp sockets
  drivers: net: xgene: fix Tx flow control
  tcp: restore fastopen with no data in SYN packet
  af_unix: Revert 'lock_interruptible' in stream receive code
  fou: clean up socket with kfree_rcu
  82xx: FCC: Fixing a bug causing to FCC port lock-up
  gianfar: Don't enable RX Filer if not supported
  net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
  rhashtable: Fix walker list corruption
  rhashtable: Enforce minimum size on initial hash table
  inet: tcp: fix inetpeer_set_addr_v4()
  ipv6: automatically enable stable privacy mode if stable_secret set
  net: fix uninitialized variable issue
  bluetooth: Validate socket address length in sco_sock_bind().
  net_sched: make qdisc_tree_decrease_qlen() work for non mq
  ser_gigaset: remove unnecessary kfree() calls from release method
  ser_gigaset: fix deallocation of platform device structure
  ser_gigaset: turn nonsense checks into WARN_ON
  ser_gigaset: fix up NULL checks
  qlcnic: fix a timeout loop
  ...
2015-12-17 14:05:22 -08:00
Ido Schimmel 272c447017 mlxsw: spectrum: Add support for VLAN devices on top of LAG
When creating a VLAN device on top of LAG, we are basically creating a
vPort on top of each of the port netdevs member in the LAG. Therefore,
these vPorts should inherit both the LAG status and LAG ID from the
underlying port netdevs.

In addition, when the VLAN device joins or leaves a bridge each of the
underlying vPorts should know about it and act accordingly. This is
achieved by propagating the VLAN event down to the lower devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:24 -05:00
Ido Schimmel 64771e31ce mlxsw: spectrum: Enable FDB records for VLAN devices on top of LAG
When adding or removing FDB records of VLAN devices on top of LAG we
should set the lag_vid parameter to the VLAN ID of the VLAN device. It
is reserved otherwise.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:24 -05:00
Ido Schimmel afd7f979b2 mlxsw: reg: Add lag_vid field to SFD register
Unicast LAG records in the Switch Filtering Database (SFD) register have
a lag_vid field indicating the VLAN ID in case of vFIDs. This field is
no longer reserved since we are going to add support for VLAN devices on
top of LAG.

Add the lag_vid field to be used by VLAN devies on top of LAG.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:24 -05:00
Ido Schimmel 26f0e7fb15 mlxsw: spectrum: Add support for VLAN devices bridging
All the member VLAN devices in a bridge need to share the same vFID.

To achieve that, expand the vFID struct to include the associated bridge
device (or lack of) and allow one to lookup a vFID based on a bridge
device.

When joining a bridge, lookup the relevant vFID or create one if none
exists. Next, make the VLAN device use the vFID.

Leaving a bridge can either occur because a user removed the VLAN device
from a bridge or because the VLAN device was deleted by the user. In the
latter case the bridge's teardown sequence is invoked after the hardware
vPort is already gone. Therefore, when unlinking the VLAN device from
the real device, check if the associated vPort is bridged and act
accordingly. The bridge's notification will be ignored in this case.

Note that bridging a VLAN interface with an ordinary port netdev is
currently not supported, but not forbidden. This will be addressed in a
follow-up patchset.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:23 -05:00
Ido Schimmel 9589a7b5d7 mlxsw: spectrum: Handle VLAN devices linking / unlinking
When a VLAN interface is configured on top of a physical port we should
associate the VLAN device with the matching vPort. Likewise, when it's
removed, we should revert back to the underlying port netdev.

While not a must, this is consistent with port netdevs and also provides
a more accurate error printing via netdev_err() and friends.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:23 -05:00
Ido Schimmel aac78a4408 mlxsw: spectrum: Adjust FDB notifications for VLAN devices
FDB notifications contain the FID and port (or LAG ID) on which the MAC
was learned. In the case of the 802.1Q bridge one can easily derive the
matching VID - as FID equals VID - and generate the appropriate
notification for the software bridge. With VLAN devices this is no
longer the case, as these are associated with a vFID.

Solve that by converting the FID to a vFID and lookup the matching VLAN
device. From that derive the VID and whether learning (and learning
sync) should occur.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:22 -05:00
Ido Schimmel 54a732018d mlxsw: spectrum: Adjust switchdev ops for VLAN devices
switchdev ops can now be called for VLAN devices and we need to be
prepared for it. Until now they were only called for the port netdev.

Use the newly propagated orig_dev passed as part of the switchdev
attr/obj and determine whether the original device is a VLAN device. If
so, act accordingly, otherwise continue as usual.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:22 -05:00
Ido Schimmel 9de6a80e06 mlxsw: spectrum: Use FID instead of VID when accessing FDB
In the Spectrum ASIC - unlike SwitchX-2 - FDB access is done by
specifying FID as parameter and not VID.

Change the relevant variables and parameters names to reflect that.

Note that this was OK up until now, since FID was always equal to VID,
but with the introduction of VLAN interfaces this is no longer the case.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:22 -05:00
Ido Schimmel 19ae612414 mlxsw: spectrum: Add another flood table for vFIDs
We previously used only one flood table for packets classified to vFIDs.
However, since we are going to add support for bridges between VLAN
interfaces (mapped to vFIDs) we need to add one more flood table.

That way we can separate the flooding domain of unknown unicast traffic
from all the rest and support flood control (as we do with the 802.1Q
bridge).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:21 -05:00
Ido Schimmel c06a94ef61 mlxsw: spectrum: Use appropriate parameter name
The __mlxsw_sp_port_flood_set function is now used to configure flooding
for both FIDs and vFIDs, so change the parameter name to 'idx' instead
of 'fid'. This is also consistent with hardware documentation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:21 -05:00
Ido Schimmel 7f71eb46a4 mlxsw: spectrum: Split vFID range in two
Up until now we used a 1:1 mapping - based on VID - to map a VLAN
interface to a vFID. However, a different scheme is needed in order to
support bridges between VLAN interfaces, as all the member interfaces -
which can have different VIDs - need to share the same vFID.

Solve that by splitting the vFID range in two:
 1. Non-bridged VLAN interfaces
 2. Bridged VLAN interfaces

When a VLAN interface is created, assign it the next available vFID in
the first range, unless one already exists for that VID or number of
vFIDs in the range was exceeded. When interface is removed, free the
vFID, unless other interfaces are mapped to it.

To accomplish the above:
 1. Store the VID to vFID mapping in a new struct (mlxsw_sp_vfid), which
    has a global context and holds a reference count.
 2. Create a vPort (dummy in case of bridge SELF invocation) on top of
    of the physical port and hold a reference to the associated vFID.

	     vfid                    vfid
	+-------------+	        +-------------+
	| vfid        |         | vfid        |
	| vid         +---> ... | vid         |
	| nr_vports   |         | nr_vports   |
	+------+------+         +------+------+
				       |
	       +-----------------------+-------+
	       |			       |
	     vport			     vport
	+-------------+         	+-------------+
	| ...	      |         	| ...	      |
	| *vfid	      +---> ... 	| *vfid	      +---> ...
	| ...	      |         	| ...	      |
	+------+------+         	+------+------+
	       |                               |
	     port			     port
	+-------------+         	+-------------+
	| ...         |         	| ...         |
	| vports_list |         	| vports_list |
	| ...         |         	| ...         |
	+-------------+         	+-------------+
	     swXpY			     swXpZ

Next patches in the series will add the missing infrastructure for the
second range and transfer vPorts between the two ranges according to the
received notifications.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:21 -05:00
Ido Schimmel bd40e9d6d5 mlxsw: spectrum: Allocate active VLANs only for port netdevs
When adding support for bridges between VLAN interfaces, we'll introduce
a new entity called a vPort, which is a represntation of the VLAN
interface in the hardware.

The main difference between a vPort and a physical port is that several
FIDs can be bound to the latter, whereas only one (called a vFID) can be
bound to the first.

Therefore, it makes sense to use the same struct to represent the two,
but to only allocate the 'active_vlans' bitmap in case of a physical
port.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:21 -05:00
Andrzej Hajda 2b2b31c845 net/mlx4_core: fix handling return value of mlx4_slave_convert_port
The function can return negative values, so its result should
be assigned to signed variable.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107

Fixes: fc48866f7 ('net/mlx4: Adapt code for N-Port VF')
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:54:43 -05:00
Jiri Pirko b626f2cb75 mlxsw: core: Fix temperature sensor index during initialization
Sensor index should be passed instead of 0. For now, this does not make
a difference, since there is so far only one temperature sensor
exposed by HW.

Fixes: 89309da39 ("mlxsw: core: Implement temperature hwmon interface")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:45:37 -05:00
Jiri Pirko acf35a4ec6 mlxsw: reg: Fix max temperature getting
Fix copy & paste error in MTPM unpack helper.

Fixes: 85926f8770 ("mlxsw: reg: Add definition of temperature management registers")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:45:37 -05:00
Maor Gottlieb 7cb21b794b net/mlx5e: Rename en_flow_table.c to en_fs.c
Rename en_flow_table.c to en_fs.c in order to be aligned
with the new flow steering files.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:24 -05:00
Maor Gottlieb 86d722ad2c net/mlx5: Use flow steering infrastructure for mlx5_en
Expose the new flow steering API and remove the old
one.

Few changes are required:

1. The Ethernet flow steering follows the existing implementation, but uses
the new steering API. The old flow steering implementation is removed.

2. Move the E-switch FDB management to use the new API.

3. When driver is loaded call to mlx5_init_fs which initialize
the flow steering tree structure, open namespaces for NIC receive
and for E-switch FDB.

4. Call to mlx5_cleanup_fs when the driver is unloaded.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:24 -05:00
Maor Gottlieb 2530236303 net/mlx5_core: Flow steering tree initialization
Flow steering initialization is based on static tree which
illustrates the flow steering tree when the driver is loaded. The
initialization considers the max supported flow table level of the device,
a minimum of 2 kernel flow tables(vlan and mac) are required to have
kernel flow table functionality.

The tree structures when the driver is loaded:

		root_namespace(receive nic)
			  |
		priority-0 (kernel priority)
			  |
		namespace(kernel namespace)
			  |
		priority-0 (flow tables priority)

In the following patches, When the EN driver will use the flow steering
API, it create two flow tables and their flow groups under
priority-0(flow tables priority).

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:24 -05:00
Maor Gottlieb 0c56b97503 net/mlx5_core: Introduce flow steering API
Introducing the following objects:

mlx5_flow_root_namespace: represent the root of specific flow table
type tree(e.g NIC receive, FDB, etc..)

mlx5_flow_group: define the mask of the flow specification.

fs_fte(flow steering flow table entry): defines the value of the
flow specification.

The following describes the relationships between the tree objects:
root_namespace --> priorities -->namespaces -->
priorities -->flow-tables --> flow-groups -->
flow-entries --> destinations

When we create new object(flow table/flow group/flow table entry), we
call to the FW command and then we add the related sw object to the tree.

When we destroy object, e.g. call to mlx5_destroy_flow_table, we use
the tree node destructor for destroying the FW object and remove the
node from the tree.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:24 -05:00
Maor Gottlieb 5e1626c09c net/mlx5_core: Add flow steering lookup algorithms
Introduce the flow steering mlx5_flow_namespace (Namespace)
and fs_prio (Flow Steering Priority) tree nodes.

Namespaces are used in order to isolate different usages or types
of steering (for example, downstream patches will add a different
namespaces for the NIC driver and for E-Switch FDB usages).

Flow Steering Priorities are objects that describes priorities
ranges between different flow objects under the same namespace.

Example, entries in priority i are matched before entries
in priority i+1.

This patch adds the following algorithms:

1) Calculate level:
Each flow table has level(the priority between the flow tables).
When we initialize the flow steering tree, we assign range of levels
to each priority, therefore the level for new flow table is
the location within the priority related to the range of the priority.

2) Match between match criteria. This function is used
for searching flow group when new flow rule is added.

3) Match between match values. This function is used
for searching flow table entry  when new flow rule is added.

4) Add essential macros for traversing on a node's children.
E.g. traversing on all the flow table of some priority

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:24 -05:00
Maor Gottlieb de8575e014 net/mlx5_core: Add flow steering base data structures
Introducing the base data structure and its operations that are
going to represent ConnectX-4 Flow Steering, this data structure
is basically a tree and all Flow steering objects such as
(Flow Table/Flow Group/FTE/etc ..) are represented as fs_node(s).

fs_node is the base object which describes a basic tree node, with the
following extra info:
    type: describes the runtime type of the node (Object).
    lock: lock this node sub-tree.
    ref_count: number of children + current references.
    remove_func: a generic destructor.

fs_node types will be used and explained once the usage is added in the
following patches.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:23 -05:00
Maor Gottlieb 26a8145390 net/mlx5_core: Introduce flow steering firmware commands
Introduce new Flow Steering (FS) firmware commands,
in-order to support the new flow steering infrastructure.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:23 -05:00
Saeed Mahameed 108805fc19 net/mlx5e: Assign random MAC address if needed
Under SRIOV there might be a case where VFs are loaded
without pre-assigned MAC address. In this case, the VF
will randomize its own MAC.  This will address the case
of administrator not assigning MAC to the VF through
the PF OS APIs and keep udev happy.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:23 -05:00
Saeed Mahameed 9bd0a185c2 net/mlx5: Fix query E-Switch capabilities
E-Switch capabilities should be queried only if E-Switch flow table
is supported and not only when vport group manager.

Fixes: d6666753c6 ("net/mlx5: E-Switch, Introduce HCA cap and E-Switch vport context")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:15:23 -05:00
Dan Carpenter 515123e286 mlxsw: core: remove an unneeded condition
We already know "err" is zero so there is no need to check.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 20:10:55 -05:00
Dan Carpenter 82a06429ae mlxsw: spectrum: fix some error handling
The "err = " assignment is missing here.

Fixes: 0d65fc1304 ('mlxsw: spectrum: Implement LAG port join/leave')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 20:10:55 -05:00
Wengang Wang 73d4da7b9f IB/mlx4: Use correct order of variables in log message
There is a mis-order in mlx4 log. Fix it.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08 16:45:51 -05:00
Moni Shoua e57968a10b net/mlx4_core: Support the HA mode for SRIOV VFs too
When the mlx4 driver runs in HA mode, and all VFs are single ported
ones, we make their single port Highly-Available.

This is done by taking advantage of the HA mode properties (following
bonding changes with programming the port V2P map, etc) and adding
the missing parts which are unique to SRIOV such as mirroring VF
steering rules on both ports.

Due to limits on the MAC and VLAN table this mode is enabled only when
number of total VFs is under 64.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:40:46 -05:00
Moni Shoua 5f61385d2e net/mlx4_core: Keep VLAN/MAC tables mirrored in multifunc HA mode
Due to HW limitations, indexes to MAC and VLAN tables are always taken
from the table of the actual port. So, if a resource holds an index to
a table, it may refer to different values during the lifetime of the
resource,  unless the tables are mirrored. Also, even when
driver is not in HA mode the policy of allocating an index to these
tables is such to make sure, as much as possible, that when the time
comes the mirroring will be successful. This means that in multifunction
mode the allocation of a free index in a port's table tries to make sure
that the same index in the other's port table is also free.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:40:45 -05:00
Moni Shoua 78efed2751 net/mlx4_core: Support mirroring VF DMFS rules on both ports
Under HA mode, steering rules set by VFs should be mirrored on both
ports of the device so packets will be accepted no matter on which
port they arrived.

Since getting into HA mode is done dynamically when the user bonds mlx4
Ethernet netdevs, we keep hold of the VF DMFS rule mbox with the port
value flipped (1->2,2->1) and execute the mirroring when getting into
HA mode. Later, when going out of HA mode, we unset the mirrored rules.
In that context note that mirrored rules cannot be removed explicitly.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:40:44 -05:00
Moni Shoua 8d80d04a52 net/mlx4_core: Use both physical ports to dispatch link state events to VF
Under HA mode, the link down event should be sent to VFs only if both
ports are down.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:40:44 -05:00
Or Gerlitz e34305c85f net/mlx4_core: Use both physical ports to set the VF link state
In HA mode, the link state for VFs for which the policy is "auto"
(i.e. follow the physical link state) should be ORed from both ports.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:40:44 -05:00
Jiri Pirko 6b20da4d8f mlxsw: core: Change BUG to WARN in hwmon code
Better to just warn the user that something really odd is going on and
continue to run.

Suggested-by: Or Gerlitz <gerlitz.or@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:26:39 -05:00
Saeed Mahameed 66e49dedad net/mlx5e: Add support for SR-IOV ndos
Implement and enable SR-IOV ndos to manage SR-IOV configuration via
netdev netlink API.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:47 -05:00
Saeed Mahameed 3b751a2a41 net/mlx5: E-Switch, Introduce get vf statistics
Add support to get VF statistics using query vport
counter command.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:47 -05:00
Saeed Mahameed 9e7ea3524a net/mlx5: E-Switch, Introduce set vport vlan (VST mode)
Add query and modify functions to control client vlan and qos
striping or insertion, in E-Switch vports contexts.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:47 -05:00
Saeed Mahameed d6666753c6 net/mlx5: E-Switch, Introduce HCA cap and E-Switch vport context
E-Switch vport context is unlike NIC vport context, managed by the
E-Switch manager or vport_group_manager and not by the NIC(VF) driver.

The E-Switch manager can access (read/modify) any of its vports
E-Switch context.

Currently E-Switch vport context includes only clietnt and server
vlan insertion and striping data (for later support of VST mode).

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed 77256579c6 net/mlx5: E-Switch, Introduce Vport administration functions
Implement set VF mac/link state and query VF config
to be used later in nedev VF ndos or any other management API.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed 81848731ff net/mlx5: E-Switch, Add SR-IOV (FDB) support
Enabling E-Switch SRIOV for nvfs+1 vports.

Create E-Switch FDB for L2 UC/MC mac steering between VFs/PF and
external vport (Uplink).

FDB contains forwarding rules such as:
	UC MAC0 -> vport0(PF).
	UC MAC1 -> vport1.
	UC MAC2 -> vport2.
	MC MACX -> vport0, vport2, Uplink.
	MC MACY -> vport1, Uplink.

For unmatched traffic FDB has the following default rules:
	Unmached Traffic (src vport != Uplink) -> Uplink.
	Unmached Traffic (src vport == Uplink) -> vport0(PF).

FDB rules population:
Each NIC vport (VF) will notify E-Switch manager of its UC/MC vport
context changes via modify vport context command, which will be
translated to an event that will be handled by E-Switch manager (PF)
which will update FDB table accordingly.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed 495716b191 net/mlx5: E-Switch, Introduce FDB hardware capabilities
Define needed hardware structures and capabilities needed
for E-Switch FDB flow tables and read them on driver load.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed 073bb189a4 net/mlx5: Introducing E-Switch and l2 table
E-Switch is the software entity that represents and manages ConnectX4
inter-HCA ethernet l2 switching.

E-Switch has its own Virtual Ports, each Vport/vNIC/VF can be
connected to the device through a vport of an e-switch.

Each e-switch is managed by one vNIC identified by
HCA_CAP.vport_group_manager (usually it is the PF/vport[0]),
and its main responsibility is to forward each packet to the
right vport.

e-Switch needs to manage its own l2-table and FDB tables.

L2 table is a flow table that is managed by FW, it is needed for
Multi-host (Multi PF) configuration for inter HCA switching between
PFs.

FDB table is a flow table that is totally managed by e-Switch driver,
its main responsibility is to switch packets between e-Swtich internal
vports and uplink vport that belong to the same.

This patch introduces only e-Swtich l2 table management, FDB managemnt
will come later when ethernet SRIOV/VFs will be enabled.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed aad9e6e41e net/mlx5e: Write vlan list into vport context
Each Vport/vNIC must notify underlying e-Switch layer
for vlan table changes in-order to update SR-IOV FDB tables.

We do that at vlan_rx_add_vid and vlan_rx_kill_vid ndos.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:45 -05:00
Saeed Mahameed 5e55da1d5a net/mlx5e: Write UC/MC list and promisc mode into vport context
Each Vport/vNIC must notify underlying e-Switch layer
for UC/MC list and promisc mode updates, in-order to update
l2 tables and SR-IOV FDB tables.

We do that at set_rx_mode ndo.

preperation for ethernet-SRIOV and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:45 -05:00
Saeed Mahameed c0046cf7b8 net/mlx5: Introduce access functions to modify/query vport vlans
Those functions are needed to notify the upcoming L2 table and SR-IOV
E-Switch(FDB) manager(PF), of the NIC vport (vf) vlan table changes.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:45 -05:00
Saeed Mahameed d82b73186d net/mlx5: Introduce access functions to modify/query vport promisc mode
Those functions are needed to notify the upcoming SR-IOV
E-Switch(FDB) manager(PF), of the NIC vport (vf) promisc mode changes.

Preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:45 -05:00
Saeed Mahameed e75465148b net/mlx5: Introduce access functions to modify/query vport state
In preparation for SR-IOV we add here an API to enable each e-switch
manager (PF) to configure its VFs link states in e-switch

preparation for ethernet sriov.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:44 -05:00
Saeed Mahameed e16aea2744 net/mlx5: Introduce access functions to modify/query vport mac lists
Those functions are needed to notify the upcoming L2 table and SR-IOV
E-Switch(FDB) manager(PF), of the NIC vport (vf) UC/MC mac lists
changes.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:44 -05:00
Saeed Mahameed e1d7d349c6 net/mlx5: Update access functions to Query/Modify vport MAC address
In preparation for SR-IOV we add here an API to enable each e-switch
client (PF/VF) to configure its L2 MAC addresses and for the e-switch
manager (usually the PF) to access them in order to be able to
configure them into the e-switch.
Therefore we now pass vport num parameter to
mlx5_query_nic_vport_context, so PF can access other vports contexts.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:44 -05:00
Eli Cohen fc50db98ff net/mlx5_core: Add base sriov support
This patch adds SRIOV base support for mlx5 supported devices. The same
driver is used for both PFs and VFs; VFs are identified by the driver
through the flag MLX5_PCI_DEV_IS_VF added to the pci table entries.
Virtual functions are created as usual through writing a value to the
sriov_numvs sysfs file of the PF device. Upon instantiating VFs, they will
all be probed by the driver on the hypervisor. One can gracefully unbind
them through /sys/bus/pci/drivers/mlx5_core/unbind.

mlx5_wait_for_vf_pages() was added to ensure that when a VF dies without
executing proper teardown, the hypervisor driver waits till all of the
pages that were allocated at the hypervisor to maintain its operation
are returned.

In order for the VF to be operational, the PF needs to call enable_hca
for it. This can be done before the VFs are created through a call to
pci_enable_sriov.

If the there are VFs assigned to a VMs when the driver of the PF is
unloaded, all the VF will experience system error and PF driver unloads
cleanly; in this case pci_disable_sriov is not called and the devices
will show when running lspci. Once the PF driver is reloaded, it will
sync its data structures which maintain state on its VFs.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:43 -05:00
Eli Cohen 0b10710603 net/mlx5_core: Modify enable/disable hca functions
Modify these functions to have func_id argument to state which device we
are referring to. This is done as a preparation for SRIOV support where
a PF driver needs to control its virtual functions.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:43 -05:00
Jiri Pirko 745812065c mlxsw: spectrum: Implement LAG tx enabled lower state change
Enabling/disabling TX on a LAG port means enabling/disabling distribution
in our HW.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:29 -05:00
Jiri Pirko 8a1ab5d766 mlxsw: spectrum: Implement FDB add/remove/dump for LAG
Implement FDB offloading for lagged ports, including learning LAG FDB
entries, adding/removing static FDB entries and dumping existing LAG FDB
entries.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:29 -05:00
Jiri Pirko 0d65fc1304 mlxsw: spectrum: Implement LAG port join/leave
Implement basic procedures for joining/leaving port to/from LAG. That
includes HW setup of collector, core LAG mapping setup.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:29 -05:00
Jiri Pirko 3b71571c01 mlxsw: reg: Add definition of LAG unicast record for SFN register
LAG-related records have specific format in SFN register.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:28 -05:00
Jiri Pirko e4bfbae29a mlxsw: reg: Add definition of LAG unicast record for SFD register
LAG-related records have specific format in SFD register.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:28 -05:00
Jiri Pirko d1d40be084 mlxsw: reg: Add link aggregation configuration registers definitions
Add definitions of SLDR, SLCR2, SLCOR registers that are used to
configure LAG.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:28 -05:00
Jiri Pirko d2292e8761 mlxsw: pci: Implement LAG processing for received packets
Completion queue element for receive queue provides information if the
packet was received via LAG port. Extract this info and pass it along
to core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:28 -05:00
Jiri Pirko 8060646a0f mlxsw: core: Add support for packets received from LAG port
Lower layer (pci) has information if the packet is received via LAG port.
If that is the case, it fills up rx_info accordingly. However upper
layer does not care about lag_id/port_index for received packets so
convert it to local_port before passing it up. For that conversion, lag
mapping array is introduced. Upper layer is responsible for setting up
the mapping according to what is set in HW.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:28 -05:00
Jiri Pirko c5b9b518ad mlxsw: spectrum: Add set_rx_mode ndo stub
Add just a stub for now. This allows to pass check in dev_ifsioc,
SIOCADDMULTI and SIOCDELMULTI cases. Teamd is using these to add LACP
slow MAC.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:27 -05:00
Jiri Pirko 52581961d8 mlxsw: core: Implement fan control using hwmon
ASIC provides access to fans. Implement their exposure to userspace
using hwmon.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:05:41 -05:00
Jiri Pirko 5246f2e29a mlxsw: reg: Add definition of fan management registers
Add definition of MFCR, MFSC and MFSM which provide possibility to
control and monitor fans.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:05:40 -05:00
Jiri Pirko 89309da39f mlxsw: core: Implement temperature hwmon interface
ASIC provides access to temperature sensors. Implement their exposure to
userspace using hwmon.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:05:40 -05:00
Jiri Pirko 85926f8770 mlxsw: reg: Add definition of temperature management registers
Add definition of MTCAP and MTMP registers which provide access to
temperature sensors.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:05:40 -05:00
Ido Schimmel 3a66ee38dc mlxsw: spectrum: Add support for port identification
Allow a user to flash the port's LED in order to identify it. This is
achieved by setting the Management LED Control Register (MLCR).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:05:40 -05:00
Ido Schimmel 3161c15900 mlxsw: reg: Add Management LED Control register definition
Add the MLCR register, which controls physical port identification LEDs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:05:40 -05:00
Ido Schimmel b07a966c70 mlxsw: spectrum: Add error paths to __mlxsw_sp_port_vlans_add
The operation of adding VLANs on a port via switchdev ops can fail and
we need to be prepared for it. If we do not rollback hardware operations
following a failure, hardware and software will remain in an
inconsistent state.

Solve that by adding suitable error paths to __mlxsw_sp_port_vlans_add.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-20 11:06:03 -05:00
Ido Schimmel 3b7ad5ece4 mlxsw: spectrum: Unify setting of HW VLAN filters
When adding or deleting VLANs from a bridged port, HW VLAN filters must be
set accordingly. Instead of having the same code in both add and delete
functions, just wrap it in a function and call it with the appropriate
parameters.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-20 11:06:03 -05:00
Ido Schimmel 06c071f68d mlxsw: spectrum: Use correct PVID value when removing VLANs
When removing a range of VLANs in which PVID is a member we should use
the correct PVID value instead of some VLAN in the range.

Also, change two print statements to use 'dev' instead of
'mlxsw_sp_port->dev', as it's already used in other print statements in
the function.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-20 11:06:02 -05:00
Eric Dumazet 93d05d4a32 net: provide generic busy polling to all NAPI drivers
NAPI drivers no longer need to observe a particular protocol
to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)

napi_hash_add() and napi_hash_del() are automatically called
from core networking stack, respectively from
netif_napi_add() and netif_napi_del()

This patch depends on free_netdev() and netif_napi_del() being
called from process context, which seems to be the norm.

Drivers might still prefer to call napi_hash_del() on their
own, since they might combine all the rcu grace periods into
a single one, knowing their NAPI structures lifetime, while
core networking stack has no idea of a possible combining.

Once this patch proves to not bring serious regressions,
we will cleanup drivers to either remove napi_hash_del()
or provide appropriate rcu grace periods combining.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:42 -05:00
Eric Dumazet d64b5e85bf net: add netif_tx_napi_add()
netif_tx_napi_add() is a variant of netif_napi_add()

It should be used by drivers that use a napi structure
to exclusively poll TX.

We do not want to add this kind of napi in napi_hash[] in following
patches, adding generic busy polling to all NAPI drivers.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:41 -05:00
Eric Dumazet 93f93a4404 net: move skb_mark_napi_id() into core networking stack
We would like to automatically provide busy polling support
to all NAPI drivers, without them having to implement anything.

skb_mark_napi_id() can be called from napi_gro_receive() and
napi_get_frags().

Few drivers are still calling skb_mark_napi_id() because
they use netif_receive_skb(). They should eventually call
napi_gro_receive() instead. I will leave this to drivers
maintainers.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:41 -05:00
Eric Dumazet 868fdb0606 mlx4: remove mlx4_en_low_latency_recv()
Busy polling can now be handled in generic NAPI poll infrastructure.
This removes complexity and fast path overhead :

mlx4 used two spin_lock()/spin_unlock() pair per napi->poll() call
in mlx4_en_cq_lock_napi()/mlx4_en_cq_unlock_napi()

Tested:

Without busy polling :

lpaa23:~# echo 0 >/proc/sys/net/core/busy_read
lpaa24:~# echo 0 >/proc/sys/net/core/busy_read
lpaa23:~# ./netperf -H lpaa24 -t TCP_RR
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpaa24.prod.google.com () port 0 AF_INET : first burst 0
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       10.00    47330.78

With busy polling :

lpaa23:~# echo 70 >/proc/sys/net/core/busy_read
lpaa24:~# echo 70 >/proc/sys/net/core/busy_read
lpaa23:~# ./netperf -H lpaa24 -t TCP_RR
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpaa24.prod.google.com () port 0 AF_INET : first burst 0
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       10.00    97643.55

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:40 -05:00
Eric Dumazet 44fb6fbbac mlx5: support napi_complete_done()
A NAPI poll handler should return number of RX packets processed,
instead of 0 / budget.

This allows proper busy poll accounting through LINUX_MIB_BUSYPOLLRXPACKETS
SNMP counter.

napi_complete_done() allows /sys/class/net/ethX/gro_flush_timeout
to be used for finer GRO aggregation control.

Tested:

Enabled busy polling, and checked TcpExtBusyPollRxPackets counter is increasing.

echo 70 >/proc/sys/net/core/busy_read
nstat >/dev/null
netperf -H target -t TCP_RR >/dev/null
nstat | grep TcpExtBusyPollRxPackets
TcpExtBusyPollRxPackets         490958             0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:39 -05:00
Eric Dumazet 7ae92ae588 mlx5: add busy polling support
It is now easy to add busy polling support to a NAPI driver,
with very little impact on normal input path.

This patch serves as a reference implementation.

Note:

A followup patch will add proper napi_complete_done() in mlx5,
so that LINUX_MIB_BUSYPOLLRXPACKETS snmp counter is properly handled.

Tested:

Normal TCP_RR results without busy polling :

lpk51:~# echo 0 >/proc/sys/net/core/busy_read
lpk52:~# echo 0 >/proc/sys/net/core/busy_read

lpk51:~# ./netperf -H 192.168.4.52 -t TCP_RR -l 10
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.4.52 () port 0 AF_INET : first burst 0
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       10.00    53509.49
16384  87380

Now enable busy polling :

lpk51:~# echo 70 >/proc/sys/net/core/busy_read
lpk52:~# echo 70 >/proc/sys/net/core/busy_read

lpk51:~# ./netperf -H 192.168.4.52 -t TCP_RR -l 10
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.4.52 () port 0 AF_INET : first burst 0
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       10.00    97530.92
16384  87380

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:39 -05:00
Eric Dumazet 5865316c9d mlx4: mlx4_en_low_latency_recv() called with BH disabled
mlx4_en_low_latency_recv() is called with BH disabled,
as other ndo_busy_poll() methods.

No need for spin_lock_bh()/spin_unlock_bh()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:38 -05:00
Noa Osherovich d49c2197fd net/mlx4_core: Avoid returning success in case of an error flow
The err variable wasn't set with the correct error value in some cases.

Fixes: 47605df953 ('mlx4: Modify proxy/tunnel QP mechanism [..]')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:41 -05:00
Eran Ben Elisha f5adbfee72 net/mlx4_core: Fix sleeping while holding spinlock at rem_slave_counters
When cleaning slave's counter resources, we hold a spinlock that
protects the slave's counters list. As part of the clean, we call
__mlx4_clear_if_stat which calls mlx4_alloc_cmd_mailbox which is a
sleepable function.

In order to fix this issue, hold the spinlock, and copy all counter
indices into a temporary array, and release the spinlock. Afterwards,
iterate over this array and free every counter. Repeat this scenario
until the original list is empty (a new counter might have been added
while releasing the counters from the temporary array).

Fixes: b72ca7e96a ("net/mlx4_core: Reset counters data when freed")
Reported-by: Moni Shoua <monis@mellanox.com>
Tested-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:41 -05:00
Achiad Shochat d4e28cbd24 net/mlx5e: Use the right DMA free function on TX path
On xmit path we use skb_frag_dma_map() which is using dma_map_page(),
while upon completion we dma-unmap the skb fragments using
dma_unmap_single() rather than dma_unmap_page().

To fix this, we now save the dma map type on xmit path and use this
info to call the right dma unmap method upon TX completion.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:40 -05:00
Doron Tsur 50a9eea694 net/mlx5e: Max mtu comparison fix
On change mtu the driver compares between hardware queried mtu and
software requested mtu. We need to compare between software
representation of the queried mtu and the requested mtu.

Fixes: facc9699f0 ('net/mlx5e: Fix HW MTU settings')
Signed-off-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:40 -05:00
Tariq Toukan 66189961e9 net/mlx5e: Added self loopback prevention
Prevent outgoing multicast frames from looping back to the RX queue.

By introducing new HW capability self_lb_en_modifiable, which indicates
the support to modify self_lb_en bit in modify_tir command.

When this capability is set we can prevent TIRs from sending back
loopback multicast traffic to their own RQs, by "refreshing TIRs" with
modify_tir command, on every time new channels (SQs/RQs) are created at
device open.
This is needed since TIRs are static and only allocated once on driver
load, and the loopback decision is under their responsibility.

Fixes issues of the kind:
"IPv6: eth2: IPv6 duplicate address fe80::e61d:2dff:fe5c:f2e9 detected!"
The issue is seen since the IPv6 solicitations multicast messages are
loopedback and the network stack thinks they are coming from another host.

Fixes: 5c50368f38 ("net/mlx5e: Light-weight netdev open/stop")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:40 -05:00
Saeed Mahameed ba6c4c0944 net/mlx5e: Fix inline header size calculation
mlx5e_get_inline_hdr_size didn't take into account the vlan insertion
into the inline WQE segment.
This could lead to max inline violation in cases where
skb_headlen(skb) + VLAN_HLEN >= sq->max_inline.

Fixes: 3ea4891db8 ("net/mlx5e: Fix LSO vlan insertion")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:40 -05:00
Linus Torvalds ab9f2faf8f Initial 4.4 merge window submission
- "Checksum offload support in user space" enablement
 - Misc cxgb4 fixes, add T6 support
 - Misc usnic fixes
 - 32 bit build warning fixes
 - Misc ocrdma fixes
 - Multicast loopback prevention extension
 - Extend the GID cache to store and return attributes of GIDs
 - Misc iSER updates
 - iSER clustering update
 - Network NameSpace support for rdma CM
 - Work Request cleanup series
 - New Memory Registration API
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWPO5UAAoJELgmozMOVy/dSCQP/iX2ImMZOS3VkOYKhLR3dSv8
 4vTEiYIoAT1JEXiPpiabuuACwotcZcMRk9kZ0dcWmBoFusTzKJmoDOkgAYd95XqY
 EsAyjqtzUGNNMjH5u5W+kdbaFdH9Ktq7IJvspRlJuvzC47Srax+qBxX01jrAkDgh
 4PoA3hEa2KkvkDjY2Mhvk9EWd/uflO9Ky6o0D8jUQkWtEvKBRyDjQLk30oW6wHX9
 pTWqww3dD0EXTrR+PDA88v2saKH1kZFU1Nt2eU8Bw+zlJM8hcX6U7PfRX0g3HT/J
 o+7ejTdLPWFDH35gJOU+KE519f1JbwfRjPJCqbOC9IttBB7iHSbhcpQLpWv4JV1x
 agdBeDA3TGQj3dHb2SkYMlWXCBp7q8UCbVGvvirTFzGSGU73sc6hhP+vCKvPQIlE
 Ah5tUqD7Y3mOBjvuDeIzKMLXILd5d3cH+m7Laytrf5e7fJPmBRZyOkcMh0QVElyl
 mKo+PFjghgeTFb405J7SDDw/vThVyN9HyIt7AGEzObaajzOOk9R1hkQr46XVy9TK
 yi58fl85yQ2n6TWV6NRnvkQoMy/N2HAEuXk/7HtO0PabV5w3Lo0zvXB9SnVrrVEm
 58FWRBYCWorVSdSacuDnPm0iz45WSRIb9G9sBlhEC93eXRq2rSBoy4RvyLeliHFH
 hllyhNNolI6FJ64j07Xm
 =bBIY
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "This is my initial round of 4.4 merge window patches.  There are a few
  other things I wish to get in for 4.4 that aren't in this pull, as
  this represents what has gone through merge/build/run testing and not
  what is the last few items for which testing is not yet complete.

   - "Checksum offload support in user space" enablement
   - Misc cxgb4 fixes, add T6 support
   - Misc usnic fixes
   - 32 bit build warning fixes
   - Misc ocrdma fixes
   - Multicast loopback prevention extension
   - Extend the GID cache to store and return attributes of GIDs
   - Misc iSER updates
   - iSER clustering update
   - Network NameSpace support for rdma CM
   - Work Request cleanup series
   - New Memory Registration API"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (76 commits)
  IB/core, cma: Make __attribute_const__ declarations sparse-friendly
  IB/core: Remove old fast registration API
  IB/ipath: Remove fast registration from the code
  IB/hfi1: Remove fast registration from the code
  RDMA/nes: Remove old FRWR API
  IB/qib: Remove old FRWR API
  iw_cxgb4: Remove old FRWR API
  RDMA/cxgb3: Remove old FRWR API
  RDMA/ocrdma: Remove old FRWR API
  IB/mlx4: Remove old FRWR API support
  IB/mlx5: Remove old FRWR API support
  IB/srp: Dont allocate a page vector when using fast_reg
  IB/srp: Remove srp_finish_mapping
  IB/srp: Convert to new registration API
  IB/srp: Split srp_map_sg
  RDS/IW: Convert to new memory registration API
  svcrdma: Port to new memory registration API
  xprtrdma: Port to new memory registration API
  iser-target: Port to new memory registration API
  IB/iser: Port to new fast registration API
  ...
2015-11-07 13:33:07 -08:00
Linus Torvalds 9cf5c095b6 asm-generic cleanups
The asm-generic changes for 4.4 are mostly a series from Christoph Hellwig
 to clean up various abuses of headers in there. The patch to rename the
 io-64-nonatomic-*.h headers caused some conflicts with new users, so I
 added a workaround that we can remove in the next merge window.
 
 The only other patch is a warning fix from Marek Vasut
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVjzaf2CrR//JCVInAQImmhAA20fZ91sUlnA5skKNPT1phhF6Z7UF2Sx5
 nPKcHQD3HA3lT1OKfPBYvCo+loYflvXFLaQThVylVcnE/8ecAEMtft4nnGW2nXvh
 sZqHIZ8fszTB53cynAZKTjdobD1wu33Rq7XRzg0ugn1mdxFkOzCHW/xDRvWRR5TL
 rdQjzzgvn2PNlqFfHlh6cZ5ykShM36AIKs3WGA0H0Y/aYsE9GmDOAUp41q1mLXnA
 4lKQaIxoeOa+kmlsUB0wEHUecWWWJH4GAP+CtdKzTX9v12bGNhmiKUMCETG78BT3
 uL8irSqaViNwSAS9tBxSpqvmVUsa5aCA5M3MYiO+fH9ifd7wbR65g/wq39D3Pc01
 KnZ3BNVRW5XSA3c86pr8vbg/HOynUXK8TN0lzt6rEk8bjoPBNEDy5YWzy0t6reVe
 wX65F+ver8upjOKe9yl2Jsg+5Kcmy79GyYjLUY3TU2mZ+dIdScy/jIWatXe/OTKZ
 iB4Ctc4MDe9GDECmlPOWf98AXqsBUuKQiWKCN/OPxLtFOeWBvi4IzvFuO8QvnL9p
 jZcRDmIlIWAcDX/2wMnLjV+Hqi3EeReIrYznxTGnO7HHVInF555GP51vFaG5k+SN
 smJQAB0/sostmC1OCCqBKq5b6/li95/No7+0v0SUhJJ5o76AR5CcNsnolXesw1fu
 vTUkB/I66Hk=
 =dQKG
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic cleanups from Arnd Bergmann:
 "The asm-generic changes for 4.4 are mostly a series from Christoph
  Hellwig to clean up various abuses of headers in there.  The patch to
  rename the io-64-nonatomic-*.h headers caused some conflicts with new
  users, so I added a workaround that we can remove in the next merge
  window.

  The only other patch is a warning fix from Marek Vasut"

* tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: temporarily add back asm-generic/io-64-nonatomic*.h
  asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations
  gpio-mxc: stop including <asm-generic/bug>
  n_tracesink: stop including <asm-generic/bug>
  n_tracerouter: stop including <asm-generic/bug>
  mlx5: stop including <asm-generic/kmap_types.h>
  hifn_795x: stop including <asm-generic/kmap_types.h>
  drbd: stop including <asm-generic/kmap_types.h>
  move count_zeroes.h out of asm-generic
  move io-64-nonatomic*.h out of asm-generic
2015-11-06 14:22:15 -08:00
Achiad Shochat 3ea4891db8 net/mlx5e: Fix LSO vlan insertion
Consider vlan insertion impact on headers copy size also for LSO
packets.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:51 -05:00
Achiad Shochat e4cf27bd9c net/mlx5e: Re-eanble client vlan TX acceleration
This reverts commit cd58c714ac "net/mlx5e: Disable client vlan TX acceleration".

Bring back client vlan insertion offload, the original
performance issue was found and fixed in the next patch.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:51 -05:00
Achiad Shochat fe9f4fe58d net/mlx5e: Return error in case mlx5e_set_features() fails
In case mlx5e_set_features() fails, return the failure status rather
than 0.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:50 -05:00
Achiad Shochat 3435ab59d3 net/mlx5e: Don't allow more than max supported channels
Consider MLX5E_MAX_NUM_CHANNELS @ethtool set/get_channels

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:50 -05:00
Achiad Shochat 61d0e73e0a net/mlx5_core: Use the the real irqn in eq->irqn
Instead of storing the msix array index in eq->irqn (vecidx),
store the real irq number.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:50 -05:00
Achiad Shochat 01c196a2d3 net/mlx5e: Wait for RX buffers initialization in a more proper manner
Use jiffies rather than wait loop with msleep().

The wait loop didn't take into consideration time when the
process was not executing.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:50 -05:00
Achiad Shochat a198574090 net/mlx5e: Avoid NULL pointer access in case of configuration failure
In case a configuration operation that involves closing and re-opening
resources (e.g RX/TX queue size change) fails at the re-opening stage
these resources will remain closed.
So when executing (following) configuration operations (e.g ifconfig
down) we cannot assume that these resources are available.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:50 -05:00
David S. Miller b75ec3af27 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-01 00:15:30 -04:00
Jiri Pirko c7070fc4ec mlxsw: spectrum: Make mlxsw_sp_port_switchdev_ops static
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:58 +09:00
Or Gerlitz d9324f68ee mlxsw: Put braces on all arms of branch statement
Fix a place where checkpatch complains that braces should be used
on all arms of this statement.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:57 +09:00
Or Gerlitz ef743fddb3 mlxsw: Put constant on the right side of comparisons
Fixes those places where checkpatch complains that comparisons
should place the constant on the right side of the test.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:54 +09:00
Jiri Pirko 135f9eceb7 mlxsw: spectrum: Fix ageing time value
The value passed through switchdev attr set is not in jiffies, but in
clock_t, so fix the convert.

Reported-by: Sagi Rotem <sagir@mellanox.com>
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:52 +09:00
Jiri Pirko 75c09280fe mlxsw: reg: Avoid unnecessary line wrap for mlxsw_reg_sfd_uc_unpack
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:50 +09:00
Jiri Pirko 8316f087f7 mlxsw: reg: Fix desription typos of couple of SFN items
Fix copy-paste errors.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:50 +09:00
Jiri Pirko 4e9ec0839b mlxsw: reg: Fix description for reg_sfd_uc_sub_port
The original description was for LAG, so fix it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:48 +09:00
Ido Schimmel 0293038e0c mlxsw: spectrum: Add support for flood control
Add or remove a bridged port from the flooding domain of unknown unicast
packets according to user configuration.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:47 +09:00
Ido Schimmel 1b3433a942 mlxsw: spectrum: Add support for VLAN ranges in flooding configuration
When enabling a range of VLANs on a bridged port we can configure
flooding for these VLANs by one register access instead of calling the
same register for each VLAN. This is accomplished by using the 'range'
field of the Switch Flooding Table Register (SFTR).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:45 +09:00
Jiri Pirko 0d9b970cee mlxsw: spectrum: move "bridged" bool to u8 flags
It is a flag anyway, so move it to existing u8 flag and don't waste mem.
Fix the flags to be in single u8 on the way.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:42 +09:00
Carol L Soto c02b05011f net/mlx4: Copy/set only sizeof struct mlx4_eqe bytes
When doing memcpy/memset of EQEs, we should use sizeof struct
mlx4_eqe as the base size and not caps.eqe_size which could be bigger.

If caps.eqe_size is bigger than the struct mlx4_eqe then we corrupt
data in the master context.

When using a 64 byte stride, the memcpy copied over 63 bytes to the
slave_eq structure.  This resulted in copying over the entire eqe of
interest, including its ownership bit -- and also 31 bytes of garbage
into the next WQE in the slave EQ -- which did NOT include the ownership
bit (and therefore had no impact).

However, once the stride is increased to 128, we are overwriting the
ownership bits of *three* eqes in the slave_eq struct.  This results
in an incorrect ownership bit for those eqes, which causes the eq to
seem to be full. The issue therefore surfaced only once 128-byte EQEs
started being used in SRIOV and (overarchitectures that have 128/256
byte cache-lines such as PPC) - e.g after commit 77507aa249
"net/mlx4_core: Enable CQE/EQE stride support".

Fixes: 08ff32352d ('mlx4: 64-byte CQE/EQE support')
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 20:27:11 -07:00
Jack Morgenstein 092bf0fc80 net/mlx4_en: Explicitly set no vlan tags in WQE ctrl segment when no vlan is present
We do not set the ins_vlan field to zero when no vlan id is present in the packet.

Since WQEs in the TX ring are not zeroed out between uses, this oversight
could result in having vlan flags present in the WQE ctrl segment when no
vlan is preset.

Fixes: e38af4faf0 ('net/mlx4_en: Add support for hardware accelerated 802.1ad vlan')
Reported-by: Gideon Naim <gideonn@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 20:27:09 -07:00
Maor Gottlieb 74194fb9c8 net/mlx4_en: Implement mcast loopback prevention for ETH qps
Set the mcast loopback prevention bit in the QPC for ETH MLX QPs (not
RSS QPs), when the firmware supports this feature. In addition, all rx
ring QPs need to be updated in order not to enforce loopback checks.
This prevents getting packets we sent both from the network stack and
the HCA. Loopback prevention is done by comparing the counter indices of
the sent and receiving QPs. If they're equal, packets aren't
loopback-ed.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 23:16:47 -04:00
Maor Gottlieb 9a89283597 net/mlx4_core: Add support for filtering multicast loopback
Update device capabilities regarding HW filtering multicast loopback support.

Add MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB attribute to mlx4_update_qp to
enable changing QP context to support filtering incoming multicast
loopback traffic according the sender's counter index.

Set the corresponding bits in QP context to force the loopback source
checks if attribute is given and HW supports it.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 23:16:47 -04:00
Doug Ledford fc81a06965 Merge branch 'k.o/for-4.3-v1' into k.o/for-4.4
Pick up the late fixes from the 4.3 cycle so we have them in our
next branch.
2015-10-21 16:40:21 -04:00
David S. Miller 26440c835f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	net/ipv4/inet_connection_sock.c
	net/switchdev/switchdev.c

In the inet_connection_sock.c case the request socket hashing scheme
is completely different in net-next.

The other two conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-20 06:08:27 -07:00
Jiri Pirko 56ade8fe3f mlxsw: spectrum: Add initial support for Spectrum ASIC
Add support for new generation Mellanox Spectrum ASIC, 10/25/40/50 and
100Gb/s Ethernet Switch.

The initial driver implements bridge forwarding offload including
bridge internal VLAN support, FDB static entries, FDB learning and
HW ageing including their setup.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:23 -07:00
Ido Schimmel a4feea74cd mlxsw: reg: Add Switch Port VLAN MAC Learning register definition
Since we currently do not support the offloading of 802.1D bridges, we
need to be able to let the device know it should not learn MAC addresses
on specific {Port, VID} pairs.

Add the SPVMLR register, which controls the learning enablement of
{Port, VID} pairs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:22 -07:00
Jiri Pirko e534a56a31 mlxsw: reg: Add Switch Filtering Database Aging Time register definition
Add SFDAT which is used to control switch ageing time.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:22 -07:00
Ido Schimmel 1f65da742d mlxsw: reg: Add Switch Virtual-Port Enabling register definition
In order for a port to support {Port, VID} to FID mapping it needs to be
configured to a virtual port mode (as opposed to VLAN mode).

Add the SVPE register, which enables port virtualization.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:20 -07:00
Ido Schimmel 6479023976 mlxsw: reg: Add Switch VID to FID Allocation register definition
An incoming packet can be classified into a filtering identifer (FID)
based on its VID or incoming port and VID ({Port, VID}).

Add the SVFA register, which controls this mapping.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:19 -07:00
Ido Schimmel f1fb693a08 mlxsw: reg: Add Switch FID Management register definition
Filtering identifiers (FIDs) are unique identifers of bridge instances
in the hardware.

Add the SFMR register, which is responsible for the creation and
configuration of these FIDs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:18 -07:00
Jiri Pirko e059436999 mlxsw: reg: Add shared buffer configuration registers definitions
Add definitions of SBPR, SBCM, SBPM, SBMM and PBMC registers that are
used to configure shared buffers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:18 -07:00
Elad Raz b2e345f9a4 mlxsw: reg: Add Switch Port VID and Switch Port VLAN Membership registers definitions
Add SPVID and SPVM registers responsible for default port VID
configuration and VLAN membership of a port.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:17 -07:00
Jiri Pirko f5d88f5892 mlxsw: reg: Add Switch FDB Notification register definition
Add SFN register which is used to poll for newly added and aged-out FDB
entries.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:17 -07:00
Jiri Pirko 236033b33c mlxsw: reg: Add Switch Filtering Database register definition
Add the SFD register which is responsible for filtering database
manipulation, including static and dynamic FDB entries.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:15 -07:00
Jiri Pirko d64b159253 mlxsw: item: Add MLXSW_ITEM_BUF_INDEXED helper
Add missing item helper which allows to access char bufs on multiple
offsets. This is needed by SFD and SFN register definitions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:14 -07:00
Jiri Pirko 7b0989b5bc mlxsw: item: Make src arg of memcpy_to helper const
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:12 -07:00
Ido Schimmel 12fd35ab8a mlxsw: cmd: Introduce FID-offset flooding tables
Packets destined to offloaded netdevs will be classified to FIDs in the
device and flooded in case of BUM.

The flooding table used is of type FID-offset, which allows one to
create different flooding domains for different FIDs and specify the
offset in the flooding table for each FID (not necessarily equal to FID
or VID).

Add support for this flooding table type, by exposing the configuration
of the number of tables from this type and their size.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:10 -07:00
Ido Schimmel 453b6a8dd8 mlxsw: cmd: Introduce per-FID flooding tables
In the newly introduced Spectrum switch ASIC, packets destined to not
offloaded netdevs will be classified to special FIDs (vFIDs) in the
device and flooded to the CPU port.

The flooding table used is of type per-FID, which allows one to create
different flooding domains for different vFIDs.

While using a simple single-entry flood table is certainly sufficient at
this point, we do plan to offload 802.1D bridges involving VLAN
interfaces, thus making this change necessary.

Add support for this flooding table type, by exposing the configuration
of the number of tables from this type and their size.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:08 -07:00
Ido Schimmel bc2055f878 mlxsw: Enable configuration of flooding domains
As part of the introduction of L2 offloads, allow different ports to
join/leave the flooding domain, according to user configuration.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:08 -07:00
Ivan Vecera 47ea032533 drivers/net: get rid of unnecessary initializations in .get_drvinfo()
Many drivers initialize uselessly n_priv_flags, n_stats, testinfo_len,
eedump_len & regdump_len fields in their .get_drvinfo() ethtool op.
It's not necessary as these fields is filled in ethtool_get_drvinfo().

v2: removed unused variable
v3: removed another unused variable

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:24:10 -07:00
Insu Yun 175f8d6746 mlx4: corretly check failed allocation
When allocation fails, mlx4_alloc_cmd_mailbox returns -ENOMEM.
Since there is no case that mlx4_alloc_cmd_mailbox returns NULL,
it needs to be checked by IS_ERR, not IS_ERR_OR_NULL

Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:31:38 -07:00
Ido Schimmel 5cd16d8c78 mlxsw: cmd: Update CONFIG_PROFILE command documentation
The meaning of certain parameters in the profile passed to the device
during initialization has changed, so update their documentation
accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:57 -07:00
Ido Schimmel 801bd3defb mlxsw: Add trap group for control packets
Previously, we trapped flooded and control packets using the same trap
group. This can cause flooded packets to overflow the PCI bus and
prevent control packets (e.g. STP, LACP) from getting to the CPU.

Solve this by splitting the RX trap group to RX and control, which allows
us to configure a policer on the first, thereby preventing it from
overflowing the PCI bus.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:56 -07:00
Ido Schimmel f24af33015 mlxsw: Simplify traps creation
The Host Trap Group Table (HTGT) register configures trap groups, which
are populated with trap IDs using the Host PacKet Trap (HPKT) register.
However, a trap ID can only be present inside one trap group (the last
configured).

Instead of passing both the trap group and ID for the function that
packs HPKT, pass only the trap ID and derive from it the trap group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:55 -07:00
Jiri Pirko ebb7963f9b mlxsw: Introduce mlxsw_reg_spms_vid_pack helper and use it
Introduce separate helper for packing SPMS VIDs, as it can be used for
multiple VIDs and not only for one as previous SPMS pack function
provided.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:55 -07:00
Ido Schimmel fa6ad058bc mlxsw: reg: Adjust definition of enum mlxsw_reg_sfgc_type
Define max which would be needed later on.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:54 -07:00
Jiri Pirko 36b78e8aba mlxsw: reg: Remove extra space in SFGC ID define
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:53 -07:00
Jiri Pirko 3f0effd16b mlxsw: reg: Uppercase letters in register IDs
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:52 -07:00
Jiri Pirko 6cf9dc8b77 mlxsw: Use dev_level_ratelimited instead of net_ratelimit & dev_level
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:51 -07:00
Jiri Pirko 18ea54454e mlxsw: core: Do not use EMADs in mlxsw_emad_fini
Be symmetric with mlxsw_emad_init and don't use EMADs in mlxsw_emad_fini
cleanup function. Use command interface instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:51 -07:00
Jiri Pirko 3e2206da73 mlxsw: pci: Limit number of entries being sent in single MAP_FA cmd
Firmware accepts only limited number of mapping entries for MAP_FA
command. In order to prevent overflow, introduce a limit and in case the
number of entries is bigger, call MAP_FA multiple times.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:49 -07:00
Jiri Pirko c85c3882ad mlxsw: pci: Remove MLXSW_PCI_RDQS/SDQS defines and checks
Remove strict number check of queues count as various ASICs have
different counts.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:48 -07:00
Jiri Pirko 424e1114af mlxsw: pci: Do not use MLXSW_PCI_SDQS_COUNT define
Use mlxsw_pci_sdq_count helper instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:48 -07:00
Jiri Pirko e4c870b1b4 mlxsw: pci: Use MLXSW_PCI_CQS_MAX instead of MLXSW_PCI_CQS_COUNT
The count of CQs can be different for various ASICs, so just define
maximal value and check for that.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:47 -07:00
Jiri Pirko ffe053285b mlxsw: switchx2: Use ETH_ALEN for mac address length
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:45 -07:00
Ido Schimmel 33a704a59b mlxsw: Remove multicast ID configuration
With respect to a firmware change, the Switch Multicast ID (SMID)
register is no longer needed, so the related configuration code can be
removed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:45 -07:00
Ido Schimmel 53ca376eec mlxsw: core: Fix race condition in __mlxsw_emad_transmit
Under certain conditions EMAD responses can be returned from the device
even before setting trans_active. This will cause the EMAD Rx listener
to drop the EMAD response - as there are no active transactions - and
timeouts will be generated.

Fix this by setting trans_active before transmitting the EMAD skb.

Fixes: 4ec14b7634 ("mlxsw: Add interface to access registers and process events")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 06:03:06 -07:00
Jack Morgenstein 2b3ddf27f4 net/mlx4_core: Replace VF zero mac with random mac in mlx4_core
By design, when no default MAC addresses are set in the Hypervisor for VFs,
the VFs are passed zero-macs. When such a MAC is received by the VF, it
generates a random MAC address and registers that MAC address
with the Hypervisor.

This random mac generation is currently done in the mlx4_en module.
There is a problem, though, if the mlx4_ib module is loaded by a VF before
the mlx4_en module. In this case, for RoCE, mlx4_ib will see the un-replaced
zero-mac and register that zero-mac as part of QP1 initialization.

Having a zero-mac in the port's MAC table creates problems for a
Baseboard Management Console. The BMC occasionally sends packets with a
zero-mac destination MAC. If there is a zero-mac present in the port's
MAC table, the FW will send such BMC packets to the host driver rather than
to the wire, and BMC will stop working.

To address this problem, we move the replacement of zero-mac addresses
with random-mac addresses to procedure mlx4_slave_cap(), which is part of the
driver startup for VFs, and is before activation of mlx4_ib and mlx4_en.
As a result, zero-mac addresses will never be registered in the port MAC table
by the driver.

In addition, when mlx4_en does initialize the net device, it needs to set
the NET_ADDR_RANDOM flag in the netdev structure if the address was
randomly generated. This is done so that udev on the VM does not create
a new device name after each VF probe (VM boot and such). To accomplish this,
we add a per-port flag in mlx4_dev which gets set whenever mlx4_core replaces
a zero-mac with a randomly-generated mac. This flag is examined when mlx4_en
initializes the net-device.

Fix was suggested by Matan Barak <matanb@mellanox.com>

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-14 19:14:44 -07:00
Eli Cohen e3297246c2 net/mlx5_core: Wait for FW readiness on startup
On device initialization, wait till firmware indicates that that it is done
with initialization before proceeding to initialize the device.

Also update initialization segment layout to match driver/firmware
interface definitions.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-14 19:14:43 -07:00
Majd Dibbiny 89d44f0a6c net/mlx5_core: Add pci error handlers to mlx5_core driver
This patch implement the pci_error_handlers for mlx5_core which allow the
driver to recover from PCI error.

Once an error is detected in the PCI, the mlx5_pci_err_detected is called
and it:
1) Marks the device to be in 'Internal Error' state.
2) Dispatches an event to the mlx5_ib to flush all the outstanding cqes
with error.
3) Returns all the on going commands with error.
4) Unloads the driver.

Afterwards, the FW is reset and mlx5_pci_slot_reset is called and it
enables the device and restore it's pci state.

If the later succeeds, mlx5_pci_resume is called, and it loads the SW
stack.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-14 19:14:42 -07:00
Eli Cohen fd76ee4da5 net/mlx5_core: Fix internal error detection conditions
The detection of a fatal condition has been updated to take into account
the state reported by the device or by detecting an all ones read of the
firmware version which indicates that the device is not accessible.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-14 19:14:41 -07:00
Christoph Hellwig adec640e03 mlx5: stop including <asm-generic/kmap_types.h>
<linux/highmem.h> is the placace the get the kmap type flags, asm-generic
files are generic implementations only to be used by architecture code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-10-15 00:21:10 +02:00
Ido Schimmel bee1f753bf mlxsw: Fix bug in __mlxsw_item_bit_array_offset
When calculating the shift needed in order to access a bit array element
in a byte, we should multiply the index by the element size and not
assume it is fixed at 2-bits.

Fixes: 93c1edb27f ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:08:09 -07:00
Elad Raz 4b0c2541cb mlxsw: switchx2: changing order of exit fallbacks
Fixes: 31557f0f97 ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support")
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:08:08 -07:00
Achiad Shochat c07543431e net/mlx5e: Disable VLAN filter in promiscuous mode
When the device was set to promiscuous mode, we didn't disable
VLAN filtering, which is wrong behaviour, fix that.

Now when the device is set to promiscuous mode RX packets
sent over any VLAN (or no VLAN tag at all) will be accepted.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:43:43 -07:00
Jiri Pirko 13b7938883 net/mlx5: Fix typo in mlx5_query_port_pvlc
We used the wrong register name for querying the PVLC register

Fixes: a124d13ef5 ('net/mlx5_core: Add more query port helpers')
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:43:43 -07:00
Carol L Soto 820d39f3c4 net/mlx4_core: Avoid failing the interrupts test
Test interrupts fails if not all completion vectors called
request_irq. This case happens if only mlx4_en is loaded and
we have more completion vectors than rx rings.

Fixes: c66fa19c40 ('net/mlx4: Add EQ pool')
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Acked-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:43:38 -07:00
Saeed Mahameed 95e196337a net/mlx4_core: Fix resource tracker error flow in add_res_range
The 'for' loop when undoing rb-tree insertions and list-adds in the
error flow in add_res_range had errors, fix them.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:27:52 -07:00
Jack Morgenstein a5b3c56ef7 net/mlx4_core: Fix mailbox leak in error flow when performing update qp
The procedure mlx4_update_qp leaks mailboxes in its error-flow, fix that.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:27:51 -07:00
Ido Shamay ba4b87aedd net/mlx4_en: Add steering rules after RSS creation
Changed the receive control flow in a way that steering
rules are added only when the RSS object is already in RTR/RTS mode.
Some optimization features, which are enabled by the device firmware,
require this condition in order to be effective.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:27:50 -07:00
Eli Cohen ac6ea6e81a net/mlx5_core: Use private health thread for each device
Use a single threaded work queue for each device in the system instead of
using one thread for any device. This is required so we can concurrently
process system error handling for all the devices that need that.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:27:49 -07:00
Eli Cohen 0144a95e2a net/mlx5_core: Use accessor functions to read from device memory
Use ioread function to read health buffer data. In addition, print the
firmware version as a string for readability and also use dev_err to have
the device string to be printed.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:27:48 -07:00
Eli Cohen 020446e01e net/mlx5_core: Prepare cmd interface to system errors handling
In preparation to handling system errors at the mlx5_core level, change the
interface of cmd_work_handler to accept a 64 bit argument for the vector.

This allows to encode a flag that signifies when the handler is called
as a result of a driver logic that wishes to terminate commands that
the hardware may not be able to terminate. Such command completions
are detected at the handler and proper return status is encoded.

To be able to terminate page handler commands, we make sure to set
the corresponding bit in the bitmask.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:27:48 -07:00
Eli Cohen 5a7883989b net/mlx5_core: Improve mlx5 messages
Improve the messages printed by the mlx5 macros to include the device
string. In addition, prefix names used by the macros with two underscores
to avoid possible name collisions.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-09 07:27:46 -07:00
Carol L Soto 85121d6ee6 net/mlx4: Remove shared_ports variable at mlx4_enable_msi_x
If we get MAX_MSIX interrupts would like to have each receive ring
with his own msix interrupt line. Do not need the shared_ports
variable at mlx4_enable_msix

Fixes: 9293267a3e ('net/mlx4_core: Capping number of requested MSIXs to MAX_MSIX')
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Acked-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-08 05:20:24 -07:00
Arnd Bergmann bcb9db49bb mlxsw: fix warnings for big-endian 32-bit dma_addr_t
The recently added mlxsw driver produces warnings in ARM
allmodconfig:

drivers/net/ethernet/mellanox/mlxsw/pci.c: In function 'mlxsw_pci_cmd_exec':
drivers/net/ethernet/mellanox/mlxsw/pci.c:1585:59: warning: right shift count >= width of type [-Wshift-count-overflow]
linux/byteorder/big_endian.h:38:51: note: in definition of macro '__cpu_to_be32'
drivers/net/ethernet/mellanox/mlxsw/pci.c:76:2: note: in expansion of macro 'iowrite32be'

This uses upper_32_bits() to extract the bits while avoiding that warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Fixes: eda6500a98 "mlxsw: Add PCI bus implementation"
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-08 05:04:30 -07:00
Jiri Pirko 1f86839874 switchdev: rename SWITCHDEV_ATTR_* enum values to SWITCHDEV_ATTR_ID_*
To be aligned with obj.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03 04:49:37 -07:00
David S. Miller f6d3125fa3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/dsa/slave.c

net/dsa/slave.c simply had overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-02 07:21:25 -07:00
Linus Torvalds 3deaa4f531 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

1) Fix regression in SKB partial checksum handling, from Pravin B
   Shalar.

2) Fix VLAN inside of VXLAN handling in i40e driver, from Jesse
   Brandeburg.

3) Cure softlockups during accept() in SCTP, from Karl Heiss.

4) MSG_PEEK should return multiple SKBs worth of data in AF_UNIX, from
   Aaron Conole.

5) IPV6 erroneously ignores output interface specifier in lookup key for
   route lookups, fix from David Ahern.

6) In Marvell DSA driver, forward unknown frames to CPU port, from
   Andrew Lunn.

7) Mission flow flag initializations in some code paths, from David
   Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: Initialize flow flags in input path
  net: dsa: fix preparation of a port STP update
  testptp: Silence compiler warnings on ppc64
  net/mlx4: Handle return codes in mlx4_qp_attach_common
  dsa: mv88e6xxx: Enable forwarding for unknown to the CPU port
  skbuff: Fix skb checksum partial check.
  net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set
  net sysfs: Print link speed as signed integer
  bna: fix error handling
  af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
  af_unix: Convert the unix_sk macro to an inline function for type safety
  net: sctp: Don't use 64 kilobyte lookup table for four elements
  l2tp: protect tunnel->del_work by ref_count
  net/ibm/emac: bump version numbers for correct work with ethtool
  sctp: Prevent soft lockup when sctp_accept() is called during a timeout event
  sctp: Whitespace fix
  i40e/i40evf: check for stopped admin queue
  i40e: fix VLAN inside VXLAN
  r8169: fix handling rtl_readphy result
  net: hisilicon: fix handling platform_get_irq result
2015-10-01 21:55:35 -04:00
Linus Torvalds 46c8217c4a Changes for 4.3-rc4
- Fixes for mlx5 related issues
 - Fixes for ipoib multicast handling
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWCfALAAoJELgmozMOVy/dc+MQAKoD6echYpTkWE0otMuHQcYf
 zMaVVots+JdRKpA6OqHYQHgKGA80z21BpnjGYwcwB5zB1zPrJwz4vxwGlOBHt01T
 xLBReFgSKyJlgOWLXKfPx4bXUdivOBKm203wY0dh+/dC/VROGYoiXYTmSDsfsuKa
 8OXT1kWgzRVLtqwqj5GSkgWvtFZ28CjKh6d9egjqcj9tpbh2UupQDZzMyOtZ52X6
 Nz/Vo3u4T7qjzlhHOlCwHCDw+97x0yvmvLY1mWweGPfKOnxtXjkzQmTQEpyzU5Mo
 EwcqJucrBnmjbLAIBMrbR1mzTUQeD4dHz1jx+EzWE0lVnRL3twe1UaY40176sNlm
 aCBA4bIOQ242r3IJ++ss15ol1k5hu7PYKRn9Q8d2sSbQGcSnCHe/YOutQQ+FTEFG
 yE9xiLL+pgT8koauROnxg66E3HDM78NGTpjP3EuG4r2Qwa1iFANPfDB6kikuv8bO
 rG3qUJcloEPvfatZY+h5QC4UCoB0/W1DAhlfzE3tPBYPmhSEgQDfEOzXTKDakeF0
 VB903bYrOL3CVOun4I7fLrDc1leVeiAUKqO2orZs3qIpRWvAKyV/VjolAusMv2+F
 /4xPyh95AEMTFfmZogOCofQFk3eOnkWpLdrVTYCKy3i6NVBoy2wHldrl+LuCAN/m
 r/DNRBmazShashbeU6wg
 =8+cX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 - Fixes for mlx5 related issues
 - Fixes for ipoib multicast handling

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/ipoib: increase the max mcast backlog queue
  IB/ipoib: Make sendonly multicast joins create the mcast group
  IB/ipoib: Expire sendonly multicast joins
  IB/mlx5: Remove pa_lkey usages
  IB/mlx5: Remove support for IB_DEVICE_LOCAL_DMA_LKEY
  IB/iser: Add module parameter for always register memory
  xprtrdma: Replace global lkey with lkey local to PD
2015-10-01 16:38:52 -04:00
Robb Manes 23860f103b net/mlx4: Handle return codes in mlx4_qp_attach_common
Both new_steering_entry() and existing_steering_entry() return values
based on their success or failure, but currently they fall through
silently.  This can make troubleshooting difficult, as we were unable
to tell which one of these two functions returned errors or
specifically what code was returned.  This patch remedies that
situation by passing the return codes to err, which is returned by
mlx4_qp_attach_common() itself.

This also addresses a leak in the call to mlx4_bitmap_free() as well.

Signed-off-by: Robb Manes <rmanes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 21:14:01 -07:00
Eli Cohen 171bb2c560 net/mlx5_core: Update health syndromes
Update new health monitored syndromes and their descriptions.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:50 -07:00
Eli Cohen 78ccb25861 net/mlx5_core: Fix wrong name in struct
The name refers to syndrome so uset ext_synd instread of ext_sync.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:50 -07:00
Majd Dibbiny a31208b1e1 net/mlx5_core: New init and exit flow for mlx5_core
In the new flow, we separate the pci initialization and teardown from the
initialization and teardown of the other resources.

init_one calls mlx5_pci_init that handles the pci resources initialization.
It then calls mlx5_load_one to initialize the remainder of the resources.

When removing a device, remove_one is invoked. However, now remove_one
calls mlx5_unload_one to free all the resources except the pci resources.
When mlx5_unload_one returns, mlx5_pci_close is called to free the pci
resources.

The above separation will allow us to implement the pci error handlers and
suspend and resume callbacks.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:50 -07:00
Eli Cohen a8ffe63e60 net/mlx5_core: Fix notification of page supplement error
Some errors did not result with notifying firmware that the page request
could not be fulfilled. Fix this and put the notification logic into a
separate function.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:49 -07:00
Eli Cohen be87544de8 net/mlx5_core: Fix async commands return code
In case of async command completion, the error code returned should take
into account the command completion status.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:49 -07:00
Achiad Shochat 6c3dbd2d72 net/mlx5_core: Remove redundant "err" variable usage
Cosmetic change.
Do not use the an err variable just to assign and return it.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:49 -07:00
Saeed Mahameed 97909302f9 net/mlx5_core: Fix struct type in the DESTROY_TIR/TIS device commands
Used the output mailbox format for input mailbox.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:49 -07:00
Achiad Shochat 343b29f308 net/mlx5e: Priv state flag not rolled-back upon netdev open error
The private mlx5 state flag that indicates that the netdev is
opened is set at the beginning of the netdev open flow.
In case an error occured later in the mlx5 netdev open flow, this
flag was not cleared, remaining set although the actual set is
closed.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:19:49 -07:00
Linus Torvalds 518a7cb698 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) When we run a tap on netlink sockets, we have to copy mmap'd SKBs
    instead of cloning them.  From Daniel Borkmann.

 2) When converting classical BPF into eBPF, fix the setting of the
    source reg to BPF_REG_X.  From Tycho Andersen.

 3) Fix igmpv3/mldv2 report parsing in the bridge multicast code, from
    Linus Lussing.

 4) Fix dst refcounting for ipv6 tunnels, from Martin KaFai Lau.

 5) Set NLM_F_REPLACE flag properly when replacing ipv6 routes, from
    Roopa Prabhu.

 6) Add some new cxgb4 PCI device IDs, from Hariprasad Shenai.

 7) Fix headroom tests and SKB leaks in ipv6 fragmentation code, from
    Florian Westphal.

 8) Check DMA mapping errors in bna driver, from Ivan Vecera.

 9) Several 8139cp bug fixes (dev_kfree_skb_any in interrupt context,
    misclearing of interrupt status in TX timeout handler, etc.) from
    David Woodhouse.

10) In tipc, reset SKB header pointer after skb_linearize(), from Erik
    Hugne.

11) Fix autobind races et al. in netlink code, from Herbert Xu with
    help from Tejun Heo and others.

12) Missing SET_NETDEV_DEV in sunvnet driver, from Sowmini Varadhan.

13) Fix various races in timewait timer and reqsk_queue_hadh_req, from
    Eric Dumazet.

14) Fix array overruns in mac80211, from Johannes Berg and Dan
    Carpenter.

15) Fix data race in rhashtable_rehash_one(), from Dmitriy Vyukov.

16) Fix race between poll_one_napi and napi_disable, from Neil Horman.

17) Fix byte order in geneve tunnel port config, from John W Linville.

18) Fix handling of ARP replies over lightweight tunnels, from Jiri
    Benc.

19) We can loop when fib rule dumps cross multiple SKBs, fix from Wilson
    Kok and Roopa Prabhu.

20) Several reference count handling bug fixes in the PHY/MDIO layer
    from Russel King.

21) Fix lockdep splat in ppp_dev_uninit(), from Guillaume Nault.

22) Fix crash in icmp_route_lookup(), from David Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  net: Fix panic in icmp_route_lookup
  net: update docbook comment for __mdiobus_register()
  ppp: fix lockdep splat in ppp_dev_uninit()
  net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
  phy: marvell: add link partner advertised modes
  net: fix net_device refcounting
  phy: add phy_device_remove()
  phy: fixed-phy: properly validate phy in fixed_phy_update_state()
  net: fix phy refcounting in a bunch of drivers
  of_mdio: fix MDIO phy device refcounting
  phy: add proper phy struct device refcounting
  phy: fix mdiobus module safety
  net: dsa: fix of_mdio_find_bus() device refcount leak
  phy: fix of_mdio_find_bus() device refcount leak
  ip6_tunnel: Reduce log level in ip6_tnl_err() to debug
  ip6_gre: Reduce log level in ip6gre_err() to debug
  fib_rules: fix fib rule dumps across multiple skbs
  bnx2x: byte swap rss_key to comply to Toeplitz specs
  net: revert "net_sched: move tp->root allocation into fw_init()"
  lwtunnel: remove source and destination UDP port config option
  ...
2015-09-26 06:01:33 -04:00
Sagi Grimberg c6790aa9f4 IB/mlx5: Remove support for IB_DEVICE_LOCAL_DMA_LKEY
Commit 96249d70dd ("IB/core: Guarantee that a local_dma_lkey
is available") allows ULPs that make use of the local dma key to keep
working as before by allocating a DMA MR with local permissions and
converted these consumers to use the MR associated with the PD
rather then device->local_dma_lkey.

ConnectIB has some known issues with memory registration
using the local_dma_lkey (SEND, RDMA, RECV seems to work ok).

Thus don't expose support for it (remove device->local_dma_lkey
setting), and take advantage of the above commit such that no regression
is introduced to working systems.

The local_dma_lkey support will be restored in CX4 depending on FW
capability query.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-25 10:46:51 -04:00
Eric Dumazet 4671fc6d47 net/mlx4_en: really allow to change RSS key
When changing rss key, we do not want to overwrite user provided key
by the one provided by netdev_rss_key_fill(), which is the host random
key generated at boot time.

Fixes: 947cbb0ac2 ("net/mlx4_en: Support for configurable RSS hash function")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eyal Perry <eyalpe@mellanox.com>
CC: Amir Vadai <amirv@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 21:03:59 -07:00
Thomas Gleixner dc2ec62f75 net/mlx4_en: Use access helper irq_data_get_affinity_mask()
This is a preparatory patch for moving irq_data struct members. Search
and replace was done with coccinelle

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Amir Vadai <amirv@mellanox.com>
2015-09-15 17:06:28 +02:00
Linus Torvalds 26d2177e97 Changes for 4.3
- Create drivers/staging/rdma
 - Move amso1100 driver to staging/rdma and schedule for deletion
 - Move ipath driver to staging/rdma and schedule for deletion
 - Add hfi1 driver to staging/rdma and set TODO for move to regular tree
 - Initial support for namespaces to be used on RDMA devices
 - Add RoCE GID table handling to the RDMA core caching code
 - Infrastructure to support handling of devices with differing
   read and write scatter gather capabilities
 - Various iSER updates
 - Kill off unsafe usage of global mr registrations
 - Update SRP driver
 - Misc. mlx4 driver updates
 - Support for the mr_alloc verb
 - Support for a netlink interface between kernel and user space cache
   daemon to speed path record queries and route resolution
 - Ininitial support for safe hot removal of verbs devices
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV7v8wAAoJELgmozMOVy/d2dcP/3PXnGFPgFGJODKE6VCZtTvj
 nooNXRKXjxv470UT5DiAX7SNcBxzzS7Zl/Lj+831H9iNXUyzuH31KtBOAZ3W03vZ
 yXwCB2caOStSldTRSUUvPe2aIFPnyNmSpC4i6XcJLJMCFijKmxin5pAo8qE44BQU
 yjhT+wC9P6LL5wZXsn/nFIMLjOFfu0WBFHNp3gs5j59paxlx5VeIAZk16aQZH135
 m7YCyicwrS8iyWQl2bEXRMon2vlCHlX2RHmOJ4f/P5I0quNcGF2+d8Yxa+K1VyC5
 zcb3OBezz+wZtvh16yhsDfSPqHWirljwID2VzOgRSzTJWvQjju8VkwHtkq6bYoBW
 egIxGCHcGWsD0R5iBXLYr/tB+BmjbDObSm0AsR4+JvSShkeVA1IpeoO+19162ixE
 n6CQnk2jCee8KXeIN4PoIKsjRSbIECM0JliWPLoIpuTuEhhpajftlSLgL5hf1dzp
 HrSy6fXmmoRj7wlTa7DnYIC3X+ffwckB8/t1zMAm2sKnIFUTjtQXF7upNiiyWk4L
 /T1QEzJ2bLQckQ9yY4v528SvBQwA4Dy1amIQB7SU8+2S//bYdUvhysWPkdKC4oOT
 WlqS5PFDCI31MvNbbM3rUbMAD8eBAR8ACw9ZpGI/Rffm5FEX5W3LoxA8gfEBRuqt
 30ZYFuW8evTL+YQcaV65
 =EHLg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull inifiniband/rdma updates from Doug Ledford:
 "This is a fairly sizeable set of changes.  I've put them through a
  decent amount of testing prior to sending the pull request due to
  that.

  There are still a few fixups that I know are coming, but I wanted to
  go ahead and get the big, sizable chunk into your hands sooner rather
  than waiting for those last few fixups.

  Of note is the fact that this creates what is intended to be a
  temporary area in the drivers/staging tree specifically for some
  cleanups and additions that are coming for the RDMA stack.  We
  deprecated two drivers (ipath and amso1100) and are waiting to hear
  back if we can deprecate another one (ehca).  We also put Intel's new
  hfi1 driver into this area because it needs to be refactored and a
  transfer library created out of the factored out code, and then it and
  the qib driver and the soft-roce driver should all be modified to use
  that library.

  I expect drivers/staging/rdma to be around for three or four kernel
  releases and then to go away as all of the work is completed and final
  deletions of deprecated drivers are done.

  Summary of changes for 4.3:

   - Create drivers/staging/rdma
   - Move amso1100 driver to staging/rdma and schedule for deletion
   - Move ipath driver to staging/rdma and schedule for deletion
   - Add hfi1 driver to staging/rdma and set TODO for move to regular
     tree
   - Initial support for namespaces to be used on RDMA devices
   - Add RoCE GID table handling to the RDMA core caching code
   - Infrastructure to support handling of devices with differing read
     and write scatter gather capabilities
   - Various iSER updates
   - Kill off unsafe usage of global mr registrations
   - Update SRP driver
   - Misc  mlx4 driver updates
   - Support for the mr_alloc verb
   - Support for a netlink interface between kernel and user space cache
     daemon to speed path record queries and route resolution
   - Ininitial support for safe hot removal of verbs devices"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (136 commits)
  IB/ipoib: Suppress warning for send only join failures
  IB/ipoib: Clean up send-only multicast joins
  IB/srp: Fix possible protection fault
  IB/core: Move SM class defines from ib_mad.h to ib_smi.h
  IB/core: Remove unnecessary defines from ib_mad.h
  IB/hfi1: Add PSM2 user space header to header_install
  IB/hfi1: Add CSRs for CONFIG_SDMA_VERBOSITY
  mlx5: Fix incorrect wc pkey_index assignment for GSI messages
  IB/mlx5: avoid destroying a NULL mr in reg_user_mr error flow
  IB/uverbs: reject invalid or unknown opcodes
  IB/cxgb4: Fix if statement in pick_local_ip6adddrs
  IB/sa: Fix rdma netlink message flags
  IB/ucma: HW Device hot-removal support
  IB/mlx4_ib: Disassociate support
  IB/uverbs: Enable device removal when there are active user space applications
  IB/uverbs: Explicitly pass ib_dev to uverbs commands
  IB/uverbs: Fix race between ib_uverbs_open and remove_one
  IB/uverbs: Fix reference counting usage of event files
  IB/core: Make ib_dealloc_pd return void
  IB/srp: Create an insecure all physical rkey only if needed
  ...
2015-09-09 08:33:31 -07:00
Moni Shoua 79857cd31f net/mlx4: Postpone the registration of net_device
The mlx4 network driver was registered in the context of the 'add'
function of the core driver (called when HW should be registered).
This makes the netdev event NETDEV_REGISTER to be sent in a context
where the answer to get_protocol_dev() callback returns NULL. This may
be confusing to listeners of netdev events.
This patch is a preparation to the patch that implements the
get_netdev() callback in the IB/mlx4 driver.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:20 -04:00
Sagi Grimberg a3c874200c mlx5: Fix missing device local_dma_lkey
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.

Query and set this lkey when creating the device resources.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28 22:54:46 -04:00
Carol L Soto b0f6446377 net/mlx4_core: Fix unintialized variable used in error path
The uninitialized value name in mlx4_en_activate_cq was used in order
to print an error message. Fixing it by replacing it with cq->vector.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:40:27 -07:00
Carol L Soto 9293267a3e net/mlx4_core: Capping number of requested MSIXs to MAX_MSIX
We currently manage IRQs in pool_bm which is a bit field
of MAX_MSIX bits. Thus, allocating more than MAX_MSIX
interrupts can't be managed in pool_bm.
Fixing this by capping number of requested MSIXs to
MAX_MSIX.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:40:26 -07:00
Ido Schimmel 1e81779ae4 mlxsw: Make mailboxes 4KB aligned
The HW-SW contract requires mailboxes passed to the firmware to be 4KB
aligned. Previously, these mailboxes were mapped using streaming DMA
routines, which do not guarantee the bus addresses to be 4KB aligned.
Under certain conditions this constraint was indeed violated and errors
were observed.

By using consistent DMA mapping routines together with a mailbox size of
4KB we are guaranteed not to violate the constraint.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:31:17 -07:00
Jiri Pirko 262df6919e mlxsw: adjust transmit fail log message level in __mlxsw_emad_transmit
When transmit fails, it is an error, not a warning.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:31:17 -07:00
Ido Schimmel a585dabd96 mlxsw: Remove duplicate included header
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:31:17 -07:00
Rana Shahout 5283af899a net/mlx5e: Avoid accessing NULL pointer at ndo_select_queue
To avoid multiply/division operations on the data path,
we hold a {channel, tc}==>txq mapping table.
We held this mapping table inside the channel object that is
being destroyed upon some configuration operations (e.g MTU change).
So in case ndo_select_queue occurs during such a configuration operation,
it may access a NULL channel pointer, resulting in kernel panic.
To fix this issue we moved the {channel, tc}==>txq mapping table
outside the channel object so that it will be available also
during such configuration operations.

Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 13:45:09 -07:00
Julia Lawall 5c12197939 mlxsw: fix error return code
Return a negative error code on failure.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
identifier ret; expression e1,e2;
@@
(
if (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 13:37:31 -07:00
David S. Miller ecf842f65c mlx5e: Fix sparse warnings in mlx5e_handle_csum().
>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: sparse: incorrect type in argument 1 (different base types)
   drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44:    expected restricted __sum16 [usertype] n
   drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44:    got restricted __be16 [usertype] check_sum

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 21:22:26 -07:00
Achiad Shochat bbceefce9a net/mlx5e: Support RX CHECKSUM_COMPLETE
Only for packets with first ethertype set to IPv4/6 for now.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:36 -07:00
Achiad Shochat 3c2d18ef22 net/mlx5e: Support ethtool get/set_pauseparam
Only rx/tx pause settings.
Autoneg setting is currently not supported.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:36 -07:00
Achiad Shochat 6fa1bcab6b net/mlx5e: Ethtool link speed setting fixes
- Port speed settings are applied by the device only upon
  port admin status transition from DOWN to UP.
  So we enforce this transition regardless of the port's
  current operation state (which may be occasionally DOWN if
  for example the network cable is disconnected).
- Fix the PORT_UP/DOWN device interface enum
- Set the local_port bit in the device PAOS register
- EXPORT the PAOS (Port Administrative and Operational Status)
  register set/query access functions.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat d9a40271cf net/mlx5e: HW LRO changes/fixes
- Change the maximum LRO session size from 16KB to 64KB
- Reduce the LRO session timeout from 512us to 32us in
  order to reduce the TCP latency of non-LRO'ed flows.
- Fix skb_shinfo(skb)->gso_size and set skb_shinfo(skb)->gso_type.
- Fix a bug accessing un-initialized mdev pointer.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat e842b1001d net/mlx5e: Support smaller RX/TX ring sizes
We un-intentionally limited the minimum rings size too much.

TX minimum ring size reduced from 128 to 64.
RX minimum ring size reduced from 128 to 2.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat 2d75b2bc8a net/mlx5e: Add ethtool RSS configuration options
- get_rxfh_key_size
- get_rxfh_indir_size
- get/set_rxfh indirection table and RSS Toeplitz hash key
- get_rxnfc

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat 936896e908 net/mlx5e: Make RSS indirection table size a constant
The indirection table size was defined by a variable that
was actually assigned a constant value.
Since we do not have any forseen intension to make it configurable
we simply made it a constant.

We also limit the number of channels such that the RSS indirection
table could always populate all RX rings.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat 57afead544 net/mlx5e: Have a single RSS Toeplitz hash key
No need to generate a unique key per TIR.
Generating a single key per netdev and copying it to all
its TIRs.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
David S. Miller 182ad468e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cavium/Kconfig

The cavium conflict was overlapping dependency
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:23:11 -07:00
Fabio Estevam ed8db18dea mellanox: mlxsw: Use '%zx' to print size_t format
Use '%zx' to print size_t format in order to fix the following build warning:

drivers/net/ethernet/mellanox/mlxsw/item.h:65:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'size_t' [-Wformat=]

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 21:12:49 -07:00
Ido Schimmel e577516b9d mlxsw: Fix use-after-free bug in mlxsw_sx_port_xmit
Store the length of the skb before transmitting it and use it for stats
instead of skb->len, since skb might have been freed already.

This issue was discovered using the Kernel Address sanitizer (KASan).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Ido Schimmel 3bfcd34764 mlxsw: Use correct skb length when dumping payload
Do not use the length of the transmitted skb (which was freed), but
that of the response skb.

This issue was discovered using the Kernel Address sanitizer (KASan).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Ido Schimmel d003462a50 mlxsw: Simplify mlxsw_sx_port_xmit function
Previously we only checked if the transmission queue is not full in the
middle of the xmit function. This lead to complex logic due to the fact
that sometimes we need to reallocate the headroom for our Tx header.

Allow the switch driver to know if the transmission queue is not full
before sending the packet and remove this complex logic.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Jiri Pirko 7b7b9cff74 mlxsw: Strip FCS from incoming packets
FCS of incoming packets is already checked by HW. Just strip it out.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Jiri Pirko 74ed207e2a mlxsw: Make pci module dependent on HAS_DMA and HAS_IOMEM
This resolves compile errors on um-allyesconfig.

Note that there are many other drivers which have the same issue.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:09 -07:00
Ido Schimmel e61011b5e0 mlxsw: Make system port to local port mapping explicit
System ports are unique identifiers in a multi-ASIC environment that
represent all the available ports in the system. Local ports on the
other hand, are unique only within the local ASIC.

Since system port to local port mapping is not part of the HW-SW
contract and since only single-ASIC configurations are currently
supported, set an explicit 1:1 mapping by configuring the Switch System
Port Record (SSPR) register.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:09 -07:00
Ido Schimmel 26a80f6e54 mlxsw: Call free_netdev when removing port
When removing a port's netdevice we should also free the memory
allocated by alloc_etherdev(). Do this by calling free_netdev() at the
end of the teardown sequence.

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:09 -07:00
Carol L Soto fe1e1876d8 net/mlx5_core: Set log_uar_page_sz for non 4K page size architecture
failed to configure the page size for architectures with page size
different than 4K.

Fixes: 938fe83 ("net/mlx5_core: New device capabilities handling")
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 15:55:48 -07:00
Gal Pressman efea389d3c net/mlx5_core: Support physical port counters
Added physical port counters in the following standard formats to
ethtool statistics:
  - IEEE 802.3
  - RFC2863
  - RFC2819

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:59 -07:00
Achiad Shochat 9b37b07fcb net/mlx5e: Take advantage of the light-weight netdev open/stop
Now that TIRs, TISs and flow tables are kept alive while the netdev is
stopped (after executing ndo_stop()) we can do the following
improvements:

- Obsolete the active_vlans SW shadow.
- Do not delete/add flow table rules upon ndo_stop/open.
  In addition to simplifying the flow, this change also fastens
  the ndo_open/close operations.
- Obsolete synchronization of threads accessing the flow tables
  with the netdev stop/open threads.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:59 -07:00
Achiad Shochat 1cefa326ff net/mlx5e: Disable async events before unregister_netdev()
It does not make sense to allow events while the netdev is
unregistered.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 40ab6a6ebe net/mlx5e: Rename/move functions following the ndo_stop flow change
Rename some functions that used to be invoked upon ndo_open/stop and
are now invoked upon create/destroy_netdev() in order to better hint
their place in the flow.

Change some functions location in the file so that functions involved
in ndo_open/stop flow will not be interleaved with other functions.

This is a cosmetic change, no logical change here.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 5c50368f38 net/mlx5e: Light-weight netdev open/stop
Create/destroy TIRs, TISs and flow tables upon PCI probe/remove rather
than upon the netdev ndo_open/stop.

Upon ndo_stop(), redirect all RX traffic to the (lately introduced)
"Drop RQ" and then close only the RX/TX rings, leaving the TIRs,
TISs and flow tables alive.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat d9eea403ca net/mlx5_core: Introduce access function to modify RSS/LRO params
To be used by the mlx5 Eth driver in following commit.

This is in preparation for netdev "light-weight" open/stop flow
change described in previous commit.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 50cfa25aba net/mlx5e: Introduce the "Drop RQ"
RX traffic routed to this RQ will be silently dropped, at the NIC HW
level.

This is in preparation for netdev "light-weight" open/stop flow
change described in previous commit.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 4cbeaff54f net/mlx5e: Unify the RX flow
Generally an RX packet flows through the following objects:
Flow table --> TIR --> RQT --> RQ

Where:
- TIR stands for "Transport Interface Receive", defining the RSS and
  LRO paramaters.
- RQT stands for "RQ Table", implementing the RSS indirection table.
- RQ stands for "Receive Queue"

For flows that do not need LRO, nor RSS, the driver made a shortcut to
the above RX flow by pointing to the RQ directly from the TIR, yielding
this flow:
Flow table --> TIR --> RQ

In this commit we remove this shortcut by "inserting" a single-RQ RQT
between the TIR and the RQ, i.e RX packets will reach the same RQ but
will go through an RQT of size 1, pointing to just a single RQ.

This way the RX traffic re-direction to/from the "Drop RQ" will be more
uniform (AKA "one flow"), as it will involve only RQTs re-direction and
no TIRs re-direction.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
David S. Miller 5510b3c2a1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	arch/s390/net/bpf_jit_comp.c
	drivers/net/ethernet/ti/netcp_ethss.c
	net/bridge/br_multicast.c
	net/ipv4/ip_fragment.c

All four conflicts were cases of simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-31 23:52:20 -07:00
Jiri Pirko 31557f0f97 mlxsw: Introduce Mellanox SwitchX-2 ASIC support
Benefit from the previously introduced Mellanox Switch infrastructure and
add driver for SwitchX-2 ASIC. Note that this driver is very simple now.
It implements bare minimum for getting device to work on slow-path.
Fast-path offload functionality is going to be added soon.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Reviewed-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-30 00:05:00 -07:00
Ido Schimmel 4ec14b7634 mlxsw: Add interface to access registers and process events
Ethernet Management Datagrams (EMADs) are Ethernet packets sent between
the host and the device in order to configure the available device registers.
Another use case is notifications sent from the device to the host,
letting it know about certain events, such as port up / down.

Add the ability to construct EMADs with provisions to construct and
parse the registers' payloads. Implement EMAD transaction layer
which is responsible for the reliable transmission of EMADs. Also, add
an infrastructure used by the switch driver to register for particular
events generated by the device.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Reviewed-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-30 00:05:00 -07:00
Jiri Pirko eda6500a98 mlxsw: Add PCI bus implementation
Add PCI bus implementation for Mellanox Technologies Switch ASICs. This
includes firmware initialization, async queues manipulation and command
interface implementation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Reviewed-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-30 00:04:59 -07:00
Jiri Pirko 93c1edb27f mlxsw: Introduce Mellanox switch driver core
Add core components of Mellanox switch driver infrastructure.
Core infrastructure is designed so that it can be used by multiple
bus drivers (PCI now, I2C and SGMII are planned to be implemented
in the future). Multiple switch kind drivers can be registered as well.
This core serves as a glue between buses and drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Reviewed-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-30 00:04:59 -07:00
Achiad Shochat 98e81b0ad6 net/mlx5e: Remove the mlx5e_update_priv_params() function
It was used to update netdev priv parameters that require stopping
and re-opening the device in a generic way - it got the new
parameters and did: ndo_stop(), copy new parameters into current
parameters, ndo_open().

We chose to remove it for two reasons:
1) It requires additional instance of struct mlx5e_params on the
   stack and looking forward we expect this struct to grow.
2) Sometimes we want to do additional operations (besides
   just updating the priv parameters) while the netdev is stopped.
   For example, updating netdev->mtu @mlx5e_change_mtu() should
   be done while the netdev is stopped (done in this commit).

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:47 -07:00
Achiad Shochat 1fc22739a8 net/mlx5e: Introduce create/destroy RSS indir table access functions
Introduce access functions to create/destroy RSS indrection table
and use it in the Ethernet driver.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:47 -07:00
Achiad Shochat 1f2a30037b net/mlx5e: Do not use netdev_err() before the netdev is registered
Since it is un-named at this time.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:46 -07:00
Achiad Shochat 97de9f310a net/mlx5e: Avoid redundant de-reference
Use the already defined rq pointer directly.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:46 -07:00
Achiad Shochat 28abbfddf4 net/mlx5e: Remove redundant assignment of sq->user_index
It is not needed by the mlx5 Eth driver since it has a CQ per RQ/SQ.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:46 -07:00
Achiad Shochat a4418a6c36 net/mlx5e: Remove redundant field mlx5e_priv->num_tc
This field already exists under the mlx5e_params struct

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:46 -07:00
Achiad Shochat 68cdf5d6e9 net/mlx5e: Use hard-coded 4K page size for RQ/SQ/CQ
The page size of the device's RQ/SQ/CQ objects is defined in 4K
units regardless of the system pages size.
Thus using the Linux's PAGE_SHIFT macro yields wrong device
configuration in systems where PAGE_SHIFT!=12.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:46 -07:00
Haggai Abramonvsky c928ed5517 net/mlx5_core: Check the return value of mlx5_command_exec()
mlx5_cmd_exec() might fail - need to check return value.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29 23:04:46 -07:00