Commit Graph

281 Commits

Author SHA1 Message Date
Petr Machata 47259544e0 mlxsw: spectrum: Add lowest_shaper_bs to struct mlxsw_sp
Lower limit of burst size configuration is dependent on system type. Add a
datum to track the value. Initialize as appropriate in mlxsw_spX_init().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Jiri Pirko 3a3e627ce0 spectrum: Add a delayed work to update SPAN buffsize according to speed
When PUDE event is handled and the link is up, update the port SPAN
buffer size according to the current speed.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20 13:25:46 +01:00
Jiri Pirko ff9fdfec5f mlxsw: spectrum: Fix SPAN egress mirroring buffer size for Spectrum-2
For SPAN egress mirroring buffer size, it is needed to use a different
formula for Spectrum and Spectrum-2. Move the buffer size computation to
ops and implement new formula for Spectrum-2.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20 13:25:46 +01:00
Jiri Pirko ac9cc4e250 mlxsw: spectrum: Push code getting port speed into a helper
Currently PTP code queries directly PTYS register for port speed from
work scheduled upon PUDE event. Since the speed needs to be used for
SPAN buffer size computation as well, push the code into a separate
helper.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20 13:25:45 +01:00
Petr Machata 19f405b988 mlxsw: spectrum_qdisc: Support offloading of ETS Qdisc
Handle TC_SETUP_QDISC_ETS, add a new ops structure for the ETS Qdisc.
Invoke the extended prio handlers implemented in the previous patch. For
stats ops, invoke directly the prio callbacks, which are not sensitive to
differences between PRIO and ETS.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18 13:32:30 -08:00
Jiri Pirko 013da29791 mlxsw: spectrum: Use port_module_max_width to compute base port index
Instead of using constant value, use port_module_max_width which is
aligned with the cluster size.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-31 10:54:47 -07:00
Jiri Pirko 49185277cc mlxsw: spectrum: Remember split base local port and use it in unsplit
Don't compute the original base local port during unsplit, rather
remember it in mlxsw_sp_port structure during split port creation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-31 10:54:47 -07:00
Jiri Pirko 4a7f970f12 mlxsw: spectrum: Replace port_to_module array with array of structs
Store the initial PMLP register configuration into array of structures
instead of just simple array of module numbers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-31 10:54:47 -07:00
Danielle Ratson 868678c574 mlxsw: spectrum: Register switched port analyzers (SPAN) as resource
The switch supports an enhanced switched port analyzer that enables
selecting network traffic for analysis by a network analyzer.

SPAN agents are configured and consumed whenever a tc filter is added
with a mirror action to a new destination. The destination can either be
a physical port (e.g., swp1), a VLAN device or a gretap.

Expose the maximum number of SPAN agents and their current usage to the
user.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-18 10:05:37 -07:00
Jiri Pirko 5bcfb6a45a mlxsw: Propagate extack down to register_fib_notifier()
During the devlink reaload the extack is present, so propagate it all
the way down to register_fib_notifier() call in spectrum_router.c.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04 11:10:56 -07:00
Jiri Pirko 053e92aa3c mlxsw: spectrum: Take devlink net instead of init_net
Follow-up patch is going to allow to reload devlink instance into
different network namespace, so use devlink_net() helper instead
of init_net.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04 11:10:56 -07:00
Petr Machata dc4f3eb08a mlxsw: spectrum_ptp: Add counters for GC events
On Spectrum-1, timestamped PTP packets and the corresponding timestamps need to
be kept in caches until both are available, at which point they are matched up
and packets forwarded as appropriate. However, not all packets will ever see
their timestamp, and not all timestamps will ever see their packet. It is
necessary to dispose of such abandoned entries, so a garbage collector was
introduced in commit 5d23e41597 ("mlxsw: spectrum: PTP: Garbage-collect
unmatched entries").

If these GC events happen often, it is a sign of a problem. However because this
whole mechanism is taking place behind the scenes, there is no direct way to
determine whether garbage collection took place.

Therefore to fix this, on Spectrum-1 only, expose four artificial ethtool
counters for the GC events: GCd timestamps and packets, in TX and RX directions.

Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-28 18:24:04 -07:00
Shalom Toledo 3f61967f41 mlxsw: spectrum: Prevent auto negotiation on number of lanes
After 50G-1-lane and 100G-2-lanes link modes were introduced, the driver
is facing situations in which the hardware auto negotiates not only on
speed and type, but also on number of lanes.

Prevent auto negotiation on number of lanes by allowing only port speeds
that can be supported on a given port according to its width.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-28 18:24:04 -07:00
Ido Schimmel b5ce611fd9 mlxsw: spectrum: Add devlink-trap support
Register supported packet traps (layer 2 drops only, currently) and
associated trap group with devlink during driver initialization.

The amount of traffic generated by these packet drop traps is capped at
10Kpps to ensure the CPU is not overwhelmed by incoming packets.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-21 12:58:39 -07:00
David S. Miller 13dfb3fa49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Just minor overlapping changes in the conflicts here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 18:44:57 -07:00
Petr Machata c6b36bdd04 mlxsw: spectrum_ptp: Increase parsing depth when PTP is enabled
Spectrum systems have a configurable limit on how far into the packet they
parse. By default, the limit is 96 bytes.

An IPv6 PTP packet is layered as Ethernet/IPv6/UDP (14+40+8 bytes), and
sequence ID of a PTP event is only available 32 bytes into payload, for a
total of 94 bytes. When an additional 802.1q header is present as
well (such as when ptp4l is running on a VLAN port), the parsing limit is
exceeded. Such packets are not recognized as PTP, and are not timestamped.

Therefore generalize the current VXLAN-specific parsing depth setting to
allow reference-counted requests from other modules as well. Keep it in the
VXLAN module, because the MPRS register also configures UDP destination
port number used for VXLAN, and is thus closely tied to the VXLAN code
anyway.

Then invoke the new interfaces from both VXLAN (in obvious places), as well
as from PTP code, when the (global) timestamping configuration changes from
disabled to enabled or vice versa.

Fixes: 8748642751 ("mlxsw: spectrum: PTP: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-29 13:55:05 -07:00
Jiri Pirko c9588e2812 mlxsw: spectrum_acl: Track rules that forbid egress block bind
Some matches and actions are not supported on egress. Track such rules
and forbid a bind of block which contains them to egress.

With this patch, the kernel tells the user he cannot do that:
$ tc qdisc add dev ens16np1 ingress_block 22 clsact
$ tc filter add block 22 protocol 802.1q pref 2 handle 101 flower vlan_id 100 skip_sw action pass
$ tc qdisc add dev ens16np2 egress_block 22 clsact
Error: mlxsw_spectrum: Block cannot be bound to egress because it contains unsupported rules.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-27 14:32:31 -07:00
Ido Schimmel 577fa14d21 mlxsw: spectrum: Do not process learned records with a dummy FID
The switch periodically sends notifications about learned FDB entries.
Among other things, the notification includes the FID (Filtering
Identifier) and the port on which the MAC was learned.

In case the driver does not have the FID defined on the relevant port,
the following error will be periodically generated:

mlxsw_spectrum2 0000:06:00.0 swp32: Failed to find a matching {Port, VID} following FDB notification

This is not supposed to happen under normal conditions, but can happen
if an ingress tc filter with a redirect action is installed on a bridged
port. The redirect action will cause the packet's FID to be changed to
the dummy FID and a learning notification will be emitted with this FID
- which is not defined on the bridged port.

Fix this by having the driver ignore learning notifications generated
with the dummy FID and delete them from the device.

Another option is to chain an ignore action after the redirect action
which will cause the device to disable learning, but this means that we
need to consume another action whenever a redirect action is used. In
addition, the scenario described above is merely a corner case.

Fixes: cedbb8b259 ("mlxsw: spectrum_flower: Set dummy FID before forward action")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alex Kushnarov <alexanderk@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Alex Kushnarov <alexanderk@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-17 15:19:46 -07:00
Pablo Neira Ayuso f9e30088d2 net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload
And any other existing fields in this structure that refer to tc.
Specifically:

* tc_cls_flower_offload_flow_rule() to flow_cls_offload_flow_rule().
* TC_CLSFLOWER_* to FLOW_CLS_*.
* tc_cls_common_offload to tc_cls_common_offload.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09 14:38:51 -07:00
Shalom Toledo 5fc1733897 mlxsw: spectrum: Set up PTP shaper when port status has changed
When getting port up down event (PUDE), change the PTP shaper
configuration based on hardware time stamping on/off and the port's
speed.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05 15:28:57 -07:00
Shalom Toledo 4ae5cc42d3 mlxsw: spectrum: Add new operation for getting the port's speed
New operation for getting the port's speed as part of port-type-speed
operations.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05 15:28:57 -07:00
Petr Machata d92e4e6e33 mlxsw: spectrum: PTP: Support timestamping on Spectrum-1
On Spectrum-1, timestamps arrive through a pair of dedicated events:
MLXSW_TRAP_ID_PTP_ING_FIFO and _EGR_FIFO. The payload delivered with
those traps is contents of the timestamp FIFO at a given port in a given
direction. Add a Spectrum-1-specific handler for these two events which
decodes the timestamps and forwards them to the PTP module.

Add a function that parses a packet, dispatching to ptp_classify_raw(),
and decodes PTP message type, domain number, and sequence ID. Add a new
mlxsw dependency on the PTP classifier.

Add helpers that can store and retrieve unmatched timestamps and SKBs to
the hash table added in a preceding patch.

Add the matching code itself: upon arrival of a timestamp or a packet,
look up the corresponding unmatched entry, and match it up. If there is
none, add a new unmatched entry. This logic is the same on ingress as on
egress.

Packets and timestamps that never matched need to be eventually disposed
of. A garbage collector added in a follow-up patch will take care of
that. Since currently all this code is turned off, no crud will
accumulate in the hash table.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 18:58:34 -07:00
Petr Machata 810256cec1 mlxsw: spectrum: PTP: Add PTP initialization / finalization
Add two ptp_ops: init and fini, to initialize and finalize the PTP
subsystem. Call as appropriate from mlxsw_sp_init() and _fini().

Lay the groundwork for Spectrum-1 support. On Spectrum-1, the received
timestamped packets and their corresponding timestamps arrive
independently, and need to be matched up. Introduce the related data types
and add to struct mlxsw_sp_ptp_state the hash table that will keep the
unmatched entries.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 18:58:34 -07:00
Petr Machata aed4b57211 mlxsw: spectrum: PTP: Hook into packet receive path
When configured, the Spectrum hardware can recognize PTP packets and
trap them to the CPU using dedicated traps, PTP0 and PTP1.

One reason to get PTP packets under dedicated traps is to have a
separate policer suitable for the amount of PTP traffic expected when
switch is operated as a boundary clock. For this, add two new trap
groups, MLXSW_REG_HTGT_TRAP_GROUP_SP_PTP0 and _PTP1, and associate the
two PTP traps with these two groups.

In the driver, specifically for Spectrum-1, event PTP packets will need
to be paired up with their timestamps. Those arrive through a different
set of traps, added later in the patch set. To support this future use,
introduce a new PTP op, ptp_receive.

It is possible to configure which PTP messages should be trapped under
which PTP trap. On Spectrum systems, we will use PTP0 for event
packets (which need timestamping), and PTP1 for control packets (which
do not). Thus configure PTP0 trap with a custom callback that defers to
the ptp_receive op.

Additionally, L2 PTP packets are actually trapped through the LLDP trap,
not through any of the PTP traps. So treat the LLDP trap the same way as
the PTP0 trap. Unlike PTP traps, which are currently still disabled,
LLDP trap is active. Correspondingly, have all the implementations of
the ptp_receive op return true, which the handler treats as a signal to
forward the packet immediately.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 18:58:34 -07:00
Petr Machata dadbc6bc09 mlxsw: spectrum: Add support for traps specific to Spectrum-1
On Spectrum-1, timestamps for PTP packets are delivered through queues
of ingress and egress timestamps. There are two event traps
corresponding to activity on each of those queues. This mechanism is
absent on Spectrum-2, and therefore the traps should only be registered
on Spectrum-1.

Carry a chip-specific listener array in mlxsw_sp->listeners and
listeners_count. Register listeners from that array in
mlxsw_sp_traps_init(). Add a new listener array for Spectrum-1 traps and
configure the newly-added mlxsw_sp->listeners with this array.

The listener array is empty for now, the events will be added in a later
patch.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 18:58:34 -07:00
Jiri Pirko 0c1f391d19 mlxsw: spectrum_flower: Implement support for ingress device matching
Benefit from the previously extended flow_dissector infrastructure and
offload matching on ingress port.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-19 10:09:22 -04:00
Shalom Toledo 412cd2ad18 mlxsw: spectrum: PTP physical hardware clock initialization
Initialize the PTP physical hardware clock.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-13 22:34:55 -07:00
Ido Schimmel 8f6862065d mlxsw: spectrum_buffers: Add extack messages for invalid configurations
Add extack messages to better communicate invalid configuration to the
user.

Example:

# devlink sb pool set pci/0000:01:00.0 pool 0 size 104857600 thtype dynamic
Error: mlxsw_spectrum: Exceeded shared buffer size.
devlink answers: Invalid argument

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-22 22:09:32 -07:00
Florian Fainelli 3d705f07d1 net: Remove switchdev_ops
Now that we have converted all possible callers to using a switchdev
notifier for attributes we do not have a need for implementing
switchdev_ops anymore, and this can be removed from all drivers the
net_device structure.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27 12:39:56 -08:00
Shalom Toledo d3eaf1085a mlxsw: spectrum: Add Spectrum-2 ASIC port type-speed operations
Add Spectrum-2 ASIC port type-speed operations.

Since multiple ethtool link modes are represented using a single bit in the
ASIC, the driver forces the user to configure all types per a specific
speed. For example, if the user wants to advertise 100Gbps 4-lanes speed,
he should advertise all the types of 100Gbps 4-lanes speed that are
supported by the ASIC as shown below:

  Supported ethtool bits for 100Gbps 4-lanes:
      0x1000000000      100000baseKR4 Full
      0x2000000000      100000baseSR4 Full
      0x4000000000      100000baseCR4 Full
      0x8000000000      100000baseLR4_ER4 Full

  Command for advertising 100Gbps 4-lanes:
      ethtool -s enp3s0np1 advertise 0xF000000000

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Shalom Toledo c5b870df69 mlxsw: spectrum: Add port type-speed operations
Add port type-speed operations in order to have different operations for
different ASICs. For now, both ASICs use the same pointer.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Petr Machata bb6c346cef mlxsw: spectrum_buffers: Reject overlarge headroom size requests
cap_max_headroom_size holds maximum headroom size supported.
Overstepping that limit might under certain conditions lead to ASIC
freeze.

Query and store the value, and add mlxsw_sp_sb_max_headroom_cells() for
obtaining the stored value. In __mlxsw_sp_port_headroom_set(), reject
requests where the total port buffer is larger than the advertised
maximum.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21 15:57:46 -08:00
Petr Machata c39f3e0e4f mlxsw: spectrum: Add struct mlxsw_sp_sb_vals
Spectrum-2 will be configured with a different shared buffer
configuration than Spectrum-1. Therefore introduce a structure for
keeping the chip-specific default and immutable configuration.

Configuration mutable in runtime will still be kept in struct
mlxsw_sp_sb.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21 15:57:45 -08:00
Jiri Pirko 98bbf70c1c mlxsw: spectrum: add "acl_region_rehash_interval" devlink param
Expose new driver-specific "acl_region_rehash_interval" devlink param
which would allow user to alter default ACL region rehash interval.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00
Jiri Pirko a339bf8aaf mlxsw: spectrum_acl: Pass hints priv all the way to ERP code
The hints priv comes from ERP code and it is possible to obtain it from
TCAM code. Add arg to appropriate functions so the hints
priv could be passed back down to ERP code. Pass NULL now as the
follow-up patches would pass an actual hints priv pointer.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00
Jiri Pirko 29a2102a29 mlxsw: spectrum_acl: Implement basic ERP rehash hits creation
Introduce an initial implementation of rehash logic in ERP code.
Currently, the rehash is considered as needed only in case number of
roots in the hints is smaller than the number of roots actually in use.
In that case return hints pointer and let it be obtained through the
callpath through the Spectrum-2 TCAM op.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:49 -08:00
Jiri Pirko 42d704e018 mlxsw: spectrum_acl: Remove unnecessary arg on action_replace call path
No need to pass ruleset/group and chunk pointers on action_replace call
path, nobody uses them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 10:43:15 -08:00
Ido Schimmel eff42aa998 mlxsw: spectrum: Expose functions to create and destroy underlay RIF
In Spectrum-2, instead of providing the ID of the virtual router (VR)
where NVE underlay lookups will occur as in Spectrum-1, the ID of a
router interface (RIF) in this VR is required.

Expose functions to create and destroy such a RIF.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23 09:28:27 -08:00
Nir Dotan 1f5b230339 mlxsw: spectrum: Set RIF ops per ASIC type
Set RIF ops array as member of mlxsw_sp in order to control which RIF
operations callbacks are called per ASIC type. This is needed to control
per ASIC handling of loopback RIF configurations.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-20 11:12:57 -08:00
Ido Schimmel 0417d25e7d mlxsw: spectrum: Switch to VID 4095 as default VID
There is no need to abuse VID 1 anymore and we can instead use VID 4095
as the default VLAN, which will be configured on the port throughout its
lifetime.

The OVS join / leave functions are changed to enable VIDs 1-4094
(inclusive) instead of 2-4095. This because VID 4095 is now the default
VLAN instead of 1.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 15:48:54 -08:00
Ido Schimmel 346fca3b58 mlxsw: spectrum: Store pointer to default port VLAN in port struct
Subsequent patches will need to access the default port VLAN. Since this
VLAN will exist throughout the lifetime of the port, simply store it in
the port's struct.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 15:48:54 -08:00
Ido Schimmel a2d2a20553 mlxsw: spectrum: Replace hard-coded default VID with a define
Subsequent patches are going to replace the current default VID (1) with
VLAN_N_VID - 1 (4095).

Prepare for this conversion by replacing the hard-coded '1' with a
define.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 15:48:54 -08:00
Ido Schimmel f40be47a3e mlxsw: spectrum_router: Do not force specific configuration order
In symmetric routing, the only two members in the VLAN corresponding to
the L3 VNI are the router port and the VXLAN tunnel.

In case the VXLAN device is already enslaved to the bridge and only
later the VLAN interface is configured, the tunnel will not be
offloaded.

The reason for this is that when the router interface (RIF)
corresponding to the VLAN interface is configured, it calls the core
fid_get() API which does not check if NVE should be enabled on the FID.

Instead, call into the bridge code which will check if NVE should be
enabled on the FID.

This effectively means that the same code path is used to retrieve a FID
when either a local port or a router port joins the FID.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 15:48:54 -08:00
Ido Schimmel 965fa8e600 mlxsw: spectrum_router: Make RIF deletion more robust
In the past we had multiple instances where RIFs were not properly
deleted.

One of the reasons for leaking a RIF was that at the time when IP
addresses were flushed from the respective netdev (prompting the
destruction of the RIF), the netdev was no longer a mlxsw upper. This
caused the inet{,6}addr notification blocks to ignore the NETDEV_DOWN
event and leak the RIF.

Instead of checking whether the netdev is our upper when an IP address
is removed, we can instead check if the netdev has a RIF configured.

To look up a RIF we need to access mlxsw private data, so the patch
stores the notification blocks inside a mlxsw struct. This then allows
us to use container_of() and extract the required private data.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel 635c8c8bba mlxsw: spectrum: Remove reference count from VLAN entries
Commit b3529af6bb ("spectrum: Reference count VLAN entries") started
reference counting port-VLAN entries in a similar fashion to the 8021q
driver.

However, this is not actually needed and only complicates things.
Instead, the driver should forbid the creation of a VLAN on a port if
this VLAN already exists. This would also solve the issue fixed by the
mentioned commit.

Therefore, remove the get()/put() API and use create()/destroy()
instead.

One place that needs special attention is VLAN addition in a VLAN-aware
bridge via switchdev operations. In case the VLAN flags (e.g., 'pvid')
are toggled, then the VLAN entry already exists. To prevent the driver
from wrongly returning EEXIST, the driver is changed to check in the
prepare phase whether the entry already exists and only returns an error
in case it is not associated with the correct bridge port.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel f1d7c33d6a mlxsw: spectrum_fid: Remove unused function
This function is no longer used. Remove it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel 32fd4b49a3 mlxsw: spectrum_router: Do not destroy RIFs based on FID's reference count
Currently, when a RIF is constructed on top of a FID, the RIF increments
the FID's reference count and the RIF is destroyed when the FID's
reference count drops to 1. This effectively means that when no local
ports are member in the FID, the FID is destroyed regardless if the
router port is a member in the FID or not.

The above can lead to the unexpected behavior in which routes using a
VLAN interface as their nexthop device are no longer offloaded after the
last local port leaves the corresponding VLAN (FID).

Example:
# ip -4 route show dev br0.10
192.0.2.0/24 proto kernel scope link src 192.0.2.1 offload
# bridge vlan del vid 10 dev swp3
# ip -4 route show dev br0.10
192.0.2.0/24 proto kernel scope link src 192.0.2.1

After the patch, the route is offloaded before and after the VLAN is
removed from local port 'swp3', as the RIF corresponding to 'br0.10'
continues to exists.

In order to remove RIFs' reliance on the underlying FID's reference
count, we need to add a reference count to sub-port RIFs, which are RIFs
that correspond to physical ports and their uppers (e.g., LAG devices).

In this case, each {Port, VID} ('struct mlxsw_sp_port_vlan') needs to
hold a reference on the RIF. For example:

                       bond0.10
                          |
                        bond0
                          |
                      +-------+
                      |       |
                    swp1    swp2

Both {Port 1, VID 10} and {Port 2, VID 10} will hold a reference on the
RIF corresponding to 'bond0.10'. When the last reference is dropped, the
RIF will be destroyed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Petr Machata 9329b8162b mlxsw: spectrum: Add mlxsw_sp.mac_mask
The Spectrum hardware demands that all router interfaces in the system
have the same first 38 resp. 36 bits of MAC address: the former limit
holds on Spectrum, the latter on Spectrum-2. Add a field that refers to
the required prefix mask and initialize in mlxsw_sp1_init() and
mlxsw_sp2_init().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13 18:41:39 -08:00
Petr Machata 9735f2d2fe mlxsw: spectrum_router: Generalize mlxsw_sp_netdevice_router_port_event()
Prepare mlxsw_sp_netdevice_router_port_event() for handling of
NETDEV_PRE_CHANGEADDR. Split out the part that deals with the actual
changes and call it for the two events currently handled.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13 18:41:39 -08:00
Nir Dotan c20580c21f mlxsw: spectrum_acl: Support rule creation without action creation
Up until now, when ACL rule was created its action was created with it.
It suits well for tc flower where ACL rule always needs an action, however
it does not suit multicast router, where the action is created prior to
setting a route, which in Spectrum-2 is actually an ACL rule.

Add support for rule creation without action creation. Do it by adding
afa_block argument to mlxsw_sp_acl_rule_create, which if NULL then an
action would be created, also add an indication within struct
mlxsw_sp_acl_rule_info that tells if the action should be destroyed when
the rule is destroyed.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-11 23:01:33 -08:00