Commit Graph

6197 Commits

Author SHA1 Message Date
Vlad Buslov b63293e759 net/mlx5e: Show/set Rx network flow classification rules on ul rep
Reuse infrastructure that already exists for pf in legacy mode to show/set
Rx network flow classification rules for uplink representors.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09 16:58:49 -07:00
Vlad Buslov 6783e8b29f net/mlx5e: Init ethtool steering for representors
During transition to uplink representors the code responsible for
initializing ethtool steering functionality wasn't added to representor
init rx routine. This causes NULL pointer dereference during configuration
of network flow classification rule with ethtool (only possible to
reproduce with next commit in this series which registers necessary ethtool
callbacks).

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09 16:58:48 -07:00
Vlad Buslov 01013ad355 net/mlx5e: Show/set Rx flow indir table and RSS hash key on ul rep
Reuse infrastructure that already exists for pf in legacy mode to show/set
Rx flow hash indirection table and RSS hash key for uplink representors.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09 16:58:46 -07:00
Saeed Mahameed 20f7b37ffc net/mlx5e: Introduce root ft concept for representors netdevs
Uplink representor traffic will be redirected to an empty root ft rather
than directly to a direct tir or ttc table, this root ft will be empty and
will be used as a link for auto-chaining with ttc table or ethtool tables
in downstream patches.

On load, fs core will connect uplink rep root_ft with ttc table.  In case
ethtool steering will be used, fs core will auto connect root_ft with
the ethtool bypass tables, which will be connected with the ttc table.

vport_rx_rule[uplink_rep]->root_ft->ethtool->ttc.

For non-uplink representors, for simplicity root_ft will always point at
ttc table, hence the replace vport_rx rule logic is removed.

vport_rx_rule[non_uplink_rep]->root_ft(ttc).

For now ethtool steering support can only be available on uplink rep.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09 16:58:44 -07:00
Parav Pandit cc617ceda0 net/mlx5: E-switch, make query inline mode a static function
mlx5_eswitch_inline_mode_get() is used only in eswitch_offloads.c.
Hence, make it static and adjacent to its caller function.

Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:42 -07:00
Paul Blakey 891b8f3321 net/mlx5: Allocate smaller size tables for ft offload
Instead of giving ft tables one of the largest tables available - 4M,
give it a more reasonable size - 64k. Especially since it will
always be created as a miss hook in the following patch.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:40 -07:00
Dan Carpenter d9fb932fde net/mlx5e: Fix an IS_ERR() vs NULL check
The esw_vport_tbl_get() function returns error pointers on error.

Fixes: 96e326878f ("net/mlx5e: Eswitch, Use per vport tables for mirroring")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:38 -07:00
Eli Cohen 2fbbc30da0 net/mlx5: Verify goto chain offload support
According to PRM, forward to flow table along with either packet
reformat or decap is supported only if reformat_and_fwd_to_table
capability is set for the flow table.

Add dependency on the capability and pack all the conditions for "goto
chain" in a single function.

Fix language in error message in case of not supporting forward to a
lower numbered flow table.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:36 -07:00
Majd Dibbiny 1e62e222db net/mlx5: E-Switch, Use vport metadata matching only when mandatory
Multi-port RoCE mode requires tagging traffic that passes through the
vport.
This matching can cause performance degradation, therefore disable it
and use the legacy matching on vhca_id and source_port when possible.

Fixes: 92ab1eb392 ("net/mlx5: E-Switch, Enable vport metadata matching if firmware supports it")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:35 -07:00
Mark Bloch 2f5438ca0e net/mlx5: Tidy up and fix reverse christmas ordring
Use reverse chirstmas tree inside mlx5e_ethtool_get_link_ksettings.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:33 -07:00
Mark Bloch c268ca6087 net/mlx5: Expose port speed when possible
When port speed can't be reported based on ext_eth_proto_capability
or eth_proto_capability instead of reporting speed as unknown check
if the port's speed can be inferred based on the data_rate_oper field.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:31 -07:00
Saeed Mahameed a70ed9d8ec Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
This series adds some HW bits and definitions for mlx5 driver, to be
used by downstream features in both rdma and netdev branches.

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: HW bit for goto chain offload support
  net/mlx5: Expose link speed directly
  net/mlx5: Introduce TLS and IPSec objects enums
  net/mlx5: Introduce egress acl forward-to-vport capability
  net/mlx5: Expose raw packet pacing APIs
  net/mlx5e: Replace zero-length array with flexible-array member
  net/mlx5: fix spelling mistake "reserverd" -> "reserved"

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09 16:58:26 -07:00
Jiri Pirko d7cb1e3ba1 flow_offload: introduce "disabled" HW stats type and allow it in mlxsw
Introduce new type for disabled HW stats and allow the value in
mlxsw offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:07:48 -07:00
Jiri Pirko f16e7f64e4 mlxsw: spectrum_acl: Ask device for rule stats only if counter was created
Set a flag in case rule counter was created. Only query the device for
stats of a rule, which has the valid counter assigned.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:07:48 -07:00
Jiri Pirko 4885547951 flow_offload: introduce "delayed" HW stats type and allow it in mlx5
Introduce new type for delayed HW stats and allow the value in
mlx5 offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:07:48 -07:00
Jiri Pirko d60d7ed4c8 flow_offload: introduce "immediate" HW stats type and allow it in mlxsw
Introduce new type for immediate HW stats and allow the value in
mlxsw offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:07:48 -07:00
Jiri Pirko c4afd0c816 mlxsw: restrict supported HW stats type to "any"
Currently don't allow actions with any other type to be inserted.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:07:48 -07:00
Jiri Pirko 3632f6d390 mlxsw: spectrum_flower: Do not allow mixing HW stats types for actions
As there is one set of counters for the whole action chain, forbid to
mix the HW stats types.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:07:48 -07:00
Jiri Pirko 319a1d1947 flow_offload: check for basic action hw stats type
Introduce flow_action_basic_hw_stats_types_check() helper and use it
in drivers. That sanitizes the drivers which do not have support
for action HW stats types.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:07:48 -07:00
Saeed Mahameed bd673da6d9 net/mlx5: Introduce TLS and IPSec objects enums
Expose the TLS encryption key general object type enum correctly,
and add the IPSec encryption key general object type enum.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-07 13:19:25 -08:00
Eli Cohen 0b13645474 net/mlx5: Clear LAG notifier pointer after unregister
After returning from unregister_netdevice_notifier_dev_net(), set the
notifier_call field to NULL so successive call to mlx5_lag_add() will
function as expected.

Fixes: 7907f23adc ("net/mlx5: Implement RoCE LAG feature")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05 15:13:45 -08:00
Sebastian Hense 404402abd5 net/mlx5e: Fix endianness handling in pedit mask
The mask value is provided as 64 bit and has to be casted in
either 32 or 16 bit. On big endian systems the wrong half was
casted which resulted in an all zero mask.

Fixes: 2b64beba02 ("net/mlx5e: Support header re-write of partial fields in TC pedit offload")
Signed-off-by: Sebastian Hense <sebastian.hense1@ibm.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05 15:13:42 -08:00
Tariq Toukan f28ca65efa net/mlx5e: kTLS, Fix wrong value in record tracker enum
Fix to match the HW spec: TRACKING state is 1, SEARCHING is 2.
No real issue for now, as these values are not currently used.

Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05 15:13:39 -08:00
Tariq Toukan 56917766de net/mlx5e: kTLS, Fix TCP seq off-by-1 issue in TX resync flow
We have an off-by-1 issue in the TCP seq comparison.
The last sequence number that belongs to the TCP packet's payload
is not "start_seq + len", but one byte before it.
Fix it so the 'ends_before' is evaluated properly.

This fixes a bug that results in error completions in the
kTLS HW offload flows.

Fixes: ffbd9ca94e ("net/mlx5e: kTLS, Fix corner-case checks in TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05 15:13:36 -08:00
Hamdan Igbaria 692b0399a2 net/mlx5: DR, Fix postsend actions write length
Fix the send info write length to be (actions x action) size in bytes.

Fixes: 297cccebdc ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05 15:13:33 -08:00
Petr Machata 7bec1a45d5 mlxsw: spectrum_qdisc: Support offloading of FIFO Qdisc
There are two peculiarities about offloading FIFO:

- sometimes the qdisc has an unspecified handle (it is "invisible")
- it may be created before the qdisc that it will be a child of

These features make the offload a bit more tricky. The approach chosen in
this patch is to make note of all the FIFOs that needed to be rejected
because their parents were not known. Later when the parent is created,
they are offloaded

FIFO is only offloaded for its counters, queue length is ignored.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-05 14:03:32 -08:00
Petr Machata c4e372e2ac mlxsw: spectrum_qdisc: Add handle parameter to ..._ops.replace
PRIO and ETS will need to check the value of qdisc handle in their
handlers. Add it to the callback and propagate through.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-05 14:03:31 -08:00
Petr Machata ee88450d25 mlxsw: spectrum_qdisc: Introduce struct mlxsw_sp_qdisc_state
In order to have a tidy structure where to put information related to Qdisc
offloads, introduce a new structure. Move there the two existing pieces of
data: root_qdisc and tclass_qdiscs. Embed them directly, because there's no
reason to go through pointer anymore. Convert users, update init/fini
functions.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-05 14:03:31 -08:00
Jakub Kicinski 55808762f3 mlx5: reject unsupported coalescing params
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

v3: adjust commit message for new member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-05 12:12:35 -08:00
Yishai Hadas 1326034b3c net/mlx5: Expose raw packet pacing APIs
Expose raw packet pacing APIs to be used by DEVX based applications.
The existing code was refactored to have a single flow with the new raw
APIs.

The new raw APIs considered the input of 'pp_rate_limit_context', uid,
'dedicated', upon looking for an existing entry.

This raw mode enables future device specification data in the raw
context without changing the existing logic and code.

The ability to ask for a dedicated entry gives control for application
to allocate entries according to its needs.

A dedicated entry may not be used by some other process and it also
enables the process spreading its resources to some different entries
for use different hardware resources as part of enforcing the rate.

The counter per entry was changed to be u64 to prevent any option to
overflow.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-03-05 14:18:09 +02:00
Gustavo A. R. Silva 30a87f150b net: mlxfw: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-03 17:39:19 -08:00
Parav Pandit 162add8cba net/mlx5e: Use devlink virtual flavour for VF devlink port
Use newly introduce 'virtual' port flavour for devlink
port of PCI VF devlink device in non-representors mode.

While at it, remove recently introduced empty lines at end of the file.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-03 15:40:40 -08:00
David S. Miller 549da33801 mlx5-updates-2020-02-27
mlx5 misc updates and minor cleanups:
 
 1) Use per vport tables for mirroring
 2) Improve log messages for SW steering (DR)
 3) Add devlink fdb_large_groups parameter
 4) E-Switch, Allow goto earlier chain
 5) Don't allow forwarding between uplink representors
 6) Add support for devlink-port in non-representors mode
 7) Minor misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl5YYYwACgkQSD+KveBX
 +j7vIQgAsFQSjJGdyUzVDfeHu94Ze79OMRstuEh9fbcNXeyawMTShPLPE94E91qo
 Nj6AfZp8sH9M6fLtFcOMrPou87ITslP3mQ/C2cYWWJn/iq9RN13kO+howZNeRhlk
 iettT8qHGYR7aU3tBU38pc9/bWVHRPt+FZGBeCAcvwgQ1zfjBEtPcMJ8svjsfETA
 0bzEDEq0eKEaynHr3phJpzbcnvzm3154HkfZtDZFAApgr4tpEGSlFa4n+8ctcmqy
 zf82HH8GnB+GhTUPU3W33NwyH2otHOcE3dPAepQNmS3f8EemMJebAtjkexo6SfjY
 rwUs5qTU05myOy5RPmzwhQsnzRyPBw==
 =z2AA
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-02-27

mlx5 misc updates and minor cleanups:

1) Use per vport tables for mirroring
2) Improve log messages for SW steering (DR)
3) Add devlink fdb_large_groups parameter
4) E-Switch, Allow goto earlier chain
5) Don't allow forwarding between uplink representors
6) Add support for devlink-port in non-representors mode
7) Minor misc cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28 11:59:53 -08:00
David S. Miller 9f6e055907 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The mptcp conflict was overlapping additions.

The SMC conflict was an additional and removal happening at the same
time.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 18:31:39 -08:00
Roi Dayan bc1d75fa79 net/mlx5e: Remove redundant comment about goto slow path
The code is self explanatory and makes the comment redundant.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:42 -08:00
Eli Cohen 178f69b477 net/mlx5e: Reduce number of arguments in slow path handling
mlx5e_tc_offload_to_slow_path() and mlx5e_tc_unoffload_from_slow_path()
take an extra argument allocated on the stack of the caller but not used
by the caller. Avoid the extra argument and use local variable in the
function itself.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:39 -08:00
Eli Cohen dec481c86e net/mlx5e: Remove unused argument from parse_tc_pedit_action()
parse_attr is not used by parse_tc_pedit_action() so revmove it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:36 -08:00
Roi Dayan 61644c3de8 net/mlx5e: Use NL_SET_ERR_MSG_MOD() extack for errors
This to be consistent and adds the module name to the error message.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:34 -08:00
Roi Dayan 4ccd83f40c net/mlx5e: Use netdev_warn() instead of pr_err() for errors
This is for added netdev prefix that helps identify
the source of the message.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:31 -08:00
Roi Dayan 237ac8ded4 net/mlx5e: Use netdev_warn() for errors for added prefix
This helps identify the source of the message.
If netdev still doesn't exists use mlx5_core_warn().

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:28 -08:00
Erez Shitrit b7d0db5520 net/mlx5: DR, Improve log messages
Few print messages are in debug level where they should be in error, and
few messages are missing.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:25 -08:00
Hamdan Igbaria f64092997f net/mlx5: DR, Change matcher priority parameter type
Change matcher priority parameter type from u16 to u32,
this change is needed since sometimes upper levels
create a matcher with priority bigger than 2^16.

Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:23 -08:00
Jianbo Liu 87dac697a0 net/mlx5e: Add devlink fdb_large_groups parameter
Add a devlink parameter to control the number of large groups in a
autogrouped flow table. The default value is 15, and the range is between 1
and 1024.

The size of each large group can be calculated according to the following
formula: size = 4M / (fdb_large_groups + 1).

Examples:
- Set the number of large groups to 20.
    $ devlink dev param set pci/0000:82:00.0 name fdb_large_groups \
      cmode driverinit value 20

  Then run devlink reload command to apply the new value.
    $ devlink dev reload pci/0000:82:00.0

- Read the number of large groups in flow table.
    $ devlink dev param show pci/0000:82:00.0 name fdb_large_groups
    pci/0000:82:00.0:
      name fdb_large_groups type driver-specific
        values:
          cmode driverinit value 20

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:19 -08:00
Jianbo Liu 8aa9f3be73 net/mlx5: Change the name of steering mode param id
The prefix should be "MLX5_DEVLINK_PARAM_ID_" for all in
mlx5_devlink_param_id enum.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:17 -08:00
Vladyslav Tarasiuk c6acd629ee net/mlx5e: Add support for devlink-port in non-representors mode
Added devlink_port field to mlx5e_priv structure and a callback to
netdev ops to enable devlink to get info about the port. The port
registration happens at driver initialization.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:14 -08:00
Vladyslav Tarasiuk ab8f963a11 net/mlx5e: Rename representor get devlink port function
Rename representor's mlx5e_get_devlink_port() to
mlx5e_rep_get_devlink_port().
The downstream patch will add a non-representor mlx5e function called
mlx5e_get_devlink_phy_port().

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:11 -08:00
Roi Dayan 297eaf5b95 net/mlx5: E-Switch, Allow goto earlier chain if FW supports it
Mellanox FW can support this if ignore_flow_level capability exists.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:08 -08:00
Eli Cohen 96e326878f net/mlx5e: Eswitch, Use per vport tables for mirroring
When using port mirroring, we forward the traffic to another table and
use that table to forward to the mirrored vport. Since the hardware
loses the values of reg c, and in particular reg c0, we fail the match
on the input vport which previously existed in reg c0. To overcome this
situation, we use a set of per vport tables, positioned at the lowest
priority, and forward traffic to those tables. Since these tables are
per vport, we can avoid matching on reg c0.

Fixes: c01cfd0f11 ("net/mlx5: E-Switch, Add match on vport metadata for rule in fast path")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:05 -08:00
Eli Cohen 1708dd5468 net/mlx5: Eswitch, avoid redundant mask
misc_params.source_port is a 16 bit field already so no need for
redundant masking against 0xffff. Also change local variables type to
u16.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:40:02 -08:00
Tonghao Zhang ffec97020f net/mlx5e: Don't allow forwarding between uplink
We can install forwarding packets rule between uplink
in switchdev mode, as show below. But the hardware does
not do that as expected (mlnx_perf -i $PF1, we can't get
the counter of the PF1). By the way, if we add the uplink
PF0, PF1 to Open vSwitch and enable hw-offload, the rules
can be offloaded but not work fine too. This patch add a
check and if so return -EOPNOTSUPP.

$ tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1 \
    flower skip_sw action mirred egress redirect dev $PF1

$ tc -d -s filter show dev $PF0 ingress
    skip_sw
    in_hw in_hw_count 1
    action order 1: mirred (Egress Redirect to device enp130s0f1) stolen
    ...
    Sent hardware 408954 bytes 4173 pkt
    ...

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27 16:39:59 -08:00
Amit Cohen ac004e8416 mlxsw: pci: Wait longer before accessing the device after reset
During initialization the driver issues a reset to the device and waits
for 100ms before checking if the firmware is ready. The waiting is
necessary because before that the device is irresponsive and the first
read can result in a completion timeout.

While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is
insufficient for Spectrum-3.

Fix this by increasing the timeout to 200ms.

Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 12:09:22 -08:00
Jiri Pirko ec4a514a68 mlxsw: reg: Update module_type values in PMTM register and map them to width
There are couple new values that PMTM register can return
in module_type field. Add them and map them to module width in
mlxsw_core_module_max_width(). Fix the existing names on the way.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 12:08:46 -08:00
Jiri Pirko e387f7d5fc mlx5: register lag notifier for init network namespace only
The current code causes problems when the unregistering netdevice could
be different then the registering one.

Since the check in mlx5_lag_netdev_event() does not allow any other
network namespace anyway, fix this by registerting the lag notifier
per init network namespace only.

Fixes: d48834f9d4 ("mlx5: Use dev_net netdevice notifier registrations")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Aya Levin <ayal@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 11:16:14 -08:00
Petr Machata 3b909c552a mlxsw: spectrum: Add mlxsw_sp_span_ops.buffsize_get for Spectrum-3
The buffer factor on Spectrum-3 is larger than on Spectrum-2. Add a new
callback and use it for mlxsw_sp->span_ops on Spectrum-3.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:44:42 -08:00
Ido Schimmel b401ff8541 mlxsw: spectrum: Initialize advertised speeds to supported speeds
During port initialization the driver instructs the device to only
advertise speeds that can be supported by the port's current width.

Since the device now returns the supported speeds based on the port's
current width, the driver no longer needs to compute the speeds that can
be advertised.

Simplify port initialization by setting the advertised speeds to the
queried supported speeds.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:44:42 -08:00
Petr Machata 8a29581eb0 mlxsw: spectrum: Move the ECN-marked packet counter to ethtool
Spectrum-1 and Spectrum-2 do not have a per-TC counter of number of packets
marked by ECN. The value reported currently is the total number of marked
packets. Showing this value at individual TC Qdiscs is misleading.

Move the counter to ethtool instead.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:44:42 -08:00
Jiri Pirko 648e53cac7 mlxsw: spectrum_switchdev: Optimize SFN records processing
Currently, only one SFN query is done from repetitive work at a time,
processing 64 entries. Another work iteration is scheduled in 100ms,
that means that the max rate of learned FDB entries is limited to 6400/s.
That is slow. Fix this by doing 2 optimizations:
1) Run 10 SFN queries at a time.
2) In case the SFN is not drained, schedule work with 0 delay to allow
   to continue processing rest of the records.

On a testing setup with 500K entries the time to process decreased
from 870secs to 10secs.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Alex Kushnarov <alexanderk@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:44:42 -08:00
Saeed Mahameed 586ee9e8a3 net/mlx5: sparse: warning: Using plain integer as NULL pointer
Return NULL instead of 0.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
2020-02-25 17:06:21 -08:00
Saeed Mahameed 5edc4c7275 net/mlx5: sparse: warning: incorrect type in assignment
drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:191:13:
sparse: warning: incorrect type in assignment (different base types)

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
2020-02-25 17:06:19 -08:00
Nathan Chancellor fa2b491287 net/mlx5: Fix header guard in rsc_dump.h
Clang warns:

 In file included from
 ../drivers/net/ethernet/mellanox/mlx5/core/main.c:73:
 ../drivers/net/ethernet/mellanox/mlx5/core/diag/rsc_dump.h:4:9: warning:
 '__MLX5_RSC_DUMP_H' is used as a header guard here, followed by #define
 of a different macro [-Wheader-guard]
 #ifndef __MLX5_RSC_DUMP_H
         ^~~~~~~~~~~~~~~~~
 ../drivers/net/ethernet/mellanox/mlx5/core/diag/rsc_dump.h:5:9: note:
 '__MLX5_RSC_DUMP__H' is defined here; did you mean '__MLX5_RSC_DUMP_H'?
 #define __MLX5_RSC_DUMP__H
         ^~~~~~~~~~~~~~~~~~
         __MLX5_RSC_DUMP_H
 1 warning generated.

Make them match to get the intended behavior and remove the warning.

Fixes: 12206b1723 ("net/mlx5: Add support for resource dump")
Link: https://github.com/ClangBuiltLinux/linux/issues/897
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:06:16 -08:00
Tariq Toukan e9c1d2539d net/mlx5e: RX, Use indirect calls wrapper for handling compressed completions
We can avoid an indirect call per compressed completion wrapping the
completion handling call with the appropriate helper.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:06:10 -08:00
Tariq Toukan 2c8f80b3e3 net/mlx5e: RX, Use indirect calls wrapper for posting descriptors
We can avoid an indirect call per NAPI cycle wrapping the RX descriptors
posting call with the appropriate helper.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:06:07 -08:00
Maxim Mikityanskiy 6e0504c698 net/mlx5e: Change inline mode correctly when changing trust state
The current steps that are performed when the trust state changes, if
the channels are active:

1. The trust state is changed in hardware.

2. The new inline mode is calculated.

3. If the new inline mode is different, the channels are recreated using
the new inline mode.

This approach has some issues:

1. There is a time gap between changing trust state in hardware and
starting sending enough inline headers (the latter happens after
recreation of channels). It leads to failed transmissions and error
CQEs.

2. If the new channels fail to open, we'll be left with the old ones,
but the hardware will be configured for the new trust state, so the
interval when we can see TX errors never ends.

This patch fixes the issues above by moving the trust state change into
the preactivate hook that runs during the recreation of the channels
when no channels are active, so it eliminates the gap of partially
applied configuration. If the inline mode doesn't change with the change
of the trust state, the channels won't be recreated, just like before
this patch.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:06:04 -08:00
Maxim Mikityanskiy b9ab5d0ecf net/mlx5e: Add context to the preactivate hook
Sometimes the preactivate hook of mlx5e_safe_switch_channels needs more
parameters than just struct mlx5e_priv *. For such cases, a new
parameter (void *context) is added to preactivate hooks.

Some of the existing normal functions are currently used as preactivate
callbacks. To avoid adding an extra unused parameter, they are wrapped
in an automatic way using the MLX5E_DEFINE_PREACTIVATE_WRAPPER_CTX
macro.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:06:02 -08:00
Maxim Mikityanskiy 35a78ed4c3 net/mlx5e: Allow mlx5e_switch_priv_channels to fail and recover
Currently mlx5e_switch_priv_channels expects that the preactivate hook
doesn't fail, however, it can fail, because it may set hardware
parameters. This commit addresses this issue and provides a way to
recover from failures of the preactivate hook: the old channels are not
closed until the point where nothing can fail anymore, so in case
preactivate fails, the driver can roll back the old channels and
activate them again.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:59 -08:00
Maxim Mikityanskiy 600a3952a2 net/mlx5e: Remove unneeded netif_set_real_num_tx_queues
The number of queues is now updated by mlx5e_update_netdev_queues in a
centralized way, when no channels are active. Remove an extra occurrence
of netif_set_real_num_tx_queues to prepare it for the next commit.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:56 -08:00
Maxim Mikityanskiy 3909a12e79 net/mlx5e: Fix configuration of XPS cpumasks and netdev queues in corner cases
Currently, mlx5e notifies the kernel about the number of queues and sets
the default XPS cpumasks when channels are activated. This
implementation has several corner cases, in which the kernel may not be
updated on time, or XPS cpumasks may be reset when not directly touched
by the user.

This commit fixes these corner cases to match the following expected
behavior:

1. The number of queues always corresponds to the number of channels
configured.

2. XPS cpumasks are set to driver's defaults on netdev attach.

3. XPS cpumasks set by user are not reset, unless the number of channels
changes. If the number of channels changes, they are reset to driver's
defaults. (In general case, when the number of channels increases or
decreases, it's not possible to guess how to convert the current XPS
cpumasks to work with the new number of channels, so we let the user
reconfigure it if they change the number of channels.)

XPS cpumasks are no longer stored per channel. Only one temporary
cpumask is used. The old stored cpumasks didn't reflect the user's
changes and were not used after applying them.

A scratchpad area is added to struct mlx5e_priv. As cpumask_var_t
requires allocation, and the preactivate hook can't fail, we need to
preallocate the temporary cpumask in advance. It's stored in the
scratchpad.

Fixes: 149e566fef ("net/mlx5e: Expand XPS cpumask to cover all online cpus")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:53 -08:00
Maxim Mikityanskiy fe867cac9e net/mlx5e: Use preactivate hook to set the indirection table
mlx5e_ethtool_set_channels updates the indirection table before
switching to the new channels. If the switch fails, the indirection
table is new, but the channels are old, which is wrong. Fix it by using
the preactivate hook of mlx5e_safe_switch_channels to update the
indirection table at the stage when nothing can fail anymore.

As the code that updates the indirection table is now encapsulated into
a new function, use that function in the attach flow when the driver has
to reduce the number of channels, and prepare the code for the next
commit.

Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:51 -08:00
Maxim Mikityanskiy dca147b3dc net/mlx5e: Rename hw_modify to preactivate
mlx5e_safe_switch_channels accepts a callback to be called before
activating new channels. It is intended to configure some hardware
parameters in cases where channels are recreated because some
configuration has changed.

Recently, this callback has started being used to update the driver's
internal MLX5E_STATE_XDP_OPEN flag, and the following patches also
intend to use this callback for software preparations. This patch
renames the hw_modify callback to preactivate, so that the name fits
better.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:47 -08:00
Maxim Mikityanskiy c2c95271f9 net/mlx5e: Encapsulate updating netdev queues into a function
As a preparation for one of the following commits, create a function to
encapsulate the code that notifies the kernel about the new amount of
RX and TX queues. The code will be called multiple times in the next
commit.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:45 -08:00
Tariq Toukan 02377e6edf net/mlx5e: Add missing LRO cap check
The LRO boolean state in params->lro_en must not be set in case
the NIC is not capable.
Enforce this check and remove the TODO comment.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:42 -08:00
Eran Ben Elisha 4229e0ea2c net/mlx5e: Define one flow for TXQ selection when TCs are configured
We shall always extract channel index out of the txq, regardless
of the relation between txq_ix and num channels. The extraction is
always valid, as if txq is smaller than number of channels,
txq_ix == priv->txq2sq[txq_ix]->ch_ix.

By doing so, we can remove an if clause from the select queue method,
and have one flow for all packets.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25 17:05:39 -08:00
Jiri Pirko 6de9fceeaa mlxsw: spectrum_trap: Lookup and pass cookie down to devlink_trap_report()
Use the cookie index received along with the packet to lookup original
flow_offload cookie binary and pass it down to devlink_trap_report().
Add "fa_cookie" metadata to the ACL trap.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25 11:05:55 -08:00
Jiri Pirko 78a7dcb7c9 mlxsw: pci: Extract cookie index for ACL discard trap packets
In case the received packet comes in due to one of ACL discard traps,
take the user_def_val_orig_pkt_len field from CQE and store it
in skb->cb as ACL cookie index.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25 11:05:55 -08:00
Jiri Pirko 6d19d2bdc8 mlxsw: core_acl_flex_actions: Implement flow_offload action cookie offload
Track cookies coming down to driver by flow_offload.
Assign a cookie_index to each unique cookie binary. Use previously
defined "Trap with userdef" flex action to ask HW to pass cookie_index
alongside with the dropped packets.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25 11:05:55 -08:00
Jiri Pirko ec12165195 mlxsw: core_acl_flex_actions: Add trap with userdef action
Expose "Trap action with userdef". It is the same as already
defined "Trap action" with a difference that it would ask the policy
engine to pass arbitrary value (userdef) alongside with received packets.
This would be later on used to carry cookie index.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25 11:05:54 -08:00
Jiri Pirko 5a2e106c74 devlink: extend devlink_trap_report() to accept cookie and pass
Add cookie argument to devlink_trap_report() allowing driver to pass in
the user cookie. Pass on the cookie down to drop monitor code.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25 11:05:54 -08:00
Jiri Pirko 45dbee0905 mlxsw: spectrum_trap: Add ACL devlink-trap support
Add the trap group used to report ACL drops. Setup the trap IDs for
ingress/egress flow action drop. Register the two packet traps
associated with ACL trap group with devlink during driver
initialization. As these are "source traps", set the disabled
trap group to be the dummy, discarding as many packets in HW
as possible.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko e612523041 mlxsw: spectrum_trap: Introduce dummy group with thin policer
For "source traps" it is not possible to change HPKT action to discard.
But there is still need to disallow packets arriving to CPU as much as
possible. Handle this by introduction of a "dummy group". It has a
"thin" policer, which passes as less packets to CPU as possible. The
rest is going to be discarded there. The "dummy group" is to be used
later on by ACL trap (which is a "source trap").

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko dbd1ddad28 mlxsw: core: Extend MLXSW_RXL_DIS to register disabled trap group
Extend the mlxsw_listener struct to contain trap group for disabled
traps too. Rename the original "trap_group" item to "en_trap_group" as
it represents enabled state. Let both groups be the same for MLXSW_RXL
however extend MLXSW_RXL_DIS to register separate groups for enable and
disable.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko c83da2929f mlxsw: core: Allow to enable/disable rx_listener for trap
For source traps, the "thin policer" is going to be used in order
to reduce the amount of trapped packets to minimum. However, there
will be still small number of packets coming in that need to be dropped
in the driver. Allow to enable/disable rx_listener related to specific
trap in order to prevent unwanted packets to go up the stack.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko 3e6cacaf51 mlxsw: acl_flex_actions: Trap all ACL dropped packets to DISCARD_*_ACL traps
Introduce a new set of traps:
DISCARD_INGRESS_ACL and DISCARD_EGRESS_ACL
Set the trap_action from NOP to TRAP which causes the packets dropped
by the TRAP action to be trapped under new trap IDs, depending on the
ingress/egress binding point.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko 3128f3a150 mlxsw: spectrum_acl: Pass the ingress indication down to flex action
The ACL flex action will have to know if it is in ingress or egress,
so it can use correct trap ID. Pass the ingress indication down to it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko 86272d3397 mlxsw: spectrum_flower: Disable mixed bound blocks to contain action drop
Action drop is going to be tracked by two separate traps, one for
ingress and one for egress. Prepare for it and disallow the possibility
to have drop action in blocks which are bound to both ingress and
egress.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko 68cc7ecc1b mlxsw: spectrum_acl: Track ingress and egress block bindings
Count the number of ingress and egress block bindings. Use the egress
counter in "is_egress_bound" helper. Add couple of helpers to check
ingress and mixed bound.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:07 -08:00
Jiri Pirko 4a23d45a3e mlxsw: spectrum_trap: Prepare mlxsw_core_trap_action_set() to handle not only action
Rename function mlxsw_core_trap_action_set() to
mlxsw_core_trap_state_set() and pass bool enabled instead of action.
Figure out the action according to the enabled state there.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:06 -08:00
Jiri Pirko 99ff9cc249 mlxsw: spectrum_trap: Use listener->en/dis_action instead of hard-coded values
The listener fields en_action and dis_action now contain the actions to
be used for TRAP and DROP devlink trap actions. Use them directly
instead of the hard-coded values.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:06 -08:00
Jiri Pirko 76d4067fe1 mlxsw: core: Allow to register disabled traps using MLXSW_RXL_DIS
Introduce a new macro MLXSW_RXL_DIS that allows to register listeners
as disabled. That allows that from now on, the "action" can be
understood always as "enabled action" and "unreg_action" as "disabled
action". Rename them and treat them accordingly.

Use the new macro for defining drops in spectrum_trap.c.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:06 -08:00
Jiri Pirko 1ef658a377 mlxsw: spectrum_trap: Set unreg_action to be SET_FW_DEFAULT
Currently it does not really matter if it is set to DISCARD or
SET_FW_DEFAULT because it is set only during unregister of the listener.
The unreg_action is going to be used for disabling the listener too, so
change to SET_FW_DEFAULT and ensure the HW is going to behave correctly.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:55:06 -08:00
Colin Ian King f2ce925a7d net/mlxfw: fix spelling mistake: "progamming" -> "programming"
There is a spelling mistake in a literal string. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-24 11:03:46 -08:00
Ido Schimmel df6470273e mlxsw: pci: Remove unused values
Since commit f3a52c6162 ("mlxsw: pci: Utilize MRSR register to perform
FW reset") the driver no longer issues a reset via the PCI BAR, so the
offset of the reset bit is unused. Remove it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:07:18 -08:00
Jiri Pirko d356b3e82b mlxsw: core: Remove priv from listener equality comparison
During packet receive, only the first matching RX listener
in rx_listener_list is going to get the packet. So there is no
meaning in registering two equal listeners with different privs.
Remove priv from equality comparison and disable possibility
of doing it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:59 -08:00
Jiri Pirko b32bd7f73a mlxsw: spectrum_acl: Make block arg const where appropriate
There are couple of places where block pointer as a function argument
can be const. So make those const.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:59 -08:00
Jiri Pirko 16adc56c45 mlxsw: spectrum_trap: Make global arrays const as they should be
The global arrays are treated as const, they should be const, so make
them const.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:59 -08:00
Jiri Pirko 62c7f2512c mlxsw: core: Remove initialization to false of mlxsw_listener struct
No need to initialize to false, so remove it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:59 -08:00
Jiri Pirko 0bb57112d7 mlxsw: core: Convert is_event and is_ctrl bools to be single bits
No need for these two flags to be bool. Covert them to bits of u8.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:59 -08:00
Jiri Pirko 8ec80a8b12 mlxsw: core: Remove dummy union name from struct mlxsw_listener
The dummy name for union is no longer needed, remove it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:58 -08:00
Jiri Pirko 2225d0803d mlxsw: core: Remove unused action field from mlxsw_rx_listener struct
Commit 0791051c43 ("mlxsw: core: Create a generic function to
register / unregister traps") moved this field to struct mlxsw_listener,
but forgot to remove it from struct mlxsw_rx_listener.
Remove the unused field.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:58 -08:00
Jiri Pirko 3cbc37e6e9 mlxsw: spectrum_trap: Move policer initialization to mlxsw_sp_trap_init()
No need to initialize a single policer multiple times for each group.
So move the initialization to be done from mlxsw_sp_trap_init(), making
the function much simpler. Also, rename it so it is with sync with
spectrum.c policers initialization.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:58 -08:00
Jiri Pirko 1255ea6ba2 mlxsw: core_acl_flex_actions: Rename Trap / Discard Action to Trap Action
The Trap / Discard Action action got renamed in PRM, so rename it in the
code as well.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:58 -08:00
Jiri Pirko a5118ef102 mlxsw: spectrum_trap: Move functions to avoid their forward declarations
No need to have forward declarations for mlxsw_sp_rx_drop_listener()
and mlxsw_sp_rx_exception_listener(). Just move them up and avoid it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:58 -08:00
Jiri Pirko aa2794b42f mlxsw: spectrum_trap: Use err variable instead of directly checking func return value
When calling mlxsw_sp_rx_listener(), use err variable instead of directly
checking func return value.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:06:58 -08:00
Ido Schimmel 9811f7a2c9 mlxsw: spectrum: Remove RTNL where possible
After introducing the router lock in previous patches and making sure it
protects internal router structures, we no longer need to rely on RTNL
to serialize access to these structures.

Remove RTNL from call sites that no longer require it.

Two calls sites that keep taking the lock are
mlxsw_sp_router_fibmr_event_work() and mlxsw_sp_inet6addr_event_work().
The first calls into ACL code that still assumes RTNL is taken. The
second potentially calls into the FID code that also relies on RTNL.
Removing RTNL from these two call sites is the subject of future work.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 50c173c3a1 mlxsw: spectrum_router: Take router lock from exported helpers
The routing code exports some helper functions that can be called from
other driver modules such as the bridge. These helpers are never called
with the router lock already held and therefore need to take it in order
to serialize access to shared router structures.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 1be54763e1 mlxsw: spectrum_router: Take router lock from inetaddr listeners
Another entry point into the routing code is from inetaddr listeners.
The driver registers listeners to IPv4 and IPv6 inetaddr notification
chains in order to understand when a RIF needs to be created or
destroyed.

Serialize access to shared router structures from these listeners by
taking the router lock when processing inetaddr events.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel b43c12e7a6 mlxsw: spectrum_router: Take router lock from netdev listener
One entry point into the routing code is from the netdev listener block.
Some netdev events require access to internal router structures. For
example, changing the MTU of a netdev requires looking-up the backing
RIF and adjusting its MTU.

In order to serialize access to shared router structures, take the
router lock when processing netdev events that require access to it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 6a5c69cd55 mlxsw: spectrum_dpipe: Take router lock from dpipe code
The dpipe code traverses internal router structures such as neighbours
and adjacency entries and dumps them to user space via netlink. Up until
now the routing code did not have its own locks and relied on RTNL lock
to serialize access. This is going to change with the introduction of
the router lock.

Take the router lock in the code paths where RTNL lock is currently
taken so that the latter could be removed by subsequent patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 894276e85c mlxsw: spectrum_router: Take router lock from inside routing code
There are several work items in the routing code that currently rely on
RTNL lock to guard against concurrent changes. Have these work items
acquire the router lock in preparation for the removal for RTNL lock
from the routing code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 20bf5d82bb mlxsw: spectrum_router: Introduce router lock
Introduce a mutex to protect the internal structure of the routing code.
A single lock is added instead of a more fine-grained and complicated
locking scheme because there is not a lot of concurrency in the routing
code.

The main motivation is remove the dependence on RTNL lock, which is
currently used by both the process pushing routes to the kernel and the
workqueue pushing the routes to the underlying device.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 8e18d85eab mlxsw: spectrum_router: Store NVE decapsulation configuration in router
When a host route is added, the driver checks if the route needs to be
promoted to perform NVE decapsulation based on the current NVE
configuration. If so, the index of the decapsulation entry is retrieved
and associated with the route.

Currently, this information is stored in the NVE module which the router
module consults. Since the information is protected under RTNL and since
route insertion happens with RTNL held, there is no problem to retrieve
the information from the NVE module.

However, this is going to change and route insertion will no longer
happen under RTNL. Instead, a dedicated lock will be introduced for the
router module.

Therefore, store this information in the router module and change the
router module to consult this copy.

The validity of the information is set / cleared whenever an NVE tunnel
is initialized / de-initialized. When this happens the NVE module calls
into the router module to promote / demote the relevant host route.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 2a60c460b5 mlxsw: spectrum_router: Expose router struct to internal users
The dpipe code accesses internal router data structures and acquires
RTNL to protect against their changes. Subsequent patches will remove
reliance on RTNL and introduce a dedicated lock to protect router data
structures.

Publish the router struct to internal users such as the dpipe, so that
they could acquire it instead of RTNL.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel f38656d067 mlxsw: spectrum_mr: Protect multicast route list with a lock
Protect the per-table multicast route list with a lock and remove RTNL
from the delayed work that periodically updates the kernel about packets
and bytes statistics from each multicast route.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel c366de8589 mlxsw: spectrum_mr: Protect multicast table list with a lock
The multicast table list is traversed from a delayed work that
periodically updates the kernel about packets and bytes statistics from
each multicast route.

The list is currently protected by RTNL, but subsequent patches will
remove the driver's dependence on this contended lock.

In order to be able to remove dependence on RTNL in the next patch,
guard this list with a dedicated mutex.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Ido Schimmel 3e3c8dafc0 mlxsw: spectrum_mr: Publish multicast route after writing it to the device
The driver periodically traverses the linked list of multicast routes
and updates the kernel about packets and bytes statistics from each
multicast route. These statistics are read from a counter associated
with the route when it is written to the device.

Currently, multicast routes are published via this linked list before
they are associated with a counter. Despite that, it is not possible for
the driver to access an invalid counter because the delayed work that
reads the statistics and multicast route addition / deletion are
mutually exclusive using RTNL.

In order to be able to remove RTNL, the list needs to be protected by a
dedicated lock, but any route published via the list must have an
associated counter, otherwise the driver will access an invalid counter.

Solve this by re-ordering the operations during multicast route addition
so that the route is only added to the linked list after it was written
to the device. Similarly, during deletion the route is first removed
from the linked list before its deletion from the device.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-22 21:24:51 -08:00
Eran Ben Elisha b7331aa204 net/mlx5: Add fsm_reactivate callback support
Add support for fsm reactivate via MIRC (Management Image Re-activation
Control) set and query commands.
For re-activation flow, driver shall first run MIRC set, and then wait
until FW is done (via querying MIRC status).

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:41:10 -08:00
Eran Ben Elisha 958dfd0dc6 net/mlxfw: Add reactivate flow support to FSM burn flow
Expose fsm_reactivate callback to the mlxfw_dev_ops struct. FSM reactivate
is needed before flashing the new image in order to flush the old flashed
but not running firmware image.

In case mlxfw_dev do not support the reactivation, this step will be
skipped. But if later image flash will fail, a hint will be provided by
the extack to advise the user that the failure might be related to it.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:41:10 -08:00
Saeed Mahameed 5042e8b97d net/mlxfw: Use MLXFW_ERR_MSG macro for error reporting
Instead of always calling both mlxfw_err and NL_SET_ERR_MSG_MOD with the
same message, use the dedicated macro instead.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:41:10 -08:00
Saeed Mahameed 6a3f707c00 net/mlxfw: Convert pr_* to dev_* in mlxfw_fsm.c
Introduce mlxfw_{info, err, dbg} macros and make them call corresponding
dev_* macros, then convert all instances of pr_* to mlxfw_*.

This will allow printing the device name mlxfw is operating on.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:41:10 -08:00
Saeed Mahameed f7fe7aa88f net/mlxfw: More error messages coverage
Make sure mlxfw_firmware_flash reports a detailed user readable error
message in every possible error path, basically every time
mlxfw_dev->ops->*() is called and an error is returned, or when image
initialization is failed.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:41:10 -08:00
Saeed Mahameed 86a1270fd7 net/mlxfw: Improve FSM err message reporting and return codes
Report unique and standard error codes corresponding to the specific
FW flash error. In addition, add a more detailed error messages to
netlink.

Before:
$ devlink dev flash pci/0000:05:00.0 file ...
Error: mlxfw: Firmware flash failed.
devlink answers: Invalid argument

After:
$ devlink dev flash pci/0000:05:00.0 file ...
Error: mlxfw: Firmware flash failed: pending reset.
devlink answers: Device busy

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:41:10 -08:00
Saeed Mahameed 4ae575661f net/mlxfw: Generic mlx FW flash status notify
FW flash status notify is currently implemented via a callback to the
caller mlx module, and all it is doing is to call
devlink_flash_update_status_notify with the specific module devlink
instance.

Instead of repeating the whole process for all mlx modules and
re-implement the status_notify callback again and again. Just provide the
devlink instance as part of mlxfw_dev when calling mlxfw_firmware_flash
and let mlxfw do the devlink status updates directly.

This will be very useful for adding status notify support to mlx5, as
already done in this patch, with a simple one line of just providing the
devlink instance to mlxfw_firmware_flash.

mlxfw now depends on NET_DEVLINK as all other mlx modules.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:41:10 -08:00
David S. Miller e65ee2fb54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflict resolution of ice_virtchnl_pf.c based upon work by
Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 13:39:34 -08:00
Ido Schimmel 9ef87b244e mlxsw: spectrum_nve: Make tunnel initialization symmetric
The device supports a single VTEP whose configuration is shared between
all VXLAN tunnels.

While the shared configuration is cleared upon the destruction of the
last tunnel - in mlxsw_sp_nve_tunnel_fini() - it is set in
mlxsw_sp_nve_fid_enable(), after calling mlxsw_sp_nve_tunnel_init().

Make tunnel initialization and destruction symmetric and set the
configuration in mlxsw_sp_nve_tunnel_init().

This will later allow us to protect the shared configuration with a
lock.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel b69e1337ff mlxsw: spectrum: Export function to check if RIF exists
After the previous patch, all the callers of mlxsw_sp_rif_find_by_dev()
outside of the routing code use it to understand if a RIF exists for the
passed netdev.

Therefore, export a function to check if a RIF exists and make
mlxsw_sp_rif_find_by_dev() internal to the routing code.

This will later allow us to more easily introduce the router lock which
will also protect the RIFs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel 5e9a664da8 mlxsw: spectrum: Prevent RIF access outside of routing code
There are currently 5 users of mlxsw_sp_rif_find_by_dev() outside of the
routing code. Only one call site actually needs to dereference the
router interface (RIF). The rest merely need to know if a RIF exists for
the provided netdev.

Convert this call site to query the needed information directly from the
routing code instead of dereferencing the RIF.

This will later allow us to replace mlxsw_sp_rif_find_by_dev() with a
function that checks if a RIF exist.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel 1c6d6b5145 mlxsw: spectrum_router: Prepare function for router lock introduction
The function de-associates the port-vlan from its router interface
(RIF). It is called both from the netdev notifier block and the inetaddr
notifier block that will soon hold the router lock.

Make sure that router code calls the internal version, as it will
already have the router lock held when the function is called.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel fbf8b356e5 mlxsw: spectrum_router: Prepare function for router lock introduction
The function removes the FDB entry that directs the macvlan's MAC to the
router port. It is called from both the netdev notifier block and the
inetaddr notifier block that will soon hold the router lock.

Make sure that only the netdev notifier calls the exported version, so
that is will take the router lock, which will already be held by the
inetaddr notifier.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel f24fbf4de5 mlxsw: spectrum_router: Do not assume RTNL is taken when resolving underlay device
The function that resolves the underlay device of the IPIP tunnel
assumes that RTNL is taken, but this will not be correct when RTNL is
removed from the route insertion path.

Convert the function to use dev_get_by_index_rcu() instead of
__dev_get_by_index() and make sure it is always called from an RCU
read-side critical section.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel 23d154c0d0 mlxsw: spectrum_router: Do not assume RTNL is taken during RIF teardown
IPv6 addresses are deleted in an atomic context, so the driver defers
the potential teardown of the associated router interface (RIF) to a
work item that takes RTNL.

The RIF is only destroyed if the associated netdev does not have any IP
addresses (both IPv4 and IPv6). The IPv4 device ('struct in_device') is
currently fetched via __in_dev_get_rtnl() which assumes RTNL is taken.

Since RTNL is going to be removed, convert it to use __in_dev_get_rcu()
from an RCU read-side critical section.

Note that the IPv6 device ('struct inet6_dev') is fetched via
__in6_dev_get(), which does not require RTNL.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel c43ef22843 mlxsw: spectrum_router: Do not assume RTNL is taken during nexthop init
RTNL is going to be removed from route insertion path, so use
__in_dev_get_rcu() from an RCU read-side critical section instead of
__in_dev_get_rtnl() which assumes RTNL is taken.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel eb833eec3b mlxsw: spectrum_span: Only update mirroring agents if present
In order not to needlessly schedule the work item that updates the
mirroring agents, only schedule it if there are any mirroring agents
present.

This is done by adding an atomic counter that counts the active
mirroring agents.

It is incremented / decremented whenever a mirroring agent is created /
destroyed. It is read before scheduling the work item and in the
devlink-resource occupancy callback.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel 622110f24b mlxsw: spectrum: Convert callers to use new mirroring API
Previous patch added a work item in the mirroring code that will take
care of updating the active mirroring agents in response to different
events.

Change the mirroring agents update function - mlxsw_sp_span_respin() -
to invoke this work item when called.

Therefore there is no need for callers to schedule a work item
themselves.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:34 -08:00
Ido Schimmel a8e7e6e7c3 mlxsw: spectrum_span: Prepare work item to update mirroring agents
The driver updates its mirroring agents whenever it receives a
notification about an event that can affect these. For example, the
addition of a route might require the driver to change the egress port
of an ERSPAN session.

Currently, RTNL needs to be held when these agents are updates, so the
driver either:

1. Calls directly into the mirroring code, in case RTNL is held

2. Schedules a work item that will take RTNL and call into the mirroring
code

Simplify this by having the mirroring code schedule the work item for
the update instead of requiring callers to schedule a work item
themselves.

The conversion of the callers will be done in the next patch to make
review easier.

This will later allow us to remove RTNL from different parts of the
driver. It will also allow us to only schedule the work item in case
there are active mirroring agents, which is information private to the
mirroring code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:33 -08:00
Ido Schimmel 6627b93bf7 mlxsw: spectrum_span: Use struct_size() to simplify allocation
Allocate the main mirroring struct and the individual structs for the
different mirroring agents in a single allocation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:33 -08:00
Ido Schimmel 9a9f8d1e74 mlxsw: spectrum_span: Do no expose mirroring agents to entire driver
The struct holding the different mirroring agents is currently allocated
as part of the main driver struct. This is unlike other driver modules.

Allocate the memory required to store the different mirroring agents as
part of the initialization of the mirroring module.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:33 -08:00
Ido Schimmel 6c5a688e75 mlxsw: spectrum: Protect counter pool with a lock
The counter pool is a shared resource. It is used by both the ACL code
to allocate counters for actions and by the routing code to allocate
counters for adjacency entries (for example).

Currently, all allocations are protected by RTNL, but this is going to
change with the removal of RTNL from the routing code.

Therefore, protect counter allocations with a spin lock.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:33 -08:00
Ido Schimmel 48fe78cebd mlxsw: spectrum_kvdl: Protect allocations with a lock
The KVDL is used to store objects allocated throughout various places
in the driver. For example, both nexthops (adjacency entries) and ACL
actions are stored in the KVDL.

Currently, all allocations are protected by RTNL, but this is going to
change with the removal of RTNL from the routing code.

Therefore, protect KVDL allocations with a lock. A mutex is used since
the free operation can block in Spectrum-2.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-20 10:04:33 -08:00
Paul Blakey b8ce903709 net/mlx5e: Restore tunnel metadata on miss
In tunnel and chains setup, we decapsulate the packets on first chain hop,
if we miss on later chains, the packet will comes up without tunnel header,
so it won't be taken by the tunnel device automatically, which fills the
tunnel metadata, and further tc tunnel matches won't work.

On miss, we get the tunnel mapping id, which was set on the chain 0 rule
that decapsulated the packet. This rule matched the tunnel outer
headers. From the tunnel mapping id, we get to this tunnel matches
and restore the equivalent tunnel info metadata dst on the skb.
We also set the skb->dev to the relevant device (tunnel device).
Now further tc processing can be done on the relevant device.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:49 -08:00
Paul Blakey 6724e66b90 net/mlx5: E-Switch, Get reg_c1 value on miss
The HW model implicitly decapsulates tunnels on chain 0 and sets reg_c1
with the mapped tunnel id. On miss, the packet does not have the outer
header and the driver restores the tunnel information from the tunnel id.

Getting reg_c1 value in software requires enabling reg_c1 loopback and
copying reg_c1 to reg_b. reg_b comes up on CQE as cqe->imm_inval_pkey.

Use the reg_c0 restoration rules to also copy reg_c1 to reg_B.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:49 -08:00
Paul Blakey 0a7fcb78cc net/mlx5e: Support inner header rewrite with goto action
The hardware supports header rewrite of outer headers only.
To perform header rewrite on inner headers, we must first
decapsulate the packet.

Currently, the hardware decap action is explicitly set by the tc
tunnel_key unset action. However, with goto action the user won't
use the tunnel_key unset action. In addition, header rewrites actions
will not apply to the inner header as done by the software model.

To support this, we will map each tunnel matches seen on a tc rule to
a unique tunnel id, implicity add a decap action on tc chain 0 flows,
and mark the packets with this unique tunnel id. Tunnel matches on
the decapsulated tunnel on later chains will match on this unique id
instead of the actual packet.

We will also use this mapping to restore the tunnel info metadata
on miss.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:49 -08:00
Paul Blakey 7f2fd0a5f8 net/mlx5e: Disallow inserting vxlan/vlan egress rules without decap/pop
Currently, rules on tunnel devices can be offloaded without decap action
when a vlan pop action exists. Similarly, the driver will offload rules
on vlan interfaces with no pop action when a decap action exists.

Disallow the faulty behavior by checking that vlan egress rules do pop or
drop and vxlan egress rules do decap, as intended.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:49 -08:00
Paul Blakey ea4cd837b9 net/mlx5e: Move tc tunnel parsing logic with the rest at tc_tun module
Currently, tunnel parsing is split between en_tc and tc_tun. The next
patch will replace the tunnel fields matching with a register match,
and will not need this parsing.

Move the tunnel parsing logic to tc_tun as a pre-step for skipping
it in the next patch.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:49 -08:00
Paul Blakey 6ae4a6a594 net/mlx5e: Allow re-allocating mod header actions
Currently the size of the mod header actions array is deduced from the
number of parsed TC header rewrite actions. However, mod header actions
are also used for setting HW register values. Support the dynamic
reallocation of the mod header array as a pre-step for adding HW
registers mod actions.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:49 -08:00
Paul Blakey d6d2778286 net/mlx5: E-Switch, Restore chain id on miss
Chain ids are mapped to the lower part of reg C, and after loopback
are copied to to CQE via a restore rule's flow_tag.

To let tc continue in the correct chain, we find the corresponding
chain id in the eswitch chain id <-> reg C mapping, and set the SKB's
tc extension chain to it.

That tells tc to continue processing from this set chain.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:48 -08:00
Paul Blakey dfd9e7500c net/mlx5e: Rx, Split rep rx mpwqe handler from nic
Copy the current rep mpwqe rx handler which is also used by nic
profile. In the next patch, we will add rep specific logic, just
for the rep profile rx handler.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:48 -08:00
Paul Blakey 8f1e0b97cc net/mlx5: E-Switch, Mark miss packets with new chain id mapping
Currently, if we miss in hardware after jumping to some chain,
we continue in chain 0 in software. This is wrong, and with the new
tc skb extension we can now restore the chain id on the skb, so
tc can continue with in the correct chain.

To restore the chain id in software after a miss in hardware, we create
a register mapping from 32bit chain ids to 16bit of reg_c0 (that
survives loopback), to 32bit chain ids. We then mark packets that
miss on some chain with the current chain id mapping on their reg_c0
field. Using this mapping, we will support up to 64K concurrent
chains.

This register survives loopback and gets to the CQE on flow_tag
via the eswitch restore rules.

In next commit, we will reverse the mapping we got on the CQE
to a chain id and tell tc to continue in the sw chain where we
left off via the tc skb extension.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:48 -08:00
Paul Blakey 11b717d615 net/mlx5: E-Switch, Get reg_c0 value on CQE
On RX side create a restore table in OFFLOADS namespace.
This table will match on all values for reg_c0 we will use,
and set it to the flow_tag. This flow tag can then be read on the CQE.

As there is no copy action from reg c0 to flow tag, instead we have to
set the flow tag explictily. We add an API so callers can add all the used
reg_c0 values (tags) and for each of those we add a restore rule.

This will be used in a following patch to save the miss chain mapping
tag on reg_c0 and from it restore the tc chain on the skb.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:48 -08:00
Paul Blakey 0f0d3827c0 net/mlx5: E-Switch, Move source port on reg_c0 to the upper 16 bits
Multi chain support requires the miss path to continue the processing
from the last chain id, and for that we need to save the chain
miss tag (a mapping for 32bit chain id) on reg_c0 which will
come in a next patch.

Currently reg_c0 is exclusively used to store the source port
metadata, giving it 32bit, it is created from 16bits of vcha_id,
and 16bits of vport number.

We will move this source port metadata to upper 16bits, and leave the
lower bits for the chain miss tag. We compress the reg_c0 source port
metadata to 16bits by taking 8 bits from vhca_id, and 8bits from
the vport number.

Since we compress the vport number to 8bits statically, and leave two
top ids for special PF/ECPF numbers, we will only support a max of 254
vports with this strategy.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:48 -08:00
Paul Blakey 7f30db1ed8 net/mlx5: Introduce mapping infra for mapping unique ids to data
Add a new interface for mapping data to a given id range (max_id),
and back again. It uses xarray as the id allocator and for finding a
given id. For locking it uses xa_lock (spin_lock) for add()/del(),
and rcu_read_lock for find().

This mapping interface also supports delaying the mapping removal via
a workqueue. This is for cases where we need the mapping to have
some grace period in regards to finding it back again, for example
for packets arriving from hardware that were marked with by a rule
with an old mapping that no longer exists.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19 17:49:48 -08:00
Gustavo A. R. Silva e99f8e7f88 mlxsw: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19 16:23:10 -08:00
Gustavo A. R. Silva 339ffae598 net/mlx5e: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-02-19 09:13:10 +02:00
Christophe JAILLET 0120936a9f net/mlx5: Remove a useless 'drain_workqueue()' call in 'mlx5e_ipsec_cleanup()'
'destroy_workqueue()' already calls 'drain_workqueue()', there is no need
to call it explicitly.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:31 -08:00
Aya Levin b5ede32d33 net/mlx5e: Add support for FEC modes based on 50G per lane links
Introduce new FEC modes:
- RS-FEC-(544,514)
- LL_RS-FEC-(272,257+1)
Add support in ethtool for set and get callbacks for the new modes
above. While RS-FEC-(544,514) is mapped to exsiting RS FEC mode,
LL_RS-FEC-(272,257+1) is mapped to a new ethtool link mode: LL-RS.

Add support for FEC on 50G per lane link modes up to 400G. The new link
modes uses a u16 fields instead of u8 fields for the legacy link modes.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:31 -08:00
Aya Levin 3c19208ea9 net/mlxe5: Separate between FEC and current speed
FEC mode is per link type, not necessary per speed. This patch access
FEC register by link modes instead of speeds. This patch will allow
further enhacment of link modes supporting FEC with the same speed
(different lane type).

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:31 -08:00
Aya Levin 2132b71f78 net/mlx5e: Advertise globaly supported FEC modes
Ethtool advertise supported link modes on an interface. Per each FEC
mode, query if there is a link type which supports it. If so, add this
FEC mode to the supported FEC modes list. Prior to this patch, ethtool
advertised only the supported FEC modes on the current link type.
Add an explicit mapping between internal FEC modes and ethtool link mode
bits. With this change, adding new FEC modes in the downstream patch
would be easier.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:30 -08:00
Aya Levin 4bd9d5070b net/mlx5e: Enforce setting of a single FEC mode
Ethtool command allow setting of several FEC modes in a single set
command. The driver can only set a single FEC mode at a time. With this
patch driver will reply not-supported on setting several FEC modes.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:30 -08:00
Aya Levin 511aa2aa63 net/mlx5e: Set FEC to auto when configured mode is not supported
When configuring FEC mode, driver tries to set it for all available
link types. If a link type doesn't support a FEC mode, set this link
type to auto (FW best effort). Prior to this patch, when a link type
didn't support a FEC mode is was set to no FEC.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:30 -08:00
Joe Perches b21aef7e71 mlx5: Use proper logging and tracing line terminations
netdev_err should use newline termination but mlx5_health_report
is used in a trace output function devlink_health_report where
no newline should be used.

Remove the newlines from a couple formats and add a format string
of "%s\n" to the netdev_err call to not directly output the
logging string.

Also use snprintf to avoid any possible output string overrun.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:30 -08:00
Aya Levin 0f56d3c5d8 net/mlx5e: Support dump callback in RX reporter
Add support for SQ's FW dump on RX reporter's events. Use Resource dump
API to retrieve the relevant data: RX slice, RQ dump, RX buffer and
ICOSQ dump (depends on the error). Wrap it in formatted messages and
store the binary output in devlink core.

Example:
$ devlink health dump show pci/0000:00:0b.0 reporter rx
RX Slice:
   data:
     00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
     22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
     00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
     22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
     00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
  RQs:
    RQ:
      rqn: 1512
      data:
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
    RQ:
      rqn: 1517
      data:
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de

$ devlink health dump show pci/0000:00:0b.0 reporter rx -jp
{
    "RX Slice": {
    	"data":[ 0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,128,0,1,0,0,0,0,173,222]
    },
    "RQs": [ {
            "RQ": {
                "index": 1512,
                "data": [ 0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,128,0,1,0,0,0,0,173,222]
            }
        },{
            "RQ": {
                "index": 1517,
                "data": [ 0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,128,0,1,0,0,0,0,173]
            }
        } ]
}

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:29 -08:00
Aya Levin 5f29458b77 net/mlx5e: Support dump callback in TX reporter
Add support for SQ's FW dump on TX reporter's events. Use Resource dump
API to retrieve the relevant data: SX slice, SQ dump and SQ buffer. Wrap
it in formatted messages and store the binary output in devlink core.

Example:
$ devlink health dump show pci/0000:00:0b.0 reporter tx
SX Slice:
   data:
     00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
     22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
     00 02 01 00 00 00 00 80 00 01 00 00 00 00 ad de
     22 01 00 00 00 00 ad de 00 20 40 90 81 88 ff ff
     00 00 00 00 00 00 00 00 15 00 15 00 00 00 00 00
     ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
     00 00 00 00 00 00 00 80 81 ae 41 06 00 ea ff ff
  SQs:
    SQ:
      index: 1511
      data:
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 02 01 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 20 40 90 81 88 ff ff
        00 00 00 00 00 00 00 00 15 00 15 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 80 81 ae 41 06 00 ea ff ff
    SQ:
      index: 1516
      data:
        00 00 00 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 02 01 00 00 00 00 80 00 01 00 00 00 00 ad de
        22 01 00 00 00 00 ad de 00 20 40 90 81 88 ff ff
        00 00 00 00 00 00 00 00 15 00 15 00 00 00 00 00
        ff ff ff ff 01 00 00 00 00 00 00 00 00 00 00 00
        00 00 00 00 00 00 00 80 81 ae 41 06 00 ea ff ff

$ devlink health dump show pci/0000:00:0b.0 reporter tx -jp
{
    "SX Slice": {
    	"data": [ 0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,32,64,144,129,136,255,255,0,0,0,0,0,0,0,0,21,0,21,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,129,174,65,6,0,234,255,255],
    	},
    "SQs": [ {
            "SQ": {
                "index": 1511,
                "data": [ 0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,32,64,144,129,136,255,255,0,0,0,0,0,0,0,0,21,0,21,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,129,174,65,6,0,234,255,255]
            }
        },{
            "SQ": {
                "index": 1516,
                "data": [ 0,0,0,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,128,0,1,0,0,0,0,173,222,34,1,0,0,0,0,173,222,0,32,64,144,129,136,255,255,0,0,0,0,0,0,0,0,21,0,21,0,0,0,0,0,255,255,255,255,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,128,129,174,65,6,0,234,255,255]
            }
        } ]
}

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:29 -08:00
Aya Levin 0a56be3c88 net/mlx5e: Gather reporters APIs together
Assemble all the API's to ease insertion of dump callbacks in the
following patches in the set.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:29 -08:00
Aya Levin 12206b1723 net/mlx5: Add support for resource dump
On driver load:
- Initialize resource dump data structure and memory access tools (mkey
  & pd).
- Read the resource dump's menu which contains the FW segment
  identifier. Each record is identified by the segment name (ASCII).

During the driver's course of life, users (like reporters) may request
dumps per segment. The user should create a command providing the
segment identifier (SW enumeration) and command keys. In return, the
user receives a command context. In order to receive the dump, the user
should supply the command context and a memory (aligned to a PAGE) on
which the dump content will be written. Since the dump may be larger
than the given memory, the user may resubmit the command until received
an indication of end-of-dump. It is the user's responsibility to destroy
the command.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:17:29 -08:00
Erez Shitrit 13a7e459a4 net/mlx5: DR, Handle reformat capability over sw-steering tables
On flow table creation, send the relevant flags according to what the FW
currently supports.
When FW doesn't support reformat option over SW-steering managed table,
the driver shouldn't pass this.

Fixes: 988fd6b32d ("net/mlx5: DR, Pass table flags at creation to lower layer")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:01:20 -08:00
Paul Blakey 76781623f0 net/mlx5: Fix lowest FDB pool size
The pool sizes represent the pool sizes in the fw. when we request
a pool size from fw, it will return the next possible group.
We track how many pools the fw has left and start requesting groups
from the big to the small.
When we start request 4k group, which doesn't exists in fw, fw
wants to allocate the next possible size, 64k, but will fail since
its exhausted. The correct smallest pool size in fw is 128 and not 4k.

Fixes: 39ac237ce0 ("net/mlx5: E-Switch, Refactor chains and priorities")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:01:20 -08:00
Dmytro Linkin 383de10815 net/mlx5e: Don't clear the whole vf config when switching modes
There is no need to reset all vf config (except link state) between
legacy and switchdev modes changes.
Also, set link state to AUTO, when legacy enabled.

Fixes: 3b83b6c2e0 ("net/mlx5e: Clear VF config when switching modes")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:01:19 -08:00
Hamdan Igbaria 52d214976d net/mlx5: DR, Fix matching on vport gvmi
Set vport gvmi in the tag, only when source gvmi is set in the bit mask.

Fixes: 26d688e3 ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:01:19 -08:00
Aya Levin 1ad6c43c6a net/mlx5e: Fix crash in recovery flow without devlink reporter
When health reporters are not supported, recovery function is invoked
directly, not via devlink health reporters.

In this direct flow, the recover function input parameter was passed
incorrectly and is causing a kernel oops. This patch is fixing the input
parameter.

Following call trace is observed on rx error health reporting.

Internal error: Oops: 96000007 [#1] PREEMPT SMP
Process kworker/u16:4 (pid: 4584, stack limit = 0x00000000c9e45703)
Call trace:
mlx5e_rx_reporter_err_rq_cqe_recover+0x30/0x164 [mlx5_core]
mlx5e_health_report+0x60/0x6c [mlx5_core]
mlx5e_reporter_rq_cqe_err+0x6c/0x90 [mlx5_core]
mlx5e_rq_err_cqe_work+0x20/0x2c [mlx5_core]
process_one_work+0x168/0x3d0
worker_thread+0x58/0x3d0
kthread+0x108/0x134

Fixes: c50de4af1d ("net/mlx5e: Generalize tx reporter's functionality")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:01:19 -08:00
Aya Levin 5ee090ed0d net/mlx5e: Reset RQ doorbell counter before moving RQ state from RST to RDY
Initialize RQ doorbell counters to zero prior to moving an RQ from RST
to RDY state. Per HW spec, when RQ is back to RDY state, the descriptor
ID on the completion is reset. The doorbell record must comply.

Fixes: 8276ea1353 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reported-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:01:19 -08:00
Huy Nguyen 3d9c5e023a net/mlx5: Fix sleep while atomic in mlx5_eswitch_get_vepa
rtnl_bridge_getlink is protected by rcu lock, so mlx5_eswitch_get_vepa
cannot take mutex lock. Two possible issues can happen:
1. User at the same time change vepa mode via RTM_SETLINK command.
2. User at the same time change the switchdev mode via devlink netlink
interface.

Case 1 cannot happen because rtnl executes one message in order.
Case 2 can happen but we do not expect user to change the switchdev mode
when changing vepa. Even if a user does it, so he will read a value
which is no longer valid.

Fixes: 8da202b249 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18 19:01:18 -08:00
Ido Schimmel da1f9f8cb7 mlxsw: spectrum: Reduce dependency between bridge and router code
Commit f40be47a3e ("mlxsw: spectrum_router: Do not force specific
configuration order") added a call from the routing code to the bridge
code in order to handle the case where VNI should be set on a FID
following the joining of the router port to the FID.

This is no longer required, as previous patches made VXLAN devices
explicitly take a reference on the FID and set VNI on it.

Therefore, remove the unnecessary call and simply have the RIF take a
reference on the FID without checking if VNI should also be set on it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17 14:42:53 -08:00
Ido Schimmel 578e55124c mlxsw: spectrum_switchdev: Remove VXLAN checks during FID membership
As explained in previous patch, VXLAN devices now take a reference on
the FID and not only local ports. Therefore, there is no need for local
ports to check if they need to set a VNI on the FID when they join the
FID.

Remove these unnecessary checks.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17 14:42:53 -08:00
Ido Schimmel 71afb45a14 mlxsw: spectrum_switchdev: Have VXLAN device take reference on FID
Up until now only local ports and the router port (which is also a local
port) took a reference on the corresponding FID (Filtering Identifier)
when joining a bridge. For example:

        192.0.2.1/24
            br0
             |
      +------+------+
      |             |
     swp1        vxlan0

In this case the reference count of the FID will be '2'. Since the VXLAN
device does not take a reference on the FID, whenever a local port joins
the bridge it needs to check if a VXLAN device is already enslaved. If
the VXLAN device should be mapped to the FID in question, then the VXLAN
device's VNI is set on the FID.

Beside the fact that this scheme special-cases the VXLAN device, it also
creates an unnecessary dependency between the routing and bridge code:

1. [R] IP address is added on 'br0', which prompts the creation of a RIF
   and a backing FID
2. [B] VNI is enabled on backing FID
3. [R] Host route corresponding to VXLAN device's source address is
   promoted to perform NVE decapsulation

[R] - Routing code
[B] - Bridge code

This back and forth dependency will become problematic when a lock is
added in the routing code instead of relying on RTNL, as it will result
in an AA deadlock.

Instead, have the VXLAN device take a reference on the FID just like all
the other netdev members of the bridge. In order to correctly handle the
case where VXLAN devices are already enslaved to the bridge when it is
offloaded, walk the bridge's slaves and replay the configuration.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17 14:42:53 -08:00
Ido Schimmel 23a1a0b391 mlxsw: spectrum_switchdev: Propagate extack to bridge creation function
Propagate extack to bridge creation function so that error messages
could be passed to user space via netlink instead of printing them to
kernel log.

A subsequent patch will pass the new extack argument to more functions.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17 14:42:53 -08:00
Ido Schimmel b96f546980 mlxsw: spectrum_fid: Use 'refcount_t' for FID reference counting
'refcount_t' is very useful for catching over/under flows. Convert the
FID (Filtering Identifier) objects to use it instead of 'unsigned int'
for their reference count.

A subsequent patch in the series will change the way VXLAN devices hold
/ release the FID reference, which is why the conversion is made now.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-17 14:42:53 -08:00
Ido Schimmel 3a99cbb6fa mlxsw: spectrum_dpipe: Add missing error path
In case devlink_dpipe_entry_ctx_prepare() failed, release RTNL that was
previously taken and free the memory allocated by
mlxsw_sp_erif_entry_prepare().

Fixes: 2ba5999f00 ("mlxsw: spectrum: Add Support for erif table entries access")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07 18:47:01 +01:00
Vadim Pasternak 36844c855b mlxsw: core: Add validation of hardware device types for MGPIR register
When reading the number of gearboxes from the hardware, the driver does
not validate the returned 'device type' field. The driver can therefore
wrongly assume that the queried devices are gearboxes.

On Spectrum-3 systems that support different types of devices, this can
prevent the driver from loading, as it will try to query the
temperature sensors from devices which it assumes are gearboxes and in
fact are not.

For example:
[  218.129230] mlxsw_minimal 2-0048: Reg cmd access status failed (status=7(bad parameter))
[  218.138282] mlxsw_minimal 2-0048: Reg cmd access failed (reg_id=900a(mtmp),type=write)
[  218.147131] mlxsw_minimal 2-0048: Failed to setup temp sensor number 256
[  218.534480] mlxsw_minimal 2-0048: Fail to register core bus
[  218.540714] mlxsw_minimal: probe of 2-0048 failed with error -5

Fix this by validating the 'device type' field.

Fixes: 2e265a8b6c ("mlxsw: core: Extend hwmon interface with inter-connect temperature attributes")
Fixes: f14f4e621b ("mlxsw: core: Extend thermal core with per inter-connect device thermal zones")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07 18:47:01 +01:00
Ido Schimmel 490f0542a7 mlxsw: spectrum_router: Clear offload indication from IPv6 nexthops on abort
Unlike IPv4, in IPv6 there is no unique structure to represent the
nexthop and both the route and nexthop information are squashed to the
same structure ('struct fib6_info'). In order to improve resource
utilization the driver consolidates identical nexthop groups to the same
internal representation of a nexthop group.

Therefore, when the offload indication of a nexthop changes, the driver
needs to iterate over all the linked fib6_info and toggle their offload
flag accordingly.

During abort, all the routes are removed from the device and unlinked
from their nexthop group. The offload indication is cleared just before
the group is destroyed, but by that time no fib6_info is linked to the
group and the offload indication remains set.

Fix this by clearing the offload indication just before dropping the
reference from the nexthop.

Fixes: ee5a0448e7 ("mlxsw: spectrum_router: Set hardware flags for routes")
Reported-by: Alex Kushnarov <alexanderk@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Alex Kushnarov <alexanderk@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07 18:47:01 +01:00
Ido Schimmel 0508ff8934 mlxsw: spectrum_router: Prevent incorrect replacement of local table routes
The driver uses the same table to represent both the main and local
routing tables. Prevent routes in the main table from replacing routes
in the local table to reflect the fact that the local table is consulted
first during lookup.

Fixes: b6a1d871d3 ("mlxsw: spectrum_router: Start using new IPv4 route notifications")
Fixes: dacad7b34b ("mlxsw: spectrum_router: Start using new IPv6 route notifications")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07 18:47:01 +01:00
Tariq Toukan 61c00cca41 net/mlx5: Deprecate usage of generic TLS HW capability bit
Deprecate the generic TLS cap bit, use the new TX-specific
TLS cap bit instead.

Fixes: a12ff35e0f ("net/mlx5: Introduce TLS TX offload hardware bits and structures")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06 12:24:24 -08:00
Tariq Toukan b57e66ad42 net/mlx5e: TX, Error completion is for last WQE in batch
For a cyclic work queue, when not requesting a completion per WQE,
a single CQE might indicate the completion of several WQEs.
However, in case some WQE in the batch causes an error, then an error
completion is issued, breaking the batch, and pointing to the offending
WQE in the wqe_counter field.

Hence, WQE-specific error CQE handling (like printing, breaking, etc...)
should be performed only for the last WQE in batch.

Fixes: 130c7b46c9 ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events")
Fixes: fd9b4be800 ("net/mlx5e: RX, Support multiple outstanding UMR posts")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06 12:24:23 -08:00
Raed Salem 08db2cf577 net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctx
SA context is allocated at mlx5_fpga_ipsec_create_sa_ctx,
however the counterpart mlx5_fpga_ipsec_delete_sa_ctx function
nullifies sa_ctx pointer without freeing the memory allocated,
hence the memory leak.

Fix by free SA context when the SA is released.

Fixes: d6c4f0298c ("net/mlx5: Refactor accel IPSec code")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06 12:24:23 -08:00
Raed Salem 0dc2c534f1 net/mlx5: IPsec, Fix esp modify function attribute
The function mlx5_fpga_esp_validate_xfrm_attrs is wrongly used
with negative negation as zero value indicates success but it
used as failure return value instead.

Fix by remove the unary not negation operator.

Fixes: 05564d0ae0 ("net/mlx5: Add flow-steering commands for FPGA IPSec implementation")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06 12:24:22 -08:00
Maor Gottlieb c1948390d7 net/mlx5: Fix deadlock in fs_core
free_match_list could be called when the flow table is already
locked. We need to pass this notation to tree_put_node.

It fixes the following lockdep warnning:

[ 1797.268537] ============================================
[ 1797.276837] WARNING: possible recursive locking detected
[ 1797.285101] 5.5.0-rc5+ #10 Not tainted
[ 1797.291641] --------------------------------------------
[ 1797.299917] handler10/9296 is trying to acquire lock:
[ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at:
tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.319694]
[ 1797.319694] but task is already holding lock:
[ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at:
nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
[ 1797.344707]
[ 1797.344707] other info that might help us debug this:
[ 1797.356952]  Possible unsafe locking scenario:
[ 1797.356952]
[ 1797.368333]        CPU0
[ 1797.373357]        ----
[ 1797.378364]   lock(&node->lock);
[ 1797.384222]   lock(&node->lock);
[ 1797.390031]
[ 1797.390031]  *** DEADLOCK ***
[ 1797.390031]
[ 1797.403003]  May be due to missing lock nesting notation
[ 1797.403003]
[ 1797.414691] 3 locks held by handler10/9296:
[ 1797.421465]  #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at:
tc_setup_cb_add+0x70/0x250
[ 1797.432810]  #1: ffff88a030081490 (&comp->sem){++++}, at:
mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core]
[ 1797.445829]  #2: ffff889ad399a0a0 (&node->lock){++++}, at:
nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
[ 1797.459913]
[ 1797.459913] stack backtrace:
[ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not
tainted 5.5.0-rc5+ #10
[ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS
2.4.3 01/17/2017
[ 1797.491480] Call Trace:
[ 1797.496701]  dump_stack+0x96/0xe0
[ 1797.502864]  __lock_acquire.cold.63+0xf8/0x212
[ 1797.510301]  ? lockdep_hardirqs_on+0x250/0x250
[ 1797.517701]  ? mark_held_locks+0x55/0xa0
[ 1797.524547]  ? quarantine_put+0xb7/0x160
[ 1797.531422]  ? lockdep_hardirqs_on+0x17d/0x250
[ 1797.538913]  lock_acquire+0xd6/0x1f0
[ 1797.545529]  ? tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.553701]  down_write+0x94/0x140
[ 1797.560206]  ? tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.568464]  ? down_write_killable_nested+0x170/0x170
[ 1797.576925]  ? del_hw_flow_group+0xde/0x1f0 [mlx5_core]
[ 1797.585629]  tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.593891]  ? free_match_list.part.25+0x147/0x170 [mlx5_core]
[ 1797.603389]  free_match_list.part.25+0xe0/0x170 [mlx5_core]
[ 1797.612654]  _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core]
[ 1797.621838]  ? lock_acquire+0xd6/0x1f0
[ 1797.629028]  ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core]
[ 1797.637981]  ? alloc_insert_flow_group+0x420/0x420 [mlx5_core]
[ 1797.647459]  ? try_to_wake_up+0x4c7/0xc70
[ 1797.654881]  ? lock_downgrade+0x350/0x350
[ 1797.662271]  ? __mutex_unlock_slowpath+0xb1/0x3f0
[ 1797.670396]  ? find_held_lock+0xac/0xd0
[ 1797.677540]  ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
[ 1797.686467]  mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
[ 1797.695134]  ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core]
[ 1797.704270]  ? irq_exit+0xa5/0x170
[ 1797.710764]  ? retint_kernel+0x10/0x10
[ 1797.717698]  ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230
[mlx5_core]
[ 1797.728708]  mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core]
[ 1797.738713]  ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core]
[ 1797.748384]  ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core]
[ 1797.757400]  mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core]
[ 1797.767665]  mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core]
[ 1797.776886]  ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core]
[ 1797.785562]  ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core]
[ 1797.795353]  __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core]
[ 1797.804558]  ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0
[mlx5_core]
[ 1797.815093]  ? wait_for_completion+0x260/0x260
[ 1797.823272]  mlx5e_configure_flower+0xe94/0x1620 [mlx5_core]
[ 1797.832792]  ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core]
[ 1797.842096]  ? down_read+0x11a/0x2e0
[ 1797.849090]  ? down_write+0x140/0x140
[ 1797.856142]  ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core]
[ 1797.866027]  tc_setup_cb_add+0x11a/0x250
[ 1797.873339]  fl_hw_replace_filter+0x25e/0x320 [cls_flower]
[ 1797.882385]  ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower]
[ 1797.891607]  fl_change+0x1d54/0x1fb6 [cls_flower]
[ 1797.899772]  ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
[cls_flower]
[ 1797.910728]  ? lock_downgrade+0x350/0x350
[ 1797.918187]  ? __radix_tree_lookup+0xa5/0x130
[ 1797.926046]  ? fl_set_key+0x1590/0x1590 [cls_flower]
[ 1797.934611]  ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
[cls_flower]
[ 1797.945673]  tc_new_tfilter+0xcd1/0x1240
[ 1797.953138]  ? tc_del_tfilter+0xb10/0xb10
[ 1797.960688]  ? avc_has_perm_noaudit+0x92/0x320
[ 1797.968721]  ? avc_has_perm_noaudit+0x1df/0x320
[ 1797.976816]  ? avc_has_extended_perms+0x990/0x990
[ 1797.985090]  ? mark_lock+0xaa/0x9e0
[ 1797.991988]  ? match_held_lock+0x1b/0x240
[ 1797.999457]  ? match_held_lock+0x1b/0x240
[ 1798.006859]  ? find_held_lock+0xac/0xd0
[ 1798.014045]  ? symbol_put_addr+0x40/0x40
[ 1798.021317]  ? rcu_read_lock_sched_held+0xd0/0xd0
[ 1798.029460]  ? tc_del_tfilter+0xb10/0xb10
[ 1798.036810]  rtnetlink_rcv_msg+0x4d5/0x620
[ 1798.044236]  ? rtnl_bridge_getlink+0x460/0x460
[ 1798.052034]  ? lockdep_hardirqs_on+0x250/0x250
[ 1798.059837]  ? match_held_lock+0x1b/0x240
[ 1798.067146]  ? find_held_lock+0xac/0xd0
[ 1798.074246]  netlink_rcv_skb+0xc6/0x1f0
[ 1798.081339]  ? rtnl_bridge_getlink+0x460/0x460
[ 1798.089104]  ? netlink_ack+0x440/0x440
[ 1798.096061]  netlink_unicast+0x2d4/0x3b0
[ 1798.103189]  ? netlink_attachskb+0x3f0/0x3f0
[ 1798.110724]  ? _copy_from_iter_full+0xda/0x370
[ 1798.118415]  netlink_sendmsg+0x3ba/0x6a0
[ 1798.125478]  ? netlink_unicast+0x3b0/0x3b0
[ 1798.132705]  ? netlink_unicast+0x3b0/0x3b0
[ 1798.139880]  sock_sendmsg+0x94/0xa0
[ 1798.146332]  ____sys_sendmsg+0x36c/0x3f0
[ 1798.153251]  ? copy_msghdr_from_user+0x165/0x230
[ 1798.160941]  ? kernel_sendmsg+0x30/0x30
[ 1798.167738]  ___sys_sendmsg+0xeb/0x150
[ 1798.174411]  ? sendmsg_copy_msghdr+0x30/0x30
[ 1798.181649]  ? lock_downgrade+0x350/0x350
[ 1798.188559]  ? rcu_read_lock_sched_held+0xd0/0xd0
[ 1798.196239]  ? __fget+0x21d/0x320
[ 1798.202335]  ? do_dup2+0x2a0/0x2a0
[ 1798.208499]  ? lock_downgrade+0x350/0x350
[ 1798.215366]  ? __fget_light+0xd6/0xf0
[ 1798.221808]  ? syscall_trace_enter+0x369/0x5d0
[ 1798.229112]  __sys_sendmsg+0xd3/0x160
[ 1798.235511]  ? __sys_sendmsg_sock+0x60/0x60
[ 1798.242478]  ? syscall_trace_enter+0x233/0x5d0
[ 1798.249721]  ? syscall_slow_exit_work+0x280/0x280
[ 1798.257211]  ? do_syscall_64+0x1e/0x2e0
[ 1798.263680]  do_syscall_64+0x72/0x2e0
[ 1798.269950]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: bd71b08ec2 ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06 12:24:22 -08:00
Nathan Chancellor 91a7d4bf3e mlxsw: spectrum_qdisc: Fix 64-bit division error in mlxsw_sp_qdisc_tbf_rate_kbps
When building arm32 allmodconfig:

ERROR: "__aeabi_uldivmod"
[drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko] undefined!

rate_bytes_ps has type u64, we need to use a 64-bit division helper to
avoid a build error.

Fixes: a44f58c41b ("mlxsw: spectrum_qdisc: Support offloading of TBF Qdisc")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-31 08:52:41 -08:00
Christophe JAILLET 6dd4b4f393 mlxsw: minimal: Fix an error handling path in 'mlxsw_m_port_create()'
An 'alloc_etherdev()' called is not ballanced by a corresponding
'free_netdev()' call in one error handling path.

Slighly reorder the error handling code to catch the missed case.

Fixes: c100e47caa ("mlxsw: minimal: Add ethtool support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27 11:13:53 +01:00
Jiri Pirko d48834f9d4 mlx5: Use dev_net netdevice notifier registrations
Register the dev_net notifier and allow the per-net notifier to follow
the device into different namespace.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27 11:03:44 +01:00
David S. Miller 4d8773b68e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflict in mlx5 because changes happened to code that has
moved meanwhile.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-26 10:40:21 +01:00
Petr Machata a44f58c41b mlxsw: spectrum_qdisc: Support offloading of TBF Qdisc
React to the TC messages that were introduced in a preceding patch and
configure egress maximum shaper as appropriate. TBF can be used as a root
qdisc or under one of PRIO or strict ETS bands.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata dbacf8ba58 mlxsw: spectrum: Configure shaper rate and burst size together
In order to allow configuration of burst size together with shaper rate,
extend mlxsw_sp_port_ets_maxrate_set() with a burst_size argument. Convert
call sites to pass 0 (for default).

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata 47259544e0 mlxsw: spectrum: Add lowest_shaper_bs to struct mlxsw_sp
Lower limit of burst size configuration is dependent on system type. Add a
datum to track the value. Initialize as appropriate in mlxsw_spX_init().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata 92afbfedb7 mlxsw: reg: Increase MLXSW_REG_QEEC_MAS_DIS
As the port speeds grow, the current value of "unlimited shaper",
200000000Kbps, might become lower than the actually supported speeds. Bump
it to the maximum value that fits in the corresponding QEEC field, which is
about 2.1Tbps.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata 23effa2479 mlxsw: reg: Add max_shaper_bs to QoS ETS Element Configuration
The QEEC register configures scheduling elements. One of the bits of
configuration is the burst size to use for the shaper installed on the
element. Add the necessary fields to support this configuration.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata be1d5a8a77 mlxsw: spectrum_qdisc: Extract a common leaf unoffload function
When the RED Qdisc is unoffloaded, it needs to reduce the reported backlog
by the amount that is in the HW, so that only the SW backlog is contained
in the counter. The same thing will need to be done by TBF, and likely any
other leaf Qdisc as well.

Extract a helper mlxsw_sp_qdisc_leaf_unoffload() and call it from
mlxsw_sp_qdisc_red_unoffload().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata 3d0d592193 mlxsw: spectrum_qdisc: Add mlxsw_sp_qdisc_get_class_stats()
Add a wrapper around mlxsw_sp_qdisc_collect_tc_stats() and
mlxsw_sp_qdisc_update_stats() for the simple case of doing both in one go:
mlxsw_sp_qdisc_get_class_stats(). Dispatch to that function from
mlxsw_sp_qdisc_get_red_stats(). This new function will be useful for other
leaf Qdiscs as well.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata cf9af379cd mlxsw: spectrum_qdisc: Extract a per-TC stat function
Extract from mlxsw_sp_qdisc_get_prio_stats() two new functions:
mlxsw_sp_qdisc_collect_tc_stats() to accumulate stats for that one TC only,
and mlxsw_sp_qdisc_update_stats() that makes the stats relative to base
values stored earlier. Use them from mlxsw_sp_qdisc_get_red_stats().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Tariq Toukan 342508c1c7 net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel path
When TCP out-of-order is identified (unexpected tcp seq mismatch), driver
analyzes the packet and decides what handling should it get:
1. go to accelerated path (to be encrypted in HW),
2. go to regular xmit path (send w/o encryption),
3. drop.

Packets marked with skb->decrypted by the TLS stack in the TX flow skips
SW encryption, and rely on the HW offload.
Verify that such packets are never sent un-encrypted on the wire.
Add a WARN to catch such bugs, and prefer dropping the packet in these cases.

Fixes: 46a3ea9807 ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24 12:04:40 -08:00
Tariq Toukan 1e92899791 net/mlx5e: kTLS, Remove redundant posts in TX resync flow
The call to tx_post_resync_params() is done earlier in the flow,
the post of the control WQEs is unnecessarily repeated. Remove it.

Fixes: 700ec49742 ("net/mlx5e: kTLS, Fix missing SQ edge fill")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24 12:04:37 -08:00
Tariq Toukan ffbd9ca94e net/mlx5e: kTLS, Fix corner-case checks in TX resync flow
There are the following cases:

1. Packet ends before start marker: bypass offload.
2. Packet starts before start marker and ends after it: drop,
   not supported, breaks contract with kernel.
3. packet ends before tls record info starts: drop,
   this packet was already acknowledged and its record info
   was released.

Add the above as comment in code.

Mind possible wraparounds of the TCP seq, replace the simple comparison
with a call to the TCP before() method.

In addition, remove logic that handles negative sync_len values,
as it became impossible.

Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Fixes: 46a3ea9807 ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24 12:04:35 -08:00
Dmytro Linkin 3b83b6c2e0 net/mlx5e: Clear VF config when switching modes
Currently VF in LEGACY mode are not able to go up. Also in OFFLOADS
mode, when switching to it first time, VF can go up independently to
his representor, which is not expected.
Perform clearing of VF config when switching modes and set link state
to AUTO as default value. Also, when switching to OFFLOADS mode set
link state to DOWN, which allow VF link state to be controlled by its
REP.

Fixes: 1ab2068a4c ("net/mlx5: Implement vports admin state backup/restore")
Fixes: 556b9d16d3 ("net/mlx5: Clear VF's configuration on disabling SRIOV")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24 12:04:32 -08:00
Erez Shitrit c0702a4bd4 net/mlx5: DR, use non preemptible call to get the current cpu number
Use raw_smp_processor_id instead of smp_processor_id() otherwise we will
get the following trace in debug-kernel:
	BUG: using smp_processor_id() in preemptible [00000000] code: devlink
	caller is dr_create_cq.constprop.2+0x31d/0x970 [mlx5_core]
	Call Trace:
	dump_stack+0x9a/0xf0
	debug_smp_processor_id+0x1f3/0x200
	dr_create_cq.constprop.2+0x31d/0x970
	genl_family_rcv_msg+0x5fd/0x1170
	genl_rcv_msg+0xb8/0x160
	netlink_rcv_skb+0x11e/0x340

Fixes: 297cccebdc ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24 12:04:29 -08:00