Commit Graph

1105068 Commits

Author SHA1 Message Date
Ivan Bornyakov 9794ef5a68 net: phy: marvell-88x2222: set proper phydev->port
phydev->port was not set and always reported as PORT_TP.
Set phydev->port according to inserted SFP module.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:25:31 +01:00
Radhey Shyam Pandey 3a51e969fa dt-bindings: net: xilinx: document xilinx emaclite driver binding
Add basic description for the xilinx emaclite driver DT bindings.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:23:24 +01:00
David S. Miller e91b3b6184 Merge branch 'ipa-simplify-completion-stats'
Alex Elder says:

====================
net: ipa: simplify completion statistics

The first patch in this series makes the name used for variables
representing a TRE ring be consistent everywhere.  The second
renames two structure fields to better represent their purpose.

The last four rework a little code that manages some tranaction and
byte transfer statistics maintained mainly for TX endpoints.  For
the most part this series is refactoring.  The last one also
includes the first step toward no longer assuming an event ring is
dedicated to a single channel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:07:58 +01:00
Alex Elder c5bddecbb9 net: ipa: rework gsi_channel_tx_update()
Rename gsi_channel_tx_update() to be gsi_trans_tx_completed(), and
pass it just the transaction pointer, deriving the channel from the
transaction.  Update the comments above the function to provide a
more concise description of how statistics for TX endpoints are
maintained and used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:07:58 +01:00
Alex Elder dbad2fa719 net: ipa: stop counting total RX bytes and transactions
In gsi_evt_ring_rx_update(), we update each transaction so its len
field reflects the actual number of bytes received.  In the process,
the total number of transactions and bytes processed on the channel
are summed, and added to a running total for the channel.

But we don't actually use those running totals for RX endpoints.
They're maintained for TX channels to support CoDel when they are
associated with a "real" network device.

So stop maintaining these totals for RX endpoints, and update the
comment where the fields are defined to make it clear they're only
valid for TX channels.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:07:58 +01:00
Alex Elder 65d39497fa net: ipa: simplify TX completion statistics
When a TX request is issued, its channel's accumulated byte and
transaction counts are recorded.  This currently does *not* take
into account the transaction being committed.

Later, when the transaction completes, the number of bytes and
transactions that have completed since the transaction was committed
are reported to the network stack.  The transaction and its byte
count are accounted for at that time.

Instead, record the transaction and its bytes in the counts recorded
at commit time.  This avoids the need to do so when the transaction
completes, and provides a (small) simplification of that code.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:07:58 +01:00
Alex Elder 4e0f28e9ee net: ipa: introduce gsi_trans_tx_committed()
Create a new function that encapsulates recording information needed
for TX channel statistics when a transaction is committed.

Record the accumulated length in the transaction before the call
(for both RX and TX), so it can be used when updating TX statistics.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:07:58 +01:00
Alex Elder 3eeabea6c8 net: ipa: rename two transaction fields
There are two fields in a GSI transaction that keep track of TRE
counts.  The first represents the number of TREs reserved for the
transaction in the TRE ring; that's currently named "tre_count".
The second is the number of TREs that are actually *used* by the
transaction at the time it is committed.

Rename the "tre_count" field to be "rsvd_count", to make its meaning
a little more specific.  The "_count" is present in the name mainly
to avoid interpreting it as a reserved (not-to-be-used) field.  This
name also distinguishes it from the "tre_count" field associated
with a channel.

Rename the "used" field to be "used_count", to match the convention
used for reserved TREs.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:07:58 +01:00
Alex Elder 2295947bda net: ipa: use "tre_ring" for all TRE ring local variables
All local variables that represent event rings are named "ring".

All but two functions that represent a channel's TRE ring with a
local variable use the name "tre_ring".  For consistency, use that
name in the two functions that don't fit the pattern.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:07:58 +01:00
Jakub Kicinski 5cb3ab50a3 Merge branch 'support-mt7531-on-bpi-r2-pro'
Frank Wunderlich says:

====================
Support mt7531 on BPI-R2 Pro

This Series add Support for the mt7531 switch on Bananapi R2 Pro board.

This board uses port5 of the switch to conect to the gmac0 of the
rk3568 SoC.

Currently CPU-Port is hardcoded in the mt7530 driver to port 6.

Compared to v1 the reset-Patch was dropped as it was not needed and
CPU-Port-changes are completely rewriten based on suggestions/code from
Vladimir Oltean (many thanks to this).
In DTS Patch i only dropped the status-property that was not
needed/ignored by driver.

Due to the Changes i also made a regression test on mt7623 bpi-r2
(mt7623 soc + mt7530) and bpi-r64 (mt7622 soc + mt7531) with cpu-
port 6. Tests were done directly (ipv4 config on dsa user port)
and with vlan-aware bridge including vlan that was tagged outgoing
on dsa user port.
====================

Link: https://lore.kernel.org/r/20220610170541.8643-1-linux@fw-web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:35:19 -07:00
Frank Wunderlich c1804463e5 arm64: dts: rockchip: Add mt7531 dsa node to BPI-R2-Pro board
Add Device Tree node for mt7531 switch connected to gmac0.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:35:16 -07:00
Frank Wunderlich ae07485d7a dt-bindings: net: dsa: make reset optional and add rgmii-mode to mt7531
A board may have no independent reset-line, so reset cannot be used
inside switch driver.

E.g. on Bananapi-R2 Pro switch and gmac are connected to same reset-line.

Resets should be acquired only to 1 device/driver. This prevents reset to
be bound to switch-driver if reset is already used for gmac. If reset is
only used by switch driver it resets the switch *and* the gmac after the
mdio bus comes up resulting in mdio bus goes down. It takes some time
until all is up again, switch driver tries to read from mdio, will fail
and defer the probe. On next try the reset does the same again.

Make reset optional for such boards.

Allow port 5 as cpu-port and phy-mode rgmii for mt7531.

- MT7530 supports RGMII on port 5 and RGMII/TRGMII on port 6.
- MT7531 supports on port 5 RGMII and SGMII (dual-sgmii) and
  SGMII on port 6.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:35:16 -07:00
Frank Wunderlich 1f9a6abecf net: dsa: mt7530: get cpu-port via dp->cpu_dp instead of constant
Replace last occurences of hardcoded cpu-port by cpu_dp member of
dsa_port struct.

Now the constant can be dropped.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:35:16 -07:00
Frank Wunderlich 6e19bc26cc net: dsa: mt7530: rework mt753[01]_setup
Enumerate available cpu-ports instead of using hardcoded constant.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:35:16 -07:00
Frank Wunderlich a9c317417c net: dsa: mt7530: rework mt7530_hw_vlan_{add,del}
Rework vlan_add/vlan_del functions in preparation for dynamic cpu port.

Currently BIT(MT7530_CPU_PORT) is added to new_members, even though
mt7530_port_vlan_add() will be called on the CPU port too.

Let DSA core decide when to call port_vlan_add for the CPU port, rather
than doing it implicitly.

We can do autonomous forwarding in a certain VLAN, but not add br0 to that
VLAN and avoid flooding the CPU with those packets, if software knows it
doesn't need to process them.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:35:15 -07:00
Frank Wunderlich e0dda31197 dt-bindings: net: dsa: convert binding for mediatek switches
Convert txt binding to yaml binding for Mediatek switches.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:35:15 -07:00
Jakub Kicinski 68d5428931 Merge branch 'mlxsw-remove-xm-support'
Ido Schimmel says:

====================
mlxsw: Remove XM support

The XM was supposed to be an external device connected to the
Spectrum-{2,3} ASICs using dedicated Ethernet ports. Its purpose was to
increase the number of routes that can be offloaded to hardware. This was
achieved by having the ASIC act as a cache that refers cache misses to the
XM where the FIB is stored and LPM lookup is performed.

Testing was done over an emulator and dedicated setups in the lab, but
the product was discontinued before shipping to customers.

Therefore, in order to remove dead code and reduce complexity of the
code base, revert the three patchsets that added XM support.
====================

Link: https://lore.kernel.org/r/20220613132116.2021055-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 21:51:08 -07:00
Petr Machata 87c0a3c676 mlxsw: Revert "Prepare for XM implementation - LPM trees"
This reverts commit 923ba95ea2 ("Merge branch
'mlxsw-spectrum-prepare-for-xm-implementation-lpm-trees'").

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 21:51:05 -07:00
Petr Machata 725ff53204 mlxsw: Revert "Prepare for XM implementation - prefix insertion and removal"
This reverts commit e7086213f7 ("Merge branch
'mlxsw-spectrum-prepare-for-xm-implementation-prefix-insertion-and-removal'").

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 21:51:05 -07:00
Petr Machata 6a4b02b8fa mlxsw: Revert "Introduce initial XM router support"
This reverts commit 75c2a8fe8e ("Merge branch
'mlxsw-introduce-initial-xm-router-support'").

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 21:51:05 -07:00
Jakub Kicinski 6ac6dc746d Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
mlx5-next: updates 2022-06-14

1) Updated HW bits and definitions for upcoming features
 1.1) vport debug counters
 1.2) flow meter
 1.3) Execute ASO action for flow entry
 1.4) enhanced CQE compression

2) Add ICM header-modify-pattern RDMA API

Leon Says
=========

SW steering manipulates packet's header using "modifying header" actions.
Many of these actions do the same operation, but use different data each time.
Currently we create and keep every one of these actions, which use expensive
and limited resources.

Now we introduce a new mechanism - pattern and argument, which splits
a modifying action into two parts:
1. action pattern: contains the operations to be applied on packet's header,
mainly set/add/copy of fields in the packet
2. action data/argument: contains the data to be used by each operation
in the pattern.

This way we reuse same patterns with different arguments to create new
modifying actions, and since many actions share the same operations, we end
up creating a small number of patterns that we keep in a dedicated cache.

These modify header patterns are implemented as new type of ICM memory,
so the following kernel patch series add the support for this new ICM type.
==========

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Add bits and fields to support enhanced CQE compression
  net/mlx5: Remove not used MLX5_CAP_BITS_RW_MASK
  net/mlx5: group fdb cleanup to single function
  net/mlx5: Add support EXECUTE_ASO action for flow entry
  net/mlx5: Add HW definitions of vport debug counters
  net/mlx5: Add IFC bits and enums for flow meter
  RDMA/mlx5: Support handling of modify-header pattern ICM area
  net/mlx5: Manage ICM of type modify-header pattern
  net/mlx5: Introduce header-modify-pattern ICM properties
====================

Link: https://lore.kernel.org/r/20220614184028.51548-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 19:09:39 -07:00
Jakub Kicinski 7e5e8ec7db docs: tls: document the TLS_TX_ZEROCOPY_RO
Add missing documentation for the TLS_TX_ZEROCOPY_RO opt-in.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Link: https://lore.kernel.org/r/20220610180212.110590-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-14 11:14:17 +02:00
Marco Bonelli 19d62f5eea ethtool: Fix and simplify ethtool_convert_link_mode_to_legacy_u32()
Fix the implementation of ethtool_convert_link_mode_to_legacy_u32(), which
is supposed to return false if src has bits higher than 31 set. The current
implementation uses the complement of bitmap_fill(ext, 32) to test high
bits of src, which is wrong as bitmap_fill() fills _with long granularity_,
and sizeof(long) can be > 4. No users of this function currently check the
return value, so the bug was dormant.

Also remove the check for __ETHTOOL_LINK_MODE_MASK_NBITS > 32, as the enum
ethtool_link_mode_bit_indices contains far beyond 32 values. Using
find_next_bit() to test the src bitmask works regardless of this anyway.

Signed-off-by: Marco Bonelli <marco@mebeim.net>
Link: https://lore.kernel.org/r/20220609134900.11201-1-marco@mebeim.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-13 23:11:35 -07:00
Rasmus Villemoes bfa54812f0 net: phy: fixed_phy: set phy_mask before calling mdiobus_register()
There's no point probing for phys on this artificial bus, so we can
save a little bit of boot time by telling mdiobus_register() not to do
that.

This doesn't have any functional change, since, at this point,
fixed_mdio_read() returns 0xffff for all addresses/registers, so

  mdiobus_scan() -> get_phy_device() -> get_phy_c22_id()

will return -ENODEV, which is just ignored.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/r/20220606200208.1665417-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-13 23:11:24 -07:00
Ofer Levi cdcdce948d net/mlx5: Add bits and fields to support enhanced CQE compression
Expose ifc bits and add needed structure fields and methods to
support enhanced CQE compression feature.
The enhanced CQE compression feature improves cpu utiliziation with
better packet latency from nic to host.

Signed-off-by: Ofer Levi <oferle@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:59:06 -07:00
Shay Drory d107ba1f7c net/mlx5: Remove not used MLX5_CAP_BITS_RW_MASK
Remove not used MLX5_CAP_BITS_RW_MASK.
While at it, remove CAP_MASK, MLX5_CAP_OFF_CMDIF_CSUM
and MLX5_DEV_CAP_FLAG_*, since MLX5_CAP_BITS_RW_MASK
was their only user.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:59:06 -07:00
Shay Drory 684f062c97 net/mlx5: group fdb cleanup to single function
Currently, the allocation of fdb software objects are done is single
function, oppose to the cleanup of them.
Group the cleanup of fdb software objects to single function.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:59:06 -07:00
Jianbo Liu 91707779a4 net/mlx5: Add support EXECUTE_ASO action for flow entry
Attach flow meter to FTE with object id and index.
Use metadata register C5 to store the packet color meter result.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:59:06 -07:00
Saeed Mahameed 3e94e61bd4 net/mlx5: Add HW definitions of vport debug counters
total_q_under_processor_handle - number of queues in error state due to an
async error or errored command.

send_queue_priority_update_flow - number of QP/SQ priority/SL update
events.

cq_overrun - number of times CQ entered an error state due to an
overflow.

async_eq_overrun -number of time an EQ mapped to async events was
overrun.

comp_eq_overrun - number of time an EQ mapped to completion events was
overrun.

quota_exceeded_command - number of commands issued and failed due to quota
exceeded.

invalid_command - number of commands issued and failed dues to any reason
other than quota exceeded.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:59:05 -07:00
Jianbo Liu f5d23ee137 net/mlx5: Add IFC bits and enums for flow meter
Add/extend structure layouts and defines for flow meter.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:59:05 -07:00
Yevgeny Kliteynik a6492af380 RDMA/mlx5: Support handling of modify-header pattern ICM area
Add support for allocate/deallocate and registering MR of the new type
of ICM area. Support exists only for devices that support sw_owner_v2.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:58:01 -07:00
Yevgeny Kliteynik 667658364b net/mlx5: Manage ICM of type modify-header pattern
Added support for managing new type of ICM for devices that
support sw_owner_v2.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:58:01 -07:00
Yevgeny Kliteynik 795e10b450 net/mlx5: Introduce header-modify-pattern ICM properties
Added new fields for device memory capabilities, in order to
support creation of ICM memory for modify header patterns.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-13 14:58:01 -07:00
Yajun Deng c04245328d net: make __sys_accept4_file() static
__sys_accept4_file() isn't used outside of the file, make it static.

As the same time, move file_flags and nofile parameters into
__sys_accept4_file().

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 13:47:15 +01:00
Eric Dumazet 219160be49 tcp: sk_forced_mem_schedule() optimization
sk_memory_allocated_add() has three callers, and returns
to them @memory_allocated.

sk_forced_mem_schedule() is one of them, and ignores
the returned value.

Change sk_memory_allocated_add() to return void.

Change sock_reserve_memory() and __sk_mem_raise_allocated()
to call sk_memory_allocated().

This removes one cache line miss [1] for RPC workloads,
as first skbs in TCP write queue and receive queue go through
sk_forced_mem_schedule().

[1] Cache line holding tcp_memory_allocated.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 13:35:25 +01:00
Parthiban Veerasooran 4066bf4ce3 net: smsc95xx: add support for Microchip EVB-LAN8670-USB
This patch adds support for Microchip's EVB-LAN8670-USB 10BASE-T1S
ethernet device to the existing smsc95xx driver by adding the new
USB VID/PID pairs.

Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 13:33:14 +01:00
Yinjun Zhang 5f30671d8d nfp: support 48-bit DMA addressing for NFP3800
48-bit DMA addressing is supported in NFP3800 HW and implemented
in NFDK firmware, so enable this feature in driver now. Note that
with this change, NFD3 firmware, which doesn't implement 48-bit
DMA, cannot be used for NFP3800 any more.

RX free list descriptor, used by both NFD3 and NFDK, is also modified
to support 48-bit DMA. That's OK because the top bits is always get
set to 0 when assigned with 40-bit address.

Based on initial work of Jakub Kicinski <jakub.kicinski@netronome.com>.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 13:31:39 +01:00
David S. Miller 11a1585f26 Merge branch 'ipa-refactoring'
Alex Elder says:

====================
net: ipa: simple refactoring

This series contains some minor code improvements.

The first patch verifies that the configuration is compatible with a
recently-defined limit.  The second and third rename two fields so
they better reflect their use in the code.  The next gets rid of an
empty function by reworking its only caller.

The last two begin to remove the assumption that an event ring is
associated with a single channel.  Eventually we'll support having
multiple channels share an event ring but some more needs to be done
before that can happen.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:01:58 +01:00
Alex Elder bcec9ecbaf net: ipa: derive channel from transaction
In gsi_channel_tx_queued(), we report when a transaction gets passed
to hardware.  Change that function so it takes transaction rather
than a channel as its argument, and derive the channel from the
transaction.  Rename the function accordingly.

Delete the header comments above the function definition; the ones
above the declaration in "gsi_private.h" should suffice.  In
addition, the comments above gsi_channel_tx_update() do a fine job
of explaining what's going on.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:01:58 +01:00
Alex Elder 7dd9558fed net: ipa: determine channel from event
Each event in an event ring describes the TRE whose completion
caused the event.  Currently, every event ring is dedicated to a
single channel, so the channel is easily derived from the event
ring.

An event ring can actually be shared by more than one channel
though, and to distinguish events for one channel from another, the
event structure contains a field indicating which channel the event
is associated with.

In gsi_event_trans(), use the channel ID in an event to determine
which channel the event is for.  This makes the channel pointer now
passed to that function irrelevant; pass the GSI pointer to that
function instead.

And although it shouldn't happen, warn if an event arrives that
records a channel ID that's not in use, or if the event does not
have a transaction associated with it.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:01:58 +01:00
Alex Elder 983a1a3081 net: ipa: simplify endpoint transaction completion
When a GSI transaction completes, ipa_endpoint_trans_complete() is
eventually called.  That handles TX and RX completions separately,
but ipa_endpoint_tx_complete() is a no-op.

Instead, have ipa_endpoint_trans_complete() return immediately for a
TX transaction, and incorporate code from ipa_endpoint_rx_complete()
to handle RX transactions.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:01:58 +01:00
Alex Elder 317595d2ce net: ipa: rename endpoint->trans_tre_max
The trans_tre_max field of the IPA endpoint structure is only used
to limit the number of fragments allowed for an SKB being prepared
for transmission.  Recognizing that, rename the field skb_frag_max,
and reduce its value by 1.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:01:58 +01:00
Alex Elder 88e03057e4 net: ipa: rename channel->tlv_count
Each GSI channel has a TLV FIFO of a certain size, specified in the
configuration data for an AP channel.  That size dictates the
maximum number of TREs that are allowed in a single transaction.

The only way that value is used after initialization is as a limit
on the number of TREs in a transaction; calling it "tlv_count"
isn't helpful, and in fact gsi_channel_trans_tre_max() exists to
sort of abstract it.

Instead, rename the channel->tlv_count field trans_tre_max, and get
rid of the helper function.  Update a couple of comments as well.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:01:58 +01:00
Alex Elder 92f78f81ac net: ipa: verify command channel TLV count
In commit 8797972aff ("net: ipa: remove command info pool"), the
maximum number of IPA commands that would be sent in a single
transaction was defined.  That number can't exceed the size of the
TLV FIFO on the command channel, and we can check that at runtime.

To add this check, pass a new flag to gsi_channel_data_valid() to
indicate the channel being checked is being used for IPA commands.
Knowing that we can also verify the channel direction is correct.

Use a new local variable that refers to the command-specific portion
of the data being checked.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:01:58 +01:00
Yinjun Zhang 27f2533bcc nfp: flower: support to offload pedit of IPv6 flowinto fields
Previously the traffic class field is ignored while firmware has
already supported to pedit flowinfo fields, including traffic
class and flow label, now add it back.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Link: https://lore.kernel.org/r/20220609080136.151830-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:23:17 -07:00
Bin Chen 10e11aa241 ethernet: Remove vf rate limit check for drivers
The commit a14857c27a ("rtnetlink: verify rate parameters for calls to
ndo_set_vf_rate") has been merged to master, so we can to remove the
now-duplicate checks in drivers.

Signed-off-by: Bin Chen <bin.chen@corigine.com>
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220609084717.155154-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:19:32 -07:00
Jakub Kicinski 68c51dd992 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
10GbE Intel Wired LAN Driver Updates 2022-06-09

Maximilian Heyne adds reporting of VF statistics on ixgbe via iproute2
interface.

Kai-Heng Feng removes duplicate defines from igb.

Jiaqing Zhao fixes typos in e1000, ixgb, and ixgbe drivers.

Julia Lawall fixes typos for fm10k, ixgbe, and ice drivers.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  drivers/net/ethernet/intel: fix typos in comments
  ixgbe: Fix typos in comments
  ixgb: Fix typos in comments
  e1000: Fix typos in comments
  igb: Remove duplicate defines
  drivers, ixgbe: export vf statistics
====================

Link: https://lore.kernel.org/r/20220609171257.2727150-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:07:32 -07:00
Jakub Kicinski e10b02ee5b Merge branch 'net-reduce-tcp_memory_allocated-inflation'
Eric Dumazet says:

====================
net: reduce tcp_memory_allocated inflation

Hosts with a lot of sockets tend to hit so called TCP memory pressure,
leading to very bad TCP performance and/or OOM.

The problem is that some TCP sockets can hold up to 2MB of 'forward
allocations' in their per-socket cache (sk->sk_forward_alloc),
and there is no mechanism to make them relinquish their share
under mem pressure.
Only under some potentially rare events their share is reclaimed,
one socket at a time.

In this series, I implemented a per-cpu cache instead of a per-socket one.

Each CPU has a +1/-1 MB (256 pages on x86) forward alloc cache, in order
to not dirty tcp_memory_allocated shared cache line too often.

We keep sk->sk_forward_alloc values as small as possible, to meet
memcg page granularity constraint.

Note that memcg already has a per-cpu cache, although MEMCG_CHARGE_BATCH
is defined to 32 pages, which seems a bit small.

Note that while this cover letter mentions TCP, this work is generic
and supports TCP, UDP, DECNET, SCTP.
====================

Link: https://lore.kernel.org/r/20220609063412.2205738-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 16:21:40 -07:00
Eric Dumazet 0f2c269398 net: unexport __sk_mem_{raise|reduce}_allocated
These two helpers are only used from core networking.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 16:21:27 -07:00
Eric Dumazet 4890b686f4 net: keep sk->sk_forward_alloc as small as possible
Currently, tcp_memory_allocated can hit tcp_mem[] limits quite fast.

Each TCP socket can forward allocate up to 2 MB of memory, even after
flow became less active.

10,000 sockets can have reserved 20 GB of memory,
and we have no shrinker in place to reclaim that.

Instead of trying to reclaim the extra allocations in some places,
just keep sk->sk_forward_alloc values as small as possible.

This should not impact performance too much now we have per-cpu
reserves: Changes to tcp_memory_allocated should not be too frequent.

For sockets not using SO_RESERVE_MEM:
 - idle sockets (no packets in tx/rx queues) have zero forward alloc.
 - non idle sockets have a forward alloc smaller than one page.

Note:

 - Removal of SK_RECLAIM_CHUNK and SK_RECLAIM_THRESHOLD
   is left to MPTCP maintainers as a follow up.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 16:21:27 -07:00