Commit Graph

99680 Commits

Author SHA1 Message Date
Jakub Kicinski b646acd5eb net: re-solve some conflicts after net -> net-next merge
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-16 23:12:23 -08:00
David S. Miller d489ded1a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-16 17:51:13 -08:00
Robert Hancock b5d007e2aa net: phy: broadcom: Do not modify LED configuration for SFP module PHYs
bcm54xx_config_init was modifying the PHY LED configuration to enable link
and activity indications. However, some SFP modules (such as Bel-Fuse
SFP-1GBT-06) have no LEDs but use the LED outputs to control the SFP LOS
signal, and modifying the LED settings will cause the LOS output to
malfunction. Skip this configuration for PHYs which are bound to an SFP
bus.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:23:23 -08:00
Robert Hancock b834489bce net: phy: Add is_on_sfp_module flag and phy_on_sfp helper
Add a flag and helper function to indicate that a PHY device is part of
an SFP module, which is set on attach. This can be used by PHY drivers
to handle SFP-specific quirks or behavior.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:23:23 -08:00
Robert Hancock 3afd021899 net: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616S
The default configuration for the BCM54616S PHY may not match the desired
mode when using 1000BaseX or SGMII interface modes, such as when it is on
an SFP module. Add code to explicitly set the correct mode using
programming sequences provided by Bel-Fuse:

https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-05-series.pdf
https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-06-series.pdf

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:23:23 -08:00
Sven Van Asbroeck 966df6ded2 lan743x: sync only the received area of an rx ring buffer
On cpu architectures w/o dma cache snooping, dma_unmap() is a
is a very expensive operation, because its resulting sync
needs to invalidate cpu caches.

Increase efficiency/performance by syncing only those sections
of the lan743x's rx ring buffers that are actually in use.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:15:21 -08:00
Sven Van Asbroeck a8db76d40e lan743x: boost performance on cpu archs w/o dma cache snooping
The buffers in the lan743x driver's receive ring are always 9K,
even when the largest packet that can be received (the mtu) is
much smaller. This performs particularly badly on cpu archs
without dma cache snooping (such as ARM): each received packet
results in a 9K dma_{map|unmap} operation, which is very expensive
because cpu caches need to be invalidated.

Careful measurement of the driver rx path on armv7 reveals that
the cpu spends the majority of its time waiting for cache
invalidation.

Optimize by keeping the rx ring buffer size as close as possible
to the mtu. This limits the amount of cache that requires
invalidation.

This optimization would normally force us to re-allocate all
ring buffers when the mtu is changed - a disruptive event,
because it can only happen when the network interface is down.

Remove the need to re-allocate all ring buffers by adding support
for multi-buffer frames. Now any combination of mtu and ring
buffer size will work. When the mtu changes from mtu1 to mtu2,
consumed buffers of size mtu1 are lazily replaced by newly
allocated buffers of size mtu2.

These optimizations double the rx performance on armv7.
Third parties report 3x rx speedup on armv8.

Tested with iperf3 on a freescale imx6qp + lan7430, both sides
set to mtu 1500 bytes, measure rx performance:

Before:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-20.00  sec   550 MBytes   231 Mbits/sec    0
After:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-20.00  sec  1.33 GBytes   570 Mbits/sec    0

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:08:48 -08:00
Vladimir Oltean 3af409ca27 net: enetc: fix destroyed phylink dereference during unbind
The following call path suggests that calling unregister_netdev on an
interface that is up will first bring it down.

enetc_pf_remove
-> unregister_netdev
   -> unregister_netdevice_queue
      -> unregister_netdevice_many
         -> dev_close_many
            -> __dev_close_many
               -> enetc_close
                  -> enetc_stop
                     -> phylink_stop

However, enetc first destroys the phylink instance, then calls
unregister_netdev. This is already dissimilar to the setup (and error
path teardown path) from enetc_pf_probe, but more than that, it is buggy
because it is invalid to call phylink_stop after phylink_destroy.

So let's first unregister the netdev (and let the .ndo_stop events
consume themselves), then destroy the phylink instance, then free the
netdev.

Fixes: 71b77a7a27 ("enetc: Migrate to PHYLINK and PCS_LYNX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:05:07 -08:00
Maxime Chevallier 4906887a8a net: mvneta: Implement mqprio support
Implement a basic MQPrio support, inserting rules in RX that translate
the TC to prio mapping into vlan prio to queues.

The TX logic stays the same as when we don't offload the qdisc.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:03:26 -08:00
Maxime Chevallier cf9bf87128 net: mvneta: Remove per-cpu queue mapping for Armada 3700
According to Errata #23 "The per-CPU GbE interrupt is limited to Core
0", we can't use the per-cpu interrupt mechanism on the Armada 3700
familly.

This is correctly checked for RSS configuration, but the initial queue
mapping is still done by having the queues spread across all the CPUs in
the system, both in the init path and in the cpu_hotplug path.

Fixes: 2636ac3cc2 ("net: mvneta: Add network support for Armada 3700 SoC")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 15:03:26 -08:00
David S. Miller 44c3203975 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
pull-request: mlx5-next 2021-02-16

The patches in this pr are already submitted and reviewed through the
netdev and rdma mailing lists.

The series includes mlx5 HW bits and definitions for mlx5 real time clock
translation and handling in the mlx5 driver clock module to enable and
support such mode [1]

[1] https://patchwork.kernel.org/project/netdevbpf/patch/20210212223042.449816-7-saeed@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:53:30 -08:00
Gary Guo 18af77c50f drivers: net: xilinx_emaclite: remove arch limitation
The changes made in eccd540 is enough for xilinx_emaclite to run
without problem on 64-bit systems. I have tested it on a Xilinx
FPGA with RV64 softcore. The architecture limitation in Kconfig
seems no longer necessary.

A small change is included to print address with %lx instead of
casting to int and print with %x.

Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:48:59 -08:00
Horatiu Vultur a026c50b59 net: dsa: felix: Add support for MRP
Implement functions 'port_mrp_add', 'port_mrp_del',
'port_mrp_add_ring_role' and 'port_mrp_del_ring_role' to call the mrp
functions from ocelot.

Also all MRP frames that arrive to CPU on queue number OCELOT_MRP_CPUQ
will be forward by the SW.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:47:46 -08:00
Horatiu Vultur d8ea7ff399 net: mscc: ocelot: Add support for MRP
Add basic support for MRP. The HW will just trap all MRP frames on the
ring ports to CPU and allow the SW to process them. In this way it is
possible to for this node to behave both as MRM and MRC.

Current limitations are:
- it doesn't support Interconnect roles.
- it supports only a single ring.
- the HW should be able to do forwarding of MRP Test frames so the SW
  will not need to do this. So it would be able to have the role MRC
  without SW support.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:47:46 -08:00
Robert Hancock 06b334f08b net: phy: marvell: Ensure SGMII auto-negotiation is enabled for 88E1111
When 88E1111 is operating in SGMII mode, auto-negotiation should be enabled
on the SGMII side so that the link will come up properly with PCSes which
normally have auto-negotiation enabled. This is normally the case when the
PHY defaults to SGMII mode at power-up, however if we switched it from some
other mode like 1000Base-X, as may happen in some SFP module situations,
it may not be, particularly for modules which have 1000Base-X
auto-negotiation defaulting to disabled.

Call genphy_check_and_restart_aneg on the fiber page to ensure that auto-
negotiation is properly enabled on the SGMII interface.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:16:58 -08:00
Marek Behún cfb971dec5 sfp: add support for 5gbase-t SFPs
The sfp_parse_support() function is setting 5000baseT_Full in some cases.
Now that we have PHY_INTERFACE_MODE_5GBASER interface mode available,
change sfp_select_interface() to return PHY_INTERFACE_MODE_5GBASER if
5000baseT_Full is set in the link mode mask.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:15:12 -08:00
Marek Behún f6813bdafd net: phylink: Add 5gbase-r support
Add 5GBASER interface type and speed to phylink.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:15:12 -08:00
Florian Fainelli 32aeba1f7a tg3: Remove unused PHY_BRCM flags
The tg3 driver tried to communicate towards the PHY driver whether it
wanted RGMII in-band signaling enabled or disabled however there is
nothing that looks at those flags in drivers/net/phy/broadcom.c so this
does do not anything.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:11:57 -08:00
Shyam Sundar S K 9eab3fdb41 net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFP
Frequent link up/down events can happen when a Bel Fuse SFP part is
connected to the amd-xgbe device. Try to avoid the frequent link
issues by resetting the PHY as documented in Bel Fuse SFP datasheets.

Fixes: e722ec8237 ("amd-xgbe: Update the BelFuse quirk to support SGMII")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:09:46 -08:00
Shyam Sundar S K 84fe68eb67 net: amd-xgbe: Reset link when the link never comes back
Normally, auto negotiation and reconnect should be automatically done by
the hardware. But there seems to be an issue where auto negotiation has
to be restarted manually. This happens because of link training and so
even though still connected to the partner the link never "comes back".
This needs an auto-negotiation restart.

Also, a change in xgbe-mdio is needed to get ethtool to recognize the
link down and get the link change message. This change is only
required in a backplane connection mode.

Fixes: abf0a1c2b2 ("amd-xgbe: Add support for SFP+ modules")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:09:45 -08:00
Shyam Sundar S K 186edbb510 net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warning
The current driver calls netif_carrier_off() late in the link tear down
which can result in a netdev watchdog timeout.

Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
avoids the warning.

 ------------[ cut here ]------------
 NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out
 WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220
 Modules linked in: amd_xgbe(E)  amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down
 CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E
 Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019
 RIP: 0010:dev_watchdog+0x20d/0x220
 Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48
 RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0
 RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018
 R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000
 R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010
 FS:  0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0
 Call Trace:
  <IRQ>
  ? pfifo_fast_reset+0x100/0x100
  call_timer_fn+0x2b/0x130
  run_timer_softirq+0x3e8/0x440
  ? enqueue_hrtimer+0x39/0x90

Fixes: e722ec8237 ("amd-xgbe: Update the BelFuse quirk to support SGMII")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:09:45 -08:00
Shyam Sundar S K 30b7edc82e net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout
Sometimes mailbox commands timeout when the RX data path becomes
unresponsive. This prevents the submission of new mailbox commands to DXIO.
This patch identifies the timeout and resets the RX data path so that the
next message can be submitted properly.

Fixes: 549b32af9f ("amd-xgbe: Simplify mailbox interface rate change code")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:09:45 -08:00
Bjarni Jonasson ca0d7fd0a5 net: phy: mscc: coma mode disabled for VSC8514
The 'coma mode' (configurable through sw or hw) provides an
optional feature that may be used to control when the PHYs become active.
The typical usage is to synchronize the link-up time across
all PHY instances. This patch releases coma mode if not done by hardware,
otherwise the phys will not link-up.

Fixes: e4f9ba642f ("net: phy: mscc: add support for VSC8514 PHY.")
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:06:19 -08:00
Bjarni Jonasson 85e97f0b98 net: phy: mscc: improved serdes calibration applied to VSC8514
The current IB serdes calibration algorithm (performed by the onboard 8051)
has proven to be unstable for the VSC8514 QSGMII phy.
A new algorithm has been developed based on
'Frequency-offset Jittered-Injection' or 'FoJi' method which solves
all known issues.  This patch disables the 8051 algorithm and
replaces it with the new FoJi algorithm.
The calibration is now performed in a new file (mscc_serdes.c),
which can act as an placeholder for future serdes configurations.

Fixes: e4f9ba642f ("net: phy: mscc: add support for VSC8514 PHY.")
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:06:18 -08:00
Bjarni Jonasson 3cc2c646be net: phy: mscc: adding LCPLL reset to VSC8514
At Power-On Reset, transients may cause the LCPLL to lock onto a
clock that is momentarily unstable. This is normally seen in QSGMII
setups where the higher speed 6G SerDes is being used.
This patch adds an initial LCPLL Reset to the PHY (first instance)
to avoid this issue.

Fixes: e4f9ba642f ("net: phy: mscc: add support for VSC8514 PHY.")
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:06:18 -08:00
Aya Levin 432119de33 net/mlx5: Add cyc2time HW translation mode support
Device timestamp can be in real time mode (cycles to time translation is
offloaded into the Hardware). With real time mode, HW provides timestamp
which is already translated into nanoseconds.

With this mode, driver adjusts both the HW and timecounter (to keep
clock_info_page updated) using callbacks: adjfreq, adjtime and settime.
HW clock modifications are done via MTUTC access reg commands. Driver is
allowed to modify HW real time clock only if MCAM ptpcyc2realtime_modify
capability is set.

Add MTUTC set function to be used for configuring the HW real time
clock. Modify existing code to support both internal timer (with
conversion via timecounter_cyc2time() and real time (no conversions).

Align the signatures of the helpers converting from timestamp to
nanoseconds. With that, when allocating a queue assign the corresponding
callback with respect to the capability.

Adjust 1PPS timestamp calculation flows based on the timestamp mode.

Cyc2time offload brings two major advantages:
- Improve MTAE (Max Time Absolute Error) for HW TS by up to 160 ns over a
  100% loaded CPU.
- Faster data-path timestamp to nanoseconds, as translation is
  lock-less and done in HW.

On real time mode, timestamp format is 32 high bits of seconds and 32
low bits of nanoseconds. On some flows, driver shall convert this format
into nanoseconds wall-clock with REAL_TIME_TO_NS macro.

HW supports a single clock, and it is shared by all functions on a
device. In case real time clock is used, it is recommended to use
a single GM to all device's functions.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16 14:04:54 -08:00
Eran Ben Elisha de19cd6cc9 net/mlx5: Move some PPS logic into helper functions
Some of PPS logic (timestamp calculations) fits only internal timer
timestamp mode. Move these logics into helper functions. Later in the
patchset cyc2time HW translation mode will expose its own PPS timestamp
calculations.

With this change, main flow will only hold calling PPS logic based on run
time mode.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16 14:04:54 -08:00
Eran Ben Elisha d6f3dc8f50 net/mlx5: Move all internal timer metadata into a dedicated struct
Internal timer mode (SW clock) requires some PTP clock related metadata
structs. Real time mode (HW clock) will not need these metadata structs.
This separation emphasize the different interfaces for HW clock and SW
clock.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16 14:04:54 -08:00
Eran Ben Elisha 1436de0b99 net/mlx5: Refactor init clock function
Function mlx5_init_clock() is responsible for internal PTP related metadata
initializations. Break mlx5_init_clock() to sub functions, each takes care
of its own logic.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16 14:04:54 -08:00
Vladimir Oltean 7f7ccdea8c net: dsa: sja1105: fix leakage of flooded frames outside bridging domain
Quite embarrasingly, I managed to fool myself into thinking that the
flooding domain of sja1105 source ports is restricted by the forwarding
domain, which it isn't. Frames which match an FDB entry are forwarded
towards that entry's DESTPORTS restricted by REACH_PORT[SRC_PORT], while
frames that don't match any FDB entry are forwarded towards
FL_DOMAIN[SRC_PORT] or BC_DOMAIN[SRC_PORT].

This means we can't get away with doing the simple thing, and we must
manage the flooding domain ourselves such that it is restricted by the
forwarding domain. This new function must be called from the
.port_bridge_join and .port_bridge_leave methods too, not just from
.port_bridge_flags as we did before.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:02:46 -08:00
Vladimir Oltean 4c44fc5e94 net: dsa: sja1105: fix configuration of source address learning
Due to a mistake, the driver always sets the address learning flag to
the previously stored value, and not to the currently configured one.
The bug is visible only in standalone ports mode, because when the port
is bridged, the issue is masked by .port_stp_state_set which overwrites
the address learning state to the proper value.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:02:46 -08:00
Geetha sowjanya 786621d200 octeontx2-af: cn10k: Fixes CN10K RPM reference issue
This patch fixes references to uninitialized variables and
debugfs entry name for CN10K platform and HW_TSO flag check.

Fixes: 3ad3f8f93c ("octeontx2-af: cn10k: MAC internal loopback support").
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>

v1-v2
- Clear HW_TSO flag for 96xx B0 version.

This patch fixes the bug introduced by the commit
3ad3f8f93c ("octeontx2-af: cn10k: MAC internal loopback support").
These changes are not yet merged into net branch, hence submitting
to net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:59:47 -08:00
Vladimir Oltean 6b73b7c96a net: dsa: felix: perform teardown on error in felix_setup
If the driver fails to probe, it would be nice to not leak memory.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:52:57 -08:00
Vladimir Oltean 42b5adbbac net: dsa: felix: don't deinitialize unused ports
ocelot_init_port is called only if dsa_is_unused_port == false, however
ocelot_deinit_port is called unconditionally. This causes a warning in
the skb_queue_purge inside ocelot_deinit_port saying that the spin lock
protecting ocelot_port->tx_skbs was not initialized.

Fixes: e5fb512d81 ("net: mscc: ocelot: deinitialize only initialized ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:52:14 -08:00
Chen Lin 6825a456c9 ionic: Remove unused function pointer typedef ionic_reset_cb
Remove the 'ionic_reset_cb' typedef as it is not used.

Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:43:58 -08:00
David S. Miller b8af417e4d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2021-02-16

The following pull-request contains BPF updates for your *net-next* tree.

There's a small merge conflict between 7eeba1706e ("tcp: Add receive timestamp
support for receive zerocopy.") from net-next tree and 9cacf81f81 ("bpf: Remove
extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:

  [...]
                lock_sock(sk);
                err = tcp_zerocopy_receive(sk, &zc, &tss);
                err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
                                                          &zc, &len, err);
                release_sock(sk);
  [...]

We've added 116 non-merge commits during the last 27 day(s) which contain
a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).

The main changes are:

1) Adds support of pointers to types with known size among global function
   args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.

2) Add bpf_iter for task_vma which can be used to generate information similar
   to /proc/pid/maps, from Song Liu.

3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
   rewriting bind user ports from BPF side below the ip_unprivileged_port_start
   range, both from Stanislav Fomichev.

4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
   as well as per-cpu maps for the latter, from Alexei Starovoitov.

5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
   for sleepable programs, both from KP Singh.

6) Extend verifier to enable variable offset read/write access to the BPF
   program stack, from Andrei Matei.

7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
   query device MTU from programs, from Jesper Dangaard Brouer.

8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
   tracing programs, from Florent Revest.

9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
   otherwise too many passes are required, from Gary Lin.

10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
    verification both related to zero-extension handling, from Ilya Leoshkevich.

11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.

12) Batch of AF_XDP selftest cleanups and small performance improvement
    for libbpf's xsk map redirect for newer kernels, from Björn Töpel.

13) Follow-up BPF doc and verifier improvements around atomics with
    BPF_FETCH, from Brendan Jackman.

14) Permit zero-sized data sections e.g. if ELF .rodata section contains
    read-only data from local variables, from Yonghong Song.

15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:14:06 -08:00
Sasha Levin 88a686728b kbuild: simplify access to the kernel's version
Instead of storing the version in a single integer and having various
kernel (and userspace) code how it's constructed, export individual
(major, patchlevel, sublevel) components and simplify kernel code that
uses it.

This should also make it easier on userspace.

Signed-off-by: Sasha Levin <sashal@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-02-16 12:01:45 +09:00
Alex Elder 25c5a7e89b net: ipa: initialize all resources
We configure the minimum and maximum number of various types of IPA
resources in ipa_resource_config().  It iterates over resource types
in the configuration data and assigns resource limits to each
resource group for each type.

Unfortunately, we are repeatedly initializing the resource data for
the first type, rather than initializing each of the types whose
limits are specified.

Fix this bug.

Fixes: 4a0d7579d4 ("net: ipa: avoid going past end of resource group array")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:26:29 -08:00
Colin Ian King f6724cd497 i40e: Fix uninitialized variable mfs_max
The variable mfs_max is not initialized and is being compared to find
the maximum value. Fix this by initializing it to 0.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 90bc8e003b ("i40e: Add hardware configuration for software based DCB")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:21:46 -08:00
Heiner Kallweit 93e8990c24 net: phy: rename PHY_IGNORE_INTERRUPT to PHY_MAC_INTERRUPT
Some internal PHY's have their events like link change reported by the
MAC interrupt. We have PHY_IGNORE_INTERRUPT to deal with this scenario.
I'm not too happy with this name. We don't ignore interrupts, typically
there is no interrupt exposed at a PHY level. So let's rename it to
PHY_MAC_INTERRUPT. This is in line with phy_mac_interrupt(), which is
called from the MAC interrupt handler to handle PHY events.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:20:49 -08:00
Michael Walle 63477a5d4c net: phy: at803x: add MDIX support to AR8031/33
AR8035 recently gained MDIX support. The same functions will work for
the AR8031/33 PHY. We just need to add the at803x_config_aneg()
callback.

This was tested on a Kontron sl28 board.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:20:00 -08:00
Sukadev Bhattiprolu 4a41c421f3 ibmvnic: serialize access to work queue on remove
The work queue is used to queue reset requests like CHANGE-PARAM or
FAILOVER resets for the worker thread. When the adapter is being removed
the adapter state is set to VNIC_REMOVING and the work queue is flushed
so no new work is added. However the check for adapter being removed is
racy in that the adapter can go into REMOVING state just after we check
and we might end up adding work just as it is being flushed (or after).

The ->rwi_lock is already being used to serialize queue/dequeue work.
Extend its usage ensure there is no race when scheduling/flushing work.

Fixes: 6954a9e419 ("ibmvnic: Flush existing work items before device removal")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Cc:Uwe Kleine-König <uwe@kleine-koenig.org>
Cc:Saeed Mahameed <saeed@kernel.org>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:17:30 -08:00
Florian Fainelli 5d4358ede8 net: phy: broadcom: Allow BCM54210E to configure APD
BCM54210E/BCM50212E has been verified to work correctly with the
auto-power down configuration done by bcm54xx_adjust_rxrefclk(), add it
to the list of PHYs working.

While we are at it, provide an appropriate name for the bit we are
changing which disables the RXC and TXC during auto-power down when
there is no energy on the cable.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:15:25 -08:00
Florian Fainelli 17d3a83afb net: phy: broadcom: Remove unused flags
We have a number of unused flags defined today and since we are scarce
on space and may need to introduce new flags in the future remove and
shift every existing flag down into a contiguous assignment.
PHY_BCM_FLAGS_MODE_1000BX was only used internally for the BCM54616S
PHY, so we allocate a driver private structure instead to store that
flag instead of canibalizing one from phydev->dev_flags for that
purpose.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:15:25 -08:00
Florian Fainelli 133bf7b4fb net: phy: broadcom: Avoid forward for bcm54xx_config_clock_delay()
Avoid a forward declaration by moving the callers of
bcm54xx_config_clock_delay() below its body.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:15:25 -08:00
Lijun Pan 7d3a7b9ea5 ibmvnic: skip send_request_unmap for timeout reset
Timeout reset will trigger the VIOS to unmap it automatically,
similarly as FAILVOER and MOBILITY events. If we unmap it
in the linux side, we will see errors like
"30000003: Error 4 in REQUEST_UNMAP_RSP".
So, don't call send_request_unmap for timeout reset.

Fixes: ed651a1087 ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:12:26 -08:00
Lijun Pan 42557dab78 ibmvnic: add memory barrier to protect long term buffer
dma_rmb() barrier is added to load the long term buffer before copying
it to socket buffer; and dma_wmb() barrier is added to update the
long term buffer before it being accessed by VIOS (virtual i/o server).

Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Acked-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:12:01 -08:00
Lijun Pan 1a42156f52 ibmvnic: substitute mb() with dma_wmb() for send_*crq* functions
The CRQ and subCRQ descriptors are DMA mapped, so dma_wmb(),
though weaker, is good enough to protect the data structures.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Acked-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:11:04 -08:00
Lijun Pan 1c7d45e7b2 ibmvnic: simplify reset_long_term_buff function
The only thing reset_long_term_buff() should do is set
buffer to zero. After doing that, it is not necessary to
send_request_map again to VIOS since it actually does not
change the mapping. So, keep memset function and remove all
others.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:10:38 -08:00
Gustavo A. R. Silva 7f76963b69 i40e: Fix incorrect argument in call to ipv6_addr_any()
It seems that the right argument to be passed is &tcp_ip6_spec->ip6dst,
not &tcp_ip6_spec->ip6src, when calling function ipv6_addr_any().

Addresses-Coverity-ID: 1501734 ("Copy-paste error")
Fixes: efca91e89b ("i40e: Add flow director support for IPv6")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:07:13 -08:00
Rafał Miłecki 14b3b46a67 net: broadcom: bcm4908_enet: set MTU on open & on request
Hardware comes up with default max frame size set to 1518. When using it
with switch it results in actual Ethernet MTU 1492:
1518 - 14 (Ethernet header) - 4 (Broadcom's tag) - 4 (802.1q) - 4 (FCS)

Above means hardware in its default state can't handle standard Ethernet
traffic (MTU 1500).

Define maximum possible Ethernet overhead and always set MAC max frame
length accordingly. This change fixes handling Ethernet frames of length
1506 - 1514.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 15:06:36 -08:00
Nobuhiro Iwamatsu b38dd98ff8 net: stmmac: Add Toshiba Visconti SoCs glue driver
Add dwmac-visconti to the stmmac driver in Toshiba Visconti ARM SoCs.
This patch contains only the basic function of the device. There is no
clock control, PM, etc. yet. These will be added in the future.

Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 14:59:35 -08:00
Heiner Kallweit 7ce189faa7 r8169: fix resuming from suspend on RTL8105e if machine runs on battery
Armin reported that after referenced commit his RTL8105e is dead when
resuming from suspend and machine runs on battery. This patch has been
confirmed to fix the issue.

Fixes: e80bd76fbf ("r8169: work around power-saving bug on some chip versions")
Reported-by: Armin Wolf <W_Armin@gmx.de>
Tested-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 14:56:07 -08:00
Stefan Chulski 3a616b92a9 net: mvpp2: Add TX flow control support for jumbo frames
With MTU less than 1500B on all ports, the driver uses per CPU pool mode.
If one of the ports set to jumbo frame MTU size, all ports move
to shared pools mode.
Here, buffer manager TX Flow Control reconfigured on all ports.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 13:37:13 -08:00
Stefan Chulski 7c29451550 net: mvpp2: reduce tx-fifo for loopback port
1KB is enough for loopback port, so 2KB can be distributed
between other ports.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 13:36:11 -08:00
Vladimir Oltean 8841f6e63f net: dsa: sja1105: make devlink property best_effort_vlan_filtering true by default
The sja1105 driver has a limitation, extensively described under
Documentation/networking/dsa/sja1105.rst and
Documentation/networking/devlink/sja1105.rst, which says that when the
ports are under a bridge with vlan_filtering=1, traffic to and from
the network stack is not possible, unless the driver-specific
best_effort_vlan_filtering devlink parameter is enabled.

For users, this creates a 'wtf' moment. They need to go to the
documentation and find about the existence of this property, then maybe
install devlink and set it to true.

Having best_effort_vlan_filtering enabled by the kernel by default
delays that 'wtf' moment (maybe up to the point that it never even
happens). The user doesn't need to care that the driver supports
addressing the ports individually by retagging VLAN IDs until he/she
needs to use more than 32 VLAN IDs (since there can be at most 32
retagging rules). Only then do they need to think whether they need the
full VLAN table, at the expense of no individual port addressing, or
not.

But the odds that an sja1105 user will need more than 32 VLANs
terminated by the CPU is probably low. And, if we were to follow the
principle that more advanced use cases should require more advanced
preparation steps, then it makes more sense for ping to 'just work'
while CPU termination of > 32 VLAN IDs to require a bit more forethought
and possibly a driver-specific devlink param.

So we should be able to safely change the default here, and make this
driver act just a little bit more sanely out of the box.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 13:23:57 -08:00
Tong Zhang a67f061615 net: wan/lmc: dont print format string when not available
dev->name is determined only after calling register_hdlc_device(),
however ,it is used by printk before the name is fully determined.

  [    4.565137] hdlc%d: detected at e8000000, irq 11

Instead of printing out a %d, print hdlc directly

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 13:03:22 -08:00
Tong Zhang 62e69bc419 net: wan/lmc: unregister device when no matching device is found
lmc set sc->lmc_media pointer when there is a matching device.
However, when no matching device is found, this pointer is NULL
and the following dereference will result in a null-ptr-deref.

To fix this issue, unregister the hdlc device and return an error.

[    4.569359] BUG: KASAN: null-ptr-deref in lmc_init_one.cold+0x2b6/0x55d [lmc]
[    4.569748] Read of size 8 at addr 0000000000000008 by task modprobe/95
[    4.570102]
[    4.570187] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7 #94
[    4.570527] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-preb4
[    4.571125] Call Trace:
[    4.571261]  dump_stack+0x7d/0xa3
[    4.571445]  kasan_report.cold+0x10c/0x10e
[    4.571667]  ? lmc_init_one.cold+0x2b6/0x55d [lmc]
[    4.571932]  lmc_init_one.cold+0x2b6/0x55d [lmc]
[    4.572186]  ? lmc_mii_readreg+0xa0/0xa0 [lmc]
[    4.572432]  local_pci_probe+0x6f/0xb0
[    4.572639]  pci_device_probe+0x171/0x240
[    4.572857]  ? pci_device_remove+0xe0/0xe0
[    4.573080]  ? kernfs_create_link+0xb6/0x110
[    4.573315]  ? sysfs_do_create_link_sd.isra.0+0x76/0xe0
[    4.573598]  really_probe+0x161/0x420
[    4.573799]  driver_probe_device+0x6d/0xd0
[    4.574022]  device_driver_attach+0x82/0x90
[    4.574249]  ? device_driver_attach+0x90/0x90
[    4.574485]  __driver_attach+0x60/0x100
[    4.574694]  ? device_driver_attach+0x90/0x90
[    4.574931]  bus_for_each_dev+0xe1/0x140
[    4.575146]  ? subsys_dev_iter_exit+0x10/0x10
[    4.575387]  ? klist_node_init+0x61/0x80
[    4.575602]  bus_add_driver+0x254/0x2a0
[    4.575812]  driver_register+0xd3/0x150
[    4.576021]  ? 0xffffffffc0018000
[    4.576202]  do_one_initcall+0x84/0x250
[    4.576411]  ? trace_event_raw_event_initcall_finish+0x150/0x150
[    4.576733]  ? unpoison_range+0xf/0x30
[    4.576938]  ? ____kasan_kmalloc.constprop.0+0x84/0xa0
[    4.577219]  ? unpoison_range+0xf/0x30
[    4.577423]  ? unpoison_range+0xf/0x30
[    4.577628]  do_init_module+0xf8/0x350
[    4.577833]  load_module+0x3fe6/0x4340
[    4.578038]  ? vm_unmap_ram+0x1d0/0x1d0
[    4.578247]  ? ____kasan_kmalloc.constprop.0+0x84/0xa0
[    4.578526]  ? module_frob_arch_sections+0x20/0x20
[    4.578787]  ? __do_sys_finit_module+0x108/0x170
[    4.579037]  __do_sys_finit_module+0x108/0x170
[    4.579278]  ? __ia32_sys_init_module+0x40/0x40
[    4.579523]  ? file_open_root+0x200/0x200
[    4.579742]  ? do_sys_open+0x85/0xe0
[    4.579938]  ? filp_open+0x50/0x50
[    4.580125]  ? exit_to_user_mode_prepare+0xfc/0x130
[    4.580390]  do_syscall_64+0x33/0x40
[    4.580586]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    4.580859] RIP: 0033:0x7f1a724c3cf7
[    4.581054] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d6 48 891
[    4.582043] RSP: 002b:00007fff44941c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    4.582447] RAX: ffffffffffffffda RBX: 00000000012ada70 RCX: 00007f1a724c3cf7
[    4.582827] RDX: 0000000000000000 RSI: 00000000012ac9e0 RDI: 0000000000000003
[    4.583207] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
[    4.583587] R10: 00007f1a72527300 R11: 0000000000000246 R12: 00000000012ac9e0
[    4.583968] R13: 0000000000000000 R14: 00000000012acc90 R15: 0000000000000001
[    4.584349] ==================================================================

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 13:02:02 -08:00
Vladimir Oltean 1f778d500d net: mscc: ocelot: avoid type promotion when calling ocelot_ifh_set_dest
Smatch is confused by the fact that a 32-bit BIT(port) macro is passed
as argument to the ocelot_ifh_set_dest function and warns:

ocelot_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type?
seville_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type?

The destination port mask is copied into a 12-bit field of the packet,
starting at bit offset 67 and ending at 56.

So this DSA tagging protocol supports at most 12 bits, which is clearly
less than 32. Attempting to send to a port number > 12 will cause the
packing() call to truncate way before there will be 32-bit truncation
due to type promotion of the BIT(port) argument towards u64.

Therefore, smatch's fears that BIT(port) will do the wrong thing and
cause unexpected truncation for "port" values >= 32 are unfounded.
Nonetheless, let's silence the warning by explicitly passing an u64
value to ocelot_ifh_set_dest, such that the compiler does not need to do
a questionable type promotion.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 12:42:19 -08:00
Colin Ian King 4773acf3d4 b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
The documentation for the PHY update [1] states:

Loop 4 times with index i

    If PHY Revision >= 3
        Copy table[i] to coef[i]
    Otherwise
        Set coef[i] to 0

the copy of the table to coef is currently implemented the wrong way
around, table is being updated from uninitialized values in coeff.
Fix this by swapping the assignment around.

[1] https://bcm-v4.sipsolutions.net/802.11/PHY/N/RestoreCal/

Fixes: 2f258b74d1 ("b43: N-PHY: implement restoring general configuration")
Addresses-Coverity: ("Uninitialized scalar variable")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 12:40:41 -08:00
Ayush Sawal 2355a6773a cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds
The Max imm data size in cxgb4 is not similar to the max imm data size
in the chtls. This caused an mismatch in output of is_ofld_imm() of
cxgb4 and chtls. So fixed this by keeping the max wreq size of imm data
same in both chtls and cxgb4 as MAX_IMM_OFLD_TX_DATA_WR_LEN.

As cxgb4's max imm. data value for ofld packets is changed to
MAX_IMM_OFLD_TX_DATA_WR_LEN. Using the same in cxgbit also.

Fixes: 36bedb3f2e ("crypto: chtls - Inline TLS record Tx")
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 12:39:33 -08:00
Heiner Kallweit d2a0437081 r8169: fix resuming from suspend on RTL8105e if machine runs on battery
Armin reported that after referenced commit his RTL8105e is dead when
resuming from suspend and machine runs on battery. This patch has been
confirmed to fix the issue.

Fixes: e80bd76fbf ("r8169: work around power-saving bug on some chip versions")
Reported-by: Armin Wolf <W_Armin@gmx.de>
Tested-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 12:34:32 -08:00
Wei Liu 3019270282 Revert "Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer"
This reverts commit a8c3209998.

It is reported that the said commit caused regression in netvsc.

Reported-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-02-15 10:49:11 +00:00
Jan Beulich 3194a1746e xen-netback: don't "handle" error by BUG()
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping().
Don't make problems worse, the more that handling elsewhere (together
with map's status fields now indicating whether a mapping wasn't even
attempted, and hence has to be considered failed) doesn't require this
odd way of dealing with errors.

This is part of XSA-362.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-15 08:55:31 +01:00
Stefan Chulski 935a11845a net: mvpp2: improve Networking Complex Control register naming
GENCONF_CTRL0_PORTX naming improved.
Non functional change.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:40:43 -08:00
Stefan Chulski 9ad78d81cb net: mvpp2: improve mvpp2_get_sram return
Use PTR_ERR_OR_ZERO instead of IS_ERR and PTR_ERR.
Non functional change.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:40:43 -08:00
Stefan Chulski f704177e47 net: mvpp2: improve Packet Processor version check
Use >= MVPP22 instead of != MVPP21.
Non functional change.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:40:43 -08:00
Stefan Chulski 8b986866b2 net: mvpp2: simplify PPv2 version ID read
PPv2.1 contain 0 in Version ID register, priv->hw_version check
can be removed.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:40:43 -08:00
Vladimir Oltean 89153ed6eb net: dsa: propagate extack to .port_vlan_filtering
Some drivers can't dynamically change the VLAN filtering option, or
impose some restrictions, it would be nice to propagate this info
through netlink instead of printing it to a kernel log that might never
be read. Also netlink extack includes the module that emitted the
message, which means that it's easier to figure out which ones are
driver-generated errors as opposed to command misuse.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:38:12 -08:00
Vladimir Oltean 31046a5fd9 net: dsa: propagate extack to .port_vlan_add
Allow drivers to communicate their restrictions to user space directly,
instead of printing to the kernel log. Where the conversion would have
been lossy and things like VLAN ID could no longer be conveyed (due to
the lack of support for printf format specifier in netlink extack), I
chose to keep the messages in full form to the kernel log only, and
leave it up to individual driver maintainers to move more messages to
extack.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:38:11 -08:00
Vladimir Oltean 0a6f17c6ae net: dsa: tag_ocelot_8021q: add support for PTP timestamping
For TX timestamping, we use the felix_txtstamp method which is common
with the regular (non-8021q) ocelot tagger. This method says that skb
deferral is needed, prepares a timestamp request ID, and puts a clone of
the skb in a queue waiting for the timestamp IRQ.

felix_txtstamp is called by dsa_skb_tx_timestamp() just before the
tagger's xmit method. In the tagger xmit, we divert the packets
classified by dsa_skb_tx_timestamp() as PTP towards the MMIO-based
injection registers, and we declare them as dead towards dsa_slave_xmit.
If not PTP, we proceed with normal tag_8021q stuff.

Then the timestamp IRQ fires, the clone queued up from felix_txtstamp is
matched to the TX timestamp retrieved from the switch's FIFO based on
the timestamp request ID, and the clone is delivered to the stack.

On RX, thanks to the VCAP IS2 rule that redirects the frames with an
EtherType for 1588 towards two destinations:
- the CPU port module (for MMIO based extraction) and
- if the "no XTR IRQ" workaround is in place, the dsa_8021q CPU port
the relevant data path processing starts in the ptp_classify_raw BPF
classifier installed by DSA in the RX data path (post tagger, which is
completely unaware that it saw a PTP packet).

This time we can't reuse the same implementation of .port_rxtstamp that
also works with the default ocelot tagger. That is because felix_rxtstamp
is given an skb with a freshly stripped DSA header, and it says "I don't
need deferral for its RX timestamp, it's right in it, let me show you";
and it just points to the header right behind skb->data, from where it
unpacks the timestamp and annotates the skb with it.

The same thing cannot happen with tag_ocelot_8021q, because for one
thing, the skb did not have an extraction frame header in the first
place, but a VLAN tag with no timestamp information. So the code paths
in felix_rxtstamp for the regular and 8021q tagger are completely
independent. With tag_8021q, the timestamp must come from the packet's
duplicate delivered to the CPU port module, but there is potentially
complex logic to be handled [ and prone to reordering ] if we were to
just start reading packets from the CPU port module, and try to match
them to the one we received over Ethernet and which needs an RX
timestamp. So we do something simple: we tell DSA "give me some time to
think" (we request skb deferral by returning false from .port_rxtstamp)
and we just drop the frame we got over Ethernet with no attempt to match
it to anything - we just treat it as a notification that there's data to
be processed from the CPU port module's queues. Then we proceed to read
the packets from those, one by one, which we deliver up the stack,
timestamped, using netif_rx - the same function that any driver would
use anyway if it needed RX timestamp deferral. So the assumption is that
we'll come across the PTP packet that triggered the CPU extraction
notification eventually, but we don't know when exactly. Thanks to the
VCAP IS2 trap/redirect rule and the exclusion of the CPU port module
from the flooding replicators, only PTP frames should be present in the
CPU port module's RX queues anyway.

There is just one conflict between the VCAP IS2 trapping rule and the
semantics of the BPF classifier. Namely, ptp_classify_raw() deems
general messages as non-timestampable, but still, those are trapped to
the CPU port module since they have an EtherType of ETH_P_1588. So, if
the "no XTR IRQ" workaround is in place, we need to run another BPF
classifier on the frames extracted over MMIO, to avoid duplicates being
sent to the stack (once over Ethernet, once over MMIO). It doesn't look
like it's possible to install VCAP IS2 rules based on keys extracted
from the 1588 frame headers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean c8c0ba4fe2 net: dsa: felix: setup MMIO filtering rules for PTP when using tag_8021q
Since the tag_8021q tagger is software-defined, it has no means by
itself for retrieving hardware timestamps of PTP event messages.

Because we do want to support PTP on ocelot even with tag_8021q, we need
to use the CPU port module for that. The RX timestamp is present in the
Extraction Frame Header. And because we can't use NPI mode which redirects
the CPU queues to an "external CPU" (meaning the ARM CPU running Linux),
then we need to poll the CPU port module through the MMIO registers to
retrieve TX and RX timestamps.

Sadly, on NXP LS1028A, the Felix switch was integrated into the SoC
without wiring the extraction IRQ line to the ARM GIC. So, if we want to
be notified of any PTP packets received on the CPU port module, we have
a problem.

There is a possible workaround, which is to use the Ethernet CPU port as
a notification channel that packets are available on the CPU port module
as well. When a PTP packet is received by the DSA tagger (without timestamp,
of course), we go to the CPU extraction queues, poll for it there, then
we drop the original Ethernet packet and masquerade the packet retrieved
over MMIO (plus the timestamp) as the original when we inject it up the
stack.

Create a quirk in struct felix is selected by the Felix driver (but not
by Seville, since that doesn't support PTP at all). We want to do this
such that the workaround is minimally invasive for future switches that
don't require this workaround.

The only traffic for which we need timestamps is PTP traffic, so add a
redirection rule to the CPU port module for this. Currently we only have
the need for PTP over L2, so redirection rules for UDP ports 319 and 320
are TBD for now.

Note that for the workaround of matching of PTP-over-Ethernet-port with
PTP-over-MMIO queues to work properly, both channels need to be
absolutely lossless. There are two parts to achieving that:
- We keep flow control enabled on the tag_8021q CPU port
- We put the DSA master interface in promiscuous mode, so it will never
  drop a PTP frame (for the profiles we are interested in, these are
  sent to the multicast MAC addresses of 01-80-c2-00-00-0e and
  01-1b-19-00-00-00).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean 924ee317f7 net: mscc: ocelot: refactor ocelot_xtr_irq_handler into ocelot_xtr_poll
Since the felix DSA driver will need to poll the CPU port module for
extracted frames as well, let's create some common functions that read
an Extraction Frame Header, and then an skb, from a CPU extraction
group.

We abuse the struct ocelot_ops :: port_to_netdev function a little bit,
in order to retrieve the DSA port net_device or the ocelot switchdev
net_device based on the source port information from the Extraction
Frame Header, but it's all in the benefit of code simplification -
netdev_alloc_skb needs it. Originally, the port_to_netdev method was
intended for parsing act->dev from tc flower offload code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean 7c4bb540e9 net: dsa: tag_ocelot: create separate tagger for Seville
The ocelot tagger is a hot mess currently, it relies on memory
initialized by the attached driver for basic frame transmission.
This is against all that DSA tagging protocols stand for, which is that
the transmission and reception of a DSA-tagged frame, the data path,
should be independent from the switch control path, because the tag
protocol is in principle hot-pluggable and reusable across switches
(even if in practice it wasn't until very recently). But if another
driver like dsa_loop wants to make use of tag_ocelot, it couldn't.

This was done to have common code between Felix and Ocelot, which have
one bit difference in the frame header format. Quoting from commit
67c2404922 ("net: dsa: felix: create a template for the DSA tags on
xmit"):

    Other alternatives have been analyzed, such as:
    - Create a separate tag_seville.c: too much code duplication for just 1
      bit field difference.
    - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
      tag_brcm.c, which would have a separate .xmit function. Again, too
      much code duplication for just 1 bit field difference.
    - Allocate the template from the init function of the tag_ocelot.c
      module, instead of from the driver: couldn't figure out a method of
      accessing the correct port template corresponding to the correct
      tagger in the .xmit function.

The really interesting part is that Seville should have had its own
tagging protocol defined - it is not compatible on the wire with Ocelot,
even for that single bit. In principle, a packet generated by
DSA_TAG_PROTO_OCELOT when booted on NXP LS1028A would look in a certain
way, but when booted on NXP T1040 it would look differently. The reverse
is also true: a packet generated by a Seville switch would be
interpreted incorrectly by Wireshark if it was told it was generated by
an Ocelot switch.

Actually things are a bit more nuanced. If we concentrate only on the
DSA tag, what I said above is true, but Ocelot/Seville also support an
optional DSA tag prefix, which can be short or long, and it is possible
to distinguish the two taggers based on an integer constant put in that
prefix. Nonetheless, creating a separate tagger is still justified,
since the tag prefix is optional, and without it, there is again no way
to distinguish.

Claiming backwards binary compatibility is a bit more tough, since I've
already changed the format of tag_ocelot once, in commit 5124197ce5
("net: dsa: tag_ocelot: use a short prefix on both ingress and egress").
Therefore I am not very concerned with treating this as a bugfix and
backporting it to stable kernels (which would be another mess due to the
fact that there would be lots of conflicts with the other DSA_TAG_PROTO*
definitions). It's just simpler to say that the string values of the
taggers have ABI value starting with kernel 5.12, which will be when the
changing of tag protocol via /sys/class/net/<dsa-master>/dsa/tagging
goes live.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean 40d3f295b5 net: mscc: ocelot: use common tag parsing code with DSA
The Injection Frame Header and Extraction Frame Header that the switch
prepends to frames over the NPI port is also prepended to frames
delivered over the CPU port module's queues.

Let's unify the handling of the frame headers by making the ocelot
driver call some helpers exported by the DSA tagger. Among other things,
this allows us to get rid of the strange cpu_to_be32 when transmitting
the Injection Frame Header on ocelot, since the packing API uses
network byte order natively (when "quirks" is 0).

The comments above ocelot_gen_ifh talk about setting pop_cnt to 3, and
the cpu extraction queue mask to something, but the code doesn't do it,
so we don't do it either.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean 137ffbc4bb net: mscc: ocelot: refactor ocelot_port_inject_frame out of ocelot_port_xmit
The felix DSA driver will inject some frames through register MMIO, same
as ocelot switchdev currently does. So we need to be able to reuse the
common code.

Also create some shim definitions, since the DSA tagger can be compiled
without support for the switch driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean 5f016f42d3 net: mscc: ocelot: use DIV_ROUND_UP helper in ocelot_port_inject_frame
This looks a bit nicer than the open-coded "(x + 3) % 4" idiom.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:43 -08:00
Vladimir Oltean a94306cea5 net: mscc: ocelot: better error handling in ocelot_xtr_irq_handler
The ocelot_rx_frame_word() function can return a negative error code,
however this isn't being checked for consistently. Errors being ignored
have not been seen in practice though.

Also, some constructs can be simplified by using "goto" instead of
repeated "break" statements.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:43 -08:00
Vladimir Oltean d7795f8f26 net: mscc: ocelot: only drain extraction queue on error
It appears that the intention of this snippet of code is to not exit
ocelot_xtr_irq_handler() while in the middle of extracting a frame.
The problem in extracting it word by word is that future extraction
attempts are really easy to get desynchronized, since the IRQ handler
assumes that the first 16 bytes are the IFH, which give further
information about the frame, such as frame length.

But during normal operation, "err" will not be 0, but 4, set from here:

		for (i = 0; i < OCELOT_TAG_LEN / 4; i++) {
			err = ocelot_rx_frame_word(ocelot, grp, true, &ifh[i]);
			if (err != 4)
				break;
		}

		if (err != 4)
			break;

In that case, draining the extraction queue is a no-op. So explicitly
make this code execute only on negative err.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:43 -08:00
Vladimir Oltean f833ca293d net: mscc: ocelot: stop returning IRQ_NONE in ocelot_xtr_irq_handler
Since the xtr (extraction) IRQ of the ocelot switch is not shared, then
if it fired, it means that some data must be present in the queues of
the CPU port module. So simplify the code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:43 -08:00
Michael Chan f4d95c3c19 bnxt_en: Improve logging of error recovery settings information.
We currently only log the error recovery settings if it is enabled.
In some cases, firmware disables error recovery after it was
initially enabled.  Without logging anything, the user will not be
aware of this change in setting.

Log it when error recovery is disabled.  Also, change the reset count
value from hexadecimal to decimal.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan df97b34d3a bnxt_en: Reply to firmware's echo request async message.
This is a new async message that the firmware can send to check if it
can communicate with the driver.  This is an added error detection
scheme that firmware can use if it suspects errors in the PCIe
interface.  When the driver receives this async message, it will reply
back echoing some data in the async message.  If the firmware is not
getting the reply with the proper data after some retries, error
recovery will kick in.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan 41435c3940 bnxt_en: Initialize "context kind" field for context memory blocks.
If firmware provides the offset to the "context kind" field of the
relevant context memory blocks, we'll initialize just that field for
each block instead of initializing all of context memory.

Populate the bnxt_mem_init structure with the proper offset returned
by firmware.  If it is older firmware and the information is not
available, we set the offset to an invalid value and fall back to
the old behavior of initializing every byte.  Otherwise, we initialize
only the "context kind" byte at the offset.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan e9696ff33c bnxt_en: Add context memory initialization infrastructure.
Currently, the driver calls memset() to set all relevant context memory
used by the chip to the initial value.  This can take many milliseconds
with the potentially large number of context pages allocated for the
chip.

To make this faster, we only need to initialize the "context kind" field
of each block of context memory.  This patch sets up the infrastructure
to do that with the bnxt_mem_init structure.  In the next patch, we'll
add the logic to obtain the offset of the "context kind" from the
firmware.  This patch is not changing the current behavior of calling
memset() to initialize all relevant context memory.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Michael Chan dab62e7c2d bnxt_en: Implement faster recovery for firmware fatal error.
During some fatal firmware error conditions, the PCI config space
register 0x2e which normally contains the subsystem ID will become
0xffff.  This register will revert back to the normal value after
the chip has completed core reset.  If we detect this condition,
we can poll this config register immediately for the value to revert.
Because we use config read cycles to poll this register, there is no
possibility of Master Abort if we happen to read it during core reset.
This speeds up recovery significantly as we don't have to wait for the
conservative min_time before polling MMIO to see if the firmware has
come out of reset.  As soon as this register changes value we can
proceed to re-initialize the device.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Edwin Peer be6d755f3d bnxt_en: selectively allocate context memories
Newer devices may have local context memory instead of relying on the
host for backing store. In these cases, HWRM_FUNC_BACKING_STORE_QCAPS
will return a zero entry size to indicate contexts for which the host
should not allocate backing store.

Selectively allocate context memory based on device capabilities and
only enable backing store for the appropriate contexts.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Michael Chan 31f67c2ee0 bnxt_en: Update firmware interface spec to 1.10.2.16.
The main changes are the echo request/response from firmware for error
detection and the NO_FCS feature to transmit frames without FCS.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Robert Hancock 6c8f06bb2e net: axienet: Support dynamic switching between 1000BaseX and SGMII
Newer versions of the Xilinx AXI Ethernet core (specifically version 7.2 or
later) allow the core to be configured with a PHY interface mode of "Both",
allowing either 1000BaseX or SGMII modes to be selected at runtime. Add
support for this in the driver to allow better support for applications
which can use both fiber and copper SFP modules.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:38:53 -08:00
Robert Hancock 66b51663cd net: axienet: hook up nway_reset ethtool operation
Hook up the nway_reset ethtool operation to the corresponding phylink
function so that "ethtool -r" can be supported.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:38:53 -08:00
Robert Hancock 57baf8cc70 net: axienet: Handle deferred probe on clock properly
This driver is set up to use a clock mapping in the device tree if it is
present, but still work without one for backward compatibility. However,
if getting the clock returns -EPROBE_DEFER, then we need to abort and
return that error from our driver initialization so that the probe can
be retried later after the clock is set up.

Move clock initialization to earlier in the process so we do not waste as
much effort if the clock is not yet available. Switch to use
devm_clk_get_optional and abort initialization on any error reported.
Also enable the clock regardless of whether the controller is using an MDIO
bus, as the clock is required in any case.

Fixes: 09a0354cad ("net: axienet: Use clock framework to get device clock rate")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:36:31 -08:00
David S. Miller 5cdaf9d6fa Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2021-02-12

This series contains updates to i40e, ice, and ixgbe drivers.

Maciej does cleanups on the following drivers.
For i40e, removes redundant check for XDP prog, cleans up no longer
relevant information, and removes an unused function argument.
For ice, removes local variable use, instead returning values directly.
Moves skb pointer from buffer to ring and removes an unneeded check for
xdp_prog in zero copy path. Also removes a redundant MTU check when
changing it.
For i40e, ice, and ixgbe, stores the rx_offset in the Rx ring as
the value is constant so there's no need for continual calls.

Bjorn folds a decrement into a while statement.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:24:47 -08:00
Rob Herring 83c4a4eec0 of: Remove of_dev_{get,put}()
of_dev_get() and of_dev_put are just wrappers for get_device()/put_device()
on a platform_device. There's also already platform_device_{get,put}()
wrappers for this purpose. Let's update the few users and remove
of_dev_{get,put}().

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Patrice Chotard <patrice.chotard@st.com>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Cc: Gilles Muller <Gilles.Muller@inria.fr>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-usb@vger.kernel.org
Cc: cocci@systeme.lip6.fr
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210211232745.1498137-2-robh@kernel.org
2021-02-12 19:23:39 -06:00
Dany Madden a6f2fe5f10 ibmvnic: change IBMVNIC_MAX_IND_DESCS to 16
The supported indirect subcrq entries on Power8 is 16. Power9
supports 128. Redefined this value to 16 to minimize the driver from
having to reset when migrating between Power9 and Power8. In our rx/tx
performance testing, we found no performance difference between 16 and
128 at this time.

Fixes: f019fb6392 ("ibmvnic: Introduce indirect subordinate Command Response Queue buffer")
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:17:05 -08:00
Vladimir Oltean 4d94235495 net: dsa: sja1105: offload bridge port flags to device
The chip can configure unicast flooding, broadcast flooding and learning.
Learning is per port, while flooding is per {ingress, egress} port pair
and we need to configure the same value for all possible ingress ports
towards the requested one.

While multicast flooding is not officially supported, we can hack it by
using a feature of the second generation (P/Q/R/S) devices, which is that
FDB entries are maskable, and multicast addresses always have an odd
first octet. So by putting a match-all for 00:01:00:00:00:00 addr and
00:01:00:00:00:00 mask at the end of the FDB, we make sure that it is
always checked last, and does not take precedence in front of any other
MDB. So it behaves effectively as an unknown multicast entry.

For the first generation switches, this feature is not available, so
unknown multicast will always be treated the same as unknown unicast.
So the only thing we can do is request the user to offload the settings
for these 2 flags in tandem, i.e.

ip link set swp2 type bridge_slave flood off
Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
ip link set swp2 type bridge_slave flood off mcast_flood off
ip link set swp2 type bridge_slave mcast_flood on
Error: sja1105: This chip cannot configure multicast flooding independently of unicast.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean 421741ea56 net: mscc: ocelot: offload bridge port flags to device
We should not be unconditionally enabling address learning, since doing
that is actively detrimential when a port is standalone and not offloading
a bridge. Namely, if a port in the switch is standalone and others are
offloading the bridge, then we could enter a situation where we learn an
address towards the standalone port, but the bridged ports could not
forward the packet there, because the CPU is the only path between the
standalone and the bridged ports. The solution of course is to not
enable address learning unless the bridge asks for it.

We need to set up the initial port flags for no learning and flooding
everything, and also when the port joins and leaves the bridge.
The flood configuration was already configured ok for standalone mode
in ocelot_init, we just need to disable learning in ocelot_init_port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean b360d94f1b net: mscc: ocelot: use separate flooding PGID for broadcast
In preparation of offloading the bridge port flags which have
independent settings for unknown multicast and for broadcast, we should
also start reserving one destination Port Group ID for the flooding of
broadcast packets, to allow configuring it individually.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean 6edb9e8d45 net: dsa: felix: restore multicast flood to CPU when NPI tagger reinitializes
ocelot_init sets up PGID_MC to include the CPU port module, and that is
fine, but the ocelot-8021q tagger removes the CPU port module from the
unknown multicast replicator. So after a transition from the default
ocelot tagger towards ocelot-8021q and then again towards ocelot,
multicast flooding towards the CPU port module will be disabled.

Fixes: e21268efbe ("net: dsa: felix: perform switch setup for tag_8021q")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean a8b659e7ff net: dsa: act as passthrough for bridge port flags
There are multiple ways in which a PORT_BRIDGE_FLAGS attribute can be
expressed by the bridge through switchdev, and not all of them can be
emulated by DSA mid-layer API at the same time.

One possible configuration is when the bridge offloads the port flags
using a mask that has a single bit set - therefore only one feature
should change. However, DSA currently groups together unicast and
multicast flooding in the .port_egress_floods method, which limits our
options when we try to add support for turning off broadcast flooding:
do we extend .port_egress_floods with a third parameter which b53 and
mv88e6xxx will ignore? But that means that the DSA layer, which
currently implements the PRE_BRIDGE_FLAGS attribute all by itself, will
see that .port_egress_floods is implemented, and will report that all 3
types of flooding are supported - not necessarily true.

Another configuration is when the user specifies more than one flag at
the same time, in the same netlink message. If we were to create one
individual function per offloadable bridge port flag, we would limit the
expressiveness of the switch driver of refusing certain combinations of
flag values. For example, a switch may not have an explicit knob for
flooding of unknown multicast, just for flooding in general. In that
case, the only correct thing to do is to allow changes to BR_FLOOD and
BR_MCAST_FLOOD in tandem, and never allow mismatched values. But having
a separate .port_set_unicast_flood and .port_set_multicast_flood would
not allow the driver to possibly reject that.

Also, DSA doesn't consider it necessary to inform the driver that a
SWITCHDEV_ATTR_ID_BRIDGE_MROUTER attribute was offloaded, because it
just calls .port_egress_floods for the CPU port. When we'll add support
for the plain SWITCHDEV_ATTR_ID_PORT_MROUTER, that will become a real
problem because the flood settings will need to be held statefully in
the DSA middle layer, otherwise changing the mrouter port attribute will
impact the flooding attribute. And that's _assuming_ that the underlying
hardware doesn't have anything else to do when a multicast router
attaches to a port than flood unknown traffic to it.  If it does, there
will need to be a dedicated .port_set_mrouter anyway.

So we need to let the DSA drivers see the exact form that the bridge
passes this switchdev attribute in, otherwise we are standing in the
way. Therefore we also need to use this form of language when
communicating to the driver that it needs to configure its initial
(before bridge join) and final (after bridge leave) port flags.

The b53 and mv88e6xxx drivers are converted to the passthrough API and
their implementation of .port_egress_floods is split into two: a
function that configures unicast flooding and another for multicast.
The mv88e6xxx implementation is quite hairy, and it turns out that
the implementations of unknown unicast flooding are actually the same
for 6185 and for 6352:

behind the confusing names actually lie two individual bits:
NO_UNKNOWN_MC -> FLOOD_UC = 0x4 = BIT(2)
NO_UNKNOWN_UC -> FLOOD_MC = 0x8 = BIT(3)

so there was no reason to entangle them in the first place.

Whereas the 6185 writes to MV88E6185_PORT_CTL0_FORWARD_UNKNOWN of
PORT_CTL0, which has the exact same bit index. I have left the
implementations separate though, for the only reason that the names are
different enough to confuse me, since I am not able to double-check with
a user manual. The multicast flooding setting for 6185 is in a different
register than for 6352 though.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:04 -08:00
Vladimir Oltean e18f4c18ab net: switchdev: pass flags and mask to both {PRE_,}BRIDGE_FLAGS attributes
This switchdev attribute offers a counterproductive API for a driver
writer, because although br_switchdev_set_port_flag gets passed a
"flags" and a "mask", those are passed piecemeal to the driver, so while
the PRE_BRIDGE_FLAGS listener knows what changed because it has the
"mask", the BRIDGE_FLAGS listener doesn't, because it only has the final
value. But certain drivers can offload only certain combinations of
settings, like for example they cannot change unicast flooding
independently of multicast flooding - they must be both on or both off.
The way the information is passed to switchdev makes drivers not
expressive enough, and unable to reject this request ahead of time, in
the PRE_BRIDGE_FLAGS notifier, so they are forced to reject it during
the deferred BRIDGE_FLAGS attribute, where the rejection is currently
ignored.

This patch also changes drivers to make use of the "mask" field for edge
detection when possible.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:04 -08:00
Vladimir Oltean 4c08c586ff net: switchdev: propagate extack to port attributes
When a struct switchdev_attr is notified through switchdev, there is no
way to report informational messages, unlike for struct switchdev_obj.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:04 -08:00
David S. Miller b0aae0bde2 octeontx2: Fix condition.
Fixes: 93efb0c656 ("octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()")
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:56:08 -08:00
Alex Elder 6170b6dab2 net: ipa: introduce gsi_channel_initialized()
Create a simple helper function that indicates whether a channel has
been initialized.  This abstacts/hides the details of how this is
determined.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:54:17 -08:00
Alex Elder a266ad6b5d net: ipa: introduce ipa_table_hash_support()
Introduce a new function to abstract the knowledge of whether hashed
routing and filter tables are supported for a given IPA instance.

IPA v4.2 is the only one that doesn't support hashed tables (now
and for the foreseeable future), but the name of the helper function
is better for explaining what's going on.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:54:17 -08:00
Alex Elder 2d65ed7692 net: ipa: fix register write command validation
In ipa_cmd_register_write_valid() we verify that values we will
supply to a REGISTER_WRITE IPA immediate command will fit in
the fields that need to hold them.  This patch fixes some issues
in that function and ipa_cmd_register_write_offset_valid().

The dev_err() call in ipa_cmd_register_write_offset_valid() has
some printf format errors:
  - The name of the register (corresponding to the string format
    specifier) was not supplied.
  - The IPA base offset and offset need to be supplied separately to
    match the other format specifiers.
Also make the ~0 constant used there to compute the maximum
supported offset value explicitly unsigned.

There are two other issues in ipa_cmd_register_write_valid():
  - There's no need to check the hash flush register for platforms
    (like IPA v4.2) that do not support hashed tables
  - The highest possible endpoint number, whose status register
    offset is computed, is COUNT - 1, not COUNT.

Fix these problems, and add some additional commentary.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:54:17 -08:00
Alex Elder 4c7ccfcd09 net: ipa: use dev_err_probe() in ipa_clock.c
When initializing the IPA core clock and interconnects, it's
possible we'll get an EPROBE_DEFER error.  This isn't really an
error, it's just means we need to be re-probed later.

Use dev_err_probe() to report the error rather than dev_err().
This avoids polluting the log with these "error" messages.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:54:17 -08:00
Alex Elder 571b1e7e58 net: ipa: use a separate pointer for adjusted GSI memory
This patch actually fixes a bug, though it doesn't affect the two
platforms supported currently.  The fix implements GSI memory
pointers a bit differently.

For IPA version 4.5 and above, the address space for almost all GSI
registers is adjusted downward by a fixed amount.  This is currently
handled by adjusting the I/O virtual address pointer after it has
been mapped.  The bug is that the pointer is not "de-adjusted" as it
should be when it's unmapped.

This patch fixes that error, but it does so by maintaining one "raw"
pointer for the mapped memory range.  This is assigned when the
memory is mapped and used to unmap the memory.  This pointer is also
used to access the two registers that do *not* sit in the "adjusted"
memory space.

Rather than adjusting *that* pointer, we maintain a separate pointer
that's an adjusted copy of the "raw" pointer, and that is used for
most GSI register accesses.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:54:16 -08:00
Gustavo A. R. Silva 93efb0c656 octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()
Code at line 967 implies that rsp->fwdata.supported_fec may be up to 4:

 967: if (rsp->fwdata.supported_fec <= FEC_MAX_INDEX)

If rsp->fwdata.supported_fec evaluates to 4, then there is an
out-of-bounds read at line 971 because fec is an array with
a maximum of 4 elements:

 954         const int fec[] = {
 955                 ETHTOOL_FEC_OFF,
 956                 ETHTOOL_FEC_BASER,
 957                 ETHTOOL_FEC_RS,
 958                 ETHTOOL_FEC_BASER | ETHTOOL_FEC_RS};
 959 #define FEC_MAX_INDEX 4

 971: fecparam->fec = fec[rsp->fwdata.supported_fec];

Fix this by properly indexing fec[] with rsp->fwdata.supported_fec - 1.
In this case the proper indexes 0 to 3 are used when
rsp->fwdata.supported_fec evaluates to a range of 1 to 4, correspondingly.

Fixes: d0cf9503e9 ("octeontx2-pf: ethtool fec mode support")
Addresses-Coverity-ID: 1501722 ("Out-of-bounds read")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:47:39 -08:00
Colin Ian King a6e0ee35ee octeontx2-af: Fix spelling mistake "recievd" -> "received"
There is a spelling mistake in the text in array rpm_rx_stats_fields,
fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:45:43 -08:00
David S. Miller 79201f358d wireless-drivers-next patches for v5.12
Second set of patches for v5.12. Last time there was a smaller pull
 request so unsurprisingly this time we have a big one. mt76 has new
 hardware support and lots of new features, iwlwifi getting new
 features and rtw88 got NAPI support. And the usual cleanups and fixes
 all over.
 
 Major changes:
 
 ath10k
 
 * support setting SAR limits via nl80211
 
 rtw88
 
 * support 8821 RFE type2 devices
 
 * NAPI support
 
 iwlwifi
 
 * add new FW API support
 
 * support for new So devices
 
 * support for RF interference mitigation (RFI)
 
 * support for PNVM (Platform Non-Volatile Memory, a firmware data
   file) from BIOS
 
 mt76
 
 * add new mt7921e driver
 
 * 802.11 encap offload support
 
 * support for multiple pcie gen1 host interfaces on 7915
 
 * 7915 testmode support
 
 * 7915 txbf support
 
 brcmfmac
 
 * support for CQM RSSI notifications
 
 wil6210
 
 * support for extended DMG MCS 12.1 rate
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJgJl5/AAoJEG4XJFUm622bCrcIAIwarm51aHq8R0Py0ZYBEiIM
 STe/x3gjxVhzYHd438APchEMnePY6pWZ6e00GnZwMyWxZpMHK+yOYzWrewLvYI29
 qyPFkjGdE6PhYjL+lJjYHn4M1cdAkO4t0FSvCy5OvOnuIgu0Yz3TXXQVxZtrzrIc
 2q+bOR1kKaVBe8NOjggnxTWe4mTj6efeTkD0D5M+IbppERtmpcVLra9FSgz5IcLl
 hlKyNNtiNo/CmQu0bGejXR7ip+JA08f5No6TOlEQKR6pBvBvwrvgXHsq9rfQO8qA
 CDhz4DqfPPYMNXnJuFVAzYsw4raZblwTg5GtIjH7e0cbXxSAx50Ne2Hd850nkO8=
 =eQGJ
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-2021-02-12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:
====================
wireless-drivers-next patches for v5.12

Second set of patches for v5.12. Last time there was a smaller pull
request so unsurprisingly this time we have a big one. mt76 has new
hardware support and lots of new features, iwlwifi getting new
features and rtw88 got NAPI support. And the usual cleanups and fixes
all over.

Major changes:

ath10k

* support setting SAR limits via nl80211

rtw88

* support 8821 RFE type2 devices

* NAPI support

iwlwifi

* add new FW API support

* support for new So devices

* support for RF interference mitigation (RFI)

* support for PNVM (Platform Non-Volatile Memory, a firmware data
  file) from BIOS

mt76

* add new mt7921e driver

* 802.11 encap offload support

* support for multiple pcie gen1 host interfaces on 7915

* 7915 testmode support

* 7915 txbf support

brcmfmac

* support for CQM RSSI notifications

wil6210

* support for extended DMG MCS 12.1 rate
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 16:43:13 -08:00
Hao Chen 80a9f3f1fa net: hns3: refactor out hclge_rm_vport_all_mac_table()
hclge_rm_vport_all_mac_table() is bloated, so split it into
separate functions for readability and maintainability.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Huazhong Tan 5fd0e7b4f7 net: hns3: refactor out hclgevf_set_rss_tuple()
To make it more readable and maintainable, split
hclgevf_set_rss_tuple() into two parts.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Huazhong Tan e291eff3bc net: hns3: refactor out hclge_set_rss_tuple()
To make it more readable and maintainable, split
hclge_set_rss_tuple() into two parts.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Yufeng Mo eb0faf32b8 net: hns3: split out hclgevf_cmd_send()
hclgevf_cmd_send() is bloated, so split it into separate
functions for readability and maintainability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Yufeng Mo 76f82fd9b1 net: hns3: split out hclge_cmd_send()
hclge_cmd_send() is bloated, so split it into separate
functions for readability and maintainability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Jian Shen b3712fa73d net: hns3: split out hclge_dbg_dump_qos_buf_cfg()
hclge_dbg_dump_qos_buf_cfg() is bloated, so split it into
separate functions for readability and maintainability.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Jian Shen 73f7767ed0 net: hns3: refactor out hclgevf_get_rss_tuple()
To improve code readability and maintainability, separate
the flow type parsing part and the converting part from
bloated hclgevf_get_rss_tuple().

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Jian Shen 405642a15c net: hns3: refactor out hclge_get_rss_tuple()
To improve code readability and maintainability, separate
the flow type parsing part and the converting part from
bloated hclge_get_rss_tuple().

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Peng Li 88936e320c net: hns3: refactor out hclge_set_vf_vlan_common()
To improve code readability and maintainability, separate
the command handling part and the status parsing part from
bloated hclge_set_vf_vlan_common().

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Jiaran Zhang eaede83567 net: hns3: use ipv6_addr_any() helper
Use common ipv6_addr_any() to determine if an addr is ipv6 any addr.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Peng Li c318af3f56 net: hns3: clean up hns3_dbg_cmd_write()
As more commands are added, hns3_dbg_cmd_write() is going to
get more bloated, so move the part about command check into
a separate function.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Peng Li 433e280277 net: hns3: refactor out hclgevf_cmd_convert_err_code()
To improve code readability and maintainability, refactor
hclgevf_cmd_convert_err_code() with an array of imp_errcode
and common_errno mapping, instead of a bloated switch/case.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Peng Li 1c9a98b0ba net: hns3: refactor out hclge_cmd_convert_err_code()
To improve code readability and maintainability, refactor
hclge_cmd_convert_err_code() with an array of imp_errcode
and common_errno mapping, instead of a bloated switch/case.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 13:13:15 -08:00
Maciej Fijalkowski c0d4e9d223 ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring
Output of ixgbe_rx_offset() is based on ethtool's priv flag setting, which
when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to ixgbe_ring that is meant to hold the
ixgbe_rx_offset() result and use it within ixgbe_clean_rx_irq().
Furthermore, use it within ixgbe_alloc_mapped_page().

Last but not least, un-inline the function of interest as it lives in .c
file so let compiler do the decision about the inlining.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:40:03 -08:00
Maciej Fijalkowski f1b1f409bf ice: store the result of ice_rx_offset() onto ice_ring
Output of ice_rx_offset() is based on ethtool's priv flag setting, which
when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to ice_ring that is meant to hold the
ice_rx_offset() result and use it within ice_clean_rx_irq().
Furthermore, use it within ice_alloc_mapped_page().

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:36:57 -08:00
Maciej Fijalkowski f7bb0d71d6 i40e: store the result of i40e_rx_offset() onto i40e_ring
Output of i40e_rx_offset() is based on ethtool's priv flag setting,
which when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to i40e_ring that is meant to hold the
i40e_rx_offset() result and use it within i40e_clean_rx_irq().
Furthermore, use it within i40e_alloc_mapped_page().

Last but not least, un-inline the function of interest so that compiler
makes the decision about inlining as it lives in .c file.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:35:13 -08:00
Björn Töpel f892a9af0c i40e: Simplify the do-while allocation loop
Fold the count decrement into the while-statement.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:33:42 -08:00
Maciej Fijalkowski 5c57e507f2 ice: skip NULL check against XDP prog in ZC path
Whole zero-copy variant of clean Rx IRQ is executed when xsk_pool is
attached to rx_ring and it can happen only when XDP program is present
on interface. Therefore it is safe to assume that program is always
!NULL and there is no need for checking it in ice_run_xdp_zc.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:28:40 -08:00
Maciej Fijalkowski 43a925e49d ice: remove redundant checks in ice_change_mtu
dev_validate_mtu checks that mtu value specified by user is not less
than min mtu and not greater than max allowed mtu. It is being done
before calling the ndo_change_mtu exposed by driver, so remove these
redundant checks in ice_change_mtu.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:20:13 -08:00
Maciej Fijalkowski 29b82f2a09 ice: move skb pointer from rx_buf to rx_ring
Similar thing has been done in i40e, as there is no real need for having
the sk_buff pointer in each rx_buf. Non-eop frames can be simply handled
on that pointer moved upwards to rx_ring.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:18:40 -08:00
Maciej Fijalkowski 59c97d1b51 ice: simplify ice_run_xdp
There's no need for 'result' variable, we can directly return the
internal status based on action returned by xdp prog.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:14:30 -08:00
Maciej Fijalkowski d06e2f05b4 i40e: adjust i40e_is_non_eop
i40e_is_non_eop had a leftover comment and unused skb argument which was
used for placing the skb onto rx_buf in case when current buffer was
non-eop one. This is not relevant anymore as commit e72e56597b
("i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ring") pulled the
non-complete skb handling out of rx_bufs up to rx_ring.  Therefore,
let's adjust the function arguments that i40e_is_non_eop takes.

Furthermore, since there is already a function responsible for bumping
the ntc, make use of that and drop that logic from i40e_is_non_eop so
that the scope of this function is limited to what the name actually
states.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:13:15 -08:00
Maciej Fijalkowski 4a14994a92 i40e: drop misleading function comments
i40e_cleanup_headers has a statement about check against skb being
linear or not which is not relevant anymore, so let's remove it.

Same case for i40e_can_reuse_rx_page, it references things that are not
present there anymore.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 10:10:59 -08:00
Maciej Fijalkowski 99f097270a i40e: drop redundant check when setting xdp prog
Net core handles the case where netdev has no xdp prog attached and
current prog is NULL. Therefore, remove such check within
i40e_xdp_setup.

Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-12 09:49:32 -08:00
Po-Hao Huang 9d083348e9 rtw88: 8822c: update RF_B (2/2) parameter tables to v60
Update RTL8822C devices' RF_A tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-9-pkshih@realtek.com
2021-02-12 09:51:15 +02:00
Po-Hao Huang 6817cbdd9d rtw88: 8822c: update RF_B (1/2) parameter tables to v60
Update RTL8822C devices' RF_B tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-8-pkshih@realtek.com
2021-02-12 09:51:14 +02:00
Po-Hao Huang 0e5abd1172 rtw88: 8822c: update RF_A parameter tables to v60
Update RTL8822C devices' RF_A tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-7-pkshih@realtek.com
2021-02-12 09:51:12 +02:00
Po-Hao Huang 9e27d4bf12 rtw88: 8822c: update MAC/BB parameter tables to v60
Update RTL8822C devices' MAC/BB tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-6-pkshih@realtek.com
2021-02-12 09:51:10 +02:00
Po-Hao Huang fe101716c7 rtw88: replace tx tasklet with work queue
Replace tasklet so we can do tx scheduling in parallel. Since throughput
is delay-sensitive in most cases, we allocate a dedicated, high priority
wq for our needs.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-5-pkshih@realtek.com
2021-02-12 09:51:09 +02:00
Po-Hao Huang 9e2fd29864 rtw88: add napi support
Use napi to reduce overhead on rx interrupts.

Driver used to interrupt kernel for every Rx packet, this could
affect both system and network performance. NAPI is a mechanism that
uses polling when processing huge amount of traffic, by doing this
the number of interrupts can be decreased.

Network performance can also benefit from this patch. Since TCP
connection is bidirectional and acks are required for every several
packets. These ack packets occupie the PCI bus bandwidth and could
lead to performance degradation.

When napi is used, GRO receive is enabled by default in the mac80211
stack. So mac80211 won't pass every RX TCP packets to the kernel TCP
network stack immediately. Instead an aggregated large length TCP packet
will be delivered.

This reduces the tx acks sent and gains rx performance. After the patch,
the Rx throughput increases about 25Mbps in 11ac.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Tested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-4-pkshih@realtek.com
2021-02-12 09:50:58 +02:00
Po-Hao Huang d77ddc34d7 rtw88: add rts condition
Since we set the IEEE80211_HW_HAS_RATE_CONTROL flag, so use_rts in
ieee80211_tx_info will never be set in the ieee80211_xmit_fast path.
Add length check for skb to decide whether rts is needed.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-3-pkshih@realtek.com
2021-02-12 09:50:55 +02:00
Po-Hao Huang 4830872685 rtw88: add dynamic rrsr configuration
Register rrsr determines the response rate we send.
In field tests, using rate higher than current tx rate could lead
to difficulty for the receiving end to receive management/control
frames. Calculate current modulation level by tx rate then cross out
rate higher than those.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-2-pkshih@realtek.com
2021-02-12 09:50:53 +02:00
Luca Coelho 3304b6f937 iwlwifi: remove incorrect comment in pnvm
We use this driver as a backport that also runs on older kernels (as
part of the backports project).  So we use some checks to backport or
prevent code from compiling in incompatible kernel version.

When I took one of the PNVM patches from the backport, I accidentally
left the comment that a certain part of the code doesn't work in older
kernels.  This obviously should never be valid for the mainline.
Remove this comment.

Reported-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210211223049.40d545a0fa89.I04793aaa5312b926335c8db32131f000432df511@changeid
2021-02-12 09:46:35 +02:00
Tariq Toukan 2af3e35c5a net/mlx5: Remove TLS dependencies on XPS
No real dependency on XPS, but on RX queue mapping, which
is being selected by TLS_DEVICE.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 19:08:06 -08:00
Moshe Shemesh e1c3940c60 net/mlx5e: Check tunnel offload is required before setting SWP
Check that tunnel offload is required before setting Software Parser
offsets to get Geneve HW offload. In case of Geneve packet we check HW
offload support of SWP in mlx5e_tunnel_features_check() and set features
accordingly, this should be reflected in skb offload requested by the
kernel and we should add the Software Parser offsets only if requested.
Otherwise, in case HW doesn't support SWP for Geneve, data path will
mistakenly try to offload Geneve SKBs with skb->encapsulation set,
regardless of whether offload was requested or not on this specific SKB.

Fixes: e3cfc7e6b7 ("net/mlx5e: TX, Add geneve tunnel stateless offload support")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:16 -08:00
Oz Shlomo a217313152 net/mlx5e: CT: manage the lifetime of the ct entry object
The ct entry object is accessed by the ct add, del, stats and restore
methods. In addition, it is referenced from several hash tables.

The lifetime of the ct entry object was not managed which triggered race
conditions as in the following kasan dump:
[ 3374.973945] ==================================================================
[ 3374.988552] BUG: KASAN: use-after-free in memcmp+0x4c/0x98
[ 3374.999590] Read of size 1 at addr ffff00036129ea55 by task ksoftirqd/1/15
[ 3375.016415] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G           O      5.4.31+ #1
[ 3375.055301] Call trace:
[ 3375.060214]  dump_backtrace+0x0/0x238
[ 3375.067580]  show_stack+0x24/0x30
[ 3375.074244]  dump_stack+0xe0/0x118
[ 3375.081085]  print_address_description.isra.9+0x74/0x3d0
[ 3375.091771]  __kasan_report+0x198/0x1e8
[ 3375.099486]  kasan_report+0xc/0x18
[ 3375.106324]  __asan_load1+0x60/0x68
[ 3375.113338]  memcmp+0x4c/0x98
[ 3375.119409]  mlx5e_tc_ct_restore_flow+0x3a4/0x6f8 [mlx5_core]
[ 3375.131073]  mlx5e_rep_tc_update_skb+0x1d4/0x2f0 [mlx5_core]
[ 3375.142553]  mlx5e_handle_rx_cqe_rep+0x198/0x308 [mlx5_core]
[ 3375.154034]  mlx5e_poll_rx_cq+0x2a0/0x1060 [mlx5_core]
[ 3375.164459]  mlx5e_napi_poll+0x1d4/0xa78 [mlx5_core]
[ 3375.174453]  net_rx_action+0x28c/0x7a8
[ 3375.182004]  __do_softirq+0x1b4/0x5d0

Manage the lifetime of the ct entry object by using synchornization
mechanisms for concurrent access.

Fixes: ac991b48d4 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:15 -08:00
Shay Drory edac23c2b3 net/mlx5: Disable devlink reload for lag devices
Devlink reload can't be allowed on lag devices since reloading one lag
device will cause traffic on the bond to get stucked.
Users who wish to reload a lag device, need to remove the device from
the bond, and only then reload it.

Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:15 -08:00
Shay Drory 7ab91f2b03 net/mlx5: Disallow RoCE on lag device
In lag mode, setting roce enabled/disable of lag device have no effect.
e.g.: bond device (roce/vf_lag) roce status remain unchanged.
Therefore disable it and add an error message.

Fixes: cc9defcbb8 ("net/mlx5: Handle "enable_roce" devlink param")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:15 -08:00
Shay Drory c70f8597fc net/mlx5: Disallow RoCE on multi port slave device
In dual port mode, setting roce enabled/disable for the slave device
have no effect. e.g.: the slave device roce status remain unchanged.
Therefore disable it and add an error message.
Enable or disable roce of the master device affect both master and slave
devices.

Fixes: cc9defcbb8 ("net/mlx5: Handle "enable_roce" devlink param")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:14 -08:00
Shay Drory d89ddaae17 net/mlx5: Disable devlink reload for multi port slave device
Devlink reload can't be allowed on a multi port slave device, because
reload of slave device doesn't take effect.

The right flow is to disable devlink reload for multi port slave
device. Hence, disabling it in mlx5_core probing.

Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:14 -08:00
Maxim Mikityanskiy b850bbff96 net/mlx5e: kTLS, Use refcounts to free kTLS RX priv context
wait_for_resync is unreliable - if it timeouts, priv_rx will be freed
anyway. However, mlx5e_ktls_handle_get_psv_completion will be called
sooner or later, leading to use-after-free. For example, it can happen
if a CQ error happened, and ICOSQ stopped, but later on the queues are
destroyed, and ICOSQ is flushed with mlx5e_free_icosq_descs.

This patch converts the lifecycle of priv_rx to fully refcount-based, so
that the struct won't be freed before the refcount goes to zero.

Fixes: 0419d8c9d8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:13 -08:00