The Completion Queue Element version 2 (CQEv2) includes a field called
'mirror_reason' which indicates why the packet was mirrored to the CPU.
Add the field so that it can be used by a later patch.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets that are mirrored to the CPU port are trapped with one of eight
trap identifiers. Add them.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The trap identifier was increased to 10 bits in new versions of the
Programmer's Reference Manual (PRM).
Increase it accordingly in the Host PacKet Trap (HPKT) register and in
the Completion Queue Element (CQE).
This is significant for subsequent patches that will introduce trap
identifiers which utilize the extended range.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mirroring packets to the CPU port the mirrored packets are trapped
to the CPU. However, unlike other traps, it is not possible to set a
policer on the associated trap group. Instead, the policer needs to be
set on the SPAN agent.
Moreover, the policer ID must be within a specified range: From a
configurable (even) base ID to this base plus the maximum number of SPAN
agents.
While the immediate use case is to set the policer on a SPAN agent that
mirrors to the CPU port, a policer can be set on any SPAN agent.
Therefore, the operation is implemented for all SPAN agent types.
Extend the SPAN agent request API to allow passing the desired policer
ID that should be bound to the SPAN agent. Return an error for
Spectrum-1, as it does not support policer setting on a SPAN agent.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the only parameter of a SPAN agent is the netdev which
the SPAN agent should mirror to.
The next patch will add the ability to request a SPAN agent that mirrors
to a specific netdev and has a specific policer ID bound to it. This is
required when mirroring packets to the CPU port.
Therefore, encapsulate the sole parameter to mlxsw_sp_span_agent_get()
in a structure, so that it could later be extended with policer
information.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Spectrum-2 and Spectrum-3 ASICs are able to mirror packets towards
the CPU. These packets are then trapped like any other packet, but with
a special packet trap and additional metadata such as why the packet was
mirrored.
The ability to mirror packets towards the CPU will be utilized by a
subsequent patch set that will mirror packets that were dropped by the
ASIC for various buffer-related reasons, such as tail-drop and
early-drop.
Add mirroring towards the CPU as a new SPAN agent type and re-use the
functions that mirror to a physical port where possible.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the destination netdev to which we mirror must be a valid
netdev. However, this is going to change with the introduction of
mirroring towards the CPU port, as the CPU port does not have a backing
netdev.
Avoid dereferencing the destination netdev when it is not clear if it is
valid or not.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The parms_set() callback is supposed to fill in the parameters for the
SPAN agent, such as the destination port and encapsulation info, if any.
When mirroring to the CPU port we cannot resolve the destination port
(the CPU port) without access to the driver private info.
Pass the driver private info to parms_set() callback so that it could be
used later on to resolve the CPU port.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The various SPAN agent types differ in their mirror targets (i.e.,
physical port netdev vs. VLAN netdev) and the encapsulation headers that
they need to encapsulate the mirrored packets with.
The Spectrum-2 and Spectrum-3 ASICs support a SPAN agent type that is
able to mirror towards the CPU, whereas the Spectrum-1 ASIC does not.
Prepare for the addition of this new SPAN agent type by splitting the
SPAN agent operations to be per-ASIC.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow setting mirroring_pid_base using MOGCR register.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow setting session_id and pid as part of port analyzer
configurations.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type defining 'NETDEV_TX_OK' but this
driver returns '0' instead of 'NETDEV_TX_OK'.
Fix this by returning 'NETDEV_TX_OK' instead of '0'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200629104009.84077-1-luc.vanoostenryck@gmail.com
Added version number info along with firmware name so driver can pick
the correct revision of FW file. Moved FW filename macro as part of
driver code & added MODULE_FIRMWARE to specify FW needed by module.
Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200710051826.3267-6-ajay.kathat@microchip.com
Modify WILC1000 binary filename to use single unified wilc1000 FW.
A single wilc1000 binary is used for different wilc1000 revisions.
Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200710051826.3267-5-ajay.kathat@microchip.com
Avoid below reported warning found when 'CONFIG_PM' config is
undefined.
'warning: unused variable 'wowlan_support' [-Wunused-const-variable]'
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200710051826.3267-4-ajay.kathat@microchip.com
Make use 'strlcpy' instead of 'strncpy' to overcome 'stringop-truncation'
compiler warning.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200710051826.3267-3-ajay.kathat@microchip.com
The code in the BRCM80211 BRCMSMAC driver is using the legacy
GPIO API to to a complex check of the validity of the base of
the GPIO chip and whether it is present at all and then adding
an offset to the base of the chip.
Use the existing function to obtain a GPIO line internally
from a GPIO chip so we can use the offset directly and
modernize the code to use GPIO descriptors instead of integers
from the global GPIO numberspace.
Cc: Wright Feng <wright.feng@cypress.com>
Cc: Frank Kao <frank.kao@cypress.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200711210150.4943-1-linus.walleij@linaro.org
The driver relies on the compatible string from DT to determine which
FW configuration file it should load. The DTS spec allows for '/' as
part of the compatible string. We change this to '-' so that we will
still be able to load the config file, even when the compatible has a
'/'. This fixes explicitly the firmware loading for
"solidrun,cubox-i/q".
Signed-off-by: Matthias Brugger <mbrugger@suse.com>
Reviewed-by: Hans deGoede <hdegoede@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200701112201.6449-1-matthias.bgg@kernel.org
Linux 3.6 introduces TSQ which has a per socket threshold for TCP Tx
packet to reduce latency. In flow control mode, host driver enqueues skb
in hanger and TCP doesn't push new skb frees until host frees the skb when
receiving fwstatus event. So set pacing shift 8 to send them as a single
large aggregate frame to the bus layer.
43455 TX TCP throughput in different FC modes on Linux 5.4.18
sk_pacing_shift : Throughput (fcmode=0)
10: 245 Mbps
9: 245 Mbps
8: 246 Mbps
7: 246 Mbps
sk_pacing_shift : Throughput (fcmode=1)
10: 182 Mbps
9: 197 Mbps
8: 206 Mbps
7: 207 Mbps
sk_pacing_shift : Throughput (fcmode=2)
10: 180 Mbps
9: 197 Mbps
8: 206 Mbps
7: 207 Mbps
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200624091608.25154-3-wright.feng@cypress.com
When USB or SDIO device got abnormal bus disconnection, host driver
tried to clean up the skbs in PSQ and TXQ (The skb's pointer in hanger
slot linked to PSQ and TSQ), so we should set the state of skb hanger slot
to BRCMF_FWS_HANGER_ITEM_STATE_FREE before freeing skb.
In brcmf_fws_bus_txq_cleanup it already sets
BRCMF_FWS_HANGER_ITEM_STATE_FREE before freeing skb, therefore we add the
same thing in brcmf_fws_psq_flush to avoid following warning message.
[ 1580.012880] ------------ [ cut here ]------------
[ 1580.017550] WARNING: CPU: 3 PID: 3065 at
drivers/net/wireless/broadcom/brcm80211/brcmutil/utils.c:49
brcmu_pkt_buf_free_skb+0x21/0x30 [brcmutil]
[ 1580.184017] Call Trace:
[ 1580.186514] brcmf_fws_cleanup+0x14e/0x190 [brcmfmac]
[ 1580.191594] brcmf_fws_del_interface+0x70/0x90 [brcmfmac]
[ 1580.197029] brcmf_proto_bcdc_del_if+0xe/0x10 [brcmfmac]
[ 1580.202418] brcmf_remove_interface+0x69/0x190 [brcmfmac]
[ 1580.207888] brcmf_detach+0x90/0xe0 [brcmfmac]
[ 1580.212385] brcmf_usb_disconnect+0x76/0xb0 [brcmfmac]
[ 1580.217557] usb_unbind_interface+0x72/0x260
[ 1580.221857] device_release_driver_internal+0x141/0x200
[ 1580.227152] device_release_driver+0x12/0x20
[ 1580.231460] bus_remove_device+0xfd/0x170
[ 1580.235504] device_del+0x1d9/0x300
[ 1580.239041] usb_disable_device+0x9e/0x270
[ 1580.243160] usb_disconnect+0x94/0x270
[ 1580.246980] hub_event+0x76d/0x13b0
[ 1580.250499] process_one_work+0x144/0x360
[ 1580.254564] worker_thread+0x4d/0x3c0
[ 1580.258247] kthread+0x109/0x140
[ 1580.261515] ? rescuer_thread+0x340/0x340
[ 1580.265543] ? kthread_park+0x60/0x60
[ 1580.269237] ? SyS_exit_group+0x14/0x20
[ 1580.273118] ret_from_fork+0x25/0x30
[ 1580.300446] ------------ [ cut here ]------------
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200624091608.25154-2-wright.feng@cypress.com
commit 4684997d9e ("brcmfmac: reset PCIe bus on a firmware crash")
adds a reset function to recover firmware trap for PCIe bus. This commit
adds an implementation for SDIO bus.
Upon SDIO firmware trap, do below:
- Remove the device
- Reset hardware
- Probe the device again
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200622144851.165248-1-chi-hsien.lin@cypress.com
Ignore FW event if the event's BSSID is different form the BSSID of the
currently connected AP. Check interface state is connected or not, if
state is not connected that can ignore link down event.
Signed-off-by: Able Liao <Able.Liao@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200618160739.21457-4-chi-hsien.lin@cypress.com
Current brcmf_link_down() always call cfg80211_disconnected() with
locally_generated=1, which is not always the case. Add event source
argument on link down handler and set locally_generated based on the
real trigger.
Signed-off-by: Soontak Lee <soontak.lee@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200618160739.21457-3-chi-hsien.lin@cypress.com
Unable to change back to visiable SSID because there is
no disable hidden ssid routine.
Signed-off-by: Soontak Lee <soontak.lee@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200618160739.21457-2-chi-hsien.lin@cypress.com
Commit 4905432b28b7 ("brcmfmac: Fix P2P Group Formation failure via Go-neg
method") did not initialize requested_dwell properly, resulting in an
always-false dwell time overflow check. Fix it by setting the correct
requested_dwell value.
Fixes: 4905432b28b7 ("brcmfmac: Fix P2P Group Formation failure via Go-neg method")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Joseph Chuang <joseph.chuang@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200610152106.175257-7-chi-hsien.lin@cypress.com
This patch move the credit map setting to right place to avoid
brcmf_fws_return_credits() return without setting the credit map.
It fix the thoughput zero stalls issue in softAP mode when STA
using PM 1 mode.
Signed-off-by: Double Lo <double.lo@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200610152106.175257-6-chi-hsien.lin@cypress.com
There is a mismatch of tx status flag values between host and firmware.
It makes the host mistake the flags and have incorrect behavior of credit
returns. So update the flags to sync with the firmware ones.
Signed-off-by: Chung-Hsien Hsu <stanley.hsu@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200610152106.175257-5-chi-hsien.lin@cypress.com
It is observed that sometimes when sdiod is low in tx credits in low
rssi scenarios, the data path consumes all sdiod rx all credits and
there is no sdiod rx credit available for control path causing host
and card to go out of sync resulting in link loss between host and
card. So in order to prevent it some credits are reserved for control
path.
Note that TXCTL_CREDITS can't be larger than the firmware default
credit update threshold 2; otherwise there will be a deadlock for both
side waiting for each other.
Signed-off-by: Amar Shankar <amsr@cypress.com>
Signed-off-by: Jia-Shyr Chuang <joseph.chuang@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200610152106.175257-4-chi-hsien.lin@cypress.com
In wifi firmware, max length of IOCTL/IOVAR buffer size is 8192.
Increase the message buffer max size same as wifi firmware for control
packets so return buffers can come back.
Signed-off-by: Soontak Lee <soontak.lee@cypress.com>
Signed-off-by: Jia-Shyr Chuang <joseph.chuang@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200610152106.175257-3-chi-hsien.lin@cypress.com
Current credit borrowing allows only the access category BE to
borrow the credits. This change is to fix the credit borrowing
logic, to make borrowing available for all access categories
and also to borrow only from the lower categories. This fixes WFA
802.11n certs 5.2.27 failures.
Signed-off-by: Raveendran Somu <raveendran.somu@cypress.com>
Signed-off-by: Jia-Shyr Chuang <joseph.chuang@cypress.com>
Signed-off-by: Chung-Hsien Hsu <stanley.hsu@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200610152106.175257-2-chi-hsien.lin@cypress.com
Bss info flag definition need to be fixed from 0x2 to 0x4
This flag is for rssi info received on channel.
All Firmware branches defined as 0x4 and this is bug in brcmfmac.
Signed-off-by: Prasanna Kerekoppa <prasanna.kerekoppa@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200604071835.3842-6-wright.feng@cypress.com
The firmware state machines are not fully suitable for concurrent
station interface support, it may hit unexpected error if we have 2
different SSIDs and the roaming scenarios concurrently.
To avoid the bad user-experience if this is not fully validated, we
dis-allow user to create two concurrent station interfaces.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200604071835.3842-5-wright.feng@cypress.com
brcmfmac host driver makes SDIO bus sleep and stops SDIO watchdog if no
pending event or data. As a result, host driver does not poll firmware
console buffer before buffer overflow, which leads to missing firmware
logs. We should not stop SDIO watchdog if console_interval is non-zero
in debug build.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200604071835.3842-4-wright.feng@cypress.com
When host driver retrieves mac addresses from dongle, driver copies memory
from drvr->mac to perm_addr. But at the moment, drvr->mac is all zero
array which causes permanent MAC address in wiphy is all zero as well.
To fix this, we set drvr->mac before setting perm_addr.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200604071835.3842-3-wright.feng@cypress.com
To truncate the additional bytes, if extra bytes have been received.
Current code only have a warning and proceed without handling it.
But in one of the crash reported by DVT, these causes the
crash intermittently. So the processing is limit to the skb->len.
Signed-off-by: Raveendran Somu <raveendran.somu@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200604071835.3842-2-wright.feng@cypress.com
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is another switch from Vitesse / Microsemi / Microchip, that has
10 ports (8 external, 2 internal) and is integrated into the Freescale /
NXP T1040 PowerPC SoC. It is very similar to Felix from NXP LS1028A,
except that this is a platform device and Felix is a PCI device, and it
doesn't support IEEE 1588 and TSN.
Like Felix, this driver configures its own PCS on the internal MDIO bus
using a phy_device abstraction for it (yes, it will be refactored to use
a raw mdio_device, like other phylink drivers do, but let's keep it like
that for now). But unlike Felix, the MDIO bus and the PCS are not from
the same vendor. The PCS is the same QorIQ/Layerscape PCS as found in
Felix/ENETC/DPAA*, but the internal MDIO bus that is used to access it
is actually an instantiation of drivers/net/phy/mdio-mscc-miim.c. But it
would be difficult to reuse that driver (it doesn't even use regmap, and
it's less than 200 lines of code), so we hand-roll here some internal
MDIO bus accessors within seville_vsc9953.c, which serves the purpose of
driving the PCS absolutely fine.
Also, same as Felix, the PCS doesn't support dynamic reconfiguration of
SerDes protocol, so we need to do pre-validation of PHY mode from device
tree and not let phylink change it.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Felix is not actually meant to be a DSA driver only for the switch
inside NXP LS1028A, but an umbrella for all Vitesse / Microsemi /
Microchip switches that are register-compatible with Ocelot and that are
using in DSA mode (with an NPI Ethernet port).
For the dsa_switch_ops exported by the felix driver to be generic enough
to be used by other non-PCI switches, we need to move the PCI-specific
probing to the low-level translation module felix_vsc9959.c. This way,
other switches can have their own probing functions, as platform devices
or otherwise.
This patch also removes the "Felix instance table", which did not stand
the test of time and is unnecessary at this point.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ocelot_wm_encode function deals with setting thresholds for pause
frame start and stop. In Ocelot and Felix the register layout is the
same, but for Seville, it isn't. The easiest way to accommodate Seville
hardware configuration is to introduce a function pointer for setting
this up.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Seville has a different bitwise layout than Ocelot and Felix.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Ocelot switches do not support flow control on Ethernet interfaces
where a DSA tag must be added. If pause frames are enabled, they will be
encapsulated in the DSA tag just like regular frames, and the DSA master
will not recognize them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't want ocelot_port_set_maxlen to enable pause frame TX, just to
adjust the pause thresholds.
Move the unconditional enabling of pause TX to ocelot_init_port. There
is no good place to put such setting because it shouldn't be
unconditional. But at the moment it is, we're not changing that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With this patch we try to kill 2 birds with 1 stone.
First of all, some switches that use tag_ocelot.c don't have the exact
same bitfield layout for the DSA tags. The destination ports field is
different for Seville VSC9953 for example. So the choices are to either
duplicate tag_ocelot.c into a new tag_seville.c (sub-optimal) or somehow
take into account a supposed ocelot->dest_ports_offset when packing this
field into the DSA injection header (again not ideal).
Secondly, tag_ocelot.c already needs to memset a 128-bit area to zero
and call some packing() functions of dubious performance in the
fastpath. And most of the values it needs to pack are pretty much
constant (BYPASS=1, SRC_PORT=CPU, DEST=port index). So it would be good
if we could improve that.
The proposed solution is to allocate a memory area per port at probe
time, initialize that with the statically defined bits as per chip
hardware revision, and just perform a simpler memcpy in the fastpath.
Other alternatives have been analyzed, such as:
- Create a separate tag_seville.c: too much code duplication for just 1
bit field difference.
- Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
tag_brcm.c, which would have a separate .xmit function. Again, too
much code duplication for just 1 bit field difference.
- Allocate the template from the init function of the tag_ocelot.c
module, instead of from the driver: couldn't figure out a method of
accessing the correct port template corresponding to the correct
tagger in the .xmit function.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently Felix and Ocelot share the same bit layout in these per-port
registers, but Seville does not. So we need reg_fields for that.
Actually since these are per-port registers, we need to also specify the
number of ports, and register size per port, and use the regmap API for
multiple ports.
There's a more subtle point to be made about the other 2 register
fields:
- QSYS_SWITCH_PORT_MODE_SCH_NEXT_CFG
- QSYS_SWITCH_PORT_MODE_INGRESS_DROP_MODE
which we are not writing any longer, for 2 reasons:
- Using the previous API (ocelot_write_rix), we were only writing 1 for
Felix and Ocelot, which was their hardware-default value, and which
there wasn't any intention in changing.
- In the case of SCH_NEXT_CFG, in fact Seville does not have this
register field at all, and therefore, if we want to have common code
we would be required to not write to it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the register definitions for the MSCC MIIM MDIO controller in
preparation for seville_vsc9959.c to create its accessors for the
internal MDIO bus.
Since we've introduced elements to ocelot_regfields that are not
instantiated by felix and ocelot, we need to define the size of the
regfields arrays explicitly, otherwise ocelot_regfields_init, which
iterates up to REGFIELD_MAX, will fault on the undefined regfield
entries (if we're lucky).
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At the moment, there are some minimal register differences between
VSC7514 Ocelot and VSC9959 Felix. To be precise, the PCS1G registers are
missing from Felix because it was integrated with an NXP PCS.
But with VSC9953 Seville (not yet introduced), the register differences
are more pronounced. The MAC registers are located at different offsets
within the DEV_GMII target. So we need to refactor the driver to keep a
regmap even for per-port registers. The callers of the ocelot_port_readl
and ocelot_port_writel were kept unchanged, only the implementation is
now more generic.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PHYLIB is not selected by mdio-mscc-miim but it uses mdio devres helpers.
Explicitly select MDIO_DEVRES in this driver's Kconfig entry.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1814cff267 ("net: phy: add a Kconfig option for mdio_devres")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RED qevents early_drop and mark can be offloaded under the following
fairly strict conditions:
- At most one filter is configured at the qevent block
- The protocol is "any"
- The classifier is matchall
- The action is trap, sample, or mirror with the same conditions as
with other SPAN offloads
- The hw_counters type is none
In this patchset, implement offload of mirror for early_drop qevent.
The ECN trigger is currently not implemented in the FW and therefore
the mark qevent is not supported.
The qevent notifications look exactly like regular block binding
notifications with a binder type that identifies them as qevents.
Therefore the details of processing this binding are fairly similar
to the matchall offload.
struct flow_block_offload.sch points at the qdisc in question. Use it to
figure out if the qdisc is offloaded at all and what TC it configures.
Bounce bindings on not-offloaded qdiscs.
Individual bindings are kept in a list so that several qevents can share
the same block and all binding points get configured as the configured
filters change.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two RED qevents have been introduced recently. From the point of view of a
driver, qevents are simply blocks with unusual binder types. However they
need to be handled by different logic than ACL-like flows.
Thus rename mlxsw_sp_setup_tc_block() to mlxsw_sp_setup_tc_block_clsact()
and move the binder-type dispatch from there to spectrum.c into a new
function of the original name. The new dispatcher is easier to extend with
new binder types.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A following patch introduces offloading of filters attached to blocks bound
to the RED tail_drop qevent. The only classifier that mlxsw will permit in
this role is matchall. mlxsw currently offloads matchall filters used with
clsact qdisc. The data structures used for that offload will come handy for
the qevent offload as well. Publish them in spectrum.h.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The field "dev" in struct mlxsw_sp_flow_block_binding is not used. Drop it.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No clean-up is performed at the target label of this goto. Convert it to a
direct return.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While the binding of global mirroring triggers to a SPAN agent is
global, packets are only mirrored if they belong to a port and TC on
which the trigger was enabled. This allows, for example, to mirror
packets that were tail-dropped on a specific netdev.
Implement the operations that allow to enable / disable a global
mirroring trigger on a specific port and TC.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Global mirroring triggers are triggers that are only keyed by their
trigger, as opposed to per-port triggers, which are keyed by their
trigger and port.
Such triggers allow mirroring packets that were tail/early dropped or
ECN marked to a SPAN agent.
Implement the previously added trigger operations for these global
triggers. Since such triggers are only supported from Spectrum-2
onwards, have the Spectrum-1 operations return an error.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, a SPAN agent can only be bound to a per-port trigger where
the trigger is either an incoming packet (INGRESS) or an outgoing packet
(EGRESS) to / from the port.
The subsequent patch will introduce the concept of global mirroring
triggers. The binding / unbinding of global triggers is different than
that of per-port triggers. Such triggers also need to be enabled /
disabled on a per-{port, TC} basis and are only supported from
Spectrum-2 onwards.
Add trigger operations that allow us to abstract these differences. Only
implement the operations for per-port triggers. Next patch will
implement the operations for global triggers.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The per-ASIC SPAN operations are relevant to the SPAN module and
therefore should be implemented there and not in the main driver file.
Move them.
These operations will be extended later on.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register is used for global port analyzer configurations.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register is used to configure the mirror enable for different
mirror reasons.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, shared blocks were only relevant for the pseudo-qdiscs ingress
and clsact. Recently, a qevent facility was introduced, which allows to
bind blocks to well-defined slots of a qdisc instance. RED in particular
got two qevents: early_drop and mark. Drivers that wish to offload these
blocks will be sent the usual notification, and need to know which qdisc it
is related to.
To that end, extend flow_block_offload with a "sch" pointer, and initialize
as appropriate. This prompts changes in the indirect block facility, which
now tracks the scheduler in addition to the netdevice. Update signatures of
several functions similarly.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit affects comments (and in one case, whitespace) only.
Throughout the IPA code, return statements are documented using
"@Return:", whereas they should use "Return:" instead. Fix these
mistakes.
In function definitions, some parameters are missing their comment
to describe them. And in structure definitions, some fields are
missing their comment to describe them. Add these missing
descriptions.
Some arguments changed name and type along the way, but their
descriptions were not updated (an endpoint pointer is now used in
many places that previously used an endpoint ID). Fix these
incorrect parameter descriptions.
In the description for the ipa_clock structure, one field had a
semicolon instead of a colon in its description. Fix this.
Add a missing function description for ipa_gsi_endpoint_data_empty().
All of these issues were identified when building with "W=1".
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
kbuild test robot found that addr_to_string() is available only when
DEBUG is defined. And I found that what that function is doing is
what %pM will do. Thus, replace %s with %pM and remove thread-unsafe
addr_to_string() function.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
The FEC allocates 2K buffers, but looses some of it due to
alignment. It can however support an MTU bigger than the default. This
is particularly interesting when used in combination with Ethernet
switches supporting DSA, which have extra headers. The DSA core will
try to increase the MTU to support these extra headers. If the max
size defaults to that of standard Ethernet we get a warning. By
setting the max to what the driver actually supports, we avoid this
warning.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell Switches support jumbo packages. So implement the
callbacks needed for changing the MTU.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case devlink reload failed, it is possible to trigger a
use-after-free when querying the kernel for device info via 'devlink dev
info' [1].
This happens because as part of the reload error path the PCI command
interface is de-initialized and its mailboxes are freed. When the
devlink '->info_get()' callback is invoked the device is queried via the
command interface and the freed mailboxes are accessed.
Fix this by initializing the command interface once during probe and not
during every reload.
This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
and also allows user space to query the running firmware version (for
example) from the device after a failed reload.
[1]
BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355
CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xf6/0x16e lib/dump_stack.c:118
print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
memcpy+0x39/0x60 mm/kasan/common.c:106
memcpy include/linux/string.h:406 [inline]
mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
__netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0x150/0x190 net/socket.c:672
____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
___sys_sendmsg+0xff/0x170 net/socket.c:2417
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: a9c8336f65 ("mlxsw: core: Add support for devlink info command")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should not trigger a warning when a memory allocation fails. Remove
the WARN_ON().
The warning is constantly triggered by syzkaller when it is injecting
faults:
[ 2230.758664] FAULT_INJECTION: forcing a failure.
[ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
[ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
...
[ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
[ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
[ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
[ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Fixes: 3057224e01 ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Utilize new devlink-health port reporters API to move rx and tx
reporters from device to port.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register devlink ports upon NIC init. TX and RX health reporters handle
errors which may occur early on at driver initialization. And because
these reporters are to be moved to port context, they require devlink
ports to be already registered.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The calls to pm_runtime_force_suspend/resume() functions are only
relevant if the device is not configured to act as a WoL wakeup source.
Add the device_may_wakeup() test before calling them.
Fixes: 3e2a5e1539 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As we now use the phylink call to phylink_stop() in the non-WoL path,
there is no need for this call to netif_carrier_off() anymore. It can
disturb the underlying phylink FSM.
Fixes: 7897b071ac ("net: macb: convert to phylink")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Keep previous function goals and integrate phylink actions to them.
phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver
supports Wake-on-Lan.
Initialization of "supported" and "wolopts" members is done in phylink
function, no need to keep them in calling function.
phylink_ethtool_set_wol() return value is considered and determines
if the MAC has to handle WoL or not. The case where the PHY doesn't
implement WoL leads to the MAC configuring it to provide this feature.
Fixes: 7897b071ac ("net: macb: convert to phylink")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the way the "magic-packet" DT property is handled in the
macb_probe() function, matching DT binding documentation.
Now we mark the device as "wakeup capable" instead of calling the
device_init_wakeup() function that would enable the wakeup source.
For Ethernet WoL, enabling the wakeup_source is done by
using ethtool and associated macb_set_wol() function that
already calls device_set_wakeup_enable() for this purpose.
That would reduce power consumption by cutting more clocks if
"magic-packet" property is set but WoL is not configured by ethtool.
Fixes: 3e2a5e1539 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the proper struct device pointer to check if the wakeup flag
and wakeup source are positioned.
Use the one passed by function call which is equivalent to
&bp->dev->dev.parent.
It's preventing the trigger of a spurious interrupt in case the
Wake-on-Lan feature is used.
Fixes: d54f89af6c ("net: macb: Add pm runtime support")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5 connection tracking offloads updates:
1) Restore CT state from lookup in zone instead of tupleid
On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT
entry and restore it, instead of the driver allocated tuple id.
This improves flow insertion rate by avoiding the allocation of a header
rewrite context to maintain the tupleid.
2) Re-use modify header HW objects for identical modify actions.
3) Expand tunnel register mappings
Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated
for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel
options mappings.
Restoring the ct state from zone lookup instead of tuple id requires
reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
mappings.
Expand tunnel and tunnel options register mappings to 12 bit each.
4) Trivial cleanup and fixes.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8H16UACgkQSD+KveBX
+j7yTwf/eza7ftn9Jq1f6yyTM9qQZ64oC0cboDZQ3EyJtY++frWzo4bNbHFbQ26Y
EDjRGqG0Hiby95dgTrGtRzf9PQuDwWfdNavLKyV1D//cPeTDYpHkwKVF4sozfd5Q
g1RB6rySvYfx8BKALaJBclYlRoiVevLoIEfuMSrmstR1/tBCvmMLiB0p1VsLIS0+
XBDEezO4rqDyNJwuznMYIX44w8Xa4IzIb9/YwEubMPs52WjktXAmTPTChcO8cu/9
4VLsTkFKUDlm3TDXg99Lpk8L+0dfo7dUcHsqaoXMs5eER6kw8bjK/f7muSSIiIcd
Nnba/UaU+FYzA4EF98xQD0bFQJNrmQ==
=NMLY
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2020-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-07-09
This series provides updates to mlx5 CT (connection tracking) offloads
For more information please see tag log below.
Please pull and let me know if there is any problem.
The following conflict is expected when net is merged into net-next:
to resolve just use the hunks from net-next.
<<<<<<< HEAD (net-next)
mlx5_tc_ct_del_ft_entry(ct_priv, entry);
kfree(entry);
======= (net)
mlx5_tc_ct_entry_del_rules(ct_priv, entry);
kfree(entry);
>>>>>>> b1a7d5bdfe54c98eca46e2c997d4e3b1484a49af
mlx5 connection tracking offloads updates:
1) Restore CT state from lookup in zone instead of tupleid
On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT
entry and restore it, instead of the driver allocated tuple id.
This improves flow insertion rate by avoiding the allocation of a header
rewrite context to maintain the tupleid.
2) Re-use modify header HW objects for identical modify actions.
3) Expand tunnel register mappings
Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated
for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel
options mappings.
Restoring the ct state from zone lookup instead of tuple id requires
reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
mappings.
Expand tunnel and tunnel options register mappings to 12 bit each.
4) Trivial cleanup and fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert to new infra, make use of the ability to sleep in the callback.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert to new infra, taking advantage of sleeping in callbacks.
v2:
- use bp->*_fw_dst_port_id != INVALID_HW_RING_ID as indication
that the offload is active.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of new common udp_tunnel_nic infra. ixgbe supports
IPv4 only, and only single VxLAN and Geneve ports (one each).
v2:
- split out the RXCSUM feature handling to separate change;
- declare structs separately;
- use ti.type instead of assuming table 0 is VxLAN;
- move setting netdev->udp_tunnel_nic_info to its own switch.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It appears the clearing of UDP tunnel ports when RXCSUM
is disabled is unnecessary. Driver will not pay attention
to checksum bits if RXCSUM is not set, so we can let
the hardware parse the packets.
Note that the UDP tunnel port NDO handlers don't pay attention
to the state of RXCSUM, so the ports could had been re-programmed,
anyway.
This cleanup simplifies later conversion patch.
v2:
- break this out of the following patch.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add UDP tunnel port handlers to our fake driver so we can test
the core infra.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cater to devices which:
(a) may want to sleep in the callbacks;
(b) only have IPv4 support;
(c) need all the programming to happen while the netdev is up.
Drivers attach UDP tunnel offload info struct to their netdevs,
where they declare how many UDP ports of various tunnel types
they support. Core takes care of tracking which ports to offload.
Use a fixed-size array since this matches what almost all drivers
do, and avoids a complexity and uncertainty around memory allocations
in an atomic context.
Make sure that tunnel drivers don't try to replay the ports when
new NIC netdev is registered. Automatic replays would mess up
reference counting, and will be removed completely once all drivers
are converted.
v4:
- use a #define NULL to avoid build issues with CONFIG_INET=n.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this commit, on ft flush, ft entries were not removed
from the ct_tuple hashtables. Fix it.
Fixes: ac991b48d4 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
"flow" parameter is not used in __mlx5_tc_ct_flow_offload_clear(),
remove it.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Instead of having to deal with converting between int and ERR_PTR for
return values in mlx5_tc_ct_flow_offload(), make the internal helper
functions return a ptr to mlx5_flow_handle instead of passing it as
output param, this will also avoid gcc confusion and false alarms,
thus we remove the redundant ERR_PTR rule initialization.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reg_c1 is 32 bits wide. Originally, 24 bit were allocated for the tuple_id,
6 bits for tunnel mapping and 2 bits for tunnel options mappings.
Restoring the ct state from zone lookup instead of tuple id requires
reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
mappings.
Expand tunnel and tunnel options register mappings to 12 bit each.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use a single byte mapping for zone restore register (zone matching
remains 16 bit).
This makes room for using the freed 8 bits on register C1 for
mapping more tunnels and tunnel options.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
After removing the tupleid register which changed per tuple,
tuple modify headers set the ct_state, zone, mark, and label registers.
For non-natted tuples going through the same tc rules path, their values
will be the same, and all their modify headers will be the same.
Re-use tuple modify header when possible, by adding each new modify
header to an hahstable, and looking up identical ones before creating
a new one.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Refactor sharing of mod headers to new file and while there,
remove spin lock and flows list, as this is only used for warn on.
Use the generic API in the next patch to re-use tuple modify headers
for identical modify actions,
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Remove tupleid, and replace it with zone_restore, which is the zone an
established tuple sets after match. On miss, Use this zone + tuple
taken from the skb, to lookup the ct entry and restore it.
This improves flow insertion rate by avoiding the allocation of a header
rewrite context.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Next patches will remove the tupleid registers that is used
to restore the ct state on miss, and instead use the tuple on
the missed packet to lookup which state to restore.
Disable tuple rewrites after connection tracking.
For tuple rewrites, inject a ct_state=-trk match so it won't
change the tuple for established flows (+trk) that passed connection
tracking, and instead miss to software.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The next patch will pass the mlx5e_priv struct to the
modify_header_match_supported method. Use this opportunity to refactor
the existing pr_info call to a netdev_info call.
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
With ct clear we don't jump to the ct tables, so header rewrite
of 5-tuple can be done in place (and not moved to after the CT action).
Check for ct clear action, and if so, allow 5-tuple header
rewrite.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Save original tuple and natted tuple in two new hashtables.
This is a pre-step for restoring ct state after hw miss by performing a
5-tuple lookup on the hash tables.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When eswitch is unsupported, currently -EPERM error code is returned
instead of -EOPNOTSUPP.
Due to this VF device's devlink virtual port is not enumerated because
port_function_get() callback returned -EPERM instead of -EOPNOTSUPP.
Hence, return the error code -EOPNOTSUPP when eswitch is unsupported.
Fixes: bd93975353 ("net/mlx5: E-switch, Introduce and use eswitch support check helper")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
CT entries are deleted via a workqueue from netfilter. If removing the
module before that, the rules are cleaned by the driver itself, but the
memory entries for them are not freed. Fix that.
Fixes: ac991b48d4 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Device unit for port buffers size, xoff_threshold and xon_threshold is
cells. Fix a bug in driver where cell unit size was hard-coded to
128 bytes. This hard-coded value is buggy, as it is wrong for some hardware
versions.
Driver to read cell size from SBCAM register and translate bytes to cell
units accordingly.
In order to fix the bug, this patch exposes SBCAM (Shared buffer
capabilities mask) layout and defines.
If SBCAM.cap_cell_size is valid, use it for all bytes to cells
calculations. If not valid, fallback to 128.
Cell size do not change on the fly per device. Instead of issuing SBCAM
access reg command every time such translation is needed, cache it in
mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz
as a param to every function that needs bytes to cells translation.
While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to
en_dcbnl.c, as it is only used by that file.
Fixes: 0696d60853 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Some released FW versions mistakenly don't set the capability that 50G
per lane link-modes are supported for VFs (ptys_extended_ethernet
capability bit). When the capability is unset, read
PTYS.ext_eth_proto_capability (always reliable).
If PTYS.ext_eth_proto_capability is valid (has a non-zero value)
conclude that the HCA supports 50G per lane. Otherwise, conclude that
the HCA doesn't support 50G per lane.
Fixes: a08b4ed137 ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When detaching netdev, remove vxlan port configuration using
udp_tunnel_drop_rx_info. During function reload, configuration will be
restored using udp_tunnel_get_rx_info. This ensures sync between
firmware and driver. Use udp_tunnel_get_rx_info even if its physical
interface is down.
Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In mlx5e_configure_flower() flow pointer is protected by rcu read lock.
However, after cited commit the pointer is being used outside of rcu read
block. Extend the block to protect all pointer accesses.
Fixes: 553f932838 ("net/mlx5e: Support tc block sharing for representors")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix eeprom SFP query support by setting i2c_addr, offset and page number
correctly. Unlike QSFP modules, SFP eeprom params are as follow:
- i2c_addr is 0x50 for offset 0 - 255 and 0x51 for offset 256 - 511.
- Page number is always zero.
- Page offset is always relative to zero.
As part of eeprom query, query the module ID (SFP / QSFP*) via helper
function to set the params accordingly.
In addition, change mlx5_qsfp_eeprom_page() input type to be u16 to avoid
unnecessary casting.
Fixes: a708fb7b1f ("net/mlx5e: ethtool, Add support for EEPROM high pages query")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently, all the input checks are done in driver.
After adding the split capability to devlink port, move the checks to
devlink.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new attribute that indicates the split ability of devlink port.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, port attributes like flavour, port number and whether the port
was split are set when initializing a port.
Set the split ability of the port as well, based on port_mapping->width
field and split attribute of devlink port in spectrum, so that it could be
easily passed to devlink in the next patch.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new devlink port attribute that indicates the port's number of lanes.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
The attribute is not passed to user space in case the number of lanes is
invalid (0).
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, port attributes like flavour, port number and whether the
port was split are set when initializing a port.
Set the number of lanes of the port as well so that it could be easily
passed to devlink in the next patch.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, devlink_port_attrs_set accepts a long list of parameters,
that most of them are devlink port's attributes.
Use the devlink_port_attrs struct to replace the relevant parameters.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/phy/mscc/mscc_ptp.c:1496:1-3: WARNING: PTR_ERR_OR_ZERO can be used
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
Generated by: scripts/coccinelle/api/ptr_ret.cocci
Fixes: 7d272e63e0 ("net: phy: mscc: timestamping and PHC support")
CC: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the u16 skb->vlan_tci is being right shifted twice by
VLAN_PRIO_SHIFT, once in the macro skb_vlan_tag_get_pri and explicitly
by VLAN_PRIO_SHIFT afterwards. The combined shift amount is larger than
the u16 so the end result is always zero. Remove the second explicit
shift as this is extraneous.
Fixes: 6e9fdb60d3 ("net: systemport: Add support for VLAN transmit acceleration")
Addresses-Coverity: ("Operands don't affect result")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to use eth_broadcast_addr() to assign broadcast address
insetad of memset().
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
NVM config file address will be modified when the MBI image is upgraded.
Driver would return stale config values if user reads the nvm-config
(via ethtool -d) in this state. The fix is to re-populate nvm attribute
info while reading the nvm config values/partition.
Changes from previous version:
-------------------------------
v3: Corrected the formatting in 'Fixes' tag.
v2: Added 'Fixes' tag.
Fixes: 1ac4329a1c ("qed: Add configuration information to register dump and debug data")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bond_ipsec_* helpers don't need RTNL, and can potentially get called
without it being held, so switch from rtnl_dereference() to
rcu_dereference() to access bond struct data.
Lightly tested with xfrm bonding, no problems found, should address the
syzkaller bug referenced below.
Reported-by: syzbot+582c98032903dcc04816@syzkaller.appspotmail.com
CC: Huy Nguyen <huyn@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert all-mask IP address to Big Endian, instead, for comparison.
Fixes: f286dd8eaa ("cxgb4: use correct type for all-mask IP address comparison")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's possible that device removal happens when the bond is in non-AB mode,
and addition happens in AB mode, so bond_ipsec_del_sa() never gets called,
which leaves security associations in an odd state if bond_ipsec_add_sa()
then gets called after switching the bond into AB. Just call add and
delete universally for all modes to keep things consistent.
However, it's also possible that this code gets called when the system is
shutting down, and the xfrm subsystem has already been disconnected from
the bond device, so we need to do some error-checking and bail, lest we
hit a null ptr deref.
Fixes: a3b658cfb6 ("bonding: allow xfrm offload setup post-module-load")
CC: Huy Nguyen <huyn@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This activates the support to use the CPU tag to properly
direct ingress traffic to the right port.
Bit 15 in register RTL8368RB_CPU_CTRL_REG can be set to
1 to disable the insertion of the CPU tag which is what
the code currently does. The bit 15 define calls this
setting RTL8368RB_CPU_INSTAG which is confusing since the
inverse meaning is implied: programmers may think that
setting this bit to 1 will *enable* inserting the tag
rather than disabling it, so rename this setting in
bit 15 to RTL8368RB_CPU_NO_TAG which is more to the
point.
After this e.g. ping works out-of-the-box with the
RTL8366RB.
Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch exposes new link modes using 100Gbps per lane, including 100G,
200G and 400G modes.
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define 100G, 200G and 400G link modes using 100Gbps per lane
LR, ER and FR are defined as a single link mode because they are
using same technology and by design are fully interoperable.
EEPROM content indicates if the module is LR, ER, or FR, and the
user space ethtool decoder is planned to support decoding these
modes in the EEPROM.
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
CC: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bare-metal use cases require giving firmware and the embedded
application processor control over VLAN offloads. The driver should
not attempt to override or utilize this feature in such scenarios
since it will not work as expected.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware VLAN offload feature on our NIC does not have separate
knobs for handling customer and service tags on RX. Either offloading
of both must be enabled or both must be disabled. Introduce definitions
for the combined feature set in order to clean up the code and make
this constraint more clear. Technically these features can be separately
enabled on TX, however, since the default is to turn both on, the
combined TX feature set is also introduced for code consistency.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the new infrastructure in place, we can now support the setting of
the indirection table from ethtool.
When changing channels, in a rare case that firmware cannot reserve the
rings that were promised, we will still try to keep the RSS map and only
revert to default when absolutely necessary.
v4: Revert RSS map to default during ring change only when absolutely
necessary.
v3: Add warning messages when firmware cannot reserve the requested RX
rings, and when the RSS table entries have to change to default.
v2: When changing channels, if the RSS table size changes and RSS map
is non-default, return error.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we have the logical indirection table, we can return these
proper logical indices directly to ethtool -x instead of the physical
IDs.
Reported-by: Jakub Kicinski <kicinski@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we have the logical table, we can fill the HW RSS table
using the logical table's entries and converting them to the HW
specific format. Re-initialize the logical table to standard
distribution if the number of RX rings changes during ring reservation.
v4: Use bnxt_get_rxfh_indir_size() to get the RSS table size.
v2: Use ALIGN() to roundup the RSS table size.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some chips, this varies based on the number of RX rings. Add this
helper function and refactor the existing code to use it.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver currently does not keep track of the logical RSS indirection
table. The hardware RSS table is set up with standard default ring
distribution when initializing the chip. This makes it difficult to
support user sepcified indirection table entries. As a first step, add
the logical table in the main bnxt structure and allocate it according
to chip specific table size. Add a function that sets up default
RSS distribution based on the number of RX rings.
v4: Use bnxt_get_rxfh_indir_size() for the current RSS table size.
v2: Use kmalloc_array() since we init. all entries afterwards.
Use ALIGN() to roundup the RSS table size.
Use ethtool_rxfh_indir_default() to init. the default entries.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up bnxt_get_rxfh_indir_size() to return the proper current RSS
table size for P5 chips. Change it to non-static so that bnxt.c
can use it to get the table size.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we allocate one page for the hardware DMA RSS indirection
table. While the size is currently big enough for all chips, future
chip variations may support bigger sizes, so it is better to calculate
and store the chip specific size and allocate accordingly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we have moved the PHY ethtool statistics to be dynamically
registered, we no longer need to inline those for ethtool. This used to
be done to avoid cross symbol referencing and allow ethtool to be
decoupled from PHYLIB entirely.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend ethtool_phy_ops to include the 3 function pointers necessary for
implementing PHY statistics. In a subsequent change we will uninline
those functions.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes ip dst and ipv6 address filters.
There were 2 mistakes in the code, which led to the issue:
* invalid register was used for ipv4 dst address;
* incorrect write order of dwords for ipv6 addresses.
Fixes: 23e7a718a4 ("net: aquantia: add rx-flow filter definitions")
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have a number of error conditions that can lead to the driver not
probing successfully, move the print when we are sure
dsa_register_switch() has suceeded. This avoids repeated prints in case
of probe deferral for instance.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The queue reset pattern is used in a couple different places,
only slightly different from each other, and could cause
issues if one gets changed and the other didn't. This puts
them together so that only one version is needed, yet each
can have slighty different effects by passing in a pointer
to a work function to do whatever configuration twiddling is
needed in the middle of the reset.
This specifically addresses issues seen where under loops
of changing ring size or queue count parameters we could
occasionally bump into the netdev watchdog.
v2: added more commit message commentary
Fixes: 4d03e00a21 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When generating debug dump, driver firstly collects all data in binary
form, and then performs per-feature formatting to human-readable if it
is supported.
For ethtool -d, this is roughly incorrect for two reasons. First of all,
drivers should always provide only original raw dumps to Ethtool without
any changes.
The second, and more critical, is that Ethtool's output buffer size is
strictly determined by ethtool_ops::get_regs_len(), and all data *must*
fit in it. The current version of driver always returns the size of raw
data, but the size of the formatted buffer exceeds it in most cases.
This leads to out-of-bound writes and memory corruption.
Address both issues by adding an option to return original, non-formatted
debug data, and using it for Ethtool case.
v2:
- Expand commit message to make it more clear;
- No functional changes.
Fixes: c965db4446 ("qed: Add support for debug data collection")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are spelling mistakes in various literal strings. Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Utilize ethtool_set_ethtool_phy_ops to register a suitable set of PHY
ethtool operations in a dynamic fashion such that ethtool will no longer
directy reference PHY library symbols.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If phylib is built as a module and CONFIG_MDIO_DEVICE is 'y', the
mdio_device and mdio_bus code will be in the phylib module, not in the
kernel image. Meanwhile we build mdio_devres depending on the
CONFIG_MDIO_DEVICE symbol, so if it's 'y', it will go into the kernel
and we'll hit the following linker error:
ld: drivers/net/phy/mdio_devres.o: in function `devm_mdiobus_alloc_size':
>> drivers/net/phy/mdio_devres.c:38: undefined reference to `mdiobus_alloc_size'
ld: drivers/net/phy/mdio_devres.o: in function `devm_mdiobus_free':
>> drivers/net/phy/mdio_devres.c:16: undefined reference to `mdiobus_free'
ld: drivers/net/phy/mdio_devres.o: in function `__devm_mdiobus_register':
>> drivers/net/phy/mdio_devres.c:87: undefined reference to `__mdiobus_register'
ld: drivers/net/phy/mdio_devres.o: in function `devm_mdiobus_unregister':
>> drivers/net/phy/mdio_devres.c:53: undefined reference to `mdiobus_unregister'
ld: drivers/net/phy/mdio_devres.o: in function `devm_of_mdiobus_register':
>> drivers/net/phy/mdio_devres.c:120: undefined reference to `of_mdiobus_register'
Add a hidden Kconfig option for MDIO_DEVRES which will be currently
selected by CONFIG_PHYLIB as there are no non-phylib users of these
helpers.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: ac3a68d566 ("net: phy: don't abuse devres in devm_mdiobus_register()")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rwlock.h should not be included directly. Instead linux/splinlock.h
should be included. Including it directly will break the RT build.
Fixes: 549c243e4e ("net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the ISR, we poll the event register for the queues in need of
service and then enter polled mode. After this point, the event
register will never be read again until we exit polled mode.
In a scenario where a UDP flow is routed back out through the same
interface, i.e. "router-on-a-stick" we'll typically only see an rx
queue event initially. Once we start to process the incoming flow
we'll be locked polled mode, but we'll never clean the tx rings since
that event is never caught.
Eventually the netdev watchdog will trip, causing all buffers to be
dropped and then the process starts over again.
Rework the NAPI poll to keep trying to consome the entire budget as
long as new events are coming in, making sure to service all rx/tx
queues, in priority order, on each pass.
Fixes: 4d494cdc92 ("net: fec: change data structure to support multiqueue")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Fugang Duan <fugang.duan@nxp.com>
Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
clang static analysis flags this garbage return
drivers/net/ethernet/marvell/sky2.c:208:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
return v;
^~~~~~~~
static inline u16 gm_phy_read( ...
{
u16 v;
__gm_phy_read(hw, port, reg, &v);
return v;
}
__gm_phy_read can return without setting v.
So handle similar to skge.c's gm_phy_read, initialize v.
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should not use legacy power management as they have to manage power
states and related operations, for the device, themselves. This driver was
handling them with the help of PCI helper functions.
With generic PM, all essentials will be handled by the PCI core. Driver
needs to do only device-specific operations.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should not use legacy power management as they have to manage power
states and related operations, for the device, themselves.
With generic PM, all essentials will be handled by the PCI core. Driver
needs to do only device-specific operations.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
priv->page_pool is an array, so comparing against it will always return true.
Do a meaningful check by checking priv->page_pool[0] instead.
While at it, clear the page_pool pointers on deallocation, or when an
allocation error happens during init.
Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: c2d6fe6163 ("mvpp2: XDP TX support")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain configurations without power management support, gcc report
the following warning:
drivers/net/ethernet/sun/cassini.c:5206:12: warning:
'cas_resume' defined but not used [-Wunused-function]
5206 | static int cas_resume(struct device *dev_d)
| ^~~~~~~~~~
Mark cas_resume() as __maybe_unused to make it clear.
Fixes: f193f4ebde ("sun/cassini: use generic power management")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The upgraded .suspend() and .resume() throw
"defined but not used [-Wunused-function]" warning for certain
configurations.
Mark them with "__maybe_unused" attribute.
Compile-tested only.
Fixes: b0db0cc2f6 ("sun/niu: use generic power management")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To ensure that the octeon MDIO driver has been loaded, the Cavium
ethernet drivers reference a dummy symbol in the MDIO driver. This
forces it to be loaded first. And this symbol has not been cleanly
implemented, resulting in warnings when build W=1 C=1.
Since device tree is being used, and a phandle points to the PHY on
the MDIO bus, we can make use of deferred probing. If the PHY fails to
connect, it should be because the MDIO bus driver has not loaded
yet. Return -EPROBE_DEFER so it will be tried again later.
Additionally, add a MODULE_SOFTDEP() to give user space a hint as to
what order it should load the modules.
v2:
s/octoen/octeon/
Add MODULE_SOFTDEP()
Cc: Sunil Goutham <sgoutham@marvell.com>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MIPS low level register access functions seem to be missing
__iomem annotation. This causes lots of sparse warnings, when code
casts off the __iomem. Make the Cavium MDIO drivers cleaner by pushing
the casts lower down into the helpers, allow the drivers to work as
normal, with __iomem.
bus->register_base is now an void *, rather than a u64. So forming the
mii_bus->id string cannot use %llx any more. Use %px, so this kernel
address is still exposed to user space, as it was before.
v2: s/cases/causes/g
Cc: Sunil Goutham <sgoutham@marvell.com>
Cc: Robert Richter <rrichter@marvell.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
ntohs() expects to be passed a __be16. Correct the type of the
variable holding the sequence ID.
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This array is not used outside of phy_device.c, so make it static.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid the W=1 warning that symbol 'genphy_c45_driver' was not
declared. Should it be static?
Declare it on the phy header file.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct the kerneldoc for a few structure and function calls,
as reported by C=1 W=1.
Cc: Alexandru Ardelean <alexaundru.ardelean@analog.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
By placing the GENMASK value into an unsigned int and then passing it
to PREF_FIELD, the type is reduces down from ULL. Given the reduced
size of the type, the range checks in PREP_FAIL() are always true, and
-Wtype-limits then gives a warning.
By skipping the intermediate variable, the warning can be avoided.
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dynamically generate a unique GPIO interrupt name, based on the
device name and the GPIO name. For example:
103: 0 sx1503q 12 Edge sff2-los
104: 0 sx1503q 13 Edge sff2-tx-fault
The sffX indicates the SFP the los and tx-fault are associated with.
v3:
- reverse Christmas tree new variable
- fix spaces vs tabs
v2:
- added net-next to PATCH part of subject line
- switched to devm_kasprintf()
Signed-off-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include "ipa_gsi.h" in "ipa_gsi.c", so the public functions are
defined before they are used in "ipa_gsi.c". This addresses some
warnings that are reported with a "W=1" build.
Fixes: c3f398b141 ("soc: qcom: ipa: IPA interface to GSI")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pointers to two struct types are used in "ipa_gsi.h", without those
struct types being forward-declared. Add these declarations.
Fixes: c3f398b141 ("soc: qcom: ipa: IPA interface to GSI")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Building with "W=1" did exactly what it was supposed to do, namely
point out some suspicious-looking code to be verified not to contain
bugs.
Some QMI message structures defined in "ipa_qmi_msg.c" contained
some bad field names (duplicating the "elem_size" field instead of
defining the "offset" field), almost certainly due to copy/paste
errors that weren't obvious in a scan of the code. Fix these bugs.
Fixes: 530f9216a9 ("soc: qcom: ipa: AP/modem communications")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
SYSTEMPORT is capable of performing VLAN transmit acceleration, support
that by configuring it appropriately, providing the VLAN ID and PCP/DEI
where necessary.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flow Dissector's keys are mostly Network / Big Endian. U{16,32}_MAX are
the same in either of byteorders, but let's make sparse happy with
wrapping them into noops.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the function arguments was renamed some time ago, but this
wasn't reflected in its kernel-doc comment.
Also add the description for return values.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current code assumes that both host and device operates in Little Endian
in lots of places. While this is true for x86 platform, this doesn't mean
we should not care about this.
This commit addresses all parts of the code that were pointed out by sparse
checker. All operations with restricted (__be*/__le*) types are now
protected with explicit from/to CPU conversions, even if they're noops on
common setups.
I'm sure there are more such places, but this implies a deeper code
investigation, and is a subject for future works.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use intermediate pointers instead of multiple dereferencing to
simplify and beautify parts of code that will be addressed in
the next commit.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To not mix functional and stylistic changes, correct indentation
of code that will be modified in the subsequent commits.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of the kernel-doc warnings when building with W=1+ by
rewriting the problematic doc comments according to the
recommended format and style.
Note that this only fixes problems found in C source files,
headers aren't in scope for now.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the prototype of qed_hw_err_notify() with the following:
* constify "fmt" argument according to printk() declarations;
* anontate it with __cold attribute to move the function out of
the line;
* annotate it with __printf() attribute;
This eliminates W=1+ warning:
drivers/net/ethernet/qlogic/qed/qed_hw.c: In function
‘qed_hw_err_notify’:
drivers/net/ethernet/qlogic/qed/qed_hw.c:851:3: warning: function
‘qed_hw_err_notify’ might be a candidate for ‘gnu_printf’ format
attribute [-Wsuggest-attribute=format]
len = vsnprintf(buf, QED_HW_ERR_MAX_STR_SIZE, fmt, vl);
^~~
as well as saves some code size:
add/remove: 0/0 grow/shrink: 2/4 up/down: 40/-125 (-85)
Function old new delta
qed_dmae_execute_command 1680 1711 +31
qed_spq_post 1104 1113 +9
qed_int_sp_dpc 3554 3545 -9
qed_mcp_cmd_and_union 1896 1876 -20
qed_hw_err_notify 395 352 -43
qed_mcp_handle_events 2630 2577 -53
Total: Before=368645, After=368560, chg -0.02%
__printf() will also be helpful with catching bad format strings
and arguments.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix several sparse warnings by moving structs declarations into
the corresponding header files:
drivers/net/ethernet/qlogic/qed/qed_dcbx.c:2402:32: warning:
symbol 'qed_dcbnl_ops_pass' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_ll2.c:2754:26: warning: symbol
'qed_ll2_ops_pass' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_ptp.c:449:30: warning: symbol
'qed_ptp_ops_pass' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_sriov.c:5265:29: warning:
symbol 'qed_iov_ops_pass' was not declared. Should it be static?
(some of them were declared twice in different header files)
Also make qed_hw_err_type_descr[] const while at it.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Static variables (and functions, unless they're inline) should not
be declared in header files.
Move the static array iro_arr[] from "qed_hsi.h" to the sole place
where it's used, "qed_init_ops.c". This eliminates lots of warnings
(42 of them actually) against W=1+:
In file included from drivers/net/ethernet/qlogic/qed/qed.h:51:0,
from drivers/net/ethernet/qlogic/qed/qed_ooo.c:40:
drivers/net/ethernet/qlogic/qed/qed_hsi.h:4421:18: warning: 'iro_arr'
defined but not used [-Wunused-const-variable=]
static const u32 iro_arr[] = {
^~~~~~~
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new structure geneve_config and moves the per-device
configuration attributes to it, like we already have in VXLAN with
struct vxlan_config. This ends up being pretty invasive since those
attributes are used everywhere.
This allows us to clean up the argument lists for geneve_configure (4
arguments instead of 8) and geneve_nl2info (5 instead of 9).
This also reduces the copy-paste of code setting those attributes
between geneve_configure and geneve_changelink to a single memcpy,
which would have avoided the bug fixed in commit
56c09de347 ("geneve: allow changing DF behavior after creation").
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
On link down, the draining of the S/G cache should be done on all
_possible_ CPUs not just the ones that are online in that moment.
Fix this by changing the iterator.
Fixes: d70446ee1f ("dpaa2-eth: send a scatter-gather FD instead of realloc-ing")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable 'err = -ENODEV;' in au1000_probe() is
duplicate, so remove redundant one. And remove the
extra blank lines in the file au1000_eth.c
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable promisc mode of PF, set VF link state to enable, and
run iperf of the VF, then do self test of the PF. The self test
will fail with a low frequency, and may cause a use-after-free
problem.
[ 87.142126] selftest:000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 87.159722] ==================================================================
[ 87.174187] BUG: KASAN: use-after-free in hex_dump_to_buffer+0x140/0x608
[ 87.187600] Read of size 1 at addr ffff003b22828000 by task ethtool/1186
[ 87.201012]
[ 87.203978] CPU: 7 PID: 1186 Comm: ethtool Not tainted 5.5.0-rc4-gfd51c473-dirty #4
[ 87.219306] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B160.01 01/15/2020
[ 87.238292] Call trace:
[ 87.243173] dump_backtrace+0x0/0x280
[ 87.250491] show_stack+0x24/0x30
[ 87.257114] dump_stack+0xe8/0x140
[ 87.263911] print_address_description.isra.8+0x70/0x380
[ 87.274538] __kasan_report+0x12c/0x230
[ 87.282203] kasan_report+0xc/0x18
[ 87.288999] __asan_load1+0x60/0x68
[ 87.295969] hex_dump_to_buffer+0x140/0x608
[ 87.304332] print_hex_dump+0x140/0x1e0
[ 87.312000] hns3_lb_check_skb_data+0x168/0x170
[ 87.321060] hns3_clean_rx_ring+0xa94/0xfe0
[ 87.329422] hns3_self_test+0x708/0x8c0
The length of packet sent by the selftest process is only
128 + 14 bytes, and the min buffer size of a BD is 256 bytes,
and the receive process will make sure the packet sent by
the selftest process is in the linear part, so only check
the linear part in hns3_lb_check_skb_data().
So fix this use-after-free by using skb_headlen() to dump
skb->data instead of skb->len.
Fixes: c39c4d98dc ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When unloading driver, if flag HNS3_NIC_STATE_INITED has been
already cleared, the debugfs will not be uninitialized, so fix it.
Fixes: b2292360bb ("net: hns3: Add debugfs framework registration")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When asserts VF reset fail, flag HCLGEVF_STATE_CMD_DISABLE
and handshake status should not set, otherwise the retry will
fail. So adds a check for asserting VF reset and returns
directly when fails.
Fixes: ef5f8e507e ("net: hns3: stop handling command queue while resetting VF")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there is a PF reset pending before FLR prepare, FLR's
preparatory work will not fail, but the FLR rebuild procedure
will fail for this pending. So this PF reset pending should
be handled in the FLR preparatory.
Fixes: 8627bdedc4 ("net: hns3: refactor the precedure of PF FLR")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
The driver was calling pci_save/restore_state() which is no more needed.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
In this driver:
gem_suspend() calls gem_do_stop() which in turn invokes
pci_disable_device(). As the PCI helper function is not called at the
end/start of the function body, breaking the function in two parts
may change its behavior.
The only other function invoking gem_do_stop() is gem_close(). Hence,
gem_close() and gem_suspend() can do the required end steps on their own.
The same case is with gem_resume(). Both gem_resume() and gem_open()
invoke gem_do_start(). Again, make the caller functions do the required
steps on their own.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a case where the ID_REV register read is failed, the memory for a
private data structure has to be freed before returning error from the
function smsc95xx_bind.
Fixes: bbd9f9ee69 ("smsc95xx: add wol support for more frame types")
Signed-off-by: Andre Edich <andre.edich@microchip.com>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return value of the function smsc95xx_reset() must be checked
to avoid returning false success from the function smsc95xx_bind().
Fixes: 2f7ca802bd ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver")
Signed-off-by: Andre Edich <andre.edich@microchip.com>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When this driver transmits data,
first this driver will remove a pseudo header of 1 byte,
then the lapb module will prepend the LAPB header of 2 or 3 bytes,
then this driver will prepend a length field of 2 bytes,
then the underlying Ethernet device will prepend its own header.
So, the header length required should be:
-1 + 3 + 2 + "the header length needed by the underlying device".
This patch fixes kernel panic when this driver is used with AF_PACKET
SOCK_DGRAM sockets.
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The comments before struct vsc73xx_platform and struct vsc73xx_spi use
kerneldoc format, but then fail to document the members of these
structures. All the structure members are self evident, and the driver
has not other kerneldoc comments, so change these to plain comments to
avoid warnings.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since lan9303_adjust_link() is a void function, there is no option to
return an error. So just remove the variable and lets any errors be
discarded.
Cc: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oddly, GENMASK() requires signed bit numbers, so that it can compare
them for < 0. If passed an unsigned type, we get warnings about the
test never being true.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oddly, GENMASK() requires signed bit numbers, so that it can compare
them for < 0. If passed an unsigned type, we get warnings about the
test never being true. There is no danger of overflow here, udf is
always a u8, so there is plenty of space when expanding to an int.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A __be16 variable should be initialised with a __be16 value. So add a
htons(). In this case it is pointless, given the value being assigned
is 0xffff, but it stops sparse from warnings.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
leX_to_cpu() expects to be passed an __leX type.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't act on any errors reading registers while handling watchdog
interrupt. Since this is an interrupt handler, we cannot return such
errors. So just remove the variable.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flow spec member vlan_tci is in network order. Hence comparisons
should be made again network order values.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oddly, GENMASK() requires signed bit numbers, so that it can compare
them for < 0. If passed an unsigned type, we get warnings about the
test never being true.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phylink now requires that parameters established through
auto-negotiation be written into the MAC at the time of the
mac_link_up() callback. In the case of felix, that means taking the port
out of reset, setting the correct timers for PAUSE frames, and
enabling/disabling TX flow control.
This patch also splits the inband and noinband configuration of the
vsc9959 PCS (currently found in a function called "init") into 2
different functions, which have a nomenclature closer to phylink:
"config", for inband setup, and "link_up", for noinband (forced) setup.
This is necessary as a preparation step for giving up control of the PCS
to phylink, which will be done in further patch series.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phylink uses the .mac_an_restart method to offer the user an
implementation of the "ethtool -r" behavior, when the media-side auto
negotiation can be restarted by the local MAC PCS. This is the case for
fiber modes 1000Base-X and 2500Base-X (IEEE clause 37) that don't have
an Ethernet PHY connected locally, and the media is connected to the MAC
PCS directly.
On the other hand, the Cisco SGMII and USXGMII standards also have an
auto negotiation mechanism based on IEEE 802.3 clause 37 (their
respective specs require a MAC PCS and a PHY PCS to implement the same
state machine, which is described in IEEE 802.3 "Auto-Negotiation Figure
37-6"), so the ability to restart auto-negotiation is intrinsically
symmetrical (the MAC PCS can do it too).
However, it appears that not all SGMII/USXGMII PHYs have logic to
restart the MDI-side auto-negotiation process when they detect a
transition of the SGMII link from data mode to configuration mode.
Some do (VSC8234) and some don't (AR8033, MV88E1111). IEEE and/or Cisco
specification wordings to not help to prove whether propagating the "AN
restart" event from MII side ("mr_restart_an") to MDI side
("mr_restart_negotiation") is required behavior - neither of them
specifies any mandatory interaction between the clause 37 AN state
machine from Figure 37-6 and the clause 28 AN state machine from Figure
28-18.
Therefore, even if a certain behavior could be proven as being required,
real-life SGMII/USXGMII PHYs are inconsistent enough that a clause 37 AN
restart cannot be used by phylink to reliably trigger a media-side
renegotiation, when the user requests it via ethtool.
The only remaining use that the .mac_an_restart callback might possibly
have, given what we know now, is to implement some silicon quirks, but
so far that has proven to not be necessary.
So remove this code for now, since it never gets called and we don't
foresee any circumstance in which it might be, either.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
state->speed holds a value of 10, 100, 1000 or 2500, but
SYS_MAC_FC_CFG_FC_LINK_SPEED expects a value in the range 0, 1, 2 or 3.
So set the correct speed encoding into this register.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In VSC9959, the PCS is the one who performs rate adaptation (symbol
duplication) to the speed negotiated by the PHY. The MAC is unaware of
that and must remain configured for gigabit. If it is configured at
OCELOT_SPEED_10 or OCELOT_SPEED_100, it'll start transmitting PAUSE
frames out of control and never recover, _even if_ we then reconfigure
it at OCELOT_SPEED_1000 afterwards.
This patch fixes a bug that luckily did not have any functional impact.
We were writing 10, 100, 1000 etc into this 2-bit field in
DEV_CLOCK_CFG, but the hardware expects values in the range 0, 1, 2, 3.
So all speed values were getting truncated to 0, which is
OCELOT_SPEED_2500, and which also appears to be fine.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ping tested:
[ 11.808455] mscc_felix 0000:00:00.5 swp0: Link is Up - 1Gbps/Full - flow control rx/tx
[ 11.816497] IPv6: ADDRCONF(NETDEV_CHANGE): swp0: link becomes ready
[root@LS1028ARDB ~] # ethtool -s swp0 advertise 0x4
[ 18.844591] mscc_felix 0000:00:00.5 swp0: Link is Down
[ 22.048337] mscc_felix 0000:00:00.5 swp0: Link is Up - 100Mbps/Half - flow control off
[root@LS1028ARDB ~] # ip addr add 192.168.1.1/24 dev swp0
[root@LS1028ARDB ~] # ping 192.168.1.2
PING 192.168.1.2 (192.168.1.2): 56 data bytes
(...)
^C--- 192.168.1.2 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.383/0.611/1.051 ms
[root@LS1028ARDB ~] # ethtool -s swp0 advertise 0x10
[ 355.637747] mscc_felix 0000:00:00.5 swp0: Link is Down
[ 358.788034] mscc_felix 0000:00:00.5 swp0: Link is Up - 1Gbps/Half - flow control off
[root@LS1028ARDB ~] # ping 192.168.1.2
PING 192.168.1.2 (192.168.1.2): 56 data bytes
(...)
^C
--- 192.168.1.2 ping statistics ---
16 packets transmitted, 16 packets received, 0% packet loss
round-trip min/avg/max = 0.301/0.384/1.138 ms
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver appears to write to BMCR_SPEED and BMCR_DUPLEX, fields which
are read-only, since they are actually configured through the
vendor-specific IF_MODE (0x14) register.
But the reason we're writing back the read-only values of MII_BMCR is to
alter these writable fields:
BMCR_RESET
BMCR_LOOPBACK
BMCR_ANENABLE
BMCR_PDOWN
BMCR_ISOLATE
BMCR_ANRESTART
In particular, the only field which is really relevant to this driver is
BMCR_ANENABLE. Clarify that intention by spelling it out, using
phy_set_bits and phy_clear_bits.
The driver also made a few writes to BMCR_RESET and BMCR_ANRESTART which
are unnecessary and may temporarily disrupt the link to the PHY. Remove
them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rmnet can have only two bridge interface.
One of them is a link interface and another one is added by
the master operation.
rmnet interface shouldn't allow adding additional
bridge interfaces by mater operation.
But, there is no code to deny additional interfaces.
So, interface leak occurs.
Test commands:
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link set dummy1 master rmnet0
ip link set dummy2 master rmnet0
ip link del rmnet0
In the above test command, the dummy0 was attached to rmnet as VND mode.
Then, dummy1 was attached to rmnet0 as BRIDGE mode.
At this point, dummy0 mode is switched from VND to BRIDGE automatically.
Then, dummy2 is attached to rmnet as BRIDGE mode.
At this point, rmnet0 should deny this operation.
But, rmnet0 doesn't deny this.
So that below splat occurs when the rmnet0 interface is deleted.
Splat looks like:
[ 186.684787][ C2] WARNING: CPU: 2 PID: 1009 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
[ 186.684788][ C2] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_x
[ 186.684805][ C2] CPU: 2 PID: 1009 Comm: ip Not tainted 5.8.0-rc1+ #621
[ 186.684807][ C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 186.684808][ C2] RIP: 0010:rollback_registered_many+0x986/0xcf0
[ 186.684811][ C2] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 5
[ 186.684812][ C2] RSP: 0018:ffff8880cd9472e0 EFLAGS: 00010287
[ 186.684815][ C2] RAX: ffff8880cc56da58 RBX: ffff8880ab21c000 RCX: ffffffff9329d323
[ 186.684816][ C2] RDX: 1ffffffff2be6410 RSI: 0000000000000008 RDI: ffffffff95f32080
[ 186.684818][ C2] RBP: dffffc0000000000 R08: fffffbfff2be6411 R09: fffffbfff2be6411
[ 186.684819][ C2] R10: ffffffff95f32087 R11: 0000000000000001 R12: ffff8880cd947480
[ 186.684820][ C2] R13: ffff8880ab21c0b8 R14: ffff8880cd947400 R15: ffff8880cdf10640
[ 186.684822][ C2] FS: 00007f00843890c0(0000) GS:ffff8880d4e00000(0000) knlGS:0000000000000000
[ 186.684823][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 186.684825][ C2] CR2: 000055b8ab1077b8 CR3: 00000000ab612006 CR4: 00000000000606e0
[ 186.684826][ C2] Call Trace:
[ 186.684827][ C2] ? lockdep_hardirqs_on_prepare+0x379/0x540
[ 186.684829][ C2] ? netif_set_real_num_tx_queues+0x780/0x780
[ 186.684830][ C2] ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[ 186.684831][ C2] ? __kasan_slab_free+0x126/0x150
[ 186.684832][ C2] ? kfree+0xdc/0x320
[ 186.684834][ C2] ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[ 186.684835][ C2] unregister_netdevice_many.part.135+0x13/0x1b0
[ 186.684836][ C2] rtnl_delete_link+0xbc/0x100
[ ... ]
[ 238.440071][ T1009] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1
Fixes: 037f9cdf72 ("net: rmnet: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and taking care of register states. And they use PCI
helper functions to do it.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
.suspend() calls __qlcnic_shutdown, which then calls qlcnic_82xx_shutdown;
.resume() calls __qlcnic_resume, which then calls qlcnic_82xx_resume;
Both ...82xx..() are define in
drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c and are used only in
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c.
Hence upgrade them and remove PCI function calls, like pci_save_state() and
pci_enable_wake(), inside them
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
In this driver:
netxen_nic_resume() calls netxen_nic_attach_func() which then invokes PCI
helper functions like pci_enable_device(), pci_set_power_state() and
pci_restore_state(). Other function:
- netxen_io_slot_reset()
also calls netxen_nic_attach_func().
Also, netxen_io_slot_reset() returns specific value based on the return value
of netxen_nic_attach_func() as whole. Thus, cannot simply move some piece of
code from netxen_nic_attach_func() to it.
Hence, define a new function netxen_nic_attach_late_func() to do the tasks
which has to be done after PCI helper functions have done their job.
Now, netxen_nic_attach_func() invokes netxen_nic_attach_late_func(), thus
netxen_io_slot_reset() behaves normally.
And, netxen_nic_resume() calls netxen_nic_attach_late_func() to avoid PCI
helper functions calls.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Private structure members live_ports, on_ports, rx_ports, tx_ports are
initialized but not used anywhere. Let's remove them.
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DSA subsystem moved to phylink and adjust_link() became deprecated in
the process. This patch removes adjust_link from the KSZ DSA switches and
adds phylink_mac_link_up() and phylink_mac_link_down().
Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending mailbox in the work of aeq event, another aeq event
will be triggered. because the last aeq work is not exited and only
one work can be excuted simultaneously in the same workqueue, mailbox
sending function will return failure of timeout. We create and use
another workqueue to fix this.
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds driver changes to perform Idlechk dump during the debug
data collection.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch populates a database of idlechk tests (registers and
predicates) and performs the idlechk using this data.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds register definitions required for Idlechk implementation.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the transmit part of XDP support, which includes:
- support for XDP_TX in mvpp2_xdp()
- .ndo_xdp_xmit hook for AF_XDP and XDP_REDIRECT with mvpp2 as destination
mvpp2_xdp_submit_frame() is a generic function which is called by
mvpp2_xdp_xmit_back() when doing XDP_TX, and by mvpp2_xdp_xmit when
doing AF_XDP or XDP_REDIRECT target.
The buffer allocation has been reworked to be able to map the buffers
as DMA_FROM_DEVICE or DMA_BIDIRECTIONAL depending if native XDP is
in use or not.
Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add XDP native support.
By now only XDP_DROP, XDP_PASS and XDP_REDIRECT
verdicts are supported.
Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the page_pool API for memory management.
This is a prerequisite for native XDP support.
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In mvpp2_swf_bm_pool_init_percpu(), a reference to a struct
mvpp2_bm_pool is obtained traversing multiple structs, when a
local variable already points to the same object.
Fix it and, while at it, give the variable a meaningful name.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For IPA v4.2, the exact interpretation of the register that defines
the timeout for avoiding head-of-line blocking was a little unclear.
We're only assigning a 0 timeout to it right now, so that wasn't
very important. But now that I know how it's supposed to work, I'm
fixing it.
The register represents a tick counter, where each tick is equal to
128 IPA core clock cycles. For IPA v3.5.1, the register contains
a simple counter value. But for IPA v4.2, the register contains two
fields, base and scale, which approximate the tick counter as:
ticks = base << scale
The base and scale values to use for a given tick count are computed
using clever bit operations, and measures are taken to make the
resulting time period as close as possible to that requested.
There's no need for ipa_endpoint_init_hol_block_timer() to return
an error, so change its return type to void.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a new function that returns the current rate of the IPA core
clock.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds kernel TX timestamps to the xen-netfront driver. Tested with chrony
on an AWS EC2 instance.
Signed-off-by: Daniel Drown <dan-netdev@drown.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The GENET driver interfaces with internal MoCA interface as well as
external MoCA chips like the BCM6802/6803 through a fixed link
interface. It is desirable for the mocad user-space daemon to be able to
control the carrier state based upon out of band messages that it
receives from the MoCA chip.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon a TX timeout handle, if the TX reporter was not able to recover
from the error, reopen the channels. If tried to reopen channels, do not
loop over TX queues for timeout.
With that, the reporters state and separation will better
expose the driver's state.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add helper which retrieves the RQ WQE's head. Use this helper in RX
reporter diagnose callback.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use txrx.h to contain helper function regarding TX/RX. In the coming
patches, I will add more RQ helpers.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Change the hierarchy of the RX reporter 'Common config' in the diagnose
output to match the 'Common config' of the TX reporter which reflects
that CQ is a helper to the traffic queues.
Before:
$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
RQ:
type: 2 stride size: 2048 size: 8
CQ:
stride size: 64 size: 1024
RQs:
...
After:
$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
RQ:
type: 2 stride size: 2048 size: 8
CQ:
stride size: 64 size: 1024
RQs:
...
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When received a CQE error, the driver inspect the syndrome given by the
firmware. RQ recovery is initiated only as a result of a fatal syndrome;
syndrome which set the RQ into an error state. Hence no need to query
the RQ state at the beginning of the recovery process. Add additional
debug prints before recovering.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
During queue's recovery, driver waits for flush. The flush timeout is
set to 2 seconds. Add a define for this value for the benefit of RX and
TX reporters.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Creation of devlink health reporters is not fatal for mlx5e instance load.
In case of error in reporter's creation, the return value is ignored.
Change all reporters creation functions to return void.
In addition, with this change, a failure in creating a reporter, will not
prevent the driver from trying to create the next reporter in the list.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We have yet another new scheme for NVRAM, and a corresponding new MCDI.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The QDMA subsystem on EF100 needs this information.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since ethtool_common.o will be built into both sfc and sfc_ef100 drivers,
it can't use KBUILD_MODNAME directly. Instead, make it reference a
string provided by the individual driver code.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously this was only happening in ef10-specific code.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
_down() merely removes all our filters and VLANs, it doesn't free
efx->filter_state itself.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we only allocate VIs for the number of TXQs we actually need, we
cannot naively use "channel * TXQ_TYPES + txq" for the TXQ number, as
this has gaps (when efx->tx_queues_per_channel < EFX_TXQ_TYPES) and
thus overruns the driver's VI allocations, causing the firmware to
reject the MC_CMD_INIT_TXQ based on INSTANCE.
Thus, we distinguish INSTANCE (stored in tx_queue->queue) from LABEL
(tx_queue->label); the former is allocated starting from 0 in
efx_set_channels(), while the latter is simply the txq type (index in
channel->tx_queue array).
To simplify things, rather than changing tx_queues_per_channel after
setting up TXQs, make Siena always probe its HIGHPRI queues at start
of day, rather than deferring it until tc mqprio enables them.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Siena needs four TX queues (csum * highpri), EF10 needs two (csum),
and EF100 only needs one (as checksumming is controlled entirely by
the transmit descriptor). Rather than having various bits of ad-hoc
code to decide which queues to set up etc., put the knowledge of how
many TXQs a channel has in one place.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of exposing this old module parameter on the new driver (thus
having to keep it forever after for compatibility), let's confine it
to the old one; if we find later that we need the feature, we ought
to support it properly, with ethtool set-channels.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EF100 only supports MSI-X, so there's no need for the new driver to
expose this old module parameter.
Since it's now visible to the linker, we have to rename it internally
to efx_interrupt_mode to avoid symbol collisions in non-modular
builds.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All NICs supported by this driver are capable of MSI-X interrupts (only
Falcon A1 wasn't, and that's now hived off into its own driver), so no
need for a nic-type parameter. Besides, the code that checked it was
buggy anyway (the following assignment that checked min_interrupt_mode
overrode it).
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unprivileged functions (such as VFs) may set their MTU by use of the
'control' field of MC_CMD_SET_MAC_EXT, as used in efx_mcdi_set_mtu().
If calling efx_ef10_mac_reconfigure() from efx_change_mtu(), and the
NIC supports the above (SET_MAC_ENHANCED capability), use it rather
than efx_mcdi_set_mac().
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable act is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Have functions that write endpoint configuration registers return
immediately if they are not valid for the direction of transfer for
the endpoint. This allows most of the calls in ipa_endpoint_program()
to be made unconditionally. Reorder the register writes to match
the order of their definition (based on offset).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPA version 4.0+ does not support endpoint suspend. Put a test at
the top of ipa_endpoint_program_suspend() that returns immediately
if suspend is not supported rather than making that check in the caller.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPA version 3.5.1 has a hardware quirk that requires special
handling if an RX endpoint is suspended while aggregation is active.
This handling is implemented by ipa_endpoint_suspend_aggr().
Have ipa_endpoint_program_suspend() be responsible for calling
ipa_endpoint_suspend_aggr() if suspend mode is being enabled on
an endpoint. If the endpoint does not support aggregation, or if
aggregation isn't active, this call will continue to have no effect.
Move the definition of ipa_endpoint_suspend_aggr() up in the file so
its definition precedes the new earlier reference to it. This
requires ipa_endpoint_aggr_active() and ipa_endpoint_force_close()
to be moved as well.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPA version 4.2 has a hardware quirk that affects endpoint delay
mode, so it isn't used there. Isolate the test that avoids using
delay mode for that version inside ipa_endpoint_program_delay(),
rather than making that check in the caller.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The number of ports is incorrectly set to the maximum available for a DSA
switch. Even if the extra ports are not used, this causes some functions
to be called later, like port_disable() and port_stp_state_set(). If the
driver doesn't check the port index, it will end up modifying unknown
registers.
Fixes: b987e98e50 ("dsa: add DSA switch driver for Microchip KSZ9477")
Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain configurations without power management support, the
following warnings happen:
drivers/net/ethernet/mellanox/mlx4/main.c:4388:12:
warning: 'mlx4_resume' defined but not used [-Wunused-function]
4388 | static int mlx4_resume(struct device *dev_d)
| ^~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx4/main.c:4373:12: warning:
'mlx4_suspend' defined but not used [-Wunused-function]
4373 | static int mlx4_suspend(struct device *dev_d)
| ^~~~~~~~~~~~
Mark these functions as __maybe_unused to make it clear to the
compiler that this is going to happen based on the configuration,
which is the standard for these types of functions.
Fixes: 0e3e206a3e ("mlx4: use generic power management")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain configurations without power management support, gcc report
the following warning:
drivers/net/ethernet/micrel/ksz884x.c:7182:12: warning:
'pcidev_suspend' defined but not used [-Wunused-function]
7182 | static int pcidev_suspend(struct device *dev_d)
| ^~~~~~~~~~~~~~
Mark pcidev_suspend() as __maybe_unused to make it clear.
Fixes: 64120615d1 ("ksz884x: use generic power management")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove is_udp variable that is used in only one place and use
ip_hdr(skb)->protocol == IPPROTO_UDP check instead.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not initialize queue variable. It is already initialized in for loops.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use hweight32() to count set bits in queue_mask.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bit 0 of queue_mask is set at the beginning of
macb_probe_queues() function. Do not set it again after reading
DGFG6 but instead use "|=" operator.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is an immutable branch shared between wireless-drivers-next and
staging-next for moving wilc1000 driver out of staging to drivers/net/wireless
directory.
The KSZ9893 3-Port Gigabit Ethernet Switch can be controlled via SPI,
I²C or MDIO (very limited and not supported by this driver). While there
is already a compatible entry for the SPI bus, it was missing for I²C.
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2020-07-01
This series contains updates to all Intel drivers, but a majority of the
changes are to the i40e driver.
Jeff converts 'fall through' comments to the 'fallthrough;' keyword for
all Intel drivers. Removed unnecessary delay in the ixgbe ethtool
diagnostics test.
Arkadiusz implements Total Port Shutdown for i40e. This is the revised
patch based on Jakub's feedback from an earlier submission of this
patch, where additional code comments and description was needed to
describe the functionality.
Wei Yongjun fixes return error code for iavf_init_get_resources().
Magnus optimizes XDP code in i40e; starting with AF_XDP zero-copy
transmit completion path. Then by only executing a division when
necessary in the napi_poll data path. Move the check for transmit ring
full outside the send loop to increase performance.
Ciara add XDP ring statistics to i40e and the ability to dump these
statistics and descriptors.
Tony fixes reporting iavf statistics.
Radoslaw adds support for 2.5 and 5 Gbps by implementing the newer ethtool
ksettings API in ixgbe.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2020-07-01
This series contains updates to the ice driver only.
Jacob implements a devlink region for device capabilities.
Bruce removes structs containing only one-element arrays that are either
unused or only used for indexing. Instead, use pointer arithmetic or
other indexing to access the elements. Converts "C struct hack"
variable-length types to the preferred C99 flexible array member.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the pre-C90-extension "C struct hack" method (using a single-
element array at the end of a structure for implementing variable-length
types) to the preferred use of C99 flexible array member.
Additional code cleanups were done near areas affected by this change.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
There are a number of structures that consist of a one-element array as the
only struct member. Some of those are unused so remove them. Others are
used to index into a buffer/array consisting of a variable number of a
different data or structure type. Those are unnecessary since we can use
simple pointer arithmetic or index directly into the buffer to access
individual elements of the buffer/array.
Additional code cleanups were done near areas affected by this change.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
At the moment, bonding xfrm crypto offload can only be set up if the bonding
module is loaded with active-backup mode already set. We need to be able to
make this work with bonds set to AB after the bonding driver has already
been loaded.
So what's done here is:
1) move #define BOND_XFRM_FEATURES to net/bonding.h so it can be used
by both bond_main.c and bond_options.c
2) set BOND_XFRM_FEATURES in bond_dev->hw_features universally, rather than
only when loading in AB mode
3) wire up xfrmdev_ops universally too
4) disable BOND_XFRM_FEATURES in bond_dev->features if not AB
5) exit early (non-AB case) from bond_ipsec_offload_ok, to prevent a
performance hit from traversing into the underlying drivers
5) toggle BOND_XFRM_FEATURES in bond_dev->wanted_features and call
netdev_change_features() from bond_option_mode_set()
In my local testing, I can change bonding modes back and forth on the fly,
have hardware offload work when I'm in AB, and see no performance penalty
to non-AB software encryption, despite having xfrm bits all wired up for
all modes now.
Fixes: 18cb261afd ("bonding: support hardware encryption offload to slaves")
Reported-by: Huy Nguyen <huyn@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new devlink region used for capturing a snapshot of the device
capabilities buffer which is reported by the firmware over the AdminQ.
This information can useful in debugging driver and firmware
interactions.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The convention throughout the IPA driver is to directly use
single-bit field mask values, rather than using (for example)
u32_encode_bits() to set or clear them.
Fix the one place that doesn't follow that convention, which sets
HOL_BLOCK_EN_FMASK in ipa_endpoint_init_hol_block_enable().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A handful of registers are valid only for RX endpoints, and some
others are valid only for TX endpoints. For these endpoints, add
a comment above their defined offset macro that indicates the
endpoints to which they apply.
Extend the endpoint parameter naming convention as well, to make
these constraints more explicit.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The INIT_MODE endpoint configuration register is only valid for TX
endpoints. Rather than writing a zero to that register for RX
endpoints, avoid writing the register at all.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The INIT_HDR_METADATA_MASK endpoint configuration register is only
valid for RX endpoints. Rather than writing a zero to that register
for TX endpoints, avoid writing the register at all.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The INIT_HOL_BLOCK_EN and INIT_HOL_BLOCK_TIMER endpoint registers
are only valid for RX endpoints.
Have ipa_endpoint_modem_hol_block_clear_all() skip writing these
registers for TX endpoints.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The microcontroller shared memory area is at the beginning of the
IPA resident memory. IPA_MEM_UC_OFFSET was defined as the offset
within that region where it's found, but it's 0, and it's never
actually used. Just get rid of the definition, and move some of the
description it had to be above the definition of the ipa_uc_mem_area
structure.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make minor updates to error messages reported in "gsi.c":
- Use local variables to reduce multi-line function calls
- Don't use parentheses in messages
- Do some slight rewording in a few cases
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We check the state of an event ring or channel both before and after
any GSI command issued that will change that state. In most--but
not all--cases, if the state is something different than expected we
report an error message.
Add error messages where missing, so that all unexpected states
provide information about what went wrong. Drop the parentheses
around the state value shown in all cases.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reuse the "limit" local variable in ipa_endpoint_init_aggr() when
setting the aggregation size limit. Simple cleanup.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Halve the time limit used when aggregation is enabled on an RX
endpoint, to half a millisecond.
Use DIV_ROUND_CLOSEST() to compute the value that represents the
time period, to get better accuracy in the event the time limit is
not an even multiple of the granularity.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The timer used for aggregation makes use of an internal 32 KHz clock.
The granularity of the timer is programmed by a field whose value is
computed by ipa_aggr_granularity_val(). Redefine the way that value
is computed by using a new TIMER_FREQUENCY constant representing the
underlying clock frequency.
Add two BUILD_BUG_ON() calls to ensure the value used is valid.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
the patch basically adds the offset adjustment and netfront
state reading to make XDP work on netfront side.
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds a basic XDP processing to xen-netfront driver.
We ran an XDP program for an RX response received from netback
driver. Also we request xen-netback to adjust data offset for
bpf_xdp_adjust_head() header space for custom headers.
synchronization between frontend and backend parts is done
by using xenbus state switching:
Reconfiguring -> Reconfigured- > Connected
UDP packets drop rate using xdp program is around 310 kpps
using ./pktgen_sample04_many_flows.sh and 160 kpps without the patch.
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added full support for new version Ethtool API. New API allow use
2500Gbase-T and 5000base-T supported and advertised link speed modes.
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
There is a 4 seconds delay in ixgbe_diag_test() that is holding up other
ioctls such as SIOCGIFCONF that Oracle database applications use.
One of Oracle's product runs "ethtool -t ethX online" periodically for
system monitoring and that is impacting database applications that use
SIOCGIFCONF at that same time.
This 4 second delay was needed in out early 1GbE parts to give the PHY
time to recover from a reset. This code was carried forward to the 10 GbE
driver even it was not needed for the supported PHYs in the ixgbe driver.
CC: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
CC: Jack Vogel <jack.vogel@oracle.com>
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Commit bac8486116 ("iavf: Refactor the watchdog state machine") inverted
the logic for when to update statistics. Statistics should be updated when
no other commands are pending, instead they were only requested when a
command was processed. iavf_request_stats() would see a pending request
and not request statistics to be updated. This caused statistics to never
be updated; fix the logic.
Fixes: bac8486116 ("iavf: Refactor the watchdog state machine")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Interfaces already exist for dumping Rx and Tx descriptor information.
Introduce another for doing the same for XDP descriptors.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Prior to this, only the Rx and Tx ring statistics were dumped. The XDP
ring statistics are now dumped as well.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Prior to this, only Rx and Tx ring statistics were accounted for.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Move the check if the HW Tx ring is full to outside the send
loop. Currently it is checked for every single descriptor that we
send. Instead, tell the send loop to only process a maximum number of
packets equal to the number of available slots in the Tx ring. This
way, we can remove the check inside the send loop to and gain some
performance.
Suggested-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Eliminate a division in the napi_poll data path. This division is
executed even though it is only needed in the rare case when there are
not enough interrupt lines so they have to be shared between queue
pairs. Instead, just test for this case and only execute the division
if needed. The code has been lifted from the ice driver.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Improve the performance of the AF_XDP zero-copy Tx completion
path. When there are no XDP buffers being sent using XDP_TX or
XDP_REDIRECT, we do not have go through the SW ring to clean up any
entries since the AF_XDP path does not use these. In these cases, just
fast forward the next-to-use counter and skip going through the SW
ring. The limit on the maximum number of entries to complete is also
removed since the algorithm is now O(1). To simplify the code path, the
maximum number of entries to complete for the XDP path is therefore
also increased from 256 to 512 (the default number of Tx HW
descriptors). This should be fine since the completion in the XDP path
is faster than in the SKB path that has 256 as the maximum number.
This patch provides around 4% throughput improvement for the l2fwd
application in xdpsock on my machine.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: b66c7bc1cd ("iavf: Refactor init state machine")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>