Commit Graph

123 Commits

Author SHA1 Message Date
Huazhong Tan c99fead7cb net: hns3: add debugfs support for interrupt coalesce
Since user may need to check the current configuration of the
interrupt coalesce, so add debugfs support for query this info.

Create a single file "coalesce_info" for it, and query it by
"cat coalesce_info", return the result to userspace.

For device whose version is above V3(include V3), the GL's register
contains usecs and 1us unit configuration. When get the usecs
configuration from this register, it will include the confusing unit
configuration, so add a GL mask to get the correct value, and add
a QL mask for the frames configuration as well.

The display style is below:
$ cat coalesce_info
tx interrupt coalesce info:
VEC_ID  ALGO_STATE  PROFILE_ID  CQE_MODE  TUNE_STATE  STEPS_LEFT...
0       IN_PROG     4           EQE       ON_TOP      0...
1       START       3           EQE       LEFT        1...

rx interrupt coalesce info:
VEC_ID  ALGO_STATE  PROFILE_ID  CQE_MODE  TUNE_STATE  STEPS_LEFT...
0       IN_PROG     3           EQE       LEFT        1...
1       IN_PROG     0           EQE       ON_TOP      0...

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:00:58 +01:00
Yunsheng Lin 9f9f0f1999 net: hns3: fix for miscalculation of rx unused desc
rx unused desc is the desc that need attatching new buffer
before refilling to hw to receive new packet, the number of
desc need attatching new buffer is calculated using next_to_use
and next_to_clean. when next_to_use == next_to_clean, currently
hns3 driver assumes that all the desc has the buffer attatched,
but 'next_to_use == next_to_clean' also means all the desc need
attatching new buffer if hw has comsumed all the desc and the
driver has not attatched any buffer to the desc yet.

This patch adds 'refill' in desc_cb to indicate whether a new
buffer has been refilled to a desc.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yunsheng Lin adfb7b4966 net: hns3: fix the max tx size according to user manual
Currently the max tx size supported by the hw is calculated by
using the max BD num supported by the hw. According to the hw
user manual, the max tx size is fixed value for both non-TSO and
TSO skb.

This patch updates the max tx size according to the manual.

Fixes: 8ae10cfb5089("net: hns3: support tx-scatter-gather-fraglist feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Hao Chen c74e503572 net: hns3: add some required spaces
Add some required spaces to improve readability.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-31 12:36:42 +01:00
Hao Chen 0cb0704149 net: hns3: add required space in comment
Add some required spaces in comment for cleanup.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-28 11:20:05 +01:00
Yufeng Mo cce1689eb5 net: hns3: add ethtool support for CQE/EQE mode configuration
Add support in ethtool for switching EQE/CQE mode.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-24 07:38:29 -07:00
Yufeng Mo 9f0c6f4b74 net: hns3: add support for EQE/CQE mode configuration
For device whose version is above V3(include V3), the GL can
select EQE or CQE mode, so adds support for it.

In CQE mode, the coalesced timer will restart when the first new
completion occurs, while in EQE mode, the timer will not restart.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-24 07:38:29 -07:00
Yufeng Mo ddccc5e368 net: hns3: add support for triggering reset by ethtool
Currently, four reset types are supported for the HNS3 ethernet
driver: IMP reset, global reset, function reset, and FLR. Only
FLR can now be triggered by the user. To restore the device when
an exception occurs, add support for triggering reset by ethtool.

Run the "ethtool --reset DEVNAME mgmt | all | dedicated" to
trigger the IMP | global | function reset manually.

In addition, VF can only trigger function reset.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/1628602128-15640-1-git-send-email-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-11 15:03:30 -07:00
Yunsheng Lin 93188e9642 net: hns3: support skb's frag page recycling based on page pool
This patch adds skb's frag page recycling support based on
the frag page support in page pool.

The performance improves above 10~20% for single thread iperf
TCP flow with IOMMU disabled when iperf server and irq/NAPI
have a different CPU.

The performance improves about 135%(14Gbit to 33Gbit) for single
thread iperf TCP flow when IOMMU is in strict mode and iperf
server shares the same cpu with irq/NAPI.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09 15:49:01 -07:00
Yunsheng Lin 99f6b5fb5f net: hns3: use bounce buffer when rx page can not be reused
Currently rx page will be reused to receive future packet when
the stack releases the previous skb quickly. If the old page
can not be reused, a new page will be allocated and mapped,
which comsumes a lot of cpu when IOMMU is in the strict mode,
especially when the application and irq/NAPI happens to run on
the same cpu.

So allocate a new frag to memcpy the data to avoid the costly
IOMMU unmapping/mapping operation, and add "frag_alloc_err"
and "frag_alloc" stats in "ethtool -S ethX" cmd.

The throughput improves above 50% when running single thread of
iperf using TCP when IOMMU is in strict mode and iperf shares the
same cpu with irq/NAPI(rx_copybreak = 2048 and mtu = 1500).

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 00:36:06 -07:00
Yunsheng Lin 7459775e9f net: hns3: support dma_map_sg() for multi frags skb
Using the queue based tx buffer, it is also possible to allocate a
sgl buffer, and use skb_to_sgvec() to convert the skb to the sgvec
in order to support the dma_map_sg() to decreases the overhead of
IOMMU mapping and unmapping.

Firstly, it reduces the number of buffers. For example, a tcp skb
may have a 66-byte header and 3 fragments of 4328, 32768, and 28064
bytes. With this patch, dma_map_sg() will combine them into two
buffers, 66-bytes header and one 65160-bytes fragment by using IOMMU.

Secondly, it reduces the number of dma mapping and unmapping. All the
original 4 buffers are mapped only once rather than 4 times.

The throughput improves above 10% when running single thread of iperf
using TCP when IOMMU is in strict mode.

Suggested-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 00:36:06 -07:00
Yunsheng Lin 907676b130 net: hns3: use tx bounce buffer for small packets
when the packet or frag size is small, it causes both security and
performance issue. As dma can't map sub-page, this means some extra
kernel data is visible to devices. On the other hand, the overhead
of dma map and unmap is huge when IOMMU is on.

So add a queue based tx shared bounce buffer to memcpy the small
packet when the len of the xmitted skb is below tx_copybreak.
Add tx_spare_buf_size module param to set the size of tx spare
buffer, and add set/get_tunable to set or query the tx_copybreak.

The throughtput improves from 30 Gbps to 90+ Gbps when running 16
netperf threads with 32KB UDP message size when IOMMU is in the
strict mode(tx_copybreak = 2000 and mtu = 1500).

Suggested-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 00:36:06 -07:00
Yunsheng Lin 26f1ccdf60 net: hns3: minor refactor related to desc_cb handling
desc_cb is used to store mapping and freeing info for the
corresponding desc, which is used in the cleaning process.
There will be more desc_cb type coming up when supporting the
tx bounce buffer, change desc_cb type to bit-wise value in order
to reduce the desc_cb type checking operation in the data path.

Also move the desc_cb type definition to hns3_enet.h because it
is only used in hns3_enet.c, and declare a local variable desc_cb
in hns3_clear_desc() to reduce lines of code.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 00:36:06 -07:00
Huazhong Tan 0bf5eb7885 net: hns3: add support for PTP
Adds PTP support for HNS3 ethernet driver.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:43:16 -07:00
Jian Shen 2ba306627f net: hns3: add support for modify VLAN filter state
Previously, with hardware limitation, the port VLAN filter are
effective for both PF and its VFs simultaneously, so a single
function is not able to enable/disable separately, and the VLAN
filter state is default enabled. Now some device supports each
function to bypass port VLAN filter, then each function can
switch VLAN filter separately. Add capability flag to check
whether the device supports modify VLAN filter state. If flag
on, user will be able to modify the VLAN filter state by ethtool
-K.

Furtherly, the default VLAN filter state is also changed
according to whether non-zero VLAN used. Then the device can
receive packet with any VLAN tag if only VLAN 0 used.

The function hclge_need_enable_vport_vlan_filter() is used to
help implement above changes. And the VLAN filter handle for
promisc mode can also be simplified by this function.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:53:07 -07:00
Huazhong Tan 307ea4ce3e net: hns3: switch to dim algorithm for adaptive interrupt moderation
The Linux kernel has support for a dynamic interrupt moderation
algorithm known as "dimlib". Replace the custom driver-specific
implementation of dynamic interrupt moderation with the kernel's
algorithm.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:25:37 -07:00
Huazhong Tan 77e9184869 net: hns3: refactor dump bd info of debugfs
Currently, the debugfs command for bd info is implemented
by "echo xxxx > cmd", and record the information in dmesg.
It's unnecessary and heavy.

To improve it, add two debugfs directories "tx_bd_info" and
"rx_bd_info", and create a file for each queue under these
two directories, and query the bd info of specific queue by
"cat tx_bd_info/tx_bd_queue*" or "cat rx_bd_info/rx_bd_queue*",
return the result to userspace, rather than record in dmesg.

The display style is below:
$ cat rx_bd_info/rx_bd_queue0
Queue 0 rx bd info:
BD_IDX   L234_INFO  PKT_LEN   SIZE...
0        0x0             60     60...
1        0x0           1512   1512...

$ cat tx_bd_info/tx_bd_queue0
Queue 0 tx bd info:
BD_IDX     ADDRESS  VLAN_TAG  SIZE...
0          0x0          0        0...
1          0x0          0        0...

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:07:34 -07:00
Yufeng Mo 5e69ea7ee2 net: hns3: refactor the debugfs process
Currently, each debugfs command needs to create a file to get
the information. To better support more debugfs commands, the
debugfs process is reconstructed, including the process of
creating dentries and files, and obtaining information.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:07:34 -07:00
Huazhong Tan 1ddc028ac8 net: hns3: refactor out RX completion checksum
Only when RXD advanced layout is enabled, in some cases
(e.g. ip fragments), the checksum of entire packet will be
calculated and filled in the least significant 16 bits of
the unused addr field.

So refactor out the handling of RX completion checksum: adjust
the location of the checksum in RX descriptor, and use ptype table
to identify whether this kind of checksum is calculated.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:07:33 -07:00
Huazhong Tan 796640778c net: hns3: support RXD advanced layout
Currently, the driver gets packet type by parsing the
L3_ID/L4_ID/OL3_ID/OL4_ID from RX descriptor, it's
time-consuming.

Now some new devices support RXD advanced layout, which combines
previous OL3_ID/OL4_ID to 8bit ptype field, so the driver gets
packet type by looking up only one table, and L3_ID/L4_ID become
reserved fields.

Considering compatibility, the firmware will report capability of
RXD advanced layout, the driver will identify and enable it by
default. This patch provides basic function: identify and enable
the RXD advanced layout, and refactor out hns3_rx_checksum() by
using ptype table to handle RX checksum if supported.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:07:33 -07:00
Yunsheng Lin 811c0830eb net: hns3: add tx send size handling for tso skb
The actual size on wire for tso skb should be (gso_segs - 1) *
hdr + skb->len instead of skb->len, which can be seen by user using
'ethtool -S ethX' cmd, and 'Byte Queue Limit' also use the send size
stat to do the queue limiting, so add send_bytes in the desc_cb to
record the actual send size for a skb. And send_bytes is only for tx
desc_cb and page_offset is only for rx desc, so reuse the same space
for both of them.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29 13:21:01 -07:00
Yunsheng Lin d5d5e0193e net: hns3: add handling for xmit skb with recursive fraglist
Currently hns3 driver only handle the xmit skb with one level of
fraglist skb, add handling for multi level by calling hns3_tx_bd_num()
recursively when calculating bd num and calling hns3_fill_skb_to_desc()
recursively when filling tx desc.

When the skb has a fraglist level of 24, the skb is simply dropped and
stats.max_recursion_level is added to record the error. Move the stat
handling from hns3_nic_net_xmit() to hns3_nic_maybe_stop_tx() in order
to handle different error stat and add the 'max_recursion_level' and
'hw_limitation' stat.

Note that the max recursive level as 24 is chose according to below:
commit 48a1df6533 ("skbuff: return -EMSGSIZE in skb_to_sgvec to
prevent overflow").

And that we are not able to find a testcase to verify the recursive
fraglist case, so Fixes tag is not provided.

Reported-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29 13:21:00 -07:00
Huazhong Tan 64749c9c38 net: hns3: remove redundant return value of hns3_uninit_all_ring()
Since hns3_uninit_all_ring() only returns 0, so remove this
redundant return value and function declaration in hns3_enet.h.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09 15:34:07 -08:00
Yufeng Mo 9393eb5034 net: hns3: clean up unnecessary parentheses in macro definitions
In macro definitions, parentheses are unnecessary in some cases,
such as the calling parameter of a function, the left variable
of the equal sign, and so on. So remove these unnecessary
parentheses according to these rules.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09 15:34:07 -08:00
Yufeng Mo e070c8b91a net: hns3: add support for obtaining the maximum frame size
Since the newer hardware may supports different frame size,
so add support to obtain the capability from the firmware
instead of the fixed value.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06 14:36:05 -08:00
Huazhong Tan 3e2816219d net: hns3: add udp tunnel checksum segmentation support
For the device who has the capability to handle udp tunnel
checksum segmentation, add support for it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 15:16:31 -08:00
Huazhong Tan 66d52f3bf3 net: hns3: add support for TX hardware checksum offload
For the device that supports TX hardware checksum, the hardware
can calculate the checksum from the start and fill the checksum
to the offset position, which reduces the operations of
calculating the type and header length of L3/L4. So add this
feature for the HNS3 ethernet driver.

The previous simple BD description is unsuitable, rename it as
HW TX CSUM.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 15:16:31 -08:00
Huazhong Tan 4b2fe769aa net: hns3: add support for RX completion checksum
In some cases (for example ip fragment), hardware will
calculate the checksum of whole packet in RX, and setup
the HNS3_RXD_L2_CSUM_B flag in the descriptor, so add
support to utilize this checksum.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 15:16:31 -08:00
Huazhong Tan de25bcc47f net: hns3: rename gl_adapt_enable in struct hns3_enet_coalesce
Besides GL(Gap Limiting), QL(Quantity Limiting) can be modified
dynamically when DIM is supported. So rename gl_adapt_enable as
adapt_enable in struct hns3_enet_coalesce.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:21 -08:00
Huazhong Tan 5ac84b02d3 net: hns3: add support for 1us unit GL configuration
For device whose version is above V3(include V3), the GL
configuration can set as 1us unit, so adds support for
configuring this field.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:20 -08:00
Huazhong Tan ab16b49cdf net: hns3: add support for querying maximum value of GL
For maintainability and compatibility, add support for querying
the maximum value of GL.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:20 -08:00
Huazhong Tan 91bfae25ee net: hns3: add support for configuring interrupt quantity limiting
QL(quantity limiting) means that hardware supports the interrupt
coalesce based on the frame quantity.  QL can be configured when
int_ql_max in device's specification is non-zero, so add support
to configure it. Also, rename two coalesce init function to fit
their purpose.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:20 -08:00
Guangbin Huang dbaae5bb46 net: hns3: dump tqp enable status in debugfs
Adds debugfs to dump tqp enable status.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 13:14:24 -07:00
Huazhong Tan fd665b3dba net: hns3: replace macro HNS3_MAX_NON_TSO_BD_NUM
Currently, the driver is able to query the device's specifications,
which includes the maximum BD number of non TSO packet, so replace
macro HNS3_MAX_NON_TSO_BD_NUM with the queried value, and rewrite
macro HNS3_MAX_NON_TSO_SIZE whose value depends on the the maximum
BD number of non TSO packet.

Also, add a new parameter max_non_tso_bd_num to function
hns3_tx_bd_num() and hns3_skb_need_linearized(), then they can get
the maximum BD number of non TSO packet from the caller instead of
calculating by themself, The note of hns3_skb_need_linearized()
should be update as well.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 13:14:24 -07:00
Yunsheng Lin 619ae331d1 net: hns3: use napi_consume_skb() when cleaning tx desc
Use napi_consume_skb() to batch consuming skb when cleaning
tx desc in NAPI polling.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17 16:14:28 -07:00
Yunsheng Lin 48ee56fd0b net: hns3: use writel() to optimize the barrier operation
writel() can be used to order I/O vs memory by default when
writing portable drivers. Use writel() to replace wmb() +
writel_relaxed(), and writel() is dma_wmb() + writel_relaxed()
for ARM64, so there is an optimization here because dma_wmb()
is a lighter barrier than wmb().

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17 16:14:28 -07:00
Yunsheng Lin 20d06ca267 net: hns3: optimize the tx clean process
Currently HNS3_RING_TX_RING_HEAD_REG register is read to determine
how many tx desc can be cleaned. To avoid the register read operation
in the critical data path, use the valid bit in the tx desc to determine
if a specific tx desc can be cleaned.

The hns3 driver sets valid bit in the tx desc before ringing a doorbell
to the hw, and hw will only clear the valid bit of the tx desc after
corresponding packet is sent out to the wire. And because next_to_use
for tx ring is a changing variable when the driver is filling the tx
desc, so reuse the pull_len for rx ring to record the tx desc that has
notified to the hw, so that hns3_nic_reclaim_desc() can decide how many
tx desc's valid bit need checking when reclaiming tx desc.

And io_err_cnt stat is also removed for it is not used anymore.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17 16:14:28 -07:00
Yunsheng Lin f6061a056c net: hns3: batch tx doorbell operation
Use netdev_xmit_more() to defer the tx doorbell operation when
the skb is passed to the driver continuously. By doing this we
can improve the overall xmit performance by avoid some doorbell
operations.

Also, the tx_err_cnt stat is not used, so rename it to tx_more
stat.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17 16:14:28 -07:00
Yunsheng Lin aeda9bf87a net: hns3: batch the page reference count updates
Batch the page reference count updates instead of doing them
one at a time. By doing this we can improve the overall receive
performance by avoid some atomic increment operations when the
rx page is reused.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17 16:14:28 -07:00
Guojia Liao 2c7bcc1de1 net: hns3: remove some unused function hns3_update_promisc_mode()
hns3_update_promisc_mode is defined, but not be used, so remove it.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08 19:51:41 -07:00
Huazhong Tan 3d93fda0bf net: hns3: remove some unused macros related to queue
There are several macros related queue defined, but never
used, so remove them.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08 19:51:41 -07:00
Huazhong Tan b7ae986f69 net: hns3: remove unused field 'io_base' in struct hns3_enet_ring
'io_base' has been defined and initialized, but never used,
so remove it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08 19:51:40 -07:00
David S. Miller a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Yunsheng Lin 48ae74c9d8 net: hns3: fix for not calculating TX BD send size correctly
With GRO and fraglist support, the SKB can be aggregated to
a total size of 65535, and when that SKB is forwarded through
a bridge, the size of the SKB may be pushed to exceed the size
of 65535 when br_dev_queue_push_xmit() is called.

The max send size of BD supported by the HW is 65535, when a SKB
with a headlen of over 65535 is sent to the driver, the driver
needs to use multi BD to send the linear data, and the send size
of the last BD is calculated incorrectly by the driver who is
using '&' operation, which causes a TX error.

Use '%' operation to fix this problem.

Fixes: 3fe13ed95d ("net: hns3: avoid mult + div op in critical data path")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:49:17 -07:00
Barry Song cb0e3e6115 net: hns3: pointer type of buffer should be void
Move the type of buffer address from unsigned char to void

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:43:10 -07:00
Huazhong Tan 5e86178dce net: hns3: remove some unused fields in struct hns3_nic_priv
Remove some fileds which defined in struct hns3_nic_priv,
but not used, and remove the related definition of struct
hns3_udp_tunnel and enum hns3_udp_tnl_type.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan bd13f7e129 net: hns3: remove some unused macros
There are some macros defined in hns3_enet.h, but not used in
anywhere.

Reported-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:18:10 -07:00
Jian Shen 039ba863e8 net: hns3: optimize the filter table entries handling when resetting
Currently, the PF driver removes all (including its VFs') MAC/VLAN
flow director table entries when resetting, and restores them after
reset completed.

In fact, the hardware will clear all table entries only in IMP
reset and global reset. So driver only needs to restore the table
entries in these cases, and needs do nothing when PF reset, FLR
or other function level reset.

This patch optimizes it by removing unnecessary table entries clear
and restoring handling in the reset flow, and doing the restoring
after reset completed.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-25 20:29:44 -07:00
Jian Shen c631c69682 net: hns3: refactor the promisc mode setting
As the HNS3 driver doesn't update the MAC address directly in
function hns3_set_rx_mode() now, it can't know whether the
MAC table is full from __dev_uc_sync() and __dev_mc_sync(),
so it's senseless to handle the overflow promisc here.

This patch removes the handle of overflow promisc from function
hns3_set_rx_mode(), and updates the promisc mode in the service
task.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-25 20:29:44 -07:00
Leon Romanovsky cad99e5068 net/hns: Remove custom driver version in favour of global one
Use globally defined kernel version instead of custom driver variant.

Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21 13:27:37 -07:00