Commit Graph

873653 Commits

Author SHA1 Message Date
Jianping Liu 6a8f6c32d9 README.md: update README.md
Update README.md with OC Kernel features introduce.
2024-09-09 19:00:59 +08:00
Jianping Liu e9367f76c5 dist: set with_ofed to 0 in OC8 kernel
OC need not compile mlnx commercial drivers.
2024-08-30 16:28:40 +08:00
Huang Cun e5646556fd
!218 [next-5.4]scsi: hisi_sas: Modify the deadline for ata_wait_after_reset()
Merge pull request !218 from chenyi/linux-5.4/cy
2024-08-28 02:25:38 +00:00
Yihang Li f69b22de11 scsi: hisi_sas: Modify the deadline for ata_wait_after_reset()
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I96KNQ
CVE: NA

----------------------------------------------------------------------

We found that the second parameter of function ata_wait_after_reset() is
incorrectly used. We call smp_ata_check_ready_type() to poll the device
type until the 30s timeout, so the correct deadline should be (jiffies +
30000).

Fixes: 3c2673a09c ("scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset")
Signed-off-by: xiabing <xiabing12@h-partners.com>
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Bing Xia <xiabing12@h-partners.com>
Signed-off-by: chenyi <chenyi211@huawei.com>
2024-08-20 04:26:38 -04:00
Huang Cun 2bf5a747a0 gpu: phytium: add depends on ARM64 and ARCH_PHYTIUM
Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-08-13 17:30:31 +08:00
Huang Cun a4ffb5a429 sssnic: adapting compilation for tencentos
Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-08-13 17:19:32 +08:00
Huang Cun 37c2f83262 cpufreq: CPPC: Add support for frequency invariance's divide-by zero part
fix ampere cpu kernel warning:
cpufreg: cpufreg_online: ->get() failed
refer upstream 1eb5dde674 (cpufreq: CPPC: Add support for frequency invariance)
backport this commit's divide-by zero part

Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-08-13 14:22:59 +08:00
Huang Cun 4ad0c632c2 update arm-cmn.c from tkernel5
This is the original code update from tkernel5 master tree with commit id 1f87429485

Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-08-13 14:22:37 +08:00
Robin Murphy c649df6d2d perf: Add Arm CMN-600 PMU driver
config: set CONFIG_ARM_CMN=m

To support arm-cmn.ko for performance debug
‘for’ loop initial declarations are only allowed in C99 or C11 mode
so set CFLAGS_arm-cmn.o += -std=gnu99

commit 0ba64770a2 upstream.

Initial driver for PMU event counting on the Arm CMN-600 interconnect.
CMN sports an obnoxiously complex distributed PMU system as part of
its debug and trace features, which can do all manner of things like
sampling, cross-triggering and generating CoreSight trace. This driver
covers the PMU functionality, plus the relevant aspects of watchpoints
for simply counting matching flits.

Tested-by: Tsahi Zidenberg <tsahee@amazon.com>
Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-08-13 14:21:45 +08:00
Bartosz Golaszewski b264ba41b2 devres: provide devm_krealloc()
add this patch for arm-cmn
commit f82485722e upstream.

Implement the managed variant of krealloc(). This function works with
all memory allocated by devm_kmalloc() (or devres functions using it
implicitly like devm_kmemdup(), devm_kstrdup() etc.).

Managed realloc'ed chunks can be manually released with devm_kfree().

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200824173859.4910-2-brgl@bgdev.pl
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-08-13 11:43:51 +08:00
Bartosz Golaszewski ac3e23db87 devres: move the size check from alloc_dr() into a separate function
add this patch for arm-cmn
commit dc2a633ccb upstream.

We will perform the same size check in devm_krealloc(). Move the relevant
code into a separate helper.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Link: https://lore.kernel.org/r/20200629065008.27620-3-brgl@bgdev.pl
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-08-13 11:43:24 +08:00
Jianping Liu 8698752488 config-readme: update config-readme
Describe how to generate and change kernel config.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
2024-07-16 16:35:01 +08:00
Jianping Liu 3a5b595eca thirdparty/bnxt: set -Wframe-larger-than=2144 in Makefile
To fix compile error when building debug kernel:
bnxt/tfc_v3/tfc_tbl_scope.c: In function 'tfc_tbl_scope_mem_alloc':
bnxt/tfc_v3/tfc_tbl_scope.c:1185:1: error: the frame size of 2144 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
2024-07-16 16:35:01 +08:00
Jianping Liu a61b032d57 thirdpaty/bnxt: update bnxt_en driver to 230
For details, the version is bnxt_en-1.10.3-230.2.52.0 . Note, bnxt_en
doesn't support RDMA, which needing bnxt_re driver.

Makefile in thirdpaty/bnxt have been modified:
add line 54: BNXT_SRC=$(srctree)/drivers/thirdparty/bnxt
add line 1420: src=$(BNXT_SRC)

Without line 1420, make dist-rpm will compile error as below:
drivers/thirdparty/bnxt/bnxt.c:111:10: fatal error: tfc.h: No such file or directory
 #include "tfc.h"

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
2024-07-16 16:35:01 +08:00
chinaljp030 701790887b
!195 [next-5.4] DRM phytium Add Phytium Display Engine support
Merge pull request !195 from xuyan213/5.4-test
2024-07-15 10:55:27 +00:00
xuyan 3a51cb695a DRM: Fix Phytium DRM build fail
Fix error :modpost "__sanitizer_cov_trace_cmpf"

Reviewed-by: Jiakun Shuai<shuaijiakun1288@phytium.com.cn>
Signed-off-by: Yang Xun <yangxun@phytium.com.cn>
Signed-off-by: Chen Baozi <chenbaozi@phytium.com.cn>
Signed-off-by: xuyan <xuyan1481@phytium.com.cn>
2024-07-15 10:39:05 +08:00
Jason Xing 2f729cf216 sssnic: add one dependency in Kconfig
Getting rid of prefix 'CONFIG_' can solve the issue. In the
previous patch, I missed one place.

Most importantly, I added dependency on CONFIG_PCI_ATS since
This drivers relys on CONFIG_PCI_ATS, so we need to adjust in
Kconfig files.

Fixes: ee280c4189a13 ("sssnic: support this new driver")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
2024-06-27 11:04:15 +08:00
Jason Xing 5f4dd239de sssnic: fix wrong dependencies in Kconfig
Getting rid of prefix 'CONFIG_' can solve the issue.

Fixes: ee280c4189a13 ("sssnic: support this new driver")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
2024-06-27 11:04:03 +08:00
xuyan 19f3a59713 drm/ast: Bugfix display error for ps23xx when using ast bmc
card

When the ast bmc card is detected on ps23xx SoCs, change the card's
vram to uncache mode to avoid unnecessary trouble.

Reviewed-by: Jiakun Shuai<shuaijiakun1288@phytium.com.cn>
Signed-off-by: Xu Yan <xuyan1481@phytium.com.cn>
Signed-off-by: WangHao <wanghao1851@cphytium.com.cn>
2024-06-26 14:26:16 +08:00
xuyan c328a15761 drm/phytium: Bugfix enable efi fb for ps23xx when using pe2201 bmc card
When enable efi fb, replace the original efi fb0 when loading the
dc driver.

Reviewed-by: Jiakun Shuai<shuaijiakun1288@phytium.com.cn>
Signed-off-by: Xu Yan <xuyan1481@phytium.com.cn>
Signed-off-by: WangHao <wanghao1851@phytium.com.cn>
2024-06-26 14:24:19 +08:00
xuyan 4c42a59bb6 drm/phytium: Bugfix Xorg startup for ps23xx when using pe2201
bmc card

When the pe2201 bmc card is detected on ps23xx SoCs, map the card's
graphics memory to device attributes to avoid unnecessary trouble.

Reviewed-by: Jiakun Shuai<shuaijiakun1288@phytium.com.cn>
Signed-off-by: Xu Yan <xuyan1481@phytium.com.cn>
Signed-off-by: WangHao <wanghao1851@phytium.com.cn>
2024-06-26 14:21:46 +08:00
lishuo 9debb2ab33 DRM phytium Add Phytium Display Engine support
Phytium Display Engine support,DC/DP driver patch.
Signed-off-by: Yang Xun <yangxun@phytium.com.cn>
Signed-off-by: Chen Baozi <chenbaozi@phytium.com.cn>
Reviewed-by:Hongbo Mao<maohongbo@phytium.com.cn>
(cherry picked from commit efc4d36af4)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
2024-06-26 14:21:05 +08:00
xuyan 823ace9213 Add phytium pci definition
add phytium pci definition

Reviewed-by: Jiakun Shuai<shuaijiakun1288@phytium.com.cn>
Signed-off-by: xuyan<xuyan1481@phytium.com.cn>
2024-06-26 14:19:03 +08:00
Yihang Li ef9fcd99f1 scsi: hisi_sas: Check whether debugfs is enabled before removing or releasing it
hisi_sas debugfs remove should be executed only when debugfs is enabled.
Check whether debugfs is enabled and then remove it only if enabled.

Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1705904747-62186-4-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: chenyi <chenyi211@huawei.com>
2024-06-12 17:11:45 +08:00
Niklas Cassel 1159f42562 scsi: core: Kick the requeue list after inserting when flushing
When libata calls ata_link_abort() to abort all ata queued commands, it
calls blk_abort_request() on the SCSI command representing each QC.

This causes scsi_timeout() to be called, which calls scsi_eh_scmd_add() for
each SCSI command.

scsi_eh_scmd_add() sets the SCSI host to state recovery, and then adds the
command to shost->eh_cmd_q.

This will wake up the SCSI EH, and eventually the libata EH strategy
handler will be called, which calls scsi_eh_flush_done_q() to either flush
retry or flush finish each failed command.

The commands that are flush retried by scsi_eh_flush_done_q() are done so
using scsi_queue_insert().

Before commit 8b566edbdb ("scsi: core: Only kick the requeue list if
necessary"), __scsi_queue_insert() called blk_mq_requeue_request() with the
second argument set to true, indicating that it should always kick/run the
requeue list after inserting.

After commit 8b566edbdb ("scsi: core: Only kick the requeue list if
necessary"), __scsi_queue_insert() does not kick/run the requeue list after
inserting, if the current SCSI host state is recovery (which is the case in
the libata example above).

This optimization is probably fine in most cases, as I can only assume that
most often someone will eventually kick/run the queues.

However, that is not the case for scsi_eh_flush_done_q(), where we can see
that the request gets inserted to the requeue list, but the queue is never
started after the request has been inserted, leading to the block layer
waiting for the completion of command that never gets to run.

Since scsi_eh_flush_done_q() is called by SCSI EH context, the SCSI host
state is most likely always in recovery when this function is called.

Thus, let scsi_eh_flush_done_q() explicitly kick the requeue list after
inserting a flush retry command, so that scsi_eh_flush_done_q() keeps the
same behavior as before commit 8b566edbdb ("scsi: core: Only kick the
requeue list if necessary").

Simple reproducer for the libata example above:
$ hdparm -Y /dev/sda
$ echo 1 > /sys/class/scsi_device/0\:0\:0\:0/device/delete

Fixes: 8b566edbdb ("scsi: core: Only kick the requeue list if necessary")
Reported-by: Kevin Locke <kevin@kevinlocke.name>
Closes: https://lore.kernel.org/linux-scsi/ZZw3Th70wUUvCiCY@kevinlocke.name/
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20240111120533.3612509-1-cassel@kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: chenyi <chenyi211@huawei.com>
2024-06-12 17:11:42 +08:00
Yihang Li ceac9bfc4e scsi: hisi_sas: Correct the number of global debugfs registers
In function debugfs_debugfs_snapshot_global_reg_v3_hw() it uses
debugfs_axi_reg.count (which is the number of axi debugfs registers) to
acquire the number of global debugfs registers.

Use debugfs_global_reg.count to acquire the number of global debugfs
registers instead.

Fixes: 623a4b6d5c ("scsi: hisi_sas: Move debugfs code to v3 hw driver")
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1702525516-51258-6-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: chenyi <chenyi211@huawei.com>
2024-06-12 17:11:37 +08:00
Luo Jiaxing b3dcc0c786 scsi: hisi_sas: Run I_T nexus resets in parallel for clear nexus reset
For a clear nexus reset operation, the I_T nexus resets are executed
serially for each device. For devices attached through an expander, this
may take 2s per device; so, in total, could take a long time.

Reduce the total time by running the I_T nexus resets in parallel through
async operations.

Link: https://lore.kernel.org/r/1623058179-80434-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: chenyi <chenyi211@huawei.com>
2024-06-12 17:11:30 +08:00
Russell King (Oracle) dcbccb8b56 net: phy: avoid kernel warning dump when stopping an errored PHY
When taking a network interface down (or removing a SFP module) after
the PHY has encountered an error, phy_stop() complains incorrectly
that it was called from HALTED state.

The reason this is incorrect is that the network driver will have
called phy_start() when the interface was brought up, and the fact
that the PHY has a problem bears no relationship to the administrative
state of the interface. Taking the interface administratively down
(which calls phy_stop()) is always the right thing to do after a
successful phy_start() call, whether or not the PHY has encountered
an error.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>

 Conflicts:
	drivers/net/phy/phy.c
	include/linux/phy.h
2024-06-12 16:57:03 +08:00
Florian Fainelli ed4867ebaa net: phy: Improved PHY error reporting in state machine
When the PHY library calls phy_error() something bad has happened, and
we halt the PHY state machine. Calling phy_error() from the main state
machine however is not precise enough to know whether the issue is
reading the link status or starting auto-negotiation.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>

 Conflicts:
	drivers/net/phy/phy.c
2024-06-12 16:56:56 +08:00
Danielle Ratson 6e68dd0e8d ethtool: Expose the number of lanes in use
Currently, ethtool does not expose how many lanes are used when the
link is up.

After adding a possibility to advertise or force a specific number of
lanes, the lanes in use value can be either the maximum width of the port
or below.

Extend ethtool to expose the number of lanes currently in use for
drivers that support it.

For example:

$ ethtool -s swp1 speed 100000 lanes 4
$ ethtool -s swp2 speed 100000 lanes 4
$ ip link set swp1 up
$ ip link set swp2 up
$ ethtool swp1
Settings for swp1:
        Supported ports: [ FIBRE         Backplane ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                50000baseKR/Full
                                50000baseSR/Full
                                50000baseCR/Full
                                50000baseLR_ER_FR/Full
                                50000baseDR/Full
                                100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
                                200000baseKR4/Full
                                200000baseSR4/Full
                                200000baseLR4_ER4_FR4/Full
                                200000baseDR4/Full
                                200000baseCR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                200000baseKR4/Full
                                200000baseSR4/Full
                                200000baseLR4_ER4_FR4/Full
                                200000baseDR4/Full
                                200000baseCR4/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Advertised link modes:  100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
	Advertised pause frame use: No
	Advertised auto-negotiation: Yes
	Advertised FEC modes: Not reported
	Speed: 100000Mb/s
	Lanes: 4
	Duplex: Full
	Auto-negotiation: on
	Port: Direct Attach Copper
	PHYAD: 0
	Transceiver: internal
	Link detected: yes

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 16:56:49 +08:00
Danielle Ratson f43960dac3 ethtool: Get link mode in use instead of speed and duplex parameters
Currently, when user space queries the link's parameters, as speed and
duplex, each parameter is passed from the driver to ethtool.

Instead, get the link mode bit in use, and derive each of the parameters
from it in ethtool.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>

 Conflicts:
	net/ethtool/linkmodes.c
2024-06-12 16:56:36 +08:00
Danielle Ratson 6abf13bd7f ethtool: Extend link modes settings uAPI with lanes
Currently, when auto negotiation is on, the user can advertise all the
linkmodes which correspond to a specific speed, but does not have a
similar selector for the number of lanes. This is significant when a
specific speed can be achieved using different number of lanes.  For
example, 2x50 or 4x25.

Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
ethtool_link_settings' with lanes field in order to implement a new
lanes-selector that will enable the user to advertise a specific number
of lanes as well.

When auto negotiation is off, lanes parameter can be forced only if the
driver supports it. Add a capability bit in 'struct ethtool_ops' that
allows ethtool know if the driver can handle the lanes parameter when
auto negotiation is off, so if it does not, an error message will be
returned when trying to set lanes.

Example:

$ ethtool -s swp1 lanes 4
$ ethtool swp1
  Settings for swp1:
	Supported ports: [ FIBRE ]
        Supported link modes:   1000baseKX/Full
                                10000baseKR/Full
                                40000baseCR4/Full
				40000baseSR4/Full
				40000baseLR4/Full
                                25000baseCR/Full
                                25000baseSR/Full
				50000baseCR2/Full
                                100000baseSR4/Full
				100000baseCR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  40000baseCR4/Full
				40000baseSR4/Full
				40000baseLR4/Full
                                100000baseSR4/Full
				100000baseCR4/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: on
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Link detected: no

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>

 Conflicts:
	net/ethtool/linkmodes.c
	net/ethtool/netlink.h
2024-06-12 16:56:30 +08:00
Danielle Ratson 6ad15a601f ethtool: Validate master slave configuration before rtnl_lock()
Create a new function for input validations to be called before
rtnl_lock() and move the master slave validation to that function.

This would be a cleanup for next patch that would add another validation
to the new function.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>

 Conflicts:
	net/ethtool/linkmodes.c
2024-06-12 16:56:23 +08:00
Jie Wang 39bc05fc7b net: page_pool: optimize page pool page allocation in NUMA scenario
Currently NIC packet receiving performance based on page pool deteriorates
occasionally. To analysis the causes of this problem page allocation stats
are collected. Here are the stats when NIC rx performance deteriorates:

bandwidth(Gbits/s)		16.8		6.91
rx_pp_alloc_fast		13794308	21141869
rx_pp_alloc_slow		108625		166481
rx_pp_alloc_slow_h		0		0
rx_pp_alloc_empty		8192		8192
rx_pp_alloc_refill		0		0
rx_pp_alloc_waive		100433		158289
rx_pp_recycle_cached		0		0
rx_pp_recycle_cache_full	0		0
rx_pp_recycle_ring		362400		420281
rx_pp_recycle_ring_full		6064893		9709724
rx_pp_recycle_released_ref	0		0

The rx_pp_alloc_waive count indicates that a large number of pages' numa
node are inconsistent with the NIC device numa node. Therefore these pages
can't be reused by the page pool. As a result, many new pages would be
allocated by __page_pool_alloc_pages_slow which is time consuming. This
causes the NIC rx performance fluctuations.

The main reason of huge numa mismatch pages in page pool is that page pool
uses alloc_pages_bulk_array to allocate original pages. This function is
not suitable for page allocation in NUMA scenario. So this patch uses
alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure
the NUMA consistent between NIC device and allocated pages.

Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth
is higher and more stable compared to the datas above. Here are three test
stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which
indicates pages allocated from slow patch is relatively low.

bandwidth(Gbits/s)		93		93.9		93.8
rx_pp_alloc_fast		60066264	61266386	60938254
rx_pp_alloc_slow		16512		16517		16539
rx_pp_alloc_slow_ho		0		0		0
rx_pp_alloc_empty		16512		16517		16539
rx_pp_alloc_refill		473841		481910		481585
rx_pp_alloc_waive		0		0		0
rx_pp_recycle_cached		0		0		0
rx_pp_recycle_cache_full	0		0		0
rx_pp_recycle_ring		29754145	30358243	30194023
rx_pp_recycle_ring_full		0		0		0
rx_pp_recycle_released_ref	0		0		0

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/20220705113515.54342-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:57 +08:00
Uladzislau Rezki (Sony) b73ba69864 mm/page_alloc: add an alloc_pages_bulk_array_node() helper
Patch series "vmalloc() vs bulk allocator", v2.

This patch (of 3):

Add a "node" variant of the alloc_pages_bulk_array() function.  The helper
guarantees that a __alloc_pages_bulk() is invoked with a valid NUMA node
ID.

Link: https://lkml.kernel.org/r/20210516202056.2120-1-urezki@gmail.com
Link: https://lkml.kernel.org/r/20210516202056.2120-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:56 +08:00
hongrongxuan 70350410c7 tencent.config: arm64: default select CONFIG_PAGE_POOL_STATS 2024-06-12 13:17:56 +08:00
Jie Wang 9e2881148b net: page_pool: add page allocation stats for two fast page allocate path
Currently If use page pool allocation stats to analysis a RX performance
degradation problem. These stats only count for pages allocate from
page_pool_alloc_pages. But nic drivers such as hns3 use
page_pool_dev_alloc_frag to allocate pages, so page stats in this API
should also be counted.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:55 +08:00
Lorenzo Bianconi 6ec340cad1 net: page_pool: introduce ethtool stats
Introduce page_pool APIs to report stats through ethtool and reduce
duplicated code in each driver.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:55 +08:00
Lorenzo Bianconi f1b097cf6a page_pool: Add recycle stats to page_pool_put_page_bulk
Add missing recycle stats to page_pool_put_page_bulk routine.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/3712178b51c007cfaed910ea80e68f00c916b1fa.1649685634.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:55 +08:00
Joe Damato 5fc015629d Documentation: update networking/page_pool.rst
Add the new stats API, kernel config parameter, and stats structure
information to the page_pool documentation.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:54 +08:00
Lorenzo Bianconi 6b73e4a946 net: page_pool: Add page_pool_put_page_bulk() to page_pool.rst
Introduce page_pool_put_page_bulk() entry into the API section of
page_pool.rst

Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/a6a5141b4d7b7b71fa7778b16b48f80095dd3233.1606146163.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:54 +08:00
Ilias Apalodimas ba01c48ca8 net: page_pool: Add documentation on page_pool API
Add documentation explaining the basic functionality and design
principles of the API

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:54 +08:00
Joe Damato a0a58dc852 page_pool: Add function to batch and return stats
Adds a function page_pool_get_stats which can be used by drivers to obtain
stats for a specified page_pool.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:53 +08:00
Joe Damato 6c05b6c3b5 page_pool: Add recycle stats
Add per-cpu stats tracking page pool recycling events:
	- cached: recycling placed page in the page pool cache
	- cache_full: page pool cache was full
	- ring: page placed into the ptr ring
	- ring_full: page released from page pool because the ptr ring was full
	- released_refcnt: page released (and not recycled) because refcnt > 1

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:53 +08:00
Joe Damato 0c7d0cb59b page_pool: Add allocation stats
Add per-pool statistics counters for the allocation path of a page pool.
These stats are incremented in softirq context, so no locking or per-cpu
variables are needed.

This code is disabled by default and a kernel config option is provided for
users who wish to enable them.

The statistics added are:
	- fast: successful fast path allocations
	- slow: slow path order-0 allocations
	- slow_high_order: slow path high order allocations
	- empty: ptr ring is empty, so a slow path allocation was forced.
	- refill: an allocation which triggered a refill of the cache
	- waive: pages obtained from the ptr ring that cannot be added to
	  the cache due to a NUMA mismatch.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:52 +08:00
Russell King 9a6a9df235 net: phylink: clarify flow control settings in documentation
Clarify the expected flow control settings operation in the phylink
documentation for each negotiation mode.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:52 +08:00
Russell King 202442fe4e net: phylink: improve initial mac configuration
Improve the initial MAC configuration so we get a configuration which
more represents the final operating mode, in particular with respect
to the flow control settings.

We do this by:
1) more fully initialising our phy state, so we can use this as the
   initial state for PHY based connections.
2) reading the fixed link state.
3) ensuring that in-band mode has sane pause settings for SGMII vs
   802.3z negotiation modes.

In all three cases, we ensure that state->link is false, just in case
any MAC drivers have other ideas by mis-using this member, and we also
take account of manual pause mode configuration at this point.

This avoids MLO_PAUSE_AN being seen in mac_config() when operating in
PHY, fixed mode or inband SGMII mode, thereby giving cleaner semantics
to the pause flags.  As a result of this, the pause flags now indicate
in a mode-independent way what is required from a mac_config()
implementation.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>

 Conflicts:
	drivers/net/phy/phylink.c
2024-06-12 13:17:52 +08:00
Russell King cca40c4a30 net: phylink: allow ethtool -A to change flow control advertisement
When ethtool -A is used to change the pause modes, the pause
advertisement is not being changed, but the documentation in
uapi/linux/ethtool.h says we should be. Add that capability to
phylink.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:51 +08:00
Russell King 6b1f64e2fd net: phylink: resolve fixed link flow control
Resolve the fixed link flow control using the recently introduced
linkmode_resolve_pause() helper, which we use in
phylink_get_fixed_state() only when operating in full duplex mode.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:51 +08:00
Russell King 9dce982e61 net: phylink: use phylib resolved flow control modes
Use the new phy_get_pause() helper to get the resolved pause modes for
a PHY rather than resolving the pause modes ourselves. We temporarily
retain our pause mode resolution for causes where there is no PHY
attached, e.g. for fixed-link modes.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:51 +08:00