OpenCloudOS-Kernel/net/core
Jie Wang 39bc05fc7b net: page_pool: optimize page pool page allocation in NUMA scenario
Currently NIC packet receiving performance based on page pool deteriorates
occasionally. To analysis the causes of this problem page allocation stats
are collected. Here are the stats when NIC rx performance deteriorates:

bandwidth(Gbits/s)		16.8		6.91
rx_pp_alloc_fast		13794308	21141869
rx_pp_alloc_slow		108625		166481
rx_pp_alloc_slow_h		0		0
rx_pp_alloc_empty		8192		8192
rx_pp_alloc_refill		0		0
rx_pp_alloc_waive		100433		158289
rx_pp_recycle_cached		0		0
rx_pp_recycle_cache_full	0		0
rx_pp_recycle_ring		362400		420281
rx_pp_recycle_ring_full		6064893		9709724
rx_pp_recycle_released_ref	0		0

The rx_pp_alloc_waive count indicates that a large number of pages' numa
node are inconsistent with the NIC device numa node. Therefore these pages
can't be reused by the page pool. As a result, many new pages would be
allocated by __page_pool_alloc_pages_slow which is time consuming. This
causes the NIC rx performance fluctuations.

The main reason of huge numa mismatch pages in page pool is that page pool
uses alloc_pages_bulk_array to allocate original pages. This function is
not suitable for page allocation in NUMA scenario. So this patch uses
alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure
the NUMA consistent between NIC device and allocated pages.

Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth
is higher and more stable compared to the datas above. Here are three test
stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which
indicates pages allocated from slow patch is relatively low.

bandwidth(Gbits/s)		93		93.9		93.8
rx_pp_alloc_fast		60066264	61266386	60938254
rx_pp_alloc_slow		16512		16517		16539
rx_pp_alloc_slow_ho		0		0		0
rx_pp_alloc_empty		16512		16517		16539
rx_pp_alloc_refill		473841		481910		481585
rx_pp_alloc_waive		0		0		0
rx_pp_recycle_cached		0		0		0
rx_pp_recycle_cache_full	0		0		0
rx_pp_recycle_ring		29754145	30358243	30194023
rx_pp_recycle_ring_full		0		0		0
rx_pp_recycle_released_ref	0		0		0

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/20220705113515.54342-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: hongrongxuan <hongrongxuan@huawei.com>
2024-06-12 13:17:57 +08:00
..
Makefile ethtool: move to its own directory 2024-06-12 13:17:03 +08:00
bpf_sk_storage.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
datagram.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
datagram.h net/core: Allow the compiler to verify declaration and definition consistency 2019-03-27 13:49:44 -07:00
dev.c net: add inline function skb_csum_is_sctp 2024-06-12 13:16:45 +08:00
dev_addr_lists.c net: remove unnecessary variables and callback 2019-10-24 14:53:49 -07:00
dev_ioctl.c ethtool: add timestamping related string sets 2024-06-12 13:17:27 +08:00
devlink.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
drop_monitor.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
dst.c net: print proper warning on dst underflow 2019-09-26 09:05:56 +02:00
dst_cache.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
failover.c failover: allow name change on IFF_UP slave interfaces 2019-04-10 22:12:26 -07:00
fib_notifier.c net: fib_notifier: move fib_notifier_ops from struct net into per-net struct 2019-09-07 17:28:22 +02:00
fib_rules.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
filter.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
flow_dissector.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
flow_offload.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
gen_estimator.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
gen_stats.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
gro_cells.c gro_cells: make sure device is up in gro_cells_receive() 2019-03-10 11:07:14 -07:00
hwbm.c net: hwbm: Make the hwbm_pool lock a mutex 2019-06-09 19:40:10 -07:00
link_watch.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
lwt_bpf.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
lwtunnel.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
neighbour.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
net-procfs.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
net-sysfs.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
net-sysfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
net-traces.c page_pool: add tracepoints for page_pool with details need by XDP 2019-06-19 11:23:13 -04:00
net_namespace.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
netclassid_cgroup.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
netevent.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netpoll.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
netprio_cgroup.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page_pool.c net: page_pool: optimize page pool page allocation in NUMA scenario 2024-06-12 13:17:57 +08:00
pktgen.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
ptp_classifier.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
request_sock.c tcp: add rcu protection around tp->fastopen_rsk 2019-10-13 10:13:08 -07:00
rtnetlink.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
rx_cgroup.h tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
scm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
secure_seq.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
skbuff.c skbuff: Fix a potential race while recycling page_pool packets 2024-06-12 13:17:00 +08:00
skmsg.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
sock.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
sock_diag.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
sock_map.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
sock_reuseport.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
stream.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
sysctl_net_core.c tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
timestamping.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
tso.c net: Use skb accessors in network core 2019-07-22 20:47:56 -07:00
tx_cgroup.h tkernel: sync code to the same with tk4 pub/lts/0017-kabi 2024-06-12 13:13:20 +08:00
utils.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
xdp.c net: page_pool: API cleanup and comments 2024-06-12 13:16:53 +08:00