samples/bpf: program demonstrating access to xdp_rxq_info
This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.
The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.
Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ. Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.
Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats CPU pps issue-pps
XDP-RX CPU 0 900,473 0
XDP-RX CPU 2 906,921 0
XDP-RX CPU total 1,807,395
RXQ stats RXQ:CPU pps issue-pps
rx_queue_index 0:0 180,098 0
rx_queue_index 0:sum 180,098
rx_queue_index 1:0 180,098 0
rx_queue_index 1:sum 180,098
rx_queue_index 2:2 906,921 0
rx_queue_index 2:sum 906,921
rx_queue_index 3:0 180,098 0
rx_queue_index 3:sum 180,098
rx_queue_index 4:0 180,082 0
rx_queue_index 4:sum 180,082
rx_queue_index 5:0 180,093 0
rx_queue_index 5:sum 180,093
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-03 18:26:19 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0
|
|
|
|
* Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc.
|
|
|
|
*
|
|
|
|
* Example howto extract XDP RX-queue info
|
|
|
|
*/
|
|
|
|
#include <uapi/linux/bpf.h>
|
samples/bpf: extend xdp_rxq_info to read packet payload
There is a cost associated with reading the packet data payload
that this test ignored. Add option --read to allow enabling
reading part of the payload.
This sample/tool helps us analyse an issue observed with a NIC
mlx5 (ConnectX-5 Ex) and an Intel(R) Xeon(R) CPU E5-1650 v4.
With no_touch of data:
Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:no_touch
XDP stats CPU pps issue-pps
XDP-RX CPU 0 14,465,157 0
XDP-RX CPU 1 14,464,728 0
XDP-RX CPU 2 14,465,283 0
XDP-RX CPU 3 14,465,282 0
XDP-RX CPU 4 14,464,159 0
XDP-RX CPU 5 14,465,379 0
XDP-RX CPU total 86,789,992
When not touching data, we observe that the CPUs have idle cycles.
When reading data the CPUs are 100% busy in softirq.
With reading data:
Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:read
XDP stats CPU pps issue-pps
XDP-RX CPU 0 9,620,639 0
XDP-RX CPU 1 9,489,843 0
XDP-RX CPU 2 9,407,854 0
XDP-RX CPU 3 9,422,289 0
XDP-RX CPU 4 9,321,959 0
XDP-RX CPU 5 9,395,242 0
XDP-RX CPU total 56,657,828
The effect seen above is a result of cache-misses occuring when
more RXQs are being used. Based on perf-event observations, our
conclusion is that the CPUs DDIO (Direct Data I/O) choose to
deliver packet into main memory, instead of L3-cache. We also
found, that this can be mitigated by either using less RXQs or by
reducing NICs the RX-ring size.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-25 22:27:43 +08:00
|
|
|
#include <uapi/linux/if_ether.h>
|
|
|
|
#include <uapi/linux/in.h>
|
samples/bpf: program demonstrating access to xdp_rxq_info
This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.
The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.
Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ. Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.
Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats CPU pps issue-pps
XDP-RX CPU 0 900,473 0
XDP-RX CPU 2 906,921 0
XDP-RX CPU total 1,807,395
RXQ stats RXQ:CPU pps issue-pps
rx_queue_index 0:0 180,098 0
rx_queue_index 0:sum 180,098
rx_queue_index 1:0 180,098 0
rx_queue_index 1:sum 180,098
rx_queue_index 2:2 906,921 0
rx_queue_index 2:sum 906,921
rx_queue_index 3:0 180,098 0
rx_queue_index 3:sum 180,098
rx_queue_index 4:0 180,082 0
rx_queue_index 4:sum 180,082
rx_queue_index 5:0 180,093 0
rx_queue_index 5:sum 180,093
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-03 18:26:19 +08:00
|
|
|
#include "bpf_helpers.h"
|
|
|
|
|
|
|
|
/* Config setup from with userspace
|
|
|
|
*
|
|
|
|
* User-side setup ifindex in config_map, to verify that
|
|
|
|
* ctx->ingress_ifindex is correct (against configured ifindex)
|
|
|
|
*/
|
|
|
|
struct config {
|
|
|
|
__u32 action;
|
|
|
|
int ifindex;
|
samples/bpf: extend xdp_rxq_info to read packet payload
There is a cost associated with reading the packet data payload
that this test ignored. Add option --read to allow enabling
reading part of the payload.
This sample/tool helps us analyse an issue observed with a NIC
mlx5 (ConnectX-5 Ex) and an Intel(R) Xeon(R) CPU E5-1650 v4.
With no_touch of data:
Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:no_touch
XDP stats CPU pps issue-pps
XDP-RX CPU 0 14,465,157 0
XDP-RX CPU 1 14,464,728 0
XDP-RX CPU 2 14,465,283 0
XDP-RX CPU 3 14,465,282 0
XDP-RX CPU 4 14,464,159 0
XDP-RX CPU 5 14,465,379 0
XDP-RX CPU total 86,789,992
When not touching data, we observe that the CPUs have idle cycles.
When reading data the CPUs are 100% busy in softirq.
With reading data:
Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:read
XDP stats CPU pps issue-pps
XDP-RX CPU 0 9,620,639 0
XDP-RX CPU 1 9,489,843 0
XDP-RX CPU 2 9,407,854 0
XDP-RX CPU 3 9,422,289 0
XDP-RX CPU 4 9,321,959 0
XDP-RX CPU 5 9,395,242 0
XDP-RX CPU total 56,657,828
The effect seen above is a result of cache-misses occuring when
more RXQs are being used. Based on perf-event observations, our
conclusion is that the CPUs DDIO (Direct Data I/O) choose to
deliver packet into main memory, instead of L3-cache. We also
found, that this can be mitigated by either using less RXQs or by
reducing NICs the RX-ring size.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-25 22:27:43 +08:00
|
|
|
__u32 options;
|
|
|
|
};
|
|
|
|
enum cfg_options_flags {
|
|
|
|
NO_TOUCH = 0x0U,
|
|
|
|
READ_MEM = 0x1U,
|
samples/bpf: program demonstrating access to xdp_rxq_info
This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.
The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.
Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ. Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.
Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats CPU pps issue-pps
XDP-RX CPU 0 900,473 0
XDP-RX CPU 2 906,921 0
XDP-RX CPU total 1,807,395
RXQ stats RXQ:CPU pps issue-pps
rx_queue_index 0:0 180,098 0
rx_queue_index 0:sum 180,098
rx_queue_index 1:0 180,098 0
rx_queue_index 1:sum 180,098
rx_queue_index 2:2 906,921 0
rx_queue_index 2:sum 906,921
rx_queue_index 3:0 180,098 0
rx_queue_index 3:sum 180,098
rx_queue_index 4:0 180,082 0
rx_queue_index 4:sum 180,082
rx_queue_index 5:0 180,093 0
rx_queue_index 5:sum 180,093
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-03 18:26:19 +08:00
|
|
|
};
|
|
|
|
struct bpf_map_def SEC("maps") config_map = {
|
|
|
|
.type = BPF_MAP_TYPE_ARRAY,
|
|
|
|
.key_size = sizeof(int),
|
|
|
|
.value_size = sizeof(struct config),
|
|
|
|
.max_entries = 1,
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Common stats data record (shared with userspace) */
|
|
|
|
struct datarec {
|
|
|
|
__u64 processed;
|
|
|
|
__u64 issue;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct bpf_map_def SEC("maps") stats_global_map = {
|
|
|
|
.type = BPF_MAP_TYPE_PERCPU_ARRAY,
|
|
|
|
.key_size = sizeof(u32),
|
|
|
|
.value_size = sizeof(struct datarec),
|
|
|
|
.max_entries = 1,
|
|
|
|
};
|
|
|
|
|
|
|
|
#define MAX_RXQs 64
|
|
|
|
|
|
|
|
/* Stats per rx_queue_index (per CPU) */
|
|
|
|
struct bpf_map_def SEC("maps") rx_queue_index_map = {
|
|
|
|
.type = BPF_MAP_TYPE_PERCPU_ARRAY,
|
|
|
|
.key_size = sizeof(u32),
|
|
|
|
.value_size = sizeof(struct datarec),
|
|
|
|
.max_entries = MAX_RXQs + 1,
|
|
|
|
};
|
|
|
|
|
|
|
|
SEC("xdp_prog0")
|
|
|
|
int xdp_prognum0(struct xdp_md *ctx)
|
|
|
|
{
|
|
|
|
void *data_end = (void *)(long)ctx->data_end;
|
|
|
|
void *data = (void *)(long)ctx->data;
|
|
|
|
struct datarec *rec, *rxq_rec;
|
|
|
|
int ingress_ifindex;
|
|
|
|
struct config *config;
|
|
|
|
u32 key = 0;
|
|
|
|
|
|
|
|
/* Global stats record */
|
|
|
|
rec = bpf_map_lookup_elem(&stats_global_map, &key);
|
|
|
|
if (!rec)
|
|
|
|
return XDP_ABORTED;
|
|
|
|
rec->processed++;
|
|
|
|
|
|
|
|
/* Accessing ctx->ingress_ifindex, cause BPF to rewrite BPF
|
|
|
|
* instructions inside kernel to access xdp_rxq->dev->ifindex
|
|
|
|
*/
|
|
|
|
ingress_ifindex = ctx->ingress_ifindex;
|
|
|
|
|
|
|
|
config = bpf_map_lookup_elem(&config_map, &key);
|
|
|
|
if (!config)
|
|
|
|
return XDP_ABORTED;
|
|
|
|
|
|
|
|
/* Simple test: check ctx provided ifindex is as expected */
|
|
|
|
if (ingress_ifindex != config->ifindex) {
|
|
|
|
/* count this error case */
|
|
|
|
rec->issue++;
|
|
|
|
return XDP_ABORTED;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Update stats per rx_queue_index. Handle if rx_queue_index
|
|
|
|
* is larger than stats map can contain info for.
|
|
|
|
*/
|
|
|
|
key = ctx->rx_queue_index;
|
|
|
|
if (key >= MAX_RXQs)
|
|
|
|
key = MAX_RXQs;
|
|
|
|
rxq_rec = bpf_map_lookup_elem(&rx_queue_index_map, &key);
|
|
|
|
if (!rxq_rec)
|
|
|
|
return XDP_ABORTED;
|
|
|
|
rxq_rec->processed++;
|
|
|
|
if (key == MAX_RXQs)
|
|
|
|
rxq_rec->issue++;
|
|
|
|
|
samples/bpf: extend xdp_rxq_info to read packet payload
There is a cost associated with reading the packet data payload
that this test ignored. Add option --read to allow enabling
reading part of the payload.
This sample/tool helps us analyse an issue observed with a NIC
mlx5 (ConnectX-5 Ex) and an Intel(R) Xeon(R) CPU E5-1650 v4.
With no_touch of data:
Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:no_touch
XDP stats CPU pps issue-pps
XDP-RX CPU 0 14,465,157 0
XDP-RX CPU 1 14,464,728 0
XDP-RX CPU 2 14,465,283 0
XDP-RX CPU 3 14,465,282 0
XDP-RX CPU 4 14,464,159 0
XDP-RX CPU 5 14,465,379 0
XDP-RX CPU total 86,789,992
When not touching data, we observe that the CPUs have idle cycles.
When reading data the CPUs are 100% busy in softirq.
With reading data:
Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:read
XDP stats CPU pps issue-pps
XDP-RX CPU 0 9,620,639 0
XDP-RX CPU 1 9,489,843 0
XDP-RX CPU 2 9,407,854 0
XDP-RX CPU 3 9,422,289 0
XDP-RX CPU 4 9,321,959 0
XDP-RX CPU 5 9,395,242 0
XDP-RX CPU total 56,657,828
The effect seen above is a result of cache-misses occuring when
more RXQs are being used. Based on perf-event observations, our
conclusion is that the CPUs DDIO (Direct Data I/O) choose to
deliver packet into main memory, instead of L3-cache. We also
found, that this can be mitigated by either using less RXQs or by
reducing NICs the RX-ring size.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-25 22:27:43 +08:00
|
|
|
/* Default: Don't touch packet data, only count packets */
|
|
|
|
if (unlikely(config->options & READ_MEM)) {
|
|
|
|
struct ethhdr *eth = data;
|
|
|
|
|
|
|
|
if (eth + 1 > data_end)
|
|
|
|
return XDP_ABORTED;
|
|
|
|
|
|
|
|
/* Avoid compiler removing this: Drop non 802.3 Ethertypes */
|
|
|
|
if (ntohs(eth->h_proto) < ETH_P_802_3_MIN)
|
|
|
|
return XDP_ABORTED;
|
|
|
|
}
|
|
|
|
|
samples/bpf: program demonstrating access to xdp_rxq_info
This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.
The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.
Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ. Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.
Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats CPU pps issue-pps
XDP-RX CPU 0 900,473 0
XDP-RX CPU 2 906,921 0
XDP-RX CPU total 1,807,395
RXQ stats RXQ:CPU pps issue-pps
rx_queue_index 0:0 180,098 0
rx_queue_index 0:sum 180,098
rx_queue_index 1:0 180,098 0
rx_queue_index 1:sum 180,098
rx_queue_index 2:2 906,921 0
rx_queue_index 2:sum 906,921
rx_queue_index 3:0 180,098 0
rx_queue_index 3:sum 180,098
rx_queue_index 4:0 180,082 0
rx_queue_index 4:sum 180,082
rx_queue_index 5:0 180,093 0
rx_queue_index 5:sum 180,093
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-03 18:26:19 +08:00
|
|
|
return config->action;
|
|
|
|
}
|
|
|
|
|
|
|
|
char _license[] SEC("license") = "GPL";
|