28af22c6c8
The current layout of net_device is not optimal for cacheline usage. The member adj_list.lower linked list is split between cacheline 2 and 3. The ifindex is placed together with stats (struct net_device_stats), although most modern drivers don't update this stats member. The members netdev_ops, mtu and hard_header_len are placed on three different cachelines. These members are accessed for XDP redirect into devmap, which were noticeably with perf tool. When not using the map redirect variant (like TC-BPF does), then ifindex is also used, which is placed on a separate fourth cacheline. These members are also accessed during forwarding with regular network stack. The members priv_flags and flags are on fast-path for network stack transmit path in __dev_queue_xmit (currently located together with mtu cacheline). This patch creates a read mostly cacheline, with the purpose of keeping the above mentioned members on the same cacheline. Some netdev_features_t members also becomes part of this cacheline, which is on purpose, as function netif_skb_features() is on fast-path via validate_xmit_skb(). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/161168277983.410784.12401225493601624417.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
---|---|---|
.. | ||
acpi | ||
asm-generic | ||
clocksource | ||
crypto | ||
drm | ||
dt-bindings | ||
keys | ||
kunit | ||
kvm | ||
linux | ||
math-emu | ||
media | ||
memory | ||
misc | ||
net | ||
pcmcia | ||
ras | ||
rdma | ||
scsi | ||
soc | ||
sound | ||
target | ||
trace | ||
uapi | ||
vdso | ||
video | ||
xen |