linux-sg2042/include
Guillaume Nault 891584f48a inet: frags: re-introduce skb coalescing for local delivery
Before commit d4289fcc9b ("net: IP6 defrag: use rbtrees for IPv6
defrag"), a netperf UDP_STREAM test[0] using big IPv6 datagrams (thus
generating many fragments) and running over an IPsec tunnel, reported
more than 6Gbps throughput. After that patch, the same test gets only
9Mbps when receiving on a be2net nic (driver can make a big difference
here, for example, ixgbe doesn't seem to be affected).

By reusing the IPv4 defragmentation code, IPv6 lost fragment coalescing
(IPv4 fragment coalescing was dropped by commit 14fe22e334 ("Revert
"ipv4: use skb coalescing in defragmentation"")).

Without fragment coalescing, be2net runs out of Rx ring entries and
starts to drop frames (ethtool reports rx_drops_no_frags errors). Since
the netperf traffic is only composed of UDP fragments, any lost packet
prevents reassembly of the full datagram. Therefore, fragments which
have no possibility to ever get reassembled pile up in the reassembly
queue, until the memory accounting exeeds the threshold. At that point
no fragment is accepted anymore, which effectively discards all
netperf traffic.

When reassembly timeout expires, some stale fragments are removed from
the reassembly queue, so a few packets can be received, reassembled
and delivered to the netperf receiver. But the nic still drops frames
and soon the reassembly queue gets filled again with stale fragments.
These long time frames where no datagram can be received explain why
the performance drop is so significant.

Re-introducing fragment coalescing is enough to get the initial
performances again (6.6Gbps with be2net): driver doesn't drop frames
anymore (no more rx_drops_no_frags errors) and the reassembly engine
works at full speed.

This patch is quite conservative and only coalesces skbs for local
IPv4 and IPv6 delivery (in order to avoid changing skb geometry when
forwarding). Coalescing could be extended in the future if need be, as
more scenarios would probably benefit from it.

[0]: Test configuration
Sender:
ip xfrm policy flush
ip xfrm state flush
ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1
ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir in tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow
ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1
ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir out tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow
netserver -D -L fc00:2::1

Receiver:
ip xfrm policy flush
ip xfrm state flush
ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1
ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir in tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow
ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1
ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir out tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow
netperf -H fc00:2::1 -f k -P 0 -L fc00:1::1 -l 60 -t UDP_STREAM -I 99,5 -i 5,5 -T5,5 -6

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08 15:55:10 -07:00
..
acpi It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
asm-generic asm-generic: fix -Wtype-limits compiler warnings 2019-08-03 07:02:01 -07:00
clocksource clocksource/drivers: Continue making Hyper-V clocksource ISA agnostic 2019-07-03 11:00:59 +02:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-08 20:57:08 -07:00
drm drm/fb-helper: Instanciate shadow FB if configured in device's mode_config 2019-08-01 15:01:35 +02:00
dt-bindings ARM: Device-tree updates 2019-07-19 17:19:24 -07:00
keys request_key improvements 2019-07-08 19:19:37 -07:00
kvm KVM: arm/arm64: Support chained PMU counters 2019-07-05 13:56:22 +01:00
linux Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-08-06 17:11:59 -07:00
math-emu
media media updates for v5.3-rc1 2019-07-09 09:47:22 -07:00
misc powerpc updates for 5.3 2019-07-13 16:08:36 -07:00
net inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
pcmcia It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
ras
rdma RDMA/devices: Remove the lock around remove_client_context 2019-08-01 11:44:48 -04:00
scsi scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure 2019-07-29 21:12:35 -04:00
soc Merge branch 'pdf_fixes_v1' of https://git.linuxtv.org/mchehab/experimental into mauro 2019-07-22 13:51:20 -06:00
sound SPDX fixes for 5.3-rc2 2019-07-28 10:00:06 -07:00
target
trace tracing: Fix header include guards in trace event headers 2019-07-30 21:49:06 -04:00
uapi Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-08-06 17:11:59 -07:00
vdso vdso: Remove superfluous #ifdef __KERNEL__ in vdso/datapage.h 2019-06-26 07:28:09 +02:00
video drm main pull request for v5.3-rc1 (sans mm changes) 2019-07-15 19:04:27 -07:00
xen xen: avoid link error on ARM 2019-07-31 08:14:12 +02:00
Kbuild kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist. 2019-07-21 11:43:43 -07:00