linux-sg2042/net
Koen De Schepper aecfde2310 tcp: Ensure DCTCP reacts to losses
RFC8257 §3.5 explicitly states that "A DCTCP sender MUST react to
loss episodes in the same way as conventional TCP".

Currently, Linux DCTCP performs no cwnd reduction when losses
are encountered. Optionally, the dctcp_clamp_alpha_on_loss resets
alpha to its maximal value if a RTO happens. This behavior
is sub-optimal for at least two reasons: i) it ignores losses
triggering fast retransmissions; and ii) it causes unnecessary large
cwnd reduction in the future if the loss was isolated as it resets
the historical term of DCTCP's alpha EWMA to its maximal value (i.e.,
denoting a total congestion). The second reason has an especially
noticeable effect when using DCTCP in high BDP environments, where
alpha normally stays at low values.

This patch replace the clamping of alpha by setting ssthresh to
half of cwnd for both fast retransmissions and RTOs, at most once
per RTT. Consequently, the dctcp_clamp_alpha_on_loss module parameter
has been removed.

The table below shows experimental results where we measured the
drop probability of a PIE AQM (not applying ECN marks) at a
bottleneck in the presence of a single TCP flow with either the
alpha-clamping option enabled or the cwnd halving proposed by this
patch. Results using reno or cubic are given for comparison.

                          |  Link   |   RTT    |    Drop
                 TCP CC   |  speed  | base+AQM | probability
        ==================|=========|==========|============
                    CUBIC |  40Mbps |  7+20ms  |    0.21%
                     RENO |         |          |    0.19%
        DCTCP-CLAMP-ALPHA |         |          |   25.80%
         DCTCP-HALVE-CWND |         |          |    0.22%
        ------------------|---------|----------|------------
                    CUBIC | 100Mbps |  7+20ms  |    0.03%
                     RENO |         |          |    0.02%
        DCTCP-CLAMP-ALPHA |         |          |   23.30%
         DCTCP-HALVE-CWND |         |          |    0.04%
        ------------------|---------|----------|------------
                    CUBIC | 800Mbps |   1+1ms  |    0.04%
                     RENO |         |          |    0.05%
        DCTCP-CLAMP-ALPHA |         |          |   18.70%
         DCTCP-HALVE-CWND |         |          |    0.06%

We see that, without halving its cwnd for all source of losses,
DCTCP drives the AQM to large drop probabilities in order to keep
the queue length under control (i.e., it repeatedly faces RTOs).
Instead, if DCTCP reacts to all source of losses, it can then be
controlled by the AQM using similar drop levels than cubic or reno.

Signed-off-by: Koen De Schepper <koen.de_schepper@nokia-bell-labs.com>
Signed-off-by: Olivier Tilmans <olivier.tilmans@nokia-bell-labs.com>
Cc: Bob Briscoe <research@bobbriscoe.net>
Cc: Lawrence Brakmo <brakmo@fb.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <borkmann@iogearbox.net>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 10:51:16 -07:00
..
6lowpan 6lowpan: fix debugfs_simple_attr.cocci warnings 2019-01-22 09:51:19 +01:00
9p 9p/net: fix memory leak in p9_client_create 2019-03-13 11:50:04 +01:00
802
8021q net: Remove switchdev.h inclusion from team/bond/vlan 2019-02-24 17:40:46 -08:00
appletalk appletalk: Fix potential NULL pointer dereference in unregister_snap_client 2019-03-15 11:25:48 -07:00
atm net: atm: Add another IS_ENABLED(CONFIG_COMPAT) in atm_dev_ioctl 2019-03-07 10:14:50 -08:00
ax25 ax25: fix possible use-after-free 2019-01-23 11:18:00 -08:00
batman-adv batman-adv: Fix genl notification for throughput_override 2019-03-25 09:31:19 +01:00
bluetooth Bluetooth: Add quirk for reading BD_ADDR from fwnode property 2019-02-26 10:08:26 +01:00
bpf bpf: fix warning about using plain integer as NULL 2019-03-08 21:17:07 +01:00
bpfilter bpfilter: re-add header search paths to tools include to fix build error 2019-02-23 13:34:40 -08:00
bridge netfilter: bridge: set skb transport_header before entering NF_INET_PRE_ROUTING 2019-03-18 16:21:54 +01:00
caif net: caif: use skb helpers instead of open-coding them 2019-02-17 11:01:17 -08:00
can can: bcm: check timer values before ktime conversion 2019-01-22 11:33:46 +01:00
ceph libceph: wait for latest osdmap in ceph_monc_blacklist_add() 2019-03-20 16:27:40 +01:00
core net-gro: Fix GRO flush when receiving a GSO packet. 2019-04-03 21:40:52 -07:00
dcb
dccp dccp: Fix memleak in __feat_register_sp 2019-04-01 18:15:10 -07:00
decnet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-01-29 21:18:54 -08:00
dns_resolver
dsa net: dsa: Implement flow_dissect callback for tag_qca 2019-03-28 16:57:19 -07:00
ethernet net/ethernet: Add parse_protocol header_ops support 2019-02-22 12:55:31 -08:00
hsr net/hsr: fix possible crash in add_timer() 2019-03-07 11:02:08 -08:00
ieee802154 net: remove unused struct inet_frag_queue.fragments field 2019-02-26 08:27:05 -08:00
ife
ipv4 tcp: Ensure DCTCP reacts to losses 2019-04-04 10:51:16 -07:00
ipv6 ipv6: Fix dangling pointer when ipv6 fragment 2019-04-03 21:42:20 -07:00
iucv iucv: Remove SKB list assumptions. 2018-11-10 16:55:11 -08:00
kcm kcm: switch order of device registration to fix a crash 2019-04-01 14:59:20 -07:00
key af_key: unconditionally clone on broadcast 2019-02-12 10:36:42 +01:00
l2tp l2tp: fix infoleak in l2tp_ip6_recvmsg() 2019-03-13 14:19:35 -07:00
l3mdev l3mdev: add function to retreive upper master 2018-12-03 14:15:26 -08:00
lapb
llc llc: do not use sk_eat_skb() 2018-10-22 19:59:20 -07:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-02-24 12:06:19 -08:00
mac802154
mpls mpls: Fix 6PE forwarding 2019-03-19 16:00:22 -07:00
ncsi net: ncsi: fix a missing check for nla_nest_start 2019-03-16 11:44:33 -07:00
netfilter netfilter: nf_tables: add missing ->release_ops() in error path of newrule() 2019-03-20 08:32:58 +01:00
netlabel netlabel: fix out-of-bounds memory accesses 2019-02-27 21:45:24 -08:00
netlink genetlink: Fix a memory leak on error path 2019-03-21 09:29:06 -07:00
netrom netrom: switch to sock timer API 2019-01-27 10:38:04 -08:00
nfc nfc: Fix to check for kmemdup failure 2019-03-19 13:48:07 -07:00
nsh
openvswitch openvswitch: fix flow actions reallocation 2019-03-28 17:15:44 -07:00
packet net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec 2019-03-20 10:46:50 -07:00
phonet phonet: fix building with clang 2019-02-21 16:23:56 -08:00
psample
qrtr mm: replace all open encodings for NUMA_NO_NODE 2019-03-05 21:07:14 -08:00
rds net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock(). 2019-03-28 17:17:18 -07:00
rfkill rfkill: gpio: Remove unused include 2018-12-18 13:13:56 +01:00
rose net: rose: fix a possible stack overflow 2019-03-18 16:53:22 -07:00
rxrpc rxrpc: avoid clang -Wuninitialized warning 2019-03-23 21:48:30 -04:00
sched net/sched: act_sample: fix divide by zero in the traffic path 2019-04-04 10:46:33 -07:00
sctp sctp: initialize _pad of sockaddr_in before copying to user memory 2019-04-01 18:08:19 -07:00
smc Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-03-12 13:27:20 -07:00
strparser net: strparser: fix a missing check for create_singlethread_workqueue 2019-03-15 12:51:56 -07:00
sunrpc SUNRPC: fix uninitialized variable warning 2019-03-26 13:04:32 -07:00
switchdev switchdev: Remove unused transaction item queue 2019-03-01 21:35:19 -08:00
tipc tipc: handle the err returned from cmd header function 2019-03-31 16:45:57 -07:00
tls net: tls: prevent false connection termination with offload 2019-03-29 13:38:50 -07:00
unix io_uring-2019-03-06 2019-03-08 14:48:40 -08:00
vmw_vsock vsock/virtio: fix kernel panic from virtio_transport_reset_no_sock 2019-03-08 15:15:44 -08:00
wimax
wireless Merge remote-tracking branch 'net-next/master' into mac80211-next 2019-02-22 13:48:13 +01:00
x25 net/x25: reset state in x25_connect() 2019-03-11 15:40:14 -07:00
xdp xsk: fix umem memory leak on cleanup 2019-03-16 01:27:51 +01:00
xfrm xfrm: Fix inbound traffic via XFRM interfaces across network namespaces 2019-02-18 10:58:54 +01:00
Kconfig net: devlink: turn devlink into a built-in 2019-02-26 08:49:05 -08:00
Makefile net: split out functions related to registering inflight socket files 2019-02-28 08:24:23 -07:00
compat.c Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-05 14:08:26 -08:00
socket.c net: add documentation to socket.c 2019-03-15 15:29:47 -07:00
sysctl_net.c