linux-sg2042

History

Eric Dumazet 605ad7f184 tcp: refine TSO autosizing Commit `95bd09eb27` ("tcp: TSO packets automatic sizing") tried to control TSO size, but did this at the wrong place (sendmsg() time) At sendmsg() time, we might have a pessimistic view of flow rate, and we end up building very small skbs (with 2 MSS per skb). This is bad because : - It sends small TSO packets even in Slow Start where rate quickly increases. - It tends to make socket write queue very big, increasing tcp_ack() processing time, but also increasing memory needs, not necessarily accounted for, as fast clones overhead is currently ignored. - Lower GRO efficiency and more ACK packets. Servers with a lot of small lived connections suffer from this. Lets instead fill skbs as much as possible (64KB of payload), but split them at xmit time, when we have a precise idea of the flow rate. skb split is actually quite efficient. Patch looks bigger than necessary, because TCP Small Queue decision now has to take place after the eventual split. As Neal suggested, introduce a new tcp_tso_autosize() helper, so that tcp_tso_should_defer() can be synchronized on same goal. Rename tp->xmit_size_goal_segs to tp->gso_segs, as this variable contains number of mss that we can put in GSO packet, and is not related to the autosizing goal anymore. Tested: 40 ms rtt link nstat >/dev/null netperf -H remote -l -2000000 -- -s 1000000 nstat \| egrep "IpInReceives\|IpOutRequests\|TcpOutSegs\|IpExtOutOctets" Before patch : Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/s 87380 2000000 2000000 0.36 44.22 IpInReceives 600 0.0 IpOutRequests 599 0.0 TcpOutSegs 1397 0.0 IpExtOutOctets 2033249 0.0 After patch : Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 2000000 2000000 0.36 44.27 IpInReceives 221 0.0 IpOutRequests 232 0.0 TcpOutSegs 1397 0.0 IpExtOutOctets 2013953 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2014-12-09 16:39:22 -05:00
..
netfilter	netfilter: combine IPv4 and IPv6 nf_nat_redirect code in one module	2014-11-27 13:08:42 +01:00
Kconfig	net: Move fou_build_header into fou.c and refactor	2014-11-05 16:30:02 -05:00
Makefile	net: Add Geneve tunneling protocol driver	2014-10-06 00:32:20 -04:00
af_inet.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-11-29 20:47:48 -08:00
ah4.c	ipsec: Remove obsolete MAX_AH_AUTH_LEN	2014-09-18 10:54:36 +02:00
arp.c	neigh: remove dynamic neigh table registration support	2014-11-11 15:23:54 -05:00
cipso_ipv4.c	cipso: remove NULL assignment on static	2014-11-04 15:09:52 -05:00
datagram.c	net: Save TX flow hash in sock and set in skbuf on xmit	2014-07-07 21:14:21 -07:00
devinet.c	ipv4: fail early when creating netdev named all or default	2014-07-29 11:43:50 -07:00
esp4.c	net: esp: Convert NETDEBUG to pr_info	2014-11-06 15:11:10 -05:00
fib_frontend.c	ipv4: Restore accept_local behaviour in fib_validate_source()	2014-08-22 12:23:10 -07:00
fib_lookup.h	ipv4: make fib_detect_death static	2013-12-28 17:01:46 -05:00
fib_rules.c	ipv4: Fix incorrect error code when adding an unreachable route	2014-11-16 14:11:45 -05:00
fib_semantics.c	ipv4: fix nexthop attlen check in fib_nh_match	2014-10-14 15:59:37 -04:00
fib_trie.c	list: fix order of arguments for hlist_add_after(_rcu)	2014-08-06 18:01:24 -07:00
fou.c	gue: Call remcsum_adjust	2014-11-26 12:25:44 -05:00
geneve.c	vlan: introduce *vlan_hwaccel_push_inside helpers	2014-11-21 14:20:17 -05:00
gre_demux.c	net: Fix GRE RX to use skb_transport_header for GRE header offset	2014-09-08 15:23:05 -07:00
gre_offload.c	gre: Use inner mac length when computing tunnel length	2014-10-30 19:51:56 -04:00
icmp.c	icmp: Remove some spurious dropped packet profile hits from the ICMP path	2014-11-18 15:28:28 -05:00
igmp.c	ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs	2014-11-16 16:55:06 -05:00
inet_connection_sock.c	ipv4: make ip_local_reserved_ports per netns	2014-05-14 15:31:45 -04:00
inet_diag.c	inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets	2014-01-13 22:35:46 -08:00
inet_fragment.c	net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited	2014-11-11 14:10:31 -05:00
inet_hashtables.c	net: use reciprocal_scale() helper	2014-08-23 12:21:21 -07:00
inet_lro.c	lro: remove dead code	2013-12-29 16:34:25 -05:00
inet_timewait_sock.c	tcp/dccp: remove twchain	2013-10-08 23:19:24 -04:00
inetpeer.c	inet: remove dead inetpeer sequence code	2014-09-08 16:42:42 -07:00
ip_forward.c	net: rename local_df to ignore_df	2014-05-12 14:03:41 -04:00
ip_fragment.c	net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited	2014-11-11 14:10:31 -05:00
ip_gre.c	fou: Fix typo in returning flags in netlink	2014-11-05 22:18:20 -05:00
ip_input.c	net: Fix memory leak if TPROXY used with TCP early demux	2014-01-27 16:22:11 -08:00
ip_options.c	ipv4: rename ip_options_echo to __ip_options_echo()	2014-09-28 16:35:42 -04:00
ip_output.c	net; ipv[46] - Remove 2 unnecessary NETDEBUG OOM messages	2014-11-06 15:11:10 -05:00
ip_sockglue.c	net-timestamp: allow reading recv cmsg on errqueue with origin tstamp	2014-12-08 20:20:48 -05:00
ip_tunnel.c	ip_tunnel: Ops registration for secondary encap (fou, gue)	2014-11-12 15:01:35 -05:00
ip_tunnel_core.c	ipv4: fix a potential use after free in ip_tunnel_core.c	2014-10-17 23:45:26 -04:00
ip_vti.c	ip_tunnel: the lack of vti_link_ops' dellink() cause kernel panic	2014-11-23 21:11:17 -05:00
ipcomp.c	ipcomp4: Use the IPsec protocol multiplexer API	2014-02-25 07:04:17 +01:00
ipconfig.c	ipv4: remove 0/NULL assignment on static	2014-11-04 15:09:52 -05:00
ipip.c	fou: Fix typo in returning flags in netlink	2014-11-05 22:18:20 -05:00
ipmr.c	net: set name_assign_type in alloc_netdev()	2014-07-15 16:12:48 -07:00
netfilter.c	netfilter: remove double colon	2014-02-19 11:41:25 +01:00
ping.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-11-29 20:47:48 -08:00
proc.c	tcp_cubic: add SNMP counters to track how effective is Hystart	2014-12-09 14:58:23 -05:00
protocol.c	net: Export inet_offloads and inet6_offloads	2014-09-19 17:15:31 -04:00
raw.c	ipv4: Avoid reading user iov twice after raw_probe_proto_opt	2014-11-10 14:25:35 -05:00
route.c	ipv4: Do not cache routing failures due to disabled forwarding.	2014-10-30 19:20:40 -04:00
syncookies.c	net: allow setting ecn via routing table	2014-11-04 16:06:09 -05:00
sysctl_net_ipv4.c	tcp: allow for bigger reordering level	2014-10-29 15:05:15 -04:00
tcp.c	tcp: refine TSO autosizing	2014-12-09 16:39:22 -05:00
tcp_bic.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_cong.c	tcp: spelling s/plugable/pluggable	2014-11-04 15:09:52 -05:00
tcp_cubic.c	tcp_cubic: refine Hystart delay threshold	2014-12-09 14:58:23 -05:00
tcp_dctcp.c	net: tcp: add DCTCP congestion control algorithm	2014-09-29 00:13:10 -04:00
tcp_diag.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_fastopen.c	tcp: remove unnecessary assignment.	2014-09-29 12:31:12 -04:00
tcp_highspeed.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_htcp.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_hybla.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_illinois.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_input.c	new helper: memcpy_from_msg()	2014-11-24 04:28:48 -05:00
tcp_ipv4.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-11-29 20:47:48 -08:00
tcp_lp.c	tcp: remove in_flight parameter from cong_avoid() methods	2014-05-03 19:23:07 -04:00
tcp_memcontrol.c	percpu_counter: add @gfp to percpu_counter_init()	2014-09-08 09:51:29 +09:00
tcp_metrics.c	tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic	2014-08-14 14:38:54 -07:00
tcp_minisocks.c	tcp: change TCP_ECN prefixes to lower case	2014-09-29 14:41:22 -04:00
tcp_offload.c	net: Remove MPLS GSO feature.	2014-11-05 23:52:33 -08:00
tcp_output.c	tcp: refine TSO autosizing	2014-12-09 16:39:22 -05:00
tcp_probe.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_scalable.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_timer.c	net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited	2014-11-11 14:10:31 -05:00
tcp_vegas.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_vegas.h	net: ipv4/ipv6: Remove extern from function prototypes	2013-10-19 19:12:11 -04:00
tcp_veno.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tcp_westwood.c	net: tcp: split ack slow/fast events from cwnd_event	2014-09-29 00:13:10 -04:00
tcp_yeah.c	tcp: whitespace fixes	2014-09-01 18:12:45 -07:00
tunnel4.c	net: Convert printks to pr_<level>	2012-03-11 23:42:51 -07:00
udp.c	udp: Neaten and reduce size of compute_score functions	2014-12-08 20:28:47 -05:00
udp_diag.c	netlink: rename ssk to sk in struct netlink_skb_params	2013-04-19 14:57:56 -04:00
udp_impl.h	net: ipv4/ipv6: Remove extern from function prototypes	2013-10-19 19:12:11 -04:00
udp_offload.c	net: Remove MPLS GSO feature.	2014-11-05 23:52:33 -08:00
udp_tunnel.c	udp-tunnel: Add a few more UDP tunnel APIs	2014-09-19 15:57:15 -04:00
udplite.c	net: Eliminate no_check from protosw	2014-05-23 16:28:53 -04:00
xfrm4_input.c	xfrm4: Add IPsec protocol multiplexer	2014-02-25 07:04:16 +01:00
xfrm4_mode_beet.c	ipv4: ERROR: code indent should use tabs where possible	2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c	…
xfrm4_mode_tunnel.c	inetpeer: get rid of ip_id_count	2014-06-02 11:00:41 -07:00
xfrm4_output.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-05-24 00:32:30 -04:00
xfrm4_policy.c	xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly	2014-03-14 07:28:07 +01:00
xfrm4_protocol.c	xfrm4: Remove duplicate semicolon	2014-06-30 07:49:47 +02:00
xfrm4_state.c	inet: make no_pmtu_disc per namespace and kill ipv4_config	2013-12-18 16:58:20 -05:00
xfrm4_tunnel.c	sit: add IPv4 over IPv4 support	2013-05-31 17:19:05 -07:00