OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Eric Dumazet	740b0f1841	tcp: switch rtt estimations to usec resolution Upcoming congestion controls for TCP require usec resolution for RTT estimations. Millisecond resolution is simply not enough these days. FQ/pacing in DC environments also require this change for finer control and removal of bimodal behavior due to the current hack in tcp_update_pacing_rate() for 'small rtt' TCP_CONG_RTT_STAMP is no longer needed. As Julian Anastasov pointed out, we need to keep user compatibility : tcp_metrics used to export RTT and RTTVAR in msec resolution, so we added RTT_US and RTTVAR_US. An iproute2 patch is needed to use the new attributes if provided by the kernel. In this example ss command displays a srtt of 32 usecs (10Gbit link) lpk51:~# ./ss -i dst lpk52 Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port tcp ESTAB 0 1 10.246.11.51:42959 10.246.11.52:64614 cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448 cwnd:10 send 3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559 Updated iproute2 ip command displays : lpk51:~# ./ip tcp_metrics \| grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source 10.246.11.51 Old binary displays : lpk51:~# ip tcp_metrics \| grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source 10.246.11.51 With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Larry Brakmo <brakmo@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 17:08:40 -05:00
Stanislav Fomichev	45f7435968	tcp: remove unused min_cwnd member of tcp_congestion_ops Commit `684bad1107` "tcp: use PRR to reduce cwin in CWR state" removed all calls to min_cwnd, so we can safely remove it. Also, remove tcp_reno_min_cwnd because it was only used for min_cwnd. Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-13 18:22:34 -05:00
Yuchung Cheng	9f9843a751	tcp: properly handle stretch acks in slow start Slow start now increases cwnd by 1 if an ACK acknowledges some packets, regardless the number of packets. Consequently slow start performance is highly dependent on the degree of the stretch ACKs caused by receiver or network ACK compression mechanisms (e.g., delayed-ACK, GRO, etc). But slow start algorithm is to send twice the amount of packets of packets left so it should process a stretch ACK of degree N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A follow up patch will use the remainder of the N (if greater than 1) to adjust cwnd in the congestion avoidance phase. In addition this patch retires the experimental limited slow start (LSS) feature. LSS has multiple drawbacks but questionable benefit. The fractional cwnd increase in LSS requires a loop in slow start even though it's rarely used. Configuring such an increase step via a global sysctl on different BDPS seems hard. Finally and most importantly the slow start overshoot concern is now better covered by the Hybrid slow start (hystart) enabled by default. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-04 19:57:59 -05:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Stephen Hemminger	a252bebe22	tcp: mark tcp_congestion_ops read_mostly Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-10 00:40:17 -08:00
Joe Perches	9d4fb27db9	net/ipv4: Move && and \|\| to end of previous line On Sun, 2009-11-22 at 16:31 -0800, David Miller wrote: > It should be of the form: > if (x && > y) > > or: > if (x && y) > > Fix patches, rather than complaints, for existing cases where things > do not follow this pattern are certainly welcome. Also collapsed some multiple tabs to single space. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-23 10:41:23 -08:00
Ilpo Järvinen	c3a05c6050	[TCP]: Cong.ctrl modules: remove unused good_ack from cong_avoid Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:41 -08:00
Stephen Hemminger	30cfd0baf0	[TCP]: congestion control API pass RTT in microseconds This patch changes the API for the callback that is done after an ACK is received. It solves a couple of issues: * Some congestion controls want higher resolution value of RTT (controlled by TCP_CONG_RTT_SAMPLE flag). These don't really want a ktime, but all compute a RTT in microseconds. * Other congestion control could use RTT at jiffies resolution. To keep API consistent the units should be the same for both cases, just the resolution should change. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-31 02:27:57 -07:00
Stephen Hemminger	16751347a0	[TCP]: remove unused argument to cong_avoid op None of the existing TCP congestion controls use the rtt value pased in the ca_ops->cong_avoid interface. Which is lucky because seq_rtt could have been -1 when handling a duplicate ack. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-18 01:46:58 -07:00
Ilpo Järvinen	b9ce204f0a	[TCP]: Congestion control API RTT sampling fix Commit `164891aadf` broke RTT sampling of congestion control modules. Inaccurate timestamps could be fed to them without providing any way for them to identify such cases. Previously RTT sampler was called only if FLAG_RETRANS_DATA_ACKED was not set filtering inaccurate timestamps nicely. In addition, the new behavior could give an invalid timestamp (zero) to RTT sampler if only skbs with TCPCB_RETRANS were ACKed. This solves both problems. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-15 15:08:43 -07:00
YOSHIFUJI Hideaki	84299b3bc4	[TCP]: Fix linkage errors on i386. To avoid raw division, use ktime_to_timeval() to get usec. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:49 -07:00
Stephen Hemminger	164891aadf	[TCP]: Congestion control API update. Do some simple changes to make congestion control API faster/cleaner. * use ktime_t rather than timeval * merge rtt sampling into existing ack callback this means one indirect call versus two per ack. * use flags bits to store options/settings Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:45 -07:00
Wong Hoi Sing Edison	bfbea8a886	[TCP] tcp-lp: prevent chance for oops This patch fix the chance for tcp_lp_remote_hz_estimator return 0, if 0 < rhz < 64. It also make sure the flag LP_VALID_RHZ is set correctly. Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:07 -07:00
Alexey Dobriyan	298969727e	[TCP] tcp_lp: use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:03 -07:00
Dave Jones	bf0d52492d	[NET]: Remove unnecessary config.h includes from net/ config.h is automatically included by kbuild these days. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:21 -07:00
Wong Hoi Sing Edison	3795da47e8	[TCP] tcp-lp: bug fix for oops in 2.6.18-rc6 Sorry that the patch submited yesterday still contain a small bug. This version have already been test for hours with BT connections. The oops is now difficult to reproduce. Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:09 -07:00
Wong Hoi Sing Edison	7c106d7e78	[TCP]: TCP Low Priority congestion control TCP Low Priority is a distributed algorithm whose goal is to utilize only the excess network bandwidth as compared to the ``fair share`` of bandwidth as targeted by TCP. Available from: http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf Original Author: Aleksandar Kuzmanovic <akuzma@northwestern.edu> See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation. As of 2.6.13, Linux supports pluggable congestion control algorithms. Due to the limitation of the API, we take the following changes from the original TCP-LP implementation: o We use newReno in most core CA handling. Only add some checking within cong_avoid. o Error correcting in remote HZ, therefore remote HZ will be keeped on checking and updating. o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne OWD have a similar meaning as RTT. Also correct the buggy formular. o Handle reaction for Early Congestion Indication (ECI) within pkts_acked, as mentioned within pseudo code. o OWD is handled in relative format, where local time stamp will in tcp_time_stamp format. Port from 2.4.19 to 2.6.16 as module by: Wong Hoi Sing Edison <hswong3i@gmail.com> Hung Hing Lun <hlhung3i@gmail.com> Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:21 -07:00

17 Commits