Commit Graph

166 Commits

Author SHA1 Message Date
Vishesh Yadav 1976f2c371 FlowTransport: Sample connect latencies 2020-11-10 12:12:01 -08:00
Vishesh Yadav 7bdcb01bdf FlowTransport: Count number of incoming/outgoing/failed connections with logging 2020-11-02 18:51:48 -08:00
Evan Tschannen 52828f9e03 Added bytesSent to the PingLatency logging; increasing the logging interval from 1 second to 3 seconds 2020-10-11 23:05:21 -07:00
Evan Tschannen e82bd2a6ff added a suppressFor just in case a warning is spammy 2020-10-06 18:27:19 -07:00
Evan Tschannen d59ee9cc30 add a sample when a ping times out 2020-10-06 17:22:33 -07:00
Evan Tschannen 1eccabc285 added logging for a scenario that should not happen 2020-10-06 17:19:02 -07:00
Evan Tschannen eddf8530d8 fix compile errors 2020-10-06 17:09:14 -07:00
Evan Tschannen 12d2f3a8f3 fixed includes 2020-10-06 17:01:13 -07:00
Evan Tschannen 2c97a7e160 added missing include 2020-10-06 16:56:52 -07:00
Evan Tschannen 822312b31d only track latencies to public network addresses; use a continousSample to get median and p90 latencies 2020-10-06 16:55:35 -07:00
Evan Tschannen 2166f9a3dd added logging about bytes received 2020-10-06 16:07:35 -07:00
Evan Tschannen 9efda1fec5 added logging for the ping latencies for all network connections 2020-10-06 13:58:05 -07:00
Evan Tschannen b4a1c269b8 lastConnectTime was not being set for incomingConnections 2020-08-31 09:10:30 -07:00
Evan Tschannen c9ff450a36 do not reject a connection as redundant if our existing connection is more than 15 seconds old 2020-08-30 18:49:49 -07:00
Evan Tschannen f6163d0a79 fix compile errors 2020-07-09 22:53:02 -07:00
Evan Tschannen 717242a0ee reset WAN network connections every 5 minutes is responses take more than 500ms 2020-07-09 22:50:47 -07:00
Evan Tschannen a835123680 exit fdbserver after a ReceiverError, because a packet which should be delivered will never be received 2020-04-22 23:38:46 -07:00
Evan Tschannen 810bba2067 cleanup calls to FlowTransport::isClient() 2020-04-22 23:36:40 -07:00
Evan Tschannen 976c2fc7a8
Update fdbrpc/FlowTransport.actor.cpp
Co-Authored-By: Alex Miller <35046903+alexmiller-apple@users.noreply.github.com>
2020-03-04 16:13:59 -08:00
Evan Tschannen 820957025f accept connections in batches of 20 to improve performance 2020-03-04 14:24:57 -08:00
Evan Tschannen 65fbe0d0bc revert AcceptSocket priority change because of bad performance results 2020-02-21 19:22:14 -08:00
Evan Tschannen dc3826e2fd fix: tls throttling would re-insert the failure into the map 2020-02-20 18:17:39 -08:00
Evan Tschannen f04e311a1e Merge commit 'b46d6e25e24993ab5a5f04091fd3235050b7cd09' into feature-boost-ssl
# Conflicts:
#	fdbserver/SimulatedCluster.actor.cpp
#	flow/Net2.actor.cpp
2020-02-20 17:36:38 -08:00
Evan Tschannen 3c4d551647 improve prioritization of connection monitor and listen given that listen is no longer expensive (because handshake is done separately) 2020-02-19 18:50:21 -08:00
Evan Tschannen 46d5f5e325 do not trigger the resetPing if we cannot actually remove the peer, because it will cause us to reset the timeout, so repeated calls to removePeer can keep a dead peer from being removed 2020-02-19 15:17:50 -08:00
Evan Tschannen 69de430057 separate handshaking from connection to improve pipelining 2020-02-06 16:45:54 -08:00
Evan Tschannen 53d0867a17 limit the number of connections a process can attempt to establish in parallel 2020-02-04 18:15:10 -08:00
Evan Tschannen 84853dd1fd switched SSL implementation to use boost ssl 2020-02-04 14:56:40 -08:00
Evan Tschannen 9e137d3b49 fix: addPeerReference only marks a connection as healthy if it is the first peerReference
added additional logging to long LoadBalance calls, and when the failure monitor state changes for an address
2019-12-19 18:26:29 -08:00
Andrew Noyes 46d10dc7dc Fix "null passed as argument declared not null"
Fix several such reports from ubsan

E.g.

/Users/anoyes/workspace/foundationdb/flow/Arena.h:794:16: runtime error: null pointer passed as argument 1, which is declared to never be null
2019-12-03 14:46:53 -08:00
Evan Tschannen 067dc55bfb fix: making _conn a state variable was keeping connections open that should be closed 2019-11-21 16:08:32 -08:00
Evan Tschannen 569c6d4476 throws of connection_failed() from net()->connect did not result in clients marking a connection as failed in the failure monitor 2019-11-21 13:08:59 -08:00
A.J. Beamon aad9fa3baa Don't check for too many connections closed on client connections 2019-11-13 13:00:43 -08:00
A.J. Beamon ef801a6432 Rename LargePacket warnings to distinguish between sent and received packets. Also remove Net2_ prefix from packet size trace events. 2019-11-12 09:23:46 -08:00
A.J. Beamon 562ce17eca Initialize outgoingConnectionIdle in the constructor. Add back line to connectionKeeper that is needed in some looping cases 2019-10-10 12:48:35 -07:00
A.J. Beamon ad8604f24a Fix spurious ConnectionClosed event when starting a connection. 2019-10-10 10:34:44 -07:00
Vishesh Yadav cf56b005e8 Add comment for pinging incompatible clients
If client is incompatible, connectionMonitor relies on peer->resetPing
to be triggered whenever data is received to prevent ping timeout.
The server stopped sending pings since 6.2 which meant resetPing
doesn't get triggered.
2019-08-30 11:17:22 -07:00
Evan Tschannen 8fc28dd730 fix: continue pinging incompatible clients from the servers so that the the client knows the server process is active 2019-08-29 16:51:03 -07:00
Evan Tschannen 1c0484cffc fix: do not close connections which have outstanding tryGetReplies with the peer 2019-08-29 16:49:57 -07:00
Evan Tschannen 0eb0e7a44a made Peer reference counted to avoid other potential bugs involving accessing Peer after it has been destroyed 2019-08-09 11:52:12 -07:00
Evan Tschannen 84fd1003a5 do not close idle network connections with incompatible servers 2019-08-08 23:47:00 -07:00
Evan Tschannen 98b643b7ae fix: connectionReader could access self in its destructor after it has already the object has already been deleted if an error can be thrown from connectionMonitor while still on the stack from scanPackets 2019-08-08 23:35:44 -07:00
mpilman 370ba8b841 Remove --object-serializer flag from executables 2019-08-06 09:25:40 -07:00
mpilman d5caf0c1b4 Merge branch 'flatbuffers-fixes2' of github.com:mpilman/foundationdb into flatbuffers-fixes2 2019-07-16 14:47:40 -07:00
mpilman 75d4b612cf Make object serializer versioned 2019-07-12 11:53:14 -07:00
Andrew Noyes 70f0726185 Support PacketBuffer's of arbitrary size 2019-07-11 23:03:31 -07:00
Vishesh Yadav 4b8eb27134 fdbrpc: Move setStatus line in addPeerReference 2019-07-09 15:01:12 -07:00
Vishesh Yadav 983343978e fdbrpc: ConnectionMonitor should close unreferenced after delay
Potentially for cases, where it goes up to 1 immediately.
2019-07-09 14:24:16 -07:00
Vishesh Yadav 22678267cd fdbrpc: Don't drop idle connections from server
Instead try pinging the client and let that decide whether the client
is alive or not. Ideally, it should always be failed since a well
behaved client would have closed the connection.
2019-07-09 14:24:16 -07:00
Vishesh Yadav 1f9c80f633 fdbrpc: Instead of tracking last sent data, track last sent non-ping data
* This will allow client to continue monitoring peer connections while
connection stays open, so that there is no period of "uncertainity"
without previous no-monitoring approach.

* Use multiplier for incoming connection idle timeout

* Update idle connection timeout values and leaked connection timeout in
simulator.
2019-07-09 14:24:16 -07:00