Steve Atherton
813dbfb297
TLS handshake threads are now created with TLS initialization instead of when the network is created since they may not be needed and it avoids calling startThread() with a custom stack size too early in process execution. StartThread() now logs a warning if the given stack size can't be used. Reorganized TLS handshake thread state near the other TLS related stuff.
2020-06-23 03:09:56 -07:00
Steve Atherton
c3ce0034bf
Background threads for TLS handshakes use a stack size based on a knob. Platform::startThread() now accepts a stack size. The generic thread pool implementation takes an optional stack size override which is used for each added thread.
2020-06-23 02:08:01 -07:00
Steve Atherton
3853860cf1
Changed default TLS handshake thread count.
2020-06-22 16:28:30 -07:00
Steve Atherton
6330f8b458
TLS handshakes can now be done using up to a configured number of background threads if any of them are not busy.
2020-06-21 06:23:32 -07:00
Evan Tschannen
aed2d34bcb
Merge branch 'master' into feature-proxy-load-balance
...
# Conflicts:
# fdbclient/NativeAPI.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# flow/Knobs.cpp
2020-05-01 09:19:39 -07:00
Evan Tschannen
76fb345dd1
Merge branch 'master' into feature-tree-broadcast
...
# Conflicts:
# fdbrpc/FailureMonitor.actor.cpp
2020-04-29 09:51:22 -07:00
Evan Tschannen
ba3e2af473
Merge commit '5288033bcfe40c3ade97c8bf2d04cf31b3f16cb1' into feature-tree-broadcast
2020-04-17 15:17:37 -07:00
Vishesh Yadav
8c8f23bff2
Merge remote-tracking branch 'apple/master' into task/issue-1017-slow-machine-poisoning
2020-04-16 00:45:35 -07:00
A.J. Beamon
d8690d31cd
Merge branch 'master' into per-priority-busy-logging
...
# Conflicts:
# flow/Net2.actor.cpp
2020-04-15 08:31:30 -07:00
A.J. Beamon
b1172417f5
Merge branch 'master' into per-priority-busy-logging
...
# Conflicts:
# flow/Knobs.cpp
# flow/Knobs.h
# flow/Net2.actor.cpp
2020-04-14 14:22:12 -07:00
A.J. Beamon
e104a2e3a6
Merge commit 'cf01233f28a2c42908656a39f458a4475c1d44a3' into run-loop-busy-profiler
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/NativeAPI.actor.h
# fdbserver/fdbserver.actor.cpp
# flow/Net2.actor.cpp
2020-04-14 14:02:24 -07:00
Evan Tschannen
07cc0a8d74
code cleanup
2020-04-10 17:02:11 -07:00
Vishesh Yadav
13447f439f
fdbrpc: Add a constant to onFailedFor()
...
Since, we mark an address as failed when connection is failed, this
patch adds a contant to compensate the time needed to reconnect and
make sure endpoint is actually down. This contant is equal to
FAILURE_MIN_DELAY which was used by centralized
FailureMonitoringClient earlier removed.
2020-04-08 19:34:40 -07:00
Vishesh Yadav
975e6b1d9a
Merge remote-tracking branch 'apple/master' into task/issue-1017-slow-machine-poisoning
...
Removed merge conflict with old build system.
2020-04-08 19:25:13 -07:00
Vishesh Yadav
fdc1048f75
Add knob to turn off marking unstable connections
2020-04-03 15:53:00 -07:00
Vishesh Yadav
1d35f2ff5a
Mark a connection as failed for X seconds if closes too often
2020-04-03 15:53:00 -07:00
tclinken
884e92bb49
Atomically update dependent knobs
2020-04-01 15:18:49 -07:00
Evan Tschannen
e08f0201f1
merge release 6.2 into master
2020-03-17 12:51:47 -07:00
A.J. Beamon
d8cfabe73b
Extend the allocation tracing disabling flag to cover more parts of trace logging as a precaution. Make it possible to disable via knob.
2020-03-16 13:59:31 -07:00
Evan Tschannen
303df197cf
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# bindings/c/test/mako/mako.c
# documentation/sphinx/source/release-notes.rst
# fdbbackup/backup.actor.cpp
# fdbclient/NativeAPI.actor.cpp
# fdbclient/NativeAPI.actor.h
# fdbserver/DataDistributionQueue.actor.cpp
# fdbserver/Knobs.cpp
# fdbserver/Knobs.h
# fdbserver/LogRouter.actor.cpp
# fdbserver/SkipList.cpp
# fdbserver/fdbserver.actor.cpp
# flow/CMakeLists.txt
# flow/Knobs.cpp
# flow/Knobs.h
# flow/flow.vcxproj
# flow/flow.vcxproj.filters
# versions.target
2020-03-06 18:22:46 -08:00
Evan Tschannen
39050308ff
lower accept batch size just to be conservative with the change
2020-03-05 18:17:49 -08:00
Evan Tschannen
820957025f
accept connections in batches of 20 to improve performance
2020-03-04 14:24:57 -08:00
Evan Tschannen
924d335aa7
Merge branch 'release-6.2'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# flow/Knobs.cpp
# flow/Knobs.h
2020-02-25 18:25:19 -08:00
Evan Tschannen
65fbe0d0bc
revert AcceptSocket priority change because of bad performance results
2020-02-21 19:22:14 -08:00
Evan Tschannen
96258b9809
Merge branch 'release-6.2'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbcli/fdbcli.actor.cpp
# fdbclient/ManagementAPI.actor.cpp
# fdbrpc/FlowTransport.actor.cpp
# fdbserver/ClusterController.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/DataDistribution.actor.h
# fdbserver/DataDistributionQueue.actor.cpp
# fdbserver/KeyValueStoreMemory.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/QuietDatabase.actor.cpp
# fdbserver/SkipList.cpp
# fdbserver/StorageMetrics.actor.h
# fdbserver/TLogServer.actor.cpp
# fdbserver/fdbserver.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/KVStoreTest.actor.cpp
# flow/CMakeLists.txt
# flow/Knobs.cpp
# flow/Knobs.h
# flow/genericactors.actor.cpp
# flow/serialize.h
2020-02-21 19:09:16 -08:00
Evan Tschannen
f04e311a1e
Merge commit 'b46d6e25e24993ab5a5f04091fd3235050b7cd09' into feature-boost-ssl
...
# Conflicts:
# fdbserver/SimulatedCluster.actor.cpp
# flow/Net2.actor.cpp
2020-02-20 17:36:38 -08:00
Evan Tschannen
08c318d28a
re-added the connect lock in the fdbcli so that the timeout is not spent before a connection has been initiated (because of the handshake lock)
2020-02-20 10:43:34 -08:00
Evan Tschannen
69b5a1fbe3
more priority improvements
2020-02-20 10:11:43 -08:00
Evan Tschannen
fbd45963d8
The cluster controller waits until no new workers register for 1.0 before starting a bad recruitment
2020-02-19 16:48:30 -08:00
Alex Miller
fe78524bbc
Merge pull request #2678 from sears/networktest_perf
...
Add some tuning knobs to networktestclient; also, measure latency directly
2020-02-19 14:38:09 -08:00
Evan Tschannen
693e469003
Changed the handshake lock to a BoundedFlowLock, which will enforce that old handshakes complete before starting to initiate new handshakes
2020-02-14 16:49:52 -08:00
Russell Sears
7724c644e5
Add some tuning knobs to networktestclient; also, measure latency directly.
2020-02-13 13:11:54 -08:00
Andrew Noyes
1248d2b8b4
Remove USE_OBJECT_SERIALIZER knob
2020-02-12 10:41:52 -08:00
A.J. Beamon
abb75f7eb7
Add logging to indicate the time spent at each priority that exceeds some minimum busyness threshold
2020-02-07 14:34:24 -08:00
Evan Tschannen
69de430057
separate handshaking from connection to improve pipelining
2020-02-06 16:45:54 -08:00
Evan Tschannen
53d0867a17
limit the number of connections a process can attempt to establish in parallel
2020-02-04 18:15:10 -08:00
Evan Tschannen
4524831456
Merge pull request #2518 from vishesh/task/failmon-remove-server
...
FailureMonitoring: Server processes no longer need to talk to ClusterController
2020-02-03 17:22:50 -08:00
A.J. Beamon
182dac7cd5
Convert the slow task profiler into a run loop profiler that also logs when the run loop is 100% busy for a knob-configurable duration.
2020-01-28 12:09:37 -08:00
Evan Tschannen
78adbea834
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# flow/Knobs.h
# versions.target
2020-01-21 21:38:19 -08:00
Evan Tschannen
afd3ec13ff
added knobs
2020-01-21 18:58:34 -08:00
Vishesh Yadav
daef5f011a
Merge remote-tracking branch 'apple/master' into task/failmon-remove-server
2020-01-21 13:20:15 -08:00
Evan Tschannen
3f9d9d8b84
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# cmake/FlowCommands.cmake
# documentation/sphinx/source/release-notes.rst
# fdbclient/StorageServerInterface.h
# fdbserver/DataDistributionTracker.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/fdbserver.actor.cpp
# flow/Knobs.h
# flow/Platform.cpp
# versions.target
2020-01-16 18:37:47 -08:00
Evan Tschannen
0e916fdbed
throttle client TLS errors longer than server errors so that when both happen simultaneously the server throttling will be disabled when the client makes its next attempt
2020-01-12 22:12:18 -08:00
Balachandar Namasivayam
741aa523e6
Establishing TLS connection through the handshake process is expensive and the fdbserver process can get easily saturated with doing repeated TLS handshakes with only a few hundreds of clients have bad certificate. Hence throttle the number of handshakes done on the server per client ip if it has a bad certificate.
2020-01-10 16:19:41 -08:00
Vishesh Yadav
598b2eaeb0
fdbrpc: Add warning when peer is unavailable for long time
2020-01-08 13:55:13 -08:00
Evan Tschannen
83ad9caf54
implemented a load balancing algorithm which evens out the number of requests processes by each proxy
2020-01-08 01:59:01 -08:00
mpilman
7a62d3b526
Changed failure monitor ping delay to 1 second
2019-12-11 11:23:24 -08:00
Evan Tschannen
3c769fcf60
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# fdbserver/ClusterController.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# versions.target
2019-11-22 15:39:19 -08:00
Evan Tschannen
27cb299d84
simulation can sometimes randomly hang or throw connection_failed, instead of always doing one or the other
2019-11-21 16:24:18 -08:00
Evan Tschannen
2727b91c46
simulation tests network connections failing due to errors instead of just hanging
2019-11-21 12:33:07 -08:00