Commit Graph

79 Commits

Author SHA1 Message Date
mpilman 7a62d3b526 Changed failure monitor ping delay to 1 second 2019-12-11 11:23:24 -08:00
Evan Tschannen 3c769fcf60 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	versions.target
2019-11-22 15:39:19 -08:00
Evan Tschannen 27cb299d84 simulation can sometimes randomly hang or throw connection_failed, instead of always doing one or the other 2019-11-21 16:24:18 -08:00
Evan Tschannen 2727b91c46 simulation tests network connections failing due to errors instead of just hanging 2019-11-21 12:33:07 -08:00
Meng Xu c4d1e6e1a9 Trace:Severity:Include SevNoInfo to mute trace
Define SevFRMutationInfo to trace mutations in restore.
2019-11-04 16:18:40 -08:00
Evan Tschannen 3325980c03 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/OldTLogServer_6_0.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/WorkerInterface.actor.h
#	fdbserver/worker.actor.cpp
#	versions.target
2019-10-24 17:38:15 -07:00
mpilman 325a8e4213 remove confusing USE_ODIRECT knob 2019-10-24 11:44:03 -07:00
mpilman f23392ec5a Don't use O_DIRECT in EIO by default 2019-10-24 11:39:55 -07:00
mpilman 7ad0e20e48 Added knob to disable O_DIRECT 2019-10-24 11:20:14 -07:00
mpilman f41f19b5f6 Introduced knob to set eio parallelism 2019-10-24 11:20:14 -07:00
Evan Tschannen 688940b685 merge 6.2 into master 2019-10-21 11:43:46 -07:00
Evan Tschannen 42b7acf7b7
Merge pull request #2202 from etschannen/feature-share-mutations
Backup and DR would not share mutations if started on different versions of FDB
2019-10-16 20:28:39 -07:00
Evan Tschannen 35ef9b32de fix: if establishing a TLS connection took longer than 10ms, we could spend all our CPU establishing new connections instead of pinging to maintain existing connections, leading to an infinite loop 2019-10-16 17:26:01 -07:00
Evan Tschannen b88b1614c1 bool knobs would improperly report a SevWarnAlways that they were unrecognized 2019-10-01 18:53:04 -07:00
Vishesh Yadav e005ef8e9a
Merge pull request #2030 from kaomakino/kaomakino/quickack
TCP_QUICKACK test
2019-09-10 17:06:12 -07:00
Kao Makino e5bbaa293a Add FLOW_TCP_NODELAY and FLOW_TCP_QUICKACK (Linux-only) knobs 2019-09-10 20:13:15 +00:00
Vishesh Yadav 8a54a57420
Merge pull request #2026 from alexmiller-apple/knobs-for-network-test
Make the networktest payload size a knob so that it can be changed.
2019-09-10 00:11:42 -07:00
Alex Miller b4de920da6 Knobs style is to use `e` or shift instead of writing long numbers. 2019-08-21 18:39:02 -07:00
Alex Miller 49c623826f Make the networktest payload size a knob so that it can be changed. 2019-08-21 18:37:50 -07:00
Evan Tschannen 41b908752e increased move keys parallelism to be less of a decrease just in case lowering this could effect normal data distribution
raised target durability lag versions to give more time for batch limiting to come into play before this limit is hit
changed max_bad_options to better reflect the name
2019-08-21 14:55:21 -07:00
Evan Tschannen 54282288cb disabled zone_id load balancing, because it can cause hot read shards 2019-08-19 14:04:21 -07:00
Evan Tschannen 51cedd24c8 load balance will send reads to remote servers if more than one alternative is failed or overloaded 2019-08-19 13:59:49 -07:00
Evan Tschannen da8163fd5a allow one requests every second to skip there all_alteratives_failed delay, because if a client has a timeout longer than the delay we will never invalidate the key servers cache 2019-08-09 13:03:40 -07:00
Evan Tschannen 36b7b98d3c Buggified the idle_timeout so that it happens in simulation 2019-08-09 12:35:55 -07:00
Evan Tschannen 5f7d3498ea The delay for all_alteratives_failed can scale all the way up to 30.0 at a much slow time ratio 2019-08-09 12:35:19 -07:00
mpilman 370ba8b841 Remove --object-serializer flag from executables 2019-08-06 09:25:40 -07:00
Evan Tschannen 2d136af2bd bool knobs can now be set with the words “true” or “false” instead of just a number 2019-07-30 18:21:24 -07:00
A.J. Beamon 94be9560ea Add cache_memory parameter to fdbserver to control the size of the (4K) page cache. Change the default slighty from 2000 MiB to 2GiB. 2019-07-23 15:05:21 -07:00
Evan Tschannen 5d3e69b6fc
Merge pull request #1820 from fzhjon/load-balance-locality
Introduced a knob that can turn locality on/off
2019-07-16 16:40:43 -07:00
Jon Fu 7d37040725 split the single knob into two for finer-grained control 2019-07-16 12:46:02 -07:00
A.J. Beamon d5051b08dd Make trace event field lengths (and total event sizes) default knobified and configurable. Add a transaction option to control the field length of transaction debug logging. Make the program start command line field less likely to be truncated. 2019-07-12 16:12:35 -07:00
Jon Fu 19a91765f6 introduced a knob that can turn locality on/off 2019-07-09 16:16:15 -07:00
Vishesh Yadav 983343978e fdbrpc: ConnectionMonitor should close unreferenced after delay
Potentially for cases, where it goes up to 1 immediately.
2019-07-09 14:24:16 -07:00
Vishesh Yadav 1f9c80f633 fdbrpc: Instead of tracking last sent data, track last sent non-ping data
* This will allow client to continue monitoring peer connections while
connection stays open, so that there is no period of "uncertainity"
without previous no-monitoring approach.

* Use multiplier for incoming connection idle timeout

* Update idle connection timeout values and leaked connection timeout in
simulator.
2019-07-09 14:24:16 -07:00
Vishesh Yadav 867986cdea fdbrpc: Reduced connection monitoring from clients
This patch does two changes to connection monitoring:

1. Connection monitoring at client side will check if the connection
has been stayed idle for some time. If connection is unused for a
while, we close the connection. There is some weirdness involved here
as ping messages are by themselves are connection traffic. We get over
this by making it two-phase process, first being checking idle
reliable traffic, followed by disabling pings and then checking for
idle unreliable traffic.

2. Connection monitoring of clients from server will no longer send
pings to clients. Instead, it keep monitor the received bytes and
close after certain period of inactivity.
2019-07-09 14:24:16 -07:00
Jingyu Zhou e6ff23c420 Allow next packet size to be read when receiving data
For large packet, allocate sizeof(uint32_t) more bytes for next packet size.
Also add knob MIN_PACKET_BUFFER_FREE_BYTES, which is used to trigger allocation
of a new arena when free bytes are lower than this threshold.
2019-06-25 10:18:56 -07:00
Jingyu Zhou c2c82c89a5 Remove knob DEFAULT_PACKET_BUFFER_BYTES 2019-06-25 10:18:56 -07:00
Jingyu Zhou 2363326ecb Add knobs for various packet buffer sizes 2019-06-25 10:18:56 -07:00
A.J. Beamon 603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00
Steve Atherton 5a8c97480a
Merge pull request #1506 from nikolas-ioannou/feature-pagecache-lru
AsyncFileCached: switch from a random to an LRU cache eviction policy
2019-05-17 13:42:21 -07:00
mpilman 1d5572a078 removed unused knob 2019-05-13 14:15:22 -07:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Austin Seipp 7a10199a35 flow: fix some print/scan format warnings
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2019-05-06 13:35:29 -07:00
Nikolas Ioannou fdb3992990 AsyncFileCached: switch to string for eviction policy knob 2019-05-03 14:04:43 +02:00
Nikolas Ioannou b2280e15bf AsyncFileCached: support for lru cache eviction policy
- Added a knob to control page cache eviction policy (0=random default, 1=lru)
- Added page cache hit, miss, and eviction stats
2019-05-02 17:35:30 +02:00
Evan Tschannen 390ab9cfed A process will mark itself as degraded if it continually disconnects from a different process which the failure monitor thinks is healthy 2019-04-04 14:11:12 -07:00
Evan Tschannen d670b74d69 prevent trace event spam by combining huge arena samples 2019-03-30 13:36:13 -07:00
Evan Tschannen 70ac5ffda0 added random logging for huge arenas 2019-03-20 11:20:47 -07:00
Evan Tschannen c8122edb6b avoid relying on 128<<10 in knobs.cpp 2019-03-18 12:40:15 -07:00
Evan Tschannen d2eb7578fd Occasionally log a backtrace when the fast allocators allocates a new magazine to help track down memory leaks 2019-03-17 21:59:56 -07:00