sfc-gh-tclinkenbeard
45c9a0abc7
Revert "Revert "Add limiting health metrics""
...
This reverts commit 209ebcc595
.
2020-11-13 17:24:57 -08:00
Trevor Clinkenbeard
209ebcc595
Revert "Add limiting health metrics"
2020-11-13 17:08:46 -08:00
sfc-gh-tclinkenbeard
9bb93dadf1
Reenabled Throttling.toml test (as a rare test)
2020-11-13 11:34:32 -08:00
Xiaoxi Wang
f930f9bbb6
update toml test spec
2020-09-02 05:25:40 +00:00
Evan Tschannen
a49cb41de7
Merge branch 'release-6.3'
...
# Conflicts:
# CMakeLists.txt
# cmake/ConfigureCompiler.cmake
# fdbserver/Knobs.cpp
# fdbserver/StorageCache.actor.cpp
# fdbserver/storageserver.actor.cpp
# flow/ThreadHelper.actor.h
# flow/serialize.h
# tests/CMakeLists.txt
2020-07-29 00:31:55 -07:00
Alex Miller
2841efe938
Rewrite most .txt tests into (pretty) .toml files.
...
This includes build/txt-to-toml.py which did the rewrites, and
can be used to rewrite other no-in-tree test spec files to toml.
I didn't touch status or restarting tests yet. Restarting will be handled
later. It turns out that I don't understand how status tests work.
2020-07-12 14:47:40 -07:00
sfc-gh-tclinkenbeard
10a4b8e321
Make Downgrade test rare
2020-07-09 17:52:41 -07:00
Alex Miller
17570b5b10
Make the testspec more restrictive in terms of what can be set where.
...
Testspec is currently very permissive in very misleading ways. In particular,
the tester parser itself will swallow K=V settings and apply them at the test
level, which breaks how a person would expect the scoping to work. Other
settings apply to the entire simulation run globally, but appear to be workload
specific. Even further, others affect simulation cluster creation or test
harness behavior, but can again be set anywhere in a testspec.
This changes testspec parsing to error if a setting that applies globally is
anywhere but the top of the file, or if a setting that applies test-wide is
applied to a workload instead of a test.
2020-07-06 02:03:30 -07:00
A.J. Beamon
80a235aa80
Add some correctness tests
2020-05-04 10:15:18 -07:00
Stephen Atherton
9227de5c20
Redwood correctness unit test was using wallclock based time limit which breaks determinism.
2019-11-11 15:13:58 -08:00
Evan Tschannen
d8ea3dbf9a
Added the ability to configure a cluster from a JSON file
2018-08-16 17:34:59 -07:00
Evan Tschannen
9c918a28f6
fix: status was reporting no replicas remaining when the remote datacenter was initially configured with usable_regions=2
2018-08-09 13:16:09 -07:00
Evan Tschannen
6d76ff67a3
added the connection string to status
2018-07-09 22:11:58 -07:00
Evan Tschannen
507b3bacb0
fix: kill all tlogs in one region prevents the remote logs from recovering in that region, do not allow that to prevent us from configuring usable_regions=1.
...
added more recovery states.
2018-07-05 00:08:51 -07:00
Evan Tschannen
866ccfe344
added the ability to allow the master to finish recovery before all storage servers in both regions have their mutations. This allows you to recover from scenarios where you lose all your tlogs in one dc.
2018-07-04 01:59:04 -04:00
Evan Tschannen
0123627d67
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-06-22 10:43:07 -07:00
Evan Tschannen
8a8914f046
re-added the ability to configure the number of log routers. Many log routers are needed to get a sufficient number of sockets involved in copying data across the WAN
2018-06-22 00:04:00 -07:00
A.J. Beamon
e8f66df001
Add metrics for watches and mutations on the storage server. The storage server tracks its lag with the logs, and status tries to report a more accurate measure of this lag.
2018-06-21 15:59:43 -07:00
A.J. Beamon
62eeefcc8a
Update status schema with new field
2018-06-20 12:09:28 -07:00
Evan Tschannen
1ccfb3a0f4
fix: log_anti_quorum was always 0 in simulation
...
removed durableStorageQuorum, because it is no longer a useful configuration parameter
2018-06-18 10:24:57 -07:00
Evan Tschannen
e8c462882b
re-added remote_logs as a parameter, because it could be useful to have a different number of logs between when recruited as primary and remote
2018-06-18 10:22:34 -07:00
Evan Tschannen
0913368651
added usable_regions to specify if we will replicate into a remote region
...
remote replication defaults to the primary replication
removed remote_logs, because they should be specified as an override in the regions object
2018-06-17 19:31:15 -07:00
Evan Tschannen
246abd1207
added full_replication to status
2018-06-14 21:14:18 -07:00
Evan Tschannen
0103b6f5ed
added datacenter_version_difference to status
2018-06-14 19:09:25 -07:00
Evan Tschannen
99e21c869c
fixed a number of status calculations, and re-enabled the status workload
2018-06-14 17:58:57 -07:00
A.J. Beamon
ca720e1540
Merge pull request #297 from apple/release-5.2
...
Merge 5.2 to Master
2018-05-08 12:04:20 -07:00
A.J. Beamon
432a295bc2
Add read bytes and read keys info to status. Collect this information directly from StorageMetrics rather than through ratekeeper.
2018-05-04 12:01:40 -07:00
Evan Tschannen
91bb8faa45
Merge commit 'f773b9460d31d31b7d421860fc647936f31aa1fa'
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-03-09 14:47:03 -08:00
A.J. Beamon
b2ef6e1358
Add missing available_bytes fields to test status schemas
2018-03-09 14:17:20 -08:00
Evan Tschannen
cb25564d38
simulated cluster supports fearless configurations
...
removed unused simulation variables
run the simulation with only 1 coordinator most of the time, since we protect the coordinator from being killed, and protecting too many things is bad for simulation
2018-02-15 18:32:39 -08:00
Evan Tschannen
264dc44dfa
fixed many more bugs associated with running without remote logs
2018-01-17 17:03:17 -08:00
Evan Tschannen
3ec45d38a0
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-01-06 13:54:45 -08:00
Evan Tschannen
5ac4f73978
Merge branch 'release-5.1' into feature-remote-logs
...
# Conflicts:
# fdbclient/NativeAPI.actor.cpp
# fdbrpc/Locality.h
# fdbrpc/simulator.h
# fdbserver/ApplyMetadataMutation.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/LogSystemPeekCursor.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/SimulatedCluster.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WorkerInterface.h
# fdbserver/masterserver.actor.cpp
# flow/Net2.actor.cpp
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-01-05 11:33:42 -08:00
A.J. Beamon
5015119115
Generalize the message that gets displayed in status if a cluster file's contents are incorrect.
2018-01-05 10:29:47 -08:00
A.J. Beamon
7cf17df821
Merge branch 'master' into log-group-for-unsupported-clients
...
# Conflicts:
# flow/Net2.actor.cpp
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2017-11-01 11:31:02 -07:00
A.J. Beamon
fcbeea104f
Update documentation and tests with new connected_clients status schema
2017-11-01 11:25:50 -07:00
Evan Tschannen
48901a9223
added a list of tlog IDs that are missing to status
2017-10-24 16:28:50 -07:00
Evan Tschannen
15962cf079
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbrpc/Locality.cpp
# fdbrpc/Locality.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/ClusterRecruitmentInterface.h
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WorkerInterface.h
# fdbserver/fdbserver.vcxproj.filters
# fdbserver/masterserver.actor.cpp
# fdbserver/worker.actor.cpp
# flow/error_definitions.h
2017-10-05 17:09:44 -07:00
Yichi Chiang
d4f75630de
Support log group field in status json
2017-09-28 16:31:29 -07:00
Evan Tschannen
e8b895c878
added the ability to disable connection failures for a period of time after one happens
2017-09-18 12:46:29 -07:00
Evan Tschannen
5e4a94120c
update status json schema
2017-07-15 16:18:59 -07:00
Stephen Atherton
7260e38545
Merge branch 'fix-io-timeout-handling'
...
# Conflicts:
# fdbrpc/AsyncFileKAIO.actor.h
# fdbrpc/sim2.actor.cpp
# fdbserver/KeyValueStoreSQLite.actor.cpp
# fdbserver/optimisttest.actor.cpp
# fdbserver/worker.actor.cpp
# fdbserver/workloads/MachineAttrition.actor.cpp
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2017-05-26 17:43:28 -07:00
FDB Dev Team
a674cb4ef4
Initial repository commit
2017-05-25 13:48:44 -07:00