Commit Graph

370 Commits

Author SHA1 Message Date
Richard Chen c77d9e4abe merge conflicts 2020-12-02 21:53:19 +00:00
David Youngworth d64cf8b9e3 Merge branch 6.3 into master 2020-11-17 11:22:45 -08:00
Andrew Noyes c50e997f60 Make status tests deterministic
This change seems to be incorrect since afaict INetwork::timer isn't
guaranteed to be monotonic. Maybe we can make that guarantee or add an
INetwork::timer_monotonic symbol?
2020-11-05 17:05:34 +00:00
sfc-gh-tclinkenbeard cf4c8e375f Merge remote-tracking branch 'origin/release-6.3' into merge 2020-10-29 22:15:41 -07:00
Russell Sears 92a5178b4a
Merge branch 'release-6.3' into release-6-2-2020-10-23 2020-10-23 12:24:48 -07:00
Richard Chen 055add9682 conflicts 2020-10-23 06:33:00 +00:00
A.J. Beamon 6a6ea56596 Restore line that stores the data lag seconds of a storage server. This value is used to add a data lag message to status. 2020-10-20 10:12:00 -07:00
Xin Dong 944f30484a
Merge pull request #3759 from dongxinEric/misc/3739/expose-time-since-last-recovery
This resolves issue #3739 by exposing time since last full recovery.
2020-10-19 09:03:31 -07:00
Richard Chen 41843f07e6 add simulator support for different process versions and ProtocolVersion test 2020-10-12 18:19:31 +00:00
Markus Pilman 268ba0bddc Merge remote-tracking branch 'origin/release-6.3' into merge-6.3 2020-10-01 14:14:06 -06:00
Evan Tschannen b1180f8eb4 fixed naming and comments 2020-09-30 20:35:09 -07:00
Evan Tschannen b1570c740f extraTlogEligileZones should consider the database available both during a failover and also if the cluster cannot recruit tlogs in the remote region 2020-09-30 18:10:04 -07:00
Evan Tschannen 8c729ca8e6 only add additional fault tolerance for availability if automatic failover is enabled 2020-09-30 18:04:23 -07:00
Evan Tschannen 9f61039858 more fixes 2020-09-30 16:52:58 -07:00
Evan Tschannen d7454ac7da fixed compile error 2020-09-30 16:49:36 -07:00
Evan Tschannen fe5c30e778 fault tolerance was not being properly increased when usable regions was 2 and satellites are configured. 2020-09-30 16:41:00 -07:00
Xin Dong 480fc82779 Resolve review comments 2020-09-25 16:58:54 -07:00
Xin Dong a96d6f85c5 Removed redundant field number_of_old_generations_of_tlogs from status json 2020-09-24 09:44:51 -07:00
Xin Dong 77048c3d0f Handle possbile timeout when getting a read version. Updated documentation of the status json format 2020-09-23 13:50:59 -07:00
Xin Dong 50f681cd32
Apply suggestions from code review
Co-authored-by: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-09-23 10:54:49 -07:00
Meng Xu cf69f455a9
Merge pull request #3785 from apple/release-6.3
Merge Release 6.3 to master
2020-09-17 14:43:56 -07:00
Xin Dong 4df0f60729 Instead of using fully_recovered, use accepting_commits as a singal of DB turned available. Also add the number of old generations into status 2020-09-17 09:55:25 -07:00
Young Liu cc5bc16bd8 Rename more places from proxy to commit proxy 2020-09-15 22:29:49 -07:00
Xin Dong 3c7bd3549a Fix compile errors 2020-09-11 14:23:27 -07:00
Xin Dong f2f3351560 Only report if the field FullyRecoveredAtVersion exists. 2020-09-11 13:44:17 -07:00
Xin Dong 2619e4d3df Use version clock to mitigate network clock skew. 2020-09-11 13:39:16 -07:00
Xin Dong 224f23b0f8 Rely on MasterRecoveryState message since we only care about the current generation. 2020-09-11 11:45:02 -07:00
Young Liu 35bef73a1c Rename proxy to commit proxy 2020-09-10 17:44:15 -07:00
Young Liu 1867ee1f5f Change cli output format 2020-09-09 22:34:36 -07:00
Young Liu 1155d015c9 fetch current log generation as well 2020-09-09 11:54:58 -07:00
Trevor Clinkenbeard 62dd1f7234
Merge pull request #3696 from sfc-gh-xwang/tag-report
report busiest write tag of each storage server
2020-09-08 15:21:14 -07:00
XiaoxiWang 2935d3d4f6 change workload; solve some comments 2020-09-08 21:47:49 +00:00
Xin Dong 4363dd0f25 This resolves issue #3739 by exposing time since last full recovery. 2020-09-08 14:26:01 -07:00
Young Liu 23e1ff694c Report missing old tlogs in recovery between accepting commits and storage recovered 2020-09-08 13:35:42 -07:00
Young Liu 6c3d919295 Fix status fetcher for GrvProxyStats 2020-09-08 11:11:45 -07:00
XiaoxiWang ecf2c0109c more concise status json 2020-09-04 18:40:45 +00:00
XiaoxiWang 5b5087c566 format 2020-09-04 16:34:05 +00:00
XiaoxiWang 7660fb3beb report busiest tags in status json 2020-09-04 16:33:59 +00:00
Young Liu 63b3612ad5 Merge master branch and resolve conflicts 2020-08-24 16:42:31 -07:00
XiaoxiWang 1f134d1534 format 2020-08-21 05:06:13 +00:00
XiaoxiWang 9398a78a3a add busy-read count and busy-write count to status json 2020-08-21 04:50:56 +00:00
XiaoxiWang bc6e42c634 add status json report for recommended throttled tags 2020-08-19 19:22:16 +00:00
Young Liu d6a23a4d6b Resolve comments to make GRV proxy a separate process class 2020-08-06 00:01:57 -07:00
Young Liu df6b676ccb Fix status bug and backup minKnownCommittedVersion bug 2020-07-24 00:49:16 -07:00
Young Liu ff4bae5cd3 Fix status test 2020-07-23 12:04:02 -07:00
Young Liu 229ab0d5f1 Fix some conflicts and remote debugging trace events 2020-07-22 23:35:46 -07:00
Young Liu 525f10e30c Merge master branch 2020-07-22 16:08:49 -07:00
Young Liu 302cf5c45f Remove debug trace events 2020-07-22 12:20:22 -07:00
Young Liu 5b06d69d25 Pass watches test 2020-07-15 00:37:41 -07:00
A.J. Beamon b09dddc07e Merge branch 'release-6.2' into merge-release-6.2-into-release-6.3
# Conflicts:
#	cmake/ConfigureCompiler.cmake
#	documentation/sphinx/source/downloads.rst
#	fdbrpc/FlowTransport.actor.cpp
#	fdbrpc/fdbrpc.vcxproj
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/LogSystemPeekCursor.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/Status.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	flow/flow.vcxproj
2020-07-10 15:06:34 -07:00
Evan Tschannen 8befb0829d
Merge pull request #3481 from ajbeamon/fix-dc-timeout-message
Add missing messages to schema and rename one to match later versions
2020-07-10 10:30:21 -07:00
A.J. Beamon b51beead53 The backport of a change in later versions didn't include some updates to the schema and a change to the name of one of the messages. 2020-07-09 16:58:13 -07:00
A.J. Beamon 04d1217941 Track statistics about server-side request latency on each process, to include min, max, mean, and various percentiles. 2020-07-09 16:39:15 -07:00
A.J. Beamon e10704fd76 Cherry-pick region related status changes from 6.3 2020-06-09 14:56:21 -07:00
A.J. Beamon d128252e90 Merge release-6.3 into master 2020-05-22 09:25:32 -07:00
Evan Tschannen 87350e1bf7
Merge pull request #3174 from ajbeamon/process-available-memory-balancing
Balance available memory based on the limits set for each process.
2020-05-20 14:20:11 -07:00
A.J. Beamon d636194d0d Remove deprecated fields in status: worst_version_lag_storage_server and limiting_version_lag_storage_server 2020-05-19 13:12:10 -07:00
A.J. Beamon b49eb0f67a Balance available memory based on the limits set for each process. Don't report more available memory than the limit. 2020-05-14 15:49:59 -07:00
A.J. Beamon bc0873adf0 Update tag throttle count status fields 2020-05-12 15:50:08 -07:00
A.J. Beamon e0526e0095 Add busiest read tags to storage server status 2020-05-12 15:49:40 -07:00
A.J. Beamon aed97a9f20 Merge branch 'master' into transaction-tagging 2020-05-07 14:52:22 -07:00
Evan Tschannen ff992060cd
Merge pull request #3073 from tclinken/fix-open-for-ide-build
Fix non-boost-related OPEN_FOR_IDE build errors
2020-05-07 14:47:59 -07:00
A.J. Beamon 36454bb3b8 Merge branch 'master' into transaction-tagging
# Conflicts:
#	fdbclient/MasterProxyInterface.h
#	fdbclient/NativeAPI.actor.cpp
2020-05-04 10:23:25 -07:00
tclinken 943e8e7e84 More fixes for OPEN_FOR_IDE build 2020-05-01 23:04:12 -07:00
Evan Tschannen 17815fb6bf
Merge pull request #3037 from ajbeamon/status-busy-use-new-field
Use the updated field name in status when fetching process busyness info
2020-05-01 23:02:54 -07:00
A.J. Beamon 6ada5359b8 Merge branch 'master' into transaction-tagging 2020-04-29 14:27:21 -07:00
A.J. Beamon 054d6bca65 Use the updated field name in status when fetching process busyness info 2020-04-27 11:38:54 -07:00
Evan Tschannen 33efb9ec97 code cleanup based on review comments 2020-04-17 15:05:01 -07:00
Evan Tschannen 1476057996 properly cache serialization of serverDBInfo 2020-04-11 19:30:05 -07:00
Evan Tschannen 07cc0a8d74 code cleanup 2020-04-10 17:02:11 -07:00
Evan Tschannen e8d333733a Merge branch 'master' into feature-tree-broadcast 2020-04-10 13:51:09 -07:00
Evan Tschannen ce4493f679 many bug fixes 2020-04-10 13:45:16 -07:00
A.J. Beamon 36da61dd9c Merge branch 'master' into transaction-tagging
# Conflicts:
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/vexillographer/fdb.options
2020-04-07 21:12:14 -07:00
A.J. Beamon 2309e9f156 Consistently use timeout instead of timedout in status messages. 2020-04-07 08:43:23 -07:00
Evan Tschannen a51c92854a Merge branch 'master' into feature-tree-broadcast
# Conflicts:
#	fdbserver/WorkerInterface.actor.h
#	fdbserver/worker.actor.cpp
2020-04-06 21:09:44 -07:00
Evan Tschannen 477d66b46d implemented a tree broadcast for txn state message for proxies, and serverDBInfo for workers 2020-04-05 23:09:36 -07:00
A.J. Beamon 2336f073ad Checkpointing a bunch of work on throttles. Rudimentary implementation of auto-throttling. Support for manual throttling via fdbcli. Throttles are stored in the system keyspace. 2020-04-03 15:24:14 -07:00
Xin Dong 9a451bbf1a Address review comments 2020-04-03 10:49:40 -07:00
Xin Dong dfe5ae3f4c
Update fdbserver/Status.actor.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-04-03 10:25:49 -07:00
Xin Dong eaae9397e5 Address review comments 2020-04-02 11:04:58 -07:00
Xin Dong 5f710bde6a
Apply suggestions from code review
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-04-02 09:40:13 -07:00
Xin Dong e755583c07 Address review comments. 2020-04-01 15:13:04 -07:00
Xin Dong 484393e879
Update fdbserver/Status.actor.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-03-31 09:42:42 -07:00
Xin Dong 012d41548e Address review comments 2020-03-30 13:55:59 -07:00
Evan Tschannen e08f0201f1 merge release 6.2 into master 2020-03-17 12:51:47 -07:00
Evan Tschannen ed4d02a3e4
Merge pull request #2812 from etschannen/feature-proxy-mem-limit
Limit the amount of requests the proxy can queue up in memory
2020-03-16 14:56:56 -07:00
A.J. Beamon fe19f30999
Merge pull request #2813 from etschannen/feature-satellite-usable-regions
do not recruit satellite tlogs when usable regions=1
2020-03-16 11:54:42 -07:00
A.J. Beamon f2defc3a3a
Merge pull request #2814 from etschannen/feature-delay-recovery
Prevent coordinated state from filling up with too many old generations
2020-03-16 11:45:17 -07:00
Evan Tschannen e5d53c863b report in status the number of active generations 2020-03-16 10:29:17 -07:00
Evan Tschannen 04b752b40a Added additional logging related to memory errors (including in status) 2020-03-13 18:31:22 -07:00
Evan Tschannen 4640edf5d6 do not recruit satellite tlogs when usable regions=1 2020-03-13 10:24:52 -07:00
Alex Miller d86a601b84 Add cluster.processes.id.network.tls_policy.hz to status.
This allows monitoring of TLS policy failures, but one has to go scrape
for TLSPolicyFailure trace events to figure out why they're happening.
2020-03-13 02:46:10 -07:00
Evan Tschannen 303df197cf Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	bindings/c/test/mako/mako.c
#	documentation/sphinx/source/release-notes.rst
#	fdbbackup/backup.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/NativeAPI.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/LogRouter.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/fdbserver.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/flow.vcxproj
#	flow/flow.vcxproj.filters
#	versions.target
2020-03-06 18:22:46 -08:00
Evan Tschannen 6296465e07 Make the DD priority associated with populating a remote region lower than machine failures 2020-03-04 14:07:32 -08:00
Evan Tschannen 96258b9809 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbcli/fdbcli.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/FlowTransport.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/KeyValueStoreMemory.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/QuietDatabase.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/StorageMetrics.actor.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KVStoreTest.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/genericactors.actor.cpp
#	flow/serialize.h
2020-02-21 19:09:16 -08:00
A.J. Beamon 78cb1071dc Status should use the full list of proxies 2020-02-07 15:44:02 -08:00
mpilman d09e07f1f5 Merge remote-tracking branch 'upstream/master' into features/icc 2020-02-04 10:26:18 -08:00
mengranwo f597aa7e18 WIP : deployable/stable version since Nov 3. Start rebase to master branch 2020-01-15 13:49:45 -08:00
Evan Tschannen 3325980c03 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/OldTLogServer_6_0.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/WorkerInterface.actor.h
#	fdbserver/worker.actor.cpp
#	versions.target
2019-10-24 17:38:15 -07:00
Evan Tschannen 35ac0071a8 fixed a compiler error 2019-10-22 17:06:54 -07:00