Commit Graph

365 Commits

Author SHA1 Message Date
sfc-gh-tclinkenbeard 4669f837fa Add uses of makeReference 2020-11-07 22:10:18 -08:00
Jon Fu 51db9a7e0a add static method to access backup pause key instead of constructing it manually 2020-11-06 14:03:29 -05:00
Jon Fu b90ad11483 add some more trace and comments 2020-11-05 14:23:08 -05:00
Jon Fu bda72d9a3d first draft at changing snapshot backup behaviour 2020-11-02 17:12:30 -05:00
Xin Dong 944f30484a
Merge pull request #3759 from dongxinEric/misc/3739/expose-time-since-last-recovery
This resolves issue #3739 by exposing time since last full recovery.
2020-10-19 09:03:31 -07:00
Trevor Clinkenbeard 24ea35e56f
Merge pull request #3748 from sfc-gh-ljoswiak/visibility-2
Add TLogVersion::V6
2020-10-14 17:35:32 -07:00
Jon Fu e696c4a8eb remove redundant line 2020-10-07 15:23:17 -04:00
Lukas Joswiak dea7000970 Merge remote-tracking branch 'upstream/master' into visibility-1 2020-10-06 18:38:15 -07:00
Jon Fu 44cd3b0999 add stopBackup to incrementalBackup workload 2020-09-30 14:24:52 -04:00
Jon Fu 3ceb44f4df add TEST macros in code paths 2020-09-28 16:40:38 -04:00
Xin Dong a96d6f85c5 Removed redundant field number_of_old_generations_of_tlogs from status json 2020-09-24 09:44:51 -07:00
Jon Fu 69580593dd Merge branch 'master' of https://github.com/apple/foundationdb into jfu-snapshot-record-version 2020-09-23 15:35:05 -04:00
Xin Dong 50f681cd32
Apply suggestions from code review
Co-authored-by: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-09-23 10:54:49 -07:00
Xin Dong 4df0f60729 Instead of using fully_recovered, use accepting_commits as a singal of DB turned available. Also add the number of old generations into status 2020-09-17 09:55:25 -07:00
Young Liu cc5bc16bd8 Rename more places from proxy to commit proxy 2020-09-15 22:29:49 -07:00
Xin Dong 3c7bd3549a Fix compile errors 2020-09-11 14:23:27 -07:00
Xin Dong 2619e4d3df Use version clock to mitigate network clock skew. 2020-09-11 13:39:16 -07:00
Jon Fu 260c8d9568 Merge branch 'master' of https://github.com/apple/foundationdb into jfu-snapshot-record-version 2020-09-11 15:05:58 -04:00
Jon Fu 22996284c7 added changes to allow writing of last epoch end version to special keys when performing recovery due to snapshot 2020-09-11 15:00:11 -04:00
Xin Dong 224f23b0f8 Rely on MasterRecoveryState message since we only care about the current generation. 2020-09-11 11:45:02 -07:00
Young Liu 35bef73a1c Rename proxy to commit proxy 2020-09-10 17:44:15 -07:00
Xin Dong 4363dd0f25 This resolves issue #3739 by exposing time since last full recovery. 2020-09-08 14:26:01 -07:00
Lukas Joswiak 2a58e775d2 Add original changes 2020-09-04 15:36:47 -07:00
Young Liu 16ef2bd3bd Pending commit 2020-08-12 10:34:07 -07:00
Young Liu 79ce16650d merge master branch 2020-08-11 19:22:10 -07:00
Xiaoge Su d8bb36c4c8 Refactor applyMetadataMutations to accept less args 2020-08-06 11:40:40 -07:00
Young Liu d6a23a4d6b Resolve comments to make GRV proxy a separate process class 2020-08-06 00:01:57 -07:00
Young Liu bfa4eb9ab2 Resolve comments 2020-07-30 14:45:03 -07:00
Young Liu 30ea639666 Remove debug traces 2020-07-29 07:55:05 -07:00
Young Liu f7b76a92af pass joshua 2020-07-29 07:26:55 -07:00
Evan Tschannen a49cb41de7 Merge branch 'release-6.3'
# Conflicts:
#	CMakeLists.txt
#	cmake/ConfigureCompiler.cmake
#	fdbserver/Knobs.cpp
#	fdbserver/StorageCache.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	flow/ThreadHelper.actor.h
#	flow/serialize.h
#	tests/CMakeLists.txt
2020-07-29 00:31:55 -07:00
Young Liu 229ab0d5f1 Fix some conflicts and remote debugging trace events 2020-07-22 23:35:46 -07:00
Young Liu 525f10e30c Merge master branch 2020-07-22 16:08:49 -07:00
Young Liu 302cf5c45f Remove debug trace events 2020-07-22 12:20:22 -07:00
Evan Tschannen 220ede3564 fixed compile error 2020-07-20 11:35:20 -07:00
Evan Tschannen be67e9cfc7 wait for the correct cluster controller interface before starting master recovery 2020-07-20 11:29:37 -07:00
Evan Tschannen 32c0169fc8 use the old logic for lifetime since we already have verified the cluster controller is correct 2020-07-20 10:26:47 -07:00
Young Liu 2703cedac5 Fixed known bugs 2020-07-17 22:24:52 -07:00
Evan Tschannen 6a38f81269 do not kill the master unless we have a dbInfo from the current cluster controller 2020-07-17 14:59:38 -07:00
Young Liu 21c1998cca Fix MaxTLogQueueSize Bug 2020-07-16 15:56:04 -07:00
Young Liu 5b06d69d25 Pass watches test 2020-07-15 00:37:41 -07:00
Meng Xu ef8c1060a2 Merge branch 'master' into mengxu/tmp-merge-6.3 2020-07-13 10:15:56 -07:00
Jingyu Zhou 5cc5d9cf1e Log peer address whose failure can cause master recovery
So when there is master recovery due to failed tlog, proxy, resolver, log
router, or resolver, we can have a trace event tells which address that the
master thinks is dead.
2020-07-10 15:57:03 -07:00
Markus Pilman 0fbe7101c3 Revert "Revert "Request tracing""
This reverts commit 327cc31e35.
2020-07-07 10:06:13 -06:00
Meng Xu f3302833ce
Merge pull request #3435 from apple/release-6.3
Merge Release 6.3 to master
2020-06-30 10:08:28 -07:00
Jingyu Zhou d883426c6a Fix spammy GotBackupProgress events
Only print this types of events during master recovery and don't log them for
backup workers.
2020-06-27 21:30:38 -07:00
Young Liu 8cd97a2be8 Resolve Evan's comments:
- Only report commit version when commit version is larger than the known committed version.
- Fix task priorities of [Get/Report]LiveCommittedVersion [request/reply].
- Fix some code style issues.
2020-06-17 10:02:25 -07:00
Young Liu 4dfb903a3a tmp merge 2020-06-16 20:32:07 -07:00
Jingyu Zhou 327cc31e35
Revert "Request tracing" 2020-06-16 12:32:42 -07:00
Young Liu bf524fc6f2 Added span trace in serveLiveCommittedVersion 2020-06-13 17:47:13 -07:00
Young Liu f211a54593 Merged from upstream master 2020-06-13 16:47:12 -07:00
Young Liu f8c457d74d Minor fix against Meng's comments 2020-06-13 16:27:08 -07:00
Meng Xu 80334351c7
Merge pull request #3329 from sfc-gh-mpilman/features/flatbuffers-debugtx
Request tracing
2020-06-12 14:50:19 -07:00
Young Liu ff7354a7de Recovered parallelism between the proxy-master rpc and its concurrent events and some minor fixes on formatting and variables naming 2020-06-11 14:07:37 -07:00
Young Liu a47806a966 Fixed locked and metadataVersion in GetReadVersion 2020-06-10 15:55:23 -07:00
Markus Pilman 9ef93714ac Address review comments 2020-06-10 15:48:49 -07:00
Markus Pilman 4ab3441a95 Merge remote-tracking branch 'origin/master' into features/flatbuffers-debugtx 2020-06-10 15:31:29 -07:00
Markus Pilman caabbec466 Spans for read and commit path 2020-06-10 09:59:41 -07:00
sfc-gh-tclinkenbeard 99bf993815 Replace BOOST_NOEXCEPT with noexcept 2020-06-09 22:39:19 -07:00
Young Liu 3a37e0af75 Serve GetReadVersion through master instead of peer proxies 2020-06-09 20:47:34 -07:00
Young Liu a038a02cdd Serve GetReadVersion through master 2020-06-09 11:16:23 -07:00
negoyal cf13e00a8f Merge remote-tracking branch 'origin/release-6.3' into fdb_cache_wo_allocator 2020-06-01 17:38:31 -07:00
Evan Tschannen ced65cd30b finished explicitly versioning everything stored in the database 2020-05-22 17:14:21 -07:00
Markus Pilman c2bc75516f Merge branch 'release-6.3' of github.com:apple/foundationdb into features/trace-roles 2020-05-14 10:34:53 -07:00
Markus Pilman 5f9b127e56 Emit traces regularly about role assignment
We are currently emitting Role transition traces when a role starts and
when it ends. While this is useful for debugging, it doesn't work well
with tools that inject data and might potentially miss some trace lines.

We do decorate each trace lines with the roles assigned to that
particular process, however, this is not sufficient for tools that can
make use of the UID -> Role mapping
2020-05-08 16:27:57 -07:00
negoyal dd033736ed Merge branch 'master' into fdb_cache_subfeature2 2020-05-04 17:29:43 -07:00
Evan Tschannen a442565e13 more work towards shrinking locality 2020-04-18 21:29:38 -07:00
Evan Tschannen 07cc0a8d74 code cleanup 2020-04-10 17:02:11 -07:00
Evan Tschannen a51c92854a Merge branch 'master' into feature-tree-broadcast
# Conflicts:
#	fdbserver/WorkerInterface.actor.h
#	fdbserver/worker.actor.cpp
2020-04-06 21:09:44 -07:00
Evan Tschannen 2a1bd97120 fix compilation errors 2020-04-06 20:58:43 -07:00
Evan Tschannen 477d66b46d implemented a tree broadcast for txn state message for proxies, and serverDBInfo for workers 2020-04-05 23:09:36 -07:00
negoyal acaf91ac47 Merge branch 'master' into fdb_cache_subfeature2 2020-03-26 13:33:08 -07:00
Evan Tschannen bb5799bd20
Merge pull request #2642 from xumengpanda/mengxu/new-backup-format-PR
FastRestore:Integrate with new backup format
2020-03-25 15:47:55 -07:00
Jingyu Zhou e2f317a0da Fix a crash failure 2020-03-25 09:18:49 -07:00
Jingyu Zhou 243d078596 Fix off by one error
Epoch end version is saved version + 1, so need +1 for minBackupVersion.
2020-03-23 20:44:31 -07:00
Jingyu Zhou 90b40e1d75 Merge branch 'mengxu/new-backup-format-PR-delta' of github.com:xumengpanda/foundationdb into backup-worker-bak
Resolve Conflicts:
	fdbclient/BackupAgent.actor.h
	fdbserver/BackupWorker.actor.cpp
	fdbserver/RestoreMaster.actor.cpp
	fdbserver/masterserver.actor.cpp
2020-03-23 13:35:33 -07:00
Meng Xu be67ab4d6a Correct comment based on review 2020-03-23 12:53:40 -07:00
Andrew Noyes fa8eaf9810 Assert recoverAndEndEpoch does not become ready 2020-03-23 12:40:00 -07:00
Meng Xu 3f31ebf659 New backup:Revise event name and explain code 2020-03-23 10:55:44 -07:00
Jingyu Zhou 97702d91c8 Skip recruiting backup workers for older epochs before min backup version
When master starts recruiting backup workers, if there is no active backup job
or the min version of the backup job is greater than old epoch's end version,
then these old epochs can be skipped.
2020-03-21 13:44:02 -07:00
Jingyu Zhou 818072f3cb Set oldest backup epoch if not recruiting backup workers
Since tlog is not kept until backup worker has pulled mutations from it, the
old tlogs can only be displaced after oldest backup epoch equals current epoch.
So if master is not recruiting backup workers, it should set the oldest backup
epoch as the current epoch.
2020-03-20 20:16:43 -07:00
Jingyu Zhou 5359528132 Reduce a call to getLogSystemConfig() 2020-03-20 20:15:09 -07:00
Jingyu Zhou 12ed8ad536 Fix backup worker start version when logset start version is lower
The start version of tlog set can be smaller than the last epoch's end version.
In this case, set backup worker's start version as last epoch's end version to
avoid overlapping of version ranges among backup workers.
2020-03-20 20:15:08 -07:00
Jingyu Zhou 80d3fa1222 Add delay for master to recruit backup workers
This delay is to ensure old epoch's backup workers can save their progress in
the database. Otherwise, the new master could attempts to recruit backup
workers for the old epoch on version ranges that have already been popped. As
a result, the logs will lose data.
2020-03-20 20:15:08 -07:00
Jingyu Zhou fda6c08640 Include a total number of tags in partition log file names
This is needed for BackupContainer to check partitioned mutation logs are
continuous, i.e., restorable to a version.
2020-03-20 20:13:38 -07:00
Jingyu Zhou 5bf62c8f85 Reduce a call to getLogSystemConfig() 2020-03-19 10:08:19 -07:00
Jingyu Zhou 89d8f13038 Fix backup worker start version when logset start version is lower
The start version of tlog set can be smaller than the last epoch's end version.
In this case, set backup worker's start version as last epoch's end version to
avoid overlapping of version ranges among backup workers.
2020-03-18 16:41:35 -07:00
Jingyu Zhou 15437ffb53 Add delay for master to recruit backup workers
This delay is to ensure old epoch's backup workers can save their progress in
the database. Otherwise, the new master could attempts to recruit backup
workers for the old epoch on version ranges that have already been popped. As
a result, the logs will lose data.
2020-03-18 16:41:35 -07:00
Jingyu Zhou d8c6bf585d Include a total number of tags in partition log file names
This is needed for BackupContainer to check partitioned mutation logs are
continuous, i.e., restorable to a version.
2020-03-18 16:39:40 -07:00
Evan Tschannen e08f0201f1 merge release 6.2 into master 2020-03-17 12:51:47 -07:00
Evan Tschannen 56dee89e6e active generations should include the current one 2020-03-16 11:09:42 -07:00
Evan Tschannen e5d53c863b report in status the number of active generations 2020-03-16 10:29:17 -07:00
Evan Tschannen 818537ed2d
Update fdbserver/masterserver.actor.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-03-14 15:04:46 -07:00
Evan Tschannen 2f2f56020f
Update fdbserver/masterserver.actor.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-03-13 15:54:13 -07:00
Evan Tschannen a39effa57d delay recoveries after 70 outstanding generations, and stop recoveries after 100 outstanding generations to prevent a death spiral from filling up the coordinated state 2020-03-13 10:28:32 -07:00
negoyal cd949eca71 Merge branch 'master' into fdb_cache_subfeature2 2020-02-26 11:22:08 -08:00
Evan Tschannen 96258b9809 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbcli/fdbcli.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/FlowTransport.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/KeyValueStoreMemory.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/QuietDatabase.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/StorageMetrics.actor.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KVStoreTest.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/genericactors.actor.cpp
#	flow/serialize.h
2020-02-21 19:09:16 -08:00
A.J. Beamon df2b0452b4 Step 3 of fixing storage server range reads: change return type of readRange from VectorRef<KeyValueRef> to RangeResultRef. 2020-02-06 13:19:24 -08:00
negoyal 85cc35e81e Merge branch 'master' into HEAD 2020-02-05 14:59:55 -08:00
Jingyu Zhou 52c6737411 Rename backupLoggingEnabled as backupWorkerEnabled
To highlight the changes for 7.0 backup changes. By default,
backup_worker_enabled flag is set for 7.0 version.
2020-02-04 10:09:16 -08:00
Jingyu Zhou 0db03f1d3c Use backup_logging_enabled flag
The default is to enable new backup workers. Users can disable this flag to
turn off the backup worker feature.
2020-02-03 20:03:22 -08:00
Jingyu Zhou 38aa1903fd Add a DB configuration option for backup workers
Right now, the default is to keep the old backup behavior, i.e., do NOT use
backup workers. Specifically, if BackupType is not set (or is set to default),
the master will not recruit backup workers and will not add pseudo locality for
backup workers.

The StartFullBackupTaskFunc is updated to check if backup worker is enabled.
Only when it is not enabled, starting a backup will wait on all backup workers
to be started.
2020-01-31 19:29:09 -08:00
mpilman 6cc827277f delete dead code 2020-01-24 14:28:09 -08:00
mpilman 4c3afa4208 Merge branch 'features/cache-initialization' of github.com:mpilman/foundationdb into features/cache-initialization 2020-01-24 11:03:25 -08:00
mpilman 51717c970d Fixed management api 2020-01-24 11:00:50 -08:00
Jingyu Zhou 8b67a89eed More review comments fixed. 2020-01-22 19:42:13 -08:00
Jingyu Zhou 1eaea91cb3 Address review comments 2020-01-22 19:42:13 -08:00
Jingyu Zhou e14246ac16 Add more information for trace events 2020-01-22 19:42:13 -08:00
Jingyu Zhou 4bed33031f Set backup worker start version to be savedVersion + 1
If no progress found, start version is set to epochBegin. So the start version
is the one after the last saved (or from last epoch's saved) version.
2020-01-22 19:42:13 -08:00
Jingyu Zhou 4ed75e37f3 BackupProgress uses old epoch's begin version if no progress found
Get rid of the complex logic of choosing the largest saved version from
previous epoch for the oldest epoch. Instead, use the begin version now
available from log system.
2020-01-22 19:38:46 -08:00
Jingyu Zhou 19eacac3ce Add a unit test for BackupProgress 2020-01-22 19:38:46 -08:00
Jingyu Zhou 64052f6349 Check and fill backup gaps for old epochs and tags
Sometimes the backup worker has not updated progress to the system space and a
master recovery happens. As a result, next epoch doesn't know the progress of
previous ones. This change is to check for such missing gaps and fill them with
the whole range [startVersion, endVersion).

The code is refactored into BackupProgress.actor.* to consolidate backup
progress processing for the master server.
2020-01-22 19:38:46 -08:00
Jingyu Zhou ed54aaa09e Fix a crash failure of empty backup interface 2020-01-22 19:38:46 -08:00
Jingyu Zhou 23985da6a0 Use backup worker failed error code during recovery
And use override instead of virtual in TagPartitionedLogSystem.
2020-01-22 19:38:45 -08:00
Jingyu Zhou 840e74d696 Allow storage server queue in consistency check
The backup worker needs to update its progress even during consistency check by
commit transactions to the database. Thus we can't really achieve zero storage
server queue. So add a limit of 10,000 to pass the consistency check.
2020-01-22 19:38:45 -08:00
Jingyu Zhou 9567bf730d Fix a crash due to null log system
When a master starts, backup worker from old epochs may send BackupWorkerDoneRequest
to it. The master can be safely ignore it, since the checkRemoved logic of the
backup worker can self exit then.
2020-01-22 19:38:45 -08:00
Jingyu Zhou 0c08161d8e Remove old backup workers when done
For backup workers working on old epochs, once their work is done, they will
notify the master. Then the master removes them from the log system and
acknowledge back to the backup workers so that they can gracefully shut down.

The popping of a backup worker is stalled if there are workers from older
epochs still working. Otherwise, workers from old epochs will lost data.

However, allowing newer epoch to start backup can cause holes in version ranges.
The restore process must verify the backup progress to make sure there are no
holes, otherwise it has to wait.
2020-01-22 19:38:45 -08:00
Jingyu Zhou 85c4a4e422 Address review comments for PR #1625 2020-01-22 19:38:45 -08:00
Jingyu Zhou 22f4bef589 Fix a race that backup workers may not be registered
After the backup worker recruitment is done, we need to force trigger the
registration with cluster controller. Otherwise, the log system may not have
the backup workers, which can stall backup workers from obtaining a cursor and
resulting in mutations being kept in TLogs.
2020-01-22 19:38:45 -08:00
Jingyu Zhou 73824faf65 Track pseudo tags popping for individual IDs
For each log router ID, we track the popped version of each pseudo tag so that
the popping only applied to the minimum of these versions.

Also add more tracing for popping and epochs.
2020-01-22 19:38:45 -08:00
Jingyu Zhou 580151e1d4 Refactor code using C++ 17 iterator 2020-01-22 19:38:45 -08:00
Jingyu Zhou c2b8ee3b53 Small improvement 2020-01-22 19:38:45 -08:00
Jingyu Zhou 19d6a889ff Recruit backup workers for old epochs
If there are unfinished ranges in the old epochs, the new master will recruit
backup workers responsible for finishing these ranges. These workers remains in
the cluster until the next epoch, when it will remove itself.
2020-01-22 19:38:45 -08:00
Jingyu Zhou ac851619bb Fix merge errors with master 2020-01-22 19:38:45 -08:00
Jingyu Zhou 11964733b7 WIP: should be divided into smaller commits. 2020-01-22 19:38:45 -08:00
Jingyu Zhou 41f0cf2bb5 Add decode function for backup progress 2020-01-22 19:38:45 -08:00
Jingyu Zhou a4d6ebe79e Recruit backup worker in newEpoch 2020-01-22 19:37:48 -08:00
Jingyu Zhou eac49bca04 Add backup worker recruitment in master. 2020-01-22 19:35:30 -08:00
negoyal e8e5f2d118 Bug fixes in the cache role. 2020-01-07 16:51:40 -08:00
negoyal cf2563f1c7 Mix of various things, a lot of which will change. 2019-12-05 17:10:32 -08:00
negoyal a4a0bf18f9 Merging with Master. 2019-11-12 13:01:29 -08:00
Jon Fu d96a7b2c69 Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-03 09:47:45 -07:00
Evan Tschannen 3cc5d484a5 the include and exclude commands do not need to set the moveKeysLockOwnerKey, which will kill the data distribution algorithm 2019-09-27 18:33:56 -07:00
A.J. Beamon 1f8a157b35 Extend the length allowed for configuration fields. Log the config if recovery fails due to invalid config. 2019-09-05 15:36:37 -07:00
Andrew Noyes 6aa0ada7b1 Replace scalar root types with proper messages 2019-08-28 14:40:50 -07:00
Evan Tschannen 4c9a392f05 the master checks the popped version of the txsTag before recovering the txnStateStore, to avoid restoring data that is later found to be popped 2019-08-05 17:01:48 -07:00
Evan Tschannen 5c98dcce6d revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators 2019-07-27 16:46:22 -07:00
Evan Tschannen b509a441e7 Merge branch 'master' into feature-skip-confirm
# Conflicts:
#	bindings/flow/tester/Tester.actor.cpp
#	bindings/go/src/_stacktester/stacktester.go
#	bindings/java/src/test/com/apple/foundationdb/test/AsyncStackTester.java
#	bindings/java/src/test/com/apple/foundationdb/test/StackTester.java
#	bindings/python/tests/tester.py
#	bindings/ruby/tests/tester.rb
#	documentation/sphinx/source/api-c.rst
#	documentation/sphinx/source/api-python.rst
#	documentation/sphinx/source/api-ruby.rst
#	documentation/sphinx/source/data-modeling.rst
#	documentation/sphinx/source/developer-guide.rst
#	fdbclient/vexillographer/fdb.options
#	fdbserver/MasterProxyServer.actor.cpp
2019-07-27 15:08:13 -07:00
Evan Tschannen 02de53160d only skip confirm epoch live if CAUSAL_READ_RISKY is enabled
time checked on the proxy should be less than the time waited by the master to account for clock speed differences
setting REQUIRED_MIN_RECOVERY_DURATION and ENFORCED_MIN_RECOVERY_DURATION to 0 will go back to the old behavior
2019-07-12 17:58:16 -07:00
Evan Tschannen a63969afb3 enforce a minimum recovery duration, which allows proxies to avoid checking if the epoch is alive as long as its last commit has been less than MINIMUM_RECOVERY_DURATION ago 2019-07-12 13:10:21 -07:00
Evan Tschannen d8948c8be1 Merge branch 'master' into feature-fast-txs-recovery
# Conflicts:
#	fdbserver/TagPartitionedLogSystem.actor.cpp
2019-07-10 13:59:52 -07:00
Evan Tschannen c348b3da51 After a proxy dies, it will remain alive for an additional 10 seconds to forward clients to the new proxies 2019-07-08 12:53:40 -07:00
Evan Tschannen 15e894c724 Merge in master 2019-07-05 15:49:24 -07:00
Alex Miller ea6898144d Merge remote-tracking branch 'upstream/master' into flowlock-api 2019-07-03 20:44:15 -07:00
Jingyu Zhou b69d7adabc Remove unused remoteRecovered from master server 2019-07-01 15:41:35 -07:00
Evan Tschannen 52efcfd136 fix: properly create the right number for txsTags when changing between different numbers of logs 2019-06-27 15:15:05 -07:00
Alex Miller 7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Evan Tschannen e0be631414 shard the txs tag so that more transaction logs are involved in its recovery 2019-06-19 18:15:09 -07:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Jingyu Zhou 8b5449e608 Fix review comments for PR #1473 2019-04-29 16:45:42 -07:00