* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""
Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.
2. Update Status.actor to track ClusterController interface to track
recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
prefix; for now keeping it as "Master", however, it should allow
smooth transition to "Cluster" prefix as it seems more appropriate.
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
responsible to recruit all other processes as well restore the
cluster state.
Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.
Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
process like other worker processes compared to current scheme
where "sequencer" process gets special treatment. In newer scheme
sequencer is responsible for maintaining/providing
"committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
the sequencer though orchestrating the recovery state machine, it
need to reachout to the ClusterController for recruiting worker
processes etc.
NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.
Next Steps:
Cluster recovery documentation will be updated in near future.
- Assertion failures in MoveKeys.actor.cpp
- Wrong results returned by getRange()
Changes:
DatabaseContext.h, NativeAPI.actor.[h,cpp]:
- Introduce a new flag, TransactionInfo::readVersionObtainedFromGrvProxy.
- Set this flag to true by default, and clear it when the read version of a
transaction is explicitly set (by using setVersion()).
- Modify getLatestCommitVersions() to not populate "latestCommitVersions" if
this flag is not set. (This will cause storage server to read at the specified
read version.)
- Modify getRange() actor to always use the specified version as the read
version (except when the specified version is latestVersion).
- Modify waitForCommittedVersion(), getRawVersion(), and getConsistentReadVersion()
to update local version vector cache after receiving GetReadVersionReply.
IClientApi.h, IConfigTransaction.h, ISingleThreadTransaction.h,
MultiVersionTransaction[.actor].[h,cpp], ThreadSafeTransaction.[h,cpp],
ApiWorkload.h:
- Add methods to get the spanID of a transaction and also the version vector
cached in a transaction. (Likely to be useful for debugging simulation test
failures.)
VersionVector.h:
- Update "maxVersion" when populating/applying a delta. (Note that empty
mutation messages only update VersionVector::maxVersion.)
BackupWorker.actor.cpp:
- Update local version vector cache after receiving GetReadVersionReply message.
Status.actor.cpp:
- Update local version vector cache and
TransactionInfo::info.readVersionObtainedFromGrvProxy after setting the
read version.
This merges release-6.3 branch right before it was fully formatted.
There were quite a few conflicts that are resolved here. CoroFlow had
a check for OOM errors introduced in 6.3, but didn't seem applicable in
the new implmentation which seems to use boost.