This commit refactors the logic for recruiting singletons,
which is done by the ClusterController. This allows for far
easier additions of new singletons in the future, and also
cleans up the code.
Also, the logic for recruiting DD was changed to mirror
the logic for recruiting RK. Although the logic for RK
allows there to be many RKs existing at once, the moveKeysLock
mechanism used by DD still prevents multiple DDs existing at once.
Since Ratekeeper and DataDistributor are no longer running with Master, they
might be running with stateful processes before a new Master becomes alive,
which is undesirable.
This PR adds a monitoring of both Ratekeeper and DataDistributor at Cluster
Controller -- if Master runs on a stateless class and RK/DD runs at a worse
class, then RK/DD will be killed. I.e., RK/DD should be running at their own
classes or on the same stateless process as Master. After restart, RK/DD should
be running at a better process class.
Add a new role for ratekeeper.
Remove StorageServerChanges from data distribution.
Ratekeeper monitors storage servers, which borrows the idea from
DataDistribution.
Use getRateInfo's endpoint as the ID for the DataDistributorInterface.
For now, added a "rejoined" flag for ClusterControllerData and Proxy.
TODO: move DataDistributorInterface into ServerDBInfo.
Let cluster controller to start a new data distributor role by sending a
message to a chosen worker.
Change MasterInterface usage in DataDistribution to masterId
Add DataDistributor rejoin handling.
This allows the data distributor to tell the new cluster controller of its
existence so that the controller doesn't spawn a new one. I.e., there should
be only ONE data distributor in the cluster.
If DataDistributor (DD) doesn't join in a while, then ClusterController (CC) tries
to recruit one as DD. CC also monitors DD and restarts one if it failed.
The Proxy is also monitoring the DD. If DD failed, the Proxy will ask CC for
the new DD.
Add GetRecoveryInfo RPC to master server, which is called by data distributor
to obtain the recovery Transaction version from the master server.