We are currently emitting Role transition traces when a role starts and
when it ends. While this is useful for debugging, it doesn't work well
with tools that inject data and might potentially miss some trace lines.
We do decorate each trace lines with the roles assigned to that
particular process, however, this is not sufficient for tools that can
make use of the UID -> Role mapping
When master starts recruiting backup workers, if there is no active backup job
or the min version of the backup job is greater than old epoch's end version,
then these old epochs can be skipped.
Since tlog is not kept until backup worker has pulled mutations from it, the
old tlogs can only be displaced after oldest backup epoch equals current epoch.
So if master is not recruiting backup workers, it should set the oldest backup
epoch as the current epoch.
The start version of tlog set can be smaller than the last epoch's end version.
In this case, set backup worker's start version as last epoch's end version to
avoid overlapping of version ranges among backup workers.
This delay is to ensure old epoch's backup workers can save their progress in
the database. Otherwise, the new master could attempts to recruit backup
workers for the old epoch on version ranges that have already been popped. As
a result, the logs will lose data.
The start version of tlog set can be smaller than the last epoch's end version.
In this case, set backup worker's start version as last epoch's end version to
avoid overlapping of version ranges among backup workers.
This delay is to ensure old epoch's backup workers can save their progress in
the database. Otherwise, the new master could attempts to recruit backup
workers for the old epoch on version ranges that have already been popped. As
a result, the logs will lose data.
Right now, the default is to keep the old backup behavior, i.e., do NOT use
backup workers. Specifically, if BackupType is not set (or is set to default),
the master will not recruit backup workers and will not add pseudo locality for
backup workers.
The StartFullBackupTaskFunc is updated to check if backup worker is enabled.
Only when it is not enabled, starting a backup will wait on all backup workers
to be started.
Get rid of the complex logic of choosing the largest saved version from
previous epoch for the oldest epoch. Instead, use the begin version now
available from log system.
Sometimes the backup worker has not updated progress to the system space and a
master recovery happens. As a result, next epoch doesn't know the progress of
previous ones. This change is to check for such missing gaps and fill them with
the whole range [startVersion, endVersion).
The code is refactored into BackupProgress.actor.* to consolidate backup
progress processing for the master server.
The backup worker needs to update its progress even during consistency check by
commit transactions to the database. Thus we can't really achieve zero storage
server queue. So add a limit of 10,000 to pass the consistency check.
When a master starts, backup worker from old epochs may send BackupWorkerDoneRequest
to it. The master can be safely ignore it, since the checkRemoved logic of the
backup worker can self exit then.
For backup workers working on old epochs, once their work is done, they will
notify the master. Then the master removes them from the log system and
acknowledge back to the backup workers so that they can gracefully shut down.
The popping of a backup worker is stalled if there are workers from older
epochs still working. Otherwise, workers from old epochs will lost data.
However, allowing newer epoch to start backup can cause holes in version ranges.
The restore process must verify the backup progress to make sure there are no
holes, otherwise it has to wait.
After the backup worker recruitment is done, we need to force trigger the
registration with cluster controller. Otherwise, the log system may not have
the backup workers, which can stall backup workers from obtaining a cursor and
resulting in mutations being kept in TLogs.
For each log router ID, we track the popped version of each pseudo tag so that
the popping only applied to the minimum of these versions.
Also add more tracing for popping and epochs.
If there are unfinished ranges in the old epochs, the new master will recruit
backup workers responsible for finishing these ranges. These workers remains in
the cluster until the next epoch, when it will remove itself.
time checked on the proxy should be less than the time waited by the master to account for clock speed differences
setting REQUIRED_MIN_RECOVERY_DURATION and ENFORCED_MIN_RECOVERY_DURATION to 0 will go back to the old behavior