For each mutation, its version, sub-version, and size are prefixed with big
endian representation. This is required, especially for the first version
variable, because we use 0xFF for padding purpose. A little endian version
number can easily collide with 0xFF, while big endian is guaranteed to have
0x00 as the first byte.
Duplicates can happen because backup workers may store the log for
old epochs successfully, but do not update the progress before another
recovery happened. As a result, next epoch will retry and creates
duplicated log files.
VersionedData used to include a MutationRef, which is made from BinaryReader.
Unfortunately, the StringRef inside MutationRef points a memory allocated from
the BinaryReader's arena, which is free'd after BinaryReader is destroyed.
Change to use a StringRef pointing to the serialized mutation solves this bug.
fdbconvert is intended to convert new backup files which are tagged mutation
logs to old backup format. The actual conversion is not included in this commit
and will be added in future commits.
Note that the BackupContainer needs to be updated to support new backup files,
which is also not included in this commit.
Get rid of the complex logic of choosing the largest saved version from
previous epoch for the oldest epoch. Instead, use the begin version now
available from log system.
This is to simplify the backup process so that whenever there is an old epoch
in the log system, we always know its begin version and can backup from that
version if no progress is known for that old epoch.
Sometimes the backup worker has not updated progress to the system space and a
master recovery happens. As a result, next epoch doesn't know the progress of
previous ones. This change is to check for such missing gaps and fill them with
the whole range [startVersion, endVersion).
The code is refactored into BackupProgress.actor.* to consolidate backup
progress processing for the master server.
Each master starts from an empty set of backup workers and recruits a new set.
So there is no need to save current backup workers to DBCoreState. Note current
backup workers need to be serialized to LogSystemConfig (in ServerDBInfo) so
that backup workers can check if they have been displaced.
For backup worker working on old epochs, make it a contract that the worker
won't pull messages after the end version. This potentially saves memory and
simplify the saving logic.
Fix the wrong backup epoch when sending BackupWorkerDoneRequest.
Previously, the pop version is the min of minKnownCommittedVersion and
endVersion. In the case of backup worker for previous epoch, the endVersion
should be used.
For an ILogPeekCursor, the arena becomes invalid if hasMessage() is false.
So the backup worker needs to keep a reference to the arena so that the message
refers to memory area that is still valid.
If a mutation has txsTag, then it is the change to in-memory key value store,
i.e., the transaction state store, and should be ignored by the backup worker.
The only exception is for the "metadataVersionKey", which needs to be stored in
the backup.
This is the first step in the new backup's data pipeline. Verification of file
content is needed in future commits. A clear documentation of file format is a
work in progress.
The backup worker needs to update its progress even during consistency check by
commit transactions to the database. Thus we can't really achieve zero storage
server queue. So add a limit of 10,000 to pass the consistency check.
When a master starts, backup worker from old epochs may send BackupWorkerDoneRequest
to it. The master can be safely ignore it, since the checkRemoved logic of the
backup worker can self exit then.
For backup workers working on old epochs, once their work is done, they will
notify the master. Then the master removes them from the log system and
acknowledge back to the backup workers so that they can gracefully shut down.
The popping of a backup worker is stalled if there are workers from older
epochs still working. Otherwise, workers from old epochs will lost data.
However, allowing newer epoch to start backup can cause holes in version ranges.
The restore process must verify the backup progress to make sure there are no
holes, otherwise it has to wait.
For backup workers created for previous epoch, we need to associate them with
the correct epoch so that later peekLogRouter can get the correct peek cursor.
Otherwise, the workers can never peek the missing range of mutations.
After the backup worker recruitment is done, we need to force trigger the
registration with cluster controller. Otherwise, the log system may not have
the backup workers, which can stall backup workers from obtaining a cursor and
resulting in mutations being kept in TLogs.
Separate popping logic into an actor with shorter interval than the upload
interval. More critically, even if there is no mutations (e.g., in quiet
database period), the popped version should still be advanced.