Commit Graph

163 Commits

Author SHA1 Message Date
Jingyu Zhou 5a602f58e8 Start backup with a wait on all backup workers running
This wait is to make sure that backup workers are already saving mutations so
that no mutations are missed. The idea is that the CLI sets a "backupStartedKey"
in the database and waits for allWorkerStarted() key of the backup to be set.

Backup workers monitor the changes to the "backupStartedKey" and start logging
mutations. Additionally, backup worker for Tag(-2,0) monitors all other workers
have started (checking their saved progress version is larger than the backup's
start version), and then sets the allWorkerStarted() key for the backup.
2020-01-31 19:29:09 -08:00
Jingyu Zhou 83907bd453 Backup worker allows on and off of backups
The monitoring loop of system key "backupStartedKey" and decides to be in one
of two modes: NOOP and backup. In the NOOP mode, the worker just pop TLogs. In
the backup mode, the worker pulls mutations from TLogs and save the mutations
into logs.
2020-01-31 19:29:09 -08:00
Evan Tschannen 8f599e9d15 fix: backupWorker would crash when run outside of simulation 2020-01-23 19:06:39 -08:00
Jingyu Zhou 39fbacbc4f Address review comments 2020-01-22 19:43:40 -08:00
Jingyu Zhou 1311fec45a Add an option to get minKnownCommittedVersion from Proxies
The backup worker needs to use this version for popping when running in a NOOP
mode. This option is added to GetReadVersionRequest and proxies will send back
minKnownCommittedVersion if the option is set.

Also add a couple of knobs for backup workers.
2020-01-22 19:42:13 -08:00
Jingyu Zhou 7989f3f015 Add NOOP to backup worker
The backup worker just blindly pop tags if the "backupStartedKey" is not set.
Note the commit version from TLog cannot be used as the pop version, because
for a single region, during a recovery the log router tags are used to recover
mutations. The backup worker can potentially pop mutations that are needed for
recovery, causing consistency errors. So the solution for now is to use commit
version - 5,000,000, which is a version guaranteed to be persisted on all
replicas.
2020-01-22 19:42:13 -08:00
Jingyu Zhou 60f360c954 Log oldest backup epoch in the backup worker 2020-01-22 19:38:46 -08:00
Jingyu Zhou 568a8a8e77 Use big endian for mutation log files
For each mutation, its version, sub-version, and size are prefixed with big
endian representation. This is required, especially for the first version
variable, because we use 0xFF for padding purpose. A little endian version
number can easily collide with 0xFF, while big endian is guaranteed to have
0x00 as the first byte.
2020-01-22 19:38:46 -08:00
Jingyu Zhou 954743977b Add paddings to a block in mutation log files
This is needed otherwise decoding cannot be performed.
2020-01-22 19:38:46 -08:00
Jingyu Zhou e4aea9b66d Use VectorRef<Tag> for VersionedMessage 2020-01-22 19:38:46 -08:00
Jingyu Zhou 7f7ec99170 Serialize and deserialize new backup files
The BackupWorker writes files that can be read by FileConverter. Move
StringRefReader to the header file for reuse in FileConverter.
2020-01-22 19:38:46 -08:00
Jingyu Zhou f21d7ca44c Add tag ID to backup log file names 2020-01-22 19:38:46 -08:00
Jingyu Zhou 2c83fbfe6c Rename to BackupWorker.actor.cpp to be explicit
There is already one file named backup.actor.cpp in "fdbbackup/".
2020-01-22 19:38:46 -08:00