Commit Graph

15 Commits

Author SHA1 Message Date
FDB Formatster df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
sfc-gh-tclinkenbeard 5b2e88b187 Use structured bindings in for loops 2020-12-27 01:46:20 -04:00
Jingyu Zhou d883426c6a Fix spammy GotBackupProgress events
Only print this types of events during master recovery and don't log them for
backup workers.
2020-06-27 21:30:38 -07:00
Jingyu Zhou 7e5551ea19 Avoid overlapping version ranges for backup workers
Sometimes, an epoch's begin version is lower than the previous epoch's end
version. In some rare casse, the master ends up recruiting backup workers for
both epoch and have overlapping ranges of [epochBeginVersion, prevEpochEndVersion].
Since the popping order is by epoch. Previous epoch can pop the mutation and
save to a log file. Then this epoch will miss these popped mutation in the
overlapping range, causing corrupted mutation logs.
2020-04-10 21:19:37 -07:00
Jingyu Zhou c59b0844a9 Add total number of tags to WorkerBackupStatus
This allows the backup worker to check the number of tags.
2020-03-20 20:15:08 -07:00
Jingyu Zhou d8731a1796 Refactor to use std::find_if for more concise code 2020-03-20 20:15:08 -07:00
Jingyu Zhou 696ce6aa82 Fix compiling error of reverse iterators
MacOS and Windows compiler doesn't like the use of "!=" operator of
std::map::reverse_iterator.
2020-03-20 20:15:08 -07:00
Jingyu Zhou 31f7108eab Handle partial recovery in BackupProgress
A partial recovery can result in empty epoch that copies previous epoch's
version range. In this case, getOldEpochTagsVersionsInfo() will not return
previous epoch's information. To correctly compute the start version for a
backup worker, we need to check previous epoch's saved version. If they are
larger than this epoch's begin version, use previously saved version as the
start version.
2020-03-20 20:15:08 -07:00
Jingyu Zhou fda6c08640 Include a total number of tags in partition log file names
This is needed for BackupContainer to check partitioned mutation logs are
continuous, i.e., restorable to a version.
2020-03-20 20:13:38 -07:00
Jingyu Zhou 5a602f58e8 Start backup with a wait on all backup workers running
This wait is to make sure that backup workers are already saving mutations so
that no mutations are missed. The idea is that the CLI sets a "backupStartedKey"
in the database and waits for allWorkerStarted() key of the backup to be set.

Backup workers monitor the changes to the "backupStartedKey" and start logging
mutations. Additionally, backup worker for Tag(-2,0) monitors all other workers
have started (checking their saved progress version is larger than the backup's
start version), and then sets the allWorkerStarted() key for the backup.
2020-01-31 19:29:09 -08:00
Jingyu Zhou c08a192c75 Add a backup start key
If the backup key is not set, do not recruit backup workers for old epoches.
2020-01-22 19:42:13 -08:00
Jingyu Zhou 2b2325036a Fix compiler error of using override 2020-01-22 19:38:46 -08:00
Jingyu Zhou 4ed75e37f3 BackupProgress uses old epoch's begin version if no progress found
Get rid of the complex logic of choosing the largest saved version from
previous epoch for the oldest epoch. Instead, use the begin version now
available from log system.
2020-01-22 19:38:46 -08:00
Jingyu Zhou 250137a52f Change BackupProgress to be a class
Struct doesn't need addref() or delref() members, though.
2020-01-22 19:38:46 -08:00
Jingyu Zhou 64052f6349 Check and fill backup gaps for old epochs and tags
Sometimes the backup worker has not updated progress to the system space and a
master recovery happens. As a result, next epoch doesn't know the progress of
previous ones. This change is to check for such missing gaps and fill them with
the whole range [startVersion, endVersion).

The code is refactored into BackupProgress.actor.* to consolidate backup
progress processing for the master server.
2020-01-22 19:38:46 -08:00