Commit Graph

68 Commits

Author SHA1 Message Date
Meng Xu 41c0a1768f FastRestore:Make FastRestore event type more descriptive 2020-05-01 10:27:08 -07:00
Meng Xu 5ebafdb94c FastRestore:Apply clang-format to changes 2020-04-07 15:57:03 -07:00
Meng Xu e5b2cd81d5 FastRestore:Cleanup debug code 2020-04-07 15:56:44 -07:00
Meng Xu 0034d6fc85 FastRestore:Master:Fix:Hnadling the last log file 2020-04-07 13:28:11 -07:00
Meng Xu a51ff7aaae FastRestore:Fix:buildVersionBatches may lose the last log file
If the last log file's endversion decides the last version batch's endversoin,
the buildVersionBatches function may quit early before include the last log file.

This causes some mutations missing and lead to incorrect DB.

This commit also addes an ASSERT(maxVBVersion >= targetVersion) to
alert such error as early as possible to simplify debug.
2020-04-06 12:24:26 -07:00
Meng Xu a81ec332a9 FastRestore:Fix:Master cannot throttle on in progress version batches when it release batches out of order in simulation 2020-04-04 17:34:26 -07:00
Jingyu Zhou 6b0d2923e7 Add target version as the limit for version batches
If using partitioned logs, the mutations after the target version can be
included if this limit is not considered.
2020-03-20 20:15:09 -07:00
Jingyu Zhou 6b9b93314e Check block padding is \0xff for new mutation logs 2020-03-20 20:13:38 -07:00
Jingyu Zhou 88ad28e576 Integrate parallel restore with partitioned logs
In parallel restore, use new getPartitionedRestoreSet() to get a set containing
partitioned mutation logs. The loader uses a new parser to extract mutations
from partitioned logs.

TODO: fix unable to restore errors.
2020-03-20 20:13:38 -07:00
Meng Xu 2520e8d44c FastRestore:Use more concise code as suggested in review 2020-03-01 22:32:36 -08:00
Meng Xu fe8b8bbbff FastRestore:Change vb state to class from enum 2020-02-27 20:15:25 -08:00
Meng Xu d77177367c FastRestore:Track each ongoing version batch progress state for applier and loader roles 2020-02-27 19:47:22 -08:00
Meng Xu 97d7eb49b5 FastRestore:Master:Report unavailable role periodically
Ping all restore roles and report unavailable ones.
2020-02-26 16:14:55 -08:00
Meng Xu ca726fc68e FastRestore:Introduce OOM protection
An actor is schedulable to run if the current worker has enough resourc, i.e.,
the worker's memory usage is below the threshold;
Exception: If the actor is working on the current version batch, we have to schedule
the actor to run to avoid dead-lock.
Future: When we release the actors that are blocked by memory usage, we should release them
in increasing order of their version batch.
2020-02-26 14:09:18 -08:00
Meng Xu 31a6ec34b7 Merge branch 'master' into mengxu/fast-restore-agent-PR 2020-02-18 16:17:59 -08:00
Meng Xu a12a161fb3 Merge branch 'master' into mengxu/fast-restore-pipeline-PR 2020-02-18 14:49:52 -08:00
Meng Xu c603b20e7e FastRestore:Resolve review comments 2020-02-18 14:08:27 -08:00
Meng Xu fd5b4af05a FastRestore:Add trace for each phase on master 2020-02-09 18:54:10 -08:00
Meng Xu 3b57bf1781 Merge branch 'master' into mengxu/fast-restore-agent-PR 2020-02-03 17:23:54 -08:00
Meng Xu 7f37a90c48 FastRestore:Introduce FASTRESTORE_VB_PARALLELISM
for controlling the number of concurrently running version batches.
2020-01-28 10:39:57 -08:00
Meng Xu 5330dd1937 FastRestore:Randomize running order of version batches 2020-01-27 22:39:25 -08:00
Meng Xu cab9d51e06 Merge branch 'master' into mengxu/fast-restore-pipeline-PR 2020-01-27 18:16:26 -08:00
Meng Xu 141609e80a FastRestore:Improve code style and fix typos 2020-01-27 18:13:14 -08:00
Meng Xu 75dc34f775 FastRestore:A single version may be larger than version batch threshold
In case data at a single version is larger than FASTRESTORE_VERSIONBATCH_MAX_BYTES,
we should allow a version batch to include the version and ignore the
FASTRESTORE_VERSIONBATCH_MAX_BYTES limit to avoid false positive in simulation.

In real environment, this situation will report SevError to ask DBA to
increase the memory limit for a version batch.
2020-01-27 12:17:20 -08:00
Meng Xu 76f30e71dc FastRestore:Init VersionBatch explicitly
Built-in variable may not be zero initialized by
compiler provided default constructor.
2020-01-26 13:15:45 -08:00
Meng Xu b04e98771e FastRestore:Replace FastRestoreOpConfig with Knobs
And randomize value for the rest of knobs
2020-01-24 14:24:34 -08:00
Meng Xu 16f9ec45bd Merge branch 'master' into mengxu/fast-restore-pipeline-PR 2020-01-23 20:15:21 -08:00
Meng Xu 4bf579a6d5 FastRestore:Fix race condition in pipeline
Master should not start asking appliers to apply mutations at batchIndex
until all appliers have applied mutations at (batchIndex - 1).
Otherwise, mutations may not be applied in increasing order of versions,
because appliers at different batch index can have overlapped key ranges.
2020-01-23 16:34:45 -08:00
Meng Xu 52e3d20d39 FastRestore:VersionBatch replace vector with set
In order to ensure each backup file only appears in version batch once.
2020-01-22 13:13:10 -08:00
Meng Xu 4fbbff8ccd FastRestore:Cannot rely on default enum value
enum class value does not have a standard default value.
We cannot assume the default value is 0 for scala enum type.
2020-01-21 21:43:32 -08:00
Meng Xu 009fcdeb16 FastRestore:Sanity check each restore asset is processed exactly once 2020-01-21 17:17:45 -08:00
Meng Xu 022783b449 Start batches in reverse order for testings and code cleanup 2020-01-21 14:49:40 -08:00
Meng Xu 4ac92d223b Cleanup batch buffer for each restore request 2020-01-21 14:49:36 -08:00
Meng Xu e6fe9f5745
Update fdbserver/RestoreMaster.actor.h
Correct typo in comment.

Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-01-21 10:59:01 -08:00
Meng Xu 441f3e2814 FastRestore:Master buffer data and progress for each batch 2020-01-17 17:01:06 -08:00
Meng Xu bfbf2164c4 FastRestore:Applier buffer data for multiple batches 2020-01-17 17:01:01 -08:00
Meng Xu f436ea806e FastRestore:Resolve review comment
1) Sort logfiles by endVersion

2) Exit program early when restore will not succeed

3) Do not increase nextVersion unncessarily when
calculate version batches.

4) Change assert condition that ensures progress in
calculating version batches.
2020-01-13 14:08:27 -08:00
Meng Xu dba85d28fc FastRestore:Cosmetic revision 2020-01-08 10:53:53 -08:00
Meng Xu 83a572ae22 FastRestore:buildVersionBatches:remove unused variable 2020-01-07 18:24:23 -08:00
Meng Xu a2b26906e8 FastRestore:Filter out empty files before distributing workload
and clean up unused code
2020-01-07 17:01:53 -08:00
Meng Xu 9df02512ab FastRestore:Apply clang-format 2020-01-07 11:50:32 -08:00
Meng Xu 67e913c3d5 Change LoadingParam struct and endVersion definition
1) Remove endVersion field because it has been included in RestoreAsset;

2) Ensure endVersion in VersionBatch and RestoreAsset is always exclusive;

3) Revise ASSERT in laoder and applier in situations when the dummy commit version
is endVersion, to avoid false positive ASSERT failure.
2020-01-07 11:48:03 -08:00
Meng Xu c3f8f3b445 FastRestore:Build VersionBatch less than threshold size 2020-01-07 11:46:56 -08:00
Meng Xu b5d7890ce0 FastRestore:Resolve review comments 2019-12-12 07:45:30 -08:00
Meng Xu 1371db4cdc FastRestore:Self code review and cleanup
1. Review memory use cases and improve:
Ensure state varialble is initialized and
change unnecessary  state variable to variable.

2. Remove debug code that is no longer useful;

3. Mute verbose debug.
2019-12-11 16:37:33 -08:00
Meng Xu 9383c3f0a6 FastRestore:Sampling:Apply clang format 2019-12-03 21:27:06 -08:00
Meng Xu 6b07c271f1 Fix non-deterministic error 2019-12-03 16:55:23 -08:00
Meng Xu 153b713b53 FastRestore:Add sampling on parsed mutations 2019-12-03 12:52:17 -08:00
Meng Xu 7e4c4ea98e FastRestore:Load mutations before assign ranges to appliers 2019-11-12 17:14:17 -08:00
Meng Xu 27c7ef09a3 FastRestore:Revise code in self review
When we read the txnId from decodeRestoreApplierKey func,
we should convert the integer to little endian.
2019-11-03 20:43:17 -08:00