Commit Graph

257 Commits

Author SHA1 Message Date
Meng Xu 8569f1a93f FastRestore:Loader:Use shallow copy for single key mutations when send mutation messages to appliers 2020-04-25 17:30:01 -07:00
Meng Xu 96855d9b47 FastRestore:Loader:Enable sending mutation messages out of order 2020-04-25 17:21:17 -07:00
Meng Xu 70ae22e72f FastRestore:Loader:Deep copy splitted mutation
otherwise, the mutation buffer will have an invalid address for MutationRef
when it was sent to applier.
2020-04-24 10:31:30 -07:00
Meng Xu f073049865 FastRestore:Revise trace events to be descriptive
Revert changes that send mutations to appliers out of order
2020-04-24 10:31:08 -07:00
Meng Xu 38193a3866 Merge branch 'master' into mengxu/fr-code-improvement-PR 2020-04-22 10:51:33 -07:00
Meng Xu 9b2f6d5c13 FastRestore:Loader:Enable sending message out of order 2020-04-22 08:57:11 -07:00
Meng Xu d85a39df69 FastRestore:Loader:Make requests no longer state varialbe 2020-04-21 22:21:59 -07:00
Meng Xu d21da5065a FastRestore:Loader:Merge MutationsVec and LogMessageVersionVec into VersionedMutationsVec
Remove the actor that sends one mutation message batch in the previous commit,
because that actor no longer reduces the code complexity.
2020-04-21 22:05:34 -07:00
Jingyu Zhou 6909f0b8fc Remove decodeRangeFileBlock from parallel restore
Reuse the one from fileBackup namespace.
2020-04-21 13:42:24 -07:00
Meng Xu 9ade63a685 FastRestore:Loader:Add sendOneMutationBatchToAppliers actor 2020-04-21 11:36:21 -07:00
Meng Xu 00188c3157 Merge branch 'master' into mengxu/fr-code-improvement-PR 2020-04-19 11:52:42 -07:00
Meng Xu d6021132a8 FastRestore:Merge mutationMap and mutationPartMap for efficiency 2020-04-19 11:52:17 -07:00
Meng Xu 719eda9421 FastRestore:Add an assertion in handleRestoreSysInfoRequest
as suggested in code review.
2020-04-18 22:42:46 -07:00
Meng Xu 10a6461d13 FastRestore:Change __inline__ to inline
__inline__ is compiler specific while inline is the standard keyword
2020-04-17 22:31:44 -07:00
Meng Xu 916d361587 BackupAndParallelRestoreCorrectness:Remove unnecessary checking optional variable 2020-04-17 18:32:14 -07:00
Meng Xu c8d049d0bb FastRestore:Loader:Add counter oldLogMutations 2020-04-17 15:21:59 -07:00
Meng Xu d6c1baa784 FastRestore:Filter out log mutations whose version is smaller than range mutation version 2020-04-15 19:45:03 -07:00
Meng Xu acdf6816b8 FastRestore:Loader:Fix:Last version in each RA is forgotten 2020-04-13 17:28:21 -07:00
Meng Xu 061bcd2fb4 FastRestore:Replace typeString with safe getTypeString func
Also fix compilation error in previous commit
2020-04-13 15:15:54 -07:00
Meng Xu 56cdd18224 FastRestore:Loader:Allow sending not all mutations at a version to appliers 2020-04-13 13:59:21 -07:00
Meng Xu 68adbef5bf FastRestore:Loader:Send multiple versions of mutations in a message to appliers 2020-04-12 11:18:41 -07:00
Meng Xu dbc9c23193 FastRestore:Loader:Send mutations at different versions in the same message to appliers
This increases the bandwidth sent from loaders to appliers.
2020-04-12 10:46:58 -07:00
Meng Xu 2325ab209f FastRestore:Applier:Avoid extra copy in getAndComputeStagingKeys 2020-04-08 12:22:08 -07:00
Meng Xu da7249ed1c FastRestore:Minor revision based on review comments 2020-04-02 11:15:22 -07:00
Meng Xu 6bce67ca75 FastRestore:Apply clang-format 2020-04-01 21:27:54 -07:00
Meng Xu 33c4be9c42 Improve debug message for debug mutations 2020-03-31 16:00:51 -07:00
Meng Xu e286f316b9 Increase generated key length for splitMutation unit test 2020-03-31 14:27:07 -07:00
Meng Xu b7da76223c Fix a tricky splitMutation bug
splitMutation result may include the end key of a clearrange mutation
2020-03-31 13:38:58 -07:00
Meng Xu ccbbdc4ba4 Unit test:Verify splitMutation by comparing with intersectingRanges result 2020-03-31 12:13:02 -07:00
Meng Xu 8a30526336 FastRestore:Remove commented assertion 2020-03-28 13:11:32 -07:00
Meng Xu 404a3e2619 FastRestore:Loader:Remove sanity chech for the order of sending log and range mutations 2020-03-27 23:36:13 -07:00
Meng Xu 97f8e46388 Sanity check subversion for log mutations 2020-03-27 13:07:08 -07:00
Alex Miller 40d10aa990 Fix debugMutation uses that were concurrently added in new backup code 2020-03-27 04:01:18 -07:00
Jingyu Zhou 99f4ef6e0c Fix restore loader to handle mutation sub number
For old backup format, give them a sub sequence number starting from 0 for each
commit version.
2020-03-26 13:04:00 -07:00
Jingyu Zhou 4bdb32be14 Batch sending all mutations of a version from RestoreLoader
This optimization is to reduce the number of messages sent from loader to
applier, which was unintentionally done when introducing sub sequence numbers
for mutations.
2020-03-20 20:15:09 -07:00
Jingyu Zhou 799f0b4b0e Small code refactor 2020-03-20 20:15:09 -07:00
Jingyu Zhou e40f937d3a Fix missing mutations in splitMutation
When a range mutation is larger than the last split point, this mutation can
become missing in the RestoreLoader, which is fixed in this commit.
2020-03-20 20:15:09 -07:00
Jingyu Zhou fe51ba3d16 Give maximum subsequence number for snapshot mutations
This is needed so that mutations in partitioned logs are applied first and
snapshot mutations are applied later for the same commit version.
2020-03-20 20:15:09 -07:00
Jingyu Zhou 2eac17b553 StagingKey can add out-of-order mutations
For partitioned logs, mutations of the same version may be sent to applier
out-of-order. If one loader advances to the next version, an applier may
receive later version mutations for different loaders. So, dropping of early
mutations is wrong.
2020-03-20 20:13:38 -07:00
Jingyu Zhou ab0b59b0c3 Add subsequence number to restore loader & applier
The subsequence number is needed so that mutations of the same commit version
number, but from different partitioned logs can be correctly reassembled in
order.

For old backup files, the sub number is always 0. For partitioned mutation
logs, the actual sub number is used. For range files, the sub number is always
0.
2020-03-20 20:13:38 -07:00
Jingyu Zhou 6b9b93314e Check block padding is \0xff for new mutation logs 2020-03-20 20:13:38 -07:00
Jingyu Zhou 35aafefb89 Consolidate StringRefReader classes
Fix a compiler error of unused variable too.
2020-03-20 20:13:38 -07:00
Jingyu Zhou 88ad28e576 Integrate parallel restore with partitioned logs
In parallel restore, use new getPartitionedRestoreSet() to get a set containing
partitioned mutation logs. The loader uses a new parser to extract mutations
from partitioned logs.

TODO: fix unable to restore errors.
2020-03-20 20:13:38 -07:00
Jingyu Zhou e15015ee6c Add mutation log version names
I.e., BACKUP_AGENT_MLOG_VERSION for 2001 and PARTITIONED_MLOG_VERSION for 4110.
2020-03-20 20:13:38 -07:00
Meng Xu d3071409c5 FastRestore:Add comment for integrating with new backup format 2020-03-20 20:13:38 -07:00
Meng Xu 2520e8d44c FastRestore:Use more concise code as suggested in review 2020-03-01 22:32:36 -08:00
Meng Xu 62b9043ff6 FastRestore:DB can be destroyed before master unlock it in simulation
Because retore roles run as workload in simulation,
they do not know when DB is destroyed by the backup and restore test workload.
So if DB is destroyed earlier than restore master unlocks DB, which is rare,
restore master should abort the unlocking DB step.
2020-02-28 14:25:58 -08:00
Meng Xu fbb6e8f39d FastRestore:Create low memory situation in simulation on purpose 2020-02-26 14:54:38 -08:00
Meng Xu 06495b90ae FastRestore:Loader:Use isSchedulable to guard OOM
And trigger delayed actors that are blocked on memory to recheck memory.
2020-02-26 14:35:05 -08:00
Meng Xu ca726fc68e FastRestore:Introduce OOM protection
An actor is schedulable to run if the current worker has enough resourc, i.e.,
the worker's memory usage is below the threshold;
Exception: If the actor is working on the current version batch, we have to schedule
the actor to run to avoid dead-lock.
Future: When we release the actors that are blocked by memory usage, we should release them
in increasing order of their version batch.
2020-02-26 14:09:18 -08:00
Meng Xu 505997ba0a FastRestore:Switch to new sendBatchRequests that tracks performance and straggler 2020-02-21 15:45:32 -08:00
Meng Xu 03f699f2f9 Merge branch 'master' into mengxu/fast-restore-applier-multi-applying-PR 2020-02-19 15:22:33 -08:00
Meng Xu 94d799552e FastRestore:Apply clang-format against master 2020-02-18 16:41:59 -08:00
Meng Xu 132f5aa9ba FastRestore:Improve trace name and cosmetic change 2020-02-18 16:41:19 -08:00
Meng Xu b5e60585aa FastRestore:Applier:Fix precompute mutation result 2020-02-13 12:57:47 -08:00
Meng Xu cda8fc189e FastRestore:AtomicOp:Intro weighted size for atomicOp
atomicOp has an amplified performance overhead to the cluster,
for example, an ADD operation can be small, but SS has to load
the value to do the operation and the value can be large.
2020-02-11 12:48:05 -08:00
Meng Xu e76b6d824a FastRestore:Assign priority to actors to prioritize vb work
When we pipeline multiple version batches, we should prevent a later
version batch from blocking the earlier version batch by consuming
CPU resources.

To achive the above, we should assign higher priority to actors
in later phases in a version batch.

Because restore master will not invoke an actor at a later phase unless
the actors at the earlier phases have been finished. This priority assignment
will not cause dead lock.
2020-02-10 20:29:23 -08:00
Meng Xu dbce1e9974 FastRestore:Applier:Add metrics counter and proc counter 2020-02-10 16:38:26 -08:00
Meng Xu 1fc793d6a7 FastRestore:Loader:Add metrics counter 2020-02-09 22:06:14 -08:00
Meng Xu 72110de7e2 FastRestore:Add trace for quick perf. measurement 2020-02-06 19:48:26 -08:00
Meng Xu cab9d51e06 Merge branch 'master' into mengxu/fast-restore-pipeline-PR 2020-01-27 18:16:26 -08:00
Meng Xu 141609e80a FastRestore:Improve code style and fix typos 2020-01-27 18:13:14 -08:00
Meng Xu cfdcddd90e FastRestore:Loader:Pipeline sendMutationsToApplier actors 2020-01-23 20:22:05 -08:00
Meng Xu e011f39829 FastRestore:Add sanity check and trace events 2020-01-23 16:03:41 -08:00
Meng Xu 009fcdeb16 FastRestore:Sanity check each restore asset is processed exactly once 2020-01-21 17:17:45 -08:00
Meng Xu 022783b449 Start batches in reverse order for testings and code cleanup 2020-01-21 14:49:40 -08:00
Meng Xu 4ac92d223b Cleanup batch buffer for each restore request 2020-01-21 14:49:36 -08:00
Meng Xu d69bd2f661 FastRestore:Loader buffer data for multiple batches 2020-01-17 17:01:06 -08:00
Meng Xu bfbf2164c4 FastRestore:Applier buffer data for multiple batches 2020-01-17 17:01:01 -08:00
Meng Xu f436ea806e FastRestore:Resolve review comment
1) Sort logfiles by endVersion

2) Exit program early when restore will not succeed

3) Do not increase nextVersion unncessarily when
calculate version batches.

4) Change assert condition that ensures progress in
calculating version batches.
2020-01-13 14:08:27 -08:00
Meng Xu c29e380076 FastRestore:Remove prevVersion from LoadingParam 2020-01-07 14:59:17 -08:00
Meng Xu 9df02512ab FastRestore:Apply clang-format 2020-01-07 11:50:32 -08:00
Meng Xu 67e913c3d5 Change LoadingParam struct and endVersion definition
1) Remove endVersion field because it has been included in RestoreAsset;

2) Ensure endVersion in VersionBatch and RestoreAsset is always exclusive;

3) Revise ASSERT in laoder and applier in situations when the dummy commit version
is endVersion, to avoid false positive ASSERT failure.
2020-01-07 11:48:03 -08:00
Meng Xu c3f8f3b445 FastRestore:Build VersionBatch less than threshold size 2020-01-07 11:46:56 -08:00
Meng Xu c10035ba54 FastRestore:Use isInVersionRange based on code review 2019-12-23 15:01:27 -08:00
Meng Xu 8d6f511816 FastRestore:Resolve review comment
Filter out range mutations that do not overlap with the restore range.
Small changes on format.
2019-12-22 20:09:10 -08:00
Meng Xu 61b29de3ce FastRestore:Self code review
Clean up commented code;
Add sanity check.
2019-12-20 22:24:34 -08:00
Meng Xu ddcf3fdd80 FastRestore:Apply clang format 2019-12-20 22:00:36 -08:00
Meng Xu 2cd1f0780a FastRestore:Split asset to subasset for async parsing files 2019-12-20 21:44:40 -08:00
Meng Xu e98b2a0d1c FastRestore:Introduce RestoreAsset 2019-12-20 18:00:10 -08:00
Meng Xu ffc8f76710 FastRestore:Rename StringRefReaderMX to BackupStringRefReader 2019-12-19 11:49:37 -08:00
Meng Xu b5d7890ce0 FastRestore:Resolve review comments 2019-12-12 07:45:30 -08:00
Meng Xu 9670d64fbd FastRestore:Remove commented code 2019-12-11 16:48:40 -08:00
Meng Xu 1371db4cdc FastRestore:Self code review and cleanup
1. Review memory use cases and improve:
Ensure state varialble is initialized and
change unnecessary  state variable to variable.

2. Remove debug code that is no longer useful;

3. Mute verbose debug.
2019-12-11 16:37:33 -08:00
Meng Xu 9a6dabe47e Merge branch 'mengxu/fastrestore-code-cleanup-PR' into mengxu/fast-restore-fix-valgrind-PR 2019-12-10 20:05:35 -08:00
Meng Xu feb2a8c70c FastRestore Change RestoreSendMutationVectorVersionedRequest name
Change RestoreSendMutationVectorVersionedRequest to
RestoreSendVersionedMutationsRequest for better naming
2019-12-10 17:23:40 -08:00
Meng Xu 20a19978f9 FastRestore:LoadingParam cleanup 2019-12-10 17:20:44 -08:00
Meng Xu e8dfc1c187 Replace pop_front(size) with new empty standalone obj 2019-12-06 23:16:49 -08:00
Meng Xu 4a66366a05 Use MutationsVec instead of VectorRef 2019-12-06 22:00:40 -08:00
Meng Xu 39a4f2372f Change FASTRESTORE_SAMPLING_PERCENT to 0 to 100 2019-12-04 21:26:27 -08:00
Meng Xu c6b36dbffb FastRestore:Sampling:Resolve review comments 2019-12-04 17:35:11 -08:00
Meng Xu dd91d26dfa FastRestore:Sampling:Add FASTRESTORE_SAMPLING_RATE knob 2019-12-04 11:46:29 -08:00
Meng Xu 2b987d1945 FastRestore:typedef Standalone<VectorRef<MutationRef>> MutationsVec 2019-12-04 11:39:55 -08:00
Meng Xu 9383c3f0a6 FastRestore:Sampling:Apply clang format 2019-12-03 21:27:06 -08:00
Meng Xu 3310f67e9e Merge branch 'mengxu/fast-restore-fix-valgrind-PR' into mengxu/fast-restore-sampling-PR 2019-12-03 16:24:40 -08:00
Meng Xu 153b713b53 FastRestore:Add sampling on parsed mutations 2019-12-03 12:52:17 -08:00
Meng Xu 474f0067c4 Remove unneeded state 2019-11-25 23:10:14 -08:00
Meng Xu a04f314b1b
Merge pull request #2383 from jzhou77/restore
Use sizeof() to replace constant numbers
2019-11-22 16:14:44 -08:00
Jingyu Zhou 037e808253 Address review comments by changing variable names 2019-11-22 13:12:04 -08:00
Jingyu Zhou 9927a9013f Use sizeof() to replace constant numbers 2019-11-22 11:47:25 -08:00
Meng Xu 78f10f15b3 FastRestore:replace insert with emplace for map and vector
This resolves the review suggestions.
2019-11-21 22:47:04 -08:00
Meng Xu 343bcd104a FastRestore:Apply Clang format 2019-11-20 21:04:18 -08:00
Meng Xu 3f5491318d FastRestore:Fix bug that cause nondeterminism
1) Use map iterator instead of pointer to maintain stability when map is inserted or deleted
2) dummySampleWorkload: clear rangeToApplier data in each sampling phase. otherwise, we can
have an increasing number of keys assigned to the applier.
2019-11-15 11:30:09 -08:00
Meng Xu 9e36b897e6 FastRestore:Loaders must send to appliers log files data before range files 2019-11-12 21:43:12 -08:00
Meng Xu 592f4c0fc4 FastRestore:Remove RestoreSetApplierKeyRangeVectorRequest 2019-11-12 17:59:11 -08:00
Meng Xu 7e4c4ea98e FastRestore:Load mutations before assign ranges to appliers 2019-11-12 17:14:17 -08:00
Jingyu Zhou ae7e42face
Merge pull request #2313 from xumengpanda/mengxu/fastrestore-applyToDB-bugfix-PR
Performant restore [8/XX]: Fix bugs in applyToDB logic and add more tests
2019-11-12 08:50:23 -08:00
Meng Xu 630c29d160 FastRestore:resolve review comments
1) wait on whenAtLeast;
2) Put BigEndian64 into the function call and the decoder to prevent
future people from making the same mistake.
2019-11-11 17:00:16 -08:00
A.J. Beamon cf2ec3418c
Merge pull request #2317 from xumengpanda/mengxu/fastrestore-extend-atomicOpTest-PR
AtomicOps Test: Add more detailed debug information when test fails with opType = AddValue
2019-11-11 15:03:10 -08:00
Meng Xu 0ccded1929 AtomicOps:Resolve review comments 2019-11-05 19:27:49 -08:00
Meng Xu c4d1e6e1a9 Trace:Severity:Include SevNoInfo to mute trace
Define SevFRMutationInfo to trace mutations in restore.
2019-11-04 16:18:40 -08:00
Meng Xu e345c9061f FastRestore:Refine debug messages 2019-11-04 11:47:38 -08:00
Meng Xu 7903b47b82 FastRestore:Remove unnecessary return 2019-10-24 13:09:24 -07:00
Meng Xu c53f817c5e FastRestore:Convert handleInitVersionBatchRequest to plain func 2019-10-24 13:06:50 -07:00
Meng Xu 60d26ff5d7 FastRestore:Resolve review comments 2019-10-24 12:52:12 -07:00
Meng Xu bae0c907a6 FastRestore:Convert unnecessary actor function to plain function 2019-10-23 15:10:34 -07:00
Meng Xu ab4a375b95 FastRestore:RestoreLoader:Define SerializedMutationPartMap type 2019-10-17 10:12:38 -07:00
Meng Xu 78b1ebc7c2 FastRestore:Loader:Handle multiple mutations at same verions in multiple files 2019-10-16 20:57:16 -07:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00
Meng Xu 9cc832cfd6 FastRestore:Fix Mac and Windows compilation error 2019-08-02 14:33:08 -07:00
Meng Xu 3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu 45083edf74 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
Meng Xu f1741aa90d FastRestore: Resolve review comments
1) Do not keep restore role data (e.g., masterData) in restore worker;
2) Change function parameter list by only passing in the needed variables in role data;
3) Remove unneccessary files vector from masterData;
4) Change typos in comments and some functions name.
2019-07-24 17:51:53 -07:00
Meng Xu 701676dbd2 FastRestore:Refactor code and add missing files
Add RestoreWorker.actor.cpp and RestoreWorkerInterface.actor.h back.
2019-06-18 09:54:27 -07:00
Meng Xu 022b555b69 FastRestore:Fix bug in finish restore
RestoreMaster may not receive all acks. for the last command, i.e., finishRestore,
because RestoreLoaders and RestoreAppliers exit immediately after sending the ack.
If the ack is lost, it will not be resent.

This commit also removes some unneeded code.
This commit passes 50k random tests without errors.
2019-06-05 20:07:18 -07:00
Meng Xu 3fcb6ec0a1 FastRestore:Refactor RestoreLoader and fix bugs
Refactor RestoreLoader code and
Fix a bug in notifying restore finish.
2019-06-04 21:53:31 -07:00
Meng Xu 477fd152c0 FastRestore:Refactor code
1) Use the runRYWTransaction for simple DB access
2) Replace some printf with TraceEvent
3) Remove printf not used in debugging
4) Avoid wait inside the condition in loop-choose-when for
   the core routine of restore worker, loader and applier.
5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since
   the file only has functionalities related to restore worker.

Passed correctness test
2019-06-04 11:22:47 -07:00
Meng Xu a372c82db2 FastRestore:BugFix:Loader must distinguish range and log mutations sent to appliers 2019-05-30 21:22:33 -07:00
Meng Xu 450bda9a01 FastRestore:Refactor parsing backup file code
Refactor _parseRangeFileToMutationsOnLoader and
_parseLogFileToMutationsOnLoader functions and their callees
2019-05-30 14:01:48 -07:00
Meng Xu a3f61e6df7 FastRestore:Rafctor:Reduce code size
1) Use runRYWTransaction to replace the loop-try style;
2) Remove unnecessary printf
3) Do not mistakenly send reply twice.
2019-05-29 17:03:50 -07:00
Meng Xu 9e1216af1c FastRestore:Remove CMDUID 2019-05-29 13:48:04 -07:00
Meng Xu 4f484a2a5d FastRestore:Refactor out the use of cmdID and other non-must functions 2019-05-29 13:26:17 -07:00
Meng Xu d56837ba16 FastRestore:Refactor LoadFileRequest
1) Remove global map to buffer the parsed mutations on loader.
   Use local map instead to increase parallelism.
2) Use std::map<LoadingParam, Future<Void>> to hold the actor
that parse a backup file and to de-duplicate requests.
3) Remove unused code.
2019-05-28 18:39:00 -07:00
Meng Xu fe2624fc22 FastRestore:Remove sampling phase
Remove the sampling phase to make the PR easier to review.
The sampling design and implementation may be changed and added in
next PR.
2019-05-26 21:34:58 -07:00
Meng Xu 3eadb31798 FastRestore:Resolve two major reveiw comments
1) Add sendBatchRequests and getBatchReplies

sendBatchRequests is a generic actor to send requests without
processing replies.
getBatchReplies is similar to sendBatchRequests expect that
it returns the reply to caller.

2) Share applier interface to loaders by using RequestStream,
instead of using DB.
   Create RestoreSysInfo struct, similar purpose as DBInfo, for
 the restore system information that are shared among restore workers.
2019-05-24 21:53:21 -07:00
Meng Xu fac63a83c4 FastRestore:Use NotifiedVersion to deduplicate requests
Add a NotifiedVersion into an applier data which represents
the smallest version the applier is at.

When a loader sends mutation vector to appliers, it sends
the request that contains prevVersion and commitVersion.

This commits also put actor into an actorCollector for
loop-choose-when situation.
2019-05-22 22:09:54 -07:00
Meng Xu 12817af03f FastRestore:Fix CMake compiling errors 2019-05-16 20:01:43 -07:00
Meng Xu 35b169fd2d FastRestore:Fix bug in registerMutationsToApplier
We forgot to update the applierInterface reference to the iterated
applyID
2019-05-14 22:10:09 -07:00
Meng Xu d9c97b5e5f FastRestore:Fix bug in sending a vector of mutations
When mutationVectorThreshold is not 1, a loader sends a vector of
mutations to an applier.

We should never mix mutations at different versions into the same vector.

The code on previous commit may mix mutations at versions.
This commit resolves the bug.
2019-05-14 21:04:36 -07:00
Meng Xu f33e3bf8bc FastRestore:bugFix:loader must clear kvOps after use it
In the sampling phase, a loader will cache the mutations into kvOps map;
In the loading log file phase, the loader will do the same thing.
The loader must clear the kvOps map once the loader use it; otherwise,
it will cache the sampled mutations twice, which leads to an
inconsistent restored DB.
2019-05-14 20:28:32 -07:00
Meng Xu f54a1e1463 FastRestore:Fix bug in deciding applierID in splitMutation 2019-05-14 17:39:44 -07:00
Meng Xu f8c654cd86 FastRestore:Fix splitMutation bug
The splitted range mutation had a wrong param1 for the produced first mutation
2019-05-14 17:05:50 -07:00
Meng Xu 1f159113e6 FastRestore:Test multiple appliers
Loaders will split a range mutation for multiple appliers when needed.
2019-05-14 16:41:04 -07:00
Meng Xu 6c4c807801 FastRestore:fix bug due to non-unique cmdid
This commit identifies the bug
why DB may be restored to an inconsistent state.

The cmdid is used to achieve exact once delivery even when
network can deliver a request twice.
This is under assumption that cmdid is unique for each request!

However, this assumption may not hold for
the phase Loader_Send_Mutations_To_Applier, when loaders send parsed
mutations to appliers:
1) When the same loader loads multiple files, we reset the cmdid
for the phase;
2) When different loaders load files, each loader's cmdid starts from
0 for the phase.
Both situations can break the assumption, which causes appliers to
miss some mutations to apply. This breaks the cycle test.
2019-05-14 01:49:49 -07:00
Meng Xu c115e3ceb1 FastRestore: Remove handleSampleLogFileRequest
handleSampleLogFileRequest is replaced by handleLoadLogFileRequest
2019-05-13 18:49:13 -07:00
Meng Xu 730142d532 FastRestore: Mark sampled file as processed files
This commit should pass correctness test, but
it does not mean the fast restore logic is correct.

We should NOT mark sampled file as processed files.
2019-05-13 17:53:11 -07:00
Meng Xu 76dd8dc8a8 FastRestore: Fix splitMutation bug 2019-05-13 17:24:57 -07:00
Meng Xu c7cd758e01 FastRestore:Do not mark log file as processed in sampling
This commit will expose a potential bug in fast restore.
We may need to parse range file before log file.
2019-05-13 11:37:20 -07:00
Meng Xu 26b224cddc FastRestore:RestoreLoader: Unify parsing log file
Use a generic actor to parse log files for sampling phase and
load phase.
2019-05-13 11:36:29 -07:00
Meng Xu a2fef23678 FastRestore: Remove handleSampleRangeFile actor 2019-05-13 10:36:44 -07:00