Commit Graph

43 Commits

Author SHA1 Message Date
Meng Xu b506ff3af9 Fix merge conflict on missing struct VersionedMutation 2020-04-29 22:35:54 -07:00
Meng Xu a0d67cac16 Merge branch 'master' into mengxu/fr-code-improvement-PR 2020-04-29 21:07:33 -07:00
Meng Xu f073049865 FastRestore:Revise trace events to be descriptive
Revert changes that send mutations to appliers out of order
2020-04-24 10:31:08 -07:00
Meng Xu 46ec766cab FastRestore:Disable debug trace 2020-04-22 16:11:46 -07:00
Meng Xu 38193a3866 Merge branch 'master' into mengxu/fr-code-improvement-PR 2020-04-22 10:51:33 -07:00
Meng Xu d21da5065a FastRestore:Loader:Merge MutationsVec and LogMessageVersionVec into VersionedMutationsVec
Remove the actor that sends one mutation message batch in the previous commit,
because that actor no longer reduces the code complexity.
2020-04-21 22:05:34 -07:00
Meng Xu dbc9c23193 FastRestore:Loader:Send mutations at different versions in the same message to appliers
This increases the bandwidth sent from loaders to appliers.
2020-04-12 10:46:58 -07:00
Meng Xu 5ebafdb94c FastRestore:Apply clang-format to changes 2020-04-07 15:57:03 -07:00
Meng Xu 536e65cd76 FastRestore:Introduce debugFRMutation for debug keys 2020-04-05 15:00:36 -07:00
Meng Xu a81ec332a9 FastRestore:Fix:Master cannot throttle on in progress version batches when it release batches out of order in simulation 2020-04-04 17:34:26 -07:00
Jingyu Zhou ab0b59b0c3 Add subsequence number to restore loader & applier
The subsequence number is needed so that mutations of the same commit version
number, but from different partitioned logs can be correctly reassembled in
order.

For old backup files, the sub number is always 0. For partitioned mutation
logs, the actual sub number is used. For range files, the sub number is always
0.
2020-03-20 20:13:38 -07:00
Meng Xu dc848f4297 FastRestore:Disable verbose trace for perf. measurement 2020-02-06 20:50:23 -08:00
Meng Xu b04e98771e FastRestore:Replace FastRestoreOpConfig with Knobs
And randomize value for the rest of knobs
2020-01-24 14:24:34 -08:00
Meng Xu 009fcdeb16 FastRestore:Sanity check each restore asset is processed exactly once 2020-01-21 17:17:45 -08:00
Meng Xu f436ea806e FastRestore:Resolve review comment
1) Sort logfiles by endVersion

2) Exit program early when restore will not succeed

3) Do not increase nextVersion unncessarily when
calculate version batches.

4) Change assert condition that ensures progress in
calculating version batches.
2020-01-13 14:08:27 -08:00
Meng Xu dba85d28fc FastRestore:Cosmetic revision 2020-01-08 10:53:53 -08:00
Meng Xu c3f8f3b445 FastRestore:Build VersionBatch less than threshold size 2020-01-07 11:46:56 -08:00
Meng Xu e98b2a0d1c FastRestore:Introduce RestoreAsset 2019-12-20 18:00:10 -08:00
Meng Xu 39a4f2372f Change FASTRESTORE_SAMPLING_PERCENT to 0 to 100 2019-12-04 21:26:27 -08:00
Meng Xu c6b36dbffb FastRestore:Sampling:Resolve review comments 2019-12-04 17:35:11 -08:00
Meng Xu 2b987d1945 FastRestore:typedef Standalone<VectorRef<MutationRef>> MutationsVec 2019-12-04 11:39:55 -08:00
Meng Xu 592f4c0fc4 FastRestore:Remove RestoreSetApplierKeyRangeVectorRequest 2019-11-12 17:59:11 -08:00
Meng Xu e7210fe842 Trace:Resolve review comments and add SevVerbose level 2019-11-05 09:42:29 -08:00
Meng Xu c4d1e6e1a9 Trace:Severity:Include SevNoInfo to mute trace
Define SevFRMutationInfo to trace mutations in restore.
2019-11-04 16:18:40 -08:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00
Meng Xu 3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu c2c2536de2 FastRestore:Resolve compile erroers due to conflict with master 2019-08-01 13:32:24 -07:00
Meng Xu 45083edf74 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
Meng Xu 022b555b69 FastRestore:Fix bug in finish restore
RestoreMaster may not receive all acks. for the last command, i.e., finishRestore,
because RestoreLoaders and RestoreAppliers exit immediately after sending the ack.
If the ack is lost, it will not be resent.

This commit also removes some unneeded code.
This commit passes 50k random tests without errors.
2019-06-05 20:07:18 -07:00
Meng Xu 3fcb6ec0a1 FastRestore:Refactor RestoreLoader and fix bugs
Refactor RestoreLoader code and
Fix a bug in notifying restore finish.
2019-06-04 21:53:31 -07:00
Meng Xu 477fd152c0 FastRestore:Refactor code
1) Use the runRYWTransaction for simple DB access
2) Replace some printf with TraceEvent
3) Remove printf not used in debugging
4) Avoid wait inside the condition in loop-choose-when for
   the core routine of restore worker, loader and applier.
5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since
   the file only has functionalities related to restore worker.

Passed correctness test
2019-06-04 11:22:47 -07:00
Meng Xu 67f5c8b493 FastRestore:Remove performance status
Remove the non-functional code to reduce the code review size.
2019-05-30 20:24:40 -07:00
Meng Xu 9e1216af1c FastRestore:Remove CMDUID 2019-05-29 13:48:04 -07:00
Meng Xu 3eadb31798 FastRestore:Resolve two major reveiw comments
1) Add sendBatchRequests and getBatchReplies

sendBatchRequests is a generic actor to send requests without
processing replies.
getBatchReplies is similar to sendBatchRequests expect that
it returns the reply to caller.

2) Share applier interface to loaders by using RequestStream,
instead of using DB.
   Create RestoreSysInfo struct, similar purpose as DBInfo, for
 the restore system information that are shared among restore workers.
2019-05-24 21:53:21 -07:00
Meng Xu 9ea83e0f3c FastRestore:Remove dbprintf 2019-05-17 17:34:42 -07:00
Meng Xu a7f1b69804 FastRestore:Add dbprintf 2019-05-15 21:15:15 -07:00
Meng Xu d8658a581f FastRestore:Change parameter for performance test 2019-05-15 19:53:14 -07:00
Meng Xu 86c936522d FastRestore:CMDUID should serialize nodeIndex 2019-05-14 16:03:32 -07:00
Meng Xu 5344e3faf7 FastRestore:Add nodeIndex to CMDUID
This avoids the duplicate cmdIDs from different loaders.
2019-05-14 15:04:09 -07:00
Meng Xu 3fcdc39b93 FastRestore:Recruit exact number of restore worker
We can configure 1 loader and 1 applier to simplify
the debug process.
2019-05-13 23:04:47 -07:00
Meng Xu ef9dcd545c FastRestore: Resolve review comments
1) Add type for RestoreCommandEnum
2) Make RestoreRoleStr const
2019-05-12 00:01:24 -07:00
Meng Xu 879bf8dc7b FastRestore: Bug fix for refactored code 2019-05-10 16:48:01 -07:00
Meng Xu a08a6776f5 FastRestore: Refactor to smaller components
The current code uses one restore interface to handle the work
for all restore roles, i.e., master, loader and applier.
This makes it harder to review or maintain or scale.

This commit split the restore into multiple roles by mimicing FDB
transaction system:
1) It uses a RestoreWorker as the process to host restore roles;
   This commit assumes one restore role per RestoreWorker; but
   it should be easy to extend to support multiple roles per RestoreWorker;
2) It creates 3 restore roles:
   RestoreMaster: Coordinate the restore process and send commands to the other two roles;
   RestoreLoader: Parse backup files to mutations and send mutations to appliers;
   RestoreApplier: Sort received mutations and apply them to DB in order.

Compilable version. To be tested in correctness.
2019-05-10 14:20:06 -07:00