Commit Graph

39 Commits

Author SHA1 Message Date
Meng Xu e345c9061f FastRestore:Refine debug messages 2019-11-04 11:47:38 -08:00
Andrew Noyes d4de608bb6 Fix OPEN_FOR_IDE build 2019-10-25 10:42:22 -07:00
Meng Xu 6f1ecd1b11 FastRestore:handleSendMutationVectorRequest:Receive mutations in order of versions 2019-10-21 14:31:21 -07:00
Meng Xu 17821a1424 FastRestore:unlockDatabase should always succeed 2019-10-17 00:50:13 -07:00
Meng Xu 3e2b3de4d0 FastRestore:RestoreMaster:Remove the extra lockDatabase in RestoreMaster 2019-10-17 00:50:13 -07:00
Meng Xu 2cd7010efb FastRestore:Add fileIndex to RestoreFileFR struct and bug fix
Fix bugs in RestoreMaster that cannot properly lock or unlock DB when
exception occurs;
Fix bug in ordering backup files
2019-10-17 00:50:13 -07:00
Meng Xu 048c341a7d FastRestore:Bug fix after merge with master
Include RestoreWorkerInterface.h instead of RestoreInterface.h into fdbserver.actor.cpp
Report warning instead of error when unlockDatabase throws error.
2019-09-04 21:14:09 -07:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00
Meng Xu 2602cb3591 FastRestore:Rename RestoreConfig to RestoreConfigFR to fix link problem in windows
Because the current restore has defined RestoreConfig, windows linker complains.
This commit rename the RestoreConfig used in FastRestore as RestoreConfigFR.
2019-08-02 23:00:12 -07:00
Meng Xu 3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu 93680cddd6 FastRestore:Remove redundant constructFilesWithVersionRange 2019-07-29 16:32:48 -07:00
Meng Xu b0c31f28af FastRestore:Fix bug that blocks restore
1) Should recruit only configured number of roles;
2) Should never register a restore master interface as a restore worker (loader or applier) interface.
2019-07-25 17:55:37 -07:00
Meng Xu 45083edf74 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
Meng Xu f1741aa90d FastRestore: Resolve review comments
1) Do not keep restore role data (e.g., masterData) in restore worker;
2) Change function parameter list by only passing in the needed variables in role data;
3) Remove unneccessary files vector from masterData;
4) Change typos in comments and some functions name.
2019-07-24 17:51:53 -07:00
Meng Xu 022b555b69 FastRestore:Fix bug in finish restore
RestoreMaster may not receive all acks. for the last command, i.e., finishRestore,
because RestoreLoaders and RestoreAppliers exit immediately after sending the ack.
If the ack is lost, it will not be resent.

This commit also removes some unneeded code.
This commit passes 50k random tests without errors.
2019-06-05 20:07:18 -07:00
Meng Xu 3fcb6ec0a1 FastRestore:Refactor RestoreLoader and fix bugs
Refactor RestoreLoader code and
Fix a bug in notifying restore finish.
2019-06-04 21:53:31 -07:00
Meng Xu 477fd152c0 FastRestore:Refactor code
1) Use the runRYWTransaction for simple DB access
2) Replace some printf with TraceEvent
3) Remove printf not used in debugging
4) Avoid wait inside the condition in loop-choose-when for
   the core routine of restore worker, loader and applier.
5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since
   the file only has functionalities related to restore worker.

Passed correctness test
2019-06-04 11:22:47 -07:00
Meng Xu 67f5c8b493 FastRestore:Remove performance status
Remove the non-functional code to reduce the code review size.
2019-05-30 20:24:40 -07:00
Meng Xu c3f3ba5b4a FastRestore:Remove old distributeWorkloadPerVersion 2019-05-30 17:47:59 -07:00
Meng Xu 45b9504ba6 FastRestore:Refactor distribute workload for version batch
Rewrite the code that collects files for a version batch and that
distribute workload among loaders for files in a version batch.
The new code is easier to understand and maintain.
2019-05-30 17:39:50 -07:00
Meng Xu a3f61e6df7 FastRestore:Rafctor:Reduce code size
1) Use runRYWTransaction to replace the loop-try style;
2) Remove unnecessary printf
3) Do not mistakenly send reply twice.
2019-05-29 17:03:50 -07:00
Meng Xu 9e1216af1c FastRestore:Remove CMDUID 2019-05-29 13:48:04 -07:00
Meng Xu 4f484a2a5d FastRestore:Refactor out the use of cmdID and other non-must functions 2019-05-29 13:26:17 -07:00
Meng Xu d56837ba16 FastRestore:Refactor LoadFileRequest
1) Remove global map to buffer the parsed mutations on loader.
   Use local map instead to increase parallelism.
2) Use std::map<LoadingParam, Future<Void>> to hold the actor
that parse a backup file and to de-duplicate requests.
3) Remove unused code.
2019-05-28 18:39:00 -07:00
Meng Xu fe2624fc22 FastRestore:Remove sampling phase
Remove the sampling phase to make the PR easier to review.
The sampling design and implementation may be changed and added in
next PR.
2019-05-26 21:34:58 -07:00
Meng Xu d21eb2bcce FastRestore:Simplify sending applier interfaces to loaders 2019-05-26 18:29:24 -07:00
Meng Xu 3eadb31798 FastRestore:Resolve two major reveiw comments
1) Add sendBatchRequests and getBatchReplies

sendBatchRequests is a generic actor to send requests without
processing replies.
getBatchReplies is similar to sendBatchRequests expect that
it returns the reply to caller.

2) Share applier interface to loaders by using RequestStream,
instead of using DB.
   Create RestoreSysInfo struct, similar purpose as DBInfo, for
 the restore system information that are shared among restore workers.
2019-05-24 21:53:21 -07:00
Meng Xu fac63a83c4 FastRestore:Use NotifiedVersion to deduplicate requests
Add a NotifiedVersion into an applier data which represents
the smallest version the applier is at.

When a loader sends mutation vector to appliers, it sends
the request that contains prevVersion and commitVersion.

This commits also put actor into an actorCollector for
loop-choose-when situation.
2019-05-22 22:09:54 -07:00
Meng Xu f235bb7e0d FastRestore:Use readVersion to trigger watch
Use readVersion to trigger watch on the restoreRequestTriggerKey and
restoreRequestDoneKey.
2019-05-22 13:20:59 -07:00
Meng Xu e8cc3add16 FastRestore:Add a general getBatchReplies func
The getBatchReplies takes the RequestStream, a set of interfaces, and
a set of requests.
It sends the requests via the RequestStream of the interfaces and
ensure each request has at least one reply returned.
2019-05-20 20:18:49 -07:00
Meng Xu cc2f4f320f FastRestore:Remove unnecessary delay 2019-05-20 11:30:17 -07:00
Meng Xu 9ea83e0f3c FastRestore:Remove dbprintf 2019-05-17 17:34:42 -07:00
Meng Xu a7f1b69804 FastRestore:Add dbprintf 2019-05-15 21:15:15 -07:00
Meng Xu 3fcdc39b93 FastRestore:Recruit exact number of restore worker
We can configure 1 loader and 1 applier to simplify
the debug process.
2019-05-13 23:04:47 -07:00
Meng Xu e9b881a7f9 FastRestore:Ask applier to apply to DB one by one
This is to simplify the logic to help debug.
2019-05-13 17:47:49 -07:00
Meng Xu fd92ab64e4 FastRestore: Clean code for RestoreApplier
Remove unused code and add comments to actors
2019-05-12 22:05:55 -07:00
Meng Xu 620cdd411e FastRestore:Add comments for each restore file 2019-05-12 21:53:43 -07:00
Meng Xu 879bf8dc7b FastRestore: Bug fix for refactored code 2019-05-10 16:48:01 -07:00
Meng Xu a08a6776f5 FastRestore: Refactor to smaller components
The current code uses one restore interface to handle the work
for all restore roles, i.e., master, loader and applier.
This makes it harder to review or maintain or scale.

This commit split the restore into multiple roles by mimicing FDB
transaction system:
1) It uses a RestoreWorker as the process to host restore roles;
   This commit assumes one restore role per RestoreWorker; but
   it should be easy to extend to support multiple roles per RestoreWorker;
2) It creates 3 restore roles:
   RestoreMaster: Coordinate the restore process and send commands to the other two roles;
   RestoreLoader: Parse backup files to mutations and send mutations to appliers;
   RestoreApplier: Sort received mutations and apply them to DB in order.

Compilable version. To be tested in correctness.
2019-05-10 14:20:06 -07:00