Commit Graph

63 Commits

Author SHA1 Message Date
Jingyu Zhou 0f57bf9685 Remove a SevError event
The same mutation can be present in overlapping mutation logs. Thus we cannot
assert its absence. This can be caused for multiple reasons. One possibility
is that new TLogs can copy mutations from old generation TLogs; another one
is backup worker is recruited without knowning previously saved progress.
2020-03-25 15:23:21 -07:00
Jingyu Zhou 196127fb92 Address review comments 2020-03-23 14:15:36 -07:00
Meng Xu 3f31ebf659 New backup:Revise event name and explain code 2020-03-23 10:55:44 -07:00
Jingyu Zhou 3513bbefe6 StagingKey uses mutation instead of a vector of mutations for each log version
Because each log version contains commit version and subsequence number, each
key can only have one mutation for its log version. This simplifies
StagingKey::add() a lot.
2020-03-18 16:44:17 -07:00
Jingyu Zhou b697e46b19 Fix duplicated mutation in StagingKey
For some reason I am not sure why, there can be duplicated mutations added to
StagingKey, which needs to be filtered out. Otherwise, atomic operations can
result in corrupted data in database.
2020-03-18 16:41:35 -07:00
Jingyu Zhou af967210ee StagingKey can add out-of-order mutations
For partitioned logs, mutations of the same version may be sent to applier
out-of-order. If one loader advances to the next version, an applier may
receive later version mutations for different loaders. So, dropping of early
mutations is wrong.
2020-03-18 16:41:35 -07:00
Jingyu Zhou ace409b49a Add subsequence number to restore loader & applier
The subsequence number is needed so that mutations of the same commit version
number, but from different partitioned logs can be correctly reassembled in
order.

For old backup files, the sub number is always 0. For partitioned mutation
logs, the actual sub number is used. For range files, the sub number is always
0.
2020-03-18 16:41:34 -07:00
Meng Xu 2520e8d44c FastRestore:Use more concise code as suggested in review 2020-03-01 22:32:36 -08:00
Meng Xu d001820219 FastRestore:getVersionBatchState can be called before VB is init 2020-02-28 14:47:11 -08:00
Meng Xu 62b9043ff6 FastRestore:DB can be destroyed before master unlock it in simulation
Because retore roles run as workload in simulation,
they do not know when DB is destroyed by the backup and restore test workload.
So if DB is destroyed earlier than restore master unlocks DB, which is rare,
restore master should abort the unlocking DB step.
2020-02-28 14:25:58 -08:00
Meng Xu 22b34bc609 FastRestore:getVersionBatchState can be called before version batch is initialized 2020-02-27 23:45:48 -08:00
Meng Xu 6018b64d73 FastRestore:Fix undefined ref to vtable error 2020-02-27 21:18:10 -08:00
Meng Xu f4cd0ef74f FastRestore:Apply clang-format 2020-02-27 20:59:56 -08:00
Meng Xu a6e66da29f FastRestore:fix compilation error for version batch state 2020-02-27 20:59:34 -08:00
Meng Xu fe8b8bbbff FastRestore:Change vb state to class from enum 2020-02-27 20:15:25 -08:00
Meng Xu d77177367c FastRestore:Track each ongoing version batch progress state for applier and loader roles 2020-02-27 19:47:22 -08:00
Meng Xu ca726fc68e FastRestore:Introduce OOM protection
An actor is schedulable to run if the current worker has enough resourc, i.e.,
the worker's memory usage is below the threshold;
Exception: If the actor is working on the current version batch, we have to schedule
the actor to run to avoid dead-lock.
Future: When we release the actors that are blocked by memory usage, we should release them
in increasing order of their version batch.
2020-02-26 14:09:18 -08:00
Meng Xu fbf5020af9 FastRestore:Applier:Add fetchKeys counter 2020-02-26 11:37:40 -08:00
Meng Xu 6bd4703a9f FastRestore:Resolve review comments 2020-02-20 14:27:34 -08:00
Meng Xu 898a1ea3ed FastRestore:Applier:Handle mutations at same version 2020-02-19 17:29:46 -08:00
Meng Xu d5d26f589f FastRestore:Cosmetic change to improve code readability 2020-02-19 15:43:51 -08:00
Meng Xu 03f699f2f9 Merge branch 'master' into mengxu/fast-restore-applier-multi-applying-PR 2020-02-19 15:22:33 -08:00
Meng Xu 94d799552e FastRestore:Apply clang-format against master 2020-02-18 16:41:59 -08:00
Meng Xu 3e2c19630a FastRestore:Applier:atomicOp can work on an empty key 2020-02-13 23:05:54 -08:00
Meng Xu 0b27786811 FastRestore:Applier:Minor change for clang-format 2020-02-13 13:17:32 -08:00
Meng Xu b5e60585aa FastRestore:Applier:Fix precompute mutation result 2020-02-13 12:57:47 -08:00
Meng Xu b1b44d4477 FastRestore:Applier:Handle CompareAndClear atomicOp 2020-02-13 11:11:29 -08:00
Meng Xu 58dad5373b FastRestore:Applier:Handle CompareAndClear atomicOp 2020-02-13 11:06:20 -08:00
Meng Xu d3c01763d9 FastRestore:Applier:Handle version stamped key values 2020-02-13 10:48:36 -08:00
Meng Xu b008df97eb FastRestore:Applier:Multiple set-clear mutations at same version 2020-02-13 10:13:46 -08:00
Meng Xu 238b2cb8e4 FastRestore:Applier:Fix various bugs
1. segmentation error
2. there exist mutations that is not set or clear or atomicOp, precompute result should ignore them.
2020-02-13 10:00:23 -08:00
Meng Xu 4394605b6f FastRestore:Applier:Fix:Final result mutation type must be set 2020-02-13 09:14:59 -08:00
Meng Xu acf34319c1 FastRestore:Applier:Precompute mutations and apply in parallel
Precompute mutations received by an applier;
Only apply the final result to the destination DB;
Execute multiple txns in parallel to apply final results to the destination DB.
2020-02-12 22:47:48 -08:00
Meng Xu 2bc82ffd70 FastRestore:Applier:Store received mutation by key 2020-02-12 14:12:38 -08:00
Meng Xu c0f75d77b1 FastRestore:Applier:Intro StagingKey struct 2020-02-12 13:57:18 -08:00
Meng Xu cda8fc189e FastRestore:AtomicOp:Intro weighted size for atomicOp
atomicOp has an amplified performance overhead to the cluster,
for example, an ADD operation can be small, but SS has to load
the value to do the operation and the value can be large.
2020-02-11 12:48:05 -08:00
Meng Xu 0c5997ca2d FastRestore:Add more traces 2020-02-10 17:01:59 -08:00
Meng Xu ad93e7bb0c FastRestore:Metrics:Minor change on trace name 2020-02-10 16:52:56 -08:00
Meng Xu dbce1e9974 FastRestore:Applier:Add metrics counter and proc counter 2020-02-10 16:38:26 -08:00
Meng Xu 141609e80a FastRestore:Improve code style and fix typos 2020-01-27 18:13:14 -08:00
Meng Xu 4ac92d223b Cleanup batch buffer for each restore request 2020-01-21 14:49:36 -08:00
Meng Xu bfbf2164c4 FastRestore:Applier buffer data for multiple batches 2020-01-17 17:01:01 -08:00
Meng Xu 35bc92b9a4 FastRestore:Refactor code to enable pipeline on Applier 2020-01-14 13:23:33 -08:00
Meng Xu e98b2a0d1c FastRestore:Introduce RestoreAsset 2019-12-20 18:00:10 -08:00
Meng Xu 1371db4cdc FastRestore:Self code review and cleanup
1. Review memory use cases and improve:
Ensure state varialble is initialized and
change unnecessary  state variable to variable.

2. Remove debug code that is no longer useful;

3. Mute verbose debug.
2019-12-11 16:37:33 -08:00
Andrew Noyes de8921b660 Move RestoreWorkerInterface to fdbclient 2019-10-25 10:42:22 -07:00
Andrew Noyes d4de608bb6 Fix OPEN_FOR_IDE build 2019-10-25 10:42:22 -07:00
Meng Xu 6f1ecd1b11 FastRestore:handleSendMutationVectorRequest:Receive mutations in order of versions 2019-10-21 14:31:21 -07:00
Meng Xu 0cd87df985 FastRestore:resetPerVersionBatch:fix compile error 2019-10-17 00:50:13 -07:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00