Meng Xu
528466e0e6
FastRestore:Fix Valgrind error InvalidSuppression
...
Trace.error() must explicitly include error_code_actor_cancelled
to handle the error.
2020-05-02 19:52:05 -07:00
Meng Xu
f9f1ac6594
FastRestore:Revise TraceEvent for better diagnosis
2020-05-01 16:31:55 -07:00
Meng Xu
134dbca0ee
FastRestore:Use cannonical way to trace error
2020-05-01 13:35:13 -07:00
Meng Xu
41c0a1768f
FastRestore:Make FastRestore event type more descriptive
2020-05-01 10:27:08 -07:00
Meng Xu
038f3834fc
Merge branch 'master' into mengxu/fr-code-improvement-PR
2020-05-01 09:26:29 -07:00
Meng Xu
6bd71560f0
FastRestore:Reduce trace events in real cluster environment
2020-04-30 19:12:31 -07:00
Meng Xu
f073049865
FastRestore:Revise trace events to be descriptive
...
Revert changes that send mutations to appliers out of order
2020-04-24 10:31:08 -07:00
Meng Xu
d21da5065a
FastRestore:Loader:Merge MutationsVec and LogMessageVersionVec into VersionedMutationsVec
...
Remove the actor that sends one mutation message batch in the previous commit,
because that actor no longer reduces the code complexity.
2020-04-21 22:05:34 -07:00
Meng Xu
061bcd2fb4
FastRestore:Replace typeString with safe getTypeString func
...
Also fix compilation error in previous commit
2020-04-13 15:15:54 -07:00
Meng Xu
dbc9c23193
FastRestore:Loader:Send mutations at different versions in the same message to appliers
...
This increases the bandwidth sent from loaders to appliers.
2020-04-12 10:46:58 -07:00
Meng Xu
55ee034e7f
Merge pull request #2916 from jzhou77/backup-fix
...
Remove version stamp ops from RestoreApplier
2020-04-11 14:04:11 -07:00
Meng Xu
2325ab209f
FastRestore:Applier:Avoid extra copy in getAndComputeStagingKeys
2020-04-08 12:22:08 -07:00
Meng Xu
5ebafdb94c
FastRestore:Apply clang-format to changes
2020-04-07 15:57:03 -07:00
Meng Xu
e5b2cd81d5
FastRestore:Cleanup debug code
2020-04-07 15:56:44 -07:00
Jingyu Zhou
cd8215ecf2
Remove version stamp ops from RestoreApplier
...
Version stamp ops are converted into SET at the proxy, so the backup files
will never have them.
2020-04-06 22:27:47 -07:00
Meng Xu
a51ff7aaae
FastRestore:Fix:buildVersionBatches may lose the last log file
...
If the last log file's endversion decides the last version batch's endversoin,
the buildVersionBatches function may quit early before include the last log file.
This causes some mutations missing and lead to incorrect DB.
This commit also addes an ASSERT(maxVBVersion >= targetVersion) to
alert such error as early as possible to simplify debug.
2020-04-06 12:24:26 -07:00
Meng Xu
536e65cd76
FastRestore:Introduce debugFRMutation for debug keys
2020-04-05 15:00:36 -07:00
Meng Xu
432c99afd0
FastRestore:Applier:Keep incompleteStagingKeys content before values are applied to DB
...
To avoid the incompleteStagingKeys is cleared before getAndComputeStagingKeys() finish using it.
2020-04-04 22:38:04 -07:00
Meng Xu
a81ec332a9
FastRestore:Fix:Master cannot throttle on in progress version batches when it release batches out of order in simulation
2020-04-04 17:34:26 -07:00
Meng Xu
6bce67ca75
FastRestore:Apply clang-format
2020-04-01 21:27:54 -07:00
Meng Xu
c69c959428
FastRestore:Fix:It is legal for a backup key not exist in DB
2020-03-31 22:02:17 -07:00
Meng Xu
25e96a13d3
FastRestore:Fix clearrange on a key mistakenly clear other keys
2020-03-31 17:45:19 -07:00
Meng Xu
212dadc2a1
Fix bug in add mutation on applier
...
For clear range mutation, we may clear the right boundary key which should not be cleared.
2020-03-31 16:51:08 -07:00
Meng Xu
33c4be9c42
Improve debug message for debug mutations
2020-03-31 16:00:51 -07:00
Jingyu Zhou
65e3b9192e
Add an assert for probably dead code
2020-03-28 21:19:47 -07:00
Meng Xu
13f343ec96
Resolve minor review comment
2020-03-28 16:03:01 -07:00
Jingyu Zhou
196127fb92
Address review comments
2020-03-23 14:15:36 -07:00
Jingyu Zhou
fea6155714
StagingKey uses mutation instead of a vector of mutations for each log version
...
Because each log version contains commit version and subsequence number, each
key can only have one mutation for its log version. This simplifies
StagingKey::add() a lot.
2020-03-20 20:15:09 -07:00
Jingyu Zhou
799f0b4b0e
Small code refactor
2020-03-20 20:15:09 -07:00
Jingyu Zhou
ab0b59b0c3
Add subsequence number to restore loader & applier
...
The subsequence number is needed so that mutations of the same commit version
number, but from different partitioned logs can be correctly reassembled in
order.
For old backup files, the sub number is always 0. For partitioned mutation
logs, the actual sub number is used. For range files, the sub number is always
0.
2020-03-20 20:13:38 -07:00
Andrew Noyes
c3b67c0c63
Fix OPEN_FOR_IDE build
2020-03-03 11:32:43 -08:00
Meng Xu
e6457ba0d5
FastRestore:Correct type for imcompleteStagingKeys
2020-03-02 11:33:07 -08:00
Meng Xu
2520e8d44c
FastRestore:Use more concise code as suggested in review
2020-03-01 22:32:36 -08:00
Meng Xu
01c1a15caf
FastRestore:Applier:Limit fetch keys number in a txn in getAndComputeStagingKeys
2020-02-28 16:53:36 -08:00
Meng Xu
fe8b8bbbff
FastRestore:Change vb state to class from enum
2020-02-27 20:15:25 -08:00
Meng Xu
d77177367c
FastRestore:Track each ongoing version batch progress state for applier and loader roles
2020-02-27 19:47:22 -08:00
Meng Xu
fbb6e8f39d
FastRestore:Create low memory situation in simulation on purpose
2020-02-26 14:54:38 -08:00
Meng Xu
06495b90ae
FastRestore:Loader:Use isSchedulable to guard OOM
...
And trigger delayed actors that are blocked on memory to recheck memory.
2020-02-26 14:35:05 -08:00
Meng Xu
a354f6ffa2
FastRestore:Applier:Use isSchedulable to guard OOM
2020-02-26 14:12:56 -08:00
Meng Xu
ca726fc68e
FastRestore:Introduce OOM protection
...
An actor is schedulable to run if the current worker has enough resourc, i.e.,
the worker's memory usage is below the threshold;
Exception: If the actor is working on the current version batch, we have to schedule
the actor to run to avoid dead-lock.
Future: When we release the actors that are blocked by memory usage, we should release them
in increasing order of their version batch.
2020-02-26 14:09:18 -08:00
Meng Xu
fbf5020af9
FastRestore:Applier:Add fetchKeys counter
2020-02-26 11:37:40 -08:00
Meng Xu
6bd4703a9f
FastRestore:Resolve review comments
2020-02-20 14:27:34 -08:00
Meng Xu
d5d26f589f
FastRestore:Cosmetic change to improve code readability
2020-02-19 15:43:51 -08:00
Meng Xu
7897b1658f
FastRestore:Applier:Rename new apply actor names
2020-02-19 15:29:32 -08:00
Meng Xu
e4258d73f5
FastRestore:Applier:Remove applying actors that do not have good perf
2020-02-19 15:27:59 -08:00
Meng Xu
fe75a4cafb
FastRestore:Apply clang-format
2020-02-19 15:22:52 -08:00
Meng Xu
03f699f2f9
Merge branch 'master' into mengxu/fast-restore-applier-multi-applying-PR
2020-02-19 15:22:33 -08:00
Meng Xu
94d799552e
FastRestore:Apply clang-format against master
2020-02-18 16:41:59 -08:00
Meng Xu
132f5aa9ba
FastRestore:Improve trace name and cosmetic change
2020-02-18 16:41:19 -08:00
Meng Xu
31a6ec34b7
Merge branch 'master' into mengxu/fast-restore-agent-PR
2020-02-18 16:17:59 -08:00
Meng Xu
c603b20e7e
FastRestore:Resolve review comments
2020-02-18 14:08:27 -08:00
Meng Xu
c34a69df32
FastRestore:Applier:Remove unused func
2020-02-13 23:06:56 -08:00
Meng Xu
3e2c19630a
FastRestore:Applier:atomicOp can work on an empty key
2020-02-13 23:05:54 -08:00
Meng Xu
b57583a504
FastRestore:Applier:Handle multiple gets in parallel
2020-02-13 23:05:31 -08:00
Meng Xu
53f427c319
FastRestore:Applier:fix getAndComputeStagingKeys
2020-02-13 22:11:30 -08:00
Meng Xu
0d668ea0c3
FastRestore:Applier:Add more trace for perf tracking
2020-02-13 15:50:10 -08:00
Meng Xu
0b27786811
FastRestore:Applier:Minor change for clang-format
2020-02-13 13:17:32 -08:00
Meng Xu
b5e60585aa
FastRestore:Applier:Fix precompute mutation result
2020-02-13 12:57:47 -08:00
Meng Xu
b1b44d4477
FastRestore:Applier:Handle CompareAndClear atomicOp
2020-02-13 11:11:29 -08:00
Meng Xu
58dad5373b
FastRestore:Applier:Handle CompareAndClear atomicOp
2020-02-13 11:06:20 -08:00
Meng Xu
d3c01763d9
FastRestore:Applier:Handle version stamped key values
2020-02-13 10:48:36 -08:00
Meng Xu
b008df97eb
FastRestore:Applier:Multiple set-clear mutations at same version
2020-02-13 10:13:46 -08:00
Meng Xu
238b2cb8e4
FastRestore:Applier:Fix various bugs
...
1. segmentation error
2. there exist mutations that is not set or clear or atomicOp, precompute result should ignore them.
2020-02-13 10:00:23 -08:00
Meng Xu
acf34319c1
FastRestore:Applier:Precompute mutations and apply in parallel
...
Precompute mutations received by an applier;
Only apply the final result to the destination DB;
Execute multiple txns in parallel to apply final results to the destination DB.
2020-02-12 22:47:48 -08:00
Meng Xu
2bc82ffd70
FastRestore:Applier:Store received mutation by key
2020-02-12 14:12:38 -08:00
Meng Xu
c0f75d77b1
FastRestore:Applier:Intro StagingKey struct
2020-02-12 13:57:18 -08:00
Meng Xu
3e6bbe9e5b
FastRestore:Applier:Use real size for atomic op
2020-02-11 15:51:32 -08:00
Meng Xu
cda8fc189e
FastRestore:AtomicOp:Intro weighted size for atomicOp
...
atomicOp has an amplified performance overhead to the cluster,
for example, an ADD operation can be small, but SS has to load
the value to do the operation and the value can be large.
2020-02-11 12:48:05 -08:00
Meng Xu
e76b6d824a
FastRestore:Assign priority to actors to prioritize vb work
...
When we pipeline multiple version batches, we should prevent a later
version batch from blocking the earlier version batch by consuming
CPU resources.
To achive the above, we should assign higher priority to actors
in later phases in a version batch.
Because restore master will not invoke an actor at a later phase unless
the actors at the earlier phases have been finished. This priority assignment
will not cause dead lock.
2020-02-10 20:29:23 -08:00
Meng Xu
325bd52939
FastRestore:Applier:Count appliedTxns
2020-02-10 17:13:20 -08:00
Meng Xu
dbce1e9974
FastRestore:Applier:Add metrics counter and proc counter
2020-02-10 16:38:26 -08:00
Meng Xu
9b7a00a64f
FastRestore:Mute trace when apply to db
2020-02-06 20:52:24 -08:00
Meng Xu
dc848f4297
FastRestore:Disable verbose trace for perf. measurement
2020-02-06 20:50:23 -08:00
Meng Xu
cab9d51e06
Merge branch 'master' into mengxu/fast-restore-pipeline-PR
2020-01-27 18:16:26 -08:00
Meng Xu
141609e80a
FastRestore:Improve code style and fix typos
2020-01-27 18:13:14 -08:00
Meng Xu
b04e98771e
FastRestore:Replace FastRestoreOpConfig with Knobs
...
And randomize value for the rest of knobs
2020-01-24 14:24:34 -08:00
Meng Xu
4bf579a6d5
FastRestore:Fix race condition in pipeline
...
Master should not start asking appliers to apply mutations at batchIndex
until all appliers have applied mutations at (batchIndex - 1).
Otherwise, mutations may not be applied in increasing order of versions,
because appliers at different batch index can have overlapped key ranges.
2020-01-23 16:34:45 -08:00
Meng Xu
e011f39829
FastRestore:Add sanity check and trace events
2020-01-23 16:03:41 -08:00
Meng Xu
009fcdeb16
FastRestore:Sanity check each restore asset is processed exactly once
2020-01-21 17:17:45 -08:00
Meng Xu
022783b449
Start batches in reverse order for testings and code cleanup
2020-01-21 14:49:40 -08:00
Meng Xu
4ac92d223b
Cleanup batch buffer for each restore request
2020-01-21 14:49:36 -08:00
Meng Xu
1a130b0df3
FastRestore:Fix race condition on handleApplyToDBRequest
2020-01-17 17:01:09 -08:00
Meng Xu
bfbf2164c4
FastRestore:Applier buffer data for multiple batches
2020-01-17 17:01:01 -08:00
Meng Xu
67e913c3d5
Change LoadingParam struct and endVersion definition
...
1) Remove endVersion field because it has been included in RestoreAsset;
2) Ensure endVersion in VersionBatch and RestoreAsset is always exclusive;
3) Revise ASSERT in laoder and applier in situations when the dummy commit version
is endVersion, to avoid false positive ASSERT failure.
2020-01-07 11:48:03 -08:00
Meng Xu
8d6f511816
FastRestore:Resolve review comment
...
Filter out range mutations that do not overlap with the restore range.
Small changes on format.
2019-12-22 20:09:10 -08:00
Meng Xu
ddcf3fdd80
FastRestore:Apply clang format
2019-12-20 22:00:36 -08:00
Meng Xu
d888e3100b
FastRestore:Applier:Add invariant
2019-12-20 19:34:28 -08:00
Meng Xu
e98b2a0d1c
FastRestore:Introduce RestoreAsset
2019-12-20 18:00:10 -08:00
Meng Xu
1371db4cdc
FastRestore:Self code review and cleanup
...
1. Review memory use cases and improve:
Ensure state varialble is initialized and
change unnecessary state variable to variable.
2. Remove debug code that is no longer useful;
3. Mute verbose debug.
2019-12-11 16:37:33 -08:00
Meng Xu
feb2a8c70c
FastRestore Change RestoreSendMutationVectorVersionedRequest name
...
Change RestoreSendMutationVectorVersionedRequest to
RestoreSendVersionedMutationsRequest for better naming
2019-12-10 17:23:40 -08:00
Meng Xu
39a4f2372f
Change FASTRESTORE_SAMPLING_PERCENT to 0 to 100
2019-12-04 21:26:27 -08:00
Meng Xu
c6b36dbffb
FastRestore:Sampling:Resolve review comments
2019-12-04 17:35:11 -08:00
Meng Xu
2b987d1945
FastRestore:typedef Standalone<VectorRef<MutationRef>> MutationsVec
2019-12-04 11:39:55 -08:00
Meng Xu
9383c3f0a6
FastRestore:Sampling:Apply clang format
2019-12-03 21:27:06 -08:00
Meng Xu
3310f67e9e
Merge branch 'mengxu/fast-restore-fix-valgrind-PR' into mengxu/fast-restore-sampling-PR
2019-12-03 16:24:40 -08:00
Meng Xu
530b689299
Move state variable to the start of function
2019-11-26 11:17:59 -08:00
Meng Xu
474f0067c4
Remove unneeded state
2019-11-25 23:10:14 -08:00
Meng Xu
bb97307f08
FastRestore:Applier:Move state variables at the start of actor
2019-11-25 21:25:14 -08:00
Jingyu Zhou
ae7e42face
Merge pull request #2313 from xumengpanda/mengxu/fastrestore-applyToDB-bugfix-PR
...
Performant restore [8/XX]: Fix bugs in applyToDB logic and add more tests
2019-11-12 08:50:23 -08:00
Meng Xu
630c29d160
FastRestore:resolve review comments
...
1) wait on whenAtLeast;
2) Put BigEndian64 into the function call and the decoder to prevent
future people from making the same mistake.
2019-11-11 17:00:16 -08:00