In case data at a single version is larger than FASTRESTORE_VERSIONBATCH_MAX_BYTES,
we should allow a version batch to include the version and ignore the
FASTRESTORE_VERSIONBATCH_MAX_BYTES limit to avoid false positive in simulation.
In real environment, this situation will report SevError to ask DBA to
increase the memory limit for a version batch.
Master should not start asking appliers to apply mutations at batchIndex
until all appliers have applied mutations at (batchIndex - 1).
Otherwise, mutations may not be applied in increasing order of versions,
because appliers at different batch index can have overlapped key ranges.
1) Sort logfiles by endVersion
2) Exit program early when restore will not succeed
3) Do not increase nextVersion unncessarily when
calculate version batches.
4) Change assert condition that ensures progress in
calculating version batches.
1) Remove endVersion field because it has been included in RestoreAsset;
2) Ensure endVersion in VersionBatch and RestoreAsset is always exclusive;
3) Revise ASSERT in laoder and applier in situations when the dummy commit version
is endVersion, to avoid false positive ASSERT failure.
1. Review memory use cases and improve:
Ensure state varialble is initialized and
change unnecessary state variable to variable.
2. Remove debug code that is no longer useful;
3. Mute verbose debug.
1) Do not keep restore role data (e.g., masterData) in restore worker;
2) Change function parameter list by only passing in the needed variables in role data;
3) Remove unneccessary files vector from masterData;
4) Change typos in comments and some functions name.
RestoreMaster may not receive all acks. for the last command, i.e., finishRestore,
because RestoreLoaders and RestoreAppliers exit immediately after sending the ack.
If the ack is lost, it will not be resent.
This commit also removes some unneeded code.
This commit passes 50k random tests without errors.
1) Use the runRYWTransaction for simple DB access
2) Replace some printf with TraceEvent
3) Remove printf not used in debugging
4) Avoid wait inside the condition in loop-choose-when for
the core routine of restore worker, loader and applier.
5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since
the file only has functionalities related to restore worker.
Passed correctness test
Rewrite the code that collects files for a version batch and that
distribute workload among loaders for files in a version batch.
The new code is easier to understand and maintain.