The same mutation can be present in overlapping mutation logs. Thus we cannot
assert its absence. This can be caused for multiple reasons. One possibility
is that new TLogs can copy mutations from old generation TLogs; another one
is backup worker is recruited without knowning previously saved progress.
Because each log version contains commit version and subsequence number, each
key can only have one mutation for its log version. This simplifies
StagingKey::add() a lot.
For some reason I am not sure why, there can be duplicated mutations added to
StagingKey, which needs to be filtered out. Otherwise, atomic operations can
result in corrupted data in database.
For partitioned logs, mutations of the same version may be sent to applier
out-of-order. If one loader advances to the next version, an applier may
receive later version mutations for different loaders. So, dropping of early
mutations is wrong.
The subsequence number is needed so that mutations of the same commit version
number, but from different partitioned logs can be correctly reassembled in
order.
For old backup files, the sub number is always 0. For partitioned mutation
logs, the actual sub number is used. For range files, the sub number is always
0.
Because retore roles run as workload in simulation,
they do not know when DB is destroyed by the backup and restore test workload.
So if DB is destroyed earlier than restore master unlocks DB, which is rare,
restore master should abort the unlocking DB step.
An actor is schedulable to run if the current worker has enough resourc, i.e.,
the worker's memory usage is below the threshold;
Exception: If the actor is working on the current version batch, we have to schedule
the actor to run to avoid dead-lock.
Future: When we release the actors that are blocked by memory usage, we should release them
in increasing order of their version batch.
Precompute mutations received by an applier;
Only apply the final result to the destination DB;
Execute multiple txns in parallel to apply final results to the destination DB.
atomicOp has an amplified performance overhead to the cluster,
for example, an ADD operation can be small, but SS has to load
the value to do the operation and the value can be large.
1. Review memory use cases and improve:
Ensure state varialble is initialized and
change unnecessary state variable to variable.
2. Remove debug code that is no longer useful;
3. Mute verbose debug.