In case data at a single version is larger than FASTRESTORE_VERSIONBATCH_MAX_BYTES,
we should allow a version batch to include the version and ignore the
FASTRESTORE_VERSIONBATCH_MAX_BYTES limit to avoid false positive in simulation.
In real environment, this situation will report SevError to ask DBA to
increase the memory limit for a version batch.
Master should not start asking appliers to apply mutations at batchIndex
until all appliers have applied mutations at (batchIndex - 1).
Otherwise, mutations may not be applied in increasing order of versions,
because appliers at different batch index can have overlapped key ranges.
The `gettid()` function is part of glibc 2.30[1]. I decided to keep the
`gettid` implementation here under a different name to remain compatible
to older glibc versions.
[1] https://sourceware.org/ml/libc-alpha/2019-08/msg00029.html
The storage queue is no longer going to be a problem failing tests. Now the
backup worker life cycle is tied with backup. So consistency check only happens
after the backup workload is done. Thus, we no longer need to save backup
progress when consistency check is running.
The backup worker needs to use this version for popping when running in a NOOP
mode. This option is added to GetReadVersionRequest and proxies will send back
minKnownCommittedVersion if the option is set.
Also add a couple of knobs for backup workers.
The backup worker just blindly pop tags if the "backupStartedKey" is not set.
Note the commit version from TLog cannot be used as the pop version, because
for a single region, during a recovery the log router tags are used to recover
mutations. The backup worker can potentially pop mutations that are needed for
recovery, causing consistency errors. So the solution for now is to use commit
version - 5,000,000, which is a version guaranteed to be persisted on all
replicas.
This bug was introduced when I added log router tags unconditionally to any
configurations. In newEpoch(), the wait for remote recovery is conditioned on
"logRouterTags == 0", which always becomes false. Thus remote recovery was not
performed and remote TLogs won't copy data from previous epoch's TLogs
(previous epoch is a single region configuration). As a result, storage servers
cannot peek/get the data, and won't pop tags. Thus, waitForFullReplication()
became stuck and eventually test timeout.
When a value (i.e., mutations for a version) is large, it will be split into
multiple key value pairs. This is not handled previously and fixing it also
consolidate the interface of DecodeProgress.
For each mutation, its version, sub-version, and size are prefixed with big
endian representation. This is required, especially for the first version
variable, because we use 0xFF for padding purpose. A little endian version
number can easily collide with 0xFF, while big endian is guaranteed to have
0x00 as the first byte.