- ignorePopDeadline to have highier limit in simulator
to accommdate for the buggify delays and make snapshot succeed.
- introduce a new knob for auto resetting the disabling of tlog pop
tLogCommit exects no blocking between duplicate check and
setting of the new version, that constraint was broken
when synchronous execProcessingHelper was introduced.
As a fix, execProcessingHelper was made asynchronous.
- wait for snapTLogFailKeys in a loop, otherwise in some race
condition it can cause a false assert
- in single region, there does not seem to be a guarantee of
tagLocalityListKey for a given DC ID, avoiding that assert for now
- to find the workers that are coordinators, looking up by primary
address is not sufficient in some cases, hence looking by both
primary and secondary address
- test make files to reflect the location of the new test cases
In Tlogs, disable pop is done whlie taking snapshots. Earlier, tlogs
were ignoring the pops if it got pop requests when pops were
disabled. In this change, instead of ignoring the pop - it remembers
the list of pops in-memory and plays them once the popping is
enabled.
- exec operation to go to all the TLogs
- minor bug fix in tlog
- restore implementation for the simulator
- restore snap UID to be stored in restartInfo.ini
- test cases added
- indentation and trace file fixes
fix: we could incorrectly make data durable if eraseMessagesFromMemory was in progress while running updatePersistentData
the quiet database check now ensure that tlogs have no more than 30 seconds of versions unpopped from the disk queue
If a peek is entirely fulfilled from spilled data, then it's likely that
the next peek will be also. It is thus wasteful for each of these peeks
to call peekMessagesFromMemory, which memcpy's excessively, and then
throw all that data away without using it.
Now, TLogs will give a hint back to peek cursors about if the provided
reply was served entirely from the spilled data, which peek curors then
feed back as the hint into their next request.
At some point, a cursor will send a request for only spilled data, get
an incomplete response, and then be told to send its next request as one
that peeks from memory as well, and then it will fully catch up.
This deprioritizes before calling peekMessagesFromMemory, which should
improve the memory usage of the TLog, and makes sure to keep txsTag
peeks at a high priority to help recoveries stay fast.
This changes the logic of pop operations from log routers (LG):
- LG pops tagLocalityLogRouterMapped from TLogs;
- TLog converts tagLocalityLogRouterMapped back to tagLocalityLogRouter before
popping.
Later when we add more psuedo localities, the same pattern can be used.
TLogServer was forked into OldTLogServer_6_0 at the same time that
3247d594 modified TLogServer, so the modification never made it into
OldTLogServer_6_0, resulting in a rare failure.
Manual code inspection revealed that there was also
78976161 that concurrently modified TLogServer, so that change was
copied to OldTLogServer_6_0 as well.