This fixes an issue introduced in the previous patch, where pop would
immediately set `poppedLocationNeedsUpdate`, but setting the popped
version was now delayed. This means that we could:
1. Run the spill loop and persist all popped versions
2. Receive a pop, and set the poppedLocationNeedsUpdate flag
3. Run the dq-pop loop, and clear the poppedLocationNeedsUpdate flag
and now when we update the persistentPopped version again, we won't have
the flag set for dq-pop to know that it needs to scan the spilled data
again for the minLocation.
We could more carefully update the flag, but instead, I've just
converted it into a version that's kept in sync purely in the dq-pop
loop, to remove shared state between pop and the dq-pop loop.
This commit is to fix a bug where popping a tag between
updatePersistentData and popDiskQueue can cause the TLog to recover to
an incorrect understanding of what data it has available.
The following series of events need to happen to trigger this bug:
Tag 1:1 is popped to version 10
updatePersistentData is run...
updatePersistentPopped runs and we persistentData stores 1:1 as popped to 10
A mutation is spilled for 1:1 at version 11 at location 1000
A mutation is spilled for 1:1 at version 21 at location 5000
updatePersistentData finishes and commits the btree changes
Tag 1:1 is popped to version 20
popDiskQueue runs
The btree is read for spilled mutations with version >=20
The minimum location required for the disk queue is found to be location 5000
The disk queue is popped to location 5000
The TLog crashes
The worker restarts, and reloads the TLog files from disk
restorePersistentPopped restores tag 1:1 as having been popped to version 10
Parallel peeks are received for tag 1:1 starting at version 0
The first peek is less than the popped version, so we respond with no data, and an end version of 10
The second peek starts at version 10, which is greater than the popped version
The btree is read for spilled mutations, and we find that there is a mutation at version 11 at location 1000
Location 1000 is read in the DiskQueue
The resulting page read at Location 1000 was popped pre-crash, and thus
might either (a) be corrupt or (b) have an incorrect sequence number.
The fix to this is to force popDiskQueue/updatePoppedLocation to use the
popped version that was persisted to disk, and not the most recently
popped version for the given tag.
This bug doesn't manifest in simulation, because we don't have any code
that peeks at a lower version than what has been popped.
This is to guard against the case where
1. Peeks with sequence numbers 0-39 are submitted
2. A 15min pause happens, in which timeout removes the peek tracker data
3. Peeks with sequence numbers 40-59 are submitted, with the same peekId
The second round of peeks wouldn't have the data left that it's allowed
to start running peek 40 immediately, and thus would hang for 10min
until it gets cleaned up.
Also, guard against overflowing the sequence number.
When switching between spill_type or log_version, a new instance of a
SharedTLog is created in the transaction log processes. If this is done
in a saturated database, then doubling the amount of memory to hold
mutations in memory can cause TLogs to be uncomfortably close to the 8GB
OOM limit.
Instead, we now thread which UID of a SharedTLog is active, and the
other TLog spill out the majority of their mutations.
This is a backport of #2213 (fef89aa1) to release-6.2
When switching between spill_type or log_version, a new instance of a
SharedTLog is created in the transaction log processes. If this is done
in a saturated database, then doubling the amount of memory to hold
mutations in memory can cause TLogs to be uncomfortably close to the 8GB
OOM limit.
Instead, we now thread which UID of a SharedTLog is active, and the
other TLog spill out the majority of their mutations.
...and test it in simulation, but not combined yet.
It turns out that because of txsTag, we basically had to support
spill-by-value anyway. Thus, if we treat all tags like txsTag when
spilling and peeking, then we have an easy way to bring the two spilling
types back into one implementation.
The spilling type is now pulled out of the request, and then stored on
LogData for later access, and persisted in the tlog metadata per tlog
generation.
It turns out that serializing types as Unversioned is a bit wonky.