* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.
Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.
This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
In this episode, we discover that having a transaction retry loop in which the
transaction conditionally has write conflict ranges is potentially troublesome.
To simplify the problem, if we have two concurrent transaction loops:
retry {
if (rand() > .5) tr->set('x', rand());
if (rand() > .5) tr->set('y', rand());
}
and
retry {
x = tr->get('x')
y = tr->get('y')
if (x > y) {
tr->set('y', x)
}
tr->commit();
}
Is not guaranteed that x > y in the database after the second transaction
commits. This is because it could read an older snapshot of x and y, in which
x was greater than y, and thus not invoke set. This means that `tr` is now a
read-only transaction, which no-ops out of committing as an "optimization". If
we add any write conflict range to `tr`, it then will conflict checked and
committed, which would guarantee that x>y when it commits.
Replace the first transaction with dumpData, and the second with version
upgrade transaction, and you have the bug that we're fixing, why, and how.
In this episode, we discover that having a transaction retry loop in which the
transaction conditionally has write conflict ranges is potentially troublesome.
To simplify the problem, if we have two concurrent transaction loops:
retry {
if (rand() > .5) tr->set('x', rand());
if (rand() > .5) tr->set('y', rand());
}
and
retry {
x = tr->get('x')
y = tr->get('y')
if (x > y) {
tr->set('y', x)
}
tr->commit();
}
Is not guaranteed that x > y in the database after the second transaction
commits. This is because it could read an older snapshot of x and y, in which
x was greater than y, and thus not invoke set. This means that `tr` is now a
read-only transaction, which no-ops out of committing as an "optimization". If
we add any write conflict range to `tr`, it then will conflict checked and
committed, which would guarantee that x>y when it commits.
Replace the first transaction with dumpData, and the second with version
upgrade transaction, and you have the bug that we're fixing, why, and how.
This cleans up a bit of the VersionStamp DR work I did, and leaves hints and
advice for anyone who will be touching mutation applying code in the future.
Simulation identified the fact that we can violate the
VersionStamps-are-always-increasing promise via the following series of events:
1. On proxy 0, dumpData adds commit requests to proxy 0's commit promise stream
2. To any proxy, a client submits the first transaction of abortBackup, which stops further dumpData calls on proxy 0.
3. To any proxy that is not proxy 0, submit a transaction that checks if it needs to upgrade the destination version.
4. The transaction from (3) is committed
5. Transactions from (1) are committed
This is possible because the dumpData transactions have no read conflict
ranges, and thus it's impossible to make them abort due to "conflicting"
transactions. There's also no promise that if client C sends a commit to proxy
A, and later a client D sends a commit to proxy B, that B must log its commit
after A. (We only promise that if C is told it was committed before D is told
it was committed, then A committed before B.)
There was a failed attempt to fix this problem. We tried to add read conflict
ranges to dumpData transactions so that they could be aborted by "conflicting"
transactions. However, this failed because this now means that dumpData
transactions require conflict resolution, and the stale read version that they
use can cause them to be aborted with a transaction_too_old error.
(Transactions that don't have read conflict ranges will never return
transaction_too_old, because with no reads, the read snapshot version is
effectively meaningless.) This was never previously possible, so the existing
code doesn't retry commits, and to make things more complicated, the dumpData
commits must be applied in order. This would require either adding
dependencies to transactions (if A is going to commit then B must also be/have
committed), which would be complicated, or submitting transactions with a fixed
read version, and replaying the failed commits with a higher read version once
we get a transaction_too_old error, which would unacceptably slow down the
maximum throughput of dumpData.
Thus, we've instead elected to add a special transaction option that bypasses
proxy load balancing for commits, and always commits against proxy 0. We can
know for certain that after the transaction from (2) is committed, all of the
dumpData transactions that will be committed have been added to the commit
promise stream on proxy 0. Thus, if we enqueue another transaction against
proxy 0, we can know that it will be placed into the promise stream after all
of the dumpData transactions, thus providing the semantics that we require: no
dumpData transaction can commit after the destination version upgrade
transaction.
This fixes the occasional VersionStampBackupToDB failures, that were caused by
the version upgrade comarision happening before dumpData invocations were
stopped. Committing the first transaction stops dumpData, and thus we can then
do the primary vs secondary version check correctly.