* cleanup audit progress metadata and tester directly issue audit requests to DD instead of CC
* address comments and fix test dd issue request but dd not present
Description
'Type' is a reserved keywork for TraceEvent, RESTKmsConnector
was usng 'Type' to log CipherDetail or BlobMetadata type information.
It causes 'fdbserver' binary to crash/restart
Patch addresses the usage to ensure TraceEvent compatibility is
maintained. Further, it proposes following test improvements:
1. Fix existing unit-test that 'should' have caught this issue.
2. Add more test coverage.
3. Add CODE_PROBE to ensure relevant corner cases are validated.
Testing
RESTKmsConnectorUnit
prepare transactions
add DD Restore Preparing state; actor accept blob migrator requests
Refactor DDEnabledState and PrepareBlobRestoreReply
Improve the initialization, remove unused DDSharedContext method
move prepareBlobRestore to moveKeys
Make context->lock and DataDistributor.lock share the reference; change checkMoveKeysLock to checkPersistenMoveKeysLock; Add more debug trace
fix requestId assignment bug
Add DDEnabledState::sameId method
Throw movekeys_conflict after hybrid restore preparation to force reload
server list and shard mapping; format code; remove unused
methods,definition and comments
fix rebase conflicts
Make DD only load initial Data Distribution after enabled
use dd_config_changed error and throw it in serveBlobMigratorRequests
move empty range check before blob restore to the transaction lock database
Rename BlobRestore*
We encountered a situation in simulation where the disk queue was in the following state
+------------+------------+
| page 1 | page 2 |
+------------+------------+
|rec |.......|rec |.......|
+------------+------------+
0..85 4096..4181
^. ^__ ^
popped. committed pushed
and we attempted to pop up to 4096, i.e. everything before page 2. This triggered
one of the assertions in the disk queue code which was meant to catch tlog logic
bugs where we pop too much.
The issue, though, is the accounting of the commit location in the disk queue.
While we only pushed records through position 85, we committed the entire page.
Attempts to pop everything before page 2 should have succeeded since we're not
attempting to pop any uncommitted data.
The solution is to fix the commit location accounting in the disk queue to round
up to the next page, to reflect the reality that we only commit entire pages.
This bug was discovered in the first place by introducing a delay into the commit
queue loop during simulation testing. That delay is included in this change.
We also noticed that getNextCommitLocation() was incorrect. Since there are no
users of that function, we've removed it entirely.
This PR adds auto_tenant_assignment option to register/configure data clusters.
Setting auto_tenant_assignment to disabled means the data cluster is a dedicated one and won't be
used for auto tenant assignment. This option is enabled by default (allowing auto tenant assignment).
Test plan:
simulation tests and metacluster_fdbcli_tests.py
---------
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>