* acs framework
* code refactor and fix bugs
* add ss crash loop protector
* use sharedptr instead of raw pointer
* fixed critical bugs and add provate mutation acs to the framework
* enable ACS for all mutations except for clear serverTag mutation and fix bugs
* fix restarting tests
* refactor code and fix bugs
* fix AccumulativeChecksumState toString
* fix bugs
* allow all mutations in acs and fixed bugs
* fix bugs and code cleanup
* code clean up for adding recovery support
* simplify code and support recovery
* clear acs state at ss
* fix bug
* terminate validator if ss will be removed in the current batch
* simplify code
* add trace
* address comments
* optimize code
* deep copy when adding mutation to acs validator
* warp encode and decode persist acs key
* make acstable private
* remove unless func
* remove unless func
* remove epoch in ACS validator
* add acs mutation counter in SS metrics
* code cleanup and make knob check better
* make mutation buffer global
* simplify code
* add comments
* make knob randomly set
* address comments
* ss reboot after acs mismatch found
Commit proxy needs to fetch additional cipher keys post-resolution, since tenant ids for raw access requests and cross-tenant clear ranges are calculated after resolution.
* Allow multiple keyranges in CheckpointRequest.
Include DataMove ID in CheckpointMetaData.
* Use UID dataMoveId instead of Optional<UID>.
* Implemented ShardedRocks::checkpoint().
* Implementing createCheckpoint().
* Attempted to change getCheckpointMetaData*() for a single keyrange.
* Added getCheckpointMetaDataForRange.
* Minor fixes for NativeAPI.actor.cpp.
* Replace UID CheckpointMetaData::ssId with std::vector<UID>
CheckpointMetaData::src;
* Implemented getCheckpointMetaData() and completed checkpoint creation
and fetch in test.
* Refactoring CheckpointRequest and CheckpointMetaData
rename `dataMoveId` as `actionId` and make it Optional.
* Fixed ctor of CheckpointMetaData.
* Implemented ShardedRocksDB::restore().
* Tested checkpoint restore, and added range check for restore, so that
the target ranges can be a subset of the checkpoint ranges.
* Added test to partially restore a checkpoint.
* Refactor: added checkpointRestore().
* Sort ranges for comparison.
* Cleanups.
* Check restore ranges are empty; Add ranges in main thread.
* Resolved comments.
* Fixed GetCheckpointMetaData range check issue.
* Fixed error description.
Co-authored-by: He Liu <heliu@apple.com>
The logic to determine the validity of a process joining a cluster now
belongs on the worker and the cluster controller. It is no longer
restricted to tlogs and storages, but instead applies to all processes
(even stateless ones).
The logic to determine the validity of a process joining a cluster now
belongs on the worker and the cluster controller. It is no longer
restricted to tlogs and storages, but instead applies to all processes
(even stateless ones).
Changes:
1. Change `isEncryptionOpSupported` to not check against `clientDBInfo.isEncryptionEnabled`, but instead against ENABLE_ENCRYPTION server knob. The problem with clientDBInfo is before its being broadcast to the workers, its content is uninitialized, during which some data (e.g. item 2) is not getting encrypted when they should.
2. Fix CommitProxy not encrypting metadata mutations which are recovered from txnStateStore
3. Fix KeyValueStoreMemory (thus TxnStateStore) partial transaction coming from recovery is not encrypted
4. new CODE_PROBE for the above fixes
5. Logging changes
* Introduce "default encryption domain"
Description
In current FDB native encryption data at-rest implementation,
an entity getting encrypted (mutation, KV and/or file) is categorized
into one of following encryption domains:
1. Tenant domain, where, Encryption domain == Tenant boundaries
2. FDB system keyspace - FDB metadata encryption domain
3. FDB Encryption Header domain - used to generate digest for
plaintext EncryptionHeader.
The scheme doesn't support encryption if an entity can't be categorized
into any of above mentioned encryption domains, for instance, non-tenant
mutations are NOT supported.
Patch extend the encryption support for mutations for which corresponding
Tenant information can't be obtained (Key length shorter than TenantPrefix)
and/or mutations do not belong to any valid Tenant
(FDB management cluster data) by mapping such mutations to a
"default encryption domain".
TODO
CommitProxy driven TLog encryption implementation requires every transaction
mutation to contain 1 KV, not crossing Tenant-boundaries. Only exception to
this rule is ClearRange mutations. For now ClearRange mutations are mapped
to 'default encryption domain', in subsequent patch appropriate handling
for ClearRange mutations shall be proposed.
Testing
devRunCorrectness - 100k
Configuration database data lives on the coordinators. When a change
coordinators command is issued, the data must be sent to the new
coordinators to keep the database consistent.