The `recoveredDiskFiles` is a promise that will be fulfilled once all the local TLog and storage files have been initialized in a process. It was added previously to make a process wait on it before joining the cluster, and it was to avoid a slow recovering TLog to join the cluster to slowdown cluster recovery.
With #7510, we allow a process to join the cluster to play stateless role, while still avoid it to join the cluster as stateful role before its TLog and storage is recovered. As such, the `recoveredDiskFiles` wait is no longer needed. This PR cleanup the logic.
* [EaR]: Update KMS APIs to split encryption keys endpoints
Description
diff-1: Address review comments
Major changes proposed:
1. Extend fdbserver to allow parsing two endpoints for encryption at-rest
support: getEncrypitonKeys, getLatestEncryptionKeys
2. Update RESTKmsConnector to do the following:
2.1. Split the getLatest and getCipher requests.
2.2. "domain_id" for point lookup marked as 'optional'
Testing
devRunCorrectness - 100K
* Fix Redwood tree height overgrowth when EaR and tenant page split are enabled, by removing the buildNewSubtree() logic.
* Fixing incorrect page upper bound for the last page created by writePages() without the buildNewSubtree() logic.
* Enable tenant page split if encryption mode is domain-aware encryption.
* Related test fixes:
- In simulation, pass encryption mode to storage/Redwood via knobs. This is a workaround to enable testing with Redwood encryption before we correctly pass the encryption mode via db config. Also temporarily disable tenant page split for restart tests.
- Disable raw access in FuzzApiCorrectness test if domain-aware encryption is enabled, to avoid test timeout
- Disable encryption for DrUpgradeRestart test, which is likely to fail due to a rare EKP deadlock issue blocking recovery. Will re-enable after the deadlock issue is fixed.
The unit test intentionally cause unexpected_encoding_type being thrown, which would hit a simulation only assert failure. Disabling that assert in this case.
* Improved SHARD_ENCODE_LOCATION_METADATA migration.
* Cleanup.
* Cancel itself if a data move finds a conflicting data move. Fixed
transaction reset issue.
* Cancel data move in a retry loop to avoid corrupted mutations.
Co-authored-by: He Liu <heliu@apple.com>
Previously with EaR we always enable authentication (e.g. we encrypt Redwood pages). The authentication is a form of checksum, so dedicated page checksum was not needed. This PR adds back xxhash page checksum when authentication is disabled. Also change the knob to default disable authentication.