Description
Given Configurable encryption has been checked in and being tested via
simulation for more than a month and also to avoid penalty of accessing
KNOBS in inline commit path, patch retires the KNOB and make
ConfigurationEncryption default EaR mode for FDB.
BlobCipher still supports the old format header and encryption semantics,
will remove the dead code as a followup PR.
Testing
devRunCorrectness - 100K
* Update main branch to 7.4
* Update API version to 740
* Makes fdb_c_client_config_tests.py passing after API version update
* Remove from_7.3.0_until_7.4.0 and add from_7.3.0
* Update tests in fdb_c_client_config_tests.py
* EaR: Update KMS URL refresh policy and fix bugs
Description
RESTKmsConnector implements discovery and refresh semantics i.e.
on bootstrap it discovers KMS Urls and periodically refresh the
URLs (handle server upgrade scenario). The current implementation
caches the URLs in a min-heap, as part of serving a request, actor
pops out elements from min-heap and attempts connecting to the server,
on failure, the URL is temporarily stored in a stack, at the end of
the request processing, the stack is merged back into the heap.
The code doesn't work as expected if there are multiple requests
consumes the heap causing following issues:
1. Min-heap would retain old URLs replaced by latest refresh (stack merge)
2. URL discovery file is read more than expected as multiple requests can
empty heap, causing the code to read URLs from the file.
Patch proposes following policy to cache and maintain URLs priority:
1. Unresponsiveness penalty: KMS flaky connection or overload can cause
requests to timeout or fail; each such instance updates unresponsiveness
penalty of associated URL context. Further, the penalty is time bound and
deteriorate with time.
2. Cached URLs are sorted once a failure is encountered, priority followed
is:
2.1. Unresponsiveness penalty server(s) least preferred
2.2. Server(s) with high total-failures less preferred
2.3. Server(s) with high total-malformed response less preferred.
3. Updates RESTClient to throw 'retryable' error up to the client such as:
'connection_failed' and/or 'timeout'
4. Extend RESTUrl to support IPv6 format.
Testing
RESTUnit - 100K (new test added for coverage)
devRunCorrectness
This PR avoids "external timeout" for redwood correctness tests.
Update the logic in fdbserver.actor.cpp so that -1 instead of 0 is considered a noUnseed. If "noUnseed == true", then -1 will be logged as "RandomUnseed" in the end of the trace.
Tweak the finish condition of redwood unit tests so that if wall clock time reaches a certain threshold, finish the test and set nounseed to true.
* EaR: REST based Simulated KMS Vault request hanlder interface
Description
diff-1: Address review comments
Improve unit test case coverage
diff-2: Extend RESTKmsConnectorUtil to generate HTTP::Header
EaR simulation testing is currently driven using SimKmsConnector
interface, it exposes endpoints directly invoked by EKP to fetch
encryption keys. Approach avoids testing RESTKms communication
path. Recently FDB codebase got extended by adding HTTPServer
interface, which was a gap prohibiting end-to-end testing of
EaR code.
Patch proposes following changes:
1. Refactor RESTKmsConnector to move common code and definitions
to RESTKmsConnectorUtil namespace
2. Introduce RESTSimKmsVault accepting HTTP format requests and
providing appropriate HTTP response.
Testing
RESTUnit 100K + 5k valgrind
devRunCorrectness 100K
Testing
* Add networkoption to disable non-TLS connections
* add disable plaintext connection to fdbserver
* python doc
* Formatting
* Add tls disable plaintext connection to client api test
* review
* fix negative test
* formatting
* add TLS support to c client config tests
Adds support for TLS in the client and server separately
* add tests for disable_plaintext_connections
Test TLS and Plaintext Clusters and Clients
* Fix documentation
* Rename option to indicate it is client-only
* clearer formatting
* default to allowing plaintext connections
* add SetTLSDisablePlaintextConnection to go bindings
* Passes existing tests
* adding http unit test for wrong md5 sum
* Added new HTTPKeyValueStore workload to test long-running http clients
* fixing warnings
* fixing bugs with tenant_mode required on external clients and changing test to find them
* Update fdbcli/BlobKeyCommand.actor.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
---------
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
* Enable fault injection;
Ignore MAX_COORDINATOR_SNAPSHOT_FAULT_TOLERANCE in simulation
Fix the issue when there's not fdb.cluster file and disable randomMoveKeys in SnapTest workload
* Move snap tests to from_7.3.0
* Add comments for the change
* Change SevError to SevWarn when Tlog process is not found in the worker map