* Add an DD tenant-cache-assembly actor
* Add basic tenant list monitoring for tenant cache.
* Update DD tenant cache refresh to be more efficient and unit-testable
* Remove the DD prefix in the tenant cache class name (and associated impl and UT class names); there is nothing specific to DD in it; DD uses it; other modules may use it in the future
* Disable DD tenant awareness by default
This patch is to fix the compile error
/root/src/fdbclient/S3BlobStore.actor.cpp:410:9: error: moving a local
object in a return statement prevents copy elision
[-Werror,-Wpessimizing-move]
return std::move(resource);
^
/root/src/fdbclient/S3BlobStore.actor.cpp:410:9: note: remove std::move
call here
return std::move(resource);
^~~~~~~~~~ ~
1 error generated.
This is to fix an issue when recovery and change coordinator key happens
together. The issue will occur when:
1. Recovery starts
2. Coordinator key change transaction started
3. During the recovery the coordinator key is read from cluster file and
stored in the storage server
4. The cluster controller received `ChangeCoordiatorsRequest`, and
updated the cluster name with the new value.
at this stage, the value related to coordinator key in storage server and
the worker is inconsistent.
5. changeQuorumChecker is called, which will verify such consistency.
Since they are different, the call is returning failure and the
caller, which could be a TEST_CASE, fails.
This is a rare race issue, and it is also noticed that when the
recovery/coordinator key change process is done, the database is in a
proper state which allows changeQuorumChecker behave properly. In this
case, a retry mechanism should be sufficiently fix corresponding test
failures.
* Close trace file when error happens in runNetwork().
* Improve the bestCount algorithm in getLeader().
In the current implementation, if the nominees are [0,1], the chosen leader will be 1, which is an exception to other cases and our expectation that if 2 nominees have the same frequency, the one with lower id will be the leader.
* Remove unnecessary new statement.
stream will never be a nullptr.
* Move self->dnsCache out of lambda capture.
Member variables are not capture by default, thus, `host` and `service` are not captured. This somehow successfully compile, but throws std::bad_alloc or basic_string::_S_create exceptions when we call `host+":"+service` in dnsCache.remove().
* Revert unintended change.
* Address comments.
1. Support virtual hosting endpoint.
2. On-premise s3 compatible storage service may use IP instead of s3 form domain name,
especially for development/test environment.
Instead of parsing service and region from domain name,
1). Hard code "s3" as service name in v4 signature
2). Add new parameter to allow pass region name from url
3. Fix creating bucket issue on aws, adding request body.
* Don't fail test if log cursor times out during network partition
Also, exercise the codepath for handling timed_out in simulation, by
reverting this knob buggification behavior to that of 07976993e7.
* clang-format
* Remove unnecessary actorcompiler.h includes (from non-actor files)
* Make AsyncFileChaos a non-actor header file
* Add unactorcompiler.h include to the end of actor header files
* Add missing actorcompiler.h includes to actor header files
* Fix a few places we weren't doing exponential backoff
We re-create the transaction every iteration of each of these retry
loops, so we need to manage exponential backoff here ourselves.
Closes#7301
* Remove former Backoff definition
`ReadYourWritesTransaction` has memory allocated before being passed to
the main thread. This allows both threads to continue to access the
transaction object. Currently, the transaction gets allocated and
initialized on the foreign thread, and then re-initialized on the main
thread. This causes a bunch of extra, unnecessary work for each
`ReadYourWritesTransaction` where the temporary object gets destructed.
The fix is to only allocate memory for the `ReadYourWritesTransaction`
on the foreign thread, and then initialize it once on the main thread.
Adding GetEncryptCipherKeys and GetLatestCipherKeys helper actors, which encapsulate cipher key fetch logic: getting cipher keys from local BlobCipherKeyCache, and on cache miss fetch from EKP (encrypt key proxy). These helper actors also handles the case if EKP get shutdown in the middle, they listen on ServerDBInfo to wait for new EKP start and send new request there instead.
The PR also have other misc changes:
* EKP is by default started in simulation regardless of. ENABLE_ENCRYPTION knob, so that in restart tests, if ENABLE_ENCRYPTION is switch from on to off after restart, encrypted data will still be able to be read.
* API tweaks for BlobCipher
* Adding a ENABLE_TLOG_ENCRYPTION knob which will be used in later PRs. The knob should normally be consistent with ENABLE_ENCRYPTION knob, but could be used to disable TLog encryption alone.
This PR is split out from #6942.
* Add simulation test for 1 data hall + 1 machine failure case.
* Disable BUGGIFY for DEGRADED_RESET_INTERVAL.
A simulation test discovered a situation where machines attempting to connect
to a dead coordinator (with a well-known endpoint) were getting themselves
marked degraded. This flapping of the degraded state prevented recovery from
completing, as it started over any time it noticed that tlogs on degraded
hosts could be relocated to non-degraded ones.
bin/fdbserver -r simulation -f tests/rare/CycleWithDeadHall.toml -b on -s 276841956
* Add JWT support to TokenSign
* Encapsulate OpenSSL public/private key type
Type-safe passing around of keys without having to DER/PEM-serialize
(OpenSSL doesn't have distinct types for public and private key)
* Apply Clang format
* Add verify benchmark for JWT and FlatBuffers token
* Unit test base64url::{encode, decode}
* Make all payload fields optional
Let user code validate non-signature fields
* Make all payload fields optional
Completely defer field check to user code
* Move rapidjson from fdbclient to contrib
* Make fdbrpc's rapidjson linkage private
Currently only sources include them.
* Modify rapidjson path in apiversioner.py
* Algorithm::Unknown > Algorithm::UNKNOWN