Previously, we had Ref types outliving the arena's that owned them,
specifically encryptDomains in the getResolution actor. Refactor to use
Standalone's, which both fixes the memory error and makes this easier to
reason about.
Also fix a potential ODR violation.
Description
find_package was used to find and link `zlib` library needed to enable
boost::gzip compression filter. However, the code adds dynamic linkage
of zlib shared object with generated binaries (fdbserver for instance).
For now disable the ZLIB find code to effectively disable GZIP compression
support.
Testing
* Introduce "default encryption domain"
Description
In current FDB native encryption data at-rest implementation,
an entity getting encrypted (mutation, KV and/or file) is categorized
into one of following encryption domains:
1. Tenant domain, where, Encryption domain == Tenant boundaries
2. FDB system keyspace - FDB metadata encryption domain
3. FDB Encryption Header domain - used to generate digest for
plaintext EncryptionHeader.
The scheme doesn't support encryption if an entity can't be categorized
into any of above mentioned encryption domains, for instance, non-tenant
mutations are NOT supported.
Patch extend the encryption support for mutations for which corresponding
Tenant information can't be obtained (Key length shorter than TenantPrefix)
and/or mutations do not belong to any valid Tenant
(FDB management cluster data) by mapping such mutations to a
"default encryption domain".
TODO
CommitProxy driven TLog encryption implementation requires every transaction
mutation to contain 1 KV, not crossing Tenant-boundaries. Only exception to
this rule is ClearRange mutations. For now ClearRange mutations are mapped
to 'default encryption domain', in subsequent patch appropriate handling
for ClearRange mutations shall be proposed.
Testing
devRunCorrectness - 100k
Per #7952, content in /sys/fs/cgroup/cpu,cpuacct/cpu.stat is reported in
MachineMetrics.
If the file does not exist, reports `NoCpuStatFile`.
If the file is not parsable, reports `CpuStatFileParseError`.
Adding the following metrics:
* BlobCipherKeyCache hit/miss
* EKP: KMS requests latencies
* For each component that using encryption, they now need to pass a UsageType enum to the encryption helper methods (GetEncryptCipherKeys/GetLatestEncryptCipherKey/encrypt/decrypt) and those methods will help to log get cipher key latency samples and encryption/decryption cpu times accordingly.
* Assert that arena's appear last in serializer calls
* Fix all occurrences of Arena's not appearing last in serializer call
* Work around issue from Standalone inheriting from Arena privately
* Attempt to fix windows build
Use fb_ prefix instead of detail namespace to scope implementation
details in headers
* Remove API 720 guards for tenants (experimental feature) and the cluster ID special keys (no need to guard)
* Enable the relaxed special key access in transactions that need to use special key-space APIs introduced in 7.2
Valgrind is complaining about use of uninitialized memory in
absl::GetStackTrace, and the cases where it complains the backtraces are
incomplete. Note: this means that jemalloc heap profiling no longer
works out of the box. Advanced users who want to enable jemalloc heap
profiling will now have to revert this change and build from source.
* flow: add ApiVersion to replace hard coding api version
Instead of hard coding api value, let's rely on feature versions akin to
ProtocolVersion.
* ApiVersion: remove use of -1 for latest and use LATEST_VERSION
* Move arena members to the end of serializer calls
See
https://github.com/apple/foundationdb/tree/main/flow#flatbuffersobjectserializer
for why this is necessary.
* Fix a heap-use-after-free
Previously memory owned by
EncryptKeyProxyData::baseCipherDomainIdKeyIdCache was borrowed by a call
to EncryptKeyProxyData::insertIntoBaseDomainIdCache where it was
invalidated and then used. Now
EncryptKeyProxyData::insertIntoBaseDomainIdCache takes shared ownership
by taking a Standalone.
And also rename some types to end in Ref to follow the flow conventions
described here: https://github.com/apple/foundationdb/tree/main/flow#arenas
* Make envvar access in envvar-knob parsing cross-platform
Variable `environ' is defined for Linux,
does not exist in Mac, and is deprecated in Windows
* Prevent env knob strings from being copied at each iteration
* Add headers for {Get|Free}EnvironmentStrings()
* Adjust for different memory layout of environ vs GetEnvironmentStrings()
GetEnvironmentStrings returns a single byte string
with adjacent key=value strings delimited with null bytes,
whereas environ is an array of key=value strings.
* Fix missing return value for Windows build
* Have just one return statement
* Implement TenantCacheEntry in-memory cache
Description
diff-4: TraceEvent usage improvements
diff-3: Address review comments
diff-2: Add APIs to read counter values, test improvements
diff-1: Address review comments
Major changes includes:
1. Implements an actor that enables an in-memory caching of
TenantCacheEntry object, allowing the caller to embed custom
information along with TenantCacheEntry.
2. The cache follows read-through cache semantics where the entry
gets loaded from underlying database on a miss.
3. The cache implements a "periodic poller" to refresh known Tenants
by consulting the database. Once a database keyrange-watch feature is
available, cache shall be updated.
Bonus:
Implement a 'recurringAsync' addition to genericActors allowing caller
to schedule a periodic task registering an "actor functor"; the routine
'waits' for the actor unlike existing 'recurring' implementation.
Testing
TenantEntryCache workload
devCorrectnessRun - 100K
* Update network address in trace logs; Add system monitor for flowprocess
* Create a new trace file with the correct process address for flowprocess
* Remove unused debugging traces
* Add a new error lock_file_failure; Change please_reboot_remote_kv_store to please_reboot_kv_store; Add the code to only reboot the kv store but not the worker; Remove some unnecessay traces
* Add error handling for file_not_found in handleIOErrors
* Format worker.actor.cpp file
Description
FDB native encryption data at-rest supports two type of cipher-keys
in-memory caching:
1. Revocable keys - with a definite expiry (future timestamp)
2. Non-revocable keys - with or without expiry timestamp and/or
refreshAt timestamp.
Patch update BlobCipherKey in-memory cache to respect EKP/KMS
supplied 'refreshAt' and 'expireAt' timestamp. GetLatestCipher
validates `cipher key freshness' as well as GetCipherKey checks
for 'cipher key liveness' before replying details to the caller.
Patch also optimizes the BlobCipher module logging by taking
following measures:
1. BLOB_CIPHER_DEBUG macro to guard spammy log messages needed
mostly for debugging failures.
2. Minimize log volume by logging cipherKey details for any new
key added to the cache, key-refreshes are not logged.
3. Categorize logs into: debug, info and warn on per-usecase basis
Testing
devRunCorrectness - 100K
EncryptOps.toml - 100K
Currently GRV is reporting proxy_memory_limit_exceeded error which has
error message claiming Commit proxy failing. This split should remove
such confusion.
* Enable configuring the next future protocol version as the current protocol version in FDB client, fdbserver, and fdbcli
* Auto format python files used in upgrade tests
* Add a test for upgrading to a future FDB version
* Emphasize that the options for using future protocol version are intended for test purposes only
* Make the global variable for current protocol version visible only locally
* Refactirng to avoid using currentProtocolVersion() in static intialization
* Update go bindings
Currently it invokes pow(10, uniform_rand(log_e(range_begin), log_e(range_end))),
which may overflow beyond UINT32_MAX.
Fix it by preventing log_e(0) case and change pow base to M_E
* Don't build fdb c shim with ubsan
This avoids duplicate symbols when linking. It doesn't really make sense
to assemble files with -fsanitize=undefined anyway, since it won't
insert instrumentation.
* Consolidate boost_asan with boost_target
* Fix ASAN build
Description
-diff-2: Fix Mac build issues
-diff-1: Address review comments
Patch addresses the issue where ASAN build failed after introducing
BlobGranule compression.
Testing
ASAN build
As of version 7.1, knob parsing started using std::stol which will
partially consume its input and return success, even if that's not
what we want, e.g. std::stol("4GiB") returns 4.
This changes returns the behavior of 7.0, which would consider the
above example invalid.
* proof of concept
* use code-probe instead of test
* code probe working on gcc
* code probe implemented
* renamed TestProbe to CodeProbe
* fixed refactoring typo
* support filtered output
* print probes at end of simulation
* fix missed probes print
* fix deduplication
* Fix refactoring issues
* revert bad refactor
* make sure file paths are relative
* fix more wrong refactor changes
* Attempt to fix windows build
The windows build is complaining that some symbols in flow are
duplicate, which seems fair since we're currently compiling all the flow
sources and _also_ linking to flow for flowlinktest. This change makes
it so that we only link flow, and don't compile the flow source files
again.
* Remove source files from fdbclient and fdbrpc link tests
* Shard based move.
* Clean up.
* Clear results on retry in getInitialDataDistribution.
* Remove assertion on SHARD_ENCODE_LOCATION_METADATA for compatibility.
* Resolved comments.
Co-authored-by: He Liu <heliu@apple.com>
Currently, we have code in different folders like `flow/` and `fdbrpc/`
that should remain isolated. For example, `flow/` files should not
include functionality from any other modules. `fdbrpc/` files should
only be able to include functionality from itself and from `flow/`.
However, when creating a shared library, the linker doesn't complain
about undefined symbols -- this only happens when creating an
executable. Thus, for example, it is possible to forward declare an
`fdbclient` function in an `fdbrpc` file and then use it, and nothing
will break (when it should, because this is illegal).
This change adds dummy executables for a few modules (`flow`, `fdbrpc`,
`fdbclient`) that will cause a linker error if there are included
symbols which the linker can't resolve.
* Add JsonWebKeySet parser/stringifier
* Update header directory
* Make JWKS parser correctness clean for OpenSSL 1.x
Add RSA keygen support
* Make JWKS parser correctness clean for OpenSSL 3.x
+extend unique_ptr for scoped destruction of OpenSSL objects
* Use PKey::{sign|verify}() in TokenSign
* Apply AutoCPointer to MkCert
* Apply Clang format
* JWKS::toStringRef() returns StringRef > Optional<StringRef>
* Fix Mac/Windows build error
* Fix incorrect fix of Mac build
* Fix filename in license comment for AutoCPointer.h
* Refactor complex C macros into function templates
This PR add support for TLog encryption through commit proxy. The encryption is done on per-mutation basis. As CP writes mutations to TLog, it inserts encryption header alongside encrypted mutations. Storage server (and other consumers of TLog such as storage cache and backup worker) decrypts the mutations as they peek TLog.