Clients now poll the proxy for the latest global config for a specific
version. The proxy now periodically requests the latest global
configuration data and stores it in memory, enabling it to respond
immediately to clients with the appropriate version.
Clients should avoid reading system keys unless authorized. Under global
config, each client reads from the system keyspace to check for new
global config keys. This commit moves these reads to a server role (the
GRV proxies) and sends the results back to GlobalConfig for an in-memory
update.
* Fix a few places we weren't doing exponential backoff
We re-create the transaction every iteration of each of these retry
loops, so we need to manage exponential backoff here ourselves.
Closes#7301
* Remove former Backoff definition
The special keys `\xff\xff/management/profiling/client_txn_sample_rate`
and `\xff\xff/management/profiling/client_txn_size_limit` are deprecated
in FDB 7.2. However, GlobalConfig was introduced in 7.0, and reading and
writing these keys through the special key space was broken in 7.0+.
This change modifies the profiling special keys to use GlobalConfig
behind the scenes, fixing the broken special keys.
The following Python script was used to make sure both GlobalConfig and
the profiling special key can be used to read/write/clear profiling
data:
```
import fdb
import time
fdb.api_version(710)
@fdb.transactional
def set_sample_rate(tr):
tr.options.set_special_key_space_enable_writes()
# Alternative way to write the key
#tr[b'\xff\xff/global_config/config/fdb_client_info/client_txn_sample_rate'] = fdb.tuple.pack((5.0,))
tr[b'\xff\xff/management/profiling/client_txn_sample_rate'] = '5.0'
@fdb.transactional
def clear_sample_rate(tr):
tr.options.set_special_key_space_enable_writes()
# Alternative way to clear the key
#tr.clear(b'\xff\xff/global_config/config/fdb_client_info/client_txn_sample_rate')
tr[b'\xff\xff/management/profiling/client_txn_sample_rate'] = 'default'
@fdb.transactional
def get_sample_rate(tr):
print(tr.get(b'\xff\xff/global_config/config/fdb_client_info/client_txn_sample_rate'))
# Alternative way to read the key
#print(tr.get(b'\xff\xff/management/profiling/client_txn_sample_rate'))
fdb.options.set_trace_enable()
fdb.options.set_trace_format('json')
db = fdb.open()
get_sample_rate(db) # None (or 'default')
set_sample_rate(db)
time.sleep(1) # Allow time for global config changes to propagate
get_sample_rate(db) # 5.0
clear_sample_rate(db)
time.sleep(1)
get_sample_rate(db) # None (or 'default')
```
It can be run with `PYTHONPATH=./bindings/python/ python profiling.py`,
and reads the `fdb.cluster` file in the current directory.
```
$ PYTHONPATH=./bindings/python/ python sps.py
None
5.000000
None
```
Currently, GlobalConfig is a singleton that means for each process there is only
one GlobalConfig object. This is bug from clients perspective as a client can
keep connections to several databases. This patch tracks GlobalConfig for each
database using an unordered_map in flowGlobals.
We discovered this bug while testing multi-version client, where the client got
stuck. This was lucky, as normally it'd just write down config to the wrong
database.
This is causing problems with the 5.2.0 restarting test. Removing this
line disables fdbserver processes from receiving global config updates,
instead requiring a restart to see them.
Fixes the following issues:
1. Use the right index when initializing the WriteOnlySet's vector of
atomics. Also switch to std::atomic_init to initialize each atomic in
the vector (cannot default construct the atomics in the vector
because std::atomic does not have a copy constructor).
2. Add failure check for when items cannot be inserted into the
WriteOnlySet due to capacity constraints. This situation occurs when
`copy` is not called on the WriteOnlySet, such as when sampling is
disabled. The `copy` function is what clears the WriteOnlySet.
3. Remove a global config feature I added to update the ClientDBInfo
object used by the global config listener function. This needs more
investigation, but the effect of this change could be that global
config changes are not correctly recognized on fdbserver processes.
4. Add various ASSERTs to verify data in WriteOnlySet.