Commit Graph

388 Commits

Author SHA1 Message Date
Dan Lambright a61d9d1cd2 Merge remote-tracking branch 'origin' into env 2022-08-24 13:06:47 -04:00
Chaoguang Lin 06aa6ee5ff
Add system monitor for flowprocess (#6925)
* Update network address in trace logs; Add system monitor for flowprocess

* Create a new trace file with the correct process address for flowprocess

* Remove unused debugging traces

* Add a new error lock_file_failure; Change please_reboot_remote_kv_store to please_reboot_kv_store; Add the code to only reboot the kv store but not the worker; Remove some unnecessay traces

* Add error handling for file_not_found in handleIOErrors

* Format worker.actor.cpp file
2022-08-24 00:40:38 -07:00
Dan Lambright 50be189a0f Merge remote-tracking branch 'origin' into env 2022-08-23 17:01:48 -04:00
Dan Lambright 5f490d3b85 Allow null knob values. Allow knob values with '='. Don't fail fdbserver on illegal input. 2022-08-23 17:00:10 -04:00
Vaidas Gasiunas 79571dd2b4
Testing upgrades to a future version of FDB (#7780)
* Enable configuring the next future protocol version as the current protocol version in FDB client, fdbserver, and fdbcli

* Auto format python files used in upgrade tests

* Add a test for upgrading to a future FDB version

* Emphasize that the options for using future protocol version are intended for test purposes only

* Make the global variable for current protocol version visible only locally

* Refactirng to avoid using currentProtocolVersion() in static intialization

* Update go bindings
2022-08-08 17:29:49 +02:00
Dan Lambright 4a25f8b692 Merge remote-tracking branch 'origin' into env 2022-08-02 14:51:50 -04:00
Dan Lambright 473feef246 Set knobs using environment variables 2022-08-02 14:29:37 -04:00
Junhyun Shim e2a3fedfc7
Merge branch 'main' into features/authz 2022-07-27 00:08:57 +02:00
Chaoguang Lin 5045844ea1
Add -r changeclusterkey tool for snapshot restore (#7687)
* Debug version of fdbserver -r changeclusterkey tool

* Remove debug symbols; move function to Coordinaton.actor.cpp

* Format and add traces

* fix comments
2022-07-25 22:12:28 -07:00
Junhyun Shim b521dc2f58 Bind DeterministicRandom to OpenSSL RNG 2022-07-20 16:03:09 +02:00
Markus Pilman 1de37afd52
Make TEST macros C++ only (#7558)
* proof of concept

* use code-probe instead of test

* code probe working on gcc

* code probe implemented

* renamed TestProbe to CodeProbe

* fixed refactoring typo

* support filtered output

* print probes at end of simulation

* fix missed probes print

* fix deduplication

* Fix refactoring issues

* revert bad refactor

* make sure file paths are relative

* fix more wrong refactor changes
2022-07-19 13:15:51 -07:00
Ata E Husain Bohra 1da88d5d8c BlobFile Encryption and compression support
Fix formatting issues

Description

Testing
2022-07-14 22:46:14 -07:00
Ata E Husain Bohra 24b2de8de8 BlobFile Encryption and compression support
Description

Testing
2022-07-14 17:04:14 -07:00
Yi Wu 7d7ce0909f
Restart tests carry forward encryption knobs value (#7497)
Previously to get around the issue that EKP is not present when restart test switching encryption from on to off and read encrypted data, EKP was made to start in simulation regardless of encryption knob. This PR revert that change, and instead force restart test not to change encryption knob, by passing previous encryption knob through restartInfo.ini file. Also since we don't allow downgrading an encrypted cluster to previous version, disable encryption in downgrade tests.

Also adding an assert to allow reading encrypted mutations only if encryption knob is on. We may reconsider allowing switching encryption on/off for existing cluster, but for now we don't allow it.
2022-07-14 14:45:17 -07:00
Lukas Joswiak 407300bfa6 Disable testing of the remote key value store in simulation 2022-07-13 18:32:50 -07:00
Markus Pilman 03d913a1de Flow compiling 2022-06-27 17:05:55 -06:00
Markus Pilman 8af056e7b0 fdbclient now compiling 2022-06-23 18:05:36 -06:00
Markus Pilman ffaf15c12a moved wellknownendpoints and fixed some includes 2022-06-23 17:03:53 -06:00
Markus Pilman d141347500
Merge pull request #7282 from Doxense/fix-windows-tests
Fix windows tests
2022-06-08 08:18:47 -06:00
Evan Tschannen 473edf3d11
fix: the peek_using_streaming can cause memory corruption (#7286)
* fix: the peek_using_streaming can cause memory corruption

* changed how ALLOW_DANGEROUS_KNOBS is initialized

* only buggify streaming when simulated
2022-05-31 16:04:28 -07:00
Mohamed Oulmahdi 3c4c485ef2 Enable Windows tests 2022-05-31 12:01:57 +02:00
Renxuan Wang df4e0deb4d
coordinatorsKey should not always store IP addresses. (#7204)
* coordinatorsKey should not storing IP addresses.

Currently, when we do a commit of coordinator change, we are always converting hostnames to IP addresses and store the converted results in coordinatorsKey (\xff/coordinators). This result in ForwardRequest also sending IP addresses, and receivers will update their cluster files with IPs, then we lose the dynamic IP feature.

* Remove the legacy coordinators() function.

* Update async_resolve().

ip::basic_resolver::async_resolve(const query & q, ResolveHandler && handler) is deprecated.

* Clean code format.

* Fix typo.

* Remove SpecifiedQuorumChange and NoQuorumChange.
2022-05-23 11:42:56 -07:00
Ata E Husain Bohra 33ae398268
REST KmsConnector implementation (#6994)
* REST KmsConnector implementation

Description
  diff-1: Address review comments.
          Add utility interface to Platform namespace to
          create and operate on tmpfile
 diff-2: Address review comments
         Link Boost::filesystem to CMake build process

Major changes includes:
1. Implement REST based KmsConnector implementation.
2. Salient features of the connector:
 2.1. Two required configuration are:
   a. Discovery KMS URLs - enable KMS discovery on bootstrap
   b. Endpoint path configuration to construct URI to fetch/refresh
      encryption keys
   c. Configuration to provide "validationTokens" to connect with
      external KMS. Patch implements file-based token validation scheme.
 2.2. On startup, RESTKmsConnector discovers KMS Urls and caches
      them in-memory. Extracts "validationTokens" based on input config.
 2.3. Expose endpoints to allow fetch/refresh of encryption keys.
 2.4. Defines JSON format to interact with external KMS - request &
      response payload format.
3. Extend Platform namespace with an interface to create and operate on
   tmp files.
4. Update Platform 'readFileBytes' and 'writeFileBytes' to leverage
   fstream supported implementation.

NOTE: KMS URLs fetched after initial discovery will be persisted using
      DynamicKnobs. It is TODO at the moment and shall be completed
      once DynamicKnobs is feature complete

Testing

Unit test to validation following:
1. Parsing on "validation tokens" logic.
2. Construction and parsing of REST JSON request and response strings.
2022-05-07 13:18:35 -07:00
sfc-gh-tclinkenbeard 06825775db Fix formatting of lines with TLS_OPTION_FLAGS 2022-05-02 22:56:06 -07:00
sfc-gh-tclinkenbeard 7f05221cfe Removed TLS_DISABLED macro 2022-05-02 22:15:27 -07:00
Renxuan Wang c69a07a858
Check in the new Hostname logic. (#6926)
* Revert #6655.

20220407-031010-renxuan-c101052c21da8346           compressed=True data_size=31004844 duration=4310801 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=1:04:15 sanity=False started=100047 stopped=20220407-041425 submitted=20220407-031010 timeout=5400 username=renxuan

* Revert #6271.

20220407-051532-renxuan-470f0fe6aac1c217           compressed=True data_size=30982370 duration=3491067 ended=100002 fail_fast=10 max_runs=100000 pass=100002 priority=100 remaining=0 runtime=0:59:57 sanity=False started=100141 stopped=20220407-061529 submitted=20220407-051532 timeout=5400 username=renxuan

* Revert #6266.

Remove resolving-related functionalities in connection string. Connection string will be used for storing purpose only, and non-mutable.

20220407-175119-renxuan-55d30ee1a4b42c2f           compressed=True data_size=30970443 duration=5437659 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:59:31 sanity=False started=100154 stopped=20220407-185050 submitted=20220407-175119 timeout=5400 username=renxuan

* Add hostname to coordinator interfaces.

* Turn on the new hostname logic.

* Add the corresponding change in config txns.

The most notable change is before calling basicLoadBalance(), we need to call tryInitializeRequestStream() to initialize request streams first.

Passed correctness tests.

* Return error when hostnames cannot be resolved in coordinators command.

* Minor fixes.
2022-04-27 21:54:13 -07:00
Binglin Chang 408c0cf1c9
Fix compile errors on ubuntu 20.04 (#4931) 2022-04-20 10:00:46 -07:00
Markus Pilman c82701dd00 fixed merge bug 2022-04-07 14:40:28 -06:00
Markus Pilman bf956f5630 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-04-07 13:29:27 -06:00
Yi Wu 994b8c92f8
Add option to limit resident memory and remove default memory limit (#6719)
Changing `memory` option to limit resident memory instead of virtual memory, in config file and fdbserver/fdbbackup/fdbcli command-line argument. Since `rlimit` doesn't support limiting virtual memory, the current implementation have both of fdbmonitor and the fdbserver/fdbbackup process checking process RSS periodically and kill and restart the process if the limit is exceeded.

Adding a new `memory_vsize` option to limit virtual memory, if backward-compatible behavior is desired.

closes #6671, closes #6672
2022-04-06 20:06:24 -07:00
Jingyu Zhou f68fd28d73 Refactor duplicated code into IKnobCollection::setupKnobs() 2022-04-05 02:06:38 -07:00
Steve Atherton 6eb1c2ae48
Merge pull request #6574 from sfc-gh-satherton/redwood-rare-bugs
Rare correctness bug fixes in Redwood
2022-04-01 16:40:22 -07:00
Chaoguang Lin 7d365bd1bb
Remote ikvs debugging (#6465)
* initial structure for remote IKVS server

* moved struct to .h file, added new files to CMakeList

* happy path implementation, connection error when testing

* saved minor local change

* changed tracing to debug

* fixed onClosed and getError being called before init is finished

* fix spawn process bug, now use absolute path

* added server knob to set ikvs process port number

* added server knob for remote/local kv store

* implement simulator remote process spawning

* fixed bug for simulator timeout

* commit all changes

* removed print lines in trace

* added FlowProcess implementation by Markus

* initial debug of FlowProcess, stuck at parent sending OpenKVStoreRequest to child

* temporary fix for process factory throwing segfault on create

* specify public address in command

* change remote kv store knob to false for jenkins build

* made port 0 open random unused port

* change remote store knob to true for benchmark

* set listening port to randomly opened port

* added print lines for jenkins run open kv store timeout debug

* removed most tracing and print lines

* removed tutorial changes

* update handleIOErrors error handling to handle remote-ikvs cases

* Push all debugging changes

* A version where worker bug exists

* A version where restarting tests fail

* Use both the name and the port to determine the child process

* Remove unnecessary update on local address

* Disable remote-kvs for DiskFailureCycle test

* A version where restarting stuck

* A version where most restarting tests green

* Reset connection with child process explicitly

* Remove change on unnecessary files

* Unify flags from _ to -

* fix merging unexpected changes

* fix trac.error to .errorUnsuppressed

* Add license header

* Remove unnecessary header in FlowProcess.actor.cpp

* Fix Windows build

* Fix Windows build, add missing ;

* Fix a stupid bug caused by code dropped by code merging

* Disable remote kvs by default

* Pass the conn_file path to the flow process, though not needed, but the buildNetwork is difficult to tune

* serialization change on readrange

* Update traces

* Refactor the RemoteIKVS interface

* Format files

* Update sim2 interface to not clog connections between parent and child processes in simulation

* Update comments; remove debugging symbols; Add error handling for remote_kvs_cancelled

* Add comments, format files

* Change method name from isBuggifyDisabled to isStableConnection; Decrease(0.1x) latency for stable connections

* Commit the IConnection interface change, forgot in previous commit

* Fix the issue that onClosed request is cancelled by ActorCollection

* Enable the remote kv store knob

* Remove FlowProcess.actor.cpp and move functions to RemoteIKeyValueStore.actor.cpp; Add remote kv store delay to avoid race; Bind the child process to die with parent process

* Fix the bug where one process starts storage server more than once

* Add a please_reboot_remote_kv_store error to restart the storage server worker if remote kvs died abnormally

* Remove unreachable code path and add comments

* Clang format the code

* Fix a simple wait error

* Clang format after merging the main branch

* Testing mixed mode in simulation if remote_kvs knob is enabled, setting the default to false

* Disable remote kvs for PhysicalShardMove which is for RocksDB

* Cleanup #include orders, remove debugging traces

* Revert the reorder in fdbserver.actor.cpp, which fails the gcc build

Co-authored-by: “Lincoln <“lincoln.xiao@snowflake.com”>
2022-03-31 17:08:59 -07:00
Steve Atherton 16afeb43fa Avoid false positive for determinism check in DEBUG_DETERMINISM by avoiding use of shared memory. 2022-03-28 20:02:55 -07:00
sfc-gh-tclinkenbeard a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Markus Pilman 35f7843d84 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-03-21 12:33:53 +01:00
Andrew Noyes 68c03a7e32
Jemalloc integration fixes (#6626)
* Set default for USE_JEMALLOC initially in ConfigureCompiler

Instead of trying to change the value later on. This fixes the valgrind
build, which was previously incorrectly getting jemalloc involved.

* Check aligned_alloc result for null

And OOM if so - don't assert

* Check that we can allocate magazines with no internal fragmentation

We may want to do this so that the jemalloc heap profiler has some
knowledge of FastAlloc

* Populate TestFile field for noSim tests in TestHarness

* Remove handling for nonexistent "ActualRun"
2022-03-17 15:17:27 -07:00
Markus Pilman 8fac0081a8 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-03-09 11:00:00 +01:00
Renxuan Wang f7eb66441d Try eliminating warnings in macOS and Windows CI builds.
MacOS warnings are format warnings, e.g., `format specifies type 'long' but the argument has type 'Version' (aka 'long long')`.
Windows warnings are `ACTOR does not contain a wait() statement`.
2022-02-25 19:06:57 -08:00
A.J. Beamon 250a88e682 Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement. 2022-02-24 12:25:52 -08:00
Markus Pilman cf31e14904 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-02-23 10:29:32 +01:00
Markus Pilman 102169ba33 Ran clang-format 2022-02-23 10:23:27 +01:00
Renxuan Wang 3c1394578b Address comments. 2022-02-22 16:29:59 -08:00
Renxuan Wang ea8b2bd314 Resolve hostnames synchronously when calling actor is impossible. 2022-02-22 16:29:59 -08:00
Renxuan Wang 481587a8c6 Turn on hostname logic. 2022-02-22 16:29:59 -08:00
Markus Pilman dc973fb67e Allow List and first test 2022-02-22 11:15:16 +01:00
Markus Pilman c7899b9d39 Implemented public endpoints, started allow list 2022-02-16 18:12:03 +01:00
sfc-gh-tclinkenbeard 5b8f4d9be1 Move SimpleIni.h to fdbclient 2022-02-07 13:31:04 -08:00
Ray Jenkins dd45805312
Merge branch 'apple:main' into threadname-issue-6064 2022-02-01 17:40:07 -06:00
Renxuan Wang f9f3735f73 Add resolveHostnamesBlocking() in ConnectionString and IClusterConnectionRecord.
Also, combine IClusterConnectionRecord::getConnectionString() and IClusterConnectionRecord::getMutableConnectionString() to IClusterConnectionRecord::getConnectionString(), and rename setConnectionString() to setAndPersistConnectionString().
2022-01-28 12:20:41 -08:00