Changing `memory` option to limit resident memory instead of virtual memory, in config file and fdbserver/fdbbackup/fdbcli command-line argument. Since `rlimit` doesn't support limiting virtual memory, the current implementation have both of fdbmonitor and the fdbserver/fdbbackup process checking process RSS periodically and kill and restart the process if the limit is exceeded.
Adding a new `memory_vsize` option to limit virtual memory, if backward-compatible behavior is desired.
closes#6671, closes#6672
* initial structure for remote IKVS server
* moved struct to .h file, added new files to CMakeList
* happy path implementation, connection error when testing
* saved minor local change
* changed tracing to debug
* fixed onClosed and getError being called before init is finished
* fix spawn process bug, now use absolute path
* added server knob to set ikvs process port number
* added server knob for remote/local kv store
* implement simulator remote process spawning
* fixed bug for simulator timeout
* commit all changes
* removed print lines in trace
* added FlowProcess implementation by Markus
* initial debug of FlowProcess, stuck at parent sending OpenKVStoreRequest to child
* temporary fix for process factory throwing segfault on create
* specify public address in command
* change remote kv store knob to false for jenkins build
* made port 0 open random unused port
* change remote store knob to true for benchmark
* set listening port to randomly opened port
* added print lines for jenkins run open kv store timeout debug
* removed most tracing and print lines
* removed tutorial changes
* update handleIOErrors error handling to handle remote-ikvs cases
* Push all debugging changes
* A version where worker bug exists
* A version where restarting tests fail
* Use both the name and the port to determine the child process
* Remove unnecessary update on local address
* Disable remote-kvs for DiskFailureCycle test
* A version where restarting stuck
* A version where most restarting tests green
* Reset connection with child process explicitly
* Remove change on unnecessary files
* Unify flags from _ to -
* fix merging unexpected changes
* fix trac.error to .errorUnsuppressed
* Add license header
* Remove unnecessary header in FlowProcess.actor.cpp
* Fix Windows build
* Fix Windows build, add missing ;
* Fix a stupid bug caused by code dropped by code merging
* Disable remote kvs by default
* Pass the conn_file path to the flow process, though not needed, but the buildNetwork is difficult to tune
* serialization change on readrange
* Update traces
* Refactor the RemoteIKVS interface
* Format files
* Update sim2 interface to not clog connections between parent and child processes in simulation
* Update comments; remove debugging symbols; Add error handling for remote_kvs_cancelled
* Add comments, format files
* Change method name from isBuggifyDisabled to isStableConnection; Decrease(0.1x) latency for stable connections
* Commit the IConnection interface change, forgot in previous commit
* Fix the issue that onClosed request is cancelled by ActorCollection
* Enable the remote kv store knob
* Remove FlowProcess.actor.cpp and move functions to RemoteIKeyValueStore.actor.cpp; Add remote kv store delay to avoid race; Bind the child process to die with parent process
* Fix the bug where one process starts storage server more than once
* Add a please_reboot_remote_kv_store error to restart the storage server worker if remote kvs died abnormally
* Remove unreachable code path and add comments
* Clang format the code
* Fix a simple wait error
* Clang format after merging the main branch
* Testing mixed mode in simulation if remote_kvs knob is enabled, setting the default to false
* Disable remote kvs for PhysicalShardMove which is for RocksDB
* Cleanup #include orders, remove debugging traces
* Revert the reorder in fdbserver.actor.cpp, which fails the gcc build
Co-authored-by: “Lincoln <“lincoln.xiao@snowflake.com”>
* Set default for USE_JEMALLOC initially in ConfigureCompiler
Instead of trying to change the value later on. This fixes the valgrind
build, which was previously incorrectly getting jemalloc involved.
* Check aligned_alloc result for null
And OOM if so - don't assert
* Check that we can allocate magazines with no internal fragmentation
We may want to do this so that the jemalloc heap profiler has some
knowledge of FastAlloc
* Populate TestFile field for noSim tests in TestHarness
* Remove handling for nonexistent "ActualRun"
MacOS warnings are format warnings, e.g., `format specifies type 'long' but the argument has type 'Version' (aka 'long long')`.
Windows warnings are `ACTOR does not contain a wait() statement`.
Also, combine IClusterConnectionRecord::getConnectionString() and IClusterConnectionRecord::getMutableConnectionString() to IClusterConnectionRecord::getConnectionString(), and rename setConnectionString() to setAndPersistConnectionString().
* Unify flags implementation and change help text in backup.actor.cpp
Description
Testing
* Keep LOG_GROUP unchanged
Description
Testing
* Transfer the hyphens to underscores for internal options and user's input, EXCEPT leading hyphens
Description
Testing
* Use a deep copy of the user's input flag to do the match
Description
Testing
* Convert the _ to - in Option arrays of backup.actor.cpp
Description
Testing
* Transter _ to - for files:
TLSConfig.actor.h, fdbcli.actor.cpp, fdbserver.actor.cpp, FileConverter.h, FileConverter.cpp
Description
Testing
* Change another way to unify flag: using SO_O_ICASE_HYPHEN_AND_UNDERSCORE to determine whether we do the conversion in function IsEqual
Description
Testing
* Change the config command's name from SO_O_ICASE_HYPHEN_AND_UNDERSCORE to SO_O_HYPHEN_TO_UNDERSCORE
Description
Testing
* Update the comment for the SO_O_HYPHEN_TO_UNDERSCORE
Description
Testing
* Fix left underscore in SOption arrays
Description
Testing
* Convert _ to - in several files for commands
Description
Testing
* Make the FDBService and fdbmonitor backward compatible
Description
Testing
* Fix bugs about pointers
Description
Testing
* Check underscore and hyphen at the same time for --knob_, --localily_ and --test_
And fix bugs in fdbmonitor and FDBService
Description
Testing
* Simplify the function in fdbmonitor and FDBService about retrieving arguments.
And fix some documents in masterserver.actor.cpp
Description
Testing
* Convert _ to - for knob in the setKnob functions
Description
Testing
* Convert - to _ in the setKnob functions
Description
Since key in the knob related maps only contain _
Testing
* Rename varialbe name in the fdbmonitor and FDBService for clarification
Description
Testing
Co-authored-by: Chang Liu <chang.liu@snowflake.com>