Commit Graph

353 Commits

Author SHA1 Message Date
Yi Wu 994b8c92f8
Add option to limit resident memory and remove default memory limit (#6719)
Changing `memory` option to limit resident memory instead of virtual memory, in config file and fdbserver/fdbbackup/fdbcli command-line argument. Since `rlimit` doesn't support limiting virtual memory, the current implementation have both of fdbmonitor and the fdbserver/fdbbackup process checking process RSS periodically and kill and restart the process if the limit is exceeded.

Adding a new `memory_vsize` option to limit virtual memory, if backward-compatible behavior is desired.

closes #6671, closes #6672
2022-04-06 20:06:24 -07:00
Jingyu Zhou f68fd28d73 Refactor duplicated code into IKnobCollection::setupKnobs() 2022-04-05 02:06:38 -07:00
Steve Atherton 6eb1c2ae48
Merge pull request #6574 from sfc-gh-satherton/redwood-rare-bugs
Rare correctness bug fixes in Redwood
2022-04-01 16:40:22 -07:00
Chaoguang Lin 7d365bd1bb
Remote ikvs debugging (#6465)
* initial structure for remote IKVS server

* moved struct to .h file, added new files to CMakeList

* happy path implementation, connection error when testing

* saved minor local change

* changed tracing to debug

* fixed onClosed and getError being called before init is finished

* fix spawn process bug, now use absolute path

* added server knob to set ikvs process port number

* added server knob for remote/local kv store

* implement simulator remote process spawning

* fixed bug for simulator timeout

* commit all changes

* removed print lines in trace

* added FlowProcess implementation by Markus

* initial debug of FlowProcess, stuck at parent sending OpenKVStoreRequest to child

* temporary fix for process factory throwing segfault on create

* specify public address in command

* change remote kv store knob to false for jenkins build

* made port 0 open random unused port

* change remote store knob to true for benchmark

* set listening port to randomly opened port

* added print lines for jenkins run open kv store timeout debug

* removed most tracing and print lines

* removed tutorial changes

* update handleIOErrors error handling to handle remote-ikvs cases

* Push all debugging changes

* A version where worker bug exists

* A version where restarting tests fail

* Use both the name and the port to determine the child process

* Remove unnecessary update on local address

* Disable remote-kvs for DiskFailureCycle test

* A version where restarting stuck

* A version where most restarting tests green

* Reset connection with child process explicitly

* Remove change on unnecessary files

* Unify flags from _ to -

* fix merging unexpected changes

* fix trac.error to .errorUnsuppressed

* Add license header

* Remove unnecessary header in FlowProcess.actor.cpp

* Fix Windows build

* Fix Windows build, add missing ;

* Fix a stupid bug caused by code dropped by code merging

* Disable remote kvs by default

* Pass the conn_file path to the flow process, though not needed, but the buildNetwork is difficult to tune

* serialization change on readrange

* Update traces

* Refactor the RemoteIKVS interface

* Format files

* Update sim2 interface to not clog connections between parent and child processes in simulation

* Update comments; remove debugging symbols; Add error handling for remote_kvs_cancelled

* Add comments, format files

* Change method name from isBuggifyDisabled to isStableConnection; Decrease(0.1x) latency for stable connections

* Commit the IConnection interface change, forgot in previous commit

* Fix the issue that onClosed request is cancelled by ActorCollection

* Enable the remote kv store knob

* Remove FlowProcess.actor.cpp and move functions to RemoteIKeyValueStore.actor.cpp; Add remote kv store delay to avoid race; Bind the child process to die with parent process

* Fix the bug where one process starts storage server more than once

* Add a please_reboot_remote_kv_store error to restart the storage server worker if remote kvs died abnormally

* Remove unreachable code path and add comments

* Clang format the code

* Fix a simple wait error

* Clang format after merging the main branch

* Testing mixed mode in simulation if remote_kvs knob is enabled, setting the default to false

* Disable remote kvs for PhysicalShardMove which is for RocksDB

* Cleanup #include orders, remove debugging traces

* Revert the reorder in fdbserver.actor.cpp, which fails the gcc build

Co-authored-by: “Lincoln <“lincoln.xiao@snowflake.com”>
2022-03-31 17:08:59 -07:00
Steve Atherton 16afeb43fa Avoid false positive for determinism check in DEBUG_DETERMINISM by avoiding use of shared memory. 2022-03-28 20:02:55 -07:00
sfc-gh-tclinkenbeard a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Andrew Noyes 68c03a7e32
Jemalloc integration fixes (#6626)
* Set default for USE_JEMALLOC initially in ConfigureCompiler

Instead of trying to change the value later on. This fixes the valgrind
build, which was previously incorrectly getting jemalloc involved.

* Check aligned_alloc result for null

And OOM if so - don't assert

* Check that we can allocate magazines with no internal fragmentation

We may want to do this so that the jemalloc heap profiler has some
knowledge of FastAlloc

* Populate TestFile field for noSim tests in TestHarness

* Remove handling for nonexistent "ActualRun"
2022-03-17 15:17:27 -07:00
Renxuan Wang f7eb66441d Try eliminating warnings in macOS and Windows CI builds.
MacOS warnings are format warnings, e.g., `format specifies type 'long' but the argument has type 'Version' (aka 'long long')`.
Windows warnings are `ACTOR does not contain a wait() statement`.
2022-02-25 19:06:57 -08:00
A.J. Beamon 250a88e682 Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement. 2022-02-24 12:25:52 -08:00
Renxuan Wang 3c1394578b Address comments. 2022-02-22 16:29:59 -08:00
Renxuan Wang ea8b2bd314 Resolve hostnames synchronously when calling actor is impossible. 2022-02-22 16:29:59 -08:00
Renxuan Wang 481587a8c6 Turn on hostname logic. 2022-02-22 16:29:59 -08:00
sfc-gh-tclinkenbeard 5b8f4d9be1 Move SimpleIni.h to fdbclient 2022-02-07 13:31:04 -08:00
Ray Jenkins dd45805312
Merge branch 'apple:main' into threadname-issue-6064 2022-02-01 17:40:07 -06:00
Renxuan Wang f9f3735f73 Add resolveHostnamesBlocking() in ConnectionString and IClusterConnectionRecord.
Also, combine IClusterConnectionRecord::getConnectionString() and IClusterConnectionRecord::getMutableConnectionString() to IClusterConnectionRecord::getConnectionString(), and rename setConnectionString() to setAndPersistConnectionString().
2022-01-28 12:20:41 -08:00
gaozengqi 23ea6e2c7e Fix typo 2022-01-28 10:40:33 -08:00
gaozengqi f1bd0f16da Run clang-format 2022-01-28 10:40:33 -08:00
gaozengqi a0c0342e76 Fix usage 2022-01-28 10:40:33 -08:00
gaozengqi 697d075e20 Add role 'kvfiledump' to dump key-values from storage file 2022-01-28 10:40:33 -08:00
Ray Jenkins 1f87c67bc9 Add parentWatcher thread names. 2022-01-25 13:21:46 -06:00
A.J. Beamon ff1cb58174 Convert hyphens to underscores for all prefix-based arguments (e.g. --knob-, --locality-) 2021-12-14 12:01:44 -08:00
A.J. Beamon f24adc7b6a Fix a bunch of places where we used old-style arguments. Allow hyphens for profiler args. 2021-12-14 09:59:14 -08:00
A.J. Beamon f29f487823
Unify flags (#25)
* Unify flags implementation and change help text in backup.actor.cpp
Description

Testing

* Keep LOG_GROUP unchanged

Description

Testing

* Transfer the hyphens to underscores for internal options and user's input, EXCEPT leading hyphens

Description

Testing

* Use a deep copy of the user's input flag to do the match

Description

Testing

* Convert the _ to - in Option arrays of backup.actor.cpp

Description

Testing

* Transter _ to - for files:
        TLSConfig.actor.h, fdbcli.actor.cpp, fdbserver.actor.cpp, FileConverter.h, FileConverter.cpp

Description

Testing

* Change another way to unify flag: using SO_O_ICASE_HYPHEN_AND_UNDERSCORE to determine whether we do the conversion in function IsEqual

Description

Testing

* Change the config command's name from SO_O_ICASE_HYPHEN_AND_UNDERSCORE to SO_O_HYPHEN_TO_UNDERSCORE

Description

Testing

* Update the comment for the SO_O_HYPHEN_TO_UNDERSCORE

Description

Testing

* Fix left underscore in SOption arrays

Description

Testing

* Convert _ to - in several files for commands

Description

Testing

* Make the FDBService and fdbmonitor backward compatible

Description

Testing

* Fix bugs about pointers

Description

Testing

* Check underscore and hyphen at the same time for --knob_, --localily_ and --test_
And fix bugs in fdbmonitor and FDBService
Description

Testing

* Simplify the function in fdbmonitor and FDBService about retrieving arguments.
And fix some documents in masterserver.actor.cpp

Description

Testing

* Convert _ to - for knob in the setKnob functions

Description

Testing

* Convert - to _ in the setKnob functions

Description
Since key in the knob related maps only contain _

Testing

* Rename varialbe name in the fdbmonitor and FDBService for clarification

Description

Testing

Co-authored-by: Chang Liu <chang.liu@snowflake.com>
2021-12-14 08:44:39 -08:00
Evan Tschannen 37c9a1320c added --print_sim_time to print simulated time to stdout 2021-11-23 15:01:44 -08:00
sfc-gh-tclinkenbeard 13bb7838aa Enable clang -Wformat warning 2021-10-30 21:07:38 -07:00
A.J. Beamon e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Vaidas Gasiunas 16171d8252 Refactoring well-known endpoint registration
- List all well-known endpoints of FDB in a single enum
- Identify well-known endpoints by plain IDs
2021-09-21 11:05:31 -06:00
Xiaoge Su abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
Mohamed Oulmahdi 13e55f7b7a Format code 2021-09-15 16:30:58 -06:00
Mohamed Oulmahdi 5df51ee979 Add output for ignored Windows tests 2021-09-15 16:30:58 -06:00
Mohamed Oulmahdi 906f7cf876 Ignore fdbserver tests on Windows 2021-09-15 16:30:58 -06:00
FDB Formatster 2c788c233d apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-08-27 17:07:47 -07:00
sfc-gh-tclinkenbeard 3418c20867 Merge remote-tracking branch 'origin/master' into paxos-config-db 2021-08-16 10:49:47 -07:00
sfc-gh-tclinkenbeard 82546853c0 Rename UseConfigDB to ConfigDBType 2021-08-09 10:04:35 -07:00
Lukas Joswiak 59d535149e Merge branch 'master' into fixes/alp6 2021-07-27 10:07:18 -07:00
Lukas Joswiak e9a1679467 Disable sampling everywhere except fdbserver 2021-07-27 09:53:23 -07:00
Sajjad Rahnama 9d6402b759 Fault Injection Active/Deactivation - Edit usage in fdbserver 2021-07-26 11:04:18 -07:00
Sajjad Rahnama e04646e267 Fault Injection Active/Deactivation 2021-07-23 16:28:20 -07:00
Steve Atherton 81e83e2830
Update fdbserver/fdbserver.actor.cpp
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2021-07-18 11:08:27 -07:00
Steve Atherton 1af8538727 Simfdb cleanup did not recognize the unittests directory, fixed this and also made the cleanup code a bit cleaner. 2021-07-18 03:06:04 -07:00
Steve Atherton f596a81073 Rename ::TRUE and ::FALSE in BooleanParams to ::True and ::False so as to not conflict with the TRUE and FALSE macros provided by the Windows and MacOS SDKs. 2021-07-17 00:11:40 -07:00
Steve Atherton 9fd5f54992
Merge pull request #5142 from Daniel-B-Smith/threads-in-simulation
Enable IThreadPool in simulation
2021-07-12 14:24:34 -07:00
Daniel Smith 8efe3b296a Delete remaining extern declarations for noUnseed 2021-07-08 19:19:22 -04:00
sfc-gh-tclinkenbeard 8cc40e3a2b Expand use of BOOLEAN_PARAM 2021-07-02 21:41:50 -07:00
sfc-gh-tclinkenbeard 89eaa90b6d Explicitly initialize useConfigDB in fdbserver.actor.cpp 2021-06-18 20:18:59 -07:00
sfc-gh-tclinkenbeard 41c790b299 Merge remote-tracking branch 'origin/master' into config-db 2021-06-10 22:31:23 -07:00
sfc-gh-tclinkenbeard 13ee24f464 Add UseConfigDB class 2021-06-10 20:57:50 -07:00
sfc-gh-tclinkenbeard c272304e60 Manage global flow knobs with global knob collection 2021-06-09 22:33:00 -07:00
sfc-gh-tclinkenbeard 83a0e473e8 Refactor IKnobCollection code 2021-06-09 20:50:00 -07:00
sfc-gh-tclinkenbeard 371a38e6e5 Merge remote-tracking branch 'origin/master' into remove-extra-copies 2021-06-07 10:26:06 -07:00