Commit Graph

327 Commits

Author SHA1 Message Date
Markus Pilman d8a0b57b6c clients have to listen on a port in simulation 2022-04-10 14:09:15 -06:00
Renxuan Wang e548c0d604 Add DNS cache. 2022-04-04 15:08:17 -07:00
Renxuan Wang ff934ca2ad Change MockDNS to DNSCache. 2022-04-04 15:08:17 -07:00
Chaoguang Lin 7d365bd1bb
Remote ikvs debugging (#6465)
* initial structure for remote IKVS server

* moved struct to .h file, added new files to CMakeList

* happy path implementation, connection error when testing

* saved minor local change

* changed tracing to debug

* fixed onClosed and getError being called before init is finished

* fix spawn process bug, now use absolute path

* added server knob to set ikvs process port number

* added server knob for remote/local kv store

* implement simulator remote process spawning

* fixed bug for simulator timeout

* commit all changes

* removed print lines in trace

* added FlowProcess implementation by Markus

* initial debug of FlowProcess, stuck at parent sending OpenKVStoreRequest to child

* temporary fix for process factory throwing segfault on create

* specify public address in command

* change remote kv store knob to false for jenkins build

* made port 0 open random unused port

* change remote store knob to true for benchmark

* set listening port to randomly opened port

* added print lines for jenkins run open kv store timeout debug

* removed most tracing and print lines

* removed tutorial changes

* update handleIOErrors error handling to handle remote-ikvs cases

* Push all debugging changes

* A version where worker bug exists

* A version where restarting tests fail

* Use both the name and the port to determine the child process

* Remove unnecessary update on local address

* Disable remote-kvs for DiskFailureCycle test

* A version where restarting stuck

* A version where most restarting tests green

* Reset connection with child process explicitly

* Remove change on unnecessary files

* Unify flags from _ to -

* fix merging unexpected changes

* fix trac.error to .errorUnsuppressed

* Add license header

* Remove unnecessary header in FlowProcess.actor.cpp

* Fix Windows build

* Fix Windows build, add missing ;

* Fix a stupid bug caused by code dropped by code merging

* Disable remote kvs by default

* Pass the conn_file path to the flow process, though not needed, but the buildNetwork is difficult to tune

* serialization change on readrange

* Update traces

* Refactor the RemoteIKVS interface

* Format files

* Update sim2 interface to not clog connections between parent and child processes in simulation

* Update comments; remove debugging symbols; Add error handling for remote_kvs_cancelled

* Add comments, format files

* Change method name from isBuggifyDisabled to isStableConnection; Decrease(0.1x) latency for stable connections

* Commit the IConnection interface change, forgot in previous commit

* Fix the issue that onClosed request is cancelled by ActorCollection

* Enable the remote kv store knob

* Remove FlowProcess.actor.cpp and move functions to RemoteIKeyValueStore.actor.cpp; Add remote kv store delay to avoid race; Bind the child process to die with parent process

* Fix the bug where one process starts storage server more than once

* Add a please_reboot_remote_kv_store error to restart the storage server worker if remote kvs died abnormally

* Remove unreachable code path and add comments

* Clang format the code

* Fix a simple wait error

* Clang format after merging the main branch

* Testing mixed mode in simulation if remote_kvs knob is enabled, setting the default to false

* Disable remote kvs for PhysicalShardMove which is for RocksDB

* Cleanup #include orders, remove debugging traces

* Revert the reorder in fdbserver.actor.cpp, which fails the gcc build

Co-authored-by: “Lincoln <“lincoln.xiao@snowflake.com”>
2022-03-31 17:08:59 -07:00
sfc-gh-tclinkenbeard a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
sfc-gh-tclinkenbeard 8dcac2f76d Fix typos 2022-03-13 10:02:11 -03:00
Jingyu Zhou 1a5bf25b5c Update code base to use fmt 8.1.1 2022-03-04 15:52:06 -08:00
Andrew Noyes 7a9217a392
Add contrib/debug_determinism (#6389)
* Add contrib/debug_determinism

Add an instrumentation-based technique for debugging unseen mismatches. Also guard a few existing sources of nondeterminism that don't affect unseen with the DEBUG_DETERMINISM macro.

Also change the simulated run loop to not run as the only task inside the real run loop, since that was a source of nondeterminism.

Also fix nondeterminism from calling timer_int

* Add StorageMetadataType::currentTime

Basically a deterministic-in-simulation version of timer_int that we can
use instead of timer_int for StorageMetadataType::createdTime
2022-02-25 12:54:31 -08:00
A.J. Beamon 250a88e682 Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement. 2022-02-24 12:25:52 -08:00
Renxuan Wang 2ea4146e1f Add resolveTCPEndpointBlocking() to resolve hostnames where async resolving is impossible. 2022-01-28 12:20:41 -08:00
Renxuan Wang 28832a99d6 Address comment. 2022-01-18 14:34:18 -08:00
Renxuan Wang b8bab06e16 Add the functions to set and get mock DNS.
These functions will be used in restarting tests, where mock DNS needs to be saved to and read from files.
2022-01-18 14:34:18 -08:00
sfc-gh-tclinkenbeard ec64890ac1 Remove some usages of PRId64 by using fmt library 2021-11-30 23:35:36 -08:00
Evan Tschannen 37c9a1320c added --print_sim_time to print simulated time to stdout 2021-11-23 15:01:44 -08:00
Steve Atherton 035e0d6e52
Merge branch 'master' into bit-flipping-workload 2021-11-16 14:42:22 -08:00
Renxuan Wang 4630b0ccea Move DNS mock from SimExternalConnection to Sim2.
This is a revise PR of #5934. In simulation, we don't have direct access to SimExternalConnection.
2021-11-15 17:02:51 -08:00
negoyal 1e7338b6c3 Merge branch 'master' into bit-flipping-workload 2021-10-28 14:24:49 -07:00
sfc-gh-tclinkenbeard 49a667c29b Improve const-correctness of INetwork 2021-10-25 14:42:31 -07:00
negoyal 3b34423248 Merge branch 'master' into bit-flipping-workload 2021-08-31 12:14:51 -07:00
FDB Formatster 2c788c233d apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-08-27 17:07:47 -07:00
Josh Slocum 2c10118229 Force kill in killDatacenter didn't actually force kill always 2021-08-17 17:26:59 -05:00
Lukas Joswiak 5dc9a97230 Merge branch 'master' into fixes/alp6 2021-08-01 20:42:52 -07:00
negoyal 40b4f3b2f1 Merge branch 'master' into bit-flipping-workload 2021-07-28 18:06:07 -07:00
negoyal 050c218502 New Disk Delay Logic and ChaosMetrics. 2021-07-28 16:03:37 -07:00
sfc-gh-tclinkenbeard c74047c665 Merge remote-tracking branch 'origin/master' into fix-more-clang-warnings 2021-07-28 11:51:02 -07:00
Evan Tschannen 81f93d794f Merge branch 'master' of https://github.com/apple/foundationdb into fix-ordered-delay 2021-07-27 13:56:35 -07:00
Evan Tschannen 256a18e43b Flow transport uses an ordered delay to avoid out of order reply promise stream messages 2021-07-27 12:01:32 -07:00
Lukas Joswiak 3eed4084e2 Merge branch 'master' into fixes/alp6 2021-07-27 11:26:53 -07:00
Lukas Joswiak 59d535149e Merge branch 'master' into fixes/alp6 2021-07-27 10:07:18 -07:00
Lukas Joswiak e9a1679467 Disable sampling everywhere except fdbserver 2021-07-27 09:53:23 -07:00
Steve Atherton 507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard e006e4fed4 Fix -Wreorder-ctor warnings in LogSystemPeekCursor.actor.cpp and several other files 2021-07-24 00:48:13 -07:00
sfc-gh-tclinkenbeard 64dc1dc185 Fix -Wreorder-ctor warnings in NativeAPI.actor.cpp and several other files 2021-07-24 00:23:06 -07:00
Sajjad Rahnama e04646e267 Fault Injection Active/Deactivation 2021-07-23 16:28:20 -07:00
Trevor Clinkenbeard 4b83d73f48
Merge pull request #5151 from sfc-gh-tclinkenbeard/fix-non-tls-build
Fix build with DISABLE_TLS=ON
2021-07-20 15:07:16 -07:00
negoyal 596ca92e2f Add missing files and rename some. 2021-07-19 11:13:57 -07:00
Trevor Clinkenbeard f914574f47
Merge pull request #5154 from sfc-gh-tclinkenbeard/send-serverdbinfo-updates-to-coord
Add LowLatencySingleClog test
2021-07-12 19:26:40 -07:00
negoyal 1b8b22decc Wrapper class to avoid adding overhead to all async disk calls 2021-07-12 17:51:01 -07:00
sfc-gh-tclinkenbeard fbc4f47882 Add LowLatencySingleClog.toml test 2021-07-10 17:30:20 -07:00
sfc-gh-tclinkenbeard 17fce0596c Expand use of ENCRYPTION_ENABLED macro 2021-07-09 21:42:42 -07:00
sfc-gh-tclinkenbeard ad03a4787a Fix non-TLS build 2021-07-09 21:06:15 -07:00
Daniel Smith 3fbd6b6143 Enable IThreadPool in simulation 2021-07-08 18:51:01 -04:00
negoyal df39c5a44e Implement Disk Throttling Chaos workload. 2021-06-30 17:05:04 -07:00
sfc-gh-tclinkenbeard a5ecc11bba Added AsyncFileEncrypted::mode field 2021-06-27 18:55:57 -07:00
sfc-gh-tclinkenbeard 81b8292094 Merge remote-tracking branch 'origin' into encrypt-backup-files 2021-06-25 12:21:12 -07:00
Evan Tschannen fcb8bd6475
Revert "Make the sim2 run loop match the behavior of the net2 run loop." 2021-06-22 14:50:01 -07:00
Evan Tschannen 08a5f17660 Merge branch 'master' of https://github.com/apple/foundationdb into feature-sim-time-batching
# Conflicts:
#	fdbserver/DataDistribution.actor.cpp
2021-06-08 10:04:06 -07:00
Evan Tschannen 52ef8b94fb added comments 2021-06-08 09:57:37 -07:00
Lukas Joswiak 153de33f57 Revert "Merge pull request #4802 from sfc-gh-ljoswiak/revert/actor-lineage"
This reverts commit 6499fa178e, reversing
changes made to 1512631957.
2021-06-04 13:31:55 -07:00
A.J. Beamon 69dbe04d42 Rename WeakFutureReference to UnsafeWeakFutureReference and add warning comment 2021-05-28 14:34:20 -07:00