Commit Graph

31 Commits

Author SHA1 Message Date
Markus Pilman 1de37afd52
Make TEST macros C++ only (#7558)
* proof of concept

* use code-probe instead of test

* code probe working on gcc

* code probe implemented

* renamed TestProbe to CodeProbe

* fixed refactoring typo

* support filtered output

* print probes at end of simulation

* fix missed probes print

* fix deduplication

* Fix refactoring issues

* revert bad refactor

* make sure file paths are relative

* fix more wrong refactor changes
2022-07-19 13:15:51 -07:00
Lukas Joswiak 9ca8a3c683 Reenable status json for dynamic knobs, add unit test 2022-06-21 11:43:05 -07:00
Andrew Noyes 6f500b59c0
Fix a heap-use-after-free in PaxosConfigConsumer.actor.cpp (#7244)
* Fix a heap-use-after-free in PaxosConfigConsumer.actor.cpp

* Two more defensive local promises

* Two more defensive promise copies

* Fix latent logic error
2022-05-25 12:08:30 -07:00
Renxuan Wang 154de018ff
One place in PaxosConfigConsumer was missed out in #6926. (#7006)
* One place in PaxosConfigConsumer was missed out.

* Minor improvements.
2022-04-28 18:32:55 -07:00
Renxuan Wang c69a07a858
Check in the new Hostname logic. (#6926)
* Revert #6655.

20220407-031010-renxuan-c101052c21da8346           compressed=True data_size=31004844 duration=4310801 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=1:04:15 sanity=False started=100047 stopped=20220407-041425 submitted=20220407-031010 timeout=5400 username=renxuan

* Revert #6271.

20220407-051532-renxuan-470f0fe6aac1c217           compressed=True data_size=30982370 duration=3491067 ended=100002 fail_fast=10 max_runs=100000 pass=100002 priority=100 remaining=0 runtime=0:59:57 sanity=False started=100141 stopped=20220407-061529 submitted=20220407-051532 timeout=5400 username=renxuan

* Revert #6266.

Remove resolving-related functionalities in connection string. Connection string will be used for storing purpose only, and non-mutable.

20220407-175119-renxuan-55d30ee1a4b42c2f           compressed=True data_size=30970443 duration=5437659 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:59:31 sanity=False started=100154 stopped=20220407-185050 submitted=20220407-175119 timeout=5400 username=renxuan

* Add hostname to coordinator interfaces.

* Turn on the new hostname logic.

* Add the corresponding change in config txns.

The most notable change is before calling basicLoadBalance(), we need to call tryInitializeRequestStream() to initialize request streams first.

Passed correctness tests.

* Return error when hostnames cannot be resolved in coordinators command.

* Minor fixes.
2022-04-27 21:54:13 -07:00
Chaoguang Lin af9deeabc2 Move the Promise<QuorumVersion> before the Future vector to be destroyed after the vector 2022-03-22 16:12:41 -07:00
sfc-gh-tclinkenbeard a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
sfc-gh-tclinkenbeard 0e7dc83f25 Fix compilation issues with ModelInterface construction in configuration database code 2022-03-16 14:25:32 -07:00
Lukas Joswiak c3e48fff9f
Update fdbserver/PaxosConfigConsumer.actor.cpp
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2022-03-16 08:59:12 -07:00
Lukas Joswiak 582ba5d519 Fix issue with stuck config nodes
In rare circumstances where the cluster controller dies / moves to a new
machine, sometimes only a minority of `ConfigNode`s received messages
telling them they were registered. When the `ConfigNode`s attempt to
register with the new broadcaster (on the new cluster controller), the
knob system would get stuck because only a minority would be registered.
Part of this change allows registration of unregistered `ConfigNode`s if
there is no path to a majority of registered nodes.
2022-03-15 11:42:58 -07:00
Lukas Joswiak d0da6c63c1 Rollforward out of date nodes, compaction fixes 2022-03-14 11:20:56 -07:00
Lukas Joswiak a8828db58e Load balance dynamic knob requests
This commit also removes an attempt to read the latest configuration
snapshot when a rollforward timeout occurs. The normal retry loop will
eventually fetch an up to date snapshot and the rollforward will be
retried.
2022-02-22 10:53:48 -08:00
Lukas Joswiak e8354d82bd Fix timeout issue when using >3 coordinators
The calculation to determine how many non-timeout replies had been
received was incorrect, causing rollback/rollforward requests to not be
sent, causing the dynamic knob subsystem to get stuck.
2022-02-09 13:43:33 -08:00
Lukas Joswiak 7fc4f0d649 Reuse existing quorum timeout error code 2022-02-09 13:43:33 -08:00
Lukas Joswiak d5a562e6b8 Fix dynamic knobs correctness issues 2022-02-09 13:43:32 -08:00
Lukas Joswiak 30b525a607 Add assertions to check rollback 2021-10-25 12:03:22 -07:00
Lukas Joswiak c96f560cbe Verify rollback of a single version in simulation, other small fixes 2021-10-25 12:03:22 -07:00
Lukas Joswiak 6078664792 clang-format 2021-10-25 12:03:22 -07:00
Lukas Joswiak 57c2cf4a24 Retry messages to well known endpoints, add notes for future work 2021-10-25 12:03:22 -07:00
Lukas Joswiak 92998fd20b Merge rollback message into rollforward message 2021-10-25 12:03:22 -07:00
Lukas Joswiak 7357d7714c Retry with well known endpoints, move last committed check to consumer 2021-10-25 12:03:22 -07:00
Lukas Joswiak 1631a1b352 Update fdbserver/PaxosConfigConsumer.actor.cpp
Co-authored-by: Trevor Clinkenbeard <trevor.clinkenbeard@snowflake.com>
2021-10-25 12:03:22 -07:00
Lukas Joswiak e79c6c7456 Fix issue where previous commit messages were reused
Fixes an issue where commit versions from previous requests sent to
ConfigNodes were being reused when a new quorum of commit versions was
requested. This was occurring due to a failure to reset the state of
GetCommittedVersionQuorum after a full snapshot request.
2021-10-25 12:03:22 -07:00
Lukas Joswiak 9d78604c5b Add rollback and rollforward logic to ConfigBroadcaster 2021-10-25 12:03:22 -07:00
Lukas Joswiak 9a39da85b1 Fix issue where previous commit messages were reused
Fixes an issue where commit versions from previous requests sent to
ConfigNodes were being reused when a new quorum of commit versions was
requested. This was occurring due to a failure to reset the state of
GetCommittedVersionQuorum after a full snapshot request.
2021-10-25 12:03:22 -07:00
Lukas Joswiak 48dc91dd7f Add rollback and rollforward logic to ConfigBroadcaster 2021-10-25 12:03:22 -07:00
sfc-gh-tclinkenbeard b15daf1886 Added PImpl class
This class propogates the constness of methods to their pimpl
implementations
2021-08-09 10:04:34 -07:00
sfc-gh-tclinkenbeard 9cfd6ed955 Add simple implementation to PaxosConfigConsumer 2021-07-18 17:07:10 -07:00
sfc-gh-tclinkenbeard 748a3ebfbe Add GetSnapshotAndChangesRequest type 2021-05-18 15:28:44 -07:00
sfc-gh-tclinkenbeard ea8396c9be Improve decoupling of configuration database interfaces and implementations 2021-05-17 15:31:03 -07:00
sfc-gh-tclinkenbeard 32f38394b1 Added dummy PaxosConfigConsumer implementation 2021-05-17 13:41:50 -07:00