Commit Graph

87 Commits

Author SHA1 Message Date
A.J. Beamon cdebda35ab
Merge pull request #5725 from sfc-gh-jfu/jfu-grv-cache
Add transaction option for clients to use cached read versions
2022-03-04 09:17:27 -08:00
A.J. Beamon 250a88e682 Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement. 2022-02-24 12:25:52 -08:00
Jon Fu ae071a7211 clean up debug trace lines 2022-02-22 17:16:11 -05:00
Jon Fu e0d3b0a488 format times in trace event 2022-02-16 16:55:14 -05:00
Jon Fu 6199faadc3 fix bug which constantly overwrote start time of last throttled queue 2022-02-16 16:52:38 -05:00
Jon Fu 5846dda410 temporary changes and extra traces for debugging 2022-02-16 15:05:25 -05:00
Jon Fu 6f1c3d50bb add debug traces for testing 2022-02-15 15:08:53 -05:00
Jon Fu 8129c4e21c simplify sidebandsingle workload and be stricter with batch throttling on rk 2022-02-14 13:58:56 -05:00
Jon Fu a63d218e9d simplify test workload and adjust ratekeeper throttling strategy 2022-02-11 16:41:14 -05:00
Jon Fu 458e708272 addressed code review comments: renamed variables, small functional changes, style changes 2022-02-10 16:17:54 -05:00
Jon Fu ec2bbf0343 clean up some more trace lines and leftover code snippets 2022-02-07 14:50:04 -05:00
Jon Fu d8e7fea421 clean up some comments and debug changes 2022-02-02 14:03:32 -05:00
Jon Fu 915e2f6c1c Merge branch 'main' of github.com:apple/foundationdb into jfu-grv-cache 2022-01-20 16:17:20 -05:00
Ata E Husain Bohra 936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Aaron Molitor 30b05b469c Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit dfe9d184ff.
2021-12-24 11:25:51 -08:00
Aaron Molitor d174bb2e06 Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit abd2959702.
2021-12-24 11:25:51 -08:00
Ata E Husain Bohra abd2959702 Refactor: ClusterController driving cluster-recovery state machine
diff-1: Address Jingyu's review comments

At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Ata E Husain Bohra dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Jon Fu 2b0ade5250 Change throttling threshold to count loop iterations instead of time 2021-11-25 13:31:55 -05:00
Jon Fu beae9ccfa1 tweak some knob default settings and trace formatting 2021-11-23 14:58:17 -05:00
Jon Fu 33ee5fa372 add tracing to proxy throttling check codepath 2021-11-22 13:00:53 -05:00
Jon Fu 3f24128da4 Merge branch 'master' of github.com:apple/foundationdb into jfu-grv-cache 2021-11-19 14:46:55 -05:00
Jon Fu e9c58d9f86 Check for sustained throttling in the proxy to lower threshold time and avoid false positives 2021-11-19 14:33:06 -05:00
Lukas Joswiak 28b72550f3 Remove additional unused tracing 2021-11-10 13:33:49 -08:00
Lukas Joswiak c93052121f Fix issue where transaction spans would not be recorded 2021-11-10 13:33:49 -08:00
Jon Fu 6d74239760 Track throttling by measuring time spent left in queue on the proxy 2021-10-22 13:55:01 -04:00
Jon Fu f1c8d3fbc8 Add code to disable cache when ratekeeper begins throttling 2021-10-20 15:52:43 -04:00
Jon Fu 44a854772f Merge branch 'master' of github.com:apple/foundationdb into jfu-grv-cache 2021-10-05 12:55:02 -04:00
Jon Fu d560eb1fea debug time bounds using sim_validation 2021-10-04 14:12:31 -04:00
Xiaoge Su abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
FDB Formatster 2c788c233d apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-08-27 17:07:47 -07:00
Lukas Joswiak a605fb3852
Merge pull request #5026 from sfc-gh-ljoswiak/fixes/alp6
Actor sampling
2021-08-11 13:44:17 -07:00
Yao Xiao f3baedd27f Apply suggestions. 2021-08-04 13:51:56 -07:00
Lukas Joswiak 5dc9a97230 Merge branch 'master' into fixes/alp6 2021-08-01 20:42:52 -07:00
Yao Xiao 82be496ba3 Updated grvRawDist to grvGetCommittedVersionRpcDist. 2021-07-30 18:01:27 -07:00
Yao Xiao 74a7da0179 Add histogram in GrvProxyServer. 2021-07-30 17:54:51 -07:00
sfc-gh-tclinkenbeard c74047c665 Merge remote-tracking branch 'origin/master' into fix-more-clang-warnings 2021-07-28 11:51:02 -07:00
Lukas Joswiak 3eed4084e2 Merge branch 'master' into fixes/alp6 2021-07-27 11:26:53 -07:00
Lukas Joswiak 59d535149e Merge branch 'master' into fixes/alp6 2021-07-27 10:07:18 -07:00
Steve Atherton 507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard 64dc1dc185 Fix -Wreorder-ctor warnings in NativeAPI.actor.cpp and several other files 2021-07-24 00:23:06 -07:00
sfc-gh-tclinkenbeard 6f81155784 Merge remote-tracking branch 'origin/master' into const-serverdbinfo 2021-07-20 10:18:40 -07:00
Steve Atherton f596a81073 Rename ::TRUE and ::FALSE in BooleanParams to ::True and ::False so as to not conflict with the TRUE and FALSE macros provided by the Windows and MacOS SDKs. 2021-07-17 00:11:40 -07:00
sfc-gh-tclinkenbeard b2bbdf0d7f Prevent grvProxyServer from modifying ServerDBInfo object 2021-07-11 23:29:36 -07:00
sfc-gh-tclinkenbeard 79ff07a071 Added *BOOLEAN_PARAM macros to enforce documentation of boolean parameters 2021-07-02 15:04:42 -07:00
Renxuan Wang 00599069d2 Add a few ProxyStats fields.
1. Add the count of # of getRateInfo() and # of leaseTimeout;
2. Add SystemGRVQueueSize, DefaultGRVQueueSize and BatchGRVQueueSize.
2021-06-22 11:45:58 -07:00
RenxuanW fe936207a9 Replace lower priority txn request when limit is hit. 2021-06-15 14:00:06 -07:00
Lukas Joswiak 4eca095644 Remove scoped lineage 2021-06-15 11:08:57 -07:00
RenxuanW f19d256e0d Bug fix: grvLatencyBands should take "GRVLatencyBands" as name. 2021-06-14 17:13:22 -07:00
RenxuanW 29cb735881 Fix batch txn throttling. 2021-06-09 12:51:44 -07:00