Commit Graph

75 Commits

Author SHA1 Message Date
Dan Lambright 9544379cdf rebase 2022-01-20 11:12:33 -05:00
Ata E Husain Bohra 936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Aaron Molitor 30b05b469c Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit dfe9d184ff.
2021-12-24 11:25:51 -08:00
Aaron Molitor d174bb2e06 Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit abd2959702.
2021-12-24 11:25:51 -08:00
Ata E Husain Bohra abd2959702 Refactor: ClusterController driving cluster-recovery state machine
diff-1: Address Jingyu's review comments

At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Ata E Husain Bohra dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Lukas Joswiak 28b72550f3 Remove additional unused tracing 2021-11-10 13:33:49 -08:00
Lukas Joswiak c93052121f Fix issue where transaction spans would not be recorded 2021-11-10 13:33:49 -08:00
Dan Lambright 46044f544e Add todo w.r.t. adding vv delta sizes to json output 2021-10-15 14:42:55 -04:00
Dan Lambright dbbda24879 Address review comments 2021-10-15 12:17:39 -04:00
Dan Lambright 212e32a600 stats on version vector size 2021-10-15 11:39:35 -04:00
Sreenath Bodagala 2aa3b44d4e Merge remote-tracking branch 'apple-upstream/master' into version-vector-prototype
- Conflicts:
	fdbserver/LogSystem.h
	fdbserver/LogSystemConfig.h
	fdbserver/TagPartitionedLogSystem.actor.cpp

- Files modified during merge:

modified:   fdbserver/LogSystem.cpp
modified:   fdbserver/LogSystemConfig.cpp
2021-09-17 19:36:18 +00:00
Xiaoge Su abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
Dan Lambright 8689e1f106 merge with master 2021-08-30 15:29:08 -04:00
FDB Formatster 2c788c233d apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-08-27 17:07:47 -07:00
Lukas Joswiak a605fb3852
Merge pull request #5026 from sfc-gh-ljoswiak/fixes/alp6
Actor sampling
2021-08-11 13:44:17 -07:00
Sreenath Bodagala a081c0baa5 Merge remote-tracking branch 'apple-upstream/master' into version-vector-prototype 2021-08-05 22:40:32 +00:00
Yao Xiao f3baedd27f Apply suggestions. 2021-08-04 13:51:56 -07:00
Lukas Joswiak 5dc9a97230 Merge branch 'master' into fixes/alp6 2021-08-01 20:42:52 -07:00
Yao Xiao 82be496ba3 Updated grvRawDist to grvGetCommittedVersionRpcDist. 2021-07-30 18:01:27 -07:00
Yao Xiao 74a7da0179 Add histogram in GrvProxyServer. 2021-07-30 17:54:51 -07:00
sfc-gh-tclinkenbeard c74047c665 Merge remote-tracking branch 'origin/master' into fix-more-clang-warnings 2021-07-28 11:51:02 -07:00
Lukas Joswiak 3eed4084e2 Merge branch 'master' into fixes/alp6 2021-07-27 11:26:53 -07:00
Lukas Joswiak 59d535149e Merge branch 'master' into fixes/alp6 2021-07-27 10:07:18 -07:00
Steve Atherton 507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard 64dc1dc185 Fix -Wreorder-ctor warnings in NativeAPI.actor.cpp and several other files 2021-07-24 00:23:06 -07:00
sfc-gh-tclinkenbeard 6f81155784 Merge remote-tracking branch 'origin/master' into const-serverdbinfo 2021-07-20 10:18:40 -07:00
Steve Atherton f596a81073 Rename ::TRUE and ::FALSE in BooleanParams to ::True and ::False so as to not conflict with the TRUE and FALSE macros provided by the Windows and MacOS SDKs. 2021-07-17 00:11:40 -07:00
Sreenath Bodagala 81001edb2e - Propagate version vector deltas between processes 2021-07-15 19:49:20 +00:00
Sreenath Bodagala b6f89df060 - Propagate the latest commit version of a storage server as part of read request.
Make storage server read at the specified version.
2021-07-15 19:48:42 +00:00
sfc-gh-tclinkenbeard b2bbdf0d7f Prevent grvProxyServer from modifying ServerDBInfo object 2021-07-11 23:29:36 -07:00
Sreenath Bodagala baceacacf4 - Maintain a cache of the version vector on both the Grv proxy and
the client. Update these caches on the (transaction) read path.
2021-07-07 19:35:26 +00:00
sfc-gh-tclinkenbeard 79ff07a071 Added *BOOLEAN_PARAM macros to enforce documentation of boolean parameters 2021-07-02 15:04:42 -07:00
Renxuan Wang 00599069d2 Add a few ProxyStats fields.
1. Add the count of # of getRateInfo() and # of leaseTimeout;
2. Add SystemGRVQueueSize, DefaultGRVQueueSize and BatchGRVQueueSize.
2021-06-22 11:45:58 -07:00
RenxuanW fe936207a9 Replace lower priority txn request when limit is hit. 2021-06-15 14:00:06 -07:00
Lukas Joswiak 4eca095644 Remove scoped lineage 2021-06-15 11:08:57 -07:00
RenxuanW f19d256e0d Bug fix: grvLatencyBands should take "GRVLatencyBands" as name. 2021-06-14 17:13:22 -07:00
RenxuanW 29cb735881 Fix batch txn throttling. 2021-06-09 12:51:44 -07:00
Lukas Joswiak 486a04659f Lazy inititialization 2021-06-04 15:01:18 -07:00
Lukas Joswiak 153de33f57 Revert "Merge pull request #4802 from sfc-gh-ljoswiak/revert/actor-lineage"
This reverts commit 6499fa178e, reversing
changes made to 1512631957.
2021-06-04 13:31:55 -07:00
Andrew Noyes ce25a99000 Disallow conversion from float in specialCounter 2021-06-04 12:09:13 -07:00
Andrew Noyes 6992f5814b Avoid casting NaN to int64_t
If the queue is empty, consider the queue to be 100% processed
2021-06-04 09:44:49 -07:00
Lukas Joswiak 4ea760b2a9 Revert "Merge pull request #4136 from sfc-gh-mpilman/features/actor-lineage"
This reverts commit da41534618, reversing
changes made to e6300905d6.
2021-05-10 20:26:12 -07:00
Markus Pilman 54919d4f3b Merge remote-tracking branch 'sfc/features/actor-lineage' into features/actor-lineage 2021-04-28 09:22:14 -06:00
Dan Lambright 715c98572c bit more documentation 2021-04-21 10:48:35 -04:00
Markus Pilman d76b32da18 Annotate read paths on the server side 2021-04-20 15:10:01 -06:00
Dan Lambright a95e845200 document changes 2021-04-06 14:56:58 -04:00
Dan Lambright 7faca702d2 Fix bug in writes to json objects. 2021-04-06 13:05:09 -04:00
Dan Lambright cabf192f57 Respond to review comments 3/23 2021-04-06 13:05:09 -04:00
Dan Lambright 48a475366c Log latency metrics for batch GRV requests 2021-04-06 13:05:09 -04:00