Balachandar Namasivayam
14e54f44b3
Address review comments.
2019-07-18 12:32:35 -07:00
Balachandar Namasivayam
406bcebdc4
Ratekeeper to throttle tpsLimit to 1 if it is not able to fetch storage server list for some configurable amount of time.
2019-07-17 18:08:17 -07:00
Evan Tschannen
db5b4a6331
avoid going to unlimited immediately after going below the durabilityLagTargetVersion
2019-07-12 18:50:56 -07:00
Evan Tschannen
6e34e16699
durable version needs more smoothing because it will be updated in bursts
2019-07-12 18:50:56 -07:00
Evan Tschannen
b2b2e25324
the durabilityLagLimit needs to be tracked separately for batch priority and normal priority
2019-07-12 18:50:56 -07:00
Evan Tschannen
fef58e13a4
adding logging for durability lag in ratekeeper
2019-07-12 18:50:56 -07:00
Evan Tschannen
1a18c859c7
knobified the durability lag rate controls
2019-07-12 18:50:56 -07:00
Evan Tschannen
c5fb5494f5
a better attempt a ratekeeper control on durability lag
2019-07-12 18:50:56 -07:00
Evan Tschannen
dc171b3eae
fixed compiler error
2019-07-12 18:50:56 -07:00
Evan Tschannen
e85c05c906
experimental slow control on durability lag
2019-07-12 18:50:56 -07:00
Jingyu Zhou
50e7593c5b
Merge pull request #1796 from ajbeamon/remove-trace-event-underscores
...
Remove trace event underscores
2019-07-05 21:45:55 -07:00
A.J. Beamon
9f4b6fd770
Remove additional underscores
2019-07-05 08:12:25 -07:00
Alex Miller
8e1ab6e7db
Merge remote-tracking branch 'upstream/master' into flowlock-api
2019-06-28 17:32:54 -07:00
Evan Tschannen
5041ff38b1
removed unneeded description
2019-06-28 16:54:22 -07:00
Evan Tschannen
a124fc6e8a
fixed compiler error
2019-06-28 16:54:22 -07:00
Evan Tschannen
b9a6271375
local ratekeeper no longer globally limits
2019-06-28 16:54:22 -07:00
Evan Tschannen
f539b5f09a
fix: a large targetRateRatio means limiting more
2019-06-28 16:54:22 -07:00
Evan Tschannen
db413c37f7
restored the STORAGE_DURABILITY_LAG_SOFT_MAX knob and made the rk target slightly smaller than the soft limit, to avoid inaccuracies in ratekeeper control causing behavior changes on the storage servers
2019-06-28 16:54:22 -07:00
Evan Tschannen
a97940a10b
fixed compiler error
2019-06-28 16:54:22 -07:00
Evan Tschannen
92b32855ca
ratekeeper’s control algorithm would oscillate when limited by local ratekeeper
2019-06-28 16:54:22 -07:00
Alex Miller
7a500cd37f
A giant translation of TaskFooPriority -> TaskPriority::Foo
...
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Evan Tschannen
dccb9bc26d
fixed a number of correctness problems
2019-06-12 19:40:50 -07:00
Trevor Clinkenbeard
8144882d7b
Merge branch 'apple-master' into features/local-rk
2019-06-10 19:40:25 -07:00
A.J. Beamon
5f55f3f613
Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.
2019-05-10 14:01:52 -07:00
mpilman
bdba8e22eb
Added test and bugfixes
2019-04-08 11:05:29 -07:00
mpilman
207049e852
fixed serialization
2019-04-08 11:04:44 -07:00
mpilman
32393ec4c9
Prototype of local ratekeeper
2019-04-08 11:04:44 -07:00
A.J. Beamon
91014d4529
Add file changes that I accidentally failed to commit; fix naming issue in worker.
2019-03-27 08:41:19 -07:00
Evan Tschannen
36ab852bb1
Merge branch 'master' into ratekeeper
...
# Conflicts:
# fdbserver/ClusterController.actor.cpp
2019-03-22 18:41:00 -07:00
Evan Tschannen
3ced178348
maxVersionDifference is a copy of a knob which is a double
2019-03-21 12:58:48 -07:00
Jingyu Zhou
99d521ef4f
Monitor Ratekeeper and DataDistributor to use stateless processes
...
Since Ratekeeper and DataDistributor are no longer running with Master, they
might be running with stateful processes before a new Master becomes alive,
which is undesirable.
This PR adds a monitoring of both Ratekeeper and DataDistributor at Cluster
Controller -- if Master runs on a stateless class and RK/DD runs at a worse
class, then RK/DD will be killed. I.e., RK/DD should be running at their own
classes or on the same stateless process as Master. After restart, RK/DD should
be running at a better process class.
2019-03-14 15:00:57 -07:00
Jingyu Zhou
2b0139670e
Fix review comment for PR 1176
2019-03-12 12:02:30 -07:00
Jingyu Zhou
cdfe906c30
Data distributor pulls batch limited info from proxy
...
Add a flag in HealthMetrics to indicate that batch priority is rate limited.
Data distributor pulls this flag from proxy to know roughly when rate limiting
happens.
DD uses this information to determine when to do the rebalance in the background,
i.e., moving data from heavily loaded servers to lighter ones. If the cluster is
currently rate limited for batch commits, then the rebalance will use longer
time intervals, otherwise use shorter intervals. See BgDDMountainChopper() and
BgDDValleyFiller() in DataDistributionQueue.actor.cpp.
2019-03-07 13:16:20 -08:00
Jingyu Zhou
f43277e819
Format Ratekeeper.actor.cpp code
2019-03-07 13:16:20 -08:00
Jingyu Zhou
dc129207a9
Minor fix after rebase.
2019-03-07 13:16:20 -08:00
Jingyu Zhou
517966fce2
Remove lastLimited from rate keeper
...
Refactor code to make IDE happy.
2019-03-07 13:16:20 -08:00
Jingyu Zhou
b2ee41ba33
Remove lastLimited from data distribution
...
Fix a serialization bug in ServerDBInfo, which causes test failures.
2019-03-07 13:16:20 -08:00
Jingyu Zhou
e6ac3f7fe8
Minor fix on ratekeeper work registration.
2019-03-07 13:16:20 -08:00
Jingyu Zhou
3c86643822
Separate Ratekeeper from data distribution.
...
Add a new role for ratekeeper.
Remove StorageServerChanges from data distribution.
Ratekeeper monitors storage servers, which borrows the idea from
DataDistribution.
2019-03-07 13:16:20 -08:00
Trevor Clinkenbeard
39f612d132
Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics
2019-03-02 17:07:00 -08:00
Trevor Clinkenbeard
2940b8d5fd
Update all per-process storage server health metrics at once
...
Ratekeeper updates all storage servers' health metrics in updateRate
with only a single map lookup
2019-03-02 16:08:28 -08:00
A.J. Beamon
655c9d82c7
Various cleanup from review
2019-03-01 14:06:47 -08:00
A.J. Beamon
93f7849261
Fix typo
2019-02-28 12:02:47 -08:00
A.J. Beamon
3e6a6a6569
Update status schema for correctness. Send the count of batch transactions started back to ratekeeper so that it can be logged with other ratekeeper metrics.
2019-02-28 12:00:58 -08:00
A.J. Beamon
eb629d87a5
Add information about batch ratekeeper to status. Make it possible to track latencies in the ReadWrite workload for concurrently run instances separately.
2019-02-28 09:53:16 -08:00
Trevor Clinkenbeard
3f59f82670
Calculate durabilityLag instead of NDV for health metrics
2019-02-27 16:30:01 -08:00
A.J. Beamon
a051055caf
Initial implementation of adding separate limits for batch priority in ratekeeper
2019-02-27 10:31:56 -08:00
Trevor Clinkenbeard
abfe057805
Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics
2019-02-25 13:47:16 -08:00
Trevor Clinkenbeard
07f800eeee
Got rid of detailed field in GetRateInfoReply message
2019-02-23 17:52:11 -08:00
Trevor Clinkenbeard
f3a73963b4
Got rid of detailedLeaseDuration in GetRateInfoReply message
2019-02-23 16:42:11 -08:00