Commit Graph

6665 Commits

Author SHA1 Message Date
A.J. Beamon fa6e45a852 Separate AsioReactor sleep and react into two different functions. Track slow tasks and time spent in react, track time spent in launch. Don't track react time at priority 0. 2019-08-28 14:35:48 -07:00
sramamoorthy 7a9097ea01 make fdbcli --exec 'snapshot create.sh' to succeed 2019-08-27 16:44:19 -07:00
sramamoorthy 5d87443323 improved error msgs for snapshot cmd 2019-08-27 16:43:52 -07:00
sramamoorthy 64000eafb2 Fixes #2020 - snap binpath not to be passed as arg 2019-08-27 11:49:12 -07:00
A.J. Beamon e0824f4915
Merge pull request #2013 from etschannen/feature-dd-logging
Warn when different parts of shard relocations take more than 10 minutes
2019-08-27 08:55:53 -07:00
A.J. Beamon fff0d37595
Merge pull request #2019 from etschannen/feature-remote-load-balance
The Load balancing algorithm will use remote replicas when the primary is overloaded
2019-08-27 08:42:06 -07:00
Vishesh Yadav 2b941f51bd Revert "fix: Use getReply* instead of tryGetReply in `monitorProxies`"
This reverts commit e7c94a2411.
2019-08-26 18:31:08 -07:00
Vishesh Yadav e7c94a2411 fix: Use getReply* instead of tryGetReply in `monitorProxies`
`tryGetReply` is unreliable, and since `monitorProxies` expects reply
after long period, the connection to coordinator gets closed due to
idle timeout, only to get reopened again in next loop to make
`openDatabase` request.

When using `getReply` our reliable message queue won't be empty and
connection will stay open.
2019-08-26 18:24:49 -07:00
A.J. Beamon 0b1fc91a9c Revert "Don't grow the budget deficit once it's exceeded some number of seconds of transactions. Decay the deficit if the rate changes and it exceeds the new limit."
This reverts commit 90cb73d472.
2019-08-22 10:05:29 -07:00
Evan Tschannen 00424a5108 changed the rate at which the coordinators register with the cluster controller and the clients register with the coordinator so the the connected client number in status will be much more accurate 2019-08-21 15:02:09 -07:00
Evan Tschannen 41b908752e increased move keys parallelism to be less of a decrease just in case lowering this could effect normal data distribution
raised target durability lag versions to give more time for batch limiting to come into play before this limit is hit
changed max_bad_options to better reflect the name
2019-08-21 14:55:21 -07:00
Evan Tschannen 0b0c9fe0ff data distribution status was combined into regular status 2019-08-21 14:44:15 -07:00
Evan Tschannen ac68c8e4fd added sources servers to the warning message 2019-08-21 11:48:29 -07:00
Andrew Noyes 692df4d7f5 Update CMakeLists.txt
Co-Authored-By: Alex Miller <35046903+alexmiller-apple@users.noreply.github.com>
2019-08-20 15:54:51 -07:00
Andrew Noyes cd7acb50fb Update cmake version 2019-08-20 15:54:51 -07:00
A.J. Beamon b7a4540a35
Merge pull request #2006 from ajbeamon/add-coordinator-to-status-roles-list
Add 'coordinator' to the list of roles that a process can have in status
2019-08-20 11:24:12 -07:00
A.J. Beamon 2b80d836f4 Merge branch 'release-6.2' into add-coordinator-to-status-roles-list
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
2019-08-19 15:03:59 -07:00
A.J. Beamon 1ae01cdab1 Merge branch 'release-6.2' into fix-proxy-grv-budgeting
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
2019-08-19 15:00:25 -07:00
A.J. Beamon 90cb73d472 Don't grow the budget deficit once it's exceeded some number of seconds of transactions. Decay the deficit if the rate changes and it exceeds the new limit. 2019-08-19 14:56:59 -07:00
Evan Tschannen 3965a959b4
Merge pull request #2017 from ajbeamon/fix-fileconfigure-error-cases
Fix some validation logic when using the fdbcli fileconfigure command
2019-08-19 14:49:51 -07:00
Bhaskar Muppana 62ab3fa70b
Merge pull request #2018 from ajbeamon/add-loggroup-help-text
Add --loggroup to fdbserver and fdbbackup help text.
2019-08-19 14:33:22 -07:00
Evan Tschannen 1f2499c74f
Merge pull request #2012 from ajbeamon/rk-durability-lag-considers-mvcc-window
Ratekeeper ignores intentionally non-durable versions on the SS for durability lag computations
2019-08-19 14:24:21 -07:00
Evan Tschannen 2bd59d1055
Merge pull request #2003 from ajbeamon/add-rk-durability-lag-to-status
Add ratekeeper's durability lag statistics to status
2019-08-19 14:19:59 -07:00
Evan Tschannen f65eb0e9d4
Merge pull request #2002 from ajbeamon/remove-unused-variables
Remove unused local rate limit variables in ratekeeper.
2019-08-19 14:17:05 -07:00
Evan Tschannen 54282288cb disabled zone_id load balancing, because it can cause hot read shards 2019-08-19 14:04:21 -07:00
Evan Tschannen 37e2fc86de Increase the target durability lag versions to be larger than the soft max, so that storage servers will respond with a penalty to clients before ratekeeper controls on the lag 2019-08-19 14:03:42 -07:00
Evan Tschannen 9318b494ad reduce the DD move keys parallelism to avoid a hot read shard when transitioning from triple replication to double replication 2019-08-19 14:02:18 -07:00
Evan Tschannen 51cedd24c8 load balance will send reads to remote servers if more than one alternative is failed or overloaded 2019-08-19 13:59:49 -07:00
A.J. Beamon f02799455e Add --loggroup to fdbserver and fdbbackup help text. 2019-08-19 12:59:14 -07:00
A.J. Beamon eeadbaf1f6
Update release-notes.rst 2019-08-19 11:41:15 -07:00
A.J. Beamon ba0941ec4c
Update release-notes.rst 2019-08-19 11:40:11 -07:00
A.J. Beamon 7953545331 Fix an unknown_error when the file passed to fileconfigure doesn't contain a valid object (e.g. if you omit the enclosing {} of your object).
Fix an internal error when configuring regions with some storage servers that don't have a datacenter set.
2019-08-19 11:28:15 -07:00
Evan Tschannen 2a436d5f6f fix: do not block fdbcli from starting if DataDistributionStatus is not available 2019-08-16 18:15:02 -07:00
Evan Tschannen d8ab48ce7f added a sleep command to fdbcli 2019-08-16 18:13:35 -07:00
A.J. Beamon 0815b09629 Reorganize section based on review feedback 2019-08-16 16:05:20 -07:00
A.J. Beamon a4c3a435e0 Add documentation for missing fdbmonitor [general] parameters. 2019-08-16 16:05:20 -07:00
Evan Tschannen d30d4cb955 Added a duration to regular relocateShard trace events 2019-08-16 15:15:36 -07:00
Evan Tschannen 297b65236f added additional trace events to warn when different parts of shard relocations take more than 10 minutes 2019-08-16 14:56:58 -07:00
A.J. Beamon 0bd74d55a5 Update release notes. 2019-08-16 14:50:46 -07:00
A.J. Beamon ac2f310104 Ratekeeper ignores intentionally non-durable versions on the SS for durability lag computations. 2019-08-16 14:46:44 -07:00
A.J. Beamon a148ddc7d5 Fix spacing 2019-08-15 14:45:36 -07:00
A.J. Beamon dc534aea1a Fix spacing 2019-08-15 14:44:39 -07:00
A.J. Beamon bc54ee5a53 Add release note 2019-08-15 14:43:54 -07:00
A.J. Beamon b8e57f37d7 Add 'coordinator' to the list of roles that a process can have in status. 2019-08-15 14:42:49 -07:00
A.J. Beamon bb72cdd36a Report lag with the usual "seconds" and "versions" fields. Rename and deprecate the qos.*version_lag_storage_server fields. 2019-08-15 13:42:39 -07:00
A.J. Beamon 02ba73917b Add release note. 2019-08-15 11:08:08 -07:00
A.J. Beamon 6581161dd3 Add ratekeeper's durability lag statistics to status 2019-08-15 11:07:04 -07:00
A.J. Beamon f6ba8509ae Remove unused local rate limit variables in ratekeeper. 2019-08-15 10:08:28 -07:00
A.J. Beamon b2af17fb08 Simplify logic by removing an unneeded condition. 2019-08-15 08:23:13 -07:00
A.J. Beamon 155bed9762 Add release note. 2019-08-14 15:05:37 -07:00