Commit Graph

85 Commits

Author SHA1 Message Date
Chaoguang Lin 5bc2e2e595 update comments, make isAsync() virtual, remove unused code 2020-06-12 10:12:44 -07:00
Chaoguang Lin 980bee1d13 Fix fetchShardMetricsList_impl and add read cache in special key space 2020-06-12 10:12:44 -07:00
Chaoguang Lin 298b94a044 clang-format the change 2020-05-19 20:51:02 -07:00
Chaoguang Lin c230c23a3a Add test for smaller range read in dd-stats 2020-05-19 20:49:15 -07:00
Chaoguang Lin 93ca130602 Delete the unnecessay endKey field in DDMetricsRef, since each begin key is the previous end key 2020-05-19 13:32:42 -07:00
Chaoguang Lin ef724bf939 Merge remote-tracking branch 'upstream/master' into add-data-distribution-metrics 2020-05-08 18:39:28 -07:00
chaoguang e8b62e48f4 Rename DDMetrics to DDMetricsRef 2020-05-08 17:17:27 -07:00
Xin Dong 1927f15898 Now the ReadHotDetection tests can be run reliably. Added it to nightly bundle. 2020-04-28 13:26:02 -07:00
Xin Dong 76eb1c4376 Address review comments from Meng 2020-04-21 11:00:19 -07:00
Xin Dong 7dd7406c59
Merge branch 'master' into feature/hot-read-key-detection-part-2 2020-04-16 14:54:05 -07:00
A.J. Beamon b1172417f5 Merge branch 'master' into per-priority-busy-logging
# Conflicts:
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/Net2.actor.cpp
2020-04-14 14:22:12 -07:00
tclinken 247ab84323 Merge branch 'master' of https://github.com/apple/foundationdb into add-data-distribution-metrics 2020-03-23 17:01:17 -07:00
Evan Tschannen e08f0201f1 merge release 6.2 into master 2020-03-17 12:51:47 -07:00
Evan Tschannen 7adc916e18
Merge pull request #2806 from ajbeamon/improve-team-request-performance
Improve performance of get team requests.
2020-03-16 11:56:45 -07:00
Evan Tschannen 12f2b32770 added additional logging in data distribution 2020-03-13 15:19:33 -07:00
A.J. Beamon 555db50cd1 Avoid calling into SABTF so frequently. Use a cheaper call that only checks that shards exist. 2020-03-12 11:22:03 -07:00
A.J. Beamon abb75f7eb7 Add logging to indicate the time spent at each priority that exceeds some minimum busyness threshold 2020-02-07 14:34:24 -08:00
tclinken c9363e7e28 Merge branch 'master' of https://github.com/apple/foundationdb into add-data-distribution-metrics 2020-01-22 21:02:21 -08:00
Xin Dong b0a1af1288 Added the actual read hot detection algorithm and logging machanism.
- When a shard has a read bandwidth larger than a threshold value(configurable via knob), and it's read-bandwidth/byte-size ratio is also larger than a threshold(configurable via knob), the corresponding shard tracker will run the algorithm
- The algorithm will divide the shard into 10MB(configurable via knob) chunks and try to find the chunk(s) that has large aforementioned ratio
- Then those ranges will be logged into TraceEvents. This will later do more like actually cache them.
2020-01-21 11:19:52 -08:00
Xin Dong 1d6cd1007b Instead of using absolute value as the max bytesReadPerKSec threshold, use a pre-defined read traffic to byte size ratio to decide that value dynamically based on the actual size of the shard. 2020-01-21 11:15:52 -08:00
Evan Tschannen 3f9d9d8b84 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	cmake/FlowCommands.cmake
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/StorageServerInterface.h
#	fdbserver/DataDistributionTracker.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	flow/Knobs.h
#	flow/Platform.cpp
#	versions.target
2020-01-16 18:37:47 -08:00
tclinken 1d6ac716a1 Merge remote-tracking branch 'origin' into add-data-distribution-metrics 2020-01-15 13:20:04 -08:00
Evan Tschannen fd5705a451 fixed capitalization 2020-01-15 09:35:57 -08:00
Evan Tschannen c93ca04ea6 Do not merge more than 100 shards together to avoid creating untrackable shards 2020-01-15 09:33:27 -08:00
Evan Tschannen b331c5dafe wantsToMerge was created before the shardEvaluator has a chance to update it based on shardSize changes 2020-01-10 17:23:56 -08:00
Evan Tschannen fde53cbeef HasBeenTrueFor was ready immediately after a previous shard merge 2020-01-10 16:28:56 -08:00
Evan Tschannen 9b80498180 Added a trace event to warn if a shard is merged before enough time has elapses from becoming low bandwidth 2020-01-10 14:58:38 -08:00
Evan Tschannen e4fa4ad0c9 Data distribution will not merge a shard unless it has been low bandwidth for 5 minutes 2020-01-09 17:02:49 -08:00
Evan Tschannen 4de60fc437 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/TLogServer.actor.cpp
2019-11-01 15:48:04 -07:00
Evan Tschannen 8f0348d5e0 fix: merges which cross over systemKeys.begin did not properly decrement the systemSizeEstimate 2019-10-31 16:38:33 -07:00
Meng Xu e676348710
Merge pull request #1955 from fzhjon/mark-ss-failed
Add fdbcli and API command to mark storage servers as permanently failed
2019-10-22 23:36:30 -07:00
A.J. Beamon 29a0014b41 Fix "bandwith" typo 2019-10-22 09:51:59 -07:00
Xin Dong fca9aab17a
Merge pull request #2046 from dongxinEric/feature/hot-read-key-detection
Added metrics for read hot key detection
2019-10-21 14:31:48 -07:00
Jon Fu d2b6626d5c Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-21 13:47:06 -07:00
Xin Dong 9a81948843
Accept review suggestions.
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2019-10-21 10:08:43 -07:00
Xin Dong 6a40ef25e5 Credit to Evan for pointing out the missing line which costs me weeks debugging some weird behaviors. 2019-10-18 16:46:19 -07:00
Jon Fu b1fd6b4443 addressed review comments 2019-10-18 09:43:25 -07:00
Evan Tschannen 86bcb84b45 Raised the data distribution priority of splitting shards above restoring fault tolerance to avoid hot write shards 2019-10-11 17:50:43 -07:00
Xin Dong 41aae9cbd9 Fix compiler errors 2019-10-10 13:08:59 -07:00
Xin Dong 795ce59fbb Resolved conflict with master 2019-10-09 16:45:11 -07:00
Xin Dong 62ffdd54a3 Updated some comments to reflect the correct knob value and also used a more appropiate value for read bandwidth. Set the default value for read bandwidth in some cases. 2019-10-09 16:42:42 -07:00
Xin Dong cd4757b06c Address review comments 2019-10-09 16:42:42 -07:00
Xin Dong 6b0f771cc0 Fixex a typo in knobs. Addressed some review comments. Added code for actual metric collecting. 2019-10-09 16:42:42 -07:00
Xin Dong 12293d5497 Added metrics for read hot key detection 2019-10-09 16:42:42 -07:00
A.J. Beamon 909855bcec Fix: the keys argument to changeSizes was passed as a reference, but when used after the first wait(), it may no longer be valid. 2019-10-09 14:07:48 -07:00
Jon Fu d96a7b2c69 Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-03 09:47:45 -07:00
Evan Tschannen 045175bd0e added tracking for the size of the system keyspace 2019-09-27 22:39:19 -07:00
Evan Tschannen 3bb62e008c lowered the priority of some delays in data distribution so that the process will prefer other work 2019-09-27 18:33:13 -07:00
Jon Fu 00c2025d4b fixed removeKeys impl, adjusted test workload, and introduced extra safety checks to NativeAPI and proxy 2019-08-27 14:39:44 -07:00
Jon Fu 66bba51988 Implemented direct removal of failed storage server from system keyspace 2019-08-27 14:39:43 -07:00