Chaoguang Lin
5bc2e2e595
update comments, make isAsync() virtual, remove unused code
2020-06-12 10:12:44 -07:00
Chaoguang Lin
980bee1d13
Fix fetchShardMetricsList_impl and add read cache in special key space
2020-06-12 10:12:44 -07:00
Chaoguang Lin
298b94a044
clang-format the change
2020-05-19 20:51:02 -07:00
Chaoguang Lin
c230c23a3a
Add test for smaller range read in dd-stats
2020-05-19 20:49:15 -07:00
Chaoguang Lin
93ca130602
Delete the unnecessay endKey field in DDMetricsRef, since each begin key is the previous end key
2020-05-19 13:32:42 -07:00
Chaoguang Lin
ef724bf939
Merge remote-tracking branch 'upstream/master' into add-data-distribution-metrics
2020-05-08 18:39:28 -07:00
chaoguang
e8b62e48f4
Rename DDMetrics to DDMetricsRef
2020-05-08 17:17:27 -07:00
Xin Dong
1927f15898
Now the ReadHotDetection tests can be run reliably. Added it to nightly bundle.
2020-04-28 13:26:02 -07:00
Xin Dong
76eb1c4376
Address review comments from Meng
2020-04-21 11:00:19 -07:00
Xin Dong
7dd7406c59
Merge branch 'master' into feature/hot-read-key-detection-part-2
2020-04-16 14:54:05 -07:00
A.J. Beamon
b1172417f5
Merge branch 'master' into per-priority-busy-logging
...
# Conflicts:
# flow/Knobs.cpp
# flow/Knobs.h
# flow/Net2.actor.cpp
2020-04-14 14:22:12 -07:00
tclinken
247ab84323
Merge branch 'master' of https://github.com/apple/foundationdb into add-data-distribution-metrics
2020-03-23 17:01:17 -07:00
Evan Tschannen
e08f0201f1
merge release 6.2 into master
2020-03-17 12:51:47 -07:00
Evan Tschannen
7adc916e18
Merge pull request #2806 from ajbeamon/improve-team-request-performance
...
Improve performance of get team requests.
2020-03-16 11:56:45 -07:00
Evan Tschannen
12f2b32770
added additional logging in data distribution
2020-03-13 15:19:33 -07:00
A.J. Beamon
555db50cd1
Avoid calling into SABTF so frequently. Use a cheaper call that only checks that shards exist.
2020-03-12 11:22:03 -07:00
A.J. Beamon
abb75f7eb7
Add logging to indicate the time spent at each priority that exceeds some minimum busyness threshold
2020-02-07 14:34:24 -08:00
tclinken
c9363e7e28
Merge branch 'master' of https://github.com/apple/foundationdb into add-data-distribution-metrics
2020-01-22 21:02:21 -08:00
Xin Dong
b0a1af1288
Added the actual read hot detection algorithm and logging machanism.
...
- When a shard has a read bandwidth larger than a threshold value(configurable via knob), and it's read-bandwidth/byte-size ratio is also larger than a threshold(configurable via knob), the corresponding shard tracker will run the algorithm
- The algorithm will divide the shard into 10MB(configurable via knob) chunks and try to find the chunk(s) that has large aforementioned ratio
- Then those ranges will be logged into TraceEvents. This will later do more like actually cache them.
2020-01-21 11:19:52 -08:00
Xin Dong
1d6cd1007b
Instead of using absolute value as the max bytesReadPerKSec threshold, use a pre-defined read traffic to byte size ratio to decide that value dynamically based on the actual size of the shard.
2020-01-21 11:15:52 -08:00
Evan Tschannen
3f9d9d8b84
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# cmake/FlowCommands.cmake
# documentation/sphinx/source/release-notes.rst
# fdbclient/StorageServerInterface.h
# fdbserver/DataDistributionTracker.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/fdbserver.actor.cpp
# flow/Knobs.h
# flow/Platform.cpp
# versions.target
2020-01-16 18:37:47 -08:00
tclinken
1d6ac716a1
Merge remote-tracking branch 'origin' into add-data-distribution-metrics
2020-01-15 13:20:04 -08:00
Evan Tschannen
fd5705a451
fixed capitalization
2020-01-15 09:35:57 -08:00
Evan Tschannen
c93ca04ea6
Do not merge more than 100 shards together to avoid creating untrackable shards
2020-01-15 09:33:27 -08:00
Evan Tschannen
b331c5dafe
wantsToMerge was created before the shardEvaluator has a chance to update it based on shardSize changes
2020-01-10 17:23:56 -08:00
Evan Tschannen
fde53cbeef
HasBeenTrueFor was ready immediately after a previous shard merge
2020-01-10 16:28:56 -08:00
Evan Tschannen
9b80498180
Added a trace event to warn if a shard is merged before enough time has elapses from becoming low bandwidth
2020-01-10 14:58:38 -08:00
Evan Tschannen
e4fa4ad0c9
Data distribution will not merge a shard unless it has been low bandwidth for 5 minutes
2020-01-09 17:02:49 -08:00
Evan Tschannen
4de60fc437
Merge branch 'release-6.2'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbserver/TLogServer.actor.cpp
2019-11-01 15:48:04 -07:00
Evan Tschannen
8f0348d5e0
fix: merges which cross over systemKeys.begin did not properly decrement the systemSizeEstimate
2019-10-31 16:38:33 -07:00
Meng Xu
e676348710
Merge pull request #1955 from fzhjon/mark-ss-failed
...
Add fdbcli and API command to mark storage servers as permanently failed
2019-10-22 23:36:30 -07:00
A.J. Beamon
29a0014b41
Fix "bandwith" typo
2019-10-22 09:51:59 -07:00
Xin Dong
fca9aab17a
Merge pull request #2046 from dongxinEric/feature/hot-read-key-detection
...
Added metrics for read hot key detection
2019-10-21 14:31:48 -07:00
Jon Fu
d2b6626d5c
Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed
2019-10-21 13:47:06 -07:00
Xin Dong
9a81948843
Accept review suggestions.
...
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2019-10-21 10:08:43 -07:00
Xin Dong
6a40ef25e5
Credit to Evan for pointing out the missing line which costs me weeks debugging some weird behaviors.
2019-10-18 16:46:19 -07:00
Jon Fu
b1fd6b4443
addressed review comments
2019-10-18 09:43:25 -07:00
Evan Tschannen
86bcb84b45
Raised the data distribution priority of splitting shards above restoring fault tolerance to avoid hot write shards
2019-10-11 17:50:43 -07:00
Xin Dong
41aae9cbd9
Fix compiler errors
2019-10-10 13:08:59 -07:00
Xin Dong
795ce59fbb
Resolved conflict with master
2019-10-09 16:45:11 -07:00
Xin Dong
62ffdd54a3
Updated some comments to reflect the correct knob value and also used a more appropiate value for read bandwidth. Set the default value for read bandwidth in some cases.
2019-10-09 16:42:42 -07:00
Xin Dong
cd4757b06c
Address review comments
2019-10-09 16:42:42 -07:00
Xin Dong
6b0f771cc0
Fixex a typo in knobs. Addressed some review comments. Added code for actual metric collecting.
2019-10-09 16:42:42 -07:00
Xin Dong
12293d5497
Added metrics for read hot key detection
2019-10-09 16:42:42 -07:00
A.J. Beamon
909855bcec
Fix: the keys argument to changeSizes was passed as a reference, but when used after the first wait(), it may no longer be valid.
2019-10-09 14:07:48 -07:00
Jon Fu
d96a7b2c69
Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed
2019-10-03 09:47:45 -07:00
Evan Tschannen
045175bd0e
added tracking for the size of the system keyspace
2019-09-27 22:39:19 -07:00
Evan Tschannen
3bb62e008c
lowered the priority of some delays in data distribution so that the process will prefer other work
2019-09-27 18:33:13 -07:00
Jon Fu
00c2025d4b
fixed removeKeys impl, adjusted test workload, and introduced extra safety checks to NativeAPI and proxy
2019-08-27 14:39:44 -07:00
Jon Fu
66bba51988
Implemented direct removal of failed storage server from system keyspace
2019-08-27 14:39:43 -07:00