Evan Tschannen
f298436db7
made the style consistent
2020-05-09 15:17:22 -07:00
Evan Tschannen
d497910fa2
fixed merge conflict
2020-05-09 13:34:51 -07:00
Evan Tschannen
69affebe40
merge master
2020-05-09 13:29:18 -07:00
Evan Tschannen
2dfae85dc7
the delay for reads is about 15% of the total cost of the read, so start multiple reads with the same delay
2020-05-09 13:26:38 -07:00
Evan Tschannen
a4fc593a3a
refactored some actors in the storage server to improve performance
2020-05-08 21:08:52 -07:00
Markus Pilman
5f9b127e56
Emit traces regularly about role assignment
...
We are currently emitting Role transition traces when a role starts and
when it ends. While this is useful for debugging, it doesn't work well
with tools that inject data and might potentially miss some trace lines.
We do decorate each trace lines with the roles assigned to that
particular process, however, this is not sufficient for tools that can
make use of the UID -> Role mapping
2020-05-08 16:27:57 -07:00
negoyal
749fcd13b0
Merge branch 'master' into fdb_cache_wo_allocator
2020-05-08 16:23:29 -07:00
negoyal
116e186af6
Add code to fetch cached ranges when a cache server startsup.
2020-05-06 18:56:42 -07:00
A.J. Beamon
b1055a8501
Merge branch 'master' into transaction-tagging
2020-05-05 16:03:39 -07:00
Evan Tschannen
f329164fb4
Merge pull request #2532 from dongxinEric/feature/hot-read-key-detection-part-2
...
Feature/hot read key detection part 2
2020-05-05 14:33:34 -07:00
negoyal
dd033736ed
Merge branch 'master' into fdb_cache_subfeature2
2020-05-04 17:29:43 -07:00
Evan Tschannen
b68980d686
Merge pull request #2028 from negoyal/cache_storageq_results
...
Cache storageq results
2020-05-04 11:27:02 -07:00
A.J. Beamon
36454bb3b8
Merge branch 'master' into transaction-tagging
...
# Conflicts:
# fdbclient/MasterProxyInterface.h
# fdbclient/NativeAPI.actor.cpp
2020-05-04 10:23:25 -07:00
A.J. Beamon
bb3d4b6b89
Add a bunch of TEST macros and some other little things
2020-05-04 10:11:36 -07:00
negoyal
4fa04c5891
Fixed a corruptionbug.
2020-05-02 09:22:15 -07:00
Evan Tschannen
aed2d34bcb
Merge branch 'master' into feature-proxy-load-balance
...
# Conflicts:
# fdbclient/NativeAPI.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# flow/Knobs.cpp
2020-05-01 09:19:39 -07:00
A.J. Beamon
41c517a5dd
Merge branch 'master' into transaction-tagging
...
# Conflicts:
# fdbclient/NativeAPI.actor.cpp
2020-04-27 13:05:24 -07:00
A.J. Beamon
9bf5c06d15
Adjust and knobify cost function for ops on the storage server
2020-04-22 14:39:32 -07:00
negoyal
2fa7d485f5
Merge branch 'master' into cache_storageq_results
2020-04-21 17:28:17 -07:00
A.J. Beamon
6619a1a36a
Rename transaction tag map.
2020-04-17 09:06:45 -07:00
Xin Dong
7dd7406c59
Merge branch 'master' into feature/hot-read-key-detection-part-2
2020-04-16 14:54:05 -07:00
A.J. Beamon
0fba8c47be
Checkpoint: Ratekeeper sets absolute limits for tag throttles and enforces them by distributing requests to proxies, who distribute them to clients.
...
A few refactorings.
2020-04-16 14:43:22 -07:00
negoyal
7a9bcf8222
Review comments.
2020-04-14 17:45:34 -07:00
Neelam Goyal
cd4f3f84b2
Update fdbserver/storageserver.actor.cpp
...
Co-Authored-By: Trevor Clinkenbeard <trevorclinkenbeard@gmail.com>
2020-04-13 12:19:17 -07:00
A.J. Beamon
6508c891fc
Make the TagSet sent to the storage servers optional so we can distinguish no tags from unsampled.
2020-04-10 13:29:28 -07:00
A.J. Beamon
29b2c2f3aa
Add hash to StringRef. Use unordered maps for storing tags. Create some helpful typedefs.
2020-04-10 12:54:59 -07:00
A.J. Beamon
ebeca10bce
Change the serialization of tags sent in some messages. Add communication of the sampling rate from cluster to clients.
2020-04-09 16:55:56 -07:00
Balachandar Namasivayam
73272fc72e
Version difference is now the diff between TLog versions and SS version.
2020-04-03 19:04:43 -07:00
Balachandar Namasivayam
ad1dd4fd9b
Mark the storage servers that are continually lagging as unhealthy and so this will give the Data Distributor the chance to move data out of this server.
2020-03-31 18:25:39 -07:00
Alex Miller
72e5891058
Clean up and rework the debugMutation API.
...
As a relatively unknown debugging tool for simulation tests, one could
have simulation print when a particular key is handled in various stages
of the commit process. This functionality was enabled by changing a 0
to a 1 in an #if, and changing a constant to the key in question.
As a proxy and storage server handle mutations, they call debugMutation
or debugKeyRange, which then checks against the mutation against the key
in question, and logs if they match. A mixture of printfs and
TraceEvents would then be emitted, and for this to actually be usable,
one also needs to comment out some particularly spammy debugKeyRange()
calls.
This PR reworks the API of debugMutation/debugKeyRange, pulls it out
into its own file, and trims what is logged by default into something
useful and understandable:
* debugMutation() now returns a TraceEvent, that one can add more details to before it is logged.
* Data distribution and storage server cleanup operations are no longer logged by default
2020-03-27 03:30:28 -07:00
negoyal
8abac91033
Fixed a bug in cache server while peeking at a version lower than popped version and added some logging.
2020-03-26 12:39:07 -07:00
A.J. Beamon
26b7e02d4c
Some initial work to support tagging transactions and passing them around.
2020-03-20 11:23:11 -07:00
negoyal
99a5cb0572
Fix a corruption bug.
2020-03-13 18:52:34 -07:00
negoyal
3acd3ad3af
Some bugfixes and cleanup.
2020-03-02 17:11:23 -08:00
negoyal
cd949eca71
Merge branch 'master' into fdb_cache_subfeature2
2020-02-26 11:22:08 -08:00
negoyal
0cbd253967
Reduce the amount of tracing in cache server.
2020-02-25 17:32:24 -08:00
Evan Tschannen
96258b9809
Merge branch 'release-6.2'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbcli/fdbcli.actor.cpp
# fdbclient/ManagementAPI.actor.cpp
# fdbrpc/FlowTransport.actor.cpp
# fdbserver/ClusterController.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/DataDistribution.actor.h
# fdbserver/DataDistributionQueue.actor.cpp
# fdbserver/KeyValueStoreMemory.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/QuietDatabase.actor.cpp
# fdbserver/SkipList.cpp
# fdbserver/StorageMetrics.actor.h
# fdbserver/TLogServer.actor.cpp
# fdbserver/fdbserver.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/KVStoreTest.actor.cpp
# flow/CMakeLists.txt
# flow/Knobs.cpp
# flow/Knobs.h
# flow/genericactors.actor.cpp
# flow/serialize.h
2020-02-21 19:09:16 -08:00
A.J. Beamon
5586e6f6d8
Merge pull request #2697 from etschannen/feature-correctness-fixes
...
A variety of correctness fixes
2020-02-20 13:32:18 -08:00
negoyal
a21f29b586
Bugfixes. Identify the cached range correctly.
2020-02-19 16:27:06 -08:00
Evan Tschannen
4326984b1d
fix: wait metrics can take a really long time to detect that two shards have been merged into one if both shards are assigned to the same team. Additional information should be added to the request to improve this.
2020-02-19 15:20:38 -08:00
Meng Xu
132f5aa9ba
FastRestore:Improve trace name and cosmetic change
2020-02-18 16:41:19 -08:00
A.J. Beamon
60f6b928f6
Slight reorganization of code to make it clearer.
2020-02-12 14:07:02 -08:00
A.J. Beamon
529200018a
Merge branch 'release-6.2' into fix-reverse-range-read-byte-limit-bug
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2020-02-10 12:23:52 -08:00
mpilman
5a9d420cb7
Merge remote-tracking branch 'upstream/release-6.2' into release-merges/20200210
2020-02-10 10:02:05 -08:00
A.J. Beamon
fa920a6cef
Step 5 of fixing storage server range reads: update the logic of reverse range reads to match forward range reads
2020-02-07 10:02:52 -08:00
A.J. Beamon
16167b07d5
Step 4 of fixing storage server range reads: remove another unneeded iteration case in the forward direction when we don't exhaust our limits in the disk read. This also hopefully makes the code a bit clearer.
2020-02-06 13:27:04 -08:00
A.J. Beamon
df2b0452b4
Step 3 of fixing storage server range reads: change return type of readRange from VectorRef<KeyValueRef> to RangeResultRef.
2020-02-06 13:19:24 -08:00
A.J. Beamon
1c61957ca1
Step 2 of fixing storage server range reads: eliminate some unnecessary iterations in the forward case
2020-02-06 12:58:59 -08:00
A.J. Beamon
7037edc3f8
Step 1 of fixing storage server range reads: cleanup of the forward direction. This should not change any behavior.
2020-02-06 12:49:02 -08:00
negoyal
85cc35e81e
Merge branch 'master' into HEAD
2020-02-05 14:59:55 -08:00
A.J. Beamon
f32d515fda
Reverse range reads on the storage server would not pass the specified byte limit to the storage engine but would apply it to the results returned, causing a potentially significant amount of wasted reading.
2020-02-05 11:16:40 -08:00
mpilman
d09e07f1f5
Merge remote-tracking branch 'upstream/master' into features/icc
2020-02-04 10:26:18 -08:00
Alex Miller
ee6490c9d1
Merge pull request #2314 from mengranwo/memory-engine
...
New Radix-Tree based Memory Storage Engine
2020-01-30 16:20:13 -08:00
A.J. Beamon
d1b87f8b7f
The storage server could fail to update its version to the latest processed if the peeked data contained a non-empty commit and ended with an empty commit.
2020-01-29 13:17:58 -08:00
Xin Dong
b0a1af1288
Added the actual read hot detection algorithm and logging machanism.
...
- When a shard has a read bandwidth larger than a threshold value(configurable via knob), and it's read-bandwidth/byte-size ratio is also larger than a threshold(configurable via knob), the corresponding shard tracker will run the algorithm
- The algorithm will divide the shard into 10MB(configurable via knob) chunks and try to find the chunk(s) that has large aforementioned ratio
- Then those ranges will be logged into TraceEvents. This will later do more like actually cache them.
2020-01-21 11:19:52 -08:00
Xin Dong
33456e7276
Done the plumbing work at the SSI side.
2020-01-21 11:15:52 -08:00
Evan Tschannen
3f9d9d8b84
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# cmake/FlowCommands.cmake
# documentation/sphinx/source/release-notes.rst
# fdbclient/StorageServerInterface.h
# fdbserver/DataDistributionTracker.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/fdbserver.actor.cpp
# flow/Knobs.h
# flow/Platform.cpp
# versions.target
2020-01-16 18:37:47 -08:00
mengranwo
f597aa7e18
WIP : deployable/stable version since Nov 3. Start rebase to master branch
2020-01-15 13:49:45 -08:00
Evan Tschannen
4b90487b90
occasionally throw wrong_shard_server when waitMetrics times out so that the waitMetrics request can get the correct number of shards if two shards have been merged or split and the same storage server owns all the chunks
2020-01-15 13:22:18 -08:00
negoyal
34adf9c92c
Storages now return a bool flag to indicate if the key[s] might be cached.
2020-01-14 12:33:17 -08:00
Evan Tschannen
855f03a41f
ratekeeper needed to check remoteDC in another location
...
the storage server scoped a transaction incorrectly
2020-01-10 15:58:36 -08:00
Evan Tschannen
4aab9b7bc8
fix: clients would waste time attempting to read from a remote region when it was in the process of catching up
2020-01-10 12:23:59 -08:00
Evan Tschannen
83ad9caf54
implemented a load balancing algorithm which evens out the number of requests processes by each proxy
2020-01-08 01:59:01 -08:00
Alvin Moore
0373b1af91
Added missing braces
2019-12-12 07:36:19 -08:00
Alvin Moore
3bf971ba8b
Merge branch 'release-6.2' of github.com:apple/foundationdb into release_6.2_merge
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbserver/storageserver.actor.cpp
2019-12-12 07:13:12 -08:00
Andrew Noyes
9ef1f4da5c
Update fdbserver/storageserver.actor.cpp
...
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2019-12-04 09:21:05 -08:00
Andrew Noyes
854c94c5ad
Fix another "binding reference to nullptr"
2019-12-03 17:39:17 -08:00
Andrew Noyes
f320f6c174
Fix occurrence of undefined behavior
...
UBSAN has this to say:
flow/Arena.h:982:10: runtime error: reference binding to null pointer of type 'KeyValueRef'
After this change UBSAN no longer complains about this occurrence
2019-11-26 21:34:24 -08:00
Xin Dong
c95fa062b2
For the read sampling, use a specialized notify function to avoid unnecessary stack object allocation and a lot branch misses.
2019-11-22 15:21:09 -08:00
Xin Dong
b6e1839d84
Code clean up
2019-11-21 13:39:19 -08:00
Xin Dong
b282e180d5
Added a knob to disable read sampling
2019-11-20 14:03:20 -08:00
Xin Dong
25fb63e68a
For performance concerns, change the read sampling when doing a range read. Now it bills the total cost of a range read to the start key of the range returned.
2019-11-20 14:03:20 -08:00
Xin Dong
3d3e186c83
Removed a place where it's essentially double logging the read size
2019-11-20 14:03:20 -08:00
negoyal
a4a0bf18f9
Merging with Master.
2019-11-12 13:01:29 -08:00
Xin Dong
199a34b827
Defined a minimum read cost (a penalty) for empty read or read size smaller than it. Fixed several review comments.
2019-10-30 10:04:19 -07:00
Xin Dong
fe54a4bde1
- Changed SHARD_MAX_BYTES_READ_PRE_KEYSEC to be equivalent to 8MiB/s, which when times the sample expire interval(120 seconds) yields 960MiB/s. A shard having a read rate larger than that will be marked as read-hot. The number 960MiB was chosen to be roughtly twice the size of the max allowed shard size to avoid wrongly marking a shard as read-hot when doing a table scan on it.
...
- Also tuned down the empty key sampling percentage to be 5%.
2019-10-23 12:00:19 -07:00
Xin Dong
e6f5748791
Use a large value for read sampling size threshold. Also at sampling site, don't round up small values to avoid sampling every key.
2019-10-22 13:47:58 -07:00
Xin Dong
3efeff04e6
Remove iosPerKSecond metric increment.
2019-10-09 16:42:42 -07:00
Xin Dong
cd4757b06c
Address review comments
2019-10-09 16:42:42 -07:00
Xin Dong
6b0f771cc0
Fixex a typo in knobs. Addressed some review comments. Added code for actual metric collecting.
2019-10-09 16:42:42 -07:00
Meng Xu
d0147e5e5d
Merge branch 'release-6.2' into mengxu/merge-release620-to-master-v3
...
Resolved Conflicts:
documentation/sphinx/source/release-notes.rst
fdbserver/DataDistribution.actor.cpp
versions.target
2019-10-02 13:22:56 -07:00
Evan Tschannen
d0e5b0d3a1
Added a buggify
2019-09-30 13:24:28 -07:00
Evan Tschannen
27db0ca530
Update fdbserver/storageserver.actor.cpp
...
Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-09-30 13:16:31 -07:00
Evan Tschannen
eee4404e4e
fix: when the shard pointer is replaced with a new AddingShard, we need to restart the warningLogger because the old one will have a pointer to the deleted AddingShard
2019-09-27 19:11:34 -07:00
Evan Tschannen
b495cc697b
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# versions.target
2019-09-13 09:25:08 -07:00
A.J. Beamon
e6fbc602df
Add metric to track empty reads.
2019-09-12 15:09:22 -07:00
A.J. Beamon
41fd3d9467
Merge branch 'master' into remove-unused-ssi-get-version
...
# Conflicts:
# fdbclient/StorageServerInterface.h
# fdbserver/storageserver.actor.cpp
2019-09-10 08:11:43 -07:00
A.J. Beamon
3d5f769ea3
Add a storage server metric for bytes cleared based on the byte sample.
2019-09-05 11:31:26 -07:00
Meng Xu
c2355f721e
Merge branch 'master' into mengxu/performant-restore-PR
2019-09-04 17:11:42 -07:00
Meng Xu
d160810662
FastRestore:Resolve review comments
2019-09-04 16:48:43 -07:00
negoyal
042fb62771
Code cleanup
2019-09-04 16:44:19 -07:00
negoyal
468cffa5d0
More cleanup around const versions of the VectorRef operators.
...
This PR changes the readRange() to cache the PTree results during the first traversal.
So that we avoid the second traversal during the merge between PTree data and Storage engine data.
I microbenchmarked the VersionedMap standalone. i.e. the set and clear-range mutations were
performed solely at the in-memory storage queue. No other FDB components were involved
in this test. And hence the numbers presented here are the best case numbers.
Test setup:
- 100M mutations: about 5% clearRange and 95% set mutations
- 100M rangeReads
- Keys/Values generated using deterministicRandom()
- Numbers presented below are for two setups, one with a single version generated for all set of mutations
and another where a new version is generated for each mutation (i.e. it's an extreme version test)
Single Versioned Extreme Versioned
Time to regular readRange 175.15 202.365
Time to readRange with cached results 158.798 184.423
2019-09-04 15:39:21 -07:00
negoyal
04405d910f
Use the same arena as the the final readRange result for resultCache.
2019-09-03 17:01:24 -07:00
Evan Tschannen
a7237c4302
Merge pull request #2045 from atn34/disallow-scalar-network-messages
...
Disallow scalar network messages
2019-08-30 13:38:54 -07:00
A.J. Beamon
1fdabe62c2
Merge pull request #2048 from etschannen/feature-fix-connections
...
Fixed two different ways useful connections were being closed
2019-08-30 11:05:02 -07:00
Evan Tschannen
1c0484cffc
fix: do not close connections which have outstanding tryGetReplies with the peer
2019-08-29 16:49:57 -07:00
Andrew Noyes
6aa0ada7b1
Replace scalar root types with proper messages
2019-08-28 14:40:50 -07:00
A.J. Beamon
64ce0c3285
Remove the unused getVersion from StorageServerInterface.
2019-08-26 13:53:54 -07:00
negoyal
4a8c301de1
Cache the storage queue results during first pass in readRange.
...
Currently, we may traverse the PTree backing the storae queue more than once
during the rangeRead operation. This is an attempt to cache the results during
the first traversal and avoid multiple PTree traversals in turn.
2019-08-22 15:23:16 -07:00
negoyal
4ba1725bb5
Cache the storage queue results during first pass in readRange.
...
Currently, we may traverse the PTree backing the storae queue more than once
during the rangeRead operation. This is an attempt to cache the results during
the first traversal and avoid multiple PTree traversals in turn.
2019-08-22 14:08:22 -07:00
Evan Tschannen
297b65236f
added additional trace events to warn when different parts of shard relocations take more than 10 minutes
2019-08-16 14:56:58 -07:00
Meng Xu
7ccaeddf05
Merge branch 'master' into mengxu/performant-restore-PR
2019-08-01 13:23:17 -07:00
Andrew Noyes
1bad0fd44e
Make requestTime private
2019-07-31 17:59:35 -07:00
A.J. Beamon
14648e20f9
Merge pull request #1901 from ajbeamon/data-distribution-receives-bytes-input-rate
...
Send bytes input rate to data distribution
2019-07-30 15:01:36 -07:00
Evan Tschannen
3ad1d95049
Merge pull request #1894 from ajbeamon/trace-file-detail-rename
...
Expand undefined acronym in trace event detail
2019-07-26 13:34:45 -07:00
Evan Tschannen
8149b5b352
Merge pull request #1413 from atn34/change-connection-file
...
Switch cluster file feature
2019-07-26 13:27:37 -07:00
Meng Xu
1706aaf199
Merge branch 'master' into mengxu/performant-restore-PR
...
Fix conflict in TlogServer.actor.cpp by accepting master changes
2019-07-26 11:46:27 -07:00
sramamoorthy
9afd162e2f
remove snap v1 related code
2019-07-25 17:29:31 -07:00
A.J. Beamon
b91795d288
Send bytes input rate to DD.
2019-07-25 16:27:32 -07:00
Meng Xu
45083edf74
Merge branch 'master' into mengxu/performant-restore-PR
...
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
sramamoorthy
8f1f0c0435
snap v2: worker and other helper related changes
2019-07-24 15:36:28 -07:00
Trevor Clinkenbeard
9ad9bd4c1f
Merge branch 'master' of https://github.com/apple/foundationdb into change-connection-file
2019-07-24 15:22:26 -07:00
A.J. Beamon
639df02f20
Expand undefined acronym in trace event detail
2019-07-24 08:38:36 -07:00
Evan Tschannen
3045826e3c
Merge pull request #1819 from mpilman/flatbuffers-fixes2
...
Flatbuffers fixes2
2019-07-19 16:33:50 -07:00
Alex Miller
c3a8ae4752
Merge pull request #1791 from fzhjon/fetch-keys-requests-priority
...
Introduce priority to fetchKeys requests from data distribution
2019-07-19 14:54:51 -07:00
Alex Miller
9863ace96c
Replace usages with intialization lists.
...
But C++ needs a bit of help to inference though the templates.
2019-07-18 22:27:36 -07:00
mpilman
1ac2d01b03
Merge remote-tracking branch 'upstream/master' into flatbuffers-fixes2
2019-07-18 09:50:08 -07:00
A.J. Beamon
2cd05e9ac9
Merge pull request #1712 from tclinken/add-local-rk-to-status
...
Track the local ratekeeper rate in status
2019-07-15 15:17:11 -07:00
mpilman
54416f46fd
Pass type as param to VectorRef instead of bool
2019-07-15 15:08:49 -07:00
Trevor Clinkenbeard
e1541778ab
Added readsRejected counter to storage server
2019-07-15 10:53:19 -07:00
Jon Fu
4b0fdabae5
mark test file as IGNORE and comment out dead placeholder code
2019-07-15 09:45:16 -07:00
mpilman
b68f2d925f
Serialize range result to string for speed
2019-07-11 23:03:31 -07:00
Jon Fu
652cd77689
fixed merge conflicts and use new TaskPriority enum class
2019-07-11 09:56:59 -07:00
Jon Fu
1e9d31597c
removed extra parameter from getRange, added knob to guard new changes, and adjusted style/formatting in several places
2019-07-11 09:56:58 -07:00
Jon Fu
f707d186fe
added new priority for fetchkeys requests and adjusted ddmetrics workload to run parallel with mako
2019-07-11 09:56:58 -07:00
Trevor Clinkenbeard
1582a2a24d
Merge branch 'master' of https://github.com/apple/foundationdb into change-connection-file
2019-07-09 13:41:54 -07:00
Trevor Clinkenbeard
1bac04509e
Track the local ratekeeper rate as a percentage
...
This value is reported in status for each storage server.
2019-07-09 12:46:53 -07:00
Evan Tschannen
310a5fe9a3
fix: we cannot reject 100% of requests, because a storage server which is stuck needs to get a future version error to trigger an all alternatives failed message from load balance so that clients will re-grab storage server interfaces from the proxy
2019-07-05 17:28:22 -07:00
Evan Tschannen
15e894c724
Merge in master
2019-07-05 15:49:24 -07:00
Evan Tschannen
e7c0ecf729
fix: we cannot reject 100% of requests, because a storage server which is stuck needs to get a future version error to trigger an all alternatives failed message from load balance so that clients will re-grab storage server interfaces from the proxy
2019-07-05 15:46:16 -07:00
Alex Miller
ea6898144d
Merge remote-tracking branch 'upstream/master' into flowlock-api
2019-07-03 20:44:15 -07:00
Evan Tschannen
3fb0999e10
revert storage server priority changes
2019-07-02 16:54:47 -07:00
mengranwo
e54eedf0e2
Address pr comments, remove wait(tr.commit()) for read-only txn
2019-07-01 16:09:51 -07:00
mengranwo
0ad151e70a
style formatting
2019-07-01 16:09:51 -07:00
mengranwo
819b6e3d6d
fix compiling error
2019-07-01 16:09:51 -07:00
mengranwo
c7148bbb14
address cr comments:
2019-07-01 16:09:51 -07:00
mengranwo
d96cdacdd5
fix format issue
2019-07-01 16:09:51 -07:00
mengranwo
11161746f8
add try catch block around tx.onerror()
2019-07-01 16:09:51 -07:00
mengranwo
6b61b0e030
fix syntax error, pass compile
2019-07-01 16:09:51 -07:00
mengranwo
0b9cd18fb4
checking cluster is healthy or not during recovery process(for storage engine), if healthy, delete data files and join as new
2019-07-01 16:09:51 -07:00
Alex Miller
8e1ab6e7db
Merge remote-tracking branch 'upstream/master' into flowlock-api
2019-06-28 17:32:54 -07:00
Evan Tschannen
4cef1d3937
Experimental change of storage write priority
2019-06-28 16:54:22 -07:00
Evan Tschannen
18d5fbf1e0
Avoid jumping from rejecting 0% of requests directly to 20% of requests
2019-06-28 16:54:22 -07:00
Evan Tschannen
db413c37f7
restored the STORAGE_DURABILITY_LAG_SOFT_MAX knob and made the rk target slightly smaller than the soft limit, to avoid inaccuracies in ratekeeper control causing behavior changes on the storage servers
2019-06-28 16:54:22 -07:00
Evan Tschannen
92b32855ca
ratekeeper’s control algorithm would oscillate when limited by local ratekeeper
2019-06-28 16:54:22 -07:00
Evan Tschannen
cfce1e1705
fix: buffered peek cursor would advance very slowly through large ranges of empty versions
2019-06-28 15:54:08 -07:00
Alex Miller
7a500cd37f
A giant translation of TaskFooPriority -> TaskPriority::Foo
...
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Stephen Atherton
f1f1081202
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# fdbserver/VersionedBTree.actor.cpp
2019-06-24 20:17:49 -07:00
Trevor Clinkenbeard
afb0dbcd1c
Merge branch 'master' of https://github.com/apple/foundationdb into change-connection-file
2019-06-20 19:11:29 -07:00
mpilman
844dd60202
FDB compiling with intel compiler
2019-06-20 09:29:01 -07:00
mpilman
68ce9a5e75
ProtocolVersion type - second try
2019-06-18 17:55:27 -07:00
mpilman
8576665a90
Revert "Revert "Make protocol version a type""
...
This reverts commit 455bf3b3ec
.
2019-06-18 14:49:04 -07:00
Alex Miller
455bf3b3ec
Revert "Make protocol version a type"
2019-06-18 10:59:17 -07:00
mpilman
da53a92bec
Make protocol version a type
...
This fixes #1214
The basic idea is that ProtocolVersion is now its own type. This
alone is an improvement as it makes many things more typesafe. For
each version, we can now add breaking features (for example Fearless).
After that, there's no need to test against actual (confusing) version
numbers. Instead a developer can simply test
`protocolVersion->hasFearless()` and this will return true iff the
protocolVersion is newer than the newest version that didn't support
fearless.
2019-06-16 09:59:15 -07:00
Evan Tschannen
20e3edeb0a
Merge branch 'release-6.1'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbserver/storageserver.actor.cpp
# versions.target
2019-06-14 12:42:59 -07:00
Evan Tschannen
924f92e5aa
Prevent the byte sample recovery from interfering with storage server recovery
2019-06-13 15:55:25 -07:00
Evan Tschannen
55f7e7d372
fix: The delay inside the disabledMap was causing the storage server updateStorage actor to run on the client process
2019-06-13 14:28:30 -07:00
Evan Tschannen
dccb9bc26d
fixed a number of correctness problems
2019-06-12 19:40:50 -07:00
Trevor Clinkenbeard
1e8f7e5b82
Refactor NextFastAllocatedSize to be constexpr function
2019-06-11 15:55:23 -07:00
Andrew Noyes
bc03421d05
Open Database as switchable only for client
2019-06-11 13:58:22 -07:00
Andrew Noyes
02e173b601
Add changeConnectionFile method to Transaction
...
Also add tests
2019-06-11 13:58:22 -07:00
Trevor Clinkenbeard
cb420ea4bd
Only construct waitDescription in simulator
2019-06-11 12:43:39 -07:00
Trevor Clinkenbeard
8144882d7b
Merge branch 'apple-master' into features/local-rk
2019-06-10 19:40:25 -07:00
Trevor Clinkenbeard
8dbb231f33
Don't reject read requests until the storage server durability lag gets large enough
2019-06-05 15:42:58 -07:00
Trevor Clinkenbeard
d1d98f298a
Changed storage server getPenalty calculation.
...
Penalty should always be >= 1.0
2019-06-05 14:14:40 -07:00
sramamoorthy
4083af0b01
Avoid using trackLatest for TLog pop test cases
2019-05-28 22:07:46 -07:00
sramamoorthy
ec7834e2f7
code re-orgnaization and address comments
2019-05-28 22:07:46 -07:00
sramamoorthy
b6e037ffbc
Replace fork with boost::process::child
2019-05-28 22:07:46 -07:00
sramamoorthy
61e93a9304
Address review comments and minor fixes
2019-05-28 22:07:46 -07:00
sramamoorthy
9e3104c2d4
Fix: races in async exec leading to bad backup
2019-05-28 22:07:46 -07:00
sramamoorthy
090bb53034
ShardInfo::addMutation to handle exec mutation
2019-05-28 22:07:46 -07:00
sramamoorthy
cfdad0c5e6
tlog to snapshot exactly at exec version
2019-05-28 22:07:46 -07:00
sramamoorthy
17ecba8313
trace cleanup and other indentation changes
2019-05-28 22:07:46 -07:00
sramamoorthy
aa79480d69
changes to make fdbfork asynchronous
2019-05-28 22:07:46 -07:00
sramamoorthy
72dd067173
Trace message changes and fix few FIXMEs
2019-05-28 22:07:46 -07:00
sramamoorthy
69edefe68b
Snapshot based backup and resotre implementation
2019-05-28 22:07:46 -07:00
A.J. Beamon
603721e125
Merge branch 'master' into thread-safe-random-number-generation
...
# Conflicts:
# fdbclient/ManagementAPI.actor.cpp
# fdbrpc/AsyncFileCached.actor.h
# fdbrpc/genericactors.actor.cpp
# fdbrpc/sim2.actor.cpp
# fdbserver/DiskQueue.actor.cpp
# fdbserver/workloads/BulkSetup.actor.h
# flow/ActorCollection.actor.cpp
# flow/Net2.actor.cpp
# flow/Trace.cpp
# flow/flow.cpp
2019-05-23 08:35:47 -07:00
Stephen Atherton
ebc96a7e0e
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# fdbserver/VersionedBTree.actor.cpp
2019-05-21 23:49:27 -07:00
A.J. Beamon
a8b9d8e34b
Merge pull request #1336 from tclinken/fast-allocate-ptree-nodes
...
Create 96-byte fast allocator for storage queue PTree nodes
2019-05-17 14:22:46 -07:00
Jingyu Zhou
b8e7fc1b84
Refactor: add std:: qualifier and use emplace_back
2019-05-17 09:38:50 -10:00
A.J. Beamon
5f55f3f613
Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.
2019-05-10 14:01:52 -07:00
Austin Seipp
bf378952cb
fdbserver: fix some print/scan format warnings
...
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2019-05-06 13:35:29 -07:00
Trevor Clinkenbeard
d339becd7c
Fix currentRate calculation for local ratekeeper
2019-04-25 15:35:34 -07:00
Meng Xu
529ce66b6c
Merge branch 'apple/master' into mengxu/performant-restore-PR
2019-04-18 18:02:45 -07:00
Andrew Noyes
ef04471a66
Fix more unused-variable warnings
2019-04-17 16:04:10 -07:00
Trevor Clinkenbeard
3426205167
Fixed readGuard usage bug
2019-04-16 15:05:57 -07:00
Trevor Clinkenbeard
1d921da170
readGuard sends server_overloaded error if request is rejected
2019-04-16 11:29:01 -07:00
Trevor Clinkenbeard
0594154644
Fixed getPenalty calculation
2019-04-16 10:17:41 -07:00
Evan Tschannen
cd5c9d91fa
Merge pull request #1443 from etschannen/master
...
Merge 6.1 into master
2019-04-10 17:43:07 -07:00
A.J. Beamon
538b431656
Apply suggestions from code review
2019-04-08 14:55:58 -07:00
A.J. Beamon
a7288e1325
Throw process_behind instead of future_version when all storage nodes on a team are behind. process_behind gets the same backoff behavior as not_committed. Add proxy_memory_limit_exceeded to the retryable predicate.
2019-04-08 14:21:24 -07:00
mpilman
d2e74cb2c0
Fix stupid rounding error
2019-04-08 11:05:29 -07:00
mpilman
aaa8f73bdc
fixed missing refactoring code
2019-04-08 11:05:29 -07:00
mpilman
bdba8e22eb
Added test and bugfixes
2019-04-08 11:05:29 -07:00
mpilman
b944e0b116
generalized read guards, allow for penalty+error
2019-04-08 11:04:44 -07:00
mpilman
32393ec4c9
Prototype of local ratekeeper
2019-04-08 11:04:44 -07:00
mpilman
d01cbf3455
Addressed code review comments
2019-04-05 13:12:20 -07:00
mpilman
1c16f87a4e
Remove trace-calls to printable (in non-workloads)
2019-04-05 13:12:19 -07:00
Meng Xu
70d7c289f4
Merge branch 'master' into mengxu/restore/parallel-v7
2019-03-30 22:13:10 -07:00
Stephen Atherton
d5c8b6b083
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# fdbserver/VersionedBTree.actor.cpp
# flow/flow.h
2019-03-27 13:37:15 -07:00