Evan Tschannen
a3fe3d4324
Merge pull request #1923 from xumengpanda/mengxu/evan-dd-improvement-minor-improvement
...
DD:Change condition for lastBuildTeamsFailed
2019-07-30 16:54:42 -07:00
Evan Tschannen
2d7ec54d3e
fix: some exclude workloads would cause both the primary and remote datacenter to be considered dead
2019-07-30 16:35:52 -07:00
Evan Tschannen
aaeeb605b2
Changes to degraded can cause master recoveries, which are not supposed to happen when speedUpSimulation is true
2019-07-30 16:33:40 -07:00
A.J. Beamon
b3f7875673
Merge pull request #1925 from ajbeamon/rename-fault-tolerance-status-fields
...
Rename max_machine_failures_without_losing_X to max_zone_failures_without_losing_X in status.
2019-07-30 16:25:48 -07:00
A.J. Beamon
b5d2234a13
Merge branch 'release-6.1' into merge-release-6.1-into-master
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbbackup/backup.actor.cpp
# fdbserver/MoveKeys.actor.cpp
# flow/FastAlloc.h
# versions.target
2019-07-30 16:23:42 -07:00
A.J. Beamon
a731adeb8f
--machine_id now sets locality_machineid
2019-07-30 16:11:09 -07:00
Evan Tschannen
06fc8cb904
Merge pull request #1919 from etschannen/feature-buffered-popped
...
Implement popped on bufferedCursor
2019-07-30 15:56:41 -07:00
A.J. Beamon
25f93f7f1b
Revert change to machine_id documentation (to be fixed in separate PR).
2019-07-30 15:20:57 -07:00
A.J. Beamon
08cdd8e788
Merge branch 'master' into move-fdbserver-arguments
2019-07-30 15:13:44 -07:00
A.J. Beamon
59dd5916c5
Merge branch 'master' into rename-fault-tolerance-status-fields
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2019-07-30 15:11:25 -07:00
A.J. Beamon
41605735f5
Merge pull request #1916 from ajbeamon/merge-onto-new-servers
...
Add knob to control whether merges request new servers or not.
2019-07-30 15:04:37 -07:00
A.J. Beamon
14648e20f9
Merge pull request #1901 from ajbeamon/data-distribution-receives-bytes-input-rate
...
Send bytes input rate to data distribution
2019-07-30 15:01:36 -07:00
Evan Tschannen
7ac7eb82f2
fix: buffered cursor would start multiple bufferedGetMore actors
...
advance all of the cursors to the poppedVersion
2019-07-30 14:42:05 -07:00
A.J. Beamon
924c51274d
Move memory and locality arguments from --dev-help to --help. Also update -i/--machine_id to note that it modifies the zone identifier key (depite the name of the parameter, which I'm not changing now).
2019-07-30 14:34:27 -07:00
A.J. Beamon
438bc636d5
Rename max_machine_failures_without_losing_X to max_zone_failures_without_losing_X in status.
2019-07-30 14:02:31 -07:00
Evan Tschannen
b5cb7919b6
fix: canDiscardPopped was not reset when necessary in all cases
2019-07-30 13:44:44 -07:00
Evan Tschannen
9e3ec2cb33
fix: when resetting the peekCursor, we cannot discard the popped data if the adapter has already processed data
2019-07-30 13:25:25 -07:00
Evan Tschannen
2301728903
fix compiler error
2019-07-30 13:00:48 -07:00
Evan Tschannen
8f887ccaa5
fix: the cursor was not reset when the disk adapter was reset
...
added a buggy to cause reset to happen more often in simulation
2019-07-30 12:58:18 -07:00
Evan Tschannen
1d326e3dc8
removed debugging message
2019-07-30 12:42:50 -07:00
Evan Tschannen
5d79e4141f
fix: buffered cursor messageVersion should be set to the version we will be at after exhausting everything in messages
2019-07-30 12:38:44 -07:00
Evan Tschannen
6977e7d2e8
do not return recovered version as popped for txsTags because it could cause recovery to start over
...
optimized how buffered peek cursor discards popped data
2019-07-30 12:21:48 -07:00
Meng Xu
0e50656c7f
DD:Change condition for lastBuildTeamsFailed
...
Change the threshold team number per server that should set lastBuildTeamsFailed
from DESIRED_TEAMS_PER_SERVER to
(SERVER_KNOBS->DESIRED_TEAMS_PER_SERVER * (configuration.storageTeamSize + 1)) / 2;
2019-07-30 11:07:02 -07:00
Evan Tschannen
7a932479dd
throw away state if we ever read popped data from the disk queue adapter
2019-07-30 10:14:39 -07:00
Evan Tschannen
45f7b41b48
fix: multi-cursor could discard popped commits after already returning data
2019-07-29 21:36:42 -07:00
Evan Tschannen
5bb322b483
implement popped on bufferedCursor
2019-07-29 21:19:47 -07:00
Evan Tschannen
a0f26b604c
Merge pull request #1907 from etschannen/master
...
A number of bug fixes for rare problems found by correctness testing
2019-07-29 21:04:38 -07:00
sramamoorthy
5a56f6b456
minor snap create client improvement and bug fixes
2019-07-29 20:28:22 -07:00
A.J. Beamon
bc536757df
Add knob to control whether merges request new servers or not. Set the default to request new servers in \xff but not in main key space.
2019-07-29 15:47:34 -07:00
Evan Tschannen
6b5e683de5
The mountainChopper and valleyFiller only move larger than average shards, to avoid moving high bandwidth shards which are generally smaller.
2019-07-28 23:50:42 -07:00
Evan Tschannen
cc4481b71a
team builders prefer to make teams which overlap less with existing teams
2019-07-28 23:44:23 -07:00
Evan Tschannen
d8b14fe372
we cannot buggify replace content bytes because it takes too long to recovery when the txnStateStore is too large
2019-07-28 19:34:17 -07:00
Evan Tschannen
9a0db74230
fix: forced recovery did not copy txsTags properly
2019-07-28 19:31:53 -07:00
Evan Tschannen
7e97bd181a
fix: we need to build teams when a server becomes healthy if it is possible another servers does not have enough teams
2019-07-28 19:31:21 -07:00
Evan Tschannen
13203da199
fix: do not set the popped version of txsTag because it could be copied over at the recoveredAt version
2019-07-27 22:36:06 -07:00
Evan Tschannen
cfc985cdf1
re-enabled flat buffers, fixed the latencyBandConfig serialization
2019-07-27 17:48:24 -07:00
Evan Tschannen
5c98dcce6d
revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators
2019-07-27 16:46:22 -07:00
Evan Tschannen
9871045cc7
flat buffers is causing an infinite loop when serializing LatencyBandConfig::GrvConfig
2019-07-27 16:34:18 -07:00
Evan Tschannen
b509a441e7
Merge branch 'master' into feature-skip-confirm
...
# Conflicts:
# bindings/flow/tester/Tester.actor.cpp
# bindings/go/src/_stacktester/stacktester.go
# bindings/java/src/test/com/apple/foundationdb/test/AsyncStackTester.java
# bindings/java/src/test/com/apple/foundationdb/test/StackTester.java
# bindings/python/tests/tester.py
# bindings/ruby/tests/tester.rb
# documentation/sphinx/source/api-c.rst
# documentation/sphinx/source/api-python.rst
# documentation/sphinx/source/api-ruby.rst
# documentation/sphinx/source/data-modeling.rst
# documentation/sphinx/source/developer-guide.rst
# fdbclient/vexillographer/fdb.options
# fdbserver/MasterProxyServer.actor.cpp
2019-07-27 15:08:13 -07:00
Evan Tschannen
c2739e510d
set uninitialized variable
2019-07-27 14:30:56 -07:00
Evan Tschannen
ee94e8a062
removed a trace event which was causing valgrind errors
2019-07-27 13:51:59 -07:00
Evan Tschannen
90e3b50213
Merge branch 'master' into feature-coordinator-connection
...
# Conflicts:
# fdbclient/DatabaseContext.h
# fdbclient/NativeAPI.actor.cpp
# fdbclient/NativeAPI.actor.h
# fdbserver/workloads/KillRegion.actor.cpp
2019-07-26 15:05:02 -07:00
Evan Tschannen
3ad1d95049
Merge pull request #1894 from ajbeamon/trace-file-detail-rename
...
Expand undefined acronym in trace event detail
2019-07-26 13:34:45 -07:00
Evan Tschannen
1a4ca05a04
Merge pull request #1889 from ajbeamon/add-cache-memory-parameter
...
Add cache_memory parameter to fdbserver
2019-07-26 13:34:24 -07:00
Evan Tschannen
04dd293af0
Merge pull request #1874 from xumengpanda/mengxu/DD-code-read
...
DataDistribution:Add comments to help understand the code
2019-07-26 13:30:44 -07:00
Evan Tschannen
98c3b24036
Merge pull request #1869 from alexmiller-apple/sharded-txs-performance
...
Raise the priority of TLogRejoin above TLogPeekReply
2019-07-26 13:30:13 -07:00
Evan Tschannen
28df2c35bb
Merge pull request #1855 from alexmiller-apple/sharded-txs-safe-upgrade
...
Make sharded txsTag upgradeable and downgradeable
2019-07-26 13:29:39 -07:00
Evan Tschannen
2123fa1c3a
Merge pull request #1853 from xumengpanda/mengxu/redundantTeamRemoverPriority-PR
...
Lower the RelocateShard priority for removing redundant teams
2019-07-26 13:28:42 -07:00
Evan Tschannen
8149b5b352
Merge pull request #1413 from atn34/change-connection-file
...
Switch cluster file feature
2019-07-26 13:27:37 -07:00
Evan Tschannen
ee92f0574f
fix: lastRequestTime was not updated
...
fix: COORDINATOR_REGISTER_INTERVAL was not set
fixed review comments
2019-07-26 13:23:56 -07:00
A.J. Beamon
7982e55ccd
Merge pull request #1898 from jzhou77/remove-monitorServerInfoConfig
...
Add a monitorServerInfoConfig() call back
2019-07-26 08:34:51 -07:00
sramamoorthy
9afd162e2f
remove snap v1 related code
2019-07-25 17:29:31 -07:00
Evan Tschannen
be5d144b8b
added status information on connected clients
2019-07-25 17:15:31 -07:00
A.J. Beamon
b91795d288
Send bytes input rate to DD.
2019-07-25 16:27:32 -07:00
Balachandar Namasivayam
bf87d906f6
Fix a crash.
2019-07-25 16:15:28 -07:00
Jingyu Zhou
bbeaf0ebbb
Add a monitorServerInfoConfig() call back
...
This was deleted during a code refactor in ef868f5
. Because no tests were
complaining, we didn't find this until now.
2019-07-25 15:17:26 -07:00
A.J. Beamon
a92b6cd3d1
Merge branch 'master' into add-cache-memory-parameter
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2019-07-25 13:41:57 -07:00
senthil-ram
edeec8a622
Update fdbserver/DataDistribution.actor.cpp
...
Co-Authored-By: Alex Miller <35046903+alexmiller-apple@users.noreply.github.com>
2019-07-24 15:36:28 -07:00
sramamoorthy
31a1e6858b
remove un-necessary state variables in getCoord
2019-07-24 15:36:28 -07:00
sramamoorthy
a65c9f92ed
get rid of all timeouts and other changes
2019-07-24 15:36:28 -07:00
sramamoorthy
a2f2ad96ff
code review comments and merge to master changes
2019-07-24 15:36:28 -07:00
sramamoorthy
4f2bb561de
snapshot only local tlogs and not the satellite
2019-07-24 15:36:28 -07:00
sramamoorthy
021c949801
increase snaptime out to 15s for simulator
2019-07-24 15:36:28 -07:00
sramamoorthy
869f77aef1
Few cosmetic edits and fixes
2019-07-24 15:36:28 -07:00
sramamoorthy
ddd4523816
bug fix in timeout & header file re-arrange in DD
2019-07-24 15:36:28 -07:00
sramamoorthy
c18558cf55
enable DD mode in restore based on test spec
2019-07-24 15:36:28 -07:00
sramamoorthy
31c010b393
few minor fixes
2019-07-24 15:36:28 -07:00
sramamoorthy
33c2801944
adjut versions to handle KCV > recoveryVersion
2019-07-24 15:36:28 -07:00
sramamoorthy
62c14dae72
disable dd during snap and enable in restore
2019-07-24 15:36:28 -07:00
sramamoorthy
c73bdfad9f
do not pop txsTag
2019-07-24 15:36:28 -07:00
sramamoorthy
a335ed2011
includeCancelled for tLogSnapCreate
2019-07-24 15:36:28 -07:00
sramamoorthy
080b3da322
includeCancelled for workerSnapCreate
2019-07-24 15:36:28 -07:00
sramamoorthy
61cd690add
enable/disable pop req with UID mis-match to fail
2019-07-24 15:36:28 -07:00
sramamoorthy
d90b678f6f
storage worker to throw in case of failures
2019-07-24 15:36:28 -07:00
sramamoorthy
95d6807740
tryGetReply instead of getReply for ddSnapReq
2019-07-24 15:36:28 -07:00
sramamoorthy
7ec8fe6e74
snap v2: implement get only local storage workers
2019-07-24 15:36:28 -07:00
sramamoorthy
671c98fa3d
snap v2: test files changes
2019-07-24 15:36:28 -07:00
sramamoorthy
a954cf4e06
snap v2: restore related changes for the simulator
2019-07-24 15:36:28 -07:00
sramamoorthy
7e04e3c8be
snap v2: knobs for max snap create timeout
2019-07-24 15:36:28 -07:00
sramamoorthy
8f1f0c0435
snap v2: worker and other helper related changes
2019-07-24 15:36:28 -07:00
sramamoorthy
f4e257e464
snap v2: TLog related changes
2019-07-24 15:36:28 -07:00
sramamoorthy
ba6bccce73
snap v2: DD changes - snapshot orchestration logic
2019-07-24 15:36:28 -07:00
sramamoorthy
d0793f5ca2
snap v2: master proxy related changes
2019-07-24 15:36:28 -07:00
Trevor Clinkenbeard
9ad9bd4c1f
Merge branch 'master' of https://github.com/apple/foundationdb into change-connection-file
2019-07-24 15:22:26 -07:00
Evan Tschannen
8b73a1c998
removed verbose trace messages
2019-07-24 15:07:41 -07:00
Evan Tschannen
2434d06726
fix: The coordinators did not properly track hasConnectedClients
2019-07-24 14:41:12 -07:00
A.J. Beamon
639df02f20
Expand undefined acronym in trace event detail
2019-07-24 08:38:36 -07:00
Evan Tschannen
b303ab4e6c
fix: DR agents need to be clients because their failure monitoring information needs to come from two different cluster controllers
2019-07-23 19:24:07 -07:00
Evan Tschannen
4a866290b7
Clients keep a persistent connection open with coordinators to get updates to the list of proxies
...
Status still needs to be updated with client information with information from the coordinators
2019-07-23 19:22:44 -07:00
A.J. Beamon
94be9560ea
Add cache_memory parameter to fdbserver to control the size of the (4K) page cache. Change the default slighty from 2000 MiB to 2GiB.
2019-07-23 15:05:21 -07:00
Meng Xu
e582219ec5
Remove unnecessary condition in DDQueue
...
Resolve the review comment.
2019-07-22 17:00:37 -07:00
Vishesh Yadav
2f9f3c184f
Merge pull request #1870 from alexmiller-apple/txnStateStore-workload
...
Add ability to bulk load data into TxnStateStore
2019-07-22 13:20:39 -07:00
A.J. Beamon
e29a6ea280
Merge pull request #1871 from bnamasivayam/tr-priority-add-client-log
...
Track the priority of sampled Transaction as part of GetReadVersion e…
2019-07-22 13:04:52 -07:00
Balachandar Namasivayam
df652155fc
Addressed review comments
2019-07-22 12:17:05 -07:00
Meng Xu
b7478f5dd3
DD:Add comments to help understand code
...
Add comments to explain the functionalities of some code.
2019-07-22 11:23:16 -07:00
Meng Xu
378db79441
Resolve conflict when merge with master
2019-07-22 10:56:20 -07:00
Meng Xu
dae4436a3d
TC:UnitTest:Change invariant due to alg change
2019-07-20 21:06:54 -07:00
Meng Xu
612a51fe00
Apply Clang format to PRIORITY_TEAM_REDUNDANT
2019-07-19 18:32:22 -07:00
Meng Xu
ea76451f15
Count PRIORITY_TEAM_REDUNDANT as count PRIORITY_TEAM_UNHEALTHY
2019-07-19 18:30:01 -07:00
Alex Miller
4ac1a0f557
Add ability to bulk load data into TxnStateStore
...
* Changes BulkLoad workload to support a specific volume of data to load
* Changes BulkLoad and Cycle to correctly handle \xff in keyPrefix
* Adds BulkLoad to TxnStateStoreCycleTest
2019-07-19 18:01:24 -07:00