Commit Graph

2412 Commits

Author SHA1 Message Date
Evan Tschannen a3fe3d4324
Merge pull request #1923 from xumengpanda/mengxu/evan-dd-improvement-minor-improvement
DD:Change condition for lastBuildTeamsFailed
2019-07-30 16:54:42 -07:00
Evan Tschannen 2d7ec54d3e fix: some exclude workloads would cause both the primary and remote datacenter to be considered dead 2019-07-30 16:35:52 -07:00
Evan Tschannen aaeeb605b2 Changes to degraded can cause master recoveries, which are not supposed to happen when speedUpSimulation is true 2019-07-30 16:33:40 -07:00
A.J. Beamon b3f7875673
Merge pull request #1925 from ajbeamon/rename-fault-tolerance-status-fields
Rename max_machine_failures_without_losing_X to max_zone_failures_without_losing_X in status.
2019-07-30 16:25:48 -07:00
A.J. Beamon b5d2234a13 Merge branch 'release-6.1' into merge-release-6.1-into-master
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbbackup/backup.actor.cpp
#	fdbserver/MoveKeys.actor.cpp
#	flow/FastAlloc.h
#	versions.target
2019-07-30 16:23:42 -07:00
A.J. Beamon a731adeb8f --machine_id now sets locality_machineid 2019-07-30 16:11:09 -07:00
Evan Tschannen 06fc8cb904
Merge pull request #1919 from etschannen/feature-buffered-popped
Implement popped on bufferedCursor
2019-07-30 15:56:41 -07:00
A.J. Beamon 25f93f7f1b Revert change to machine_id documentation (to be fixed in separate PR). 2019-07-30 15:20:57 -07:00
A.J. Beamon 08cdd8e788 Merge branch 'master' into move-fdbserver-arguments 2019-07-30 15:13:44 -07:00
A.J. Beamon 59dd5916c5 Merge branch 'master' into rename-fault-tolerance-status-fields
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
2019-07-30 15:11:25 -07:00
A.J. Beamon 41605735f5
Merge pull request #1916 from ajbeamon/merge-onto-new-servers
Add knob to control whether merges request new servers or not.
2019-07-30 15:04:37 -07:00
A.J. Beamon 14648e20f9
Merge pull request #1901 from ajbeamon/data-distribution-receives-bytes-input-rate
Send bytes input rate to data distribution
2019-07-30 15:01:36 -07:00
Evan Tschannen 7ac7eb82f2 fix: buffered cursor would start multiple bufferedGetMore actors
advance all of the cursors to the poppedVersion
2019-07-30 14:42:05 -07:00
A.J. Beamon 924c51274d Move memory and locality arguments from --dev-help to --help. Also update -i/--machine_id to note that it modifies the zone identifier key (depite the name of the parameter, which I'm not changing now). 2019-07-30 14:34:27 -07:00
A.J. Beamon 438bc636d5 Rename max_machine_failures_without_losing_X to max_zone_failures_without_losing_X in status. 2019-07-30 14:02:31 -07:00
Evan Tschannen b5cb7919b6 fix: canDiscardPopped was not reset when necessary in all cases 2019-07-30 13:44:44 -07:00
Evan Tschannen 9e3ec2cb33 fix: when resetting the peekCursor, we cannot discard the popped data if the adapter has already processed data 2019-07-30 13:25:25 -07:00
Evan Tschannen 2301728903 fix compiler error 2019-07-30 13:00:48 -07:00
Evan Tschannen 8f887ccaa5 fix: the cursor was not reset when the disk adapter was reset
added a buggy to cause reset to happen more often in simulation
2019-07-30 12:58:18 -07:00
Evan Tschannen 1d326e3dc8 removed debugging message 2019-07-30 12:42:50 -07:00
Evan Tschannen 5d79e4141f fix: buffered cursor messageVersion should be set to the version we will be at after exhausting everything in messages 2019-07-30 12:38:44 -07:00
Evan Tschannen 6977e7d2e8 do not return recovered version as popped for txsTags because it could cause recovery to start over
optimized how buffered peek cursor discards popped data
2019-07-30 12:21:48 -07:00
Meng Xu 0e50656c7f DD:Change condition for lastBuildTeamsFailed
Change the threshold team number per server that should set lastBuildTeamsFailed
from DESIRED_TEAMS_PER_SERVER to
(SERVER_KNOBS->DESIRED_TEAMS_PER_SERVER * (configuration.storageTeamSize + 1)) / 2;
2019-07-30 11:07:02 -07:00
Evan Tschannen 7a932479dd throw away state if we ever read popped data from the disk queue adapter 2019-07-30 10:14:39 -07:00
Evan Tschannen 45f7b41b48 fix: multi-cursor could discard popped commits after already returning data 2019-07-29 21:36:42 -07:00
Evan Tschannen 5bb322b483 implement popped on bufferedCursor 2019-07-29 21:19:47 -07:00
Evan Tschannen a0f26b604c
Merge pull request #1907 from etschannen/master
A number of bug fixes for rare problems found by correctness testing
2019-07-29 21:04:38 -07:00
sramamoorthy 5a56f6b456 minor snap create client improvement and bug fixes 2019-07-29 20:28:22 -07:00
A.J. Beamon bc536757df Add knob to control whether merges request new servers or not. Set the default to request new servers in \xff but not in main key space. 2019-07-29 15:47:34 -07:00
Evan Tschannen 6b5e683de5 The mountainChopper and valleyFiller only move larger than average shards, to avoid moving high bandwidth shards which are generally smaller. 2019-07-28 23:50:42 -07:00
Evan Tschannen cc4481b71a team builders prefer to make teams which overlap less with existing teams 2019-07-28 23:44:23 -07:00
Evan Tschannen d8b14fe372 we cannot buggify replace content bytes because it takes too long to recovery when the txnStateStore is too large 2019-07-28 19:34:17 -07:00
Evan Tschannen 9a0db74230 fix: forced recovery did not copy txsTags properly 2019-07-28 19:31:53 -07:00
Evan Tschannen 7e97bd181a fix: we need to build teams when a server becomes healthy if it is possible another servers does not have enough teams 2019-07-28 19:31:21 -07:00
Evan Tschannen 13203da199 fix: do not set the popped version of txsTag because it could be copied over at the recoveredAt version 2019-07-27 22:36:06 -07:00
Evan Tschannen cfc985cdf1 re-enabled flat buffers, fixed the latencyBandConfig serialization 2019-07-27 17:48:24 -07:00
Evan Tschannen 5c98dcce6d revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators 2019-07-27 16:46:22 -07:00
Evan Tschannen 9871045cc7 flat buffers is causing an infinite loop when serializing LatencyBandConfig::GrvConfig 2019-07-27 16:34:18 -07:00
Evan Tschannen b509a441e7 Merge branch 'master' into feature-skip-confirm
# Conflicts:
#	bindings/flow/tester/Tester.actor.cpp
#	bindings/go/src/_stacktester/stacktester.go
#	bindings/java/src/test/com/apple/foundationdb/test/AsyncStackTester.java
#	bindings/java/src/test/com/apple/foundationdb/test/StackTester.java
#	bindings/python/tests/tester.py
#	bindings/ruby/tests/tester.rb
#	documentation/sphinx/source/api-c.rst
#	documentation/sphinx/source/api-python.rst
#	documentation/sphinx/source/api-ruby.rst
#	documentation/sphinx/source/data-modeling.rst
#	documentation/sphinx/source/developer-guide.rst
#	fdbclient/vexillographer/fdb.options
#	fdbserver/MasterProxyServer.actor.cpp
2019-07-27 15:08:13 -07:00
Evan Tschannen c2739e510d set uninitialized variable 2019-07-27 14:30:56 -07:00
Evan Tschannen ee94e8a062 removed a trace event which was causing valgrind errors 2019-07-27 13:51:59 -07:00
Evan Tschannen 90e3b50213 Merge branch 'master' into feature-coordinator-connection
# Conflicts:
#	fdbclient/DatabaseContext.h
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/NativeAPI.actor.h
#	fdbserver/workloads/KillRegion.actor.cpp
2019-07-26 15:05:02 -07:00
Evan Tschannen 3ad1d95049
Merge pull request #1894 from ajbeamon/trace-file-detail-rename
Expand undefined acronym in trace event detail
2019-07-26 13:34:45 -07:00
Evan Tschannen 1a4ca05a04
Merge pull request #1889 from ajbeamon/add-cache-memory-parameter
Add cache_memory parameter to fdbserver
2019-07-26 13:34:24 -07:00
Evan Tschannen 04dd293af0
Merge pull request #1874 from xumengpanda/mengxu/DD-code-read
DataDistribution:Add comments to help understand the code
2019-07-26 13:30:44 -07:00
Evan Tschannen 98c3b24036
Merge pull request #1869 from alexmiller-apple/sharded-txs-performance
Raise the priority of TLogRejoin above TLogPeekReply
2019-07-26 13:30:13 -07:00
Evan Tschannen 28df2c35bb
Merge pull request #1855 from alexmiller-apple/sharded-txs-safe-upgrade
Make sharded txsTag upgradeable and downgradeable
2019-07-26 13:29:39 -07:00
Evan Tschannen 2123fa1c3a
Merge pull request #1853 from xumengpanda/mengxu/redundantTeamRemoverPriority-PR
Lower the RelocateShard priority for removing redundant teams
2019-07-26 13:28:42 -07:00
Evan Tschannen 8149b5b352
Merge pull request #1413 from atn34/change-connection-file
Switch cluster file feature
2019-07-26 13:27:37 -07:00
Evan Tschannen ee92f0574f fix: lastRequestTime was not updated
fix: COORDINATOR_REGISTER_INTERVAL was not set
fixed review comments
2019-07-26 13:23:56 -07:00
A.J. Beamon 7982e55ccd
Merge pull request #1898 from jzhou77/remove-monitorServerInfoConfig
Add a monitorServerInfoConfig() call back
2019-07-26 08:34:51 -07:00
sramamoorthy 9afd162e2f remove snap v1 related code 2019-07-25 17:29:31 -07:00
Evan Tschannen be5d144b8b added status information on connected clients 2019-07-25 17:15:31 -07:00
A.J. Beamon b91795d288 Send bytes input rate to DD. 2019-07-25 16:27:32 -07:00
Balachandar Namasivayam bf87d906f6 Fix a crash. 2019-07-25 16:15:28 -07:00
Jingyu Zhou bbeaf0ebbb Add a monitorServerInfoConfig() call back
This was deleted during a code refactor in ef868f5. Because no tests were
complaining, we didn't find this until now.
2019-07-25 15:17:26 -07:00
A.J. Beamon a92b6cd3d1 Merge branch 'master' into add-cache-memory-parameter
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
2019-07-25 13:41:57 -07:00
senthil-ram edeec8a622 Update fdbserver/DataDistribution.actor.cpp
Co-Authored-By: Alex Miller <35046903+alexmiller-apple@users.noreply.github.com>
2019-07-24 15:36:28 -07:00
sramamoorthy 31a1e6858b remove un-necessary state variables in getCoord 2019-07-24 15:36:28 -07:00
sramamoorthy a65c9f92ed get rid of all timeouts and other changes 2019-07-24 15:36:28 -07:00
sramamoorthy a2f2ad96ff code review comments and merge to master changes 2019-07-24 15:36:28 -07:00
sramamoorthy 4f2bb561de snapshot only local tlogs and not the satellite 2019-07-24 15:36:28 -07:00
sramamoorthy 021c949801 increase snaptime out to 15s for simulator 2019-07-24 15:36:28 -07:00
sramamoorthy 869f77aef1 Few cosmetic edits and fixes 2019-07-24 15:36:28 -07:00
sramamoorthy ddd4523816 bug fix in timeout & header file re-arrange in DD 2019-07-24 15:36:28 -07:00
sramamoorthy c18558cf55 enable DD mode in restore based on test spec 2019-07-24 15:36:28 -07:00
sramamoorthy 31c010b393 few minor fixes 2019-07-24 15:36:28 -07:00
sramamoorthy 33c2801944 adjut versions to handle KCV > recoveryVersion 2019-07-24 15:36:28 -07:00
sramamoorthy 62c14dae72 disable dd during snap and enable in restore 2019-07-24 15:36:28 -07:00
sramamoorthy c73bdfad9f do not pop txsTag 2019-07-24 15:36:28 -07:00
sramamoorthy a335ed2011 includeCancelled for tLogSnapCreate 2019-07-24 15:36:28 -07:00
sramamoorthy 080b3da322 includeCancelled for workerSnapCreate 2019-07-24 15:36:28 -07:00
sramamoorthy 61cd690add enable/disable pop req with UID mis-match to fail 2019-07-24 15:36:28 -07:00
sramamoorthy d90b678f6f storage worker to throw in case of failures 2019-07-24 15:36:28 -07:00
sramamoorthy 95d6807740 tryGetReply instead of getReply for ddSnapReq 2019-07-24 15:36:28 -07:00
sramamoorthy 7ec8fe6e74 snap v2: implement get only local storage workers 2019-07-24 15:36:28 -07:00
sramamoorthy 671c98fa3d snap v2: test files changes 2019-07-24 15:36:28 -07:00
sramamoorthy a954cf4e06 snap v2: restore related changes for the simulator 2019-07-24 15:36:28 -07:00
sramamoorthy 7e04e3c8be snap v2: knobs for max snap create timeout 2019-07-24 15:36:28 -07:00
sramamoorthy 8f1f0c0435 snap v2: worker and other helper related changes 2019-07-24 15:36:28 -07:00
sramamoorthy f4e257e464 snap v2: TLog related changes 2019-07-24 15:36:28 -07:00
sramamoorthy ba6bccce73 snap v2: DD changes - snapshot orchestration logic 2019-07-24 15:36:28 -07:00
sramamoorthy d0793f5ca2 snap v2: master proxy related changes 2019-07-24 15:36:28 -07:00
Trevor Clinkenbeard 9ad9bd4c1f Merge branch 'master' of https://github.com/apple/foundationdb into change-connection-file 2019-07-24 15:22:26 -07:00
Evan Tschannen 8b73a1c998 removed verbose trace messages 2019-07-24 15:07:41 -07:00
Evan Tschannen 2434d06726 fix: The coordinators did not properly track hasConnectedClients 2019-07-24 14:41:12 -07:00
A.J. Beamon 639df02f20 Expand undefined acronym in trace event detail 2019-07-24 08:38:36 -07:00
Evan Tschannen b303ab4e6c fix: DR agents need to be clients because their failure monitoring information needs to come from two different cluster controllers 2019-07-23 19:24:07 -07:00
Evan Tschannen 4a866290b7 Clients keep a persistent connection open with coordinators to get updates to the list of proxies
Status still needs to be updated with client information with information from the coordinators
2019-07-23 19:22:44 -07:00
A.J. Beamon 94be9560ea Add cache_memory parameter to fdbserver to control the size of the (4K) page cache. Change the default slighty from 2000 MiB to 2GiB. 2019-07-23 15:05:21 -07:00
Meng Xu e582219ec5 Remove unnecessary condition in DDQueue
Resolve the review comment.
2019-07-22 17:00:37 -07:00
Vishesh Yadav 2f9f3c184f
Merge pull request #1870 from alexmiller-apple/txnStateStore-workload
Add ability to bulk load data into TxnStateStore
2019-07-22 13:20:39 -07:00
A.J. Beamon e29a6ea280
Merge pull request #1871 from bnamasivayam/tr-priority-add-client-log
Track the priority of sampled Transaction as part of GetReadVersion e…
2019-07-22 13:04:52 -07:00
Balachandar Namasivayam df652155fc Addressed review comments 2019-07-22 12:17:05 -07:00
Meng Xu b7478f5dd3 DD:Add comments to help understand code
Add comments to explain the functionalities of some code.
2019-07-22 11:23:16 -07:00
Meng Xu 378db79441 Resolve conflict when merge with master 2019-07-22 10:56:20 -07:00
Meng Xu dae4436a3d TC:UnitTest:Change invariant due to alg change 2019-07-20 21:06:54 -07:00
Meng Xu 612a51fe00 Apply Clang format to PRIORITY_TEAM_REDUNDANT 2019-07-19 18:32:22 -07:00
Meng Xu ea76451f15 Count PRIORITY_TEAM_REDUNDANT as count PRIORITY_TEAM_UNHEALTHY 2019-07-19 18:30:01 -07:00
Alex Miller 4ac1a0f557 Add ability to bulk load data into TxnStateStore
* Changes BulkLoad workload to support a specific volume of data to load
* Changes BulkLoad and Cycle to correctly handle \xff in keyPrefix
* Adds BulkLoad to TxnStateStoreCycleTest
2019-07-19 18:01:24 -07:00