Commit Graph

1091 Commits

Author SHA1 Message Date
mpilman da53a92bec Make protocol version a type
This fixes #1214

The basic idea is that ProtocolVersion is now its own type. This
alone is an improvement as it makes many things more typesafe. For
each version, we can now add breaking features (for example Fearless).
After that, there's no need to test against actual (confusing) version
numbers. Instead a developer can simply test
`protocolVersion->hasFearless()` and this will return true iff the
protocolVersion is newer than the newest version that didn't support
fearless.
2019-06-16 09:59:15 -07:00
Vishesh Yadav f9bfd74dd0 Monitor leader only when proxies are unknown or any dies
FDB clients talk to coordinators to monitor the leader, and then ask
them for proxies. Once proxies are known, the clients still keep
talking to coordinators. This patch, stops this monitoring and
reconnects to coordinators only if one of the proxy is no longer
available.
2019-06-14 13:27:57 -07:00
Evan Tschannen 6ececa94ce
Merge pull request #1640 from vishesh/task/client-failmon
Clients will no longer get failure monitoring info from cluster controller
2019-06-13 17:31:17 -07:00
Evan Tschannen dccb9bc26d fixed a number of correctness problems 2019-06-12 19:40:50 -07:00
Trevor Clinkenbeard 1e8f7e5b82 Refactor NextFastAllocatedSize to be constexpr function 2019-06-11 15:55:23 -07:00
Trevor Clinkenbeard 8144882d7b Merge branch 'apple-master' into features/local-rk 2019-06-10 19:40:25 -07:00
Vishesh Yadav a8e408e268 run clang-format on changes 2019-06-10 14:10:24 -07:00
Vishesh Yadav 4316ef9ec6 failMon: For clients remove expireFailure and report failures only during connect 2019-06-09 00:43:38 -07:00
Vishesh Yadav 6fa7081a21 net: Don't make FailureMonitoring requests from client
This patch removes the need for clients to continuously contact
cluster coordinator for failure monitoring information. Instead, it
uses the FlowTransport to monitor the statuses of peers and update
FailureMonitor accordingly.
2019-06-09 00:43:38 -07:00
Vishesh Yadav 6b4d30c3ae failmon: Identify client vs server when starting failure monitoring client 2019-06-09 00:43:12 -07:00
Evan Tschannen 5bdf5aaeb6
Merge pull request #1662 from etschannen/master
Merge 6.1 into master
2019-06-06 13:57:34 -07:00
A.J. Beamon dbfa746494 Don't set the deterministic random seed with platform::getRandomSeed since that is what it does by default. 2019-06-05 11:48:29 -07:00
A.J. Beamon 31e82a6d6d Clients should seed the deterministic random number generator on the network thread. 2019-06-05 11:48:29 -07:00
Evan Tschannen 29b96414e2 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/NativeAPI.actor.cpp
#	fdbserver/Coordination.actor.cpp
#	flow/Arena.h
#	versions.target
2019-06-03 18:49:35 -07:00
Stephen Atherton d0011c3844 Merge branch 'release-6.1' of https://github.com/apple/foundationdb into reduce-http-error-noise 2019-06-03 16:51:21 -07:00
A.J. Beamon 773bce9e32
Merge pull request #1643 from etschannen/feature-cc-mem-leak
Fixed a memory leak on the cluster controller
2019-06-03 15:02:36 -07:00
Stephen Atherton 7ac62fb40f For HTTP requests, a missing request ID in the response is ignored if the response code is 5xx indicating a server error. 2019-05-31 12:33:39 -07:00
Evan Tschannen 7c333dbc16 If a process receives a message in its clusterControllerInterface before becoming the cluster controller, if the process does not become the cluster controller in the next minute it should destroy the interface to prevent a memory leak. 2019-05-29 16:57:13 -07:00
Evan Tschannen 362c2bf1e6 improved the cpu efficiency of printable 2019-05-29 14:55:45 -07:00
sramamoorthy 4bcb590f12 g_random -> deterministicRandom() 2019-05-28 22:07:46 -07:00
sramamoorthy 2a68b28590 rebase related changes 2019-05-28 22:07:46 -07:00
sramamoorthy b17ad85497 exec op not supported when log_anti_quorum > 0 2019-05-28 22:07:46 -07:00
sramamoorthy dcb99c5138 txn to disable tlogPop to be timedout
If the disable tlog pop txn takes more than 30 seconds then
tlog will automatically start enabling pop, fix is to timeout
the txn if it takes more than 10 seconds and retry a new
txn to disable tlog pop.
2019-05-28 22:07:46 -07:00
sramamoorthy d3a179b6f9 Multiple bug fixes
- wait for snapTLogFailKeys in a loop, otherwise in some race
  condition it can cause a false assert
- in single region, there does not seem to be a guarantee of
  tagLocalityListKey for a given DC ID, avoiding that assert for now
- to find the workers that are coordinators, looking up by primary
  address is not sufficient in some cases, hence looking by both
  primary and secondary address
- test make files to reflect the location of the new test cases
2019-05-28 22:07:46 -07:00
sramamoorthy bb474dc323 if recovery < fully_recovered then fail the exec
Will do more cleanup, pushing it for a test run in CI
2019-05-28 22:07:46 -07:00
sramamoorthy 925499954b New status cluster_not_fully_recovered 2019-05-28 22:07:46 -07:00
sramamoorthy 591ff96b93 increase retry and use eat instead of parsing 2019-05-28 22:07:46 -07:00
sramamoorthy 4083af0b01 Avoid using trackLatest for TLog pop test cases 2019-05-28 22:07:46 -07:00
sramamoorthy 936ffc2dde rebase related changes 2019-05-28 22:07:46 -07:00
sramamoorthy ec7834e2f7 code re-orgnaization and address comments 2019-05-28 22:07:46 -07:00
sramamoorthy 61e93a9304 Address review comments and minor fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 89b7a052f5 Bug fixes for snapping coordinators 2019-05-28 22:07:46 -07:00
sramamoorthy 17ecba8313 trace cleanup and other indentation changes 2019-05-28 22:07:46 -07:00
sramamoorthy 898bed66c1 Allow only whitelisted binary path for exec op 2019-05-28 22:07:46 -07:00
sramamoorthy aa79480d69 changes to make fdbfork asynchronous 2019-05-28 22:07:46 -07:00
sramamoorthy f7ba0635ef Make Exec op the first op in the batch 2019-05-28 22:07:46 -07:00
sramamoorthy 382b246930 trace change and retain fitness file after restore 2019-05-28 22:07:46 -07:00
sramamoorthy 3d5998e9dd tlog: when pops are disabled, store them & replay
In Tlogs, disable pop is done whlie taking snapshots. Earlier, tlogs
were ignoring the pops if it got pop requests when pops were
disabled. In this change, instead of ignoring the pop - it remembers
the list of pops in-memory and plays them once the popping is
enabled.
2019-05-28 22:07:46 -07:00
sramamoorthy 72dd067173 Trace message changes and fix few FIXMEs 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00
A.J. Beamon 20d83d61db Merge branch 'master' into thread-safe-random-number-generation 2019-05-23 11:07:08 -07:00
Evan Tschannen b451c2cd56
Merge pull request #1497 from alexmiller-apple/fastrecovery
Add an \xff keyrange that is backed by the txnStateStore.
2019-05-23 10:52:35 -07:00
A.J. Beamon e5381e0612 Fix some new usages of g_random 2019-05-23 09:23:27 -07:00
A.J. Beamon 603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00
Evan Tschannen f4fbaac6b0 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	versions.target
2019-05-19 10:27:59 -07:00
A.J. Beamon 52b0fcb946
Merge pull request #1602 from etschannen/release-6.1
Increase the configure timeout to 60 seconds
2019-05-17 15:09:30 -07:00
A.J. Beamon a8b9d8e34b
Merge pull request #1336 from tclinken/fast-allocate-ptree-nodes
Create 96-byte fast allocator for storage queue PTree nodes
2019-05-17 14:22:46 -07:00
Jingyu Zhou b8e7fc1b84 Refactor: add std:: qualifier and use emplace_back 2019-05-17 09:38:50 -10:00
Evan Tschannen 4166ecdfe7 Increase the configure timeout to 60 seconds, to avoid spurious configuration failures 2019-05-16 18:03:16 -07:00
Alvin Moore 3acaa7343e Enabled C++17 for all Windows projects
Set Visual Studio version to 2017 (first version to support C++17)
2019-05-16 17:44:13 -07:00