Commit Graph

67 Commits

Author SHA1 Message Date
Evan Tschannen 048201717c Fixed a number of problems with monitorLeaderRemotely 2020-05-10 14:20:50 -07:00
Alex Miller 5f51d97444 Remove unneeded module violation. 2020-05-07 15:40:22 -07:00
Alex Miller 38fa218a00 Remove unneeded duplicated code. 2020-05-05 01:02:01 -07:00
Alex Miller 1117eae2b5 Rework to make ElectionResult code similar to OpenDatabase code.
And also restore and fix the delayed cluster controller code.
2020-05-05 01:00:17 -07:00
Alex Miller 43a63452d8 YOLO at reducing TLS connection count via doing monitorLeader on coordinators 2020-05-01 14:40:21 -07:00
Evan Tschannen 303df197cf Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	bindings/c/test/mako/mako.c
#	documentation/sphinx/source/release-notes.rst
#	fdbbackup/backup.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/NativeAPI.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/LogRouter.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/fdbserver.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/flow.vcxproj
#	flow/flow.vcxproj.filters
#	versions.target
2020-03-06 18:22:46 -08:00
Evan Tschannen c11c24b79d removed the fdbrpc version of platform.h 2020-02-28 14:56:10 -08:00
A.J. Beamon d1e1fea42d Our binaries that act like clients (fdbcli, backup and DR binaries) were reporting an unknown client version. Clients did not react if the list of supported versions changed. 2020-02-28 09:35:21 -08:00
Evan Tschannen 924d335aa7 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	flow/Knobs.cpp
#	flow/Knobs.h
2020-02-25 18:25:19 -08:00
Evan Tschannen 13a523a355 fix: commit on first proxy did not always commit to the first proxy 2020-02-25 12:34:31 -08:00
Andrew Noyes 1248d2b8b4 Remove USE_OBJECT_SERIALIZER knob 2020-02-12 10:41:52 -08:00
Evan Tschannen 855f03a41f ratekeeper needed to check remoteDC in another location
the storage server scoped a transaction incorrectly
2020-01-10 15:58:36 -08:00
Evan Tschannen da1be272cb fix: servers which opened the database would use the full list of proxies 2020-01-10 12:20:30 -08:00
Evan Tschannen 9b4f7626bb cache the serialization of clientDBInfo 2019-09-11 15:19:42 -07:00
Vishesh Yadav 2b941f51bd Revert "fix: Use getReply* instead of tryGetReply in `monitorProxies`"
This reverts commit e7c94a2411.
2019-08-26 18:31:08 -07:00
Vishesh Yadav e7c94a2411 fix: Use getReply* instead of tryGetReply in `monitorProxies`
`tryGetReply` is unreliable, and since `monitorProxies` expects reply
after long period, the connection to coordinator gets closed due to
idle timeout, only to get reopened again in next loop to make
`openDatabase` request.

When using `getReply` our reliable message queue won't be empty and
connection will stay open.
2019-08-26 18:24:49 -07:00
Evan Tschannen 70ce678879 fix: max_protocol_clients were being added to the connected_clients list
fix: the clientCount was included clients with unknown protocol versions. This has been changed back to the pre-6.2 behavior where it is just a count of clients with known versions, and now clients with unknown versions are tracked explicitly as its own supported_version section
2019-08-13 15:54:40 -07:00
mpilman 370ba8b841 Remove --object-serializer flag from executables 2019-08-06 09:25:40 -07:00
Evan Tschannen 0063ef62ea fix: the client would not shrink the proxy list in all cases 2019-07-31 16:06:51 -07:00
Evan Tschannen 54df2abe8e fix: trace event did not compile 2019-07-30 17:52:53 -07:00
Evan Tschannen 85767f2034
Update fdbclient/MonitorLeader.actor.cpp 2019-07-30 17:19:33 -07:00
Evan Tschannen 7aece7398b fix: it was reducing the list of proxies on the coordinators, which would have made all the clients talking to that coordinator connect to the same set of proxies
optimized the code to avoid re-randomizing the same list of proxies
2019-07-30 17:15:24 -07:00
Evan Tschannen 8425f53fc5 clients only connect to three proxies 2019-07-28 23:52:29 -07:00
Evan Tschannen 27d9ce1143 fixed merge conflict 2019-07-26 15:23:36 -07:00
Evan Tschannen 90e3b50213 Merge branch 'master' into feature-coordinator-connection
# Conflicts:
#	fdbclient/DatabaseContext.h
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/NativeAPI.actor.h
#	fdbserver/workloads/KillRegion.actor.cpp
2019-07-26 15:05:02 -07:00
Evan Tschannen ee92f0574f fix: lastRequestTime was not updated
fix: COORDINATOR_REGISTER_INTERVAL was not set
fixed review comments
2019-07-26 13:23:56 -07:00
Evan Tschannen be5d144b8b added status information on connected clients 2019-07-25 17:15:31 -07:00
Evan Tschannen 8b73a1c998 removed verbose trace messages 2019-07-24 15:07:41 -07:00
Evan Tschannen 4a866290b7 Clients keep a persistent connection open with coordinators to get updates to the list of proxies
Status still needs to be updated with client information with information from the coordinators
2019-07-23 19:22:44 -07:00
Andrew Noyes e7e48a40ce Fix a few IncludeVersion bugs 2019-07-16 13:28:29 -07:00
mpilman 75d4b612cf Make object serializer versioned 2019-07-12 11:53:14 -07:00
Alex Miller 7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
A.J. Beamon e5381e0612 Fix some new usages of g_random 2019-05-23 09:23:27 -07:00
A.J. Beamon 603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00
mpilman 44db3450ec Several flatbuffers bug fixes 2019-05-13 14:15:23 -07:00
mpilman 47fdb78782 use EnsureTable for enums of enums 2019-05-13 14:15:22 -07:00
mpilman 7affe73774 Added more low-level tests and implemented enums 2019-05-13 14:15:22 -07:00
mpilman 8df7818b17 Added a test to reproduce current failure
Currently, enabling the ObjectSerializer results
in failures. These are hard to debug. Therefore I added
a unit test that reproduces the problem.
2019-05-13 14:15:22 -07:00
mpilman 92bad76479 Wrap ClusterClientInterface into its own type
When a process joins a cluster it fetches the cluster
interface. However, not the whole interface is exposed
to the client. This mechanism relies on the fact that
the serializer keeps the field ordering and doesn't
verify the message before parsing it.

To make this work, we provide a client type with one
member (the ClusterInterface which is exposed to the
client and the server). This client interface has the
same FileIdentifier as the ClusterControllerFullInterface
which has the same first member. This works because
FlatBuffers allows for members to be missing.
2019-05-13 14:15:22 -07:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Meng Xu 85c24b0067 Merge branch 'master' into mengxu/tls-switch-status-PR 2019-03-12 15:20:54 -07:00
Meng Xu 22f5624494 TLS Status: Reduce cluster controller load
When the coordinator changes, we use delayedAsyncVar() to reduce
the frequency for cluster controller to send the updated connectedCoordinatorsNumDelayed
to clients.
This help reduce the cluster controllers workload
2019-03-12 15:08:10 -07:00
Meng Xu 46f4b02807 TLS Status: Resolve review comments
Use connectedCoordinatorsNumDelayed to reduce the load on cluster controller;
Set connectedCoordinatorsNum to null by default for monitorLeader()
2019-03-11 17:10:08 -07:00
Vishesh Yadav ed49d603a0 Allows cluster string to contain coordinators with different TLS states
During live TLS upgrades, we can hence switch one coordinator at a time
to TLS than all of them together.
2019-03-06 16:05:10 -08:00
Meng Xu 04880e3d4d Merge branch 'master' into mengxu/tls-switch-status-PR 2019-03-06 13:41:16 -08:00
Meng Xu b7a52e81e2 Status: Count connected coordinators per client
A client will always try to connect all coordinators.
This commit let Status track the number of connected coordinators
for each client.

This allows us to do canary in coordinators. For example,
when we switch from non-TLS to TLS, we can switch 1 coordinator
from non-TLS to TLS. This can help check if a client has the ability
to connect through TLS.
We can make the non-TLS to TLS switch for each coordinators
one by one. This avoid the risk of losing connection in the switch.
2019-03-05 21:21:23 -08:00
Vishesh Yadav 57832e625d net: Support IPv6 #963
- NetworkAddress now contains IPAddress object which can be either
IPv4 or IPv6 address. 128bits are used even for IPv4 addresses,
however only 32bits are used when using/serializing IPv4 address.

- ConnectPacket is updated to store IPv6 address. Backward compatible
with old format since the first 32bits of IP address field is used
for serialization of IPv4.

- Mainly updates rest of the code to use IPAddress structure instead
of plain uint32_t.

- IPv6 address/pair ports should be represented as `[ip]:port` as per
convention. This applies to both cluster files and command line
arguments.
2019-03-04 14:12:41 -08:00
mpilman 479a4d7c22 Minor fixes in fdbclient for intellisense 2019-02-19 15:16:59 -08:00
Vishesh Yadav 3eb9b23024 Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint
- This patch will make FDB listen to multiple addresses given via
  command line. Although, we'll still use first address in most places,
  this patch starts using vector<NetworkAddress> in Endpoint at some basic
  places.
- When sending packets to an endpoint, pick a random network address in
  endpoints
- Renames Endpoint::address to Endpoint::addresses since it
  now holds a vector of addresses.
2018-12-13 13:36:52 -08:00
Vishesh Yadav 43e5a46f9b Change Endpoint::address(NetworkAddress) to vector<NetworkAddress>
Extend `Endpoint` class to take multiple NetworkAddresses instead of
just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll
have multiple IP:PORT pairs.

This patch simply adds the field and makes changes to compile the
codebase. The first element of of `address` field is used everywhere.
Hence the way we talk to remains same with this patch.

NOTE:

Directly accessing the first memeber of Endpoint::address is unsafe
as Endpoint() doesn't enforces non-empty address list. However, since
the correctness test pass for now and are anyway replacing all those
unsafe accesses with ones considering the whole vector, this patch
ignores to access them in safe way.
2018-12-13 13:36:52 -08:00