Commit Graph

65 Commits

Author SHA1 Message Date
A.J. Beamon e61cac4ed4 Fix spacing issue; rename fdbrpc/Stats.h to fdbrpc/TimedRequest.h 2019-08-01 08:39:52 -07:00
mpilman 7d247af500 Two minor bug fixes from recent optimizations 2019-07-31 19:14:11 -07:00
mpilman dabe516320 Avoid unnecessary timer calls 2019-07-31 17:59:35 -07:00
Evan Tschannen 5c98dcce6d revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators 2019-07-27 16:46:22 -07:00
Evan Tschannen 90e3b50213 Merge branch 'master' into feature-coordinator-connection
# Conflicts:
#	fdbclient/DatabaseContext.h
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/NativeAPI.actor.h
#	fdbserver/workloads/KillRegion.actor.cpp
2019-07-26 15:05:02 -07:00
sramamoorthy 9afd162e2f remove snap v1 related code 2019-07-25 17:29:31 -07:00
sramamoorthy a2f2ad96ff code review comments and merge to master changes 2019-07-24 15:36:28 -07:00
sramamoorthy d0793f5ca2 snap v2: master proxy related changes 2019-07-24 15:36:28 -07:00
Evan Tschannen 4a866290b7 Clients keep a persistent connection open with coordinators to get updates to the list of proxies
Status still needs to be updated with client information with information from the coordinators
2019-07-23 19:22:44 -07:00
Andrew Noyes 1968db17e3 Initialize in default constructor for GetReadVersionReply
==10473== Uninitialised byte(s) found during client check request
==10473==    at 0x1BA9ACE: sendPacket(TransportData*, ISerializeSource const&, Endpoint const&, bool, bool) (FlowTransport.actor.cpp:1252)
==10473==    by 0x877C05: (anonymous namespace)::NetworkSenderActorState<GetReadVersionReply, (anonymous namespace)::NetworkSenderActor<GetReadVersionReply> >::a_body1cont2(GetReadVersionReply const&, int) [clone .isra.0] (networksender.actor
.h:40)
==10473==    by 0x877CC6: a_body1when1 (networksender.actor.g.h:147)
==10473==    by 0x877CC6: a_callback_fire (networksender.actor.g.h:161)
==10473==    by 0x877CC6: ActorCallback<(anonymous namespace)::NetworkSenderActor<GetReadVersionReply>, 0, GetReadVersionReply>::fire(GetReadVersionReply const&) (flow.h:894)
==10473==    by 0xC343A7: send<GetReadVersionReply&> (flow.h:343)
==10473==    by 0xC343A7: send<GetReadVersionReply&> (fdbrpc.h:124)
==10473==    by 0xC343A7: (anonymous namespace)::ForwardProxyActorState<(anonymous namespace)::ForwardProxyActor>::a_body1loopBody1when2(ReplyPromise<GetReadVersionReply> const&, int) (MasterProxyServer.actor.cpp:1814)
==10473==    by 0xC33C10: (anonymous namespace)::ForwardProxyActorState<(anonymous namespace)::ForwardProxyActor>::a_body1loopBody1(int) (MasterProxyServer.actor.g.cpp:8167)
==10473==    by 0xC35434: a_body1loopHead1 (MasterProxyServer.actor.g.cpp:8152)
==10473==    by 0xC35434: a_body1loopBody1cont2 (MasterProxyServer.actor.g.cpp:8327)
==10473==    by 0xC35434: a_body1loopBody1cont1when1 (MasterProxyServer.actor.g.cpp:8333)
==10473==    by 0xC35434: a_body1loopBody1cont1when1 (MasterProxyServer.actor.g.cpp:8331)
==10473==    by 0xC35434: a_callback_fire (MasterProxyServer.actor.g.cpp:8347)
==10473==    by 0xC35434: ActorCallback<(anonymous namespace)::ForwardProxyActor, 3, Void>::fire(Void const&) (flow.h:894)
==10473==    by 0x7E7BE7: SAV<Void>::finishSendAndDelPromiseRef() (flow.h:375)
==10473==    by 0x8319FD: a_body1when1 (genericactors.actor.g.h:10892)
==10473==    by 0x8319FD: a_callback_fire (genericactors.actor.g.h:10920)
==10473==    by 0x8319FD: ActorCallback<(anonymous namespace)::ChooseActorActor<Void>, 0, Void>::fire(Void const&) (flow.h:894)
==10473==    by 0x891917: void SAV<Void>::send<Void>(Void&&) (flow.h:343)
==10473==    by 0x1C47ADC: send<Void> (flow.h:674)
==10473==    by 0x1C47ADC: execTask (sim2.actor.cpp:1632)
==10473==    by 0x1C47ADC: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1cont1(Void const&, int) (sim2.actor.cpp:975)
==10473==    by 0x1C47FF2: a_body1loopBody1when1 (sim2.actor.g.cpp:5092)
==10473==    by 0x1C47FF2: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1(int) (sim2.actor.g.cpp:5037)
==10473==    by 0x1C47A6C: a_body1loopHead1 (sim2.actor.g.cpp:5020)
==10473==    by 0x1C47A6C: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1cont1(Void const&, int) (sim2.actor.g.cpp:5086)
==10473==  Address 0x12db1ba1 is 2,977 bytes inside a recently re-allocated block of size 4,096 alloc'd
==10473==    at 0x1CC5D7F: FastAllocator<4096>::allocate() (FastAlloc.cpp:290)
==10473==    by 0x1CFAA68: operator new (FastAlloc.h:193)
==10473==    by 0x1CFAA68: PacketWriter::nextBuffer() (Net2Packet.cpp:59)
==10473==    by 0x1CFABD6: PacketWriter::writeAhead(int, SplitBuffer*) (Net2Packet.cpp:81)
==10473==    by 0x1BA97EB: sendPacket(TransportData*, ISerializeSource const&, Endpoint const&, bool, bool) (FlowTransport.actor.cpp:1199)
==10473==    by 0x7DEAD1: a_body1cont2 (networksender.actor.h:40)
==10473==    by 0x7DEAD1: a_body1when1 (networksender.actor.g.h:147)
==10473==    by 0x7DEAD1: a_callback_fire (networksender.actor.g.h:161)
==10473==    by 0x7DEAD1: ActorCallback<(anonymous namespace)::NetworkSenderActor<GetValueReply>, 0, GetValueReply>::fire(GetValueReply const&) (flow.h:894)
==10473==    by 0xF22767: send<GetValueReply&> (flow.h:343)
==10473==    by 0xF22767: send<GetValueReply&> (fdbrpc.h:124)
==10473==    by 0xF22767: (anonymous namespace)::GetValueQActorState<(anonymous namespace)::GetValueQActor>::a_body1cont5(int) [clone .isra.0] (storageserver.actor.cpp:890)
==10473==    by 0xF2305C: (anonymous namespace)::GetValueQActorState<(anonymous namespace)::GetValueQActor>::a_body1cont3(int) [clone .isra.0] (storageserver.actor.g.cpp:1592)
==10473==    by 0xF23447: a_body1cont2when1 (storageserver.actor.g.cpp:1627)
==10473==    by 0xF23447: (anonymous namespace)::GetValueQActorState<(anonymous namespace)::GetValueQActor>::a_body1cont2(Void const&, int) [clone .isra.0] (storageserver.actor.g.cpp:1512)
==10473==    by 0xF23507: a_body1when1 (storageserver.actor.g.cpp:1523)
==10473==    by 0xF23507: a_callback_fire (storageserver.actor.g.cpp:1537)
==10473==    by 0xF23507: ActorCallback<(anonymous namespace)::GetValueQActor, 0, Void>::fire(Void const&) (flow.h:894)
==10473==    by 0x891917: void SAV<Void>::send<Void>(Void&&) (flow.h:343)
==10473==    by 0x1C47ADC: send<Void> (flow.h:674)
==10473==    by 0x1C47ADC: execTask (sim2.actor.cpp:1632)
==10473==    by 0x1C47ADC: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1cont1(Void const&, int) (sim2.actor.cpp:975)
==10473==    by 0x1C47FF2: a_body1loopBody1when1 (sim2.actor.g.cpp:5092)
==10473==    by 0x1C47FF2: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1(int) (sim2.actor.g.cpp:5037)
==10473==  Uninitialised value was created by a stack allocation
==10473==    at 0xC342D0: (anonymous namespace)::ForwardProxyActorState<(anonymous namespace)::ForwardProxyActor>::a_body1loopBody1when2(ReplyPromise<GetReadVersionReply> const&, int) (MasterProxyServer.actor.g.cpp:8213)
2019-07-22 13:54:52 -07:00
Evan Tschannen c348b3da51 After a proxy dies, it will remain alive for an additional 10 seconds to forward clients to the new proxies 2019-07-08 12:53:40 -07:00
Alex Miller 7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
sramamoorthy 2a68b28590 rebase related changes 2019-05-28 22:07:46 -07:00
sramamoorthy 61e93a9304 Address review comments and minor fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 72dd067173 Trace message changes and fix few FIXMEs 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00
Jingyu Zhou b8e7fc1b84 Refactor: add std:: qualifier and use emplace_back 2019-05-17 09:38:50 -10:00
mpilman 96aaa31a6c Compiling on clang again 2019-05-13 14:15:23 -07:00
mpilman 69fa3d3903 fixed compilation issues after rebase 2019-05-13 14:15:23 -07:00
mpilman 6afce01744 Implementation complete (not yet working) 2019-05-13 14:15:22 -07:00
Evan Tschannen 4fa1c008f9 Highly prioritize storageServerRejoin messages on the proxy, so that storage servers can rejoin the cluster even when a proxy is CPU saturated 2019-04-23 20:56:01 -07:00
Evan Tschannen b6008558d3 renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Evan Tschannen f9aad46573 made use_provisional_proxies a transaction option 2019-03-19 18:44:37 -07:00
Evan Tschannen 5b9c45ea0b clients do not attempt to connect to provisional proxies 2019-03-19 13:37:50 -07:00
A.J. Beamon 85b3f11e71 Fix various compiler warnings 2019-03-15 10:34:57 -07:00
Evan Tschannen 988add9fb5 cache the metadataVersion for commits, so that doing setVersion with the commit version of a different transaction will 2019-03-04 16:48:34 -08:00
Evan Tschannen 075fdef31a Merge branch 'master' into feature-metadata-version
# Conflicts:
#	fdbclient/DatabaseContext.h
2019-03-03 22:58:45 -08:00
Evan Tschannen 9d53305516
Make GetHealthMetricsRequest constructor explicit
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:30:10 -08:00
Evan Tschannen de577a6ca9
Modified GetHealthMetricsReply constructor
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:29:16 -08:00
Evan Tschannen 3a91fb287b
Use const HealthMetrics& parameter in update
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:17:20 -08:00
Evan Tschannen 3da85f3acd implemented the \xff/metadataVersion key, which can be used by layers to help them cheaply cache metadata and know when their cache is invalid 2019-02-28 17:45:00 -08:00
Trevor Clinkenbeard ff9a7cb2f1 Combined proxy health metrics replies into single message type 2019-02-23 10:13:43 -08:00
Trevor Clinkenbeard fa96b8dd33 Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics 2019-02-20 16:56:16 -08:00
Vishesh Yadav e05b53d755 Merge remote-tracking branch 'apple/master' into task/tls-upgrade 2019-02-15 20:37:07 -08:00
A.J. Beamon d4349293b9 Reworked the way latency counters are tracked. Report the latency bands in separate events from StorageMetrics and ProxyMetrics. Fix a problem when the latency band configuration was changed. Add correctness testing. 2019-02-07 13:39:22 -08:00
Trevor Clinkenbeard eaaa93a939 Use IncludeVersion() for GetDetailedHealthMetricsReply pre-serialization 2019-02-01 11:45:47 -08:00
Trevor Clinkenbeard 4daf49ff4d Proxy runs healthMetricsRequestServer to handle incoming health metrics requests 2019-02-01 10:58:42 -08:00
Trevor Clinkenbeard 03e5e3ccbc Proxies periodically request health metrics from Ratekeeper in the getRate function. Occassionally (determined by DETAILED_METRIC_UPDATE_RATE), requests are for detailed per-process metrics. 2019-01-31 13:25:57 -08:00
Evan Tschannen 1d7fec3074 Merge commit '048bfc5c368063d9e009513078dab88be0cbd5b0' into task/tls-upgrade-2
# Conflicts:
#	.gitignore
2019-01-24 17:43:06 -08:00
A.J. Beamon 2198d24ce1 Merge commit '3b2700d25334c53d13496ca16682642aac951beb' into track-server-request-latencies
# Conflicts:
#	fdbclient/MasterProxyInterface.h
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/ServerDBInfo.h
#	fdbserver/Status.actor.cpp
#	fdbserver/fdbserver.vcxproj
#	fdbserver/storageserver.actor.cpp
2019-01-24 11:43:26 -08:00
anoyes 6a4d87802b Replace & operator with variadic function 2018-12-28 11:33:42 -08:00
Vishesh Yadav 3eb9b23024 Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint
- This patch will make FDB listen to multiple addresses given via
  command line. Although, we'll still use first address in most places,
  this patch starts using vector<NetworkAddress> in Endpoint at some basic
  places.
- When sending packets to an endpoint, pick a random network address in
  endpoints
- Renames Endpoint::address to Endpoint::addresses since it
  now holds a vector of addresses.
2018-12-13 13:36:52 -08:00
Vishesh Yadav 43e5a46f9b Change Endpoint::address(NetworkAddress) to vector<NetworkAddress>
Extend `Endpoint` class to take multiple NetworkAddresses instead of
just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll
have multiple IP:PORT pairs.

This patch simply adds the field and makes changes to compile the
codebase. The first element of of `address` field is used everywhere.
Hence the way we talk to remains same with this patch.

NOTE:

Directly accessing the first memeber of Endpoint::address is unsafe
as Endpoint() doesn't enforces non-empty address list. However, since
the correctness test pass for now and are anyway replacing all those
unsafe accesses with ones considering the whole vector, this patch
ignores to access them in safe way.
2018-12-13 13:36:52 -08:00
A.J. Beamon eb2f27b8e5 Work in progress implementation of server-side latency tracking. The intent of this is to be able to measure the number of requests that achieve certain latency targets across the system relative to the total number of requests. 2018-11-30 10:46:04 -08:00
Evan Tschannen 4b5d0b4e2c Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/AsyncFileBlobStore.actor.cpp
#	fdbclient/AsyncFileBlobStore.actor.h
#	fdbclient/BlobStore.actor.cpp
#	fdbclient/BlobStore.h
#	fdbclient/HTTP.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbrpc/LoadBalance.actor.h
#	fdbrpc/batcher.actor.h
#	fdbrpc/fdbrpc.vcxproj
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistributionTracker.actor.cpp
#	fdbserver/SimulatedCluster.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/masterserver.actor.cpp
2018-11-10 13:04:24 -08:00
Evan Tschannen bf6545a9cf clients cache storage server interfaces individually, instead of as a team. This is needed because in fearless every shard has storage servers from two separate teams, leading to a lot of possible combinations
allAlternatives failed logic was simplified, because we are already doing a global rate limiting, so a per shard limit is unnecessary
reduced unnecessary state variables in waitMetrics requests
2018-11-02 13:15:09 -07:00
Evan Tschannen 18509ac4dd account for the overhead of tags on a message when batching transactions into a version
increased the default location cache size
2018-11-02 12:52:34 -07:00
Robert Escriva 268093a96d Adjust all includes to be relative to the root.
Remove the use of relative paths.  A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h".  Adjust so that every include references such a header with the
latter form.

Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Evan Tschannen 65057b4788 Merge branch 'release-5.2' into release-6.0
# Conflicts:
#	documentation/sphinx/source/downloads.rst
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/MasterProxyInterface.h
#	packaging/msi/FDBInstaller.wxs
2018-08-02 11:29:40 -07:00
Evan Tschannen a361a785e8 fix: clients which cannot talk to storage servers poll the proxy for new storage server interfaces. If there are too many clients polling, we saturate the proxies with these requests, prevents the storage servers from updating their interfaces. 2018-07-31 16:57:23 -07:00