A.J. Beamon
e61cac4ed4
Fix spacing issue; rename fdbrpc/Stats.h to fdbrpc/TimedRequest.h
2019-08-01 08:39:52 -07:00
mpilman
7d247af500
Two minor bug fixes from recent optimizations
2019-07-31 19:14:11 -07:00
mpilman
dabe516320
Avoid unnecessary timer calls
2019-07-31 17:59:35 -07:00
Evan Tschannen
5c98dcce6d
revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators
2019-07-27 16:46:22 -07:00
Evan Tschannen
90e3b50213
Merge branch 'master' into feature-coordinator-connection
...
# Conflicts:
# fdbclient/DatabaseContext.h
# fdbclient/NativeAPI.actor.cpp
# fdbclient/NativeAPI.actor.h
# fdbserver/workloads/KillRegion.actor.cpp
2019-07-26 15:05:02 -07:00
sramamoorthy
9afd162e2f
remove snap v1 related code
2019-07-25 17:29:31 -07:00
sramamoorthy
a2f2ad96ff
code review comments and merge to master changes
2019-07-24 15:36:28 -07:00
sramamoorthy
d0793f5ca2
snap v2: master proxy related changes
2019-07-24 15:36:28 -07:00
Evan Tschannen
4a866290b7
Clients keep a persistent connection open with coordinators to get updates to the list of proxies
...
Status still needs to be updated with client information with information from the coordinators
2019-07-23 19:22:44 -07:00
Andrew Noyes
1968db17e3
Initialize in default constructor for GetReadVersionReply
...
==10473== Uninitialised byte(s) found during client check request
==10473== at 0x1BA9ACE: sendPacket(TransportData*, ISerializeSource const&, Endpoint const&, bool, bool) (FlowTransport.actor.cpp:1252)
==10473== by 0x877C05: (anonymous namespace)::NetworkSenderActorState<GetReadVersionReply, (anonymous namespace)::NetworkSenderActor<GetReadVersionReply> >::a_body1cont2(GetReadVersionReply const&, int) [clone .isra.0] (networksender.actor
.h:40)
==10473== by 0x877CC6: a_body1when1 (networksender.actor.g.h:147)
==10473== by 0x877CC6: a_callback_fire (networksender.actor.g.h:161)
==10473== by 0x877CC6: ActorCallback<(anonymous namespace)::NetworkSenderActor<GetReadVersionReply>, 0, GetReadVersionReply>::fire(GetReadVersionReply const&) (flow.h:894)
==10473== by 0xC343A7: send<GetReadVersionReply&> (flow.h:343)
==10473== by 0xC343A7: send<GetReadVersionReply&> (fdbrpc.h:124)
==10473== by 0xC343A7: (anonymous namespace)::ForwardProxyActorState<(anonymous namespace)::ForwardProxyActor>::a_body1loopBody1when2(ReplyPromise<GetReadVersionReply> const&, int) (MasterProxyServer.actor.cpp:1814)
==10473== by 0xC33C10: (anonymous namespace)::ForwardProxyActorState<(anonymous namespace)::ForwardProxyActor>::a_body1loopBody1(int) (MasterProxyServer.actor.g.cpp:8167)
==10473== by 0xC35434: a_body1loopHead1 (MasterProxyServer.actor.g.cpp:8152)
==10473== by 0xC35434: a_body1loopBody1cont2 (MasterProxyServer.actor.g.cpp:8327)
==10473== by 0xC35434: a_body1loopBody1cont1when1 (MasterProxyServer.actor.g.cpp:8333)
==10473== by 0xC35434: a_body1loopBody1cont1when1 (MasterProxyServer.actor.g.cpp:8331)
==10473== by 0xC35434: a_callback_fire (MasterProxyServer.actor.g.cpp:8347)
==10473== by 0xC35434: ActorCallback<(anonymous namespace)::ForwardProxyActor, 3, Void>::fire(Void const&) (flow.h:894)
==10473== by 0x7E7BE7: SAV<Void>::finishSendAndDelPromiseRef() (flow.h:375)
==10473== by 0x8319FD: a_body1when1 (genericactors.actor.g.h:10892)
==10473== by 0x8319FD: a_callback_fire (genericactors.actor.g.h:10920)
==10473== by 0x8319FD: ActorCallback<(anonymous namespace)::ChooseActorActor<Void>, 0, Void>::fire(Void const&) (flow.h:894)
==10473== by 0x891917: void SAV<Void>::send<Void>(Void&&) (flow.h:343)
==10473== by 0x1C47ADC: send<Void> (flow.h:674)
==10473== by 0x1C47ADC: execTask (sim2.actor.cpp:1632)
==10473== by 0x1C47ADC: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1cont1(Void const&, int) (sim2.actor.cpp:975)
==10473== by 0x1C47FF2: a_body1loopBody1when1 (sim2.actor.g.cpp:5092)
==10473== by 0x1C47FF2: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1(int) (sim2.actor.g.cpp:5037)
==10473== by 0x1C47A6C: a_body1loopHead1 (sim2.actor.g.cpp:5020)
==10473== by 0x1C47A6C: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1cont1(Void const&, int) (sim2.actor.g.cpp:5086)
==10473== Address 0x12db1ba1 is 2,977 bytes inside a recently re-allocated block of size 4,096 alloc'd
==10473== at 0x1CC5D7F: FastAllocator<4096>::allocate() (FastAlloc.cpp:290)
==10473== by 0x1CFAA68: operator new (FastAlloc.h:193)
==10473== by 0x1CFAA68: PacketWriter::nextBuffer() (Net2Packet.cpp:59)
==10473== by 0x1CFABD6: PacketWriter::writeAhead(int, SplitBuffer*) (Net2Packet.cpp:81)
==10473== by 0x1BA97EB: sendPacket(TransportData*, ISerializeSource const&, Endpoint const&, bool, bool) (FlowTransport.actor.cpp:1199)
==10473== by 0x7DEAD1: a_body1cont2 (networksender.actor.h:40)
==10473== by 0x7DEAD1: a_body1when1 (networksender.actor.g.h:147)
==10473== by 0x7DEAD1: a_callback_fire (networksender.actor.g.h:161)
==10473== by 0x7DEAD1: ActorCallback<(anonymous namespace)::NetworkSenderActor<GetValueReply>, 0, GetValueReply>::fire(GetValueReply const&) (flow.h:894)
==10473== by 0xF22767: send<GetValueReply&> (flow.h:343)
==10473== by 0xF22767: send<GetValueReply&> (fdbrpc.h:124)
==10473== by 0xF22767: (anonymous namespace)::GetValueQActorState<(anonymous namespace)::GetValueQActor>::a_body1cont5(int) [clone .isra.0] (storageserver.actor.cpp:890)
==10473== by 0xF2305C: (anonymous namespace)::GetValueQActorState<(anonymous namespace)::GetValueQActor>::a_body1cont3(int) [clone .isra.0] (storageserver.actor.g.cpp:1592)
==10473== by 0xF23447: a_body1cont2when1 (storageserver.actor.g.cpp:1627)
==10473== by 0xF23447: (anonymous namespace)::GetValueQActorState<(anonymous namespace)::GetValueQActor>::a_body1cont2(Void const&, int) [clone .isra.0] (storageserver.actor.g.cpp:1512)
==10473== by 0xF23507: a_body1when1 (storageserver.actor.g.cpp:1523)
==10473== by 0xF23507: a_callback_fire (storageserver.actor.g.cpp:1537)
==10473== by 0xF23507: ActorCallback<(anonymous namespace)::GetValueQActor, 0, Void>::fire(Void const&) (flow.h:894)
==10473== by 0x891917: void SAV<Void>::send<Void>(Void&&) (flow.h:343)
==10473== by 0x1C47ADC: send<Void> (flow.h:674)
==10473== by 0x1C47ADC: execTask (sim2.actor.cpp:1632)
==10473== by 0x1C47ADC: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1cont1(Void const&, int) (sim2.actor.cpp:975)
==10473== by 0x1C47FF2: a_body1loopBody1when1 (sim2.actor.g.cpp:5092)
==10473== by 0x1C47FF2: Sim2::RunLoopActorState<Sim2::RunLoopActor>::a_body1loopBody1(int) (sim2.actor.g.cpp:5037)
==10473== Uninitialised value was created by a stack allocation
==10473== at 0xC342D0: (anonymous namespace)::ForwardProxyActorState<(anonymous namespace)::ForwardProxyActor>::a_body1loopBody1when2(ReplyPromise<GetReadVersionReply> const&, int) (MasterProxyServer.actor.g.cpp:8213)
2019-07-22 13:54:52 -07:00
Evan Tschannen
c348b3da51
After a proxy dies, it will remain alive for an additional 10 seconds to forward clients to the new proxies
2019-07-08 12:53:40 -07:00
Alex Miller
7a500cd37f
A giant translation of TaskFooPriority -> TaskPriority::Foo
...
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
sramamoorthy
2a68b28590
rebase related changes
2019-05-28 22:07:46 -07:00
sramamoorthy
61e93a9304
Address review comments and minor fixes
2019-05-28 22:07:46 -07:00
sramamoorthy
72dd067173
Trace message changes and fix few FIXMEs
2019-05-28 22:07:46 -07:00
sramamoorthy
69edefe68b
Snapshot based backup and resotre implementation
2019-05-28 22:07:46 -07:00
Jingyu Zhou
b8e7fc1b84
Refactor: add std:: qualifier and use emplace_back
2019-05-17 09:38:50 -10:00
mpilman
96aaa31a6c
Compiling on clang again
2019-05-13 14:15:23 -07:00
mpilman
69fa3d3903
fixed compilation issues after rebase
2019-05-13 14:15:23 -07:00
mpilman
6afce01744
Implementation complete (not yet working)
2019-05-13 14:15:22 -07:00
Evan Tschannen
4fa1c008f9
Highly prioritize storageServerRejoin messages on the proxy, so that storage servers can rejoin the cluster even when a proxy is CPU saturated
2019-04-23 20:56:01 -07:00
Evan Tschannen
b6008558d3
renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
...
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Evan Tschannen
f9aad46573
made use_provisional_proxies a transaction option
2019-03-19 18:44:37 -07:00
Evan Tschannen
5b9c45ea0b
clients do not attempt to connect to provisional proxies
2019-03-19 13:37:50 -07:00
A.J. Beamon
85b3f11e71
Fix various compiler warnings
2019-03-15 10:34:57 -07:00
Evan Tschannen
988add9fb5
cache the metadataVersion for commits, so that doing setVersion with the commit version of a different transaction will
2019-03-04 16:48:34 -08:00
Evan Tschannen
075fdef31a
Merge branch 'master' into feature-metadata-version
...
# Conflicts:
# fdbclient/DatabaseContext.h
2019-03-03 22:58:45 -08:00
Evan Tschannen
9d53305516
Make GetHealthMetricsRequest constructor explicit
...
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:30:10 -08:00
Evan Tschannen
de577a6ca9
Modified GetHealthMetricsReply constructor
...
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:29:16 -08:00
Evan Tschannen
3a91fb287b
Use const HealthMetrics& parameter in update
...
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:17:20 -08:00
Evan Tschannen
3da85f3acd
implemented the \xff/metadataVersion key, which can be used by layers to help them cheaply cache metadata and know when their cache is invalid
2019-02-28 17:45:00 -08:00
Trevor Clinkenbeard
ff9a7cb2f1
Combined proxy health metrics replies into single message type
2019-02-23 10:13:43 -08:00
Trevor Clinkenbeard
fa96b8dd33
Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics
2019-02-20 16:56:16 -08:00
Vishesh Yadav
e05b53d755
Merge remote-tracking branch 'apple/master' into task/tls-upgrade
2019-02-15 20:37:07 -08:00
A.J. Beamon
d4349293b9
Reworked the way latency counters are tracked. Report the latency bands in separate events from StorageMetrics and ProxyMetrics. Fix a problem when the latency band configuration was changed. Add correctness testing.
2019-02-07 13:39:22 -08:00
Trevor Clinkenbeard
eaaa93a939
Use IncludeVersion() for GetDetailedHealthMetricsReply pre-serialization
2019-02-01 11:45:47 -08:00
Trevor Clinkenbeard
4daf49ff4d
Proxy runs healthMetricsRequestServer to handle incoming health metrics requests
2019-02-01 10:58:42 -08:00
Trevor Clinkenbeard
03e5e3ccbc
Proxies periodically request health metrics from Ratekeeper in the getRate function. Occassionally (determined by DETAILED_METRIC_UPDATE_RATE), requests are for detailed per-process metrics.
2019-01-31 13:25:57 -08:00
Evan Tschannen
1d7fec3074
Merge commit '048bfc5c368063d9e009513078dab88be0cbd5b0' into task/tls-upgrade-2
...
# Conflicts:
# .gitignore
2019-01-24 17:43:06 -08:00
A.J. Beamon
2198d24ce1
Merge commit '3b2700d25334c53d13496ca16682642aac951beb' into track-server-request-latencies
...
# Conflicts:
# fdbclient/MasterProxyInterface.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/ServerDBInfo.h
# fdbserver/Status.actor.cpp
# fdbserver/fdbserver.vcxproj
# fdbserver/storageserver.actor.cpp
2019-01-24 11:43:26 -08:00
anoyes
6a4d87802b
Replace & operator with variadic function
2018-12-28 11:33:42 -08:00
Vishesh Yadav
3eb9b23024
Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint
...
- This patch will make FDB listen to multiple addresses given via
command line. Although, we'll still use first address in most places,
this patch starts using vector<NetworkAddress> in Endpoint at some basic
places.
- When sending packets to an endpoint, pick a random network address in
endpoints
- Renames Endpoint::address to Endpoint::addresses since it
now holds a vector of addresses.
2018-12-13 13:36:52 -08:00
Vishesh Yadav
43e5a46f9b
Change Endpoint::address(NetworkAddress) to vector<NetworkAddress>
...
Extend `Endpoint` class to take multiple NetworkAddresses instead of
just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll
have multiple IP:PORT pairs.
This patch simply adds the field and makes changes to compile the
codebase. The first element of of `address` field is used everywhere.
Hence the way we talk to remains same with this patch.
NOTE:
Directly accessing the first memeber of Endpoint::address is unsafe
as Endpoint() doesn't enforces non-empty address list. However, since
the correctness test pass for now and are anyway replacing all those
unsafe accesses with ones considering the whole vector, this patch
ignores to access them in safe way.
2018-12-13 13:36:52 -08:00
A.J. Beamon
eb2f27b8e5
Work in progress implementation of server-side latency tracking. The intent of this is to be able to measure the number of requests that achieve certain latency targets across the system relative to the total number of requests.
2018-11-30 10:46:04 -08:00
Evan Tschannen
4b5d0b4e2c
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/AsyncFileBlobStore.actor.cpp
# fdbclient/AsyncFileBlobStore.actor.h
# fdbclient/BlobStore.actor.cpp
# fdbclient/BlobStore.h
# fdbclient/HTTP.actor.cpp
# fdbclient/ManagementAPI.actor.cpp
# fdbclient/NativeAPI.actor.cpp
# fdbrpc/LoadBalance.actor.h
# fdbrpc/batcher.actor.h
# fdbrpc/fdbrpc.vcxproj
# fdbrpc/sim2.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/DataDistributionTracker.actor.cpp
# fdbserver/SimulatedCluster.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/masterserver.actor.cpp
2018-11-10 13:04:24 -08:00
Evan Tschannen
bf6545a9cf
clients cache storage server interfaces individually, instead of as a team. This is needed because in fearless every shard has storage servers from two separate teams, leading to a lot of possible combinations
...
allAlternatives failed logic was simplified, because we are already doing a global rate limiting, so a per shard limit is unnecessary
reduced unnecessary state variables in waitMetrics requests
2018-11-02 13:15:09 -07:00
Evan Tschannen
18509ac4dd
account for the overhead of tags on a message when batching transactions into a version
...
increased the default location cache size
2018-11-02 12:52:34 -07:00
Robert Escriva
268093a96d
Adjust all includes to be relative to the root.
...
Remove the use of relative paths. A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h". Adjust so that every include references such a header with the
latter form.
Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Evan Tschannen
65057b4788
Merge branch 'release-5.2' into release-6.0
...
# Conflicts:
# documentation/sphinx/source/downloads.rst
# documentation/sphinx/source/release-notes.rst
# fdbclient/MasterProxyInterface.h
# packaging/msi/FDBInstaller.wxs
2018-08-02 11:29:40 -07:00
Evan Tschannen
a361a785e8
fix: clients which cannot talk to storage servers poll the proxy for new storage server interfaces. If there are too many clients polling, we saturate the proxies with these requests, prevents the storage servers from updating their interfaces.
2018-07-31 16:57:23 -07:00