foundationdb/fdbrpc/FailureMonitor.h

/*
 * FailureMonitor.h
 *
 * This source file is part of the FoundationDB open source project
 *
 * Copyright 2013-2018 Apple Inc. and the FoundationDB project authors
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

#ifndef FLOW_FAILUREMONITOR_H
#define FLOW_FAILUREMONITOR_H
#pragma once

#include "flow/flow.h"
#include "flow/IndexedSet.h"
#include "fdbrpc/FlowTransport.h" // Endpoint
#include <unordered_map>

using std::vector;

/*

IFailureMonitor is used by load balancing, data distribution and other components
to report on which other machines are unresponsive or experiencing other failures.
This is vital both to reconfigure the system in response to failures and to prevent
actors from waiting forever for replies from remote machines that are no longer
available.  When waiting for a reply, clients should generally stop waiting and
try an alternative server when a failure is reported, rather than relying on timeouts.

The information tracked for each machine is a FailureStatus, which
for the moment is just a boolean but might be richer in the future.

Get an IFailureMonitor by calling g_network->failureMonitor(); the simulator keeps
one for each simulated machine and ASIONetwork keeps one for each process.

The system attempts to ensure that failures are reported quickly, but may occasionally
report a working system as failed temporarily.  Clients that intend to take very costly
actions as a result of a failure should probably wait a while to see if a machine becomes
unfailed first.  If possible use onFailedFor() which in the future may react to 'permanent'
failures immediately.

In older FDB, information reported through this interface was actually actively supplied by
failureMonitorClient, which exchanges FailureMonitoringRequest/Reply pairs with the
failureDetectionServer actor on the ClusterController.

Now it is done locally by each process with help of of FlowTransport. Whenever a network
connection is establish/failed, the address is marked as available or failed accordingly. We
do however take an optimistic approach of assuming every newly discovered address
(when deserializing an endpoint) is healthy by default.

In the future it may be augmented with locally available information about failures (e.g.
TCP connection loss in ASIONetwork or unexpectedly long response times for application requests).

Communications failures are tracked at NetworkAddress granularity.  When a request is made to
a missing endpoint on a non-failed machine, this information is reported back to the requesting
machine and tracked at the endpoint level.

*/

struct FailureStatus {
	bool failed;

	FailureStatus() : failed(true) {}
	explicit FailureStatus(bool failed) : failed(failed) {}
	bool isFailed() const { return failed; }
	bool isAvailable() const { return !failed; }

	bool operator == (FailureStatus const& r) const { return failed == r.failed; }
	bool operator != (FailureStatus const& r) const { return failed != r.failed; }
	template <class Ar>
	void serialize(Ar& ar) {
		serializer(ar, failed);
	}
};

class IFailureMonitor {
public:
	// Returns the currently known status for the endpoint
	virtual FailureStatus getState( Endpoint const& endpoint ) = 0;

	// Returns the currently known status for the address
	virtual FailureStatus getState( NetworkAddress const& address ) = 0;

	// Only use this function when the endpoint is known to be failed
	virtual void endpointNotFound( Endpoint const& ) = 0;

	// The next time the known status for the endpoint changes, returns the new status.
	virtual Future<Void> onStateChanged( Endpoint const& endpoint ) = 0;

	// Returns when onFailed(endpoint) || transport().onDisconnect( endpoint.getPrimaryAddress() ), but more efficiently
	virtual Future<Void> onDisconnectOrFailure( Endpoint const& endpoint ) = 0;

	// Returns true if the endpoint is failed but the address of the endpoint is not failed.
	virtual bool onlyEndpointFailed( Endpoint const& endpoint ) = 0;

	// Returns true if the endpoint will never become available.
	virtual bool permanentlyFailed( Endpoint const& endpoint ) = 0;

	// Called by FlowTransport when a connection closes and a prior request or reply might be lost
	virtual void notifyDisconnect( NetworkAddress const& ) = 0;

	// Called to update the failure status of network address directly when running client.
	virtual void setStatus(NetworkAddress const& address, FailureStatus const& status) = 0;

	// Returns when the known status of endpoint is next equal to status.  Returns immediately
	//   if appropriate.
	Future<Void> onStateEqual( Endpoint const& endpoint, FailureStatus status );

	// Returns when the status of the given endpoint is next considered "failed"
	Future<Void> onFailed( Endpoint const& endpoint ) {
		return onStateEqual( endpoint, FailureStatus() );
	}

	// Returns when the status of the given endpoint has continuously been "failed" for sustainedFailureDuration + (elapsedTime*sustainedFailureSlope)
	Future<Void> onFailedFor( Endpoint const& endpoint, double sustainedFailureDuration, double sustainedFailureSlope = 0.0 );

	// Returns the failure monitor that the calling machine should use
	static IFailureMonitor& failureMonitor() {
		return *static_cast<IFailureMonitor*>((void*)g_network->global(INetwork::enFailureMonitor));
	}
};

// SimpleFailureMonitor is the sole implementation of IFailureMonitor.  It has no
//   failure detection logic; it just implements the interface and reacts to setStatus() etc.
// Initially all addresses are considered failed, but all endpoints of a non-failed address are considered OK.

class SimpleFailureMonitor : public IFailureMonitor {
public:
	SimpleFailureMonitor() : endpointKnownFailed() { }
	void setStatus( NetworkAddress const& address, FailureStatus const& status );
	void endpointNotFound( Endpoint const& );
	virtual void notifyDisconnect( NetworkAddress const& );

	virtual Future<Void> onStateChanged( Endpoint const& endpoint );
	virtual FailureStatus getState( Endpoint const& endpoint );
	virtual FailureStatus getState( NetworkAddress const& address );
	virtual Future<Void> onDisconnectOrFailure( Endpoint const& endpoint );
	virtual bool onlyEndpointFailed( Endpoint const& endpoint );
	virtual bool permanentlyFailed( Endpoint const& endpoint );

	void reset();
private:
	std::unordered_map< NetworkAddress, FailureStatus > addressStatus;
	YieldedAsyncMap< Endpoint, bool > endpointKnownFailed;

	friend class OnStateChangedActorActor;
};

#endif
Initial repository commit 2017-05-26 04:48:44 +08:00			`/*`
			`* FailureMonitor.h`
			`*`
			`* This source file is part of the FoundationDB open source project`
			`*`
			`* Copyright 2013-2018 Apple Inc. and the FoundationDB project authors`
remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-22 02:25:11 +08:00			`*`
Initial repository commit 2017-05-26 04:48:44 +08:00			`* Licensed under the Apache License, Version 2.0 (the "License");`
			`* you may not use this file except in compliance with the License.`
			`* You may obtain a copy of the License at`
remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-22 02:25:11 +08:00			`*`
Initial repository commit 2017-05-26 04:48:44 +08:00			`* http://www.apache.org/licenses/LICENSE-2.0`
remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-22 02:25:11 +08:00			`*`
Initial repository commit 2017-05-26 04:48:44 +08:00			`* Unless required by applicable law or agreed to in writing, software`
			`* distributed under the License is distributed on an "AS IS" BASIS,`
			`* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.`
			`* See the License for the specific language governing permissions and`
			`* limitations under the License.`
			`*/`

			`#ifndef FLOW_FAILUREMONITOR_H`
			`#define FLOW_FAILUREMONITOR_H`
			`#pragma once`

			`#include "flow/flow.h"`
			`#include "flow/IndexedSet.h"`
Adjust all includes to be relative to the root. Remove the use of relative paths. A header at foo/bar.h could be included by files under foo/ with "bar.h", but would be included everywhere else as "foo/bar.h". Adjust so that every include references such a header with the latter form. Signed-off-by: Robert Escriva <rescriva@dropbox.com> 2018-10-20 01:30:13 +08:00			`#include "fdbrpc/FlowTransport.h" // Endpoint`
changed failureMonitor to use an unordered_map 2019-03-28 10:17:08 +08:00			`#include <unordered_map>`
Initial repository commit 2017-05-26 04:48:44 +08:00
			`using std::vector;`

			`/*`

			`IFailureMonitor is used by load balancing, data distribution and other components`
			`to report on which other machines are unresponsive or experiencing other failures.`
			`This is vital both to reconfigure the system in response to failures and to prevent`
			`actors from waiting forever for replies from remote machines that are no longer`
			`available. When waiting for a reply, clients should generally stop waiting and`
			`try an alternative server when a failure is reported, rather than relying on timeouts.`

			`The information tracked for each machine is a FailureStatus, which`
			`for the moment is just a boolean but might be richer in the future.`

			`Get an IFailureMonitor by calling g_network->failureMonitor(); the simulator keeps`
			`one for each simulated machine and ASIONetwork keeps one for each process.`

			`The system attempts to ensure that failures are reported quickly, but may occasionally`
			`report a working system as failed temporarily. Clients that intend to take very costly`
			`actions as a result of a failure should probably wait a while to see if a machine becomes`
			`unfailed first. If possible use onFailedFor() which in the future may react to 'permanent'`
			`failures immediately.`

FailureMonitor: Update comment on how healthy/failed addresses are tracked 2020-01-08 03:45:34 +08:00			`In older FDB, information reported through this interface was actually actively supplied by`
			`failureMonitorClient, which exchanges FailureMonitoringRequest/Reply pairs with the`
			`failureDetectionServer actor on the ClusterController.`

			`Now it is done locally by each process with help of of FlowTransport. Whenever a network`
			`connection is establish/failed, the address is marked as available or failed accordingly. We`
			`do however take an optimistic approach of assuming every newly discovered address`
			`(when deserializing an endpoint) is healthy by default.`

			`In the future it may be augmented with locally available information about failures (e.g.`
Initial repository commit 2017-05-26 04:48:44 +08:00			`TCP connection loss in ASIONetwork or unexpectedly long response times for application requests).`

			`Communications failures are tracked at NetworkAddress granularity. When a request is made to`
			`a missing endpoint on a non-failed machine, this information is reported back to the requesting`
			`machine and tracked at the endpoint level.`

			`*/`

			`struct FailureStatus {`
			`bool failed;`

			`FailureStatus() : failed(true) {}`
			`explicit FailureStatus(bool failed) : failed(failed) {}`
FailureMonitor: Update comment on how healthy/failed addresses are tracked 2020-01-08 03:45:34 +08:00			`bool isFailed() const { return failed; }`
			`bool isAvailable() const { return !failed; }`
Initial repository commit 2017-05-26 04:48:44 +08:00
			`bool operator == (FailureStatus const& r) const { return failed == r.failed; }`
			`bool operator != (FailureStatus const& r) const { return failed != r.failed; }`
			`template <class Ar>`
			`void serialize(Ar& ar) {`
Replace & operator with variadic function 2018-12-29 02:49:26 +08:00			`serializer(ar, failed);`
Initial repository commit 2017-05-26 04:48:44 +08:00			`}`
			`};`

			`class IFailureMonitor {`
			`public:`
			`// Returns the currently known status for the endpoint`
			`virtual FailureStatus getState( Endpoint const& endpoint ) = 0;`

A process will mark itself as degraded if it continually disconnects from a different process which the failure monitor thinks is healthy 2019-04-05 05:11:12 +08:00			`// Returns the currently known status for the address`
			`virtual FailureStatus getState( NetworkAddress const& address ) = 0;`

Initial repository commit 2017-05-26 04:48:44 +08:00			`// Only use this function when the endpoint is known to be failed`
			`virtual void endpointNotFound( Endpoint const& ) = 0;`

			`// The next time the known status for the endpoint changes, returns the new status.`
			`virtual Future<Void> onStateChanged( Endpoint const& endpoint ) = 0;`

Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint - This patch will make FDB listen to multiple addresses given via command line. Although, we'll still use first address in most places, this patch starts using vector<NetworkAddress> in Endpoint at some basic places. - When sending packets to an endpoint, pick a random network address in endpoints - Renames Endpoint::address to Endpoint::addresses since it now holds a vector of addresses. 2018-10-31 04:44:37 +08:00			`// Returns when onFailed(endpoint) \|\| transport().onDisconnect( endpoint.getPrimaryAddress() ), but more efficiently`
Initial repository commit 2017-05-26 04:48:44 +08:00			`virtual Future<Void> onDisconnectOrFailure( Endpoint const& endpoint ) = 0;`

			`// Returns true if the endpoint is failed but the address of the endpoint is not failed.`
			`virtual bool onlyEndpointFailed( Endpoint const& endpoint ) = 0;`

			`// Returns true if the endpoint will never become available.`
			`virtual bool permanentlyFailed( Endpoint const& endpoint ) = 0;`

			`// Called by FlowTransport when a connection closes and a prior request or reply might be lost`
			`virtual void notifyDisconnect( NetworkAddress const& ) = 0;`

net: Don't make FailureMonitoring requests from client This patch removes the need for clients to continuously contact cluster coordinator for failure monitoring information. Instead, it uses the FlowTransport to monitor the statuses of peers and update FailureMonitor accordingly. 2019-05-21 02:54:46 +08:00			`// Called to update the failure status of network address directly when running client.`
run clang-format on changes 2019-05-30 04:43:21 +08:00			`virtual void setStatus(NetworkAddress const& address, FailureStatus const& status) = 0;`
net: Don't make FailureMonitoring requests from client This patch removes the need for clients to continuously contact cluster coordinator for failure monitoring information. Instead, it uses the FlowTransport to monitor the statuses of peers and update FailureMonitor accordingly. 2019-05-21 02:54:46 +08:00
Initial repository commit 2017-05-26 04:48:44 +08:00			`// Returns when the known status of endpoint is next equal to status. Returns immediately`
			`// if appropriate.`
			`Future<Void> onStateEqual( Endpoint const& endpoint, FailureStatus status );`

			`// Returns when the status of the given endpoint is next considered "failed"`
			`Future<Void> onFailed( Endpoint const& endpoint ) {`
			`return onStateEqual( endpoint, FailureStatus() );`
			`}`

			`// Returns when the status of the given endpoint has continuously been "failed" for sustainedFailureDuration + (elapsedTime*sustainedFailureSlope)`
			`Future<Void> onFailedFor( Endpoint const& endpoint, double sustainedFailureDuration, double sustainedFailureSlope = 0.0 );`
net: Don't make FailureMonitoring requests from client This patch removes the need for clients to continuously contact cluster coordinator for failure monitoring information. Instead, it uses the FlowTransport to monitor the statuses of peers and update FailureMonitor accordingly. 2019-05-21 02:54:46 +08:00
			`// Returns the failure monitor that the calling machine should use`
run clang-format on changes 2019-05-30 04:43:21 +08:00			`static IFailureMonitor& failureMonitor() {`
			`return static_cast<IFailureMonitor>((void*)g_network->global(INetwork::enFailureMonitor));`
			`}`
Initial repository commit 2017-05-26 04:48:44 +08:00			`};`

			`// SimpleFailureMonitor is the sole implementation of IFailureMonitor. It has no`
			`// failure detection logic; it just implements the interface and reacts to setStatus() etc.`
			`// Initially all addresses are considered failed, but all endpoints of a non-failed address are considered OK.`
failMon: For clients remove expireFailure and report failures only during connect 2019-05-22 07:39:08 +08:00
Initial repository commit 2017-05-26 04:48:44 +08:00			`class SimpleFailureMonitor : public IFailureMonitor {`
			`public:`
run clang-format on changes 2019-05-30 04:43:21 +08:00			`SimpleFailureMonitor() : endpointKnownFailed() { }`
Initial repository commit 2017-05-26 04:48:44 +08:00			`void setStatus( NetworkAddress const& address, FailureStatus const& status );`
			`void endpointNotFound( Endpoint const& );`
			`virtual void notifyDisconnect( NetworkAddress const& );`

			`virtual Future<Void> onStateChanged( Endpoint const& endpoint );`
			`virtual FailureStatus getState( Endpoint const& endpoint );`
A process will mark itself as degraded if it continually disconnects from a different process which the failure monitor thinks is healthy 2019-04-05 05:11:12 +08:00			`virtual FailureStatus getState( NetworkAddress const& address );`
Initial repository commit 2017-05-26 04:48:44 +08:00			`virtual Future<Void> onDisconnectOrFailure( Endpoint const& endpoint );`
			`virtual bool onlyEndpointFailed( Endpoint const& endpoint );`
			`virtual bool permanentlyFailed( Endpoint const& endpoint );`

			`void reset();`
			`private:`
changed failureMonitor to use an unordered_map 2019-03-28 10:17:08 +08:00			`std::unordered_map< NetworkAddress, FailureStatus > addressStatus;`
Initial repository commit 2017-05-26 04:48:44 +08:00			`YieldedAsyncMap< Endpoint, bool > endpointKnownFailed;`

			`friend class OnStateChangedActorActor;`
			`};`

Change Endpoint::address(NetworkAddress) to vector<NetworkAddress> Extend `Endpoint` class to take multiple NetworkAddresses instead of just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll have multiple IP:PORT pairs. This patch simply adds the field and makes changes to compile the codebase. The first element of of `address` field is used everywhere. Hence the way we talk to remains same with this patch. NOTE: Directly accessing the first memeber of Endpoint::address is unsafe as Endpoint() doesn't enforces non-empty address list. However, since the correctness test pass for now and are anyway replacing all those unsafe accesses with ones considering the whole vector, this patch ignores to access them in safe way. 2018-10-25 05:59:50 +08:00			`#endif`