fixup! Update documentation per comments
This commit is contained in:
parent
6e7716d078
commit
bb05c83e19
|
@ -1,8 +1,16 @@
|
|||
# Load Balancing in FoundationDB
|
||||
|
||||
In FoundationDB, often multiple *interface*s are available for the same type of *request*s. A load balancer can be used to distribute the requests to those interfaces, while awaring the possible failures.
|
||||
## Introduction
|
||||
|
||||
Two load balancer are provided: `basicLoadBalance` and `loadBalance`, both defined in `LoadBalance.actor.h`. The `basicLoadBalance` is a simple
|
||||
FoundationDB is a distributed key-value database. A FoundationDB cluster is constituted by one or more processes over one or more physical machines, where each process is a *worker* and takes certain *role*s, such as coordinator, proxy, TLog, storage server, etc., in the system.
|
||||
|
||||
The interpocess communications (IPC) between the processes are supported by the [`flow`](https://github.com/apple/foundationdb/tree/main/flow) infrastructure. In the `flow` context, each process will expose one or more *interface*(s). Each interface is able to accept given type of *request*s, and *reply* `Void`, requested data or error. The interfaces and the corresponding request/reply pairs forms the IPC protocol of FoundationDB.
|
||||
|
||||
In many cases, the same request can be proceed by multiple processes, e.g. all commit proxies can accept commit requests, and multiple storage server processes can provide values for a given key in double/triple redundancy mode. A load balancer (LB) can be used to distribute the requests over the possible interfaces, preventing one or a few processes getting overloaded. The interface candidates are also referred as *alternative*s. The LB is also able to react when one or more interfaces are (temporarily) unavailable by retrying, or re-routing the request to other candidates. The interface candidates are also known as *alternative*s.
|
||||
|
||||
Two LBs are provided in FoundationDB: `basicLoadBalance` and `loadBalance`, both defined in [`LoadBalance.actor.h`](https://github.com/apple/foundationdb/blob/main/fdbrpc/include/fdbrpc/LoadBalance.actor.h). The `basicLoadBalance` is a simple load balancer which each interface is equally chosen; while the `loadBalance` accepts a model object, which provides [datacenter](https://apple.github.io/foundationdb/configuration.html#configuring-regions) (DC) awaring balancing algorithms, allowing requests being sent to interfaces in the same DC.
|
||||
|
||||
In the following sections, the two LBs will be discussed in details.
|
||||
|
||||
## `basicLoadBalance`
|
||||
|
||||
|
@ -12,12 +20,12 @@ Two load balancer are provided: `basicLoadBalance` and `loadBalance`, both defin
|
|||
* GetReadVersion proxy interface
|
||||
* ConfigFollower interface
|
||||
|
||||
The interface is assumed to be always *fresh*, i.e. the list of the servers is fixed.
|
||||
Here, the interfaces are assumed to be always *fresh*, i.e. the list of the servers is fixed.
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
H0{Has alternatives?}
|
||||
H1[Pick up an alternative]
|
||||
H1[Pick an alternative]
|
||||
H2[Backoff]
|
||||
H3[Request]
|
||||
H4([Reply])
|
||||
|
@ -26,24 +34,24 @@ graph LR
|
|||
H((Start)) --> H0
|
||||
H0 --No--> H6
|
||||
H0 --Yes--> H1
|
||||
H1 --No healthy alternatives--> H2 --> H1
|
||||
H1 --No healthy alternatives--> H2 --Retry--> H1
|
||||
H1 --Has alternative--> H3 --Success--> H4
|
||||
H3 --Exception--> H5
|
||||
H3 --Broken Promise --> H2
|
||||
```
|
||||
|
||||
### Alternative pick up algorithm
|
||||
### Alternative pick algorithm
|
||||
|
||||
In `basicLoadBalance`, a *best* alternative is picked up and used at the beginning. At this stage, this alternative is randomly picked up among all alternatives. If the best alternative does not work, it will iteratively try other interfaces, see [here](#picking-up-an-alternative-in-basic-load-balancing-algorithm).
|
||||
In `basicLoadBalance`, a *best* alternative is picked and used at the beginning. At this stage, this alternative is randomly picked among all alternatives. If the best alternative does not work, it will iteratively try other interfaces, see [here](#picking-up-an-alternative-in-basic-load-balancing-algorithm).
|
||||
|
||||
## `loadBalance`
|
||||
|
||||
`loadBalance` provides a more sophisticated implementation of load balancing. In addition to the basic load balancing, it also provides a variety of features, such as
|
||||
`loadBalance` provides a more sophisticated implementation of load balancing. In addition of the basic load balancing, it also provides a variety of features:
|
||||
|
||||
* Support for Test Storage Server ([TSS](https://github.com/apple/foundationdb/blob/main/documentation/sphinx/source/tss.rst))
|
||||
* Distance-based candidate election
|
||||
* Able to handle timeouts and exceptions with retries
|
||||
* etc.
|
||||
* Datacenter awaring alternative election
|
||||
* Recording the latency and penalty from interfaces, and [prioritize the interfaces based on previously stored data](#with-queuemodel).
|
||||
* Able to handle timeouts and SS exceptions with retries.
|
||||
|
||||
Currently it is used for
|
||||
|
||||
|
@ -58,7 +66,7 @@ graph LR
|
|||
H0{Has alternatives?}
|
||||
H1[Choose initial candidates]
|
||||
H4([Never])
|
||||
H5[Pick up an alternative]
|
||||
H5[pick an alternative]
|
||||
H6[Send request]
|
||||
H7[Wait for available alternative]
|
||||
H8([Response])
|
||||
|
@ -77,34 +85,32 @@ graph LR
|
|||
|
||||
Note:
|
||||
|
||||
* Response could be an exception, e.g. `process_behind` or `request_maybe_delivered`, and will be delivered as `Error` to the caller.
|
||||
* The response could be either a reply, or an `Error`, e.g. `process_behind` or `request_maybe_delivered`.
|
||||
|
||||
### Choose initial candidates
|
||||
|
||||
Two initial candidates will be picked up before the requests start. They will be selected as the first two alternatives for the load balancer. If both of them failed, other alternatives are used in a round-robin way.
|
||||
Two initial candidates will be picked before the requests start. They will be selected as the first two alternatives for the load balancer. If both of them failed, other alternatives are used in a round-robin way.
|
||||
|
||||
#### No `QueueModel`
|
||||
|
||||
If no `QueueModel` is provided, the initial candidates are picked up randomly. The first candidate, or the *best* alternative, will always be one of local workers.
|
||||
If no `QueueModel` is provided, the initial candidates are picked randomly. The first candidate, or the *best* alternative, will be the one that in the same DC, if possible.
|
||||
|
||||
#### With `QueueModel`
|
||||
|
||||
`QueueModel` holds information about each candidate related to future version, latency and penalty.
|
||||
|
||||
* If the storage server is returning a future version error, it is marked as not available until some certain time.
|
||||
* Penalty is reported by storage server in each response (see `storageserver.actor.cpp:StorageServer::getPenalty`). It is determined by the write queue length and the version lagging.
|
||||
* Penalty is reported by storage server in each response (see `storageserver.actor.cpp:StorageServer::getPenalty`). It is determined by the write queue length and the durability lagging.
|
||||
|
||||
If `QueueModel` exists, the candidates will be picked base on the penalty. Workers with high penalties will be avoided when picking up the first two candidates.
|
||||
If `QueueModel` exists, the candidates will be picked base on the penalty. Workers with high penalties will be avoided when picking the first two candidates.
|
||||
|
||||
### Pick up an alternative
|
||||
### Pick an alternative
|
||||
|
||||
As mentioned above, the alternatives are chosen in the round-robin way when the first two candidates failed.
|
||||
|
||||
If all alternatives failed, a flag is set, so if the next request fails with `process_behind`, the caller will receive the `process_behind` error.
|
||||
The alternatives are chosen in the round-robin way when the first two candidates failed. If all alternatives failed, a flag is set, and if the next request fails with `process_behind`, the caller will receive the `process_behind` error.
|
||||
|
||||
### Send requests to workers
|
||||
|
||||
Here it is assumed that there is at least one alternative available.
|
||||
Here it is assumed that there are at least one alternative available. If no alternative is available, the LB will wait.
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
|
@ -126,7 +132,7 @@ graph LR
|
|||
H4 --Additional request failed--> H3
|
||||
```
|
||||
|
||||
If the first request failed, it is reset and the next request will be considered as the first request. Certain types of errors can also be returned as response, e.g. `request_may_be_delivered` or `process_behind`, which may not trigger a load-balancer retry.
|
||||
The first request has a timeout option. If the LB is not able to retrieve the response within the timout, more requests will be sent to secondary and other available interfaces. If the first request failed, it is reset and the next request will be considered as the first request. Certain types of errors can also be returned as response, e.g. `request_may_be_delivered` or `process_behind`, which may not trigger a load-balancer retry.
|
||||
|
||||
### Wait for available alternative
|
||||
|
||||
|
@ -151,7 +157,7 @@ graph LR
|
|||
H2 --Failed-->H4
|
||||
```
|
||||
|
||||
Note that "Wait for alternatives" will only timeout if the alternatives are not always fresh, i.e. this only happens when accessing storage servers.
|
||||
Note that "Wait for alternatives" will only timeout if the alternatives are always not fresh, i.e. this only happens when accessing storage servers. LB will throw `all_alternatives_failed` when timeout in this case.
|
||||
|
||||
#### Requests
|
||||
|
||||
|
@ -163,9 +169,9 @@ Original requests in `loadBalancer` are wrapped by `LoadBalance.actor.h:RequestD
|
|||
|
||||
## Appendix
|
||||
|
||||
### Picking up an alternative in load balancing algorithm
|
||||
### Picking an alternative in basic load balancing algorithm
|
||||
|
||||
The following script simulates the alternative picking up algorithm. The chosen alternatives will be printed out one-by-one.
|
||||
The following script simulates the alternative picking up algorithm. The chosen alternatives will be printed out one-by-one. The `loadBalance` function uses a similar approach, though the interfaces in the same DC are used firstly.
|
||||
|
||||
```python
|
||||
#! /usr/bin/env python3
|
||||
|
|
Loading…
Reference in New Issue