foundationdb

Commit Graph

Author	SHA1	Message	Date
Jingyu Zhou	0c08161d8e	Remove old backup workers when done For backup workers working on old epochs, once their work is done, they will notify the master. Then the master removes them from the log system and acknowledge back to the backup workers so that they can gracefully shut down. The popping of a backup worker is stalled if there are workers from older epochs still working. Otherwise, workers from old epochs will lost data. However, allowing newer epoch to start backup can cause holes in version ranges. The restore process must verify the backup progress to make sure there are no holes, otherwise it has to wait.	2020-01-22 19:38:45 -08:00
Andrew Noyes	6aa0ada7b1	Replace scalar root types with proper messages	2019-08-28 14:40:50 -07:00
Alex Miller	df7f0cffa1	Raise the priority of TLogRejoin above the default work priority. With sharded txs tags, the master now receives data from transaction logs at an order of magnitude higher rate. This is the intentional desires result of sharding the txs tag. With a sufficient number of TLogs, the master will saturate its CPU time handling the peek responses. Performance tests revealed some unstable oddities in how long a recovery would take, which was eventually root caused to a priority inversion between TLogRejoin requests and TLog peek replies. Once peek replies saturate the CPU, the master would proceed to ignore further TLogRejoin messages. TLogRejoin is what marks a TLog as available to the failure monitor, which is also what decides between a ServerPeekCursor and a MergePeekCursor for a SetPeekCursor. Ignoring TLogRejoins meant that the sharded txs locality tags for those servers would be merge peeked over all TLogs. This is much less efficient than just peeking one copy of data from the one preferred server. Depending on the race between TLogPeek replies saturating the CPU and TLogRejoin requests being submitted, a variable number of tags would be affected, and thus the performance test would have some variance in its results.	2019-07-19 16:55:04 -07:00
Alex Miller	7a500cd37f	A giant translation of TaskFooPriority -> TaskPriority::Foo This is so that APIs that take priorities don't take ints, which are common and easy to accidentally pass the wrong thing.	2019-06-25 02:47:35 -07:00
mpilman	68ce9a5e75	ProtocolVersion type - second try	2019-06-18 17:55:27 -07:00
mpilman	6afce01744	Implementation complete (not yet working)	2019-05-13 14:15:22 -07:00
mpilman	ba83c458a6	types implemented	2019-05-13 14:15:22 -07:00
Vishesh Yadav	e05b53d755	Merge remote-tracking branch 'apple/master' into task/tls-upgrade	2019-02-15 20:37:07 -08:00
Jingyu Zhou	aea602d9c7	Remove getRecoveryInfo from master interface.	2019-02-14 16:37:16 -08:00
Jingyu Zhou	886e7ab2ba	Add a new DataDistributor role. Let cluster controller to start a new data distributor role by sending a message to a chosen worker. Change MasterInterface usage in DataDistribution to masterId Add DataDistributor rejoin handling. This allows the data distributor to tell the new cluster controller of its existence so that the controller doesn't spawn a new one. I.e., there should be only ONE data distributor in the cluster. If DataDistributor (DD) doesn't join in a while, then ClusterController (CC) tries to recruit one as DD. CC also monitors DD and restarts one if it failed. The Proxy is also monitoring the DD. If DD failed, the Proxy will ask CC for the new DD. Add GetRecoveryInfo RPC to master server, which is called by data distributor to obtain the recovery Transaction version from the master server.	2019-02-14 16:30:13 -08:00
Evan Tschannen	1d7fec3074	Merge commit '048bfc5c368063d9e009513078dab88be0cbd5b0' into task/tls-upgrade-2 # Conflicts: # .gitignore	2019-01-24 17:43:06 -08:00
anoyes	6a4d87802b	Replace & operator with variadic function	2018-12-28 11:33:42 -08:00
Vishesh Yadav	3eb9b23024	Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint - This patch will make FDB listen to multiple addresses given via command line. Although, we'll still use first address in most places, this patch starts using vector<NetworkAddress> in Endpoint at some basic places. - When sending packets to an endpoint, pick a random network address in endpoints - Renames Endpoint::address to Endpoint::addresses since it now holds a vector of addresses.	2018-12-13 13:36:52 -08:00
Vishesh Yadav	43e5a46f9b	Change Endpoint::address(NetworkAddress) to vector<NetworkAddress> Extend `Endpoint` class to take multiple NetworkAddresses instead of just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll have multiple IP:PORT pairs. This patch simply adds the field and makes changes to compile the codebase. The first element of of `address` field is used everywhere. Hence the way we talk to remains same with this patch. NOTE: Directly accessing the first memeber of Endpoint::address is unsafe as Endpoint() doesn't enforces non-empty address list. However, since the correctness test pass for now and are anyway replacing all those unsafe accesses with ones considering the whole vector, this patch ignores to access them in safe way.	2018-12-13 13:36:52 -08:00
Robert Escriva	268093a96d	Adjust all includes to be relative to the root. Remove the use of relative paths. A header at foo/bar.h could be included by files under foo/ with "bar.h", but would be included everywhere else as "foo/bar.h". Adjust so that every include references such a header with the latter form. Signed-off-by: Robert Escriva <rescriva@dropbox.com>	2018-10-19 17:35:33 +00:00
Balachandar Namasivayam	8ae640c062	Addressed review comments.	2018-03-02 17:56:49 -08:00
Balachandar Namasivayam	11df1aeabf	Add new api to get shared tlogs id and address	2018-03-02 16:50:30 -08:00
Alec Grieser	0bae9880f1	remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py	2018-02-21 10:25:11 -08:00
FDB Dev Team	a674cb4ef4	Initial repository commit	2017-05-25 13:48:44 -07:00

19 Commits