foundationdb

Commit Graph

Author	SHA1	Message	Date
Jingyu Zhou	0259a243ae	Switch DC if log router peek becomes stuck Trying to a different DC if this happens.	2023-03-06 17:41:56 -08:00
Zhe Wu	087d37d10b	Add event for txn server initialization and a warning for TLog slow catching up	2023-01-11 10:02:06 -08:00
sfc-gh-tclinkenbeard	3c97f43138	Change Histogram::Unit::microseconds to milliseconds	2022-11-21 08:03:56 -08:00
sfc-gh-tclinkenbeard	c9968f5c0c	Mark several methods of ILogSystem, IPeekCursor and LogPushData const	2022-11-15 14:57:32 -08:00
Zhe Wu	32bc9b6ebb	Fix a race condition between batched peek and pop, where the server removal pop may be lost	2022-11-04 15:05:37 -07:00
Lukas Joswiak	9d3c3b1efe	Remove cluster ID logic from individual roles The logic to determine the validity of a process joining a cluster now belongs on the worker and the cluster controller. It is no longer restricted to tlogs and storages, but instead applies to all processes (even stateless ones).	2022-10-27 13:56:13 -07:00
Jingyu Zhou	a8391caf23	Revert "Data loss protection v2"	2022-10-20 18:09:58 -05:00
Lukas Joswiak	72bc89cf39	Remove cluster ID logic from individual roles The logic to determine the validity of a process joining a cluster now belongs on the worker and the cluster controller. It is no longer restricted to tlogs and storages, but instead applies to all processes (even stateless ones).	2022-10-18 21:37:42 -07:00
sfc-gh-tclinkenbeard	82adc1e856	Make g_simulator a pointer	2022-09-15 09:00:33 -07:00
Zhe Wu	bd99f4fa3b	Log tlog initialization	2022-07-21 13:54:44 -07:00
Markus Pilman	1de37afd52	Make TEST macros C++ only (#7558 ) * proof of concept * use code-probe instead of test * code probe working on gcc * code probe implemented * renamed TestProbe to CodeProbe * fixed refactoring typo * support filtered output * print probes at end of simulation * fix missed probes print * fix deduplication * Fix refactoring issues * revert bad refactor * make sure file paths are relative * fix more wrong refactor changes	2022-07-19 13:15:51 -07:00
Jingyu Zhou	b2fded5c51	CC sends recovery txn version during TLog recruitment This simplifies the logic for TLog to wait for recovery txn before replying back to peeks.	2022-05-24 14:57:55 -07:00
Zhe Wu	cb73352e36	Don't pop every generation of old log router	2022-05-24 08:47:57 -07:00
Ray Jenkins	dc9e782ccc	OpenTelemetry Tracing Perf Fixes (#6990 )	2022-05-02 14:56:51 -05:00
Evan Tschannen	a825eb8a8c	fix: when more tlogs are absent than the replication factor we would access invalid memory	2022-04-27 16:53:30 -07:00
Ray Jenkins	1c5bf135d5	Revert "Migrate to OpenTelemetry tracing. (#6855 )" (#6941 ) This reverts commit `5df3bac110`.	2022-04-25 09:29:56 -05:00
Ray Jenkins	5df3bac110	Migrate to OpenTelemetry tracing. (#6855 )	2022-04-20 09:26:37 -05:00
Jingyu Zhou	cfcf0f152c	Merge branch 'main-4a085fc84' into vv Fix Conflicts: fdbclient/NativeAPI.actor.cpp fdbserver/ClusterRecovery.actor.cpp fdbserver/MasterInterface.h fdbserver/masterserver.actor.cpp flow/error_definitions.h	2022-03-30 22:28:06 -07:00
Dan Lambright	f867474b05	Respond to Jingyu's comments 3/24	2022-03-25 10:50:41 -04:00
Dan Lambright	12e88a8ef5	Error in previous commit	2022-03-23 14:16:45 -04:00
Dan Lambright	f23f451cc4	Fix bug computing tlog count per log group	2022-03-22 14:12:26 -04:00
sfc-gh-tclinkenbeard	a71099471b	Update copyright header dates	2022-03-21 13:36:23 -07:00
Dan Lambright	d69aa8ae92	retain tlog count per log group, add fix dropped in previous rebase	2022-03-21 15:08:13 -04:00
Dan Lambright	6e507b8c07	refactor unicast recovery	2022-03-17 12:25:50 -04:00
Dan Lambright	b529801407	Respond to Jingyu's comments	2022-03-15 19:17:54 -04:00
Dan Lambright	de73fc03dc	fix recovery algorithm	2022-03-14 08:59:58 -04:00
Dan Lambright	9544379cdf	rebase	2022-01-20 11:12:33 -05:00
Dan Lambright	adc9055097	Do not restart recovery unless min DV of recovered tlog set goes down	2022-01-11 12:52:05 -05:00
Ata E Husain Bohra	936bf5336a	Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191 ) * Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine"" Major changes includes: 1. Re-revert Sequencer refactor commits listed below (in listed order): 1.a. This reverts commit `bb17e194d9`. 1.b. This reverts commit `d174bb2e06`. 1.c. This reverts commit `30b05b469c`. 2. Update Status.actor to track ClusterController interface to track recovery status. 3. Introduce a ServerKnob to define "cluster recovery trace event" prefix; for now keeping it as "Master", however, it should allow smooth transition to "Cluster" prefix as it seems more appropriate.	2022-01-06 12:15:51 -08:00
Dan Lambright	49e89571fa	Set recoverAt to max(all tlogs rv) for recovered (crashed) tLogs in UNICAST mode.	2022-01-04 12:27:20 -05:00
Aaron Molitor	30b05b469c	Revert "Refactor: ClusterController driving cluster-recovery state machine" This reverts commit `dfe9d184ff`.	2021-12-24 11:25:51 -08:00
Ata E Husain Bohra	dfe9d184ff	Refactor: ClusterController driving cluster-recovery state machine At present, cluster recovery process consists of following steps: 1. ClusterController clusterWatchDatabase actor recruits master/sequencer process. 2. Sequencer process implements the cluster recovery state machine, responsible to recruit all other processes as well restore the cluster state. Patch proposes a scheme where the cluster recovery state machine is implemented and driven by the ClusterController process instead of the Sequencer process. Advantages of the scheme could be: 1. Simplified design where ClusterController recruits "sequencer" process like other worker processes compared to current scheme where "sequencer" process gets special treatment. In newer scheme sequencer is responsible for maintaining/providing "committed version" (as expected). 2. ClusterController is responsible for worker processes recruitment, the sequencer though orchestrating the recovery state machine, it need to reachout to the ClusterController for recruiting worker processes etc. NOTE: Patch has moved the recovery state machine code from 'sequencer' -> 'cluster-controller' process, however, necessary updates were done for both functionality as well as performance improvement reasons. Next Steps: Cluster recovery documentation will be updated in near future.	2021-12-22 14:06:27 -08:00
Dan Lambright	792d7d288d	address review comments	2021-12-19 12:50:59 -05:00
Dan Lambright	0222d8669d	fix simulation failures	2021-12-10 09:56:21 -05:00
Dan Lambright	faef404279	system rv is max of tlog's rv	2021-11-15 09:42:01 -05:00
Dan Lambright	4979ccb889	commits recovered if written to every tlog minus failure tolerance.	2021-11-12 12:10:04 -05:00
Dan Lambright	0f99ad582b	first cut unicast recovery	2021-11-10 12:31:16 -05:00
Lukas Joswiak	3988b11fd6	Cleanup	2021-11-09 12:29:48 -08:00
Lukas Joswiak	3e2c65bb11	Allow tlog to join another cluster but retain its data	2021-11-09 12:29:48 -08:00
Lukas Joswiak	30867750b5	Add protection against storage and tlog data deletion when joining a new cluster	2021-11-09 12:29:47 -08:00
Dan Lambright	58e1888d8e	remove network hop by getting previous commit versions in GetCommitVersionRequest	2021-09-30 11:51:57 -04:00
Sreenath Bodagala	184c134b8a	- Resolve merge conflicts	2021-09-17 20:25:16 +00:00
Sreenath Bodagala	2aa3b44d4e	Merge remote-tracking branch 'apple-upstream/master' into version-vector-prototype - Conflicts: fdbserver/LogSystem.h fdbserver/LogSystemConfig.h fdbserver/TagPartitionedLogSystem.actor.cpp - Files modified during merge: modified: fdbserver/LogSystem.cpp modified: fdbserver/LogSystemConfig.cpp	2021-09-17 19:36:18 +00:00
Xiaoge Su	abf73047ca	Enforce std:: specifier rather than using namespace	2021-09-16 19:40:28 -07:00
Xiaoge Su	c32c3b6ec4	fixup! Reformat the code per github's requirement	2021-09-12 14:17:19 -07:00
Xiaoge Su	40648dbb31	fixup! Update code per comment Also fix the issue that TagPartitionedLogSystem.actor.cpp should include TagPartitionedLogSystem.actor.h	2021-09-12 14:17:19 -07:00
Xiaoge Su	ecca4edeb4	Create TagPartitionedLogSystem.actor.h TagPartitionedLogSystem.actor.h contains the struct of TagPartitionedLogSystem.	2021-09-12 14:17:19 -07:00
Sreenath Bodagala	a081c0baa5	Merge remote-tracking branch 'apple-upstream/master' into version-vector-prototype	2021-08-05 22:40:32 +00:00
yao-xiao-github	8609b45354	Add histograms to CommitProxyServer. (#5299 )	2021-08-05 09:17:37 -07:00
Andrew Noyes	353efe7db2	Merge pull request #5264 from sfc-gh-tclinkenbeard/fix-more-clang-warnings Enable more warnings for `clang`	2021-07-29 15:43:54 -07:00

1 2 3 4 5 ...

381 Commits