foundationdb

Commit Graph

Author	SHA1	Message	Date
Evan Tschannen	3c0c03c004	fix: remote logs should reply until they have recovered through recoverAt	2018-04-16 17:25:49 -07:00
Evan Tschannen	3018a7b1b3	fix: the known committed version of a newly initialized log is 1, since by definition the first commit must have succeeded	2018-04-16 10:42:48 -07:00
Evan Tschannen	5533016f1e	fix: tlogs are now initialized immediately, instead of when starting the core, this must be done to pop the log routers during recovery fix: log router start version must be the same as remote log start version	2018-04-15 14:33:07 -07:00
Evan Tschannen	65e69620a7	fix: unrecoveredBefore on a new log is at minimum 1	2018-04-13 10:41:30 -07:00
Evan Tschannen	1af5ac0d9d	fix: a number of different problems prevented tlogs from using log routers during recovery	2018-04-12 15:20:54 -07:00
Evan Tschannen	a738c4bec1	fix: if the known committed version is equal to the recovery version we do not need to copy any data	2018-04-09 20:48:55 -07:00
Evan Tschannen	419951f601	fix: need to initialize tlog versions to less than the startVersion	2018-04-09 17:17:11 -07:00
Evan Tschannen	4c89f721cd	fix: do not include logRouter tags in lock results	2018-04-09 10:48:57 -07:00
Evan Tschannen	7af892f50b	first working version of non-copying recovery working with fearless configurations	2018-04-08 21:24:05 -07:00
Evan Tschannen	331e707684	fix: pop all tags that did not have data at the recovery version because fully popped tags may come back when pullAsyncData re-indexes the mutations	2018-03-31 16:47:56 -07:00
Evan Tschannen	96fffe2cea	fix: do not update version if the log has been stopped	2018-03-30 22:11:42 -07:00
Evan Tschannen	1a4ded1c99	support upgrades by merging tags associated with the different peek requests	2018-03-29 17:54:08 -07:00
Evan Tschannen	b36e08f08f	first version of non-copying recovery. Upgrades are broken, and it has not been tested using fearless configurations yet	2018-03-29 15:12:38 -07:00
Evan Tschannen	0746fe4d56	optimized tag lookups on the tlog by removing one level of vectors	2018-03-20 10:41:42 -07:00
Evan Tschannen	d8e064d8bb	fix: when a new log is recruited on a shared log, all outstanding commits need to be notified that they are stopped, because there is no longer a guarantee that their queueCommittedVersion will advance	2018-03-19 17:48:28 -07:00
Evan Tschannen	54be14000d	do not deserialize tags	2018-03-17 11:24:18 -07:00
Evan Tschannen	9c8cb445d6	optimized the tlog to use a vector for tags instead of a map	2018-03-17 10:36:19 -07:00
Evan Tschannen	fecfea0f7d	fix: messages vector was not cleared	2018-03-17 10:24:44 -07:00
Evan Tschannen	ccd70fd005	The tlog uses the tags embedded in the message instead of a separate vector of locations optimized remote tlog committing to avoid re-serializing the message	2018-03-16 16:47:05 -07:00
Evan Tschannen	820382ea68	optimized the log router commit path to avoid re-serializing the data	2018-03-16 11:40:21 -07:00
Evan Tschannen	f6a22c1035	fix: the recovery actor was holding a copy of the tlogInterface after the tlog was removed	2018-03-12 16:56:34 -07:00
A.J. Beamon	b25810711c	Merge branch 'master' into release-5.2	2018-03-05 10:32:57 -08:00
Balachandar Namasivayam	8ae640c062	Addressed review comments.	2018-03-02 17:56:49 -08:00
Balachandar Namasivayam	11df1aeabf	Add new api to get shared tlogs id and address	2018-03-02 16:50:30 -08:00
Evan Tschannen	e3c6b66240	fix: do not commit more data after being stopped fix: prioritize dc locality above exclusion to prevent being stuck after excluding all machines in a data center	2018-02-26 13:13:37 -08:00
Evan Tschannen	37a6a81634	Merge commit '7f6fc3e039c911cd84b8540f7f799fc38a1c1822' into feature-remote-logs # Conflicts: # fdbserver/workloads/RestartRecovery.actor.cpp	2018-02-23 12:33:28 -08:00
Evan Tschannen	ddb484143c	fix: do not peek from remote logs if they are not fully recovered	2018-02-21 14:06:44 -08:00
Alec Grieser	0bae9880f1	remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py	2018-02-21 10:25:11 -08:00
Evan Tschannen	1dc6a8d4bd	fix: the tlog can peek from log systems that have been recovered even if it does not match its recoverFrom set	2018-02-20 14:50:13 -08:00
Evan Tschannen	31b89a638f	added satellite_none and remote_none options to unconfigure from a fearless setup fix: log_router configuration was broken	2018-02-17 13:51:17 -08:00
Evan Tschannen	1fedcba890	fix: do not use log router tags when configured without remote logs fix: data distribution tracks undesired storage servers re-enabled consistency check	2018-02-13 17:01:34 -08:00
Evan Tschannen	6b54d56ca7	gracefully exit if attempting to upgrade from 4.X versions	2018-01-30 17:10:50 -08:00
Evan Tschannen	29c5d4ad3d	upgrades from 5.X mostly supported, still some remaining correctness problems	2018-01-28 11:52:54 -08:00
Evan Tschannen	66b2218989	added tlog support for upgrading from 5.X clusters. Does not support upgrading from 4.X or earlier. Untested, storage servers still need the ability to change their tag.	2018-01-21 12:21:46 -08:00
Evan Tschannen	21482a45e1	Merge branch 'master' into feature-remote-logs # Conflicts: # fdbserver/DBCoreState.h # fdbserver/LogSystem.h # fdbserver/LogSystemPeekCursor.actor.cpp # fdbserver/TLogServer.actor.cpp	2018-01-14 13:40:24 -08:00
Evan Tschannen	be643d6937	fix: the tlog did not cancel recovery properly when stopped	2018-01-12 17:18:14 -08:00
Evan Tschannen	de119f192d	fixed a priority inversion where the tlog would prefer to copy data from the previous generation rather than make data durable (leading to being ratekeeper controlled)	2018-01-11 16:09:49 -08:00
Evan Tschannen	30710f7493	syncLogId was not necessary	2018-01-06 14:52:39 -08:00
Evan Tschannen	10c3fc165e	fix: after recovering from disk, only allow peeking data the was fully recovered	2018-01-06 13:49:13 -08:00
Evan Tschannen	63751fb0e2	fix: remote logs are not in the log system until the recovery is complete so they cannot be used to determine if this is the correct log system to recover from	2018-01-05 14:15:25 -08:00
Evan Tschannen	5ac4f73978	Merge branch 'release-5.1' into feature-remote-logs # Conflicts: # fdbclient/NativeAPI.actor.cpp # fdbrpc/Locality.h # fdbrpc/simulator.h # fdbserver/ApplyMetadataMutation.h # fdbserver/ClusterController.actor.cpp # fdbserver/LogSystemPeekCursor.actor.cpp # fdbserver/MasterProxyServer.actor.cpp # fdbserver/SimulatedCluster.actor.cpp # fdbserver/TLogServer.actor.cpp # fdbserver/TagPartitionedLogSystem.actor.cpp # fdbserver/WorkerInterface.h # fdbserver/masterserver.actor.cpp # flow/Net2.actor.cpp # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2018-01-05 11:33:42 -08:00
Evan Tschannen	f2c4beed9f	fix: tlogFitness did not consider it better to have one tlog of a better fitness fix: checkStable was not used in all places in better master exists fix: we need to call checkOutstanding on worker registration in all cases fix: in case persistentData is keyValueStoreMemory, we need to make sure it is fully recovered before writing to it	2018-01-04 11:33:02 -08:00
Alex Miller	c7dbd31a1e	Refactoring: Create a common prefixRange and do UID->Key once in backup.	2017-12-19 17:17:50 -08:00
Evan Tschannen	8c51bc4ac4	fixed low latency tests in a way that gives us better test coverage	2017-11-28 18:20:29 -08:00
Evan Tschannen	dc624a54dc	fix: avoid flushing large queues in simulation when checking latency	2017-11-27 17:23:20 -08:00
Evan Tschannen	df74e2a373	re-added support for non-copying tlog recovery	2017-10-24 15:09:31 -07:00
Evan Tschannen	15962cf079	Merge branch 'master' into feature-remote-logs # Conflicts: # fdbrpc/Locality.cpp # fdbrpc/Locality.h # fdbserver/ClusterController.actor.cpp # fdbserver/ClusterRecruitmentInterface.h # fdbserver/TLogServer.actor.cpp # fdbserver/TagPartitionedLogSystem.actor.cpp # fdbserver/WorkerInterface.h # fdbserver/fdbserver.vcxproj.filters # fdbserver/masterserver.actor.cpp # fdbserver/worker.actor.cpp # flow/error_definitions.h	2017-10-05 17:09:44 -07:00
Balachandar Namasivayam	0e153cdd35	Throttle Spammy logs. Three knobs are added. Trace Events are sampled and cached with an expiration set. Every TraceEvent above SevDebug is checked against this cache to see if it exceeded a set threshold. If yes, then throttle the TraceEvent. If a TraceEvent is throttled, a warning msg is logged.	2017-10-02 18:43:11 -07:00
Evan Tschannen	f75dfc3153	do not register with the master until recovery of the queue is complete, to avoid having the master wait a long time for a peek response	2017-09-18 17:39:12 -07:00
Evan Tschannen	36c98f18e9	do not register a worker with the cluster controller until it has finished recovering all files from disk	2017-09-15 10:57:58 -07:00

1 2

76 Commits