foundationdb

Commit Graph

Author	SHA1	Message	Date
Evan Tschannen	e0be631414	shard the txs tag so that more transaction logs are involved in its recovery	2019-06-19 18:15:09 -07:00
Alex Miller	51fd42a4d2	Merge remote-tracking branch 'upstream/master' into spilled-only-peek	2019-06-18 17:33:52 -07:00
sramamoorthy	1190f2f33d	rebased related changes	2019-05-28 22:07:46 -07:00
sramamoorthy	b43c100e57	TLog bug fixes	2019-05-28 22:07:46 -07:00
sramamoorthy	3877f87481	comment change in tLogCommit	2019-05-28 22:07:46 -07:00
sramamoorthy	31b6c86650	ignorePopDeadline to have high limit in simulator - ignorePopDeadline to have highier limit in simulator to accommdate for the buggify delays and make snapshot succeed. - introduce a new knob for auto resetting the disabling of tlog pop	2019-05-28 22:07:46 -07:00
sramamoorthy	b1b96946af	logData->stop check right after execOpHold wait	2019-05-28 22:07:46 -07:00
sramamoorthy	5749e220bd	use FlowLock for implementing critical section Instead of using Promises and future to implement critcal section use FlowLock	2019-05-28 22:07:46 -07:00
sramamoorthy	e6c0b87a4d	remove unused variable	2019-05-28 22:07:46 -07:00
sramamoorthy	f27a40f118	execProcessingHelper made synchronous tLogCommit exects no blocking between duplicate check and setting of the new version, that constraint was broken when synchronous execProcessingHelper was introduced. As a fix, execProcessingHelper was made asynchronous.	2019-05-28 22:07:46 -07:00
sramamoorthy	d3a179b6f9	Multiple bug fixes - wait for snapTLogFailKeys in a loop, otherwise in some race condition it can cause a false assert - in single region, there does not seem to be a guarantee of tagLocalityListKey for a given DC ID, avoiding that assert for now - to find the workers that are coordinators, looking up by primary address is not sufficient in some cases, hence looking by both primary and secondary address - test make files to reflect the location of the new test cases	2019-05-28 22:07:46 -07:00
sramamoorthy	dcd2d96751	make spawnProcess predictable in the simulator	2019-05-28 22:07:46 -07:00
sramamoorthy	4083af0b01	Avoid using trackLatest for TLog pop test cases	2019-05-28 22:07:46 -07:00
sramamoorthy	ec7834e2f7	code re-orgnaization and address comments	2019-05-28 22:07:46 -07:00
sramamoorthy	b6e037ffbc	Replace fork with boost::process::child	2019-05-28 22:07:46 -07:00
sramamoorthy	e91c76834e	tlog: move snap create part to indepdendent funcs	2019-05-28 22:07:46 -07:00
sramamoorthy	61e93a9304	Address review comments and minor fixes	2019-05-28 22:07:46 -07:00
sramamoorthy	9e3104c2d4	Fix: races in async exec leading to bad backup	2019-05-28 22:07:46 -07:00
sramamoorthy	cfdad0c5e6	tlog to snapshot exactly at exec version	2019-05-28 22:07:46 -07:00
sramamoorthy	539e65efad	Skip parsing mutations if it is tagged for TxsTag In Tlog, if a mutation is targetted for TxsTag then skip from parsing them.	2019-05-28 22:07:46 -07:00
sramamoorthy	17ecba8313	trace cleanup and other indentation changes	2019-05-28 22:07:46 -07:00
sramamoorthy	aa79480d69	changes to make fdbfork asynchronous	2019-05-28 22:07:46 -07:00
sramamoorthy	4016f16c76	Fix few compilation and bugs in rebase	2019-05-28 22:07:46 -07:00
sramamoorthy	3d5998e9dd	tlog: when pops are disabled, store them & replay In Tlogs, disable pop is done whlie taking snapshots. Earlier, tlogs were ignoring the pops if it got pop requests when pops were disabled. In this change, instead of ignoring the pop - it remembers the list of pops in-memory and plays them once the popping is enabled.	2019-05-28 22:07:46 -07:00
sramamoorthy	4bc4c615da	exec op to all tlog, restore change in test &other - exec operation to go to all the TLogs - minor bug fix in tlog - restore implementation for the simulator - restore snap UID to be stored in restartInfo.ini - test cases added - indentation and trace file fixes	2019-05-28 22:07:46 -07:00
sramamoorthy	72dd067173	Trace message changes and fix few FIXMEs	2019-05-28 22:07:46 -07:00
sramamoorthy	69edefe68b	Snapshot based backup and resotre implementation	2019-05-28 22:07:46 -07:00
A.J. Beamon	f417e60264	Merge branch 'merge-release-6.1-into-master' into thread-safe-random-number-generation # Conflicts: # fdbserver/QuietDatabase.actor.cpp	2019-05-23 09:52:00 -07:00
A.J. Beamon	d29c7e4c9b	Merge branch 'release-6.1' into merge-release-6.1-into-master # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbserver/QuietDatabase.actor.cpp # versions.target	2019-05-23 09:28:45 -07:00
Evan Tschannen	4059d68348	fix: the tlog would not pop data from the disk queue after a storage server was removed, because the tag still exists in memory on the logs fix: we could incorrectly make data durable if eraseMessagesFromMemory was in progress while running updatePersistentData the quiet database check now ensure that tlogs have no more than 30 seconds of versions unpopped from the disk queue	2019-05-20 23:58:45 -07:00
Alex Miller	4eb4c03ce5	Save TLog resources by letting peek request only spilled data. If a peek is entirely fulfilled from spilled data, then it's likely that the next peek will be also. It is thus wasteful for each of these peeks to call peekMessagesFromMemory, which memcpy's excessively, and then throw all that data away without using it. Now, TLogs will give a hint back to peek cursors about if the provided reply was served entirely from the spilled data, which peek curors then feed back as the hint into their next request. At some point, a cursor will send a request for only spilled data, get an incomplete response, and then be told to send its next request as one that peeks from memory as well, and then it will fully catch up.	2019-05-14 15:38:48 -10:00
A.J. Beamon	5f55f3f613	Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.	2019-05-10 14:01:52 -07:00
Evan Tschannen	22499666d0	Merge branch 'release-6.1' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbserver/LogRouter.actor.cpp # flow/Trace.cpp # versions.target	2019-05-08 18:19:35 -07:00
Evan Tschannen	93eb2a9395	Merge pull request #1527 from alexmiller-apple/tstlog-6.1 Spill-by-reference knob + TLog6.0 Spilled Peek deprioritization	2019-05-03 17:19:45 -07:00
Alex Miller	c918b21137	Deprioritize spilled peeks in spill-by-value, and improve its logic. This deprioritizes before calling peekMessagesFromMemory, which should improve the memory usage of the TLog, and makes sure to keep txsTag peeks at a high priority to help recoveries stay fast.	2019-05-03 15:27:11 -07:00
Evan Tschannen	8590b710bf	added additional logging on the logs and log routers	2019-05-02 17:24:39 -07:00
Jingyu Zhou	8b5449e608	Fix review comments for PR #1473	2019-04-29 16:45:42 -07:00
Jingyu Zhou	5462f560e7	Add pseudo locality for log routers and tlogs This changes the logic of pop operations from log routers (LG): - LG pops tagLocalityLogRouterMapped from TLogs; - TLog converts tagLocalityLogRouterMapped back to tagLocalityLogRouter before popping. Later when we add more psuedo localities, the same pattern can be used.	2019-04-23 21:35:56 -07:00
Jingyu Zhou	0b1984978a	Small code refactoring.	2019-04-21 10:41:07 -07:00
Jingyu Zhou	ec1bc5cfca	Add LogSystemType enum	2019-04-21 10:41:07 -07:00
mpilman	1c16f87a4e	Remove trace-calls to printable (in non-workloads)	2019-04-05 13:12:19 -07:00
Evan Tschannen	31ed73d9f5	Ported the bug fix https://github.com/apple/foundationdb/pull/1379 to OldTLogServer_6_0	2019-04-02 15:27:37 -07:00
Evan Tschannen	b6008558d3	renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>() eliminated an unnecessary copy from the proxy commit path eliminated an unnecessary copy from buffered peek cursor	2019-03-28 11:52:50 -07:00
Evan Tschannen	1c6ad6d307	fix: change the location where stopped is checked, because a yield could cause cause stopped to be set after the existing check	2019-03-20 19:33:09 -07:00
Evan Tschannen	5873705228	tlog commits very rarely take an additional 6 seconds	2019-03-11 12:11:17 -07:00
Evan Tschannen	80c3f2f8e2	added status fields detailing which processes are degraded, and also the total number of degraded processes	2019-03-10 22:58:15 -07:00
Evan Tschannen	044b6b4f8a	Merge branch 'master' into feature-degraded-tlog # Conflicts: # fdbserver/ClusterController.actor.cpp	2019-03-08 22:50:41 -05:00
Evan Tschannen	53f16b5347	when a tlog queue commit takes longer than 5 seconds, its process is marked as degraded	2019-03-08 11:46:34 -05:00
Alex Miller	c6a65389ae	Remove noexcept macro and replace with BOOST_NOEXCEPT. BOOST_NOEXCEPT does what the noexcept macro was supposed to do, but in a way that is correctly maintained over time.	2019-03-05 22:06:12 -08:00
Alex Miller	2aa527c0ef	Fix a bug resulting from concurrent TLog changes. TLogServer was forked into OldTLogServer_6_0 at the same time that `3247d594` modified TLogServer, so the modification never made it into OldTLogServer_6_0, resulting in a rare failure. Manual code inspection revealed that there was also `78976161` that concurrently modified TLogServer, so that change was copied to OldTLogServer_6_0 as well.	2019-03-04 01:42:38 -08:00
Alex Miller	fa1bfbc0c5	Replace TLogSpillType with TLogVersion in worker and filenames.	2019-02-19 22:30:15 -08:00
Alex Miller	df61bd07db	Save an (unused) copy of the previous TLog.	2019-02-19 22:18:10 -08:00

1 2 3 4

152 Commits