foundationdb

Commit Graph

Author	SHA1	Message	Date
Oleg Samarin	4c9df78076	dstonly	2020-07-10 15:13:42 +03:00
Evan Tschannen	ced65cd30b	finished explicitly versioning everything stored in the database	2020-05-22 17:14:21 -07:00
Evan Tschannen	0420b3e786	fix compile error	2020-04-29 14:05:53 -07:00
Evan Tschannen	7cebe743f9	A number of bug fixes of rare correctness errors	2020-04-29 13:50:13 -07:00
Evan Tschannen	a486ec2de0	pipelined fdbdr status	2020-02-25 15:48:00 -08:00
Evan Tschannen	daee15cbb5	fix: starting a DR should do the commit on the first proxy to ensure all mutations from previous backups have been flushed	2020-02-25 12:35:24 -08:00
Evan Tschannen	73ad702d14	Clients which fetch status should not disconnect from the coordinators and cluster controller between each retrieval	2020-01-22 15:41:22 -08:00
Steve Atherton	4ff058e86b	Backup and DR layer status document generation now uses snapshot reads for all keys read to avoid unnecessary conflicts when read during a status update or cleanup transaction. Since many of the keys read use wrapper functions, all of the underlying functions in BackupAgentBase and its two implementations also required a snapshot mode argument. All snapshot arguments default to false to match the underlying FDB API get/getrange methods.	2019-12-19 00:29:35 -08:00
Evan Tschannen	628b4e0220	added a warning if multiple log ranges exist for the same range	2019-10-02 17:06:19 -07:00
Evan Tschannen	9ec9f41d34	backup and DR would not share mutations if they were started on different versions of FDB	2019-10-01 18:52:07 -07:00
Evan Tschannen	ef01ad2ed8	optimized log range clearing to clear everything for each possible hash (256 clears) if that would be more efficient than one clear per second that has elapsed aborting a DR without the —cleanup flag will still attempt to cleanup for 30 seconds before giving up added a cleanup command to fdbbackup which can remove mutations from orphaned DRs which were stopped without the —cleanup flag	2019-09-27 18:32:27 -07:00
Jingyu Zhou	3a63d053e9	Address review comments for PR#1725	2019-06-20 14:06:32 -07:00
A.J. Beamon	5f55f3f613	Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.	2019-05-10 14:01:52 -07:00
Andrew Noyes	781b6ece77	Fix OPEN_FOR_IDE -Wunused-variable warnings CC #1255, #1173	2019-04-16 15:28:01 -07:00
mpilman	1c16f87a4e	Remove trace-calls to printable (in non-workloads)	2019-04-05 13:12:19 -07:00
Steve Atherton	8aab719c22	Merge branch 'master' into feature-backup-json	2019-03-12 18:23:16 -07:00
Stephen Atherton	023bbb566f	Renamed backup state enums for clarity, added backup state names. Changed Epochs to EpochSeconds in backup JSON along with some other renaming/moving of fields, and added information about snapshot dispatch. Changed timestamp format for input/output in all backup/restore contexts to be a fully qualified time with timezone offset. Added information about the last snapshot dispatch to backup config and status (not yet populated).	2019-03-10 16:00:01 -07:00
Evan Tschannen	f1897f3eb6	Merge branch 'master' into feature-metadata-version # Conflicts: # fdbclient/NativeAPI.actor.cpp	2019-03-04 21:06:16 -08:00
Balachandar Namasivayam	a258df32f6	Skip switchover checks for force option.	2019-03-04 15:58:36 -08:00
Balachandar Namasivayam	7cf71b0931	Add some basic checks before doing an atomic switchover.	2019-03-01 14:49:04 -08:00
Evan Tschannen	3da85f3acd	implemented the \xff/metadataVersion key, which can be used by layers to help them cheaply cache metadata and know when their cache is invalid	2019-02-28 17:45:00 -08:00
mpilman	c1592b4c3a	Several minor fixes within fdbclient	2019-02-19 15:16:59 -08:00
mpilman	78dd80ea8a	Proper fwd decl in BackupAgent Also BackupAgent.h -> BackupAgent.actor.h	2019-02-19 15:16:59 -08:00
mpilman	3cb2391b58	use proper fwd declarations in ManagementAPI Also ManagementAPI.h -> ManagementAPI.actor.h	2019-02-19 15:16:59 -08:00
mpilman	479a4d7c22	Minor fixes in fdbclient for intellisense	2019-02-19 15:16:59 -08:00
Andrew Noyes	067a445e06	Replace unused _ variables with wait(success(...))	2019-02-12 17:30:30 -08:00
A.J. Beamon	d4d5740282	* Add Optional.map and ErrorOr.map. * Rename Optional/ErrorOr cast_to to castTo. * Make printable(Optional<T>) templated rather than restricted to StringRef types. * Fixes bug in (unused) ErrorOr.castTo where an ErrorOr that was not set would lose its error.	2019-01-11 09:03:38 -08:00
Robert Escriva	268093a96d	Adjust all includes to be relative to the root. Remove the use of relative paths. A header at foo/bar.h could be included by files under foo/ with "bar.h", but would be included everywhere else as "foo/bar.h". Adjust so that every include references such a header with the latter form. Signed-off-by: Robert Escriva <rescriva@dropbox.com>	2018-10-19 17:35:33 +00:00
Alex Miller	535b5701e5	Rewrite all `Void _ = wait(...)` -> `wait(...)`. This takes advantage of the new actorcompiler functionality to avoid having duplicate definitions of `Void _` when trying to feed the un-actorompiled source through clang.	2018-08-14 15:50:26 -07:00
A.J. Beamon	99c9958db7	Some more trace event normalization	2018-06-08 13:57:00 -07:00
A.J. Beamon	e5488419cc	Attempt to normalize trace events: * Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check. * Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase. * Use seconds instead of milliseconds in details. Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed. This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.	2018-06-08 11:11:08 -07:00
Evan Tschannen	e82985aea2	fix: continue setting beginVersion so that versions between 5.2.0 and 5.2.2 do not crash when decoding tasks created by 5.2.3	2018-06-06 13:34:22 -07:00
Evan Tschannen	4120062bb9	fix: backup initialized its begin version at 1 instead of the read version of the starting transaction fix: erasing log ranges did not properly divide up work between transactions to prevent making transactions which were too large	2018-06-06 13:05:53 -07:00
Evan Tschannen	8930c2e3db	DR upgrade tests now test the durability of the data.	2018-05-09 15:11:05 -07:00
Yichi Chiang	c721ab6854	Fix review comments	2018-04-27 13:54:34 -07:00
Yichi Chiang	6bddf8aefa	Upgrade DR from 5.1 to 5.2	2018-04-26 17:24:40 -07:00
Evan Tschannen	57d650062a	merge 5.1 into 5.2	2018-04-18 20:44:31 -07:00
Evan Tschannen	77d100e1e6	fix: if a DR cluster has a much lower version than the primary database, it would take a long time to process the empty versions. But the version of the DR cluster before starting the DR to avoid this problem.	2018-04-18 19:37:24 -07:00
yichic	ede5cab192	Merge pull request #89 from yichic/share-log-mutations-5.2 Share log mutations 5.2	2018-03-19 12:01:26 -07:00
Yichi Chiang	ec02e54f64	Refactor EraseLogData()	2018-03-19 11:56:01 -07:00
Yichi Chiang	1f2602d2b3	Fix all review comments	2018-03-19 11:33:33 -07:00
Yichi Chiang	d6559b144f	Share log mutations between backups and DRs which have the same backup range	2018-03-19 11:32:50 -07:00
Stephen Atherton	dcf5b2e35d	All readCommitted() functions now use Transaction instead of ReadYourWritesTransaction to reduce memory consumption in Backup and DR. Also removed one readCommitted() variant as it is just a special case of another definition.	2018-03-07 13:56:34 -08:00
Alec Grieser	0bae9880f1	remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py	2018-02-21 10:25:11 -08:00
Alex Miller	f021934792	Fix yet another VersionStamp DR bug. In this episode, we discover that having a transaction retry loop in which the transaction conditionally has write conflict ranges is potentially troublesome. To simplify the problem, if we have two concurrent transaction loops: retry { if (rand() > .5) tr->set('x', rand()); if (rand() > .5) tr->set('y', rand()); } and retry { x = tr->get('x') y = tr->get('y') if (x > y) { tr->set('y', x) } tr->commit(); } Is not guaranteed that x > y in the database after the second transaction commits. This is because it could read an older snapshot of x and y, in which x was greater than y, and thus not invoke set. This means that `tr` is now a read-only transaction, which no-ops out of committing as an "optimization". If we add any write conflict range to `tr`, it then will conflict checked and committed, which would guarantee that x>y when it commits. Replace the first transaction with dumpData, and the second with version upgrade transaction, and you have the bug that we're fixing, why, and how.	2018-01-05 14:23:11 -08:00
Alex Miller	b264a98aea	Fix yet another VersionStamp DR bug. In this episode, we discover that having a transaction retry loop in which the transaction conditionally has write conflict ranges is potentially troublesome. To simplify the problem, if we have two concurrent transaction loops: retry { if (rand() > .5) tr->set('x', rand()); if (rand() > .5) tr->set('y', rand()); } and retry { x = tr->get('x') y = tr->get('y') if (x > y) { tr->set('y', x) } tr->commit(); } Is not guaranteed that x > y in the database after the second transaction commits. This is because it could read an older snapshot of x and y, in which x was greater than y, and thus not invoke set. This means that `tr` is now a read-only transaction, which no-ops out of committing as an "optimization". If we add any write conflict range to `tr`, it then will conflict checked and committed, which would guarantee that x>y when it commits. Replace the first transaction with dumpData, and the second with version upgrade transaction, and you have the bug that we're fixing, why, and how.	2018-01-04 17:29:43 -08:00
Alex Miller	f70e3b9fe8	Add or change a bunch of comments to provide descriptions of function contracts. This cleans up a bit of the VersionStamp DR work I did, and leaves hints and advice for anyone who will be touching mutation applying code in the future.	2017-12-20 16:57:14 -08:00
Evan Tschannen	38cff7d4a5	every transaction which clears applyMutation keys does so on the first proxy	2017-12-20 15:41:47 -08:00
Evan Tschannen	982f0dcb1e	Merge pull request #222 from cie/alexmiller/drtimefix2 Fix yet another VersionStamp DR issue.	2017-12-20 15:09:23 -08:00
Alex Miller	b5a6bc0ab7	Fix VersionStamp problems by instead adding a COMMIT_ON_FIRST_PROXY transaction option. Simulation identified the fact that we can violate the VersionStamps-are-always-increasing promise via the following series of events: 1. On proxy 0, dumpData adds commit requests to proxy 0's commit promise stream 2. To any proxy, a client submits the first transaction of abortBackup, which stops further dumpData calls on proxy 0. 3. To any proxy that is not proxy 0, submit a transaction that checks if it needs to upgrade the destination version. 4. The transaction from (3) is committed 5. Transactions from (1) are committed This is possible because the dumpData transactions have no read conflict ranges, and thus it's impossible to make them abort due to "conflicting" transactions. There's also no promise that if client C sends a commit to proxy A, and later a client D sends a commit to proxy B, that B must log its commit after A. (We only promise that if C is told it was committed before D is told it was committed, then A committed before B.) There was a failed attempt to fix this problem. We tried to add read conflict ranges to dumpData transactions so that they could be aborted by "conflicting" transactions. However, this failed because this now means that dumpData transactions require conflict resolution, and the stale read version that they use can cause them to be aborted with a transaction_too_old error. (Transactions that don't have read conflict ranges will never return transaction_too_old, because with no reads, the read snapshot version is effectively meaningless.) This was never previously possible, so the existing code doesn't retry commits, and to make things more complicated, the dumpData commits must be applied in order. This would require either adding dependencies to transactions (if A is going to commit then B must also be/have committed), which would be complicated, or submitting transactions with a fixed read version, and replaying the failed commits with a higher read version once we get a transaction_too_old error, which would unacceptably slow down the maximum throughput of dumpData. Thus, we've instead elected to add a special transaction option that bypasses proxy load balancing for commits, and always commits against proxy 0. We can know for certain that after the transaction from (2) is committed, all of the dumpData transactions that will be committed have been added to the commit promise stream on proxy 0. Thus, if we enqueue another transaction against proxy 0, we can know that it will be placed into the promise stream after all of the dumpData transactions, thus providing the semantics that we require: no dumpData transaction can commit after the destination version upgrade transaction.	2017-12-20 15:04:04 -08:00

1 2

65 Commits