Commit Graph

835 Commits

Author SHA1 Message Date
Stephen Atherton aa8b4c52d5 Removed backup URL from trace events. 2017-12-21 18:22:14 -08:00
Stephen Atherton 93e426ccd2 Merge branch 'master' of github.com:apple/foundationdb 2017-12-21 17:21:20 -08:00
Stephen Atherton e8f9568bbe Simulation improvement, readCommitted() calls that run for a long time would sometimes go too slow depending on the buggified limit, so now the limit is updated for each fetch loop. 2017-12-21 17:21:05 -08:00
Evan Tschannen 69f7409c37 fix: latestRestorable was incorrect 2017-12-21 17:09:21 -08:00
Evan Tschannen 5ed080721d fix: atomic restore must wait for the restorable version is greater than the lock version
fix: latestRestorableVersion calculation was wrong
2017-12-21 15:45:10 -08:00
Yichi Chiang e750682b74 Merge pull request #227 from cie/add-datetime-conversion-error-message
Add datetime conversion error message in backup
2017-12-21 14:51:04 -08:00
Yichi Chiang 3616035415 Add datetime conversion error message in backup 2017-12-21 14:45:05 -08:00
Evan Tschannen 95b502e1d7 fix: we did not restore to the target version in all cases 2017-12-21 14:11:44 -08:00
Stephen Atherton e0ef5a9a20 Whitespace normalization. 2017-12-21 12:07:29 -08:00
Stephen Atherton 9dc952e3b2 Added blob credentials details to fdbbackup help. 2017-12-21 02:50:02 -08:00
Stephen Atherton ec28c77353 Merge branch 'master' of github.com:apple/foundationdb 2017-12-21 01:58:47 -08:00
Stephen Atherton e3aee45a74 Backup tools and agent now accept blob account credentials via files containing JSON which are specified using command line arguments and/or an environment variable. Improved fdbbackup help, clarifying which options are for which operations. Fdbbackup operations which do not need to use a database no longer require a cluster file parameter. Added eat() commands to StringRef for incrementally tokenizing strings using separator strings. 2017-12-21 01:58:15 -08:00
Evan Tschannen 86958cb08d Merge pull request #226 from cie/fix-taskBucket-unblockFuture
Modify TaskBucketCorrectness to support chain and multiple tasks
2017-12-20 18:00:54 -08:00
Yichi Chiang 91e5abeaa6 Modify TaskBucketCorrectness to support chain and multiple tasks 2017-12-20 17:02:49 -08:00
Alex Miller f70e3b9fe8 Add or change a bunch of comments to provide descriptions of function contracts.
This cleans up a bit of the VersionStamp DR work I did, and leaves hints and
advice for anyone who will be touching mutation applying code in the future.
2017-12-20 16:57:14 -08:00
Evan Tschannen 38cff7d4a5 every transaction which clears applyMutation keys does so on the first proxy 2017-12-20 15:41:47 -08:00
Evan Tschannen 982f0dcb1e Merge pull request #222 from cie/alexmiller/drtimefix2
Fix yet another VersionStamp DR issue.
2017-12-20 15:09:23 -08:00
Alex Miller b5a6bc0ab7 Fix VersionStamp problems by instead adding a COMMIT_ON_FIRST_PROXY transaction option.
Simulation identified the fact that we can violate the
VersionStamps-are-always-increasing promise via the following series of events:

1. On proxy 0, dumpData adds commit requests to proxy 0's commit promise stream
2. To any proxy, a client submits the first transaction of abortBackup, which stops further dumpData calls on proxy 0.
3. To any proxy that is not proxy 0, submit a transaction that checks if it needs to upgrade the destination version.
4. The transaction from (3) is committed
5. Transactions from (1) are committed

This is possible because the dumpData transactions have no read conflict
ranges, and thus it's impossible to make them abort due to "conflicting"
transactions.  There's also no promise that if client C sends a commit to proxy
A, and later a client D sends a commit to proxy B, that B must log its commit
after A.  (We only promise that if C is told it was committed before D is told
it was committed, then A committed before B.)

There was a failed attempt to fix this problem.  We tried to add read conflict
ranges to dumpData transactions so that they could be aborted by "conflicting"
transactions.  However, this failed because this now means that dumpData
transactions require conflict resolution, and the stale read version that they
use can cause them to be aborted with a transaction_too_old error.
(Transactions that don't have read conflict ranges will never return
transaction_too_old, because with no reads, the read snapshot version is
effectively meaningless.)  This was never previously possible, so the existing
code doesn't retry commits, and to make things more complicated, the dumpData
commits must be applied in order.  This would require either adding
dependencies to transactions (if A is going to commit then B must also be/have
committed), which would be complicated, or submitting transactions with a fixed
read version, and replaying the failed commits with a higher read version once
we get a transaction_too_old error, which would unacceptably slow down the
maximum throughput of dumpData.

Thus, we've instead elected to add a special transaction option that bypasses
proxy load balancing for commits, and always commits against proxy 0.  We can
know for certain that after the transaction from (2) is committed, all of the
dumpData transactions that will be committed have been added to the commit
promise stream on proxy 0.  Thus, if we enqueue another transaction against
proxy 0, we can know that it will be placed into the promise stream after all
of the dumpData transactions, thus providing the semantics that we require:  no
dumpData transaction can commit after the destination version upgrade
transaction.
2017-12-20 15:04:04 -08:00
Evan Tschannen 07efdc70c8 more fixes for windows compile 2017-12-20 14:39:23 -08:00
Evan Tschannen c51de3bb88 fixed windows compile issues 2017-12-20 13:48:31 -08:00
Stephen Atherton c1958b335a Compile fix on windows, can't access protected parent class member from static function, apparently. 2017-12-20 12:13:25 -08:00
Evan Tschannen 0ab0cf51a3 fix: snapshotDispatch signaled completion after the first snapshot finished 2017-12-20 12:07:35 -08:00
Evan Tschannen 50bc25d3c7 Merge pull request #225 from cie/continuous-backup
Continuous backup
2017-12-20 11:13:36 -08:00
Stephen Atherton b77276d2f0 First snapshot of a backup should go as fast as possible instead of using the configured snapshot interval. 2017-12-20 01:07:03 -08:00
Stephen Atherton 7caa012fbf Added snapshot interval option to "fdbbackup start" which defaults to a new knob's value. Added snapshot info to backup status text. Improvements to fdbbackup help. 2017-12-20 00:49:08 -08:00
Stephen Atherton d87aa521e9 Merge branch 'backup-container-refactor' into continuous-backup 2017-12-19 23:39:00 -08:00
Stephen Atherton 193c216f52 Merge pull request #224 from cie/add-fdbbackup-interface
Add fdbbackup interface
2017-12-19 23:33:17 -08:00
Stephen Atherton e0d9cea008 Merge branch 'master' into continuous-backup
# Conflicts:
#	fdbclient/FileBackupAgent.actor.cpp
#	fdbrpc/BlobStore.actor.cpp
2017-12-19 23:02:14 -08:00
Stephen Atherton 2cd1ff6aae Bug fix, in restore dispatch the apply lag was being retrieved before updating the apply end version which would make it look like mutations were finished applying early. 2017-12-19 18:11:40 -08:00
Stephen Atherton 61a043ebfa Added tr->reset() to prevent initial transaction loop attempts from having a higher chance of expiring. 2017-12-19 17:33:45 -08:00
Alex Miller c7dbd31a1e Refactoring: Create a common prefixRange and do UID->Key once in backup. 2017-12-19 17:17:50 -08:00
Alex Miller 1488c12c18 Simulation will return and error and print if any non-suppressed SevError events were logged.
This means that loops like `seed=1; while ./fdbserver -r simulation -s $seed;
do seed=$(($seed+1)); done` to find an example of an often failing test.  This
also means joshua will report ExitCode errors on anything that has a SevError
in the log.

As a part of this, we also implicitly downgrade any injected errors to SevWarnAlways.
2017-12-19 17:17:50 -08:00
Stephen Atherton aa5169bd3c Removed unnecessary trace event. 2017-12-19 15:29:22 -08:00
Stephen Atherton e28641886d TraceEvent improvements. Minor bug fix, restore log writing tasks didn't have the log file endVersion but it's only for logging purposes. 2017-12-19 15:27:04 -08:00
Stephen Atherton a276985baf Bug fix, if there are range files in a restore which begin at exactly the restore version they will be repeatedly dispatched forever. 2017-12-18 17:48:18 -08:00
Stephen Atherton 005a4a0706 Restore status bug fix, during restore the apply lag would appear as a large negative number until the first restore batch is completed. Test improvement, snapshot dispatch now chooses a random number of tasks to dispatch per commit. 2017-12-18 15:56:57 -08:00
Evan Tschannen a5601877b3 fix: valgrind issue with destruction ordering 2017-12-18 15:31:59 -08:00
Stephen Atherton 937fa75bec Bug fix, if target snapshot end version is at or before the begin version then no progress would be made. 2017-12-18 00:13:25 -08:00
Stephen Atherton d32a770648 Bug fix, backup never went to differential mode once it was restorable which caused waitBackup to only return once the backup was discontinued. 2017-12-17 23:22:18 -08:00
Stephen Atherton 2b92815e8c Bug fix. The snapshot dispatch add task retry loop was incorrectly deciding that the second and further transaction of an execution was already committed and therefore skipping it, resulting in missing ranges in the snapshot. 2017-12-17 21:01:31 -08:00
Stephen Atherton afd2603576 Refactored backup task flow and config to support ongoing snapshots and allow stopping the backup cleanly between snapshots. The previously separate tasks for initial and differential mode log dispatching have been merged into BackupLogsDispatchTask. 2017-12-17 14:29:57 -08:00
Evan Tschannen 1dc9eceb6d optimize GetKeyLocationRequests on the proxy so they only require a single map lookup, instead of doing 3 + (3* [number of ranges]) lookups 2017-12-15 20:13:44 -08:00
Alec Grieser f2221cd16e updated documentation to reflect startNetwork starting a thread 2017-12-15 15:59:51 -08:00
A.J. Beamon 11dba3e8ef Update a bunch of tests and some documentation to use dispose. 2017-12-15 15:19:23 -08:00
Alec Grieser 34bdd8de28 Merge branch 'master' of github.com:apple/foundationdb 2017-12-15 12:23:08 -08:00
Alec Grieser 916105cd35 java now names the network thread "fdb-network-thread" 2017-12-15 12:23:01 -08:00
Stephen Atherton a3820b8b33 Merge pull request #196 from cie/updated-fdbmonitor-logging
Updates to fdbmonitor logging for better Splunk support
2017-12-15 12:22:30 -08:00
A.J. Beamon 83b21cc57b Set the thread name for threads created by our default executor in the Java bindings. 2017-12-15 11:00:29 -08:00
A.J. Beamon 33558e2757 Fix links to general FDB documentation. De-double-pluralize Transaction. 2017-12-15 09:19:01 -08:00
A.J. Beamon 8b84d5e7d9 Testing the removal of some pre/post build events in our fdb_java vcxproj file. These built the Java bindings jar, but only for the old bindings that no longer exist. 2017-12-15 08:28:29 -08:00