Commit Graph

908 Commits

Author SHA1 Message Date
Stephen Atherton 93a7eb7474 Fdbbackup restore will now default to the latest usable version in a backup if a specific restore version was not given. Expire will make sure a cluster is provided if either of its timestamp options are given. 2018-01-19 12:19:32 -08:00
Stephen Atherton 6e96d3c30c Bug fix, backup snapshots could take unexpectedly long if the desired snapshot interval is less than the configured snapshotDispatch interval. 2018-01-19 12:14:04 -08:00
Stephen Atherton da02099c4c Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1 2018-01-19 11:02:48 -08:00
Stephen Atherton fc16bb94ab Discontinuing a backup that is already restorable now stops immediately and aborts (via validation key) any tasks scheduled to run later. 2018-01-19 11:02:43 -08:00
Evan Tschannen 570f72ba40 fix: nextDispatchVersion was being set too large if the snapshot interval was small 2018-01-19 10:53:58 -08:00
A.J. Beamon 35b91bfb55 Add back (in different form) some ratekeeper trace events when a storage server or log doesn't respond. Add actualTPS (named TPSBasis) to RkUpdate. 2018-01-18 14:51:38 -08:00
Evan Tschannen b78e0a362a fix: do not pause when running multiple backup tests simultaneously 2018-01-18 12:24:33 -08:00
Stephen Atherton b6dd06d945 Bug fix in version to timestamp conversion. 2018-01-18 02:54:12 -08:00
Stephen Atherton 307e04c0ad Updated backup container unit test to match new safer behavior of expireData(). Rewrote BackupContainerLocalDirectory::deleteContainer() to actually delete the whole directory but only if it appears to be a backup with either log or snapshot data. 2018-01-18 00:36:28 -08:00
Stephen Atherton cdd1e784dc Added yields to writing backup snapshot manifests to avoid slow tasks. 2018-01-17 13:28:56 -08:00
Stephen Atherton f6f0816bc1 Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1 2018-01-17 12:12:12 -08:00
Stephen Atherton 8fece71662 Bug fix in backup metadata handling if logEnd becomes less than logBegin, which can happen if an expire is done without logEnd being updated. 2018-01-17 12:12:04 -08:00
Stephen Atherton d7f8fe218a Bug fix, resolving versions to timestamps for use in backup descriptions did not work on a locked database. Added some trace events to AtomicRestore to show progress. 2018-01-17 12:03:19 -08:00
A.J. Beamon 4bfbdbf454 Extract getLocalTime to platform.cpp 2018-01-17 11:35:34 -08:00
Stephen Atherton a6fc30209e Compile fix on windows, localtime_r is not available there. 2018-01-17 09:55:25 -08:00
Stephen Atherton 93b34a945f Major usability and performance improvements to backup management. Backup descriptions now calculate and display timestamps using TimeKeeper data (if given a cluster) and restorability of snapshots. Expire now requires a --force option to leave a backup unrestorable or unrestorable after a given point in time, specified by version or timestamp. BackupContainerFilesystem now maintains metadata on key version boundaries in order to avoid large list operations for describe and expire operations. Blob parallel recursive list operations can now take a path (aka prefix) filter function. New describe and expire options are available in fdbbackup. 2018-01-17 04:09:43 -08:00
Stephen Atherton f955547796 Bug fixes. Local var int i being declared over a state variable, and iterators being initialized incorrectly. 2018-01-16 11:41:49 -08:00
Stephen Atherton 897ff6f676 Added new knob for how many tasks to add per transaction in backup dispatch, instead of using the value for restore which has much lower overhead per task. 2018-01-16 10:45:21 -08:00
Stephen Atherton 02d72ca4b8 Added yields to CPU-heavy operations in FileBackup's snapshot range dispatcher. 2018-01-16 10:31:44 -08:00
Alec Grieser 252fb2b152 gotta bump up that version number!
those are rookie version numbers!
2018-01-16 09:39:58 -08:00
Evan Tschannen 645dc5ead6 warmRange needs to get a read version occasionally to prevent it from overwhelming the proxy
quietDatabase waits for all data distribution to be completely finished so that databases are cached in a cleaner state
2018-01-14 12:50:52 -08:00
Evan Tschannen be643d6937 fix: the tlog did not cancel recovery properly when stopped 2018-01-12 17:18:14 -08:00
Alvin Moore 2e6ce03224 Merge pull request #232 from cie/build-dont-compile-hpp
Filter out .hpp files from *_BUILD_SOURCES (like we do with .h files)…
2018-01-12 14:09:25 -08:00
Evan Tschannen 660cee0254 increased the priority of getKeyServersLocations, because once a client gets a read version, answering their reads should be higher priority than starting new transactions 2018-01-12 13:46:20 -08:00
Evan Tschannen 3915d6825c we need to check the server list at a higher priority, because if we do not notice a storage server interface change for a long period of time, we will mark it as failed 2018-01-12 12:51:07 -08:00
Evan Tschannen 721a891d1f fix: never request more than 100 shards from the proxy at a time to avoid large packets 2018-01-12 10:51:53 -08:00
Evan Tschannen de119f192d fixed a priority inversion where the tlog would prefer to copy data from the previous generation rather than make data durable (leading to being ratekeeper controlled) 2018-01-11 16:09:49 -08:00
Evan Tschannen 29ebb19388 Merge branch 'release-5.0' into release-5.1 2018-01-11 15:43:37 -08:00
Evan Tschannen 22e5a0b257 formatting 2018-01-11 14:44:09 -08:00
Evan Tschannen 173a8de3ed DBCoreState supports upgrades from 3.0 versions 2018-01-11 14:39:51 -08:00
Evan Tschannen 02bd83ff76 changed incompatibleDataRead to an asyncTrigger 2018-01-11 13:35:56 -08:00
A.J. Beamon 80b84c23ac Filter out .hpp files from *_BUILD_SOURCES (like we do with .h files). Add xml2json.hpp to our fdbrpc project. 2018-01-10 13:51:57 -08:00
A.J. Beamon ce93d98b50 Temporarily remove xml2json.hpp from fdbrpc vcxproj 2018-01-10 10:18:44 -08:00
A.J. Beamon 2f5073d00f Some visual studio project cleanup. 2018-01-10 10:07:18 -08:00
Evan Tschannen 022df3b91b backup and restore sometimes took too long in simulation 2018-01-09 17:26:42 -08:00
Evan Tschannen 645f68212b make timekeeper priority system immediate 2018-01-08 18:21:00 -08:00
Evan Tschannen 370e8a9903 fix: split metrics could fail an assert in a very rare scenario 2018-01-08 18:20:22 -08:00
Stephen Atherton 0e7d538c94 Bug fix, in recursive blob folder listings the recent removal of common prefixes from the result stream caused the list marker to not be set correctly when a folder level requires multiple requests due to folder size. 2018-01-06 20:58:48 -08:00
Stephen Atherton 96cb06cbc7 Bug fixes. Fdbbackup delete was broken. Blobstore backup container deletion would do too much listing before deletions began due to list operations queueing up ahead of and starving the delete operations. Created new knob and blob endpoint limit for concurrent list operations to fix this. Increased blob request timeout default because some requests were taking longer. Crash fixes in blobstore doRequest() which wasn't checking that response object is valid before using it in error conditions. Filesystem-like backup container class (covering blobstore and local dirs) now ignores unrecognized filenames for describe() and expire() operations. 2018-01-05 23:06:39 -08:00
Stephen Atherton b86f68ceb8 Added new test that combines atomic backup/restore. Added randomization to delays in AtomicRestore workload. 2018-01-05 14:43:21 -08:00
Stephen Atherton 236799c77f Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1 2018-01-05 14:38:06 -08:00
Stephen Atherton cbecc44ee2 Merge pull request #231 from cie/fdbmonitor-deconfigure-on-delete
Fdbmonitor deconfigure on delete
2018-01-05 14:23:34 -08:00
Alex Miller f021934792 Fix yet another VersionStamp DR bug.
In this episode, we discover that having a transaction retry loop in which the
transaction conditionally has write conflict ranges is potentially troublesome.

To simplify the problem, if we have two concurrent transaction loops:

    retry {
      if (rand() > .5) tr->set('x', rand());
      if (rand() > .5) tr->set('y', rand());
    }

and

    retry {
      x = tr->get('x')
      y = tr->get('y')
      if (x > y) {
        tr->set('y', x)
      }
      tr->commit();
    }

Is not guaranteed that x > y in the database after the second transaction
commits.  This is because it could read an older snapshot of x and y, in which
x was greater than y, and thus not invoke set.  This means that `tr` is now a
read-only transaction, which no-ops out of committing as an "optimization".  If
we add any write conflict range to `tr`, it then will conflict checked and
committed, which would guarantee that x>y when it commits.

Replace the first transaction with dumpData, and the second with version
upgrade transaction, and you have the bug that we're fixing, why, and how.
2018-01-05 14:23:11 -08:00
Stephen Atherton cbeff0f789 Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1 2018-01-05 14:13:27 -08:00
Stephen Atherton 2763713cbc Bug fix, backup snapshot dispatch was calculating that all shards must be done immediately. 2018-01-05 14:12:00 -08:00
Alec Grieser 70cc02d572 Merge pull request #230 from cie/vexillographer-binding-specific-disables
Vexillographer binding specific disables
2018-01-05 13:26:12 -08:00
A.J. Beamon 30067d2f53 Whitespace fixes and removal of change to java's AbstractTester 2018-01-05 13:21:54 -08:00
A.J. Beamon 9f2e6bfbd1 Merge branch 'release-5.1' into vexillographer-binding-specific-disables
# Conflicts:
#	fdbclient/vexillographer/fdb.options
2018-01-05 13:16:41 -08:00
Evan Tschannen d10d42a015 Merge pull request #229 from cie/status-generalize-incorrect-cluster-file
Status generalize incorrect cluster file
2018-01-05 12:42:50 -08:00
A.J. Beamon 5015119115 Generalize the message that gets displayed in status if a cluster file's contents are incorrect. 2018-01-05 10:29:47 -08:00