Commit Graph

225 Commits

Author SHA1 Message Date
Stephen Atherton bb2de3479b Updated unit test to new backup container behavior. 2019-01-08 16:29:00 -08:00
Stephen Atherton 2951369006 When starting a restore, if range files are missing their names will be logged before an error is thrown. The error thrown is now restore_missing_data instead of restore_corrupted_data. 2019-01-08 16:28:40 -08:00
Stephen Atherton 928876be81 Backup dumpFileList() can now take a version range. 2018-12-21 22:42:29 -08:00
Stephen Atherton f64d5321e9 Backup describe, expire, and delete will now clearly indicate when there is no backup at the given URL. 2018-12-20 18:05:23 -08:00
Stephen Atherton 354abebf64 Added progress reporting to backup expiration. Simplified backup delete progress. 2018-12-20 00:23:26 -08:00
Stephen Atherton fcb34a8768 In backup describe, when not using the original cluster to resolve versions to date/time strings the versions will be converted to approximate day deltas from maxLogEndVersion (if available) using core_versionspersecond. 2018-12-19 16:53:39 -08:00
Stephen Atherton 00568f4011 Error code was misleading, added comment. 2018-12-19 13:14:48 -08:00
Stephen Atherton fd4a62fbfd Backup expire will fail earlier if force option is needed but not specified. 2018-12-19 10:36:25 -08:00
Stephen Atherton 172c3f2021 Bug fix in backup describe, do not update log begin/end metadata in backup if describe was given a start version override as it can result in incorrect metadata. 2018-12-19 10:35:06 -08:00
Stephen Atherton afa243bc97 Added fdbbackup expire options to calculate approximate version boundaries based on a number of days prior to the latest log file found in the backup container. This enables expiration operations based on time (with reasonable precision) without accessing the source cluster. 2018-12-18 18:55:44 -08:00
Stephen Atherton 69d847dbbd Backup describe can now analyze a backup assuming log data starts at a given version. This is used by expire both as an optimization and because doing so enables expire to repair a bad metadata state that can result from a cancelled or failed expire operation from fdbbackup <= 6.0.17. ListLogFiles() and ListRangeFiles() no longer sort results as in most cases the sort is not required and the result set can be very large. Describe will now update minLogBegin metadata to the beginning of the log range known to be present when possible. Several serial log and range list pairings are now done in parallel. 2018-12-18 04:33:37 -08:00
Stephen Atherton 9ef9041fba Bug fix, metadata read futures were moved to a vector but then not waited on. 2018-12-17 13:13:35 -08:00
Stephen Atherton dac1827d23 Backup describe's "deep scan" mode should only ignore log begin/end versions, not expire and unreliable end versions. 2018-12-16 00:41:38 -08:00
Stephen Atherton 5951e9d577 Added backup URL and exceptions to trace events where applicable. 2018-12-16 00:33:30 -08:00
Stephen Atherton 223b19f5ba Rewrote backup metadata scheme to fix several design flaws involving cancelled or concurrent operations. Most importantly, before an expire deletes any data it marks the end of the range being deleted as 'unreliable' so further reads of the backup will treat versions before that point as having data holes. 2018-12-16 00:18:13 -08:00
Evan Tschannen 1f3b6e8bdf Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/BackupContainer.actor.cpp
#	fdbclient/BlobStore.actor.cpp
#	versions.target
2018-11-27 14:41:46 -08:00
A.J. Beamon 975711c389 Merge branch 'release-6.0' of github.com:apple/foundationdb
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
2018-11-27 09:50:39 -08:00
Stephen Atherton 3d68d6b994 Bug fix, clarified a comment. 2018-11-24 18:41:39 -08:00
Stephen Atherton 0610a19e4d Rewrote backup container unit test to use randomness to cover a wider variety of data patterns and edge cases. 2018-11-24 17:24:54 -08:00
Stephen Atherton aa648daabf Compile fix for linux, can't make a move iterator from a const container reference. 2018-11-23 12:49:10 -08:00
Stephen Atherton ec9410492d Changed backup folder scheme to use a much smaller number of unique folders for kv range files, which will speed up list and expire operations. The old scheme is still readable but will no longer be written except in simulation in order to test backward compatibility. 2018-11-23 05:23:56 -08:00
Stephen Atherton 1f2223fcf5 Bug fix in backup expiration. After the range file scan, it was being asserted that the range files found have a version < expiration version but this isn't guaranteed because the expiration version is likely shifted backward a bit after the file scan based on the log files found. 2018-11-15 02:15:25 -08:00
Evan Tschannen e45952bc53 Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/BackupContainer.actor.cpp
#	fdbclient/BlobStore.actor.cpp
#	fdbclient/HTTP.actor.cpp
#	tests/BlobStore.txt
#	versions.target
2018-11-13 16:06:39 -08:00
Stephen Atherton 3125b807b3 Added documentation of the 'bucket' URL parameter for blobstore:// backup URLs. 2018-11-13 06:23:58 -08:00
Stephen Atherton 983bd3346a BackupContainerBlobStore no longer uses a hardcoded bucket name. BlobStoreEndpoint creating from a URL string now supports having additional parameters in the URL which it does not consume but rather returns to the caller, and BackupContainerBlobStore uses this to accept a "bucket" parameter. 2018-11-13 03:00:59 -08:00
Evan Tschannen 4b5d0b4e2c Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/AsyncFileBlobStore.actor.cpp
#	fdbclient/AsyncFileBlobStore.actor.h
#	fdbclient/BlobStore.actor.cpp
#	fdbclient/BlobStore.h
#	fdbclient/HTTP.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbrpc/LoadBalance.actor.h
#	fdbrpc/batcher.actor.h
#	fdbrpc/fdbrpc.vcxproj
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistributionTracker.actor.cpp
#	fdbserver/SimulatedCluster.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/masterserver.actor.cpp
2018-11-10 13:04:24 -08:00
A.J. Beamon 776b289bfe Move AsyncFileBlobStore and related files to fdbclient. 2018-10-26 13:49:42 -07:00
Robert Escriva 268093a96d Adjust all includes to be relative to the root.
Remove the use of relative paths.  A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h".  Adjust so that every include references such a header with the
latter form.

Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Stephen Atherton 22f8a4efa9 Normalized all unit test names to begin with "/" if they should be included in random unit testing. 2018-10-05 22:09:58 -07:00
Evan Tschannen 3922e477a5 Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/ManagementAPI.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/LogSystemDiskQueueAdapter.actor.cpp
#	fdbserver/SimulatedCluster.actor.cpp
#	fdbserver/TLogServer.actor.cpp
2018-10-03 16:57:18 -07:00
Evan Tschannen 3f86905ea7 fix: restore did not take into account that the end version of a log file does not exist in that file. This resulted in restores done at the same version a snapshot completes to not apply the mutations at that final version. 2018-09-21 11:48:28 -07:00
Bhaskar Muppana 920fd3fe97 Merge branch 'release-6.0' 2018-09-06 14:24:02 -07:00
Stephen Atherton cfce27b0f4 Timestamp to Version lookup using Timekeeper data was not lock aware. 2018-09-05 16:16:22 -07:00
Alex Miller 535b5701e5 Rewrite all `Void _ = wait(...)` -> `wait(...)`.
This takes advantage of the new actorcompiler functionality to avoid
having duplicate definitions of `Void _` when trying to feed the
un-actorompiled source through clang.
2018-08-14 15:50:26 -07:00
Evan Tschannen 1c29275672 call all methods which could disable a trace event before it is initialized. In practice this means calling .error first, then .suppressFor, then all your details. 2018-08-01 14:30:57 -07:00
A.J. Beamon e5488419cc Attempt to normalize trace events:
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.

Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.

This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
Stephen Atherton 9c901983f0 Clarity improvement, resetting backup description variable because it's no longer valid due to some of its contents being std::move'd. 2018-03-09 12:03:10 -08:00
Stephen Atherton 3a7288924a Bug fixes. During expiration, the backup container's log range metadata could be updated incorrectly if force was required and not specified or if a backup had no log begin metadata and an expire was done which covered 1 or more log file. In the latter case a backup could be left in a state where the container metadata suggests the backup has more log coverage than it actually does. 2018-03-09 11:29:23 -08:00
Stephen Atherton cb68885328 If backup expiration determines that force is required but the force parameter is not set, it will no longer throw an error unless the backup contains data from prior to the expire_before_version. 2018-03-08 11:27:15 -08:00
Alec Grieser 0bae9880f1 remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-21 10:25:11 -08:00
Stephen Atherton d8879dc3f3 HTTP::doRequest() now reads responses in parallel with sending requests, so if the server responds before receiving all of the the request the client can stop sending the remainder of the request. For PUT requests which upload files, this prevents sending potentially several megabytes of unnecessary bytes if the server responds with an error (such as 429) before the request is completely sent. Updated the backup container unit test to use more parallelism in order to test this new behavior. 2018-02-07 10:38:31 -08:00
Stephen Atherton 2f291d8955 Bug fix in blob backup container deletion. The list/delete loop could end before deleting all of the files, but the index entry would still be deleted. Also preemptively made the same code change in listBucket() - Although it is technically correct as written it is a dangerous style because it is not obvious that the addition of a wait() call in the second 'when' block would create a bug. Consolidated deleteContainer() and deleteBucket() as they differ by only 1 line. 2018-01-29 00:32:41 -08:00
Stephen Atherton 83409fb067 Bug fix, versionFolderString() was not reducing the precision of the number in the output string. Not technically a 'bug' as the scheme will still work but produces an overly deep and sparse folder structure. 2018-01-24 10:29:37 -08:00
Stephen Atherton 40d38880fe Changed version-based folder naming scheme to something simpler, a fixed width 0-padded 19 digit number (the longest a Version can be) with /'s inserted to limit the size of each folder level. Comparisons using these folder names ignore the /'s so any future change to the splitting scheme would still be compatible with the current listing/reading logic. 2018-01-23 15:02:15 -08:00
Stephen Atherton 7db7a51440 Changes to backup folder structure in BackupContainerBlobStore at the top level inside the backup bucket. All data files now live under data/<backup_name> and there is an 'index' at backups/<backup_name> which indicates what backups exist. Backup names can now contain '/' characters. 2018-01-23 11:46:16 -08:00
Stephen Atherton 7f0b7311b9 Corrected function name to timeKeeperVersionFromDatetime(). 'Fdbbackup expire' now allows an expire_before version of 0 if explicitly passed by version or by timestamp. 2018-01-23 00:19:51 -08:00
Stephen Atherton 51a1bd9327 Timekeeper lookup improvements, moved both function declartions to BackupContainer.h. VersionFromEpochs() now uses versions/sec to adjust the lookup result to improve accuracy. Conversions in both directions look for the latest record less than the target conversion value, but failing that they will now fall back on any available data point and adjust from there using versions/sec. 2018-01-22 23:57:01 -08:00
Stephen Atherton f086ba9d9d Improved version to timestamp lookup - if there are no older versioned records in the database then the next available record, if any, will be used to calculate a result. 2018-01-22 22:47:57 -08:00
Stephen Atherton b6dd06d945 Bug fix in version to timestamp conversion. 2018-01-18 02:54:12 -08:00
Stephen Atherton 307e04c0ad Updated backup container unit test to match new safer behavior of expireData(). Rewrote BackupContainerLocalDirectory::deleteContainer() to actually delete the whole directory but only if it appears to be a backup with either log or snapshot data. 2018-01-18 00:36:28 -08:00
Stephen Atherton cdd1e784dc Added yields to writing backup snapshot manifests to avoid slow tasks. 2018-01-17 13:28:56 -08:00
Stephen Atherton f6f0816bc1 Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1 2018-01-17 12:12:12 -08:00
Stephen Atherton 8fece71662 Bug fix in backup metadata handling if logEnd becomes less than logBegin, which can happen if an expire is done without logEnd being updated. 2018-01-17 12:12:04 -08:00
Stephen Atherton d7f8fe218a Bug fix, resolving versions to timestamps for use in backup descriptions did not work on a locked database. Added some trace events to AtomicRestore to show progress. 2018-01-17 12:03:19 -08:00
A.J. Beamon 4bfbdbf454 Extract getLocalTime to platform.cpp 2018-01-17 11:35:34 -08:00
Stephen Atherton a6fc30209e Compile fix on windows, localtime_r is not available there. 2018-01-17 09:55:25 -08:00
Stephen Atherton 93b34a945f Major usability and performance improvements to backup management. Backup descriptions now calculate and display timestamps using TimeKeeper data (if given a cluster) and restorability of snapshots. Expire now requires a --force option to leave a backup unrestorable or unrestorable after a given point in time, specified by version or timestamp. BackupContainerFilesystem now maintains metadata on key version boundaries in order to avoid large list operations for describe and expire operations. Blob parallel recursive list operations can now take a path (aka prefix) filter function. New describe and expire options are available in fdbbackup. 2018-01-17 04:09:43 -08:00
Stephen Atherton 96cb06cbc7 Bug fixes. Fdbbackup delete was broken. Blobstore backup container deletion would do too much listing before deletions began due to list operations queueing up ahead of and starving the delete operations. Created new knob and blob endpoint limit for concurrent list operations to fix this. Increased blob request timeout default because some requests were taking longer. Crash fixes in blobstore doRequest() which wasn't checking that response object is valid before using it in error conditions. Filesystem-like backup container class (covering blobstore and local dirs) now ignores unrecognized filenames for describe() and expire() operations. 2018-01-05 23:06:39 -08:00
Stephen Atherton 96c479dc71 Rare bug fix. It turns out that backup log files must be written with unique names, otherwise a re-written >1 block log file overwritten after a restore has begun could read some blocks from before the rewrite and some blocks after, but due to random content ordering this would be incorrect and produce a corrupt restore. This bug is very rare because restore would detect an error unless the rewritten log file has exactly same size as the original file, but this is unlikely because the random content order affects block padding and therefore usable content bytes per block. 2018-01-02 23:38:01 -08:00
Stephen Atherton 371dee70e6 Improved backup folder structure to be shallower but spread files more uniformly and make each folder's entries lexically sort into version order regardless of numeric length. Improved backup container test to use a random version multiplier on the file versions created in order to test a wide range of versioned folder paths. 2018-01-02 23:22:35 -08:00
Stephen Atherton f2524ffd33 AsyncFileBlobStoreWrite was prohibiting the writing of 0-byte files. Improved HTTP verbose logging to stdout. Added writing a 0-byte file to BackupContainer unit test. Added backup log and snapshot sizes to backup description. 2017-12-21 21:15:26 -08:00
Evan Tschannen 95b502e1d7 fix: we did not restore to the target version in all cases 2017-12-21 14:11:44 -08:00
Stephen Atherton e3aee45a74 Backup tools and agent now accept blob account credentials via files containing JSON which are specified using command line arguments and/or an environment variable. Improved fdbbackup help, clarifying which options are for which operations. Fdbbackup operations which do not need to use a database no longer require a cluster file parameter. Added eat() commands to StringRef for incrementally tokenizing strings using separator strings. 2017-12-21 01:58:15 -08:00
Evan Tschannen c51de3bb88 fixed windows compile issues 2017-12-20 13:48:31 -08:00
Stephen Atherton c1958b335a Compile fix on windows, can't access protected parent class member from static function, apparently. 2017-12-20 12:13:25 -08:00
Stephen Atherton 47a9a7ab0e Finished backup container discovery / listing via base URL. 2017-12-12 17:44:03 -08:00
Stephen Atherton d3b4a81ed0 Blobstore connection details in unit tests now come from environment variables. 2017-12-06 14:38:45 -08:00
Balachandar Namasivayam 1f949240f5 Make fdbbackup s3 compatible.
s3 sends response in XML.  FDB backup expects json response. Added a new libraray xml2json to convert xml to json.
2017-12-05 17:13:15 -08:00
Stephen Atherton 6695c9e6a2 Bug fixes and improvements to error handling and trace events. The most serious bug was that restore would start at the wrong version, possibly skipping early log and range files. 2017-11-25 00:46:16 -08:00
Stephen Atherton 9354a8cbb4 Added new backup container method to list everything in a backup. 2017-11-19 04:28:22 -08:00
Stephen Atherton 07c19098fe Improved backup container unit test, added file reading / verification, more data, and a series of expirations and validating the expected result. Then fixed the bugs that this new testing discovered. 2017-11-16 16:19:56 -08:00
Stephen Atherton f105204aca Shifted version distribution over folders. 2017-11-15 23:13:04 -08:00
Stephen Atherton ab0017f023 TaskBucket’s TaskFunc interface now has an optional handleError() which is called on any task that throws an error from execute() or finish(). Restore and Backup tasks use this to ensure that any errors that occur are placed in the backup or restore config’s lastError property. Bug fixes in log and range file encodings. 2017-11-15 13:33:09 -08:00
Stephen Atherton 3dfaf13b67 IBackupContainer has been rewritten to be a logical interface for storing, reading, deleting, expiring, and querying backup data. The details of how the data is organized or stored is now hidden from users of the interface. Both the local and blobstore containers have been rewritten, the key changes being a multi level directory structure and no more use of temporary files or pseudo-symlinks in the blob store implementation. This refactor has a large impact radius as the previous backup container was just a thin wrapper that presented a single level list of files and offered no methods for managing or interpreting the file structure so all of that logic was spread around other places in the code base. This made moving to the new blob store schema very messy, and without this refactor further changes in the future would only be worse.
Several backup tasks have been cleaned up / simplified because they no longer need to manage the ‘raw’ structure of the backup.  The addition of IBackupFile and its finish() method simplified the log and range writer tasks.  Updated BlobStoreEndpoint to support now-required bucket creation and bucket listing prefix/delimiter options for finding common prefixes.  Added KeyBackedSet<T> type.  Moved JSONDoc to its own header.  Added platform::findFilesRecursively().

Still to do:  update command line tool to use new IBackupContainer interface, fix bugs in Restore startup.
2017-11-14 23:33:17 -08:00
FDB Dev Team a674cb4ef4 Initial repository commit 2017-05-25 13:48:44 -07:00