<spanid="backups"></span><h1>Backup, Restore, and Replication for Disaster Recovery</h1>
<p>This document covers backup and restoration of a FoundationDB database. While FoundationDB itself is fault tolerant, the backup tool provides an additional level of protection by supporting recovery from disasters or unintentional modification of the database.</p>
<p>FoundationDB’s backup tool makes a consistent, point-in-time backup of a FoundationDB database without downtime. Like FoundationDB itself, the backup/restore software is distributed, with multiple backup agents cooperating to perform a backup or restore faster than a single machine can send or receive data and to continue the backup process seamlessly even when some backup agents fail.</p>
<p>The FoundationDB database usually cannot maintain a consistent snapshot long enough to read the entire database, so full backup consists of an <em>inconsistent</em> copy of the data with a log of database changes that took place during the creation of that inconsistent copy. During a restore, the inconsistent copy and the log of changes are combined to reconstruct a consistent, point-in-time snapshot of the original database.</p>
<p>A FoundationDB Backup job can run continuously, pushing multiple inconsistent snapshots and logs of changes over time to maintain the backup’s restorable point-in-time very close to now.</p>
</div>
<divclass="section"id="backup-vs-dr">
<h2>Backup vs DR</h2>
<p>FoundationDB can backup a database to local disks, a blob store (such as Amazon S3), or to another FoundationDB database.</p>
<p>Backing up one database to another is a special form of backup is called DR backup or just DR for short. DR stands for Disaster Recovery, as it can be used to keep two geographically separated databases in close synchronization to recover from a catastrophic disaster. Once a DR operation has reached ‘differential’ mode, the secondary database (the destination of the DR job) will always contains a <em>consistent</em> copy of the primary database (the source of the DR job) but it will be from some past point in time. If the primary database is lost and applications continue using the secondary database, the “ACI” in ACID is preserved but D (Durability) is lost for some amount of most recent changes. When DR is operating normally, the secondary database will lag behind the primary database by as little as a few seconds worth of database commits.</p>
<p>While a cluster is being used as the destination for a DR operation it will be locked to prevent accidental use or modification.</p>
</div>
<divclass="section"id="limitations">
<h2>Limitations</h2>
<p>Backup data is not encrypted on disk, in a blob store account, or in transit to a destination blob store account or database.</p>
</div>
<divclass="section"id="tools">
<h2>Tools</h2>
<p>There are 5 command line tools for working with Backup and DR operations:</p>
<dd>This command line tool is used to control (but not execute) backup jobs and manage backup data. It can <codeclass="docutils literal"><spanclass="pre">start</span></code> or <codeclass="docutils literal"><spanclass="pre">abort</span></code> a backup, <codeclass="docutils literal"><spanclass="pre">discontinue</span></code> a continuous backup, get the <codeclass="docutils literal"><spanclass="pre">status</span></code> of an ongoing backup, or <codeclass="docutils literal"><spanclass="pre">wait</span></code> for a backup to complete. It can also <codeclass="docutils literal"><spanclass="pre">describe</span></code>, <codeclass="docutils literal"><spanclass="pre">delete</span></code>, <codeclass="docutils literal"><spanclass="pre">expire</span></code> data in a backup, or <codeclass="docutils literal"><spanclass="pre">list</span></code> the backups at a destination folder URL.</dd>
<dd>This command line tool is used to control (but not execute) restore jobs. It can <codeclass="docutils literal"><spanclass="pre">start</span></code> or <codeclass="docutils literal"><spanclass="pre">abort</span></code> a restore, get the <codeclass="docutils literal"><spanclass="pre">status</span></code> of current and recent restore tasks, or <codeclass="docutils literal"><spanclass="pre">wait</span></code> for a restore task to complete while printing ongoing progress details.</dd>
<dd>The backup agent is a daemon that actually executes the work of the backup and restore jobs. Any number of backup agents pointed at the same database will cooperate to perform backups and restores. The Backup URL specified for a backup or restore must be accessible by all <codeclass="docutils literal"><spanclass="pre">backup_agent</span></code> processes.</dd>
<dd>This command line tool is used to control (but not execute) DR jobs - backups from one database to another. It can <codeclass="docutils literal"><spanclass="pre">start</span></code>, <codeclass="docutils literal"><spanclass="pre">abort</span></code> a DR job, or <codeclass="docutils literal"><spanclass="pre">switch</span></code> the DR direction. It can also get the <codeclass="docutils literal"><spanclass="pre">status</span></code> of a running DR job.</dd>
<dd>The database backup agent is a daemon that actually executes the work of the DR jobs, writing snapshot and log data to the destination database. Any number of agents pointed at the same databases will cooperate to perform the backup.</dd>
</dl>
<p>By default, the FoundationDB packages are configured to start a single <codeclass="docutils literal"><spanclass="pre">backup_agent</span></code> process on each FoundationDB server. If you want to perform a backup to a network drive or blob store instance that is accessible to every server, you can immediately use the <codeclass="docutils literal"><spanclass="pre">fdbbackup</span><spanclass="pre">start</span></code> command from any machine with access to your cluster to start the backup</p>
<p>If instead you want to perform a backup to the local disk of a particular machine or machines which are not network accessible to the FoundationDB servers, then you should disable the backup agents on the FoundationDB servers. This is accomplished by commenting out all of the <codeclass="docutils literal"><spanclass="pre">[backup_agent.<ID>]</span></code> sections in <aclass="reference internal"href="configuration.html#foundationdb-conf"><spanclass="std std-ref">foundationdb.conf</span></a>. Do not comment out the global <codeclass="docutils literal"><spanclass="pre">[backup_agent]</span></code> section. Next, start backup agents on the destination machine or machines. Now, when you start a backup, you can specify the destination directory (as a Backup URL) using a local path on the destination machines. The backup agents will fetch data from the database and store it locally on the destination machines.</p>
</div>
<divclass="section"id="backup-urls">
<h2>Backup URLs</h2>
<p>Backup and Restore locations are specified by Backup URLs. Currently there are two valid Backup URL formats.</p>
<p>Note that items in angle brackets (< and >) are just placeholders and must be replaced (including the brackets) with meaningful values. Items within square brackets ([ and ]) are optional.</p>
<p>For local directories, the Backup URL format is</p>
<p>An example would be <codeclass="docutils literal"><spanclass="pre">file:///home/backups</span></code> which would refer to the directory <codeclass="docutils literal"><spanclass="pre">/home/backups</span></code>.
Note that since paths must be absolute this will result in three slashes (/) in a row in the URL.</p>
<p>Note that for local directory URLs the actual backup files will not be written to <base_dir> directly but rather to a uniquely timestamped subdirectory. When starting a restore the path to the timestamped subdirectory must be specified.</p>
<p>For blob store backup locations, the Backup URL format is</p>
<api_key> - API key to use for authentication
<secret> - API key's secret. Optional.
<hostname> - Remote hostname or IP address to connect to
<port> - Remote port to connect to. Optional. Default is 80.
<name> - Name of backup. It can contain '/' characters, to place backups into a folder-like structure.
<param>=<value> - Optional URL parameters. See below for details.
</pre></div>
</div>
<p>If <secret> is not specified, it will be looked up in <aclass="reference internal"href="#blob-credential-files"><spanclass="std std-ref">blob credential sources</span></a>.</p>
<p>An example blob store Backup URL would be <codeclass="docutils literal"><spanclass="pre">blobstore://myKey:mySecret@something.domain.com:80/dec_1_2017_0400</span></code>.</p>
<p>Blob store Backup URLs can have optional parameters at the end which set various limits on interactions with the blob store. All values must be positive decimal integers. The default values are not very restrictive. The most likely parameter a user would want to change is <codeclass="docutils literal"><spanclass="pre">max_send_bytes_per_second</span></code> (or <codeclass="docutils literal"><spanclass="pre">sbps</span></code> for short) which determines the upload speed to the blob service.</p>
<p>Here is a complete list of valid parameters:</p>
<blockquote>
<div><p><em>connect_tries</em> (or <em>ct</em>) - Number of times to try to connect for each request.</p>
<p><em>request_tries</em> (or <em>rt</em>) - Number of times to try each request until a parseable HTTP response other than 429 is received.</p>
<p><em>requests_per_second</em> (or <em>rps</em>) - Max number of requests to start per second.</p>
<p><em>concurrent_requests</em> (or <em>cr</em>) - Max number of requests in progress at once.</p>
<p><em>multipart_max_part_size</em> (or <em>maxps</em>) - Max part size for multipart uploads.</p>
<p><em>multipart_min_part_size</em> (or <em>minps</em>) - Min part size for multipart uploads.</p>
<p><em>concurrent_uploads</em> (or <em>cu</em>) - Max concurrent uploads (part or whole) that can be in progress at once.</p>
<p><em>concurrent_reads_per_file</em> (or <em>crps</em>) - Max concurrent reads in progress for any one file.</p>
<p><em>read_block_size</em> (or <em>rbs</em>) - Block size in bytes to be used for reads.</p>
<p><em>read_ahead_blocks</em> (or <em>rab</em>) - Number of blocks to read ahead of requested offset.</p>
<p><em>read_cache_blocks_per_file</em> (or <em>rcb</em>) - Size of the read cache for a file in blocks.</p>
<p><em>max_send_bytes_per_second</em> (or <em>sbps</em>) - Max send bytes per second for all requests combined.</p>
<p><em>max_recv_bytes_per_second</em> (or <em>rbps</em>) - Max receive bytes per second for all requests combined</p>
<p>In order to help safeguard blob store credentials, the <SECRET> can optionally be omitted from blobstore:// URLs on the command line. Omitted secrets will be resolved at connect time using 1 or more Blob Credential files.</p>
<p>Blob Credential files can be specified on the command line (via –blob_credentials <FILE>) or via the environment variable FDB_BLOB_CREDENTIALS which can be set to a colon-separated list of files. The command line takes priority over the environment variable however all files from both sources will be used.</p>
<p>At connect time, the specified files are read in order and the first matching account specification (<aclass="reference external"href="mailto:user%40host">user<span>@</span>host</a>)
will be used to obtain the secret key.</p>
<p>The Blob Credential File format is JSON with the following schema:</p>
<h2><codeclass="docutils literal"><spanclass="pre">fdbbackup</span></code> command line tool</h2>
<p>The <codeclass="docutils literal"><spanclass="pre">fdbbackup</span></code> command line tool is used to control backup jobs or to manage backup data.</p>
<dd>Path to the cluster file that should be used to connect to the FoundationDB cluster you want to use. If not specified, a <aclass="reference internal"href="administration.html#default-cluster-file"><spanclass="std std-ref">default cluster file</span></a> will be used.</dd>
<dd>The Backup URL which the subcommand should read, write, or modify. For <codeclass="docutils literal"><spanclass="pre">start</span></code> operations, the Backup URL must be accessible by the <codeclass="docutils literal"><spanclass="pre">backup_agent</span></code> processes.</dd>
<dd>A “tag” is a named slot in which a backup task executes. Backups on different named tags make progress and are controlled independently, though their executions are handled by the same set of backup agent processes. Any number of unique backup tags can be active at once. It the tag is not specified, the default tag name “default” is used.</dd>
<dd>Use FILE as a <aclass="reference internal"href="#blob-credential-files"><spanclass="std std-ref">Blob Credential File</span></a>. Can be used multiple times.</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">start</span></code> subcommand is used to start a backup. If there is already a backup in progress, the command will fail and the current backup will be unaffected. Otherwise, a backup is started. If the wait option is used, the command will wait for the backup to complete; otherwise, it returns immediately.</p>
<dd>Perform the backup continuously rather than terminating once a restorable backup is achieved. Database mutations within the backup’s target key ranges will be continuously written to the backup as well as repeated inconsistent snapshots at the configured snapshot rate.</dd>
<dt><codeclass="docutils literal"><spanclass="pre">-s</span><spanclass="pre"><DURATION></span></code> or <codeclass="docutils literal"><spanclass="pre">--snapshot_interval</span><spanclass="pre"><DURATION></span></code></dt>
<dd>Specifies the duration, in seconds, of the inconsistent snapshots written to the backup in continuous mode. The default is 864000 which is 10 days.</dd>
<dd>Wait for the backup to complete with behavior identical to that of the <aclass="reference internal"href="#backup-wait"><spanclass="std std-ref">wait command</span></a>.</dd>
<dd><pclass="first">Specify a key range to be included in the backup. Can be used multiple times to specify multiple key ranges. The argument should be a single string containing either a BEGIN alone or both a BEGIN and END separated by a space. If only the BEGIN is specified, the END is assumed to be BEGIN + ‘xff’. If no key ranges are different, the default is all user keys (‘’ to ‘xff’).</p>
<p>Each key range should be quoted in a manner appropriate for your command line environment. Here are some examples for Bash:</p>
<p>The <codeclass="docutils literal"><spanclass="pre">abort</span></code> subcommand is used to abort a backup that is currently in progress. If there is no backup in progress, the command will return an error. The destination backup is NOT deleted automatically, and it may or may not be restorable depending on when the abort is done.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">discontinue</span></code> subcommand is only available for backups that were started with the continuous (<codeclass="docutils literal"><spanclass="pre">-z</span></code>) option. Its effect is to discontinue the continuous backup. Note that the subcommand does <em>not</em> abort the backup; it simply allows the backup to complete as a noncontinuous backup would.</p>
<dd>Wait for the backup to complete with behavior identical to that of the <aclass="reference internal"href="#backup-wait"><spanclass="std std-ref">wait command</span></a>.</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">wait</span></code> subcommand is used to wait for a backup to complete, which is useful for scripting purposes. If there is a backup in progress, it waits for it to complete or be aborted and returns a status based on the result of the backup. If there is no backup in progress, it returns immediately based on the result of the previous backup. The exit code is zero (success) if the backup was completed successfully and nonzero if it was aborted.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">status</span></code> subcommand is used to get information on the current status of backup. It will show several backup metrics as well as recent errors which organized by whether or not they appear to be preventing backup progress.</p>
<divclass="highlight-default"><divclass="highlight"><pre><span></span>user@host$ fdbbackup status [-t <TAG>]
<pclass="last">If you cancel a delete operation while it is in progress the specified backup is in an unknown state and is likely no longer usable. Repeat the delete command to finish deleting the backup.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">expire</span></code> subcommand will remove data from a backup prior to some point in time referred to as the ‘cutoff’.</p>
<dd>Specifies the expiration cutoff to DATETIME. Requires a cluster file and will use version/timestamp metadata in the database to convert DATETIME to a database commit version. DATETIME must be in the form “YYYY-MM-DD.HH:MI:SS” in UTC.</dd>
<dd>Specifies that the backup must be restorable to DATETIME and later. Requires a cluster file and will use version/timestamp metadata in the database to convert DATETIME to a database commit version. DATETIME must be in the form “YYYY-MM-DD.HH:MI:SS” in UTC.</dd>
<dd>Specifies that the backup must be restorable as of VERSION and later.</dd>
</dl>
</div></blockquote>
<dlclass="docutils">
<dt><codeclass="docutils literal"><spanclass="pre">-f</span></code> or <codeclass="docutils literal"><spanclass="pre">--force</span></code></dt>
<dd>If the designated cutoff will result in removal of data such that the backup’s restorability would be reduced to either unrestorable or less restorable than the optional restorability requirement then the –force option must be given or the result will be an error and no action will be taken.</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">describe</span></code> subcommand will analyze the given backup and print a summary of the snapshot and mutation data versions it contains as well as the version range of restorability the backup can currently provide.</p>
<dd>If the originating cluster is still available and is passed on the command line, this option can be specified in order for all versions in the output to also be converted to timestamps for better human readability.</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">list</span></code> subcommand will list the backups at a given ‘base’ or shortened Backup URL.</p>
<divclass="highlight-default"><divclass="highlight"><pre><span></span>user@host$ fdbbackup list -b <BASE_URL>
</pre></div>
</div>
<dlclass="docutils">
<dt><codeclass="docutils literal"><spanclass="pre">-b</span><spanclass="pre"><BASE_URL></span></code> or <codeclass="docutils literal"><spanclass="pre">--base_url</span><spanclass="pre"><BASE_URL></span></code></dt>
<dd>This a shortened Backup URL which looks just like a Backup URL but without the backup name so that the list command will discover and list all of the backups under that base URL.</dd>
<h2><codeclass="docutils literal"><spanclass="pre">fdbrestore</span></code> command line tool</h2>
<p>The <codeclass="docutils literal"><spanclass="pre">fdbrestore</span></code> command line tool is used to control restore tasks. Note that a restore operation will not clear the target key ranges, for safety reasons, so you must manually clear the ranges to be restored prior to starting the restore.</p>
<divclass="admonition warning">
<pclass="first admonition-title">Warning</p>
<pclass="last">It is your responsibility to ensure that no clients are accessing the database while it is being restored. During the restore process the database is in an inconsistent state, and writes that happen during the restore process might be partially or completely overwritten by restored data.</p>
<dd><pclass="first">Specify the tag for the restore task. Multiple restore tasks can be in progress at once so long as each task uses a different tag. The default tag is “default”.</p>
<divclass="last admonition warning">
<pclass="first admonition-title">Warning</p>
<pclass="last">If multiple restore tasks are in progress they should be restoring to different prefixes or the result is undefined.</p>
<dd>Path to the cluster file that should be used to connect to the FoundationDB cluster you want to use. If not specified, a <aclass="reference internal"href="administration.html#default-cluster-file"><spanclass="std std-ref">default cluster file</span></a> will be used.</dd>
<dd>Use FILE as a <aclass="reference internal"href="#blob-credential-files"><spanclass="std std-ref">Blob Credential File</span></a>. Can be used multiple times.</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">start</span></code> command will start a new restore on the specified (or default) tag. The command will fail if a tag is already in use by an active restore.</p>
<dd>Required. Specifies the Backup URL for the source backup data to restore to the database. The source data must be accessible by the <codeclass="docutils literal"><spanclass="pre">backup_agent</span></code> processes for the cluster.</dd>
<dd>Wait for the restore to reach a final state (such as complete) before exiting. Prints a progress update every few seconds. Behavior is identical to that of the wait command.</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">abort</span></code> command will stop an active backup on the specified (or default) tag. It will display the final state of the restore tag.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">wait</span></code> command will wait for the restore on the specified (or default) tag to reach a final state (such as complete or abort) and then exit. While waiting it will prints a progress update every few seconds.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">status</span></code> command will print a detailed status report on restore job progress. If a tag is specified, it will only show status for that specific tag, otherwise status for all tags will be shown.</p>
<h2><codeclass="docutils literal"><spanclass="pre">backup_agent</span></code> command line tool</h2>
<p><codeclass="docutils literal"><spanclass="pre">backup_agent</span></code> is started automatically on each server in the default configuration of FoundationDB, so you will not normally need to invoke it at the command line. One case in which you would need to do so would be to perform a backup to a destination which is not accessible via a shared filesystem.</p>
<dd><pclass="first">Specify the path to the <codeclass="docutils literal"><spanclass="pre">fdb.cluster</span></code> file that should be used to connect to the FoundationDB cluster you want to back up.</p>
<pclass="last">If not specified, a <aclass="reference internal"href="administration.html#default-cluster-file"><spanclass="std std-ref">default cluster file</span></a> will be used.</p>
<dd>Use FILE as a <aclass="reference internal"href="#blob-credential-files"><spanclass="std std-ref">Blob Credential File</span></a>. Can be used multiple times.</dd>
</dl>
</div>
<divclass="section"id="fdbdr-command-line-tool">
<spanid="fdbdr-intro"></span><h2><codeclass="docutils literal"><spanclass="pre">fdbdr</span></code> command line tool</h2>
<p>The <codeclass="docutils literal"><spanclass="pre">fdbdr</span></code> command line tool is used to manage DR tasks.</p>
<dd>Specify the path to the <codeclass="docutils literal"><spanclass="pre">fdb.cluster</span></code> file for the destination cluster of the DR operation.</dd>
<dd>Specify the path to the <codeclass="docutils literal"><spanclass="pre">fdb.cluster</span></code> file for the source cluster of the DR operation.</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">start</span></code> subcommand is used to start a DR backup. If there is already a DR backup in progress, the command will fail and the current DR backup will be unaffected.</p>
<dd>Specify a key range to be included in the DR. Can be used multiple times to specify multiple key ranges. The argument should be a single string containing either a BEGIN alone or both a BEGIN and END separated by a space. If only the BEGIN is specified, the END is assumed to be BEGIN + ‘xff’. If no key ranges are different, the default is all user keys (‘’ to ‘xff’).</dd>
<p>The <codeclass="docutils literal"><spanclass="pre">switch</span></code> subcommand is used to swap the source and destination database clusters of an active DR in differential mode. This means the destination will be unlocked and start streaming data into the source database, which will subsequently be locked.</p>
<p>This command requires both databases to be available. While the switch command is working, both databases will be locked for a few seconds.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">abort</span></code> subcommand is used to abort a DR that is currently in progress. If there is no backup in progress, the command will return an error. If the DR had already reached differential status, the abort command will leave the destination database at consistent snapshot of the source database from sometime in the past.</p>
<blockquote>
<div><divclass="admonition warning">
<pclass="first admonition-title">Warning</p>
<pclass="last">The <codeclass="docutils literal"><spanclass="pre">abort</span></code> command will lose some amount of prior commits.</p>
<p>The <codeclass="docutils literal"><spanclass="pre">status</span></code> subcommand is used to get information on the current status of DR backup. It tells whether or not there is a DR in progress and whether or not there are active DR agents. It will also report any errors that have been encountered by the DR agents.</p>
<dd>Print the last (up to) <codeclass="docutils literal"><spanclass="pre"><LIMIT></span></code> errors that were logged into the database by backup agents. The default is 10.</dd>
<h2><codeclass="docutils literal"><spanclass="pre">dr_agent</span></code> command line tool</h2>
<p>Unlike <codeclass="docutils literal"><spanclass="pre">backup_agent</span></code>, <codeclass="docutils literal"><spanclass="pre">dr_agent</span></code> is not started automatically in a default FoundationDB installation. A <codeclass="docutils literal"><spanclass="pre">dr_agent</span></code> needs the cluster files for both the source database and the destination database, and can only perform a backup in one direction (from source to destination) at a time.</p>
<dd>Specify the path to the <codeclass="docutils literal"><spanclass="pre">fdb.cluster</span></code> file for the destination cluster of the DR operation.</dd>
<dd>Specify the path to the <codeclass="docutils literal"><spanclass="pre">fdb.cluster</span></code> file for the source cluster of the DR operation.</dd>