Meng Xu
5022566b35
Validate if key_not_found error ever happens
2020-06-08 16:59:00 -07:00
tclinken
aca995d2c5
Removed dead backup agent code
2020-05-01 14:46:59 -07:00
Meng Xu
3f510d0653
Merge pull request #3036 from jzhou77/backup-cmd
...
Several bug fixes for new backups
2020-04-29 16:34:59 -07:00
Jingyu Zhou
9da76c35ea
Fix a memory corruption error
...
The backup container URL should be Key instead of KeyRef.
2020-04-29 15:55:34 -07:00
Jingyu Zhou
7d59e53349
Consolidate makePadding()
2020-04-28 15:39:23 -07:00
Meng Xu
f5e8345496
FastRestoreAgent:Use atomicParallelRestore to kick off restore
...
Replace the handcrafted version with atomicParallelRestore actor
which is simulation tested
2020-04-27 22:15:00 -07:00
Jingyu Zhou
9bfc5bbea8
Check RestorableFileSet's key ranges in simulation
...
Ranges written in the manifest file should match with actual file content.
2020-04-21 12:55:40 -07:00
Jingyu Zhou
9fb3fb9d82
Add pause/resume for new backups
...
To pause/resume the backup workers, the fdbbackup command will write to the
backupPausedKey. Then backup workers noticed the value of the key has been
changed and stops/resumes pulling from TLog.
2020-04-06 14:29:46 -07:00
Jingyu Zhou
feedab02a0
Merge pull request #2855 from xumengpanda/mengxu/fr-api-atomicrestore-PR
...
Add ApiCorrectnessAtomicRestore workload for the new performant restore
2020-03-25 18:05:26 -07:00
Meng Xu
1ba11dc74b
Apply clang format
2020-03-25 11:20:17 -07:00
Meng Xu
120272f025
Change unlockDB from RestoreMaster to Agent
2020-03-25 11:04:49 -07:00
Meng Xu
ca8966a28b
Move lockDB into submitRestore request from restore worker
...
AtomicRestore needs to lock DB before we start the restore worker.
So we cannot lock DB in restore worker with a different randomUID.
2020-03-24 23:39:35 -07:00
Meng Xu
241c2703c8
Fix atomicParallelRestore interface
2020-03-24 17:00:55 -07:00
Meng Xu
80d62f3cb8
Fix:Add atomicParallelRestore to header
2020-03-24 16:28:08 -07:00
Meng Xu
81f7181c9e
Refactor submitParallelRestore function into FileBackupAgent
2020-03-24 14:44:55 -07:00
Meng Xu
5584884c12
Refactor parallelRestoreFinish function into FileBackupAgent
2020-03-24 14:15:15 -07:00
Jingyu Zhou
44c1996950
Change all worker started to be set after all workers updated a key
...
Previously, all worker started is set to be when saved log versions are higher.
However, saving the versions can be wrong, as the worker is not guaranteed to
write to the right container. For instance, if the watch is triggered later,
then mutation logs are written to previous containers. So we need to ensure the
right container is ready -- all workers have acknowledged seeing the container.
2020-03-22 16:40:12 -07:00
Jingyu Zhou
38def426f4
Add a flag to submitBackup for partitioned log
...
This is to distinguish with old workloads so that they can work in simulation.
2020-03-20 20:15:08 -07:00
Jingyu Zhou
e9287407d6
Backup worker updates latest log versions in BackupConfig
...
If backup worker is enabled, the current epoch's worker of tag (-2,0) will be
responsible for monitoring the backup progress of all workers and update the
BackupConfig with the latest saved log version, which is the minimum version
of all tags.
This change has been incorporated in the getLatestRestorableVersion() so that
it is transparent to clients.
2020-03-20 20:15:08 -07:00
Jingyu Zhou
35aafefb89
Consolidate StringRefReader classes
...
Fix a compiler error of unused variable too.
2020-03-20 20:13:38 -07:00
Jingyu Zhou
5a602f58e8
Start backup with a wait on all backup workers running
...
This wait is to make sure that backup workers are already saving mutations so
that no mutations are missed. The idea is that the CLI sets a "backupStartedKey"
in the database and waits for allWorkerStarted() key of the backup to be set.
Backup workers monitor the changes to the "backupStartedKey" and start logging
mutations. Additionally, backup worker for Tag(-2,0) monitors all other workers
have started (checking their saved progress version is larger than the backup's
start version), and then sets the allWorkerStarted() key for the backup.
2020-01-31 19:29:09 -08:00
Jingyu Zhou
7f7ec99170
Serialize and deserialize new backup files
...
The BackupWorker writes files that can be read by FileConverter. Move
StringRefReader to the header file for reuse in FileConverter.
2020-01-22 19:38:46 -08:00
Jingyu Zhou
485d3d0feb
Use Version instead of int64_t
2020-01-22 19:38:45 -08:00
Alvin Moore
7628d04fb9
Merge branch 'release-6.2' of github.com:apple/foundationdb into release_6.2_merge
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2020-01-09 07:21:16 -08:00
Steve Atherton
4ff058e86b
Backup and DR layer status document generation now uses snapshot reads for all keys read to avoid unnecessary conflicts when read during a status update or cleanup transaction. Since many of the keys read use wrapper functions, all of the underlying functions in BackupAgentBase and its two implementations also required a snapshot mode argument. All snapshot arguments default to false to match the underlying FDB API get/getrange methods.
2019-12-19 00:29:35 -08:00
Meng Xu
d0147e5e5d
Merge branch 'release-6.2' into mengxu/merge-release620-to-master-v3
...
Resolved Conflicts:
documentation/sphinx/source/release-notes.rst
fdbserver/DataDistribution.actor.cpp
versions.target
2019-10-02 13:22:56 -07:00
Evan Tschannen
ef01ad2ed8
optimized log range clearing to clear everything for each possible hash (256 clears) if that would be more efficient than one clear per second that has elapsed
...
aborting a DR without the —cleanup flag will still attempt to cleanup for 30 seconds before giving up
added a cleanup command to fdbbackup which can remove mutations from orphaned DRs which were stopped without the —cleanup flag
2019-09-27 18:32:27 -07:00
Meng Xu
d160810662
FastRestore:Resolve review comments
2019-09-04 16:48:43 -07:00
Meng Xu
3b54363780
FastRestore:Apply Clang-format
2019-08-01 18:09:12 -07:00
Meng Xu
45083edf74
Merge branch 'master' into mengxu/performant-restore-PR
...
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
Jingyu Zhou
4c0e824456
Include unactorcompiler.h at the end of *.actor.h
2019-07-10 14:51:52 -07:00
Meng Xu
477fd152c0
FastRestore:Refactor code
...
1) Use the runRYWTransaction for simple DB access
2) Replace some printf with TraceEvent
3) Remove printf not used in debugging
4) Avoid wait inside the condition in loop-choose-when for
the core routine of restore worker, loader and applier.
5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since
the file only has functionalities related to restore worker.
Passed correctness test
2019-06-04 11:22:47 -07:00
Meng Xu
529ce66b6c
Merge branch 'apple/master' into mengxu/performant-restore-PR
2019-04-18 18:02:45 -07:00
Andrew Noyes
6207d724f8
Fix all -Wunused-variable warnings
2019-04-15 18:13:00 -07:00
Meng Xu
70d7c289f4
Merge branch 'master' into mengxu/restore/parallel-v7
2019-03-30 22:13:10 -07:00
Stephen Atherton
d5e50e6963
Rewrote BackupAgentBase::parseTime() to use std::get_time() so it compiles on all supported platforms. Added unit test for parseTime() and formatTime().
2019-03-20 01:18:37 -07:00
Steve Atherton
8aab719c22
Merge branch 'master' into feature-backup-json
2019-03-12 18:23:16 -07:00
Stephen Atherton
bc0b2aa040
Merge branch 'release-6.0' of https://github.com/apple/foundationdb
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbbackup/backup.actor.cpp
# fdbclient/BlobStore.actor.cpp
2019-03-12 04:49:12 -07:00
Stephen Atherton
f0eae0295f
Merge branch 'master' of https://github.com/apple/foundationdb into feature-backup-json
2019-03-12 03:35:03 -07:00
Stephen Atherton
f2953db7d8
Added updating of backup snapshot shards behind in snapshot dispatcher so status can determine if a snapshot is lagging the configured speed.
2019-03-11 01:25:51 -07:00
Stephen Atherton
023bbb566f
Renamed backup state enums for clarity, added backup state names. Changed Epochs to EpochSeconds in backup JSON along with some other renaming/moving of fields, and added information about snapshot dispatch. Changed timestamp format for input/output in all backup/restore contexts to be a fully qualified time with timezone offset. Added information about the last snapshot dispatch to backup config and status (not yet populated).
2019-03-10 16:00:01 -07:00
Stephen Atherton
06c11a316d
Normalized timestamp to text format across backup and restore tooling. Added epochs field to JSON objects describing versions and timestamps in backup status and describe output, renamed some fields for clarity.
2019-03-06 22:34:25 -08:00
Stephen Atherton
1399aee532
Added --json option to fdbbackup status.
2019-03-06 21:32:46 -08:00
Balachandar Namasivayam
f3391ea413
Merge pull request #1240 from satherton/feature-restore-by-timestamp
...
Restore by timestamp
2019-03-06 16:21:06 -08:00
Stephen Atherton
7778112f6a
Bug fix, restore was using the destination cluster to look up timestamps when printing the backup description instead of (optionally) the original cluster which generated the backup. Made missing cluster file errors more clear.
2019-03-06 02:45:55 -08:00
Alex Miller
c6a65389ae
Remove noexcept macro and replace with BOOST_NOEXCEPT.
...
BOOST_NOEXCEPT does what the noexcept macro was supposed to do, but in a
way that is correctly maintained over time.
2019-03-05 22:06:12 -08:00
Steve Atherton
21f55e1878
Merge pull request #1190 from bnamasivayam/restore-multiple-ranges
...
Add support for restoring multiple ranges.
2019-03-05 10:15:55 -08:00
Balachandar Namasivayam
a258df32f6
Skip switchover checks for force option.
2019-03-04 15:58:36 -08:00
Balachandar Namasivayam
7eba50b086
Add support for restoring multiple ranges.
2019-02-25 18:00:28 -08:00
mpilman
f79a9594c1
Several bugfixes to make fdb build on non-ide
2019-02-19 15:16:59 -08:00