Commit Graph

69 Commits

Author SHA1 Message Date
FDB Formatster df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
Andrew Noyes b0f61fb74f Resolve the simple-looking conflicts 2021-01-15 19:35:10 +00:00
Andrew Noyes ff7d306b09 Merge branch 'release-6.3' into anoyes/merge-6.3-to-master
Include conflict markers for now. Will resolve.
2021-01-15 18:04:09 +00:00
Andrew Noyes 6d6b2843da Merge remote-tracking branch 'upstream/release-6.2' into anoyes/merge-release-6.2
I'm including the conflict markers for now and I'll resolve them in a
subsequent commit
2021-01-14 17:20:11 +00:00
Steve Atherton a860778b51 Add write buffering to BackupContainerLocalDirectory::BackupFile to greatly reduce the number of IAsyncFile::write() calls made when writing backup data. The buffer size is controlled by a knob. 2021-01-09 07:57:48 -08:00
sfc-gh-tclinkenbeard 6a27619544 Remove unnecessary copies in BackupContainer code 2020-10-24 16:48:03 -07:00
sfc-gh-tclinkenbeard 1d28285cc5 Made BackupContainer.h header guard consistent with other header guards 2020-10-24 16:48:03 -07:00
sfc-gh-tclinkenbeard 82b6daa16b First draft of Azure blob storage backup container 2020-10-24 16:47:51 -07:00
Meng Xu bbc7ce581e Resolve conflicts merging from 6.3 to master 2020-09-25 16:08:31 -07:00
Meng Xu 862336de8f Merge branch 'master' into mengxu/merge-to-master-PR 2020-09-24 17:06:00 -07:00
Young Liu fd7198d874 Extend backup container interface to support query restorable files set by key ranges 2020-08-29 19:58:07 -07:00
Jon Fu 00c77ba2b4 Added beginVersion cmd line option and addressed code review comments 2020-08-28 14:29:22 -04:00
Jon Fu 21635f8a28 update backup restore for local testing 2020-08-04 15:48:43 -04:00
Jingyu Zhou 7d59e53349 Consolidate makePadding() 2020-04-28 15:39:23 -07:00
Jingyu Zhou b315163033 Fix a memory corruption error 2020-04-28 15:39:23 -07:00
Jingyu Zhou 9bfc5bbea8 Check RestorableFileSet's key ranges in simulation
Ranges written in the manifest file should match with actual file content.
2020-04-21 12:55:40 -07:00
Jingyu Zhou 0938e45c6a Set continuous version to invalidVersion when snapshot version is the target version 2020-04-20 22:26:42 -07:00
Jingyu Zhou 930b175c4c Add range files' key ranges to RestorableFileSet
Also add continuous logs' begin and end version in RestorableFileSet.
2020-04-20 22:26:42 -07:00
Jingyu Zhou 3063611355 Write range files' begin & end keys to manifest file
This information can be very useful in knowing the content in these files,
especially for restores.
2020-04-20 22:26:42 -07:00
Jingyu Zhou 4d06e837dc Remove getPartitionedRestoreSet() API
Use getRestoreSet() instead for both old and new partitioned logs.
2020-04-08 20:12:09 -07:00
Jingyu Zhou fd9caa88a0 Remove isPartitionedBackup()
This is no longer needed, since describeBackup() figures this out.
2020-04-08 16:09:18 -07:00
Jingyu Zhou 0cf6013357 Refactor to remove describePartitionedBackup()
The backup container can figure out if partitioned logs are used by looking at
mutation logs, thus consolidating the API to a single describeBackup() as
before.
2020-04-08 15:59:37 -07:00
Jingyu Zhou 772ab70aee Add an option for fast restore to restore old backups
If "usePartitionedLogs" is set to false, then the workload uses old backups for
restore.
2020-03-26 13:04:00 -07:00
Jingyu Zhou 4c75c61f39 Fix duplicate file removal for subset version ranges
Partitioned logs can have strict subset version ranges, which was not properly
handled -- we used to assume overlapping only happens for the same begin
version.
2020-03-20 20:15:09 -07:00
Jingyu Zhou 20df67ee6a Filter partitioned logs with subset relationship
If a log file's progress is not saved, a new log file will be generated
with the same begin version. Then we can have a file that contains a subset
of contents in another log file. During restore, we should filter out files
that their contents are subset of other files.
2020-03-20 20:15:08 -07:00
Jingyu Zhou 1f95cba53e Add describePartitionedBackup() for parallel restore
For partitioned logs, computing continuous log end version from min logs begin
version. Old backup test keeps using describeBackup() to be correctness clean.

Rename partitioned log file so that the last number is block size.
2020-03-20 20:13:38 -07:00
Jingyu Zhou fda6c08640 Include a total number of tags in partition log file names
This is needed for BackupContainer to check partitioned mutation logs are
continuous, i.e., restorable to a version.
2020-03-20 20:13:38 -07:00
Jingyu Zhou 88ad28e576 Integrate parallel restore with partitioned logs
In parallel restore, use new getPartitionedRestoreSet() to get a set containing
partitioned mutation logs. The loader uses a new parser to extract mutations
from partitioned logs.

TODO: fix unable to restore errors.
2020-03-20 20:13:38 -07:00
Jingyu Zhou e15015ee6c Add mutation log version names
I.e., BACKUP_AGENT_MLOG_VERSION for 2001 and PARTITIONED_MLOG_VERSION for 4110.
2020-03-20 20:13:38 -07:00
Jingyu Zhou 1123157ae0 Ignore mutations large than the end version 2020-01-22 19:38:46 -08:00
Jingyu Zhou d8c74e7e1a Extend BackupContainer to support tagged log files
That is, the file name contains the log router tag ID as the last component,
e.g., "log,39638169,42718056,016f52a4d16ef36fd3335db9c68abfc1,1048576,1".
2020-01-22 19:38:46 -08:00
Jingyu Zhou f21d7ca44c Add tag ID to backup log file names 2020-01-22 19:38:46 -08:00
Meng Xu e345c9061f FastRestore:Refine debug messages 2019-11-04 11:47:38 -08:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00
Meng Xu 3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu 477fd152c0 FastRestore:Refactor code
1) Use the runRYWTransaction for simple DB access
2) Replace some printf with TraceEvent
3) Remove printf not used in debugging
4) Avoid wait inside the condition in loop-choose-when for
   the core routine of restore worker, loader and applier.
5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since
   the file only has functionalities related to restore worker.

Passed correctness test
2019-06-04 11:22:47 -07:00
Meng Xu a08a6776f5 FastRestore: Refactor to smaller components
The current code uses one restore interface to handle the work
for all restore roles, i.e., master, loader and applier.
This makes it harder to review or maintain or scale.

This commit split the restore into multiple roles by mimicing FDB
transaction system:
1) It uses a RestoreWorker as the process to host restore roles;
   This commit assumes one restore role per RestoreWorker; but
   it should be easy to extend to support multiple roles per RestoreWorker;
2) It creates 3 restore roles:
   RestoreMaster: Coordinate the restore process and send commands to the other two roles;
   RestoreLoader: Parse backup files to mutations and send mutations to appliers;
   RestoreApplier: Sort received mutations and apply them to DB in order.

Compilable version. To be tested in correctness.
2019-05-10 14:20:06 -07:00
Meng Xu 4c3ccebe8a FastRestore: Cleanup code
Remove unused code and comments.
2019-04-12 13:49:55 -07:00
Meng Xu 70d7c289f4 Merge branch 'master' into mengxu/restore/parallel-v7 2019-03-30 22:13:10 -07:00
Stephen Atherton ca8bbad657 Added --json option to fdbbackup describe. Also added expired percentage indicator to snapshot details. 2019-03-06 14:14:06 -08:00
mpilman 0bb60e5a3b Use proper fwd decl in NativeAPI
Also NativeAPI.h -> NativeAPI.actor.h
2019-02-19 15:16:59 -08:00
Meng Xu 550f2e2682 Merge with master to use the latest backup container
In fdb 6.0.15, backup container is changed on how to organize the backup data.
The backup made by fdb >6.0.15 has to be restored with fdb > 6.0.15.
Merge with master so that the fast restore uses fdb > 6.0.15
2019-01-30 12:05:15 -08:00
Evan Tschannen 684a22a52b Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbbackup/backup.actor.cpp
#	fdbclient/BackupContainer.actor.cpp
#	fdbclient/HTTP.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/BackupCorrectness.actor.cpp
#	versions.target
2019-01-09 16:14:46 -08:00
Meng Xu 5f406d864c WiP:Has bug: assign roles and distribute files
The global variables for a process in real mode are separate across processes in real mode, BUT
they are NOT separate in simulation mode.
The workers in different processes can still access the global variables,
which makes one worker overwrite another work's global variable value.

To solve this problem, which happens in other FDB code as well,
we allocate the global variable in a heap and pass the pointers across functions,
as what DataDistribution.actor does.

The other approach is to create two code path: one for real mode and the other for simulator,
as the g_simulator does to create the process context for each simulated process
2019-01-04 22:23:02 -08:00
Meng Xu 338f7ebe16 WiP: not compilable, get restore files
Need to see how existing code pass conplex struct around
E.g., TaskBucket
2018-12-26 20:57:56 -08:00
Stephen Atherton 928876be81 Backup dumpFileList() can now take a version range. 2018-12-21 22:42:29 -08:00
Stephen Atherton f64d5321e9 Backup describe, expire, and delete will now clearly indicate when there is no backup at the given URL. 2018-12-20 18:05:23 -08:00
Stephen Atherton 354abebf64 Added progress reporting to backup expiration. Simplified backup delete progress. 2018-12-20 00:23:26 -08:00
Stephen Atherton 69d847dbbd Backup describe can now analyze a backup assuming log data starts at a given version. This is used by expire both as an optimization and because doing so enables expire to repair a bad metadata state that can result from a cancelled or failed expire operation from fdbbackup <= 6.0.17. ListLogFiles() and ListRangeFiles() no longer sort results as in most cases the sort is not required and the result set can be very large. Describe will now update minLogBegin metadata to the beginning of the log range known to be present when possible. Several serial log and range list pairings are now done in parallel. 2018-12-18 04:33:37 -08:00
Stephen Atherton 223b19f5ba Rewrote backup metadata scheme to fix several design flaws involving cancelled or concurrent operations. Most importantly, before an expire deletes any data it marks the end of the range being deleted as 'unreliable' so further reads of the backup will treat versions before that point as having data holes. 2018-12-16 00:18:13 -08:00