foundationdb

Commit Graph

Author	SHA1	Message	Date
Jingyu Zhou	273c086b0f	Fix MacOS compiling error clang doesn't allow capture references, so use copy for lambda's capture list.	2020-03-20 20:15:09 -07:00
Jingyu Zhou	6fb7316185	Fix asset end version if request.targetVersion is -1	2020-03-20 20:15:09 -07:00
Jingyu Zhou	ca1a4ef9fd	Ignore mutation logs of size 0 in converter	2020-03-20 20:15:08 -07:00
Jingyu Zhou	d0a24dd20d	Decode out of order mutations in old mutation logs In the old mutation logs, a version's mutations are serialized as a buffer. Then the buffer is split into smaller chunks, e.g., 10000 bytes each. When writting chunks to the final mutation log file, these chunks can be flushed out of order. For instance, the (version, chunck_part) can be in the order of (3, 0), (4, 0), (3, 1). As a result, the decoder must read forward to find all chunks of data for a version. Another complication is that the files are organized into blocks, where (3, 1) can be in a subsequent block. This change checks the value size for each version, if the size is smaller than the right size, the decoder will look for the missing chucks in the next block.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	d82432da3c	Fix wrong end version for restore loader The restore cannot exceed the target version of the restore request. Otherwise, the version restored is larger than the requested version.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	c59b0844a9	Add total number of tags to WorkerBackupStatus This allows the backup worker to check the number of tags.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	d8731a1796	Refactor to use std::find_if for more concise code	2020-03-20 20:15:08 -07:00
Jingyu Zhou	12ed8ad536	Fix backup worker start version when logset start version is lower The start version of tlog set can be smaller than the last epoch's end version. In this case, set backup worker's start version as last epoch's end version to avoid overlapping of version ranges among backup workers.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	38def426f4	Add a flag to submitBackup for partitioned log This is to distinguish with old workloads so that they can work in simulation.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	e9287407d6	Backup worker updates latest log versions in BackupConfig If backup worker is enabled, the current epoch's worker of tag (-2,0) will be responsible for monitoring the backup progress of all workers and update the BackupConfig with the latest saved log version, which is the minimum version of all tags. This change has been incorporated in the getLatestRestorableVersion() so that it is transparent to clients.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	80d3fa1222	Add delay for master to recruit backup workers This delay is to ensure old epoch's backup workers can save their progress in the database. Otherwise, the new master could attempts to recruit backup workers for the old epoch on version ranges that have already been popped. As a result, the logs will lose data.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	fe6b4a4398	Some correctness fixes	2020-03-20 20:15:08 -07:00
Jingyu Zhou	5ce9fc0e4c	Partitioned logs should be filtered after sorting by tag IDs The default sorting by begin and end version doesn't work with duplicates removal, as tags are also compared.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	5afc23a0e1	Give a chance for backup worker to finish writing files If a backup worker is cancelled, wait until it finishes writing files so that we don't need to create these files in the next epoch.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	20df67ee6a	Filter partitioned logs with subset relationship If a log file's progress is not saved, a new log file will be generated with the same begin version. Then we can have a file that contains a subset of contents in another log file. During restore, we should filter out files that their contents are subset of other files.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	696ce6aa82	Fix compiling error of reverse iterators MacOS and Windows compiler doesn't like the use of "!=" operator of std::map::reverse_iterator.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	b792d76d62	Fix version gap in old epoch's backup When pull finished and message queue is empty, we should use end version as the popVersion for backup files. Otherwise, there might be a version gap between last message and end version.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	31f7108eab	Handle partial recovery in BackupProgress A partial recovery can result in empty epoch that copies previous epoch's version range. In this case, getOldEpochTagsVersionsInfo() will not return previous epoch's information. To correctly compute the start version for a backup worker, we need to check previous epoch's saved version. If they are larger than this epoch's begin version, use previously saved version as the start version.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	e3eb3beaaf	Consider previously pulled version for pulling version Saving files only happens if we are not pulling, i.e., not in NOOP mode.	2020-03-20 20:15:08 -07:00
Jingyu Zhou	1b159a3785	Fix: backup worker ignores deleted container	2020-03-20 20:14:36 -07:00
Jingyu Zhou	00350dd3d8	Fix pulledVersion of backup worker Not sure why, the cursor's version can be smaller than before.	2020-03-20 20:14:35 -07:00
Jingyu Zhou	672ad7a8ea	Fix: backup worker savedVersion init to begin version Choosing invalidVersion is wrong, as the worker starts at beginVersion.	2020-03-20 20:14:35 -07:00
Jingyu Zhou	c300a5c1b7	Fix contract changes: backup worker generate continuous versions Before we allow holes in version ranges in partitioned mutation logs. This has been changed so that restore can easily figure out if database is restorable. A specific problem is that if the backup worker didn't find any mutations for an old epoch, the worker can just exit without generating a log file, thus leaving holes in version ranges. Another contract change is that if a backup key is set, then we must store all mutations for that key, especially for the worker for the old epoch. As a result, the worker must first check backup key, before pulling mutations and uploading logs. Otherwise, we may lose mutations. Finally, when a backup key is removed, the saving of mutations should be up to the current version so that backup worker doesn't exit too early. I.e., avoid the case saved mutation versions are less than the snapshot version taken.	2020-03-20 20:14:35 -07:00
Jingyu Zhou	86edc1c9c8	Fix backup worker does NOOP pop before getting backup key The NOOP pop cuases some mutation ranges being dropped by backup workers. As a result, the backup is incomplete. Specifically, the wait of BACKUP_NOOP_POP_DELAY blocks the monitoring of backup key actor.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	05b87cf288	Partitioned logs need to compute continuous begin Version Because different tags may start at different versions, tag 0 can start at a higher version. In this case, another tag's high version should be used as the start version for continuous logs.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	1f95cba53e	Add describePartitionedBackup() for parallel restore For partitioned logs, computing continuous log end version from min logs begin version. Old backup test keeps using describeBackup() to be correctness clean. Rename partitioned log file so that the last number is block size.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	2eac17b553	StagingKey can add out-of-order mutations For partitioned logs, mutations of the same version may be sent to applier out-of-order. If one loader advances to the next version, an applier may receive later version mutations for different loaders. So, dropping of early mutations is wrong.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	938a6f358d	Describe backup uses partitioned logs to find continuous end version For partitioned logs, the continuous end version has to be done range by range, where each range must contain continuous version for all tags.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	659843ff51	Check partitioned log files are continuous for RestoreSet The idea of checking is to use Tag 0 to find out ranges and their number of tags. Then for each tag 1 and above, check versions are continuous.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	ab0b59b0c3	Add subsequence number to restore loader & applier The subsequence number is needed so that mutations of the same commit version number, but from different partitioned logs can be correctly reassembled in order. For old backup files, the sub number is always 0. For partitioned mutation logs, the actual sub number is used. For range files, the sub number is always 0.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	fda6c08640	Include a total number of tags in partition log file names This is needed for BackupContainer to check partitioned mutation logs are continuous, i.e., restorable to a version.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	64859467e4	Return partitioned logs for RestorableFileSet	2020-03-20 20:13:38 -07:00
Jingyu Zhou	940bea102a	Add a knob to switch mutation logs for parallel restore Knob FASTRESTORE_USE_PARTITIONED_LOGS, default is true to enable partitioned mutation logs. Otherwise, old mutation logs are used.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	6b9b93314e	Check block padding is \0xff for new mutation logs	2020-03-20 20:13:38 -07:00
Jingyu Zhou	35aafefb89	Consolidate StringRefReader classes Fix a compiler error of unused variable too.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	88ad28e576	Integrate parallel restore with partitioned logs In parallel restore, use new getPartitionedRestoreSet() to get a set containing partitioned mutation logs. The loader uses a new parser to extract mutations from partitioned logs. TODO: fix unable to restore errors.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	e15015ee6c	Add mutation log version names I.e., BACKUP_AGENT_MLOG_VERSION for 2001 and PARTITIONED_MLOG_VERSION for 4110.	2020-03-20 20:13:38 -07:00
Jingyu Zhou	ec352c03c9	Add partitioned logs to BackupContainer	2020-03-20 20:13:38 -07:00
Meng Xu	d3071409c5	FastRestore:Add comment for integrating with new backup format	2020-03-20 20:13:38 -07:00
Jingyu Zhou	3801e50288	Backup worker: enable 50% of time in simulation Make this randomization a separate one.	2020-03-20 20:13:38 -07:00
Meng Xu	980037f3a8	Merge pull request #2835 from bnamasivayam/revert-report-conflicting-keys Revert report conflicting keys	2020-03-20 10:33:26 -07:00
Evan Tschannen	e7e559cbae	Merge pull request #2706 from etschannen/feature-test-harness Added TestHarness and TraceLogHelper for assisting with automated simulation testing	2020-03-20 10:29:22 -07:00
A.J. Beamon	cf9a18a64d	Merge pull request #2838 from dongxinEric/misc/diable-ruby-tester Disable Ruby tests for the moment.	2020-03-20 10:26:26 -07:00
Evan Tschannen	a38a7fc8b4	updated copyright date	2020-03-20 10:15:33 -07:00
Xin Dong	f293666028	Added back unnecessary changes.	2020-03-20 10:12:37 -07:00
Xin Dong	851ee20c1a	Disable Ruby tests for the moment.	2020-03-20 10:05:14 -07:00
Jingyu Zhou	34415f82b3	Merge pull request #2832 from xumengpanda/mengxu/backup-code-review-PR Buggify upload delay when backup worker upload data to blob	2020-03-19 21:42:28 -07:00
Balachandar Namasivayam	804fe1b22e	Revert "Merge pull request #2257 from zjuLcg/report-conflicting-key" This reverts commit `648dc4a933`, reversing changes made to `487d131b38`.	2020-03-19 21:34:28 -07:00
Balachandar Namasivayam	efd0c6cec0	Revert "Merge pull request #2833 from xumengpanda/mengxu/remove-test-PR" This reverts commit `8d655d7e40`, reversing changes made to `cd5be43cd9`.	2020-03-19 21:33:47 -07:00
Meng Xu	dfea2c2e55	BackupWorker:Remove assert in pop	2020-03-19 20:14:52 -07:00

1 2 3 4 5 ...

9158 Commits All Branches Search

9158 Commits

All Branches