Meng Xu
42df1e7792
Merge pull request #2879 from jzhou77/backup-progress
...
Update mutation bytes written for new backups
2020-03-30 21:42:45 -07:00
Meng Xu
a85652375c
Merge pull request #2872 from jzhou77/backup-fix
...
Switch off old mutation logging on proxies for new backups
2020-03-30 21:42:10 -07:00
Meng Xu
60f6edc3b5
Merge pull request #2860 from zjuLcg/report-conflicting-key-roll-forward
...
Report conflicting key roll forward
2020-03-30 17:33:56 -07:00
Jingyu Zhou
411b4c28ac
Update mutation bytes written for new backups
...
This will make the log bytes written available to backup status and describe
backup calls.
2020-03-29 21:23:34 -07:00
Jingyu Zhou
65e3b9192e
Add an assert for probably dead code
2020-03-28 21:19:47 -07:00
Jingyu Zhou
280bc94738
Do not recruit backup workers with wrong tags
...
In a rare scenario, the master can recruit backup workers with more tags than
the number of log router tags for an epoch. This can be caused by an
unsuccessful recovery, which uses more tags than the next epoch. When
recruiting for the next epoch, if no progress has been made yet, the recruiting
logic will look back at the previous epoch. If previous epoch has saved past
this epoch's begin version, current epoch's progress is updated with that
information and can result in more tags being inserted to this epoch's
recruitment.
2020-03-28 21:19:41 -07:00
Meng Xu
13f343ec96
Resolve minor review comment
2020-03-28 16:03:01 -07:00
Meng Xu
8a30526336
FastRestore:Remove commented assertion
2020-03-28 13:11:32 -07:00
Meng Xu
404a3e2619
FastRestore:Loader:Remove sanity chech for the order of sending log and range mutations
2020-03-27 23:36:13 -07:00
Meng Xu
21a5c67f9a
FastRestore:Remove assertion on mutation sending order
2020-03-27 17:09:31 -07:00
Meng Xu
75fc9af5c8
Apply clang format
2020-03-27 16:55:52 -07:00
Meng Xu
0222e8096c
FastRestore:Send log mutations and range mutations in parallel
...
With the subversion extension, appliers can order log and range mutations
based on LogMessageVersion instead of sending order.
2020-03-27 16:54:19 -07:00
Meng Xu
f7233bade7
Rename ParallelRestoreCorrectnessAtomicOpTinyData.txt by removing TinyData
2020-03-27 13:08:59 -07:00
Meng Xu
97f8e46388
Sanity check subversion for log mutations
2020-03-27 13:07:08 -07:00
Meng Xu
32b0ba1822
Merge branch 'master' into mengxu/parallel-range-log-file-loading-PR
2020-03-27 12:13:47 -07:00
Meng Xu
113d0fb48b
Remove incorrect assertion
2020-03-27 12:13:30 -07:00
chaoguang
64148469e8
clang-format the pr
2020-03-26 15:52:30 -07:00
Jingyu Zhou
9a9af7d8a8
Add more trace event details on partitioned log
2020-03-26 13:57:31 -07:00
Meng Xu
6299ad3913
FastRestore:Load range and log files in parallel for new backup format
2020-03-26 13:17:44 -07:00
Jingyu Zhou
6be913a430
Add partitioned logs option to AtomicRestore workload
2020-03-26 13:04:00 -07:00
Jingyu Zhou
aca458cd96
Set 50% chance to restore old backup files for fast restore
2020-03-26 13:04:00 -07:00
Jingyu Zhou
99f4ef6e0c
Fix restore loader to handle mutation sub number
...
For old backup format, give them a sub sequence number starting from 0 for each
commit version.
2020-03-26 13:04:00 -07:00
Jingyu Zhou
40b17e1e9b
Remove a no longer unused knob
2020-03-26 13:04:00 -07:00
Jingyu Zhou
772ab70aee
Add an option for fast restore to restore old backups
...
If "usePartitionedLogs" is set to false, then the workload uses old backups for
restore.
2020-03-26 13:04:00 -07:00
Meng Xu
1052b23ee1
Merge pull request #2370 from atn34/test-watch-outliving-transaction
...
Test watch outliving transaction
2020-03-26 12:40:38 -07:00
Andrew Noyes
cdb6bbfc85
Test watch outliving transaction
2020-03-26 10:09:03 -07:00
Jingyu Zhou
feedab02a0
Merge pull request #2855 from xumengpanda/mengxu/fr-api-atomicrestore-PR
...
Add ApiCorrectnessAtomicRestore workload for the new performant restore
2020-03-25 18:05:26 -07:00
Evan Tschannen
bb5799bd20
Merge pull request #2642 from xumengpanda/mengxu/new-backup-format-PR
...
FastRestore:Integrate with new backup format
2020-03-25 15:47:55 -07:00
Jingyu Zhou
0f57bf9685
Remove a SevError event
...
The same mutation can be present in overlapping mutation logs. Thus we cannot
assert its absence. This can be caused for multiple reasons. One possibility
is that new TLogs can copy mutations from old generation TLogs; another one
is backup worker is recruited without knowning previously saved progress.
2020-03-25 15:23:21 -07:00
Meng Xu
1ba11dc74b
Apply clang format
2020-03-25 11:20:17 -07:00
Meng Xu
120272f025
Change unlockDB from RestoreMaster to Agent
2020-03-25 11:04:49 -07:00
Jingyu Zhou
472f7bdd32
Rename a trace event to avoid confusion
...
Change from BackupRange to BackupVersionRange.
2020-03-25 11:03:05 -07:00
Evan Tschannen
e0fbd9ecbe
Merge pull request #2847 from atn34/atn34/assert-no-return
...
Assert recoverAndEndEpoch does not become ready
2020-03-25 10:23:38 -07:00
Jingyu Zhou
e2f317a0da
Fix a crash failure
2020-03-25 09:18:49 -07:00
chaoguang
62627dd2ee
Fix a randomness bug and naming issue in TraceEvent
2020-03-25 00:55:40 -07:00
Jingyu Zhou
00fb4c1a35
Fix an off by one error
...
Backup worker's saved version should start from its startVersion - 1, i.e.,
the startVersion is not saved yet. Otherwise, if the version range is just
the startVersion itself and there is no data, then the range [startVersion,
startVersion + 1) will be missing. This causes non-continuous partitioned logs.
2020-03-24 23:40:36 -07:00
Meng Xu
ca8966a28b
Move lockDB into submitRestore request from restore worker
...
AtomicRestore needs to lock DB before we start the restore worker.
So we cannot lock DB in restore worker with a different randomUID.
2020-03-24 23:39:35 -07:00
Meng Xu
6a8d6ddb8e
Introduce ParallelRestoreApiCorrectnessAtomicRestore.txt test
...
This covers ApiCorrectnessTest as workload for parallel restore.
2020-03-24 22:30:51 -07:00
Jingyu Zhou
669916467e
Add missing transaction reset call
2020-03-24 20:14:37 -07:00
Jingyu Zhou
5e729a5bcf
Merge branch 'master' of https://github.com/apple/foundationdb into backup-worker-bak
2020-03-24 19:54:36 -07:00
Jingyu Zhou
edcbeb8992
Address review comments
...
Move transaction object outside of the loop and rename trace events.
2020-03-24 18:22:20 -07:00
Andrew Noyes
289487559d
Revert "Revert "Merge pull request #2257 from zjuLcg/report-conflicting-key""
...
This reverts commit 804fe1b22e
.
2020-03-24 18:11:15 -07:00
Meng Xu
b173929316
Add atomicParallelRestore to AtomicRestore workload
2020-03-24 15:58:49 -07:00
Meng Xu
81f7181c9e
Refactor submitParallelRestore function into FileBackupAgent
2020-03-24 14:44:55 -07:00
Meng Xu
5584884c12
Refactor parallelRestoreFinish function into FileBackupAgent
2020-03-24 14:15:15 -07:00
Jingyu Zhou
a3058e7d96
Fix incorrectly marking a backup job as stopped
...
This causes missing version ranges for mutation logs.
2020-03-23 22:05:58 -07:00
Jingyu Zhou
1155304cd5
Remove a spurious assertion
...
It's possible that there is a gap between backup's contiguousLogEnd and snapshot
version.
2020-03-23 21:39:40 -07:00
Jingyu Zhou
82a1790776
Fix backup worker crash due to aborted backup job
...
If a backup job is aborted, the "startedBackupWorkers" key can be cleared, thus
triggering the assertion failure.
2020-03-23 21:11:25 -07:00
Jingyu Zhou
243d078596
Fix off by one error
...
Epoch end version is saved version + 1, so need +1 for minBackupVersion.
2020-03-23 20:44:31 -07:00
Jingyu Zhou
f1d7fbafb4
Stop actors for displaced backup workers
...
If the worker is displaced, it should not update backup containers.
2020-03-23 18:48:06 -07:00