Commit Graph

134 Commits

Author SHA1 Message Date
Young Liu 79ce16650d merge master branch 2020-08-11 19:22:10 -07:00
Evan Tschannen a49cb41de7 Merge branch 'release-6.3'
# Conflicts:
#	CMakeLists.txt
#	cmake/ConfigureCompiler.cmake
#	fdbserver/Knobs.cpp
#	fdbserver/StorageCache.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	flow/ThreadHelper.actor.h
#	flow/serialize.h
#	tests/CMakeLists.txt
2020-07-29 00:31:55 -07:00
Andrew Noyes d2cf700bd4 Fix compiler warnings 2020-07-28 18:30:26 +00:00
Young Liu 229ab0d5f1 Fix some conflicts and remote debugging trace events 2020-07-22 23:35:46 -07:00
Young Liu 525f10e30c Merge master branch 2020-07-22 16:08:49 -07:00
Young Liu 302cf5c45f Remove debug trace events 2020-07-22 12:20:22 -07:00
Young Liu 2703cedac5 Fixed known bugs 2020-07-17 22:24:52 -07:00
Young Liu 21c1998cca Fix MaxTLogQueueSize Bug 2020-07-16 15:56:04 -07:00
Young Liu 5b06d69d25 Pass watches test 2020-07-15 00:37:41 -07:00
Evan Tschannen 7affda3c8b Merge branch 'release-6.3'
# Conflicts:
#	CMakeLists.txt
2020-07-14 14:57:24 -07:00
Jingyu Zhou 773e533a09 Make Arena's impl private 2020-07-13 21:39:36 -07:00
Markus Pilman 69864c9f96 Make Spans not allocate heap memory 2020-07-09 11:49:33 -06:00
Markus Pilman 0fbe7101c3 Revert "Revert "Request tracing""
This reverts commit 327cc31e35.
2020-07-07 10:06:13 -06:00
Meng Xu f3302833ce
Merge pull request #3435 from apple/release-6.3
Merge Release 6.3 to master
2020-06-30 10:08:28 -07:00
Jingyu Zhou 8263c25336 Really ignore the error from uploading actor 2020-06-27 22:36:53 -07:00
Jingyu Zhou d883426c6a Fix spammy GotBackupProgress events
Only print this types of events during master recovery and don't log them for
backup workers.
2020-06-27 21:30:38 -07:00
Jingyu Zhou b8c77ead43 Fix spurious SevError from backup workers
While displaced backup workers wait for uploading to finish, it can get
connection_failed error, which caused spurious SevError of BackupFailed. Fix
by ignoring any errors from the uploading actor.
2020-06-27 21:24:22 -07:00
Jingyu Zhou 327cc31e35
Revert "Request tracing" 2020-06-16 12:32:42 -07:00
Markus Pilman 09c136e434 Framework for transaction tracing 2020-06-08 16:09:37 -07:00
Meng Xu 36ad1a95f4 Resolve conflicts when merge release-6.3 into master 2020-05-25 12:11:59 -07:00
Meng Xu 1c35ad884f Merge branch 'master' into mengxu/release-6.3-conflict-PR
Has conflict with master;
Next commit will fix the conflicts.
2020-05-25 12:01:49 -07:00
A.J. Beamon d128252e90 Merge release-6.3 into master 2020-05-22 09:25:32 -07:00
Jingyu Zhou 9e23166cf8 Fix a super rare missing mutation bug in backup workers
When a backup worker stops pulling for an old epoch, we cannot clear mutations.
This is because these muations are needed for saving.
2020-05-21 12:19:57 -07:00
Jingyu Zhou cdeabc4de6 Fix memory accounting error due to growing Arena
After an Arena object is counted, it can grow larger later. So we can't reduce
the amount of memory of arena size later. Instead, we use the arena size when
inserting mutations.
2020-05-20 13:26:57 -07:00
Jingyu Zhou 9fbbec1033 Fix duplicated counting of memory usage
For each message from LogPeekCursor, check it's using different arena from the
previous one. Otherwise, the arena's memory could be counted twice.
2020-05-16 18:50:09 -07:00
Jingyu Zhou a2e5050492 Fix duplicated mutation
This seems to be related to how actor compiler generates code. The message can
be inserted twice with original code ordering.
2020-05-16 10:52:11 -07:00
Jingyu Zhou caca31d05a Filter out mutations before the true-up version
When a mutation log's begin version is true-uped, we must filter out mutations
less than such a version.
2020-05-15 20:06:47 -07:00
Jingyu Zhou 01eff0fc03 Fix memory bytes accounting
Avoid duplicated counting of arena memory since messages from peek cursor can
share arena.
2020-05-14 19:59:54 -07:00
Jingyu Zhou 17915e13b0 Limit memory usage of backup workers 2020-05-14 13:24:56 -07:00
Jingyu Zhou 1a35efe43c Add an assertion: mutation version >= log's begin version
This is to check that no version's data are split into two files.
2020-05-14 12:06:13 -07:00
Alex Miller ccaac162e2 Resolve performance concerns of nearly-no-op debugMutation being frequently called
This introduces unhygenic macro variants that inline a `ENABLED &&`
before the TraceEvent.  This way, they get entirely compiled out unless
enabled.

Then rewrite all debugMutation uses via sed.
2020-05-13 18:44:15 -07:00
Alex Miller 27da91ab9e Merge remote-tracking branch 'upstream/master' into mutation-debugging 2020-05-13 12:51:44 -07:00
A.J. Beamon 36454bb3b8 Merge branch 'master' into transaction-tagging
# Conflicts:
#	fdbclient/MasterProxyInterface.h
#	fdbclient/NativeAPI.actor.cpp
2020-05-04 10:23:25 -07:00
Evan Tschannen bd699f435c fixed compiler errors 2020-05-01 11:01:09 -07:00
A.J. Beamon 66228343f1 Merge branch 'master' into transaction-tagging 2020-04-30 08:12:03 -07:00
Jingyu Zhou 7d59e53349 Consolidate makePadding() 2020-04-28 15:39:23 -07:00
A.J. Beamon 41c517a5dd Merge branch 'master' into transaction-tagging
# Conflicts:
#	fdbclient/NativeAPI.actor.cpp
2020-04-27 13:05:24 -07:00
A.J. Beamon 239876351b Add some initial auto-throttling. Move the definition of the priority enum to a more global place and use it for all transaction priorites (except in ClientLogEvents, because of serialization incompatibilites). 2020-04-24 11:31:16 -07:00
Jingyu Zhou 0ae0a81edf Ensure mutation logs save complete version's data
I.e., do not allow the same version's mutations saved in different files.
Otherwise, we may have a file only contain a version's partial data, causing
continuity analysis of mutation logs to fail. This could also cause restore
failures, if the target version's mutations are stored in two files.

In the above description, all mutation logs refer to the same tag's logs.
2020-04-20 20:41:30 -07:00
Jingyu Zhou 61f0f44ab3 Fix comments on startVersion in BackupWorker 2020-04-20 17:07:50 -07:00
Jingyu Zhou 5f43e18906 Backup worker pops max of savedVersion or NOOP's popVersion 2020-04-20 11:43:09 -07:00
Jingyu Zhou 70221a25d7 True-up a backup's begin version
For the first mutation log of a backup, we need to true-up its begin version to
the exact version of the first mutation. This is needed to ensure the strict
less than relationship between two mutation logs, if one's version range is
within the other.

A problematic scenario is as follows:
Epoch 1: a mutation log A [200, 900] is saved, but its progress is NOT saved.
Epoch 2: master recruits a worker for [1, 1000], 1000 is epoch 1's end version.
         New worker saves a mutation log B [100, 1000]
A's range is strict within B's range, but A's size is larger than B.

This happens because B's start version is true-up to the backup's begin version,
which is not the actual version of the first mutation. After B's begin version
is true-up to 300, we won't have this issue.
2020-04-20 11:06:46 -07:00
Jingyu Zhou 8245f12091 Backup worker doesn't save progress in NOOP mode
This fixes the consistency check failure, where saving progress commits new
transactions. Pop is performed by the NOOP loop in monitorBackupKeyOrPullData.
2020-04-20 11:06:46 -07:00
Jingyu Zhou cdc911a6ae Fix inadvertent savedVersion update 2020-04-20 11:06:46 -07:00
Jingyu Zhou 76d90ac6d7 Limit the version range for old epochs
When the Master recruits a backup worker for previous epochs, the Master may
set the begin version to a very low number, because the backup progress for
that epoch is not saved. This can cause problem for the log file, since these
low versions have been popped.

The fix here is to advance savedVersion to the minimum of backup's starting
version if it is higher than the begin version set by the Master. This is safe
because these versions are not popped. If they are popped, their progress should
already be recorded and Master would use a higher version than the backup's
starting version.
2020-04-20 11:06:46 -07:00
Jingyu Zhou 4e128328f7 Stop backup workers before clearing DB in parallel restore workload
This is because the clearing of DB can be picked up by backup workers and be
applied during restore, causing restore failures.
2020-04-11 10:26:08 -07:00
Jingyu Zhou 60407bdee3 Use LiteralStringRef for backup paused key 2020-04-07 16:02:25 -07:00
Jingyu Zhou 9fb3fb9d82 Add pause/resume for new backups
To pause/resume the backup workers, the fdbbackup command will write to the
backupPausedKey. Then backup workers noticed the value of the key has been
changed and stops/resumes pulling from TLog.
2020-04-06 14:29:46 -07:00
Jingyu Zhou 411b4c28ac Update mutation bytes written for new backups
This will make the log bytes written available to backup status and describe
backup calls.
2020-03-29 21:23:34 -07:00
Alex Miller 40d10aa990 Fix debugMutation uses that were concurrently added in new backup code 2020-03-27 04:01:18 -07:00