A.J. Beamon
25c4880ebe
Merge branch 'release-6.3' into merge-release-6.3-into-master (temporarily discard all changes to BackupContainer.actor.cpp)
...
# Conflicts:
# fdbclient/BackupContainer.actor.cpp
# fdbserver/Knobs.h
2021-03-15 16:41:04 -07:00
FDB Formatster
df90cc89de
apply clang-format to *.c, *.cpp, *.h, *.hpp files
2021-03-10 10:18:07 -08:00
Vishesh Yadav
3fbe7e69fa
Add TraceEvent to warn tLogs which has not joined after timeout
...
During recovery, wait for TLOG_SLOW_REJOIN_WARN_TIMEOUT_SECS and
log the tLogs that has not joined yet.
2021-03-09 18:57:46 -08:00
FDB Formatster
8a8c488ede
apply clang-format to *.c, *.cpp, *.h, *.hpp files
2021-03-05 18:13:38 -06:00
Andrew Noyes
79cec09255
Apply clang-tidy's performance-inefficient-vector-operation fix
...
I ran this command in my build directory after compiling with
OPEN_FOR_IDE. It took a few small tweaks to get it to compile, which is
outside the scope of this commit.
$ python run-clang-tidy.py -j $(nproc) -checks='-*,performance-inefficient-vector-operation' -fix
2021-03-04 03:58:25 +00:00
Evan Tschannen
346a4e3ecd
Merge branch 'release-6.3'
...
# Conflicts:
# fdbcli/fdbcli.actor.cpp
# fdbrpc/LoadBalance.actor.h
# fdbrpc/MultiInterface.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/masterserver.actor.cpp
2021-03-01 18:52:06 -08:00
Meng Xu
0c28e1d640
Add comment to peekLogRouter in TagPartitionedLogSystem
2021-02-22 16:20:50 -08:00
Meng Xu
ef0bf2728e
Merge branch 'release-6.3' into mengxu/ha-code
...
Resolve Conflicts:
fdbserver/LogRouter.actor.cpp: Only conflicts at comments
2021-02-19 21:47:09 -08:00
Meng Xu
33eb1de00e
Add some comment to log system
...
and resolve review comment by deleting my questions.
2021-02-19 21:44:13 -08:00
Meng Xu
471a3489fb
Resolve review comments and add trace fields to MasterRecoveryState
2021-02-17 14:44:14 -08:00
Meng Xu
9122be4d81
Add comments to HA code and loadBalance code
2021-02-10 13:51:36 -08:00
sfc-gh-tclinkenbeard
dd3669886e
Improve IPeekCursor and ILogSystem method signatures
2020-12-08 09:09:31 -08:00
Andrew Noyes
877997632d
Merge branch 'release-6.3' into anoyes/merge-release-6.3-master
...
Include conflict markers for review purposes
2020-12-04 01:38:07 +00:00
Andrew Noyes
dc2bac5670
Resolve conflicts
2020-11-24 19:09:42 +00:00
Andrew Noyes
1f541f02be
Merge branch 'anoyes/merge-6.2-to-6.3' into anoyes/release-6.3-merge
...
Merge, leaving conflict markers for now
2020-11-24 16:55:34 +00:00
Meng Xu
046a6e8427
Add Alex comment on tLog
2020-11-12 13:29:11 -08:00
sfc-gh-tclinkenbeard
4669f837fa
Add uses of makeReference
2020-11-07 22:10:18 -08:00
sfc-gh-tclinkenbeard
0ff1809d25
Remove emplace_back(new ...) pattern
...
This pattern can cause a memory leak due to an exception between
resource allocation and management
2020-11-07 22:09:53 -08:00
Meng Xu
4788544a6f
Revise comments based on review suggestions
...
Ack. Jingyu and Xin for their suggestions.
2020-11-06 08:51:13 -08:00
Meng Xu
063700e4d6
Add comments and questions to HA and tLog code reading
...
The comments' correctness need to be confirmed by reviewers.
2020-10-30 12:14:57 -07:00
Lukas Joswiak
53b7721d6c
Add additional trace information
2020-09-04 15:36:47 -07:00
Lukas Joswiak
2a58e775d2
Add original changes
2020-09-04 15:36:47 -07:00
Evan Tschannen
12edadd059
Merge branch 'release-6.3'
...
# Conflicts:
# CMakeLists.txt
# fdbclient/Knobs.cpp
# fdbclient/MasterProxyInterface.h
# fdbrpc/simulator.h
# fdbserver/MasterProxyServer.actor.cpp
# tests/fast/CycleAndLock.txt
# tests/fast/TxnStateStoreCycleTest.txt
# tests/fast/VersionStamp.txt
# tests/slow/ParallelRestoreOldBackupApiCorrectnessAtomicRestore.txt
# tests/slow/ParallelRestoreOldBackupCorrectnessCycle.txt
# versions.target
2020-08-31 19:33:34 -07:00
Evan Tschannen
29eec30183
Merge branch 'release-6.2' into release-6.3
...
# Conflicts:
# CMakeLists.txt
# build/Dockerfile
# build/Dockerfile.devel
# documentation/sphinx/source/downloads.rst
# fdbserver/Knobs.cpp
# fdbserver/LogSystem.h
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WaitFailure.actor.cpp
# fdbserver/fdbserver.vcxproj
# fdbserver/fdbserver.vcxproj.filters
# packaging/msi/FDBInstaller.wxs
2020-08-31 01:10:29 -07:00
Evan Tschannen
507c67c930
Added additional information to trace events
2020-08-26 11:42:23 -07:00
Evan Tschannen
28cb5f242c
another fix
2020-08-26 11:01:40 -07:00
Evan Tschannen
e81ccd2dc9
another compiler fix
2020-08-26 10:59:06 -07:00
Evan Tschannen
e531046b53
fix compiler errors
2020-08-26 10:56:21 -07:00
Evan Tschannen
fd1a4304fa
fix: made ConnectionResetInfo reference counted
2020-08-26 10:53:17 -07:00
Evan Tschannen
8ede143941
Track tlog push latencies and reset connections if they are above 500ms
2020-08-18 08:43:14 -07:00
Meng Xu
ef8c1060a2
Merge branch 'master' into mengxu/tmp-merge-6.3
2020-07-13 10:15:56 -07:00
Jingyu Zhou
5cc5d9cf1e
Log peer address whose failure can cause master recovery
...
So when there is master recovery due to failed tlog, proxy, resolver, log
router, or resolver, we can have a trace event tells which address that the
master thinks is dead.
2020-07-10 15:57:03 -07:00
negoyal
cf13e00a8f
Merge remote-tracking branch 'origin/release-6.3' into fdb_cache_wo_allocator
2020-06-01 17:38:31 -07:00
negoyal
dd033736ed
Merge branch 'master' into fdb_cache_subfeature2
2020-05-04 17:29:43 -07:00
Evan Tschannen
fd0ee72293
Merge branch 'master' into feature-small-endpoint
2020-04-29 18:43:10 -07:00
Jingyu Zhou
0823091423
Fix backup worker removal races with setting
...
The master waits for all backup worker recruitment done and then set them in a
batch. However, a backup worker could remove itself before the master sets it.
As a result, the worker is not removed and oldest backup epoch can't advance,
and TLog can't be popped.
2020-04-20 11:06:46 -07:00
Evan Tschannen
a442565e13
more work towards shrinking locality
2020-04-18 21:29:38 -07:00
Evan Tschannen
99a58f8ee5
fix compiler errors
2020-04-17 17:47:50 -07:00
Evan Tschannen
07dc2988e7
filter the locality inside of tlog interface to reduce its size
2020-04-17 17:37:14 -07:00
negoyal
a0c8946f31
Merge branch 'master' into fdb_cache_subfeature2
2020-04-02 12:27:04 -07:00
Jingyu Zhou
280bc94738
Do not recruit backup workers with wrong tags
...
In a rare scenario, the master can recruit backup workers with more tags than
the number of log router tags for an epoch. This can be caused by an
unsuccessful recovery, which uses more tags than the next epoch. When
recruiting for the next epoch, if no progress has been made yet, the recruiting
logic will look back at the previous epoch. If previous epoch has saved past
this epoch's begin version, current epoch's progress is updated with that
information and can result in more tags being inserted to this epoch's
recruitment.
2020-03-28 21:19:41 -07:00
negoyal
4772bedc1b
Remove a trce message that seemed to have problems.
2020-03-26 17:11:26 -07:00
negoyal
acaf91ac47
Merge branch 'master' into fdb_cache_subfeature2
2020-03-26 13:33:08 -07:00
negoyal
8abac91033
Fixed a bug in cache server while peeking at a version lower than popped version and added some logging.
2020-03-26 12:39:07 -07:00
Jingyu Zhou
90b40e1d75
Merge branch 'mengxu/new-backup-format-PR-delta' of github.com:xumengpanda/foundationdb into backup-worker-bak
...
Resolve Conflicts:
fdbclient/BackupAgent.actor.h
fdbserver/BackupWorker.actor.cpp
fdbserver/RestoreMaster.actor.cpp
fdbserver/masterserver.actor.cpp
2020-03-23 13:35:33 -07:00
Meng Xu
3f31ebf659
New backup:Revise event name and explain code
2020-03-23 10:55:44 -07:00
Jingyu Zhou
0eacf1cdab
trackTlogRecovery listens on backup worker change events
...
Old TLogs can only be removed when backup workers no long need them (i.e., the
oldest backup epoch == current epoch). As a result, the core state changes need
include backup worker changes, which updates the oldest backup epoch.
2020-03-20 20:17:32 -07:00
Jingyu Zhou
818072f3cb
Set oldest backup epoch if not recruiting backup workers
...
Since tlog is not kept until backup worker has pulled mutations from it, the
old tlogs can only be displaced after oldest backup epoch equals current epoch.
So if master is not recruiting backup workers, it should set the oldest backup
epoch as the current epoch.
2020-03-20 20:16:43 -07:00
Jingyu Zhou
5b36dcaad5
Fix oldest backup epoch for backup workers
...
The oldest backup epoch is piggybacked in LogSystemConfig from master to
cluster controller and then to all workers. Previously, this epoch is set
to the current master epoch, which is wrong.
2020-03-20 20:15:09 -07:00
Jingyu Zhou
12ed8ad536
Fix backup worker start version when logset start version is lower
...
The start version of tlog set can be smaller than the last epoch's end version.
In this case, set backup worker's start version as last epoch's end version to
avoid overlapping of version ranges among backup workers.
2020-03-20 20:15:08 -07:00