Ata E Husain Bohra
936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" ( #6191 )
...
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""
Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9
.
1.b. This reverts commit d174bb2e06
.
1.c. This reverts commit 30b05b469c
.
2. Update Status.actor to track ClusterController interface to track
recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
prefix; for now keeping it as "Master", however, it should allow
smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Aaron Molitor
30b05b469c
Revert "Refactor: ClusterController driving cluster-recovery state machine"
...
This reverts commit dfe9d184ff
.
2021-12-24 11:25:51 -08:00
Ata E Husain Bohra
dfe9d184ff
Refactor: ClusterController driving cluster-recovery state machine
...
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
responsible to recruit all other processes as well restore the
cluster state.
Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.
Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
process like other worker processes compared to current scheme
where "sequencer" process gets special treatment. In newer scheme
sequencer is responsible for maintaining/providing
"committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
the sequencer though orchestrating the recovery state machine, it
need to reachout to the ClusterController for recruiting worker
processes etc.
NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.
Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Lukas Joswiak
3988b11fd6
Cleanup
2021-11-09 12:29:48 -08:00
Lukas Joswiak
3e2c65bb11
Allow tlog to join another cluster but retain its data
2021-11-09 12:29:48 -08:00
Lukas Joswiak
30867750b5
Add protection against storage and tlog data deletion when joining a new cluster
2021-11-09 12:29:47 -08:00
Xiaoge Su
abf73047ca
Enforce std:: specifier rather than using namespace
2021-09-16 19:40:28 -07:00
Xiaoge Su
c32c3b6ec4
fixup! Reformat the code per github's requirement
2021-09-12 14:17:19 -07:00
Xiaoge Su
40648dbb31
fixup! Update code per comment
...
Also fix the issue that TagPartitionedLogSystem.actor.cpp should include
TagPartitionedLogSystem.actor.h
2021-09-12 14:17:19 -07:00
Xiaoge Su
ecca4edeb4
Create TagPartitionedLogSystem.actor.h
...
TagPartitionedLogSystem.actor.h contains the struct of TagPartitionedLogSystem.
2021-09-12 14:17:19 -07:00
yao-xiao-github
8609b45354
Add histograms to CommitProxyServer. ( #5299 )
2021-08-05 09:17:37 -07:00
Andrew Noyes
353efe7db2
Merge pull request #5264 from sfc-gh-tclinkenbeard/fix-more-clang-warnings
...
Enable more warnings for `clang`
2021-07-29 15:43:54 -07:00
sfc-gh-tclinkenbeard
94a65865d9
Merge remote-tracking branch 'origin/master' into fix-clang-warnings
2021-07-28 12:29:27 -07:00
sfc-gh-tclinkenbeard
c74047c665
Merge remote-tracking branch 'origin/master' into fix-more-clang-warnings
2021-07-28 11:51:02 -07:00
Steve Atherton
507c1f11e3
Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use.
2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard
64dc1dc185
Fix -Wreorder-ctor warnings in NativeAPI.actor.cpp and several other files
2021-07-24 00:23:06 -07:00
sfc-gh-tclinkenbeard
e62e6503ac
Fix most delete-non-virtual-dtor clang warnings
2021-07-21 23:32:44 -07:00
Lukas Joswiak
5338251946
Fix invalid read
2021-07-13 10:44:37 -07:00
Jingyu Zhou
c700feaa6e
Address Dan's comments
2021-06-28 11:16:35 -07:00
sfc-gh-tclinkenbeard
371a38e6e5
Merge remote-tracking branch 'origin/master' into remove-extra-copies
2021-06-07 10:26:06 -07:00
sfc-gh-tclinkenbeard
594e8944ae
Move RestoreWorkerInterface into fdbserver
2021-05-30 11:51:47 -07:00
sfc-gh-tclinkenbeard
f28ac955c3
Remove unnecessary temporary objects while growing objects of type std::vector<std::pair<A, B>>
2021-05-10 16:32:50 -07:00
A.J. Beamon
25c4880ebe
Merge branch 'release-6.3' into merge-release-6.3-into-master (temporarily discard all changes to BackupContainer.actor.cpp)
...
# Conflicts:
# fdbclient/BackupContainer.actor.cpp
# fdbserver/Knobs.h
2021-03-15 16:41:04 -07:00
FDB Formatster
df90cc89de
apply clang-format to *.c, *.cpp, *.h, *.hpp files
2021-03-10 10:18:07 -08:00
Vishesh Yadav
3fbe7e69fa
Add TraceEvent to warn tLogs which has not joined after timeout
...
During recovery, wait for TLOG_SLOW_REJOIN_WARN_TIMEOUT_SECS and
log the tLogs that has not joined yet.
2021-03-09 18:57:46 -08:00
FDB Formatster
8a8c488ede
apply clang-format to *.c, *.cpp, *.h, *.hpp files
2021-03-05 18:13:38 -06:00
Andrew Noyes
79cec09255
Apply clang-tidy's performance-inefficient-vector-operation fix
...
I ran this command in my build directory after compiling with
OPEN_FOR_IDE. It took a few small tweaks to get it to compile, which is
outside the scope of this commit.
$ python run-clang-tidy.py -j $(nproc) -checks='-*,performance-inefficient-vector-operation' -fix
2021-03-04 03:58:25 +00:00
Evan Tschannen
346a4e3ecd
Merge branch 'release-6.3'
...
# Conflicts:
# fdbcli/fdbcli.actor.cpp
# fdbrpc/LoadBalance.actor.h
# fdbrpc/MultiInterface.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/masterserver.actor.cpp
2021-03-01 18:52:06 -08:00
Meng Xu
0c28e1d640
Add comment to peekLogRouter in TagPartitionedLogSystem
2021-02-22 16:20:50 -08:00
Meng Xu
ef0bf2728e
Merge branch 'release-6.3' into mengxu/ha-code
...
Resolve Conflicts:
fdbserver/LogRouter.actor.cpp: Only conflicts at comments
2021-02-19 21:47:09 -08:00
Meng Xu
33eb1de00e
Add some comment to log system
...
and resolve review comment by deleting my questions.
2021-02-19 21:44:13 -08:00
Meng Xu
471a3489fb
Resolve review comments and add trace fields to MasterRecoveryState
2021-02-17 14:44:14 -08:00
Meng Xu
9122be4d81
Add comments to HA code and loadBalance code
2021-02-10 13:51:36 -08:00
sfc-gh-tclinkenbeard
dd3669886e
Improve IPeekCursor and ILogSystem method signatures
2020-12-08 09:09:31 -08:00
Andrew Noyes
877997632d
Merge branch 'release-6.3' into anoyes/merge-release-6.3-master
...
Include conflict markers for review purposes
2020-12-04 01:38:07 +00:00
Andrew Noyes
dc2bac5670
Resolve conflicts
2020-11-24 19:09:42 +00:00
Andrew Noyes
1f541f02be
Merge branch 'anoyes/merge-6.2-to-6.3' into anoyes/release-6.3-merge
...
Merge, leaving conflict markers for now
2020-11-24 16:55:34 +00:00
Meng Xu
046a6e8427
Add Alex comment on tLog
2020-11-12 13:29:11 -08:00
sfc-gh-tclinkenbeard
4669f837fa
Add uses of makeReference
2020-11-07 22:10:18 -08:00
sfc-gh-tclinkenbeard
0ff1809d25
Remove emplace_back(new ...) pattern
...
This pattern can cause a memory leak due to an exception between
resource allocation and management
2020-11-07 22:09:53 -08:00
Meng Xu
4788544a6f
Revise comments based on review suggestions
...
Ack. Jingyu and Xin for their suggestions.
2020-11-06 08:51:13 -08:00
Meng Xu
063700e4d6
Add comments and questions to HA and tLog code reading
...
The comments' correctness need to be confirmed by reviewers.
2020-10-30 12:14:57 -07:00
Lukas Joswiak
53b7721d6c
Add additional trace information
2020-09-04 15:36:47 -07:00
Lukas Joswiak
2a58e775d2
Add original changes
2020-09-04 15:36:47 -07:00
Evan Tschannen
12edadd059
Merge branch 'release-6.3'
...
# Conflicts:
# CMakeLists.txt
# fdbclient/Knobs.cpp
# fdbclient/MasterProxyInterface.h
# fdbrpc/simulator.h
# fdbserver/MasterProxyServer.actor.cpp
# tests/fast/CycleAndLock.txt
# tests/fast/TxnStateStoreCycleTest.txt
# tests/fast/VersionStamp.txt
# tests/slow/ParallelRestoreOldBackupApiCorrectnessAtomicRestore.txt
# tests/slow/ParallelRestoreOldBackupCorrectnessCycle.txt
# versions.target
2020-08-31 19:33:34 -07:00
Evan Tschannen
29eec30183
Merge branch 'release-6.2' into release-6.3
...
# Conflicts:
# CMakeLists.txt
# build/Dockerfile
# build/Dockerfile.devel
# documentation/sphinx/source/downloads.rst
# fdbserver/Knobs.cpp
# fdbserver/LogSystem.h
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WaitFailure.actor.cpp
# fdbserver/fdbserver.vcxproj
# fdbserver/fdbserver.vcxproj.filters
# packaging/msi/FDBInstaller.wxs
2020-08-31 01:10:29 -07:00
Evan Tschannen
507c67c930
Added additional information to trace events
2020-08-26 11:42:23 -07:00
Evan Tschannen
28cb5f242c
another fix
2020-08-26 11:01:40 -07:00
Evan Tschannen
e81ccd2dc9
another compiler fix
2020-08-26 10:59:06 -07:00
Evan Tschannen
e531046b53
fix compiler errors
2020-08-26 10:56:21 -07:00