Commit Graph

311 Commits

Author SHA1 Message Date
Jon Fu 915e2f6c1c Merge branch 'main' of github.com:apple/foundationdb into jfu-grv-cache 2022-01-20 16:17:20 -05:00
Ata E Husain Bohra 936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Aaron Molitor 30b05b469c Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit dfe9d184ff.
2021-12-24 11:25:51 -08:00
Ata E Husain Bohra dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
A.J. Beamon f29f487823
Unify flags (#25)
* Unify flags implementation and change help text in backup.actor.cpp
Description

Testing

* Keep LOG_GROUP unchanged

Description

Testing

* Transfer the hyphens to underscores for internal options and user's input, EXCEPT leading hyphens

Description

Testing

* Use a deep copy of the user's input flag to do the match

Description

Testing

* Convert the _ to - in Option arrays of backup.actor.cpp

Description

Testing

* Transter _ to - for files:
        TLSConfig.actor.h, fdbcli.actor.cpp, fdbserver.actor.cpp, FileConverter.h, FileConverter.cpp

Description

Testing

* Change another way to unify flag: using SO_O_ICASE_HYPHEN_AND_UNDERSCORE to determine whether we do the conversion in function IsEqual

Description

Testing

* Change the config command's name from SO_O_ICASE_HYPHEN_AND_UNDERSCORE to SO_O_HYPHEN_TO_UNDERSCORE

Description

Testing

* Update the comment for the SO_O_HYPHEN_TO_UNDERSCORE

Description

Testing

* Fix left underscore in SOption arrays

Description

Testing

* Convert _ to - in several files for commands

Description

Testing

* Make the FDBService and fdbmonitor backward compatible

Description

Testing

* Fix bugs about pointers

Description

Testing

* Check underscore and hyphen at the same time for --knob_, --localily_ and --test_
And fix bugs in fdbmonitor and FDBService
Description

Testing

* Simplify the function in fdbmonitor and FDBService about retrieving arguments.
And fix some documents in masterserver.actor.cpp

Description

Testing

* Convert _ to - for knob in the setKnob functions

Description

Testing

* Convert - to _ in the setKnob functions

Description
Since key in the knob related maps only contain _

Testing

* Rename varialbe name in the fdbmonitor and FDBService for clarification

Description

Testing

Co-authored-by: Chang Liu <chang.liu@snowflake.com>
2021-12-14 08:44:39 -08:00
A.J. Beamon 72c5fb183d Fix: avoid updating the master registration while the cstate is written but we are not accepting commits. 2021-11-30 15:44:04 -08:00
Jon Fu 3f24128da4 Merge branch 'master' of github.com:apple/foundationdb into jfu-grv-cache 2021-11-19 14:46:55 -05:00
Lukas Joswiak 74cf64fe0f Sync cluster ID through ServerDBInfo 2021-11-09 12:29:48 -08:00
Lukas Joswiak 30867750b5 Add protection against storage and tlog data deletion when joining a new cluster 2021-11-09 12:29:47 -08:00
A.J. Beamon e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Markus Pilman 424b35de63 verify FLAG_USE_PROVISIONAL_PROXIES on the server 2021-10-09 16:40:24 -06:00
Jon Fu 4c942cc4e3 simplify sim_validation verification to only involve maximum bound 2021-10-06 14:28:04 -04:00
Jon Fu 9a18cc8f41 Change debug time bound update logic to be located in masterserver 2021-10-05 17:31:31 -04:00
Chang Liu 5da864d91c Fix roll trace event issue
Description

Testing
2021-09-24 09:53:32 -07:00
Chang Liu 8427e40cbe Fix roll trace event issue
Description

Testing
2021-09-24 09:53:32 -07:00
Chang Liu 48990058a3 Fix roll trace event issue
Description

Testing
2021-09-24 09:53:32 -07:00
Xiaoge Su abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
FDB Formatster 2c788c233d apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-08-27 17:07:47 -07:00
sfc-gh-tclinkenbeard ceb83f7f5e Make ccInterface a const reference in workerServer 2021-08-14 23:41:39 -07:00
sfc-gh-tclinkenbeard c74047c665 Merge remote-tracking branch 'origin/master' into fix-more-clang-warnings 2021-07-28 11:51:02 -07:00
Steve Atherton 507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard 3442ebd3b7 Fix more -Wreorder-ctor warnings across many files 2021-07-24 11:20:51 -07:00
sfc-gh-tclinkenbeard b9a22a61ef Fix many -Wreorder-ctor warnings 2021-07-23 17:33:18 -07:00
sfc-gh-tclinkenbeard 6f81155784 Merge remote-tracking branch 'origin/master' into const-serverdbinfo 2021-07-20 10:18:40 -07:00
Steve Atherton f596a81073 Rename ::TRUE and ::FALSE in BooleanParams to ::True and ::False so as to not conflict with the TRUE and FALSE macros provided by the Windows and MacOS SDKs. 2021-07-17 00:11:40 -07:00
sfc-gh-tclinkenbeard 6c1d913ab8 Prevent masterServer from modifying db 2021-07-11 21:11:21 -07:00
sfc-gh-tclinkenbeard 79ff07a071 Added *BOOLEAN_PARAM macros to enforce documentation of boolean parameters 2021-07-02 15:04:42 -07:00
Neethu Haneesha Bingi 73752f441b exclude locality:clang-format, ranged loops, documentation, tracking addStoragesever for exclusion. 2021-06-23 18:03:27 -07:00
Zhe Wang ab1404b201
Merge pull request #5015 from kakaiu/zhewang-add-traceevent-in-recovery
Add trace event to recovery_transaction step in recovery
2021-06-21 13:00:46 -05:00
Zhe Wang b685691629 add trace event to recovery_transaction step in recovery 2021-06-19 16:04:20 -05:00
sfc-gh-tclinkenbeard dee9fec300 Rename coordination files to fix upgrades 2021-06-18 14:16:49 -07:00
RenxuanW 4f6b983bfb Address comments. 2021-05-27 12:58:47 -07:00
RenxuanW caeceb932e Improve logging on the current view of the database configuration that the cluster controller is using. 2021-05-24 09:37:57 -07:00
Lukas Joswiak e7d7b39f12
Merge pull request #4744 from sfc-gh-tclinkenbeard/add-rangeresult-type-alias
Create RangeResult type alias
2021-05-03 16:29:33 -07:00
sfc-gh-tclinkenbeard 0a9289a580 Merge remote-tracking branch 'origin/master' into add-master-logging 2021-05-03 14:49:08 -07:00
sfc-gh-tclinkenbeard 5c2d7b6080 Create RangeResult type alias 2021-05-03 13:14:16 -07:00
Evan Tschannen 65fcf4014e Fix: simulation could still stall writes for 10 seconds even when speedUpSimulation was on
Fix: disable connection failures in simulation when there are too many generations outstanding
2021-04-28 12:41:48 -07:00
A.J. Beamon feede1d2f6 Fix line length of test macro + comments to be within the 120 character limit 2021-04-13 10:48:52 -07:00
sfc-gh-tclinkenbeard 18120d6b1a Add MasterMetrics periodic logging 2021-04-06 22:14:02 -07:00
Evan Tschannen 0ea513b503 It is not safe to call expectedLogSets() with a potentially newer configuration than the one from the recovery 2021-03-23 13:21:48 -07:00
Jingyu Zhou b0fa735a27 Fix a race between configuration change and recovery
The problem is described in Issue #4376, where a configuration change can
occure before the database is fully recovered, thus triggering the assertion
failure.

Because of the configuration change, the master needs to do the recovery. So
the fix is to trigger the recovery when this happens.
2021-03-20 17:35:49 -07:00
Vishesh Yadav d7252da951 clang-format: Fix the TEST() macros which require comments in line 2021-03-10 16:50:53 -08:00
FDB Formatster df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
Andrew Noyes 79cec09255 Apply clang-tidy's performance-inefficient-vector-operation fix
I ran this command in my build directory after compiling with
OPEN_FOR_IDE. It took a few small tweaks to get it to compile, which is
outside the scope of this commit.

    $ python run-clang-tidy.py -j $(nproc) -checks='-*,performance-inefficient-vector-operation' -fix
2021-03-04 03:58:25 +00:00
Evan Tschannen 346a4e3ecd Merge branch 'release-6.3'
# Conflicts:
#	fdbcli/fdbcli.actor.cpp
#	fdbrpc/LoadBalance.actor.h
#	fdbrpc/MultiInterface.h
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/masterserver.actor.cpp
2021-03-01 18:52:06 -08:00
Meng Xu 471a3489fb Resolve review comments and add trace fields to MasterRecoveryState 2021-02-17 14:44:14 -08:00
sfc-gh-tclinkenbeard 4669f837fa Add uses of makeReference 2020-11-07 22:10:18 -08:00
Jon Fu 51db9a7e0a add static method to access backup pause key instead of constructing it manually 2020-11-06 14:03:29 -05:00
Jon Fu b90ad11483 add some more trace and comments 2020-11-05 14:23:08 -05:00
Jon Fu bda72d9a3d first draft at changing snapshot backup behaviour 2020-11-02 17:12:30 -05:00