Commit Graph

386 Commits

Author SHA1 Message Date
Renxuan Wang f7eb66441d Try eliminating warnings in macOS and Windows CI builds.
MacOS warnings are format warnings, e.g., `format specifies type 'long' but the argument has type 'Version' (aka 'long long')`.
Windows warnings are `ACTOR does not contain a wait() statement`.
2022-02-25 19:06:57 -08:00
Xiaoxi Wang 6dc5921575
createdTime based storage wiggler (#6219)
* add storagemetadata

* add StorageWiggler;

* fix serverMetadataKey bug

* add metadata tracker in storage tracker

* finish StorageWiggler

* update next storage ID

* change pid to server id

* write metadata when seed SS

* add status json fields

* remove pid based ppw iteration

* fix time expression

* fix tss metadata nonexistence; fix transaction retry when retrieving metadata

* fix checkMetadata bug when store type is wrong

* fix remove storage status json

* format code

* refactor updateNextWigglingStoragePID

* seperate storage metadata tracker and store type tracker

* rename pid

* wiggler stats

* fix completion between waitServerListChange and storageRecruiter

* solve review comments

* rename system key

* fix database lock timeout by adding lock_aware

* format code

* status json

* resolve code format/naming comments

* delete expireNow; change PerpetualStorageWiggleID's value to KeyBackedObjectMap<UID, StorageWiggleValue>

* fix omit start rount

* format code

* status json reset

* solve status json format

* improve status json latency; replace binarywriter/reader to objectwriter/reader; refactor storagewigglerstats transactions

* status timestamp
2022-02-04 15:04:30 -08:00
Ata E Husain Bohra 87ee4cf958 Add new FDB EncryptKeyProxy role
Major changes includes:

1. Add a new FDB role responsible- EncyrptKeyProxy. The role is
   responsible to expose APIs to fetch encyrption keys interacting
   with external Encryption KeyManager interface.
2. The process is a FDB singleton process following similar recruitment
   rules as other singleton processes in the system.
3. Code to recruit the worker process; given the encryption keys are
   needed during recovery (decode TLog records), for now the process
   is co-located in same datacenter as ClusterController.
4. Skeleton process actor code; more functionality will be added in
   subsequent PRs.

NOTE: The code is protected under a SERVER_KNOB with the default
      value as 'false' for now.
2022-01-25 17:38:27 -08:00
Ata E Husain Bohra 936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
sfc-gh-tclinkenbeard ec64890ac1 Remove some usages of PRId64 by using fmt library 2021-11-30 23:35:36 -08:00
Steve Atherton 035e0d6e52
Merge branch 'master' into bit-flipping-workload 2021-11-16 14:42:22 -08:00
Markus Pilman 7df059570a Make sure unit tests are run often enough 2021-11-08 15:43:32 -07:00
negoyal 1e7338b6c3 Merge branch 'master' into bit-flipping-workload 2021-10-28 14:24:49 -07:00
Josh Slocum 0ff8ddc2b6 Merge branch 'master' into blob_full_clean 2021-10-25 13:38:48 -05:00
A.J. Beamon e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Suraj Gupta d46951ccb7 Collapse if's into one. 2021-10-14 18:47:05 -04:00
Suraj Gupta 3ccc24136c Gate BM/BW in Status and timeout after waiting. 2021-10-14 18:39:55 -04:00
Josh Slocum f3c44c568f fixing merge conflicts 2021-10-13 16:26:44 -05:00
Josh Slocum 5f0ec0612a Merge branch 'feature-range-feed' into blob_full 2021-10-13 15:44:35 -05:00
negoyal f913dfed97 Merge branch 'master' into bit-flipping-workload 2021-10-11 16:34:57 -07:00
Suraj Gupta 4d54669ccd Recruit the blob workers via blob manager.
In this PR, the blob manager now recruits blob workers
(via communication with the cluster controller). Blob workers
are onboarded as blob worker processes enter the cluster.
2021-10-04 11:07:08 -04:00
Suraj Gupta 5fa6c687d6 Add blob manager as a singleton. 2021-09-23 10:45:37 -04:00
Xiaoge Su abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
negoyal 3b34423248 Merge branch 'master' into bit-flipping-workload 2021-08-31 12:14:51 -07:00
sfc-gh-tclinkenbeard 42a45ebfc4 Temporarily comment out configuration database code breaking simulation tests 2021-08-09 10:04:35 -07:00
negoyal 9e7197faba Bunch of changes based on review comments and discussions. 2021-07-30 01:32:43 -07:00
sfc-gh-tclinkenbeard 79ff07a071 Added *BOOLEAN_PARAM macros to enforce documentation of boolean parameters 2021-07-02 15:04:42 -07:00
Neethu Haneesha Bingi 73752f441b exclude locality:clang-format, ranged loops, documentation, tracking addStoragesever for exclusion. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi 62355571d0 exclude servers based on locality match 2021-06-23 18:03:27 -07:00
sfc-gh-tclinkenbeard 41c790b299 Merge remote-tracking branch 'origin/master' into config-db 2021-06-10 22:31:23 -07:00
sfc-gh-tclinkenbeard 371a38e6e5 Merge remote-tracking branch 'origin/master' into remove-extra-copies 2021-06-07 10:26:06 -07:00
Trevor Clinkenbeard 866f536983
Merge pull request #4888 from sfc-gh-tclinkenbeard/remove-fdbserver-includes
Remove fdbserver includes from fdbclient
2021-06-07 10:22:13 -07:00
sfc-gh-tclinkenbeard f10dd70c37 Remove configuration_database from status when disabled 2021-06-06 08:51:18 -07:00
sfc-gh-tclinkenbeard a775f92fca Merge remote-tracking branch 'origin/master' into config-db 2021-06-01 15:39:34 -07:00
sfc-gh-tclinkenbeard 1a40c60674 Move RestoreWorkerInterface into fdbserver 2021-05-30 15:02:33 -07:00
sfc-gh-tclinkenbeard 594e8944ae Move RestoreWorkerInterface into fdbserver 2021-05-30 11:51:47 -07:00
Josh Slocum 4257ac2b4d More TSS Changes/Fixes 2021-05-25 20:37:48 +00:00
Josh Slocum ce82c9653e Testing Storage Server implementation 2021-05-25 20:28:50 +00:00
Sreenath Bodagala 2fa80e7912 Address review comments 2021-05-19 22:04:43 +00:00
Sreenath Bodagala 3066e856c9 Expose "bounce impact" and Storage Server "version catch-up rate" metrics
Changes:

storageserver.actor.cpp: Use counters to capture (a) how fast a storage
server is catching up in versions and (b) the version fetch frequency.

Status.actor.cpp: Report the captured counter metrics as part of storage
metrics.
2021-05-19 16:08:32 +00:00
sfc-gh-tclinkenbeard fcc6efd3b1 Add .cluster.configuration status json field 2021-05-18 10:47:16 -07:00
Sreenath Bodagala d8cad8efca Report bounce impact info as part of cluster JSON object. 2021-05-13 16:36:57 +00:00
Sreenath Bodagala 160293bd54 Report bounce impact in fdbcli status
Changes:

Schemas.cpp: Extend the JSON schema to report whether the cluster is
bounceable and if not, report the reason for why it is not bounceable.

Status.actor.cpp: Extend recoveryStateStatusFetcher() to populate the
bounce related field(s).

mr-status-json-schemas.rst.inc: Update the schema to reflect the change
made in Schemas.cpp.

release-notes-700.rst: Add a note about the new status fields in "Status"
section.
2021-05-13 14:28:06 +00:00
sfc-gh-tclinkenbeard f28ac955c3 Remove unnecessary temporary objects while growing objects of type std::vector<std::pair<A, B>> 2021-05-10 16:32:50 -07:00
Sreenath Bodagala 336a9bff66 Provide "time since last full recovery" in fdbcli status
Changes:

Schemas.cpp: Extend the JSON schema to include a new field that reports
the number of seconds since last full recovery.

Status.actor.cpp: Extend recoveryStateStatusFetcher() to populate the
new field that has been added to Schemas.cpp.

mr-status-json-schemas.rst.inc: Update the schema to reflect the change
made in Schemas.cpp.
2021-05-05 19:43:44 +00:00
sfc-gh-tclinkenbeard 5c2d7b6080 Create RangeResult type alias 2021-05-03 13:14:16 -07:00
sbodagala f7e28c50d4
Merge pull request #4735 from sbodagala/master
Expose CommitBatchingWindowSize metric to fdbcli status
2021-04-30 15:52:29 -04:00
Sreenath Bodagala f151df3203 Expose CommitBatchingWindowSize metric to fdbcli status
Changes:

Schemas.cpp:
- Extend JSON schema to include aggregated information about
CommitBatchingWindowSize samples.

Status.actor.cpp:
- Extend getStorageServersAndMetrics() to gather metrics about
CommitBatchingWindowSize.
- Extend CommitProxy AddRole() to populate the status-JSON object
with the metrics about CommitBatchingWindowSize.
2021-04-29 22:11:09 +00:00
Dan Lambright 715c98572c bit more documentation 2021-04-21 10:48:35 -04:00
Dan Lambright a95e845200 document changes 2021-04-06 14:56:58 -04:00
Dan Lambright 7faca702d2 Fix bug in writes to json objects. 2021-04-06 13:05:09 -04:00
Dan Lambright cabf192f57 Respond to review comments 3/23 2021-04-06 13:05:09 -04:00
Dan Lambright 48a475366c Log latency metrics for batch GRV requests 2021-04-06 13:05:09 -04:00
Zhe Wu c3aff4340f Address merge conflicts 2021-03-26 11:36:02 -07:00
Evan Tschannen bff7b9dcae
Merge pull request #4454 from vishesh/task/tlog-rf-old-tlog
status: Ignore LogSets with no tLogs when computing FT
2021-03-22 11:18:00 -07:00