Commit Graph

173 Commits

Author SHA1 Message Date
Xiaoxi Wang 6dc5921575
createdTime based storage wiggler (#6219)
* add storagemetadata

* add StorageWiggler;

* fix serverMetadataKey bug

* add metadata tracker in storage tracker

* finish StorageWiggler

* update next storage ID

* change pid to server id

* write metadata when seed SS

* add status json fields

* remove pid based ppw iteration

* fix time expression

* fix tss metadata nonexistence; fix transaction retry when retrieving metadata

* fix checkMetadata bug when store type is wrong

* fix remove storage status json

* format code

* refactor updateNextWigglingStoragePID

* seperate storage metadata tracker and store type tracker

* rename pid

* wiggler stats

* fix completion between waitServerListChange and storageRecruiter

* solve review comments

* rename system key

* fix database lock timeout by adding lock_aware

* format code

* status json

* resolve code format/naming comments

* delete expireNow; change PerpetualStorageWiggleID's value to KeyBackedObjectMap<UID, StorageWiggleValue>

* fix omit start rount

* format code

* status json reset

* solve status json format

* improve status json latency; replace binarywriter/reader to objectwriter/reader; refactor storagewigglerstats transactions

* status timestamp
2022-02-04 15:04:30 -08:00
Ata E Husain Bohra 87ee4cf958 Add new FDB EncryptKeyProxy role
Major changes includes:

1. Add a new FDB role responsible- EncyrptKeyProxy. The role is
   responsible to expose APIs to fetch encyrption keys interacting
   with external Encryption KeyManager interface.
2. The process is a FDB singleton process following similar recruitment
   rules as other singleton processes in the system.
3. Code to recruit the worker process; given the encryption keys are
   needed during recovery (decode TLog records), for now the process
   is co-located in same datacenter as ClusterController.
4. Skeleton process actor code; more functionality will be added in
   subsequent PRs.

NOTE: The code is protected under a SERVER_KNOB with the default
      value as 'false' for now.
2022-01-25 17:38:27 -08:00
Ata E Husain Bohra 936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Steve Atherton c53f5aa110 Renamed redwood to redwood-1-experimental and file extension to .redwood-v1. 2021-11-16 02:15:22 -08:00
Vaidas Gasiunas 40da5a80f9 Merge remote-tracking branch 'apple/master' into multi-version-client-2 2021-10-26 19:29:10 +02:00
Josh Slocum 5f0ec0612a Merge branch 'feature-range-feed' into blob_full 2021-10-13 15:44:35 -05:00
Vaidas Gasiunas 61fb967484 MVC2.0: Add a clientlib metadata attribute for checksum algorithm 2021-10-07 18:06:22 +02:00
Vaidas Gasiunas 5360abb238 MVC2.0: Operation to list uploaded libraries with various filters;
Introducing constants for attribute names and platform values
2021-10-06 18:01:46 +02:00
Vaidas Gasiunas cda0a5f931 Operation to upload client library binary in to system keyspace 2021-10-06 18:01:46 +02:00
Neethu Haneesha Bingi 3e79299898 Locality filter support to perpetual storage wiggler feature. 2021-09-30 10:00:33 -07:00
Suraj Gupta a4bcd3919d Add exclusive process class for Blob Worker.
Also introduces a specific machine in the simulated cluster
to test blob worker (similar to what's done for storage cache).
2021-09-23 16:54:44 -04:00
Suraj Gupta 5fa6c687d6 Add blob manager as a singleton. 2021-09-23 10:45:37 -04:00
Josh Slocum 9992a7b33f Added StorageMigrationType and cli commands 2021-09-14 09:55:41 -05:00
Neethu Haneesha Bingi 66f2518405 exclude to work with any locality data match. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi cbe714acd0 Status json schema update, includelocalities back for consistency check, review comments. 2021-06-23 18:03:27 -07:00
Xiaoxi Wang 454f9e9c89 update json schemas 2021-06-17 20:20:39 +00:00
Daniel Smith aa37c7dcec Add proxies back to schema 2021-06-08 14:12:33 -04:00
Josh Slocum ce82c9653e Testing Storage Server implementation 2021-05-25 20:28:50 +00:00
Markus Pilman e1254d38a0
Merge pull request #4838 from sfc-gh-xwang/ppwiggle
perpetual storage wiggling command line support
2021-05-24 14:48:00 -06:00
Xiaoxi Wang 93c809764f fix Schema check error 2021-05-19 23:52:16 +00:00
Sreenath Bodagala 2fa80e7912 Address review comments 2021-05-19 22:04:43 +00:00
Sreenath Bodagala 622f43474a Expose "bounce impact" and Storage Server "version catch-up rate" metrics
Changes:

Schemas.cpp: Extend the JSON schema to report the new metrics that have
been added.

mr-status-json-schemas.rst.inc: Update the schema to reflect the changes
made to the JSON schema.

release-notes-700.rst: Add a note about the new metrics in "Status"
section.
2021-05-19 19:54:49 +00:00
Sreenath Bodagala d8cad8efca Report bounce impact info as part of cluster JSON object. 2021-05-13 16:36:57 +00:00
Sreenath Bodagala 160293bd54 Report bounce impact in fdbcli status
Changes:

Schemas.cpp: Extend the JSON schema to report whether the cluster is
bounceable and if not, report the reason for why it is not bounceable.

Status.actor.cpp: Extend recoveryStateStatusFetcher() to populate the
bounce related field(s).

mr-status-json-schemas.rst.inc: Update the schema to reflect the change
made in Schemas.cpp.

release-notes-700.rst: Add a note about the new status fields in "Status"
section.
2021-05-13 14:28:06 +00:00
Sreenath Bodagala 336a9bff66 Provide "time since last full recovery" in fdbcli status
Changes:

Schemas.cpp: Extend the JSON schema to include a new field that reports
the number of seconds since last full recovery.

Status.actor.cpp: Extend recoveryStateStatusFetcher() to populate the
new field that has been added to Schemas.cpp.

mr-status-json-schemas.rst.inc: Update the schema to reflect the change
made in Schemas.cpp.
2021-05-05 19:43:44 +00:00
sbodagala f7e28c50d4
Merge pull request #4735 from sbodagala/master
Expose CommitBatchingWindowSize metric to fdbcli status
2021-04-30 15:52:29 -04:00
Sreenath Bodagala f151df3203 Expose CommitBatchingWindowSize metric to fdbcli status
Changes:

Schemas.cpp:
- Extend JSON schema to include aggregated information about
CommitBatchingWindowSize samples.

Status.actor.cpp:
- Extend getStorageServersAndMetrics() to gather metrics about
CommitBatchingWindowSize.
- Extend CommitProxy AddRole() to populate the status-JSON object
with the metrics about CommitBatchingWindowSize.
2021-04-29 22:11:09 +00:00
Markus Pilman 340f012e1a
Merge pull request #4695 from sfc-gh-etschannen/fix-rewrite-bme
rewrote tlog recruitment logic so that it is deterministic
2021-04-27 10:19:25 -06:00
Evan Tschannen f1559a2203 use the stateless process class instead of master or resolution in simulation because it is the recommended process class, and the others are not deterministic when recruited in a constrained process situation 2021-04-26 09:49:26 -07:00
Dan Lambright cabf192f57 Respond to review comments 3/23 2021-04-06 13:05:09 -04:00
Dan Lambright 48a475366c Log latency metrics for batch GRV requests 2021-04-06 13:05:09 -04:00
Hao Fu fb9632297e Add txnRejectedForQueuedTooLong in ProxyStats 2021-02-12 13:04:58 -08:00
A.J. Beamon aaf0a9aa7b Merge branch 'release-6.3' into merge-release-6.3-into-master
# Conflicts:
#	build/docker-compose.yaml
#	cmake/ConfigureCompiler.cmake
#	fdbclient/FileBackupAgent.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/IAsyncFile.h
#	fdbrpc/IRateControl.h
#	fdbrpc/simulator.h
#	fdbserver/KeyValueStoreSQLite.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbservice/ServiceBase.cpp
2021-02-08 12:58:34 -08:00
A.J. Beamon 67e783acf8 Merge branch 'release-6.2' into merge-release-6.2-into-release-6.3
# Conflicts:
#	cmake/CompileBoost.cmake
#	cmake/FDBComponents.cmake
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/simulator.h
#	fdbserver/KeyValueStoreSQLite.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/storageserver.actor.cpp
#	flow/Knobs.h
#	flow/network.h
2021-02-08 09:20:28 -08:00
Evan Tschannen b2ffdf47f0 added low priority reads to status 2021-02-03 13:24:34 -08:00
Richard Chen c77d9e4abe merge conflicts 2020-12-02 21:53:19 +00:00
sfc-gh-tclinkenbeard 45c9a0abc7 Revert "Revert "Add limiting health metrics""
This reverts commit 209ebcc595.
2020-11-13 17:24:57 -08:00
Trevor Clinkenbeard 209ebcc595
Revert "Add limiting health metrics" 2020-11-13 17:08:46 -08:00
sfc-gh-tclinkenbeard 6c4493166f Add limiting storage queue and durability lag to health metrics 2020-11-12 20:14:41 -08:00
Richard Chen 5b546c4854 change protocol version to hex encoded string in status json. Move constant from flow transport header to cpp 2020-10-26 19:35:38 +00:00
Richard Chen 055add9682 conflicts 2020-10-23 06:33:00 +00:00
Xin Dong 944f30484a
Merge pull request #3759 from dongxinEric/misc/3739/expose-time-since-last-recovery
This resolves issue #3739 by exposing time since last full recovery.
2020-10-19 09:03:31 -07:00
Richard Chen 2f5b0bef08 switch to test newer incompatible version. Fix PR comments. Modify schema 2020-10-12 18:29:16 +00:00
Xin Dong 480fc82779 Resolve review comments 2020-09-25 16:58:54 -07:00
Oleg Samarin b6b87cd8d4 fileconfigure fails with Assertion false failed 2020-09-25 20:20:00 +03:00
Xin Dong a96d6f85c5 Removed redundant field number_of_old_generations_of_tlogs from status json 2020-09-24 09:44:51 -07:00
Xin Dong 50f681cd32
Apply suggestions from code review
Co-authored-by: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-09-23 10:54:49 -07:00
Meng Xu cf69f455a9
Merge pull request #3785 from apple/release-6.3
Merge Release 6.3 to master
2020-09-17 14:43:56 -07:00
Xin Dong 4df0f60729 Instead of using fully_recovered, use accepting_commits as a singal of DB turned available. Also add the number of old generations into status 2020-09-17 09:55:25 -07:00
Young Liu 35bef73a1c Rename proxy to commit proxy 2020-09-10 17:44:15 -07:00