Commit Graph

239 Commits

Author SHA1 Message Date
Xiaoxi Wang aba9d85560 merge main 2022-04-06 09:57:52 -07:00
Steve Atherton b179813989 Updated status schema and fixed spacing. 2022-04-01 17:21:35 -07:00
He Liu dd15489605 rename ssd-rocksdb-experimental as ssd-rocksdb-v1. 2022-03-29 10:53:38 -07:00
Xiaoxi Wang d93b57dd88 conflict solving 2022-03-24 20:45:51 -07:00
Xiaoxi Wang 1b631a9263 solve conflict with main 2022-03-24 16:29:11 -07:00
Josh Slocum f27475e2f4 Merge branch 'main' into blob_integration 2022-03-22 11:41:58 -05:00
sfc-gh-tclinkenbeard a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Josh Slocum e71b3533f9 Merge branch 'main' into blob_integration 2022-03-09 08:59:56 -06:00
A.J. Beamon 72a34945ce Add the ability to disable tenants. Server processes verify the ID of tenants being read or written. 2022-03-06 21:54:21 -08:00
A.J. Beamon 5fa9d3e1b7 Add a tenant parameter to read and commit requests. Store a map of all tenants on commit proxy and storage servers. Add an option to require tenant mode. 2022-03-06 21:54:21 -08:00
Xiaoxi Wang e73c0a31e6 add wiggle_server_ids and wiggle_server_addresses in status json 2022-03-02 10:03:23 -08:00
Xiaoxi Wang f48ba989c3 add status 2022-03-01 15:27:27 -08:00
Josh Slocum 38a75a8b89 Merge branch 'main' into blob_integration 2022-02-17 17:47:38 -06:00
Vaidas Gasiunas 092b5cee4b MVC2.0: Rollback added code 2022-02-14 13:50:42 -08:00
Xiaoxi Wang 6dc5921575
createdTime based storage wiggler (#6219)
* add storagemetadata

* add StorageWiggler;

* fix serverMetadataKey bug

* add metadata tracker in storage tracker

* finish StorageWiggler

* update next storage ID

* change pid to server id

* write metadata when seed SS

* add status json fields

* remove pid based ppw iteration

* fix time expression

* fix tss metadata nonexistence; fix transaction retry when retrieving metadata

* fix checkMetadata bug when store type is wrong

* fix remove storage status json

* format code

* refactor updateNextWigglingStoragePID

* seperate storage metadata tracker and store type tracker

* rename pid

* wiggler stats

* fix completion between waitServerListChange and storageRecruiter

* solve review comments

* rename system key

* fix database lock timeout by adding lock_aware

* format code

* status json

* resolve code format/naming comments

* delete expireNow; change PerpetualStorageWiggleID's value to KeyBackedObjectMap<UID, StorageWiggleValue>

* fix omit start rount

* format code

* status json reset

* solve status json format

* improve status json latency; replace binarywriter/reader to objectwriter/reader; refactor storagewigglerstats transactions

* status timestamp
2022-02-04 15:04:30 -08:00
Ata E Husain Bohra 87ee4cf958 Add new FDB EncryptKeyProxy role
Major changes includes:

1. Add a new FDB role responsible- EncyrptKeyProxy. The role is
   responsible to expose APIs to fetch encyrption keys interacting
   with external Encryption KeyManager interface.
2. The process is a FDB singleton process following similar recruitment
   rules as other singleton processes in the system.
3. Code to recruit the worker process; given the encryption keys are
   needed during recovery (decode TLog records), for now the process
   is co-located in same datacenter as ClusterController.
4. Skeleton process actor code; more functionality will be added in
   subsequent PRs.

NOTE: The code is protected under a SERVER_KNOB with the default
      value as 'false' for now.
2022-01-25 17:38:27 -08:00
Ata E Husain Bohra 936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9.
1.b. This reverts commit d174bb2e06.
1.c. This reverts commit 30b05b469c.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Suraj Gupta 640cee2072 Start BM without config change if config is enabled. 2021-12-10 14:00:34 -06:00
Suraj Gupta fc3376fe8f Move client knob to database config for blob granules. 2021-12-10 14:00:34 -06:00
Steve Atherton c53f5aa110 Renamed redwood to redwood-1-experimental and file extension to .redwood-v1. 2021-11-16 02:15:22 -08:00
Vaidas Gasiunas 40da5a80f9 Merge remote-tracking branch 'apple/master' into multi-version-client-2 2021-10-26 19:29:10 +02:00
Josh Slocum 5f0ec0612a Merge branch 'feature-range-feed' into blob_full 2021-10-13 15:44:35 -05:00
Vaidas Gasiunas 61fb967484 MVC2.0: Add a clientlib metadata attribute for checksum algorithm 2021-10-07 18:06:22 +02:00
Vaidas Gasiunas 5360abb238 MVC2.0: Operation to list uploaded libraries with various filters;
Introducing constants for attribute names and platform values
2021-10-06 18:01:46 +02:00
Vaidas Gasiunas cda0a5f931 Operation to upload client library binary in to system keyspace 2021-10-06 18:01:46 +02:00
Neethu Haneesha Bingi 3e79299898 Locality filter support to perpetual storage wiggler feature. 2021-09-30 10:00:33 -07:00
Suraj Gupta a4bcd3919d Add exclusive process class for Blob Worker.
Also introduces a specific machine in the simulated cluster
to test blob worker (similar to what's done for storage cache).
2021-09-23 16:54:44 -04:00
Suraj Gupta 5fa6c687d6 Add blob manager as a singleton. 2021-09-23 10:45:37 -04:00
Josh Slocum 9992a7b33f Added StorageMigrationType and cli commands 2021-09-14 09:55:41 -05:00
Neethu Haneesha Bingi 66f2518405 exclude to work with any locality data match. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi cbe714acd0 Status json schema update, includelocalities back for consistency check, review comments. 2021-06-23 18:03:27 -07:00
Xiaoxi Wang 454f9e9c89 update json schemas 2021-06-17 20:20:39 +00:00
Daniel Smith aa37c7dcec Add proxies back to schema 2021-06-08 14:12:33 -04:00
Josh Slocum ce82c9653e Testing Storage Server implementation 2021-05-25 20:28:50 +00:00
Markus Pilman e1254d38a0
Merge pull request #4838 from sfc-gh-xwang/ppwiggle
perpetual storage wiggling command line support
2021-05-24 14:48:00 -06:00
Xiaoxi Wang 93c809764f fix Schema check error 2021-05-19 23:52:16 +00:00
Sreenath Bodagala 2fa80e7912 Address review comments 2021-05-19 22:04:43 +00:00
Sreenath Bodagala 622f43474a Expose "bounce impact" and Storage Server "version catch-up rate" metrics
Changes:

Schemas.cpp: Extend the JSON schema to report the new metrics that have
been added.

mr-status-json-schemas.rst.inc: Update the schema to reflect the changes
made to the JSON schema.

release-notes-700.rst: Add a note about the new metrics in "Status"
section.
2021-05-19 19:54:49 +00:00
Sreenath Bodagala d8cad8efca Report bounce impact info as part of cluster JSON object. 2021-05-13 16:36:57 +00:00
Sreenath Bodagala 160293bd54 Report bounce impact in fdbcli status
Changes:

Schemas.cpp: Extend the JSON schema to report whether the cluster is
bounceable and if not, report the reason for why it is not bounceable.

Status.actor.cpp: Extend recoveryStateStatusFetcher() to populate the
bounce related field(s).

mr-status-json-schemas.rst.inc: Update the schema to reflect the change
made in Schemas.cpp.

release-notes-700.rst: Add a note about the new status fields in "Status"
section.
2021-05-13 14:28:06 +00:00
Sreenath Bodagala 336a9bff66 Provide "time since last full recovery" in fdbcli status
Changes:

Schemas.cpp: Extend the JSON schema to include a new field that reports
the number of seconds since last full recovery.

Status.actor.cpp: Extend recoveryStateStatusFetcher() to populate the
new field that has been added to Schemas.cpp.

mr-status-json-schemas.rst.inc: Update the schema to reflect the change
made in Schemas.cpp.
2021-05-05 19:43:44 +00:00
sbodagala f7e28c50d4
Merge pull request #4735 from sbodagala/master
Expose CommitBatchingWindowSize metric to fdbcli status
2021-04-30 15:52:29 -04:00
Sreenath Bodagala f151df3203 Expose CommitBatchingWindowSize metric to fdbcli status
Changes:

Schemas.cpp:
- Extend JSON schema to include aggregated information about
CommitBatchingWindowSize samples.

Status.actor.cpp:
- Extend getStorageServersAndMetrics() to gather metrics about
CommitBatchingWindowSize.
- Extend CommitProxy AddRole() to populate the status-JSON object
with the metrics about CommitBatchingWindowSize.
2021-04-29 22:11:09 +00:00
Markus Pilman 340f012e1a
Merge pull request #4695 from sfc-gh-etschannen/fix-rewrite-bme
rewrote tlog recruitment logic so that it is deterministic
2021-04-27 10:19:25 -06:00
Evan Tschannen f1559a2203 use the stateless process class instead of master or resolution in simulation because it is the recommended process class, and the others are not deterministic when recruited in a constrained process situation 2021-04-26 09:49:26 -07:00
Dan Lambright cabf192f57 Respond to review comments 3/23 2021-04-06 13:05:09 -04:00
Dan Lambright 48a475366c Log latency metrics for batch GRV requests 2021-04-06 13:05:09 -04:00
Hao Fu fb9632297e Add txnRejectedForQueuedTooLong in ProxyStats 2021-02-12 13:04:58 -08:00
A.J. Beamon aaf0a9aa7b Merge branch 'release-6.3' into merge-release-6.3-into-master
# Conflicts:
#	build/docker-compose.yaml
#	cmake/ConfigureCompiler.cmake
#	fdbclient/FileBackupAgent.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/IAsyncFile.h
#	fdbrpc/IRateControl.h
#	fdbrpc/simulator.h
#	fdbserver/KeyValueStoreSQLite.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbservice/ServiceBase.cpp
2021-02-08 12:58:34 -08:00
A.J. Beamon 67e783acf8 Merge branch 'release-6.2' into merge-release-6.2-into-release-6.3
# Conflicts:
#	cmake/CompileBoost.cmake
#	cmake/FDBComponents.cmake
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/simulator.h
#	fdbserver/KeyValueStoreSQLite.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/storageserver.actor.cpp
#	flow/Knobs.h
#	flow/network.h
2021-02-08 09:20:28 -08:00
Evan Tschannen b2ffdf47f0 added low priority reads to status 2021-02-03 13:24:34 -08:00
Richard Chen c77d9e4abe merge conflicts 2020-12-02 21:53:19 +00:00
sfc-gh-tclinkenbeard 45c9a0abc7 Revert "Revert "Add limiting health metrics""
This reverts commit 209ebcc595.
2020-11-13 17:24:57 -08:00
Trevor Clinkenbeard 209ebcc595
Revert "Add limiting health metrics" 2020-11-13 17:08:46 -08:00
sfc-gh-tclinkenbeard 6c4493166f Add limiting storage queue and durability lag to health metrics 2020-11-12 20:14:41 -08:00
Richard Chen 5b546c4854 change protocol version to hex encoded string in status json. Move constant from flow transport header to cpp 2020-10-26 19:35:38 +00:00
Richard Chen 055add9682 conflicts 2020-10-23 06:33:00 +00:00
Xin Dong 944f30484a
Merge pull request #3759 from dongxinEric/misc/3739/expose-time-since-last-recovery
This resolves issue #3739 by exposing time since last full recovery.
2020-10-19 09:03:31 -07:00
Richard Chen 2f5b0bef08 switch to test newer incompatible version. Fix PR comments. Modify schema 2020-10-12 18:29:16 +00:00
Xin Dong 480fc82779 Resolve review comments 2020-09-25 16:58:54 -07:00
Oleg Samarin b6b87cd8d4 fileconfigure fails with Assertion false failed 2020-09-25 20:20:00 +03:00
Xin Dong a96d6f85c5 Removed redundant field number_of_old_generations_of_tlogs from status json 2020-09-24 09:44:51 -07:00
Xin Dong 50f681cd32
Apply suggestions from code review
Co-authored-by: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-09-23 10:54:49 -07:00
Meng Xu cf69f455a9
Merge pull request #3785 from apple/release-6.3
Merge Release 6.3 to master
2020-09-17 14:43:56 -07:00
Xin Dong 4df0f60729 Instead of using fully_recovered, use accepting_commits as a singal of DB turned available. Also add the number of old generations into status 2020-09-17 09:55:25 -07:00
Young Liu 35bef73a1c Rename proxy to commit proxy 2020-09-10 17:44:15 -07:00
Young Liu 1155d015c9 fetch current log generation as well 2020-09-09 11:54:58 -07:00
XiaoxiWang 2935d3d4f6 change workload; solve some comments 2020-09-08 21:47:49 +00:00
Xin Dong 4363dd0f25 This resolves issue #3739 by exposing time since last full recovery. 2020-09-08 14:26:01 -07:00
Young Liu 23e1ff694c Report missing old tlogs in recovery between accepting commits and storage recovered 2020-09-08 13:35:42 -07:00
XiaoxiWang ecf2c0109c more concise status json 2020-09-04 18:40:45 +00:00
XiaoxiWang fb758bf937 update Schemas.cpp 2020-09-04 16:34:05 +00:00
Young Liu 9564171463 Merge branch 'master' into grv-proxy 2020-08-24 22:45:01 -07:00
XiaoxiWang 4e627691a9 add throttle objects into Schemas.cpp 2020-08-24 23:37:58 +00:00
Young Liu 79ce16650d merge master branch 2020-08-11 19:22:10 -07:00
Young Liu d6a23a4d6b Resolve comments to make GRV proxy a separate process class 2020-08-06 00:01:57 -07:00
Chaoguang Lin 3ba940a63d Change json string to snake_case 2020-07-28 11:42:03 -07:00
Chaoguang Lin 52178f9eae Add json schema for the management api error message in special key space 2020-07-27 17:37:19 -07:00
A.J. Beamon b09dddc07e Merge branch 'release-6.2' into merge-release-6.2-into-release-6.3
# Conflicts:
#	cmake/ConfigureCompiler.cmake
#	documentation/sphinx/source/downloads.rst
#	fdbrpc/FlowTransport.actor.cpp
#	fdbrpc/fdbrpc.vcxproj
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/LogSystemPeekCursor.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/Status.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	flow/flow.vcxproj
2020-07-10 15:06:34 -07:00
Evan Tschannen 8befb0829d
Merge pull request #3481 from ajbeamon/fix-dc-timeout-message
Add missing messages to schema and rename one to match later versions
2020-07-10 10:30:21 -07:00
A.J. Beamon b51beead53 The backport of a change in later versions didn't include some updates to the schema and a change to the name of one of the messages. 2020-07-09 16:58:13 -07:00
A.J. Beamon 693595f4e5 Fix make build, fix GRV schema 2020-07-09 16:50:08 -07:00
A.J. Beamon 04d1217941 Track statistics about server-side request latency on each process, to include min, max, mean, and various percentiles. 2020-07-09 16:39:15 -07:00
Andrew Noyes 409ccf3be2 Use snake_case to match status json convention 2020-06-30 17:05:29 +00:00
Andrew Noyes e1dfa410c1 Remove "Storages" field from data_distribution_stats
It seems like a cool idea, but it feels rushed to me. Reverting for now.
2020-06-30 15:24:16 +00:00
Andrew Noyes 403274bba8 Add schemas, and check dataDistributionStatsSchema 2020-06-30 15:24:16 +00:00
Daniel Smith acbfe2e4c9
Revert "Revert "Initial RocksDB"" 2020-06-15 12:45:36 -04:00
Jingyu Zhou 9cd1614c82
Revert "Initial RocksDB" 2020-06-11 15:29:46 -07:00
A.J. Beamon e10704fd76 Cherry-pick region related status changes from 6.3 2020-06-09 14:56:21 -07:00
Daniel Smith 5d361fe532 Copy/paste rebase onto 6.3 2020-05-22 15:02:51 +00:00
A.J. Beamon d636194d0d Remove deprecated fields in status: worst_version_lag_storage_server and limiting_version_lag_storage_server 2020-05-19 13:12:10 -07:00
A.J. Beamon d0c66d7282 Fix typo 2020-05-12 18:38:20 -07:00
A.J. Beamon e0526e0095 Add busiest read tags to storage server status 2020-05-12 15:49:40 -07:00
A.J. Beamon 36da61dd9c Merge branch 'master' into transaction-tagging
# Conflicts:
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/vexillographer/fdb.options
2020-04-07 21:12:14 -07:00
A.J. Beamon 2309e9f156 Consistently use timeout instead of timedout in status messages. 2020-04-07 08:43:23 -07:00
Xin Dong 587419f984 Fix the missing schema field which caused a lot noise in nightly 2020-04-06 10:18:58 -07:00
A.J. Beamon 2336f073ad Checkpointing a bunch of work on throttles. Rudimentary implementation of auto-throttling. Support for manual throttling via fdbcli. Throttles are stored in the system keyspace. 2020-04-03 15:24:14 -07:00
Xin Dong e755583c07 Address review comments. 2020-04-01 15:13:04 -07:00
Xin Dong a7e8bfad82 Fix the test failure, which was introduced by a typo 2020-03-30 15:24:08 -07:00
Xin Dong 012d41548e Address review comments 2020-03-30 13:55:59 -07:00