Commit Graph

695 Commits

Author SHA1 Message Date
Zhe Wu 90bdfbdc4c Update info trigger new DB info update immediately 2023-03-21 21:40:12 -07:00
Hui Liu c43f8b3fdc
Refactor - introduce BlobRestoreController for APIs to manage restore state (#9616) 2023-03-08 07:50:30 -08:00
Hui Liu b2d497a3b2 Report restore phase start timestamp 2023-03-03 18:09:51 -08:00
Hui Liu 6b6959d35f Split blob manifest as segments when writting 2023-02-09 11:26:19 -08:00
Yi Wu 0849f60ef1
Clean up cluster controller's wait on recoveredDiskFiles (#9105)
The `recoveredDiskFiles` is a promise that will be fulfilled once all the local TLog and storage files have been initialized in a process. It was added previously to make a process wait on it before joining the cluster, and it was to avoid a slow recovering TLog to join the cluster to slowdown cluster recovery. 

With #7510, we allow a process to join the cluster to play stateless role, while still avoid it to join the cluster as stateful role before its TLog and storage is recovered. As such, the `recoveredDiskFiles` wait is no longer needed. This PR cleanup the logic.
2023-01-09 16:26:32 -08:00
Nim Wijetunga d4cbe20d5f
Cluster Controller uses DB Config (#8992)
CC uses db config for encryption
2023-01-09 12:17:36 -05:00
Nim Wijetunga 10ccaa1ee5
remove client info encryption state (#9096) 2023-01-06 17:14:06 -05:00
Zhe Wu 6aaf5af75d Disconnection to satellite TLog should trigger recovery in gray failure detection 2023-01-05 22:23:02 -08:00
Hui Liu 46d92bbf3f
Merge pull request #8984 from sfc-gh-huliu/restoretest
Add correctness test for blob restore
2023-01-04 14:43:49 -08:00
Hui Liu e3bf79cf71 Add correctness test for blob restore 2023-01-04 11:10:34 -08:00
sfc-gh-tclinkenbeard 1efe06da20 Move SingletonRecruitThrottler to SingletonRoles.h 2023-01-03 14:08:09 -08:00
sfc-gh-tclinkenbeard 68547a2dbd Remove dead code from ClusterController.actor.cpp 2023-01-03 14:08:09 -08:00
sfc-gh-tclinkenbeard 9e9415eff0 Move singleton role logic into its own file 2023-01-03 14:08:09 -08:00
FoundationDB CI 86d6106dc1
format source code after switch to clang 15 2022-12-08 17:26:45 +00:00
Hui Liu 2e62822183 Show blob restore in fdbcli status command 2022-11-17 14:22:59 -08:00
Hui Liu 5834517570 Add fdbcli blobrestore to start the full restore 2022-11-11 08:32:23 -08:00
Ata E Husain Bohra a7d123643d
Extend Tlog persistentStorage to persist encryption state (#8344)
* Extend Tlog persistentStorage to persist encryption state

Description

 diff-3: Address review comment.
 diff-2: Extend ClusterController endpoints to allow query
         cluster's encryptionAtRest status
         Update Tlog recovery to ensure on-disk encryption
         status matches with cluster's cstate persisted
         encryptionAtRest
 diff-1: Store encryptionAtRestMode state in Coordinators

Major changes proposed are:
1. Extend TLog persistentStorage to persist encryption state
2. Encryption state persisted is derived from corresponding
db-config and relevant SERVER_KNOBS. In near future, knobs
shall be removed.
3. On TLog startup, the persisted encryption state is compared
against cluster configuration, if mismatch, the TLog is killed
and not allowed to rejoin the cluster.

Testing

devRunCorrectness - 100K
2022-11-03 11:16:50 -07:00
Lukas Joswiak 91146a03f0 Write cluster ID to `ClientDBInfo`
This enables clients to receive the cluster ID.
2022-10-27 13:56:13 -07:00
Lukas Joswiak 9d3c3b1efe Remove cluster ID logic from individual roles
The logic to determine the validity of a process joining a cluster now
belongs on the worker and the cluster controller. It is no longer
restricted to tlogs and storages, but instead applies to all processes
(even stateless ones).
2022-10-27 13:56:13 -07:00
Lukas Joswiak bba05b7c9b Move cluster ID from txnStateStore to the database
The cluster ID is now stored in the database instead of in the
txnStateStore. The cluster controller will read it on boot and send it
to all processes to persist.
2022-10-27 13:56:13 -07:00
Lukas Joswiak f43011e4b7 Notify processes joining the wrong cluster
And have these processes enter a "zombie" state where they cancel all
their actors and then wait forever, refusing to do any additional work
until they are manually handled by the operator.
2022-10-27 13:56:13 -07:00
Lukas Joswiak 72a97afcd6 Avoid recruiting workers with different cluster ID 2022-10-27 13:56:13 -07:00
Trevor Clinkenbeard 25f3a99b3d
Merge pull request #8568 from sfc-gh-tclinkenbeard/make-tracecounters-method
Encapsulate `CounterCollection`
2022-10-25 14:27:56 -07:00
sfc-gh-tclinkenbeard 74212eeacf Encapsulate CounterCollection 2022-10-25 10:17:15 -07:00
Hui Liu f2289ced27 Add StorageServerInterface for BlobMigrator 2022-10-24 13:12:07 -07:00
Jingyu Zhou a8391caf23 Revert "Data loss protection v2" 2022-10-20 18:09:58 -05:00
Lukas Joswiak 72bc89cf39 Remove cluster ID logic from individual roles
The logic to determine the validity of a process joining a cluster now
belongs on the worker and the cluster controller. It is no longer
restricted to tlogs and storages, but instead applies to all processes
(even stateless ones).
2022-10-18 21:37:42 -07:00
Lukas Joswiak 7342672c11 Move cluster ID from txnStateStore to the database
The cluster ID is now stored in the database instead of in the
txnStateStore. The cluster controller will read it on boot and send it
to all processes to persist.
2022-10-18 21:37:42 -07:00
Lukas Joswiak 743609f217 Notify processes joining the wrong cluster
And have these processes enter a "zombie" state where they cancel all
their actors and then wait forever, refusing to do any additional work
until they are manually handled by the operator.
2022-10-18 21:37:40 -07:00
Lukas Joswiak 2394a9f4b9 Avoid recruiting workers with different cluster ID 2022-10-18 21:37:16 -07:00
Hui Liu 169c341f79
Merge pull request #8386 from sfc-gh-huliu/blobmigrator
Add blob migrator to assist data copy from blob to storage server
2022-10-13 14:46:04 -07:00
Hui Liu 049df622f1 add a blob migrator 2022-10-13 13:21:45 -07:00
He Liu 97acc94a7f Fixed out-of-scope variable issue. 2022-10-12 16:13:57 -07:00
He Liu 2caa4290c6 Fixed clang variable scope issue. 2022-10-12 15:43:33 -07:00
He Liu b52edd8658 Merge branch 'main' of https://github.com/apple/foundationdb into validate-data-consistency 2022-10-10 11:00:05 -07:00
Markus Pilman ea1325a552
Merge pull request #8319 from sfc-gh-tclinkenbeard/add-rare-code-probe-annotation
Add `rare` code probe decoration
2022-10-07 09:39:00 -06:00
Zhe Wu 43cd419078 Comments cleanup 2022-10-06 14:09:00 -07:00
Zhe Wu c4c71c7806 Add tests 2022-10-06 14:09:00 -07:00
Zhe Wu f41d88cd45 Ignore CC_DEGRADED_PEER_DEGREE_TO_EXCLUDE for disconnected peers. Core logic 2022-10-06 14:09:00 -07:00
Jon Fu 2fe1d19e95 Merge branch 'main' of github.com:apple/foundationdb into metacluster-status 2022-10-05 11:48:04 -07:00
Markus Pilman 550488b020 Merge remote-tracking branch 'origin/main' into bugfixes/open-for-ide
# Conflicts:
#	bindings/c/CMakeLists.txt
#	fdbclient/include/fdbclient/GetEncryptCipherKeys.actor.h
#	fdbserver/BackupWorker.actor.cpp
#	fdbserver/BlobWorker.actor.cpp
#	fdbserver/CommitProxyServer.actor.cpp
#	fdbserver/KeyValueStoreMemory.actor.cpp
#	fdbserver/StorageCache.actor.cpp
#	fdbserver/include/fdbserver/GetEncryptCipherKeys.actor.h
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/PhysicalShardMove.actor.cpp
#	flow/CMakeLists.txt
2022-10-04 18:27:48 -06:00
Markus Pilman 97dfc6823f fixed build with OPEN_FOR_IDE 2022-10-04 17:01:02 -06:00
Jon Fu 13f022160c add trace detail 2022-09-30 15:02:10 -07:00
Jon Fu 1b333734a2 set updaterDelay on every loop 2022-09-29 16:53:37 -07:00
Jon Fu 461e42bfe1 restructure updater code and add capacity check in metacluster management workload 2022-09-29 16:24:02 -07:00
Zhe Wu fcbd65c2ea Add unit test 2022-09-29 13:55:16 -07:00
Jon Fu 257dbff430 recreate transaction every loop in background updater 2022-09-29 11:34:36 -07:00
Jon Fu 6357ad1750 pass info through cc data to populate in status 2022-09-28 16:18:44 -07:00
Jon Fu 0fa462fca9 initial commit to trace metacluster metrics 2022-09-26 15:56:45 -07:00
sfc-gh-tclinkenbeard 985958c260 Add rare code probe decoration 2022-09-25 15:28:32 -07:00