Kill datacenter in simulation as part of repair process (#8914)

* kill datacenter in simulation as part of repair process

* fix typo
This commit is contained in:
Jon Fu 2022-12-05 15:09:51 -08:00 committed by GitHub
parent a8b7a33a64
commit 3c08901d2c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 3 additions and 1 deletions

View File

@ -182,7 +182,7 @@ Now the main steps in recovery have finished. The CC keeps waiting for all tLogs
## Phase 6: ACCEPTING_COMMITS
The transaction system starts to accept new transactions. This doesn't mean that this committed data will be available for reading by clients, because storage servers are not guaranteed to be alive in the recovery process. In case storage servers have not been alive, write-only transactions can be committed and will be buffered in tLogs. If storage servers are unavailable for long enough, pushing tLogs' memory usage above a configurable threshold, rakekeepr will throttle all transactions.
The transaction system starts to accept new transactions. This doesn't mean that this committed data will be available for reading by clients, because storage servers are not guaranteed to be alive in the recovery process. In case storage servers have not been alive, write-only transactions can be committed and will be buffered in tLogs. If storage servers are unavailable for long enough, pushing tLogs' memory usage above a configurable threshold, ratekeeper will throttle all transactions.
## Phase 7: ALL_LOGS_RECRUITED

View File

@ -673,6 +673,8 @@ ACTOR Future<Void> repairDeadDatacenter(Database cx,
.detail("RemoteDead", remoteDead)
.detail("PrimaryDead", primaryDead);
g_simulator->usableRegions = 1;
g_simulator->killDataCenter(
primaryDead ? g_simulator->primaryDcId : g_simulator->remoteDcId, ISimulator::KillInstantly, true);
wait(success(ManagementAPI::changeConfig(
cx.getReference(),
(primaryDead ? g_simulator->disablePrimary : g_simulator->disableRemote) + " repopulate_anti_quorum=1",