Otherwise, the restored database is not consistent.
To reproduce, at commit 45c459cfc of PR #11064:
-f ./tests/restarting/from_7.3.0/DrUpgradeRestart-1.toml -s 4210130489 -b off
-f ./tests/restarting/from_7.3.0/DrUpgradeRestart-2.toml --restarting -s 4210130490 -b on
* throttle hot shards
* expire throttled shards over time
* add backoff
* Parallelize messaging from RK to CP
* Obtain shards from a single SS
* handle expired transactions
* bump transaction_throttled_hot_shard
* Change SevError to SevWarn for CannotMonitorHotShardForSS
* Add log per request
* Remove SS entries from RateKeeper once it is down
Before the change, certain data structures in RateKeeper would
not delete data associated with a deleted/cancelled SS, thus
it causes significant unnecessary CPU usage, results in degrades
of GRV proxy in performance. This change fixes it.
* do not count recently created change feeds for throttling
* fix: blocked assignments were not decremented when force purging
* fix: created needs to be updated when the changefeed is reset
* added asserts to detect if ratekeeper is throttled on blob workers
* Throttle the cluster if the blob manager cannot assign ranges
* fixed a number of different bugs which caused ratekeeper to throttle to zero because of blob worker lag
* fix: do not mark an assignment as block if it is cancelled
* remove asserts to merge bug fixes
* fix formatting
* restored old control flow to storage updater
* storage updater did not throw errors
* disable buggify to see if it fixes CI
* throttle the cluster when blob workers fall behind
* do not throttle on blob workers if they are not enabled
* remove an unnecessary actor
* fixed a compile error
* fetch blob worker metrics at the same interval as the rate is updated, avoid fetching the complete blob worker list too frequently
* fixed another compilation bug
* added a 5 second delay before bw throttling to prevent false positives caused by the 100e6 version jump during recovery. Lower the throttling thresholds to react much quicker to bw lag.
* fixed a number of problems
* changed the minBlobVersionRequest to look at storage server versions since this will be a lot more efficient
* fix: do not let desired go backwards
* fix: track the version of notAtLatest changefeeds for throttling
* ratekeeper now throttled blob workers by estimating the transaction per second throughput of the blob workers
* added metrics for blob worker change feeds
* added a knob to disable bw throttling
* fixed the transaction options in blob manager
Issue #7258
The ratekeeper is recording the busiest write tag for *all* storage
servers, which throttles the traceevent. Distribute the busiest write
tag to corresponding storage servers should reduces this throttling
issue.
* proof of concept
* use code-probe instead of test
* code probe working on gcc
* code probe implemented
* renamed TestProbe to CodeProbe
* fixed refactoring typo
* support filtered output
* print probes at end of simulation
* fix missed probes print
* fix deduplication
* Fix refactoring issues
* revert bad refactor
* make sure file paths are relative
* fix more wrong refactor changes