Commit Graph

297 Commits

Author SHA1 Message Date
Jingyu Zhou 266a3f018b Disable hot shard throttling for restore
Otherwise, the restored database is not consistent.

To reproduce, at commit 45c459cfc of PR #11064:

-f ./tests/restarting/from_7.3.0/DrUpgradeRestart-1.toml -s 4210130489 -b off
-f ./tests/restarting/from_7.3.0/DrUpgradeRestart-2.toml --restarting -s 4210130490 -b on
2023-11-17 13:56:58 -08:00
Dan Lambright af53e9a532
Log ignored zones and reasons in RkUpdate (#11067) 2023-11-17 16:07:48 -05:00
Sreenath Bodagala d8f0a21ecc Merge remote-tracking branch 'apple-upstream/main' 2023-11-07 21:10:43 +00:00
Sreenath Bodagala 58c0e79874 - Prevent failover when storage servers are behind. 2023-11-03 21:45:48 +00:00
Dan Lambright 015167c17e
Throttle commits against hot shards (#10970)
* throttle hot shards

* expire throttled shards over time

* add backoff

* Parallelize messaging from RK to CP

* Obtain shards from a single SS

* handle expired transactions

* bump transaction_throttled_hot_shard

* Change SevError to SevWarn for CannotMonitorHotShardForSS

* Add log per request
2023-10-31 12:01:34 -04:00
sfc-gh-tclinkenbeard 2228cd3320 Monitor multiple write tags in StorageQueueInfo::refreshCommitCost 2023-08-02 15:52:57 -07:00
Hao Fu a5f4d53c45
Remove SS entries from RateKeeper once it is down (#10627)
* Remove SS entries from RateKeeper once it is down

Before the change, certain data structures in RateKeeper would
not delete data associated with a deleted/cancelled SS, thus
it causes significant unnecessary CPU usage, results in degrades
of GRV proxy in performance.  This change fixes it.
2023-07-24 13:47:23 -07:00
sfc-gh-tclinkenbeard 9c6b365267 Simplify limiting rate calculation for GlobalTagThrottler 2023-07-16 22:33:13 -07:00
sfc-gh-tclinkenbeard d33d0ece55 GlobalTagThrottler should decay throughput from missing storage servers 2023-06-29 20:53:09 -07:00
sfc-gh-tclinkenbeard ae6167e576 Update StorageQueueInfo::getTagThrottlingRatio implementation 2023-05-26 16:10:37 -07:00
sfc-gh-tclinkenbeard 9639192a88 Add GLOBAL_TAG_THROTTLING_REPORT_ONLY knob 2023-04-21 11:13:42 -07:00
sfc-gh-tclinkenbeard 568518b6a3 Update semantics of MIN_TAG_WRITE_PAGES_RATE to reflect name 2023-04-17 11:58:50 -07:00
Josh Slocum aef5130da2
adding system priority option to getDatabaseConfiguration, and several debugging improvements (#9864) 2023-04-06 15:08:40 -05:00
Josh Slocum 3748693a28
fixing txn flags in new bw rk function (#9813) 2023-03-27 15:56:15 -05:00
Josh Slocum 33c0b35ee6
No RK throttling on blob workers if no blob ranges (#9425) 2023-02-21 15:23:40 -06:00
sfc-gh-tclinkenbeard c0fcf59a8c Fix bug in GlobalTagThrottler::getLimitingTps.
Also add comments to GlobalTagThrottler unit tests
2022-10-27 21:18:39 -07:00
sfc-gh-tclinkenbeard 6a8c6e83e4 Rename StorageQueueInfo::getTagThrottlingRatio 2022-10-27 21:18:39 -07:00
sfc-gh-tclinkenbeard 950ac1c867 Improve encapsulation for TLogQueueInfo and StorageQueueInfo 2022-10-27 21:18:35 -07:00
A.J. Beamon 4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
sfc-gh-tclinkenbeard 82adc1e856 Make g_simulator a pointer 2022-09-15 09:00:33 -07:00
Josh Slocum 9721de70b6 Adding knob and increasing delay for simulation ratekeeper throttling assert 2022-08-31 09:08:27 -05:00
sfc-gh-tclinkenbeard 9df990e375 Remove global_tag_throttler status section 2022-08-29 23:17:20 -07:00
A.J. Beamon 2907d2d4dd
Merge pull request #8004 from sfc-gh-ajbeamon/fix-ub
Fix some undefined bevavior in RK and a unit test
2022-08-29 09:16:11 -07:00
Evan Tschannen 8314e80371
Fixed a few bugs which caused ratekeeper to unnecessarily throttle a cluster (#8006)
* do not count recently created change feeds for throttling

* fix: blocked assignments were not decremented when force purging

* fix: created needs to be updated when the changefeed is reset

* added asserts to detect if ratekeeper is throttled on blob workers
2022-08-26 15:38:31 -07:00
A.J. Beamon 0e782412a8 Fix some undefined bevavior: 1) a unit test was not initializing members of the WorkloadContext it was using, and 2) very large ratekeeper limits for batch priority were overflowing the types used to log them 2022-08-26 14:17:01 -07:00
Evan Tschannen 493771b6a8
Throttle the cluster if the blob manager cannot assign ranges (#7900)
* Throttle the cluster if the blob manager cannot assign ranges

* fixed a number of different bugs which caused ratekeeper to throttle to zero because of blob worker lag

* fix: do not mark an assignment as block if it is cancelled

* remove asserts to merge bug fixes

* fix formatting

* restored old control flow to storage updater

* storage updater did not throw errors

* disable buggify to see if it fixes CI
2022-08-23 13:33:46 -05:00
Evan Tschannen a9d3c9f9b3
Added throttling when a blob worker falls behind (#7751)
* throttle the cluster when blob workers fall behind

* do not throttle on blob workers if they are not enabled

* remove an unnecessary actor

* fixed a compile error

* fetch blob worker metrics at the same interval as the rate is updated, avoid fetching the complete blob worker list too frequently

* fixed another compilation bug

* added a 5 second delay before bw throttling to prevent false positives caused by the 100e6 version jump during recovery. Lower the throttling thresholds to react much quicker to bw lag.

* fixed a number of problems

* changed the minBlobVersionRequest to look at storage server versions since this will be a lot more efficient

* fix: do not let desired go backwards

* fix: track the version of notAtLatest changefeeds for throttling

* ratekeeper now throttled blob workers by estimating the transaction per second throughput of the blob workers

* added metrics for blob worker change feeds

* added a knob to disable bw throttling

* fixed the transaction options in blob manager
2022-08-12 13:15:56 -07:00
Trevor Clinkenbeard 583021c2d9
Merge pull request #7772 from sfc-gh-tclinkenbeard/global-tag-throttling6
Add status section for global tag throttler
2022-08-11 17:38:31 -03:00
sfc-gh-tclinkenbeard 66373f1e74 Addressed review comments 2022-08-10 21:44:12 -03:00
Jingyu Zhou eba77d78f4 Add knobs for min/max Ratekeeper limit
The default has no effects.
2022-08-08 15:27:21 -07:00
sfc-gh-tclinkenbeard 1bd47a07b2 Add ENFORCE_TAG_THROTTLING_ON_PROXIES knob 2022-08-05 00:40:10 -07:00
Jingyu Zhou 84d483605b
Merge pull request #7431 from xis19/main
Let the storage server reports busiest write tag
2022-08-04 10:23:31 -07:00
sfc-gh-tclinkenbeard 2699439282 Add global_tag_throttler section to status 2022-08-02 16:53:03 -07:00
Xiaoge Su fd3c3f0774 fixup! Reformat source 2022-08-01 18:56:50 -07:00
Xiaoge Su 195890dd7b Add ratekeeper ID for storage server busiest write tag report 2022-08-01 18:56:50 -07:00
Xiaoge Su aa69f5f36e fixup! Update per code review 2022-08-01 18:56:50 -07:00
Xiaoge Su 90b887f394 fixup! Update per comments 2022-08-01 18:56:50 -07:00
Xiaoge Su ec40c6bfec fixup! Add a wrapper of ResourceWeakRef for better support of self pointer 2022-08-01 18:56:50 -07:00
Xiaoge Su cf04afe925 fixup! Non-owning reference to an object
See documents in flow/OwningResource.h
2022-08-01 18:56:50 -07:00
Xiaoge Su 542b5e61cf Let the storage server reports busiest write tag
Issue #7258

The ratekeeper is recording the busiest write tag for *all* storage
servers, which throttles the traceevent. Distribute the busiest write
tag to corresponding storage servers should reduces this throttling
issue.
2022-08-01 18:56:50 -07:00
sfc-gh-tclinkenbeard 20ac60fb11 Set throttling ratio in GlobalTagThrottler::tryUpdateAutoThrottling 2022-07-19 17:04:04 -07:00
sfc-gh-tclinkenbeard b49c36f0b0 Add StorageQueueInfo::getWriteQueueSizeLimitRatio method 2022-07-19 16:28:27 -07:00
Markus Pilman 1de37afd52
Make TEST macros C++ only (#7558)
* proof of concept

* use code-probe instead of test

* code probe working on gcc

* code probe implemented

* renamed TestProbe to CodeProbe

* fixed refactoring typo

* support filtered output

* print probes at end of simulation

* fix missed probes print

* fix deduplication

* Fix refactoring issues

* revert bad refactor

* make sure file paths are relative

* fix more wrong refactor changes
2022-07-19 13:15:51 -07:00
sfc-gh-tclinkenbeard 086e4bff06 Merge remote-tracking branch 'origin/main' into global-tag-throttling3 2022-06-28 10:18:13 -07:00
Xiaoxi Wang a5054b2beb move getServerListAndProcessClasses to NativeAPI 2022-06-23 15:28:45 -07:00
sfc-gh-tclinkenbeard 44e367830a Remove unnecessary indirection in Ratekeeper::monitorThrottlingChanges implementation 2022-06-14 11:24:26 -07:00
sfc-gh-tclinkenbeard 5a1de67757 Add GLOBAL_TAG_THROTTLING knob 2022-05-07 15:58:04 -07:00
Bharadwaj V.R 726cb3a18f merge commits from main 2022-03-28 22:49:03 -07:00
Bharadwaj V.R 961e4ae7fd ratekeeper and ser-des fixes 2022-03-24 17:25:07 -07:00
sfc-gh-tclinkenbeard 30651bf2c6 Fix order of TagInfo constructor arguments 2022-03-22 17:06:33 -07:00