foundationdb

Commit Graph

Author	SHA1	Message	Date
Evan Tschannen	a9d3c9f9b3	Added throttling when a blob worker falls behind (#7751 ) * throttle the cluster when blob workers fall behind * do not throttle on blob workers if they are not enabled * remove an unnecessary actor * fixed a compile error * fetch blob worker metrics at the same interval as the rate is updated, avoid fetching the complete blob worker list too frequently * fixed another compilation bug * added a 5 second delay before bw throttling to prevent false positives caused by the 100e6 version jump during recovery. Lower the throttling thresholds to react much quicker to bw lag. * fixed a number of problems * changed the minBlobVersionRequest to look at storage server versions since this will be a lot more efficient * fix: do not let desired go backwards * fix: track the version of notAtLatest changefeeds for throttling * ratekeeper now throttled blob workers by estimating the transaction per second throughput of the blob workers * added metrics for blob worker change feeds * added a knob to disable bw throttling * fixed the transaction options in blob manager	2022-08-12 13:15:56 -07:00
He Liu	bc5bfaffda	Shard based move (#6981 ) * Shard based move. * Clean up. * Clear results on retry in getInitialDataDistribution. * Remove assertion on SHARD_ENCODE_LOCATION_METADATA for compatibility. * Resolved comments. Co-authored-by: He Liu <heliu@apple.com>	2022-07-07 20:49:16 -07:00
Bharadwaj V.R	71705bf930	Increase timeout for QuietDatabase when buggify is on	2022-06-27 23:03:00 -07:00
Bharadwaj V.R	990c789a5c	Increase quiet-database timeout when buggify is on; data-movements in simulation take longer than the timeout allows, and waiting for quiet-database does succeed when given some more time (#7290 )	2022-06-06 13:13:11 -07:00
A.J. Beamon	917b271a37	Merge pull request #6996 from sfc-gh-mpilman/features/fail-quietdatabase-before-timeout Make QuietDatabase more human friendly	2022-05-10 08:36:25 -07:00
Markus Pilman	e0cbe74d94	Only fail DD early in simulation	2022-04-28 11:32:35 -06:00
Markus Pilman	eb22ac1c1f	Address review comments	2022-04-28 10:09:06 -06:00
Markus Pilman	f959e84b85	fix comparison	2022-04-28 09:46:28 -06:00
Markus Pilman	74abca44d8	Make QuietDatabase more human friendly QuietDatabase will now fail by itself after 1000 seconds instead of relying on the general simulation timeout. Additionally it will print a more human friendly error.	2022-04-28 09:15:20 -06:00
Renxuan Wang	c69a07a858	Check in the new Hostname logic. (#6926 ) * Revert #6655. 20220407-031010-renxuan-c101052c21da8346 compressed=True data_size=31004844 duration=4310801 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=1:04:15 sanity=False started=100047 stopped=20220407-041425 submitted=20220407-031010 timeout=5400 username=renxuan * Revert #6271. 20220407-051532-renxuan-470f0fe6aac1c217 compressed=True data_size=30982370 duration=3491067 ended=100002 fail_fast=10 max_runs=100000 pass=100002 priority=100 remaining=0 runtime=0:59:57 sanity=False started=100141 stopped=20220407-061529 submitted=20220407-051532 timeout=5400 username=renxuan * Revert #6266. Remove resolving-related functionalities in connection string. Connection string will be used for storing purpose only, and non-mutable. 20220407-175119-renxuan-55d30ee1a4b42c2f compressed=True data_size=30970443 duration=5437659 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:59:31 sanity=False started=100154 stopped=20220407-185050 submitted=20220407-175119 timeout=5400 username=renxuan * Add hostname to coordinator interfaces. * Turn on the new hostname logic. * Add the corresponding change in config txns. The most notable change is before calling basicLoadBalance(), we need to call tryInitializeRequestStream() to initialize request streams first. Passed correctness tests. * Return error when hostnames cannot be resolved in coordinators command. * Minor fixes.	2022-04-27 21:54:13 -07:00
Trevor Clinkenbeard	ba8fbca038	Merge pull request #6752 from sfc-gh-tclinkenbeard/improve-snapshot-fault-tolerance Improve fault tolerance of snapshots	2022-04-08 12:46:50 -07:00
Lukas Joswiak	73a7c32982	Add fdbcli command to read/write version epoch (#6480 ) * Initialize cluster version at wall-clock time Previously, new clusters would begin at version 0. After this change, clusters will initialize at a version matching wall-clock time. Instead of using the Unix epoch (or Windows epoch), FDB clusters will use a new epoch, defaulting to January 1, 2010, 01:00:00+00:00. In the future, this base epoch will be modifiable through fdbcli, allowing administrators to advance the cluster version. Basing the version off of time allows different FDB clusters to share data without running into version issues. * Send version epoch to master * Cleanup * Update fdbserver/storageserver.actor.cpp Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com> * Jump directly to expected version if possible * Fix initial version issue on storage servers * Add random recovery offset to start version in simulation * Type fixes * Disable reference time by default Enable on a cluster using the fdbcli command `versionepoch add 0`. * Use correct recoveryTransactionVersion when recovering * Allow version epoch to be adjusted forwards (to decrease the version) * Set version epoch in simulation * Add quiet database check to ensure small version offset * Fix initial version issue on storage servers * Disable reference time by default Enable on a cluster using the fdbcli command `versionepoch add 0`. * Add fdbcli command to read/write version epoch * Cause recovery when version epoch is set * Handle optional version epoch key * Add ability to clear the version epoch This causes version advancement to revert to the old methodology whereas versions attempt to advance by about a million versions per second, instead of trying to match the clock. * Update transaction access * Modify version epoch to use microseconds instead of seconds * Modify fdbcli version target API Move commands from `versionepoch` to `targetversion` top level command. * Add fdbcli tests for * Temporarily disable targetversion cli tests * Fix version epoch fetch issue * Fix Arena issue * Reduce max version jump in simulation to 1,000,000 * Rework fdbcli API It now requires two commands to fully switch a cluster to using the version epoch. First, enable the version epoch with `versionepoch enable` or `versionepoch set <versionepoch>`. At this point, versions will be given out at a faster or slower rate in an attempt to reach the expected version. Then, run `versionepoch commit` to perform a one time jump to the expected version. This is essentially irreversible. * Temporarily disable old targetversion tests * Cleanup * Move version epoch buggify to sequencer This will cause some issues with the QuietDatabase check for the version offset - namely, it won't do anything, since the version epoch is not being written to the txnStateStore in simulation. This will get fixed in the future. Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>	2022-04-08 12:33:19 -07:00
sfc-gh-tclinkenbeard	e3acbd1388	Fix bug in getStorageWorkers	2022-04-08 11:21:29 -07:00
sfc-gh-tclinkenbeard	91930b8040	Remove getMinReplicasRemaining PromiseStream. Instead, in order to enforce the maximum fault tolerance for snapshots, update getStorageWorkers to return the number of unavailable storage servers (instead of throwing an error when unavailable storage servers exist).	2022-04-07 23:23:23 -07:00
Josh Slocum	f27475e2f4	Merge branch 'main' into blob_integration	2022-03-22 11:41:58 -05:00
sfc-gh-tclinkenbeard	a71099471b	Update copyright header dates	2022-03-21 13:36:23 -07:00
Josh Slocum	37e7c80f26	Merge branch 'main' into blob_integration	2022-03-17 18:45:42 -05:00
A.J. Beamon	2a21126028	Don't apply read prefixes on the client. Cache tenant data locally.	2022-03-15 09:23:30 -07:00
Josh Slocum	e71b3533f9	Merge branch 'main' into blob_integration	2022-03-09 08:59:56 -06:00
A.J. Beamon	250a88e682	Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement.	2022-02-24 12:25:52 -08:00
Renxuan Wang	622d89b552	Rebase on main. Since we changed ClusterConnectionString's status flag from boolean to enum in #6422, we need to update this PR correspondingly.	2022-02-22 16:29:59 -08:00
Renxuan Wang	481587a8c6	Turn on hostname logic.	2022-02-22 16:29:59 -08:00
Suraj Gupta	99606482ea	initial thoughts	2021-10-26 16:16:00 -04:00
Josh Slocum	0ff8ddc2b6	Merge branch 'master' into blob_full_clean	2021-10-25 13:38:48 -05:00
Trevor Clinkenbeard	c69364d5aa	Verify that cluster is fully recovered in quietDatabase check (#5807 ) * Verify that cluster is fully recovered in quietDatabase check * Add trace event to waitForQuietDatabase	2021-10-21 09:01:52 -07:00
Josh Slocum	5f0ec0612a	Merge branch 'feature-range-feed' into blob_full	2021-10-13 15:44:35 -05:00
Suraj Gupta	282f9d35cd	Cleanup comments and debugging code.	2021-10-04 11:07:08 -04:00
Suraj Gupta	4d54669ccd	Recruit the blob workers via blob manager. In this PR, the blob manager now recruits blob workers (via communication with the cluster controller). Blob workers are onboarded as blob worker processes enter the cluster.	2021-10-04 11:07:08 -04:00
Xiaoxi Wang	1730d75f73	change configure test add store type check add test file	2021-09-21 18:11:04 -07:00
Chaoguang Lin	65956ae6b7	Refactor configure command; refactor changeConfig to template code to reuse existing tests	2021-09-21 10:06:04 -07:00
Xiaoge Su	abf73047ca	Enforce std:: specifier rather than using namespace	2021-09-16 19:40:28 -07:00
Xiaoxi Wang	10c82b422f	merge master branch	2021-07-28 14:19:46 -07:00
Xiaoxi Wang	12d4f5c261	disable streaming peek for localities < 0	2021-07-28 14:11:25 -07:00
Steve Atherton	507c1f11e3	Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use.	2021-07-26 19:55:10 -07:00
Xiaoxi Wang	bfebd4e812	Merge branch 'master' of https://github.com/apple/foundationdb into tlog_dev	2021-07-22 16:15:07 -07:00
Xiaoxi Wang	cd32478b52	memory error(Simple config)	2021-07-22 15:45:59 -07:00
Xiaoxi Wang	1057835e8b	merge with master	2021-07-20 17:09:34 -07:00
Xiaoxi Wang	5046ee3b07	add stream peek to logRouter	2021-07-20 17:42:00 +00:00
sfc-gh-tclinkenbeard	6f81155784	Merge remote-tracking branch 'origin/master' into const-serverdbinfo	2021-07-20 10:18:40 -07:00
Steve Atherton	f596a81073	Rename ::TRUE and ::FALSE in BooleanParams to ::True and ::False so as to not conflict with the TRUE and FALSE macros provided by the Windows and MacOS SDKs.	2021-07-17 00:11:40 -07:00
sfc-gh-tclinkenbeard	8a212862f0	Prevent dataDistributor from modifying ServerDBInfo object	2021-07-11 22:04:54 -07:00
sfc-gh-tclinkenbeard	8cc40e3a2b	Expand use of BOOLEAN_PARAM	2021-07-02 21:41:50 -07:00
Josh Slocum	d1d2ca9285	Don't inject TSS faults if speedUpSimulation is set	2021-06-18 12:41:48 -05:00
Markus Pilman	05aea49d16	Merge pull request #4986 from sfc-gh-mpilman/bugfixes/double-ss Bugfixes/double ss	2021-06-16 14:43:32 -06:00
Markus Pilman	56eaf1bc83	added comments	2021-06-15 16:49:27 -06:00
Markus Pilman	b2271f2176	additional tracing for quietDatabase	2021-06-15 16:00:28 -06:00
Trevor Clinkenbeard	866f536983	Merge pull request #4888 from sfc-gh-tclinkenbeard/remove-fdbserver-includes Remove fdbserver includes from fdbclient	2021-06-07 10:22:13 -07:00
Xiaoxi Wang	838d847d4e	Merge pull request #4860 from sfc-gh-xwang/ppwtest implement perpetual storage wiggling feature	2021-06-04 16:18:39 -07:00
Xiaoxi Wang	e0981d6732	add code coverage mark	2021-06-03 19:58:28 +00:00
Xiaoxi Wang	351325b3af	comment modification; wait perpetual wiggling close	2021-06-03 05:13:20 +00:00

1 2 3 4

184 Commits