foundationdb

Commit Graph

Author	SHA1	Message	Date
Meng Xu	962024d8b8	Explain knob SHARD_MAX_BYTES_PER_KSEC Explain why it may cost 100MB data movement. No code change.	2020-01-31 17:04:11 -08:00
Meng Xu	7f37a90c48	FastRestore:Introduce FASTRESTORE_VB_PARALLELISM for controlling the number of concurrently running version batches.	2020-01-28 10:39:57 -08:00
Meng Xu	0f4dfeda5b	FastRestore:Test random number of appliers and loaders	2020-01-27 20:19:58 -08:00
Meng Xu	3cfe1f031d	FastRestore:Use set instead of vector for keysplitter and disable testing for random number of appliers and loaders	2020-01-27 20:14:33 -08:00
Meng Xu	141609e80a	FastRestore:Improve code style and fix typos	2020-01-27 18:13:14 -08:00
Evan Tschannen	231d7830a0	more accurate calculation on the amount of time that proxy should wait before getting a version from the master	2020-01-26 19:47:12 -08:00
Meng Xu	b04e98771e	FastRestore:Replace FastRestoreOpConfig with Knobs And randomize value for the rest of knobs	2020-01-24 14:24:34 -08:00
Evan Tschannen	e167e63eaf	Add delays between proxy batches which roughly corresponding to the amount of work the proxy needs to do. This will help avoid getting a version from the master and then waiting a long time before committing it.	2020-01-23 18:31:51 -08:00
Jingyu Zhou	1311fec45a	Add an option to get minKnownCommittedVersion from Proxies The backup worker needs to use this version for popping when running in a NOOP mode. This option is added to GetReadVersionRequest and proxies will send back minKnownCommittedVersion if the option is set. Also add a couple of knobs for backup workers.	2020-01-22 19:42:13 -08:00
Jingyu Zhou	0e5f5b50f0	Remove unused backup worker knobs	2020-01-22 19:38:46 -08:00
Jingyu Zhou	a4d6ebe79e	Recruit backup worker in newEpoch	2020-01-22 19:37:48 -08:00
Jingyu Zhou	de8d953865	Add backup role, class, and worker skeleton	2020-01-22 19:35:30 -08:00
Xin Dong	80683c09bb	Remove unused var to make compiler happy.	2020-01-21 11:19:52 -08:00
Xin Dong	b0a1af1288	Added the actual read hot detection algorithm and logging machanism. - When a shard has a read bandwidth larger than a threshold value(configurable via knob), and it's read-bandwidth/byte-size ratio is also larger than a threshold(configurable via knob), the corresponding shard tracker will run the algorithm - The algorithm will divide the shard into 10MB(configurable via knob) chunks and try to find the chunk(s) that has large aforementioned ratio - Then those ranges will be logged into TraceEvents. This will later do more like actually cache them.	2020-01-21 11:19:52 -08:00
Xin Dong	33456e7276	Done the plumbing work at the SSI side.	2020-01-21 11:15:52 -08:00
Xin Dong	7708034cc9	Added the function used to determin sub-ranges within a read hot shard that has a readSize-to-byteSize ratio higher than the knob value. Alos added unit tests for that function.	2020-01-21 11:15:52 -08:00
Xin Dong	1d6cd1007b	Instead of using absolute value as the max bytesReadPerKSec threshold, use a pre-defined read traffic to byte size ratio to decide that value dynamically based on the actual size of the shard.	2020-01-21 11:15:52 -08:00
Evan Tschannen	54d77d20b2	Merge branch 'release-6.2'	2020-01-19 15:22:49 -08:00
Evan Tschannen	8197f0562f	merge priority did not need to be raised, because we no longer merge shards until they are untrackable max_commit_updates was too large, and could cause proxies to run out of memory	2020-01-17 14:24:58 -08:00
Evan Tschannen	3f9d9d8b84	Merge branch 'release-6.2' # Conflicts: # CMakeLists.txt # cmake/FlowCommands.cmake # documentation/sphinx/source/release-notes.rst # fdbclient/StorageServerInterface.h # fdbserver/DataDistributionTracker.actor.cpp # fdbserver/MasterProxyServer.actor.cpp # fdbserver/fdbserver.actor.cpp # flow/Knobs.h # flow/Platform.cpp # versions.target	2020-01-16 18:37:47 -08:00
Evan Tschannen	4b90487b90	occasionally throw wrong_shard_server when waitMetrics times out so that the waitMetrics request can get the correct number of shards if two shards have been merged or split and the same storage server owns all the chunks	2020-01-15 13:22:18 -08:00
A.J. Beamon	9668c4471f	Clamp infinite limit in ratekeeper	2020-01-14 15:45:24 -08:00
Evan Tschannen	17e97f24e4	Merge pull request #2526 from etschannen/feature-dd-improvements Data distribution improvements	2020-01-10 17:53:22 -08:00
Evan Tschannen	fde53cbeef	HasBeenTrueFor was ready immediately after a previous shard merge	2020-01-10 16:28:56 -08:00
Evan Tschannen	9b80498180	Added a trace event to warn if a shard is merged before enough time has elapses from becoming low bandwidth	2020-01-10 14:58:38 -08:00
Evan Tschannen	4aab9b7bc8	fix: clients would waste time attempting to read from a remote region when it was in the process of catching up	2020-01-10 12:23:59 -08:00
Evan Tschannen	02a8e8d1e9	batch priority must be heavily throttled before stopping data distribution rebalancing	2020-01-09 17:05:22 -08:00
Evan Tschannen	9842272ced	raised the priority of shard merges, because the tracker cannot track an unmerged shard	2020-01-09 17:04:17 -08:00
Evan Tschannen	e4fa4ad0c9	Data distribution will not merge a shard unless it has been low bandwidth for 5 minutes	2020-01-09 17:02:49 -08:00
Evan Tschannen	ab7071932f	Data distribution no longer attempts to pick teams which share members of the source unless the team matches exactly	2020-01-09 16:59:37 -08:00
Meng Xu	39a4f2372f	Change FASTRESTORE_SAMPLING_PERCENT to 0 to 100	2019-12-04 21:26:27 -08:00
Meng Xu	c6b36dbffb	FastRestore:Sampling:Resolve review comments	2019-12-04 17:35:11 -08:00
Meng Xu	dd91d26dfa	FastRestore:Sampling:Add FASTRESTORE_SAMPLING_RATE knob	2019-12-04 11:46:29 -08:00
Evan Tschannen	07331ab5fd	Merge pull request #2362 from etschannen/master Merge 6.2 into master	2019-12-02 15:04:27 -08:00
Evan Tschannen	ebcb2f79ed	Merge branch 'master' of github.com:apple/foundationdb	2019-11-22 15:34:49 -08:00
Xin Dong	14dd5626d7	Resolve review comments	2019-11-22 10:11:45 -08:00
Xin Dong	b6e1839d84	Code clean up	2019-11-21 13:39:19 -08:00
Xin Dong	b282e180d5	Added a knob to disable read sampling	2019-11-20 14:03:20 -08:00
Evan Tschannen	8d3ef89540	Merge branch 'release-6.2' # Conflicts: # CMakeLists.txt # documentation/sphinx/source/release-notes.rst # fdbclient/MutationList.h # fdbserver/MasterProxyServer.actor.cpp # versions.target	2019-11-14 15:49:56 -08:00
negoyal	a4a0bf18f9	Merging with Master.	2019-11-12 13:01:29 -08:00
Evan Tschannen	396dccbc98	when peeking from satellites we do not need to limit the amount of peeking on log router tags, because that is the only thing that can be peeked from a satellite log	2019-11-08 18:34:05 -08:00
Evan Tschannen	afc9713005	Merge branch 'release-6.2' # Conflicts: # CMakeLists.txt # documentation/sphinx/source/release-notes.rst # fdbclient/FDBTypes.h # fdbserver/LogSystem.h # fdbserver/LogSystemPeekCursor.actor.cpp # fdbserver/OldTLogServer_6_0.actor.cpp # fdbserver/TLogServer.actor.cpp # versions.target	2019-11-06 13:45:37 -08:00
Evan Tschannen	daac8a2c22	Knobified a few variables	2019-11-04 20:21:38 -08:00
Evan Tschannen	71dfaa3f95	Merge pull request #2275 from dongxinEric/bugfix/2273/fix-read-key-sampling Resolves #2273: Use a large value for read sampling size threshold. Also at sampling …	2019-10-31 10:21:58 -07:00
Xin Dong	199a34b827	Defined a minimum read cost (a penalty) for empty read or read size smaller than it. Fixed several review comments.	2019-10-30 10:04:19 -07:00
Evan Tschannen	3325980c03	Merge branch 'release-6.2' # Conflicts: # CMakeLists.txt # documentation/sphinx/source/release-notes.rst # fdbserver/DataDistribution.actor.cpp # fdbserver/OldTLogServer_6_0.actor.cpp # fdbserver/TLogServer.actor.cpp # fdbserver/WorkerInterface.actor.h # fdbserver/worker.actor.cpp # versions.target	2019-10-24 17:38:15 -07:00
Xin Dong	a290e2cb2b	Use 8 MiB for real	2019-10-24 11:02:17 -07:00
Xin Dong	fe54a4bde1	- Changed SHARD_MAX_BYTES_READ_PRE_KEYSEC to be equivalent to 8MiB/s, which when times the sample expire interval(120 seconds) yields 960MiB/s. A shard having a read rate larger than that will be marked as read-hot. The number 960MiB was chosen to be roughtly twice the size of the max allowed shard size to avoid wrongly marking a shard as read-hot when doing a table scan on it. - Also tuned down the empty key sampling percentage to be 5%.	2019-10-23 12:00:19 -07:00
Meng Xu	e676348710	Merge pull request #1955 from fzhjon/mark-ss-failed Add fdbcli and API command to mark storage servers as permanently failed	2019-10-22 23:36:30 -07:00
Xin Dong	af72d15566	Update fdbserver/Knobs.cpp From AJ: to match typical aligned format used on other variables. Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>	2019-10-22 13:53:28 -07:00
Xin Dong	e6f5748791	Use a large value for read sampling size threshold. Also at sampling site, don't round up small values to avoid sampling every key.	2019-10-22 13:47:58 -07:00
A.J. Beamon	29a0014b41	Fix "bandwith" typo	2019-10-22 09:51:59 -07:00
Evan Tschannen	12c517ab16	limit the number of committed version updates in progress simultaneously to prevent running out of memory	2019-10-21 16:01:45 -07:00
Xin Dong	fca9aab17a	Merge pull request #2046 from dongxinEric/feature/hot-read-key-detection Added metrics for read hot key detection	2019-10-21 14:31:48 -07:00
Jon Fu	d2b6626d5c	Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed	2019-10-21 13:47:06 -07:00
Evan Tschannen	688940b685	merge 6.2 into master	2019-10-21 11:43:46 -07:00
Evan Tschannen	8b09cd16b2	Merge branch 'release-6.2' of github.com:apple/foundationdb into feature-share-mutations	2019-10-16 14:50:37 -07:00
Evan Tschannen	5667331729	added a buggify + minor code cleanup	2019-10-11 18:31:43 -07:00
Evan Tschannen	86bcb84b45	Raised the data distribution priority of splitting shards above restoring fault tolerance to avoid hot write shards	2019-10-11 17:50:43 -07:00
Xin Dong	62ffdd54a3	Updated some comments to reflect the correct knob value and also used a more appropiate value for read bandwidth. Set the default value for read bandwidth in some cases.	2019-10-09 16:42:42 -07:00
Xin Dong	cd4757b06c	Address review comments	2019-10-09 16:42:42 -07:00
Xin Dong	6b0f771cc0	Fixex a typo in knobs. Addressed some review comments. Added code for actual metric collecting.	2019-10-09 16:42:42 -07:00
Xin Dong	12293d5497	Added metrics for read hot key detection	2019-10-09 16:42:42 -07:00
Jon Fu	d96a7b2c69	Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed	2019-10-03 09:47:45 -07:00
Evan Tschannen	628b4e0220	added a warning if multiple log ranges exist for the same range	2019-10-02 17:06:19 -07:00
Meng Xu	d0147e5e5d	Merge branch 'release-6.2' into mengxu/merge-release620-to-master-v3 Resolved Conflicts: documentation/sphinx/source/release-notes.rst fdbserver/DataDistribution.actor.cpp versions.target	2019-10-02 13:22:56 -07:00
Jon Fu	68f88dea4b	remove buggify setting of new knob	2019-09-27 13:12:41 -07:00
Jon Fu	450a09e117	Code Review Changes	2019-09-24 15:48:50 -07:00
Meng Xu	d2fd1f4931	DD:MisconfiguredLocality:Fix review comments	2019-09-17 13:04:21 -07:00
Meng Xu	37d2318eed	DD:Handle worker with incorrect locality When a worker has incorrect locality, the worker will be excluded from storage recruitment. When the worker has its locality corrected by system operators, the worker will be reincluded for storage recruitment.	2019-09-14 12:12:56 -07:00
Meng Xu	78b8e48cef	DD:ValidLocality:Resolve review comment	2019-09-13 15:35:16 -07:00
Meng Xu	3ad7e3adb3	DD:DD_VALIDATE_LOCALITY:Guard the checking of locality validity	2019-09-13 13:19:35 -07:00
Evan Tschannen	8fbd90e2f6	Merge pull request #1985 from xumengpanda/mengxu/storage-engine-switch-PR-v2 Graceful storage engine migration	2019-09-09 13:51:53 -07:00
Meng Xu	c2355f721e	Merge branch 'master' into mengxu/performant-restore-PR	2019-09-04 17:11:42 -07:00
Meng Xu	8f9ba3bc09	StorageEngineSwitch:Remove unused code	2019-09-03 17:18:56 -07:00
Meng Xu	bd80a67d46	Merge branch 'master' into mengxu/storage-engine-switch-PR-v2	2019-09-03 14:11:33 -07:00
Evan Tschannen	00424a5108	changed the rate at which the coordinators register with the cluster controller and the clients register with the coordinator so the the connected client number in status will be much more accurate	2019-08-21 15:02:09 -07:00
Evan Tschannen	41b908752e	increased move keys parallelism to be less of a decrease just in case lowering this could effect normal data distribution raised target durability lag versions to give more time for batch limiting to come into play before this limit is hit changed max_bad_options to better reflect the name	2019-08-21 14:55:21 -07:00
Evan Tschannen	37e2fc86de	Increase the target durability lag versions to be larger than the soft max, so that storage servers will respond with a penalty to clients before ratekeeper controls on the lag	2019-08-19 14:03:42 -07:00
Evan Tschannen	9318b494ad	reduce the DD move keys parallelism to avoid a hot read shard when transitioning from triple replication to double replication	2019-08-19 14:02:18 -07:00
Meng Xu	b448f92d61	StorageEngineSwitch:Remove unnecessary code and format code Uncessary code include debug code and the unnecessary calling of the removeWrongStoreType actor; Format the changes with clang-format as well.	2019-08-16 16:53:38 -07:00
Meng Xu	85ba904e2c	StorageEngineSwitch:Stop removeWrongStoreType actor if no SS has wrong storeType	2019-08-16 16:11:28 -07:00
Meng Xu	a588710376	StorageEngineSwitch:Graceful switch When fdbcli change storeType for storage engines, we switch the store type of storage servers one by one gracefully. This avoids recruiting multiple storage servers on the same process, which can cause OOM error.	2019-08-12 17:37:52 -07:00
Meng Xu	7ff46e6772	Merge branch 'master' into mengxu/performant-restore-PR	2019-08-07 20:31:56 -07:00
Evan Tschannen	9382a58390	fix: after a forced recovery it is possible to not have logs from all generations, so only wait at most a second for getting a popped txs version	2019-08-06 16:32:28 -07:00
Meng Xu	7ccaeddf05	Merge branch 'master' into mengxu/performant-restore-PR	2019-08-01 13:23:17 -07:00
Evan Tschannen	7d7aa27c2d	Merge pull request #1814 from dongxinEric/feature/1508/finer-grained-dd-controls Added finer grained controls to DataDistribution in fdbcli.	2019-07-31 17:36:20 -07:00
Evan Tschannen	a0b29ff82f	updated knobs to allow more batch priority traffic	2019-07-31 17:19:41 -07:00
Evan Tschannen	4308ff86f7	increased the MAX_TEAMS_PER_SERVER	2019-07-31 16:08:18 -07:00
Xin Dong	b653ddb30d	Final clean ups after rebasing master	2019-07-30 22:35:34 -07:00
Xin Dong	cda70700cc	Address review comments. 50K correctness with no failures.	2019-07-30 22:24:30 -07:00
Evan Tschannen	6dbaddd0a7	Added a knob to always use CAUSAL_READ_RISKY for GRV	2019-07-30 18:21:46 -07:00
Evan Tschannen	5dd9043fd3	addressed review comments	2019-07-30 17:04:41 -07:00
A.J. Beamon	41605735f5	Merge pull request #1916 from ajbeamon/merge-onto-new-servers Add knob to control whether merges request new servers or not.	2019-07-30 15:04:37 -07:00
A.J. Beamon	bc536757df	Add knob to control whether merges request new servers or not. Set the default to request new servers in \xff but not in main key space.	2019-07-29 15:47:34 -07:00
Evan Tschannen	d8b14fe372	we cannot buggify replace content bytes because it takes too long to recovery when the txnStateStore is too large	2019-07-28 19:34:17 -07:00
Evan Tschannen	5c98dcce6d	revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators	2019-07-27 16:46:22 -07:00
Evan Tschannen	b509a441e7	Merge branch 'master' into feature-skip-confirm # Conflicts: # bindings/flow/tester/Tester.actor.cpp # bindings/go/src/_stacktester/stacktester.go # bindings/java/src/test/com/apple/foundationdb/test/AsyncStackTester.java # bindings/java/src/test/com/apple/foundationdb/test/StackTester.java # bindings/python/tests/tester.py # bindings/ruby/tests/tester.rb # documentation/sphinx/source/api-c.rst # documentation/sphinx/source/api-python.rst # documentation/sphinx/source/api-ruby.rst # documentation/sphinx/source/data-modeling.rst # documentation/sphinx/source/developer-guide.rst # fdbclient/vexillographer/fdb.options # fdbserver/MasterProxyServer.actor.cpp	2019-07-27 15:08:13 -07:00
Evan Tschannen	ee94e8a062	removed a trace event which was causing valgrind errors	2019-07-27 13:51:59 -07:00
Evan Tschannen	90e3b50213	Merge branch 'master' into feature-coordinator-connection # Conflicts: # fdbclient/DatabaseContext.h # fdbclient/NativeAPI.actor.cpp # fdbclient/NativeAPI.actor.h # fdbserver/workloads/KillRegion.actor.cpp	2019-07-26 15:05:02 -07:00
Evan Tschannen	ee92f0574f	fix: lastRequestTime was not updated fix: COORDINATOR_REGISTER_INTERVAL was not set fixed review comments	2019-07-26 13:23:56 -07:00
Meng Xu	45083edf74	Merge branch 'master' into mengxu/performant-restore-PR Fix conflicts as well.	2019-07-25 10:46:11 -07:00
sramamoorthy	a65c9f92ed	get rid of all timeouts and other changes	2019-07-24 15:36:28 -07:00
sramamoorthy	7e04e3c8be	snap v2: knobs for max snap create timeout	2019-07-24 15:36:28 -07:00
Evan Tschannen	c70e762f0e	Merge pull request #1785 from xumengpanda/mengxu/server-team-remover-PR Remove redundant server teams	2019-07-19 17:44:16 -07:00
Meng Xu	b001a9ebe8	ServerTeamRemover runs after machineTeamRemover finishes If serverTeamRemover removes a team before machineTeamRemover brings the machine team number down to the desired number, DD may create a new team (due to teams removed by serverTeamRemover), which may be removed later by machineTeamRemover. This causes unnnecessary extra data movement.	2019-07-19 16:48:52 -07:00
Evan Tschannen	846038b0e6	Merge pull request #1858 from bnamasivayam/rk-ssfetch-throttle Ratekeeper throttling aggressively when unable to fetch storage server list	2019-07-19 16:41:58 -07:00
Alex Miller	c3a8ae4752	Merge pull request #1791 from fzhjon/fetch-keys-requests-priority Introduce priority to fetchKeys requests from data distribution	2019-07-19 14:54:51 -07:00
Balachandar Namasivayam	ecb3de3b49	Fixed space issue.	2019-07-17 18:10:05 -07:00
Balachandar Namasivayam	406bcebdc4	Ratekeeper to throttle tpsLimit to 1 if it is not able to fetch storage server list for some configurable amount of time.	2019-07-17 18:08:17 -07:00
Meng Xu	20f067e794	Merge with master:Resolve conflict with PR#1797	2019-07-16 10:52:28 -07:00
Meng Xu	415622f465	MachineTeamRemover:Change to remove MT with most teams Change to remove machine team with most machine teams, using the same logic as the serverTeamRemover. The featue is guarded by TR_FLAG_REMOVE_MT_WITH_MOST_TEAMS knob.	2019-07-15 14:29:49 -07:00
Evan Tschannen	db5b4a6331	avoid going to unlimited immediately after going below the durabilityLagTargetVersion	2019-07-12 18:50:56 -07:00
Evan Tschannen	1a18c859c7	knobified the durability lag rate controls	2019-07-12 18:50:56 -07:00
Evan Tschannen	02de53160d	only skip confirm epoch live if CAUSAL_READ_RISKY is enabled time checked on the proxy should be less than the time waited by the master to account for clock speed differences setting REQUIRED_MIN_RECOVERY_DURATION and ENFORCED_MIN_RECOVERY_DURATION to 0 will go back to the old behavior	2019-07-12 17:58:16 -07:00
Evan Tschannen	a63969afb3	enforce a minimum recovery duration, which allows proxies to avoid checking if the epoch is alive as long as its last commit has been less than MINIMUM_RECOVERY_DURATION ago	2019-07-12 13:10:21 -07:00
Jon Fu	f12a3909f3	renamed workloads and made code style adjustments	2019-07-11 09:56:58 -07:00
Jon Fu	1e9d31597c	removed extra parameter from getRange, added knob to guard new changes, and adjusted style/formatting in several places	2019-07-11 09:56:58 -07:00
Evan Tschannen	7e919e361c	Merge pull request #1817 from etschannen/feature-proxy-forward Proxies will forward clients to the next generation	2019-07-10 13:53:12 -07:00
Evan Tschannen	49121172ea	Merge pull request #1795 from alexmiller-apple/peek-from-satellites Log Routers will prefer to peek from satellite logs.	2019-07-09 17:38:57 -07:00
Evan Tschannen	001abec29d	fixed a compiler error, buggified a new knob	2019-07-09 16:50:59 -07:00
Evan Tschannen	64aee73c4f	we only need to hold the ReplyPromise for messages that we are going to forward to new proxies	2019-07-09 16:47:56 -07:00
Alex Miller	44f11702a8	Log Routers will prefer to peek from satellite logs. Formerly, they would prefer to peek from the primary's logs. Testing of a failed region rejoining the cluster revealed that this becomes quite a strain on the primary logs when extremely large volumes of peek requests are coming from the Log Routers. It happens that we have satellites that contain the same mutations with Log Router tags, that have no other peeking load, so we can prefer to use the satellite to peek rather than the primary to distribute load across TLogs better. Unfortunately, this revealed a latent bug in how tagged mutations in the KnownCommittedVersion->RecoveryVersion gap were copied across generations when the number of log router tags were decreased. Satellite TLogs would be assigned log router tags using the team-building based logic in getPushLocations(), whereas TLogs would internally re-index tags according to tag.id%logRouterTags. This mismatch would mean that we could have: Log0 -2:0 ----- -2:0 Log 0 Log1 -2:1 \ >--- -2:1,-2:0 (-2:2 mod 2 becomes -2:0) Log 1 Log2 -2:2 / And now we have data that's tagged as -2:0 on a TLog that's not the preferred location for -2:0, and therefore a BestLocationOnly cursor would miss the mutations. This was never noticed before, as we never used a satellite as a preferred location to peek from. Merge cursors always peek from all locations, and thus a peek for -2:0 that needed data from the satellites would have gone to both TLogs and merged the results. We now take this mod-based re-indexing into account when assigning which TLogs need to recover which tags from the previous generation, to make sure that tag.id%logRouterTags always results in the assigned TLog being the preferred location. Unfortunately, previously existing will potentially have existing satellites with log router tags indexed incorrectly, so this transition needs to be gated on a `log_version` transition. Old LogSets will have an old LogVersion, and we won't prefer the sattelite for peeking. Log Sets post-6.2 (opt-in) or post-6.3 (default) will be indexed correctly, and therefore we can safely offload peeking onto the satellites.	2019-07-08 22:25:01 -07:00
Meng Xu	3b9618fe11	ServerTeamRemover:Speedup removing teams in simulation Otherwise, simulation may time out when team remover needs to remove hundreds of teams.	2019-07-08 18:17:21 -07:00
Meng Xu	08a721b320	Merge branch 'master' into mengxu/server-team-remover-PR	2019-07-08 16:30:32 -07:00
Evan Tschannen	c348b3da51	After a proxy dies, it will remain alive for an additional 10 seconds to forward clients to the new proxies	2019-07-08 12:53:40 -07:00
Evan Tschannen	310a5fe9a3	fix: we cannot reject 100% of requests, because a storage server which is stuck needs to get a future version error to trigger an all alternatives failed message from load balance so that clients will re-grab storage server interfaces from the proxy	2019-07-05 17:28:22 -07:00
Evan Tschannen	e7c0ecf729	fix: we cannot reject 100% of requests, because a storage server which is stuck needs to get a future version error to trigger an all alternatives failed message from load balance so that clients will re-grab storage server interfaces from the proxy	2019-07-05 15:46:16 -07:00
Meng Xu	599fcb2e6d	Add serverTeamRemover to remove redundant server teams	2019-07-02 17:40:37 -07:00
Evan Tschannen	b9a6271375	local ratekeeper no longer globally limits	2019-06-28 16:54:22 -07:00
Evan Tschannen	18d5fbf1e0	Avoid jumping from rejecting 0% of requests directly to 20% of requests	2019-06-28 16:54:22 -07:00
Evan Tschannen	db413c37f7	restored the STORAGE_DURABILITY_LAG_SOFT_MAX knob and made the rk target slightly smaller than the soft limit, to avoid inaccuracies in ratekeeper control causing behavior changes on the storage servers	2019-06-28 16:54:22 -07:00
Evan Tschannen	92b32855ca	ratekeeper’s control algorithm would oscillate when limited by local ratekeeper	2019-06-28 16:54:22 -07:00
A.J. Beamon	35b6277a50	Fix knob copy paste error	2019-06-27 12:55:39 -07:00
Alex Miller	61901effed	Increase how long FDB will wait before starting DD to repair data loss. 10s is a bit short for starting data distribution, which is rather expensive. 60s is a bit more reasonable.	2019-06-19 13:40:21 -07:00
Evan Tschannen	20e3edeb0a	Merge branch 'release-6.1' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbserver/storageserver.actor.cpp # versions.target	2019-06-14 12:42:59 -07:00
Evan Tschannen	924f92e5aa	Prevent the byte sample recovery from interfering with storage server recovery	2019-06-13 15:55:25 -07:00
Evan Tschannen	054d775343	increase the delay between idle commits to reduce the rate idle clusters fsync	2019-06-13 14:55:37 -07:00
Trevor Clinkenbeard	8144882d7b	Merge branch 'apple-master' into features/local-rk	2019-06-10 19:40:25 -07:00
Meng Xu	022b555b69	FastRestore:Fix bug in finish restore RestoreMaster may not receive all acks. for the last command, i.e., finishRestore, because RestoreLoaders and RestoreAppliers exit immediately after sending the ack. If the ack is lost, it will not be resent. This commit also removes some unneeded code. This commit passes 50k random tests without errors.	2019-06-05 20:07:18 -07:00
Meng Xu	477fd152c0	FastRestore:Refactor code 1) Use the runRYWTransaction for simple DB access 2) Replace some printf with TraceEvent 3) Remove printf not used in debugging 4) Avoid wait inside the condition in loop-choose-when for the core routine of restore worker, loader and applier. 5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since the file only has functionalities related to restore worker. Passed correctness test	2019-06-04 11:22:47 -07:00
Evan Tschannen	29b96414e2	Merge branch 'release-6.1' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/NativeAPI.actor.cpp # fdbserver/Coordination.actor.cpp # flow/Arena.h # versions.target	2019-06-03 18:49:35 -07:00
Evan Tschannen	7c333dbc16	If a process receives a message in its clusterControllerInterface before becoming the cluster controller, if the process does not become the cluster controller in the next minute it should destroy the interface to prevent a memory leak.	2019-05-29 16:57:13 -07:00
sramamoorthy	31b6c86650	ignorePopDeadline to have high limit in simulator - ignorePopDeadline to have highier limit in simulator to accommdate for the buggify delays and make snapshot succeed. - introduce a new knob for auto resetting the disabling of tlog pop	2019-05-28 22:07:46 -07:00
A.J. Beamon	603721e125	Merge branch 'master' into thread-safe-random-number-generation # Conflicts: # fdbclient/ManagementAPI.actor.cpp # fdbrpc/AsyncFileCached.actor.h # fdbrpc/genericactors.actor.cpp # fdbrpc/sim2.actor.cpp # fdbserver/DiskQueue.actor.cpp # fdbserver/workloads/BulkSetup.actor.h # flow/ActorCollection.actor.cpp # flow/Net2.actor.cpp # flow/Trace.cpp # flow/flow.cpp	2019-05-23 08:35:47 -07:00
Evan Tschannen	8c3516951a	Merge branch 'release-6.1' # Conflicts: # documentation/sphinx/source/release-notes.rst # versions.target	2019-05-12 20:13:49 -07:00
Alex Miller	ea12a54946	Rename DISK_QUEUE_MAX_TRUNCATE_EXTENTS -> ..._BYTES So as to not make filesystem assumptions. This knob did technically appear in (only the) 6.1.5 release, but this feature was broken 6.1.5, so thus impossible to use anyway.	2019-05-10 18:26:22 -10:00
A.J. Beamon	5f55f3f613	Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.	2019-05-10 14:01:52 -07:00
Evan Tschannen	22499666d0	Merge branch 'release-6.1' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbserver/LogRouter.actor.cpp # flow/Trace.cpp # versions.target	2019-05-08 18:19:35 -07:00
Alex Miller	0685e6c1c7	Avoid large truncates in the DiskQueue. And instead create a new file while incrementally truncating the old one down. This avoids queueing up a massive number of filesystem metadata operations in one call, thus flooding the disk with requests and stalling out all other filesystem operations. This sets the knobs so that a truncate of >10GB causes us to create a new file rather than trying to truncate the old one.	2019-05-08 12:33:31 -10:00
Alex Miller	4052f3826a	Add a knob to limit the number of commits indexed per key. Theoretically, we could spill 20MB of 22B mutations for one key, which would generate a very long value being stored in SQLite, and very inefficiently read back. This stops that from being a problem, at the cost of some extra write calls.	2019-05-03 15:27:10 -07:00
Alex Miller	f4e48c3851	Add a knob to limit amount of data read from sqlite for one PeekRequest. This prevents peeking from degrading over time if there are a very large number of SpilledData entries for one particular tag.	2019-05-02 17:26:45 -07:00
Evan Tschannen	2d5043c665	Merge branch 'release-6.1' # Conflicts: # documentation/sphinx/source/release-notes.rst # versions.target	2019-04-30 18:27:04 -07:00
Evan Tschannen	1a4c1759a4	Merge pull request #1429 from jzhou77/pprof Dump heap profiler when memory usage is high	2019-04-29 16:31:44 -07:00
Evan Tschannen	cacd82758e	Reduced data distribution speeds	2019-04-26 13:54:49 -07:00
Evan Tschannen	9ff8aca1da	Increased the SQLITE_CHUNK_SIZE to 100MB (left at 4MB for simulation)	2019-04-26 13:53:56 -07:00
A.J. Beamon	253d2400ef	Merge branch 'release-6.1' into speed-up-and-parameterize-spring-cleaning # Conflicts: # documentation/sphinx/source/release-notes.rst	2019-04-23 14:38:52 -07:00
A.J. Beamon	4ad0496b39	Increase the frequency that lazy deletes are run. Add more parameters for better control over the spring cleaning process.	2019-04-23 14:01:51 -07:00
Stephen Atherton	83db547306	Implemented the chunk size and db size hint fileControl options in our SQLite VFS implementation. KeyValueStoreSQLite now sets file chunk size based on a new knob, SQLITE_CHUNK_SIZE_PAGES.	2019-04-23 04:50:58 -07:00
Jingyu Zhou	6870e132b2	Merge branch 'master' into pprof	2019-04-19 14:06:44 -07:00
Andrew Noyes	d1e86779a6	Address review comments	2019-04-18 08:48:27 -07:00
mpilman	32393ec4c9	Prototype of local ratekeeper	2019-04-08 11:04:44 -07:00
Evan Tschannen	05869a8383	do not log a degraded reset message if the previous reset was more than a week ago	2019-04-07 23:00:58 -07:00
Jingyu Zhou	4b08042a88	Change memory profiling threshold to a flag	2019-04-05 16:33:51 -07:00
Jingyu Zhou	09b2c35d11	Dump heap profiler when memory usage is high Set the threshold of dump to 2GB.	2019-04-05 16:12:23 -07:00
Evan Tschannen	390ab9cfed	A process will mark itself as degraded if it continually disconnects from a different process which the failure monitor thinks is healthy	2019-04-04 14:11:12 -07:00
A.J. Beamon	71e2fdafb8	Changes to ratekeeper camel case	2019-03-27 08:24:25 -07:00
Evan Tschannen	6254a1a8e4	fix: restarting the provisional proxy causes all tlog peeks to restart, so if tlog peeks take longer than 1 second this could end in an infinite loop	2019-03-22 18:37:39 -07:00
A.J. Beamon	2d7b48dadc	Merge pull request #1311 from etschannen/feature-increase-grv-batch Increased the GRV client batch size	2019-03-19 08:23:05 -07:00
Evan Tschannen	2554fed965	reduce max transaction to start	2019-03-18 16:16:03 -07:00
Evan Tschannen	87e2a1a029	The proxy budget is implemented to let one request over its limit through, and then pay back what was over the limit in the next update	2019-03-18 16:09:57 -07:00
Alex Miller	29ab7370cd	Clear versionLocation when spilling, and pop DQ separately. Popping the disk queue now requires potentially recovering the location to which we can pop from the spilled data itself, and for each tag we must maintain the first location with relevant data. The previous queue we had to represent the ordering, queueOrder, was used by spilling, and popped when a TLog had been spilled. This means that as soon as a TLog has been fully spilled, we have no idea how it relates in order to other fully spilled TLogs. Instead, use queueOrder to keep track of all the TLog UIDs until they're removed, and use spillOrder to keep track of the order only for spilling.	2019-03-18 15:09:22 -07:00
Evan Tschannen	ec6c843124	increased the GRV client batch size, similarly increased the proxy limits related to the number of transactions started in a batch	2019-03-16 16:18:58 -07:00
Evan Tschannen	e068c478b5	merge master	2019-03-12 18:31:25 -07:00
Evan Tschannen	c6e94293bf	reset a process to not be degraded after 2 days	2019-03-10 22:39:21 -07:00
Evan Tschannen	53f16b5347	when a tlog queue commit takes longer than 5 seconds, its process is marked as degraded	2019-03-08 11:46:34 -05:00
Jingyu Zhou	3c86643822	Separate Ratekeeper from data distribution. Add a new role for ratekeeper. Remove StorageServerChanges from data distribution. Ratekeeper monitors storage servers, which borrows the idea from DataDistribution.	2019-03-07 13:16:20 -08:00
Alex Miller	94bf75cb00	Allow the disk queue to shrink if it has unneeded slack space.	2019-03-04 01:42:38 -08:00
Alex Miller	9ef283d4e7	Implement hard limiting of memory used to serve peek requests.	2019-03-04 01:42:38 -08:00
Alex Miller	e7d8520c63	Batch more when spilling data.	2019-03-04 01:42:38 -08:00
Trevor Clinkenbeard	39f612d132	Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics	2019-03-02 17:07:00 -08:00
A.J. Beamon	a051055caf	Initial implementation of adding separate limits for batch priority in ratekeeper	2019-02-27 10:31:56 -08:00
Trevor Clinkenbeard	abfe057805	Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics	2019-02-25 13:47:16 -08:00
Evan Tschannen	b8910ba7cd	Merge branch 'master' into feature-fix-force-recovery # Conflicts: # fdbclient/ManagementAPI.actor.h # fdbserver/DataDistribution.actor.cpp # fdbserver/storageserver.actor.cpp # fdbserver/workloads/KillRegion.actor.cpp	2019-02-22 14:38:13 -08:00
Meng Xu	9445ac0b0c	Status: Use new data distributor worker to publish status After we add a new data distributor role, we publish the data related to data distributor and rate keeper through the new role (and new worker). So the status needs to contact the data distributor, instead of master, to get the status information.	2019-02-21 18:05:50 -08:00
Meng Xu	7cca439e00	TeamRemover: Add status to show redundant team removing Distinguish the removal of unhealthy team and redundant team. Change status report to include redundant team removal report.	2019-02-21 14:16:46 -08:00
Trevor Clinkenbeard	fa96b8dd33	Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics	2019-02-20 16:56:16 -08:00
Meng Xu	d86ba0e811	TeamRemover: Change it to run periodically This simplifies the problem of when we should invoke the teamRemover	2019-02-20 16:08:34 -08:00
Evan Tschannen	27e3617548	fix: remove bad teams needed to use dd_stall_check delay, because in simulation the buggified delay time could make us remove bad teams before they submit their ranges to the queue	2019-02-20 14:18:36 -08:00
Evan Tschannen	d4737fac0f	knobify force recovery recovery check delay	2019-02-19 16:05:20 -08:00
Evan Tschannen	065a45e05f	Merge branch 'master' into feature-fix-force-recovery # Conflicts: # fdbclient/ManagementAPI.actor.cpp # fdbserver/ClusterController.actor.cpp # fdbserver/workloads/KillRegion.actor.cpp	2019-02-18 17:09:06 -08:00
Evan Tschannen	d492395f84	fix: simulation could buggify a delay such that data distribution incorrectly thinks the queue is not processing unhealthy relocations	2019-02-18 14:57:07 -08:00
Meng Xu	6d09ac483c	Merge with master	2019-02-15 17:03:40 -08:00
Jingyu Zhou	fc3a784963	Fix another build team bug The buildTeam() can create teams with undesired storage servers, which are considered unhealthy. As a result, the data movement can become stuck. Fix this by adding an ACTOR monitorHealthyTeams that builds team every one second whenever there is no healthy teams. Clean up storageServerTracker() interface.	2019-02-14 16:37:16 -08:00
Jingyu Zhou	816f8b1ae1	Per review comments Add a knob for starting distributor delay. Move distributor failed variable to a local loop.	2019-02-14 16:37:16 -08:00
Jingyu Zhou	e0a7162cf8	Add a failure timeout knob for data distributor. Set default time to 1.0s.	2019-02-14 16:37:16 -08:00
Meng Xu	5481851e82	TeamCollection: Add knobs for team remover Added three knobs to control team remover bool TR_FLAG_DISABLE_TEAM_REMOVER: Disable the teamRemover actor double TR_REMOVE_MACHINE_TEAM_DELAY: Wait for the specified time before try to remove next machine team double TR_WAIT_FOR_ALL_MACHINES_HEALTHY_DELAY: Wait before checking if all machines are healthy	2019-02-13 15:11:56 -08:00
Meng Xu	214a72fba3	TeamCollection: Resolve review comments 1) Reduce the frequency of checking if we need to call teamRemover 2) Improve code efficiency in finding the machine team to remove 3) Remove unused code 4) Add sanity check	2019-02-12 10:59:57 -08:00
Meng Xu	3b8ae0fe95	TeamCollection: Add into 6.1 release note	2019-02-08 13:50:27 -08:00
Meng Xu	7cfe6de27e	TeamCollection: Server team number must match machine team number DESIRED_TEAMS_PER_MACHINE must equal to DESIRED_TEAMS_PER_SERVER. Otherwise, we may have to few machine teams to create enough server teams. Note that BUGGIFY macro value is based on a random number generator. When you have two BUGGIFY, one may be true and the other is false. Also fix a bug in get the number of healthy machine teams.	2019-02-07 13:53:55 -08:00

... 2 3 4 5 6 ...

435 Commits