foundationdb

Commit Graph

Author	SHA1	Message	Date
sfc-gh-tclinkenbeard	6235d087a6	Prevent shardTracker or trackShardBytes from accidentally unsafely accessing DataDistributionTracker	2020-11-16 12:46:21 -08:00
sfc-gh-tclinkenbeard	ca8ea3b6ff	Fix memory issues caused by cancelling data distribution tracker	2020-11-15 23:52:36 -08:00
Meng Xu	222da17558	Merge branch 'release-6.2' into mengxu/ha-code-read	2020-11-12 13:39:27 -08:00
Meng Xu	c2dd7d1d38	Remove unresolved questions	2020-11-11 22:39:11 -08:00
Andrew Noyes	f467524e06	Don't dereference self on broken_promise	2020-11-11 00:24:23 +00:00
Meng Xu	063700e4d6	Add comments and questions to HA and tLog code reading The comments' correctness need to be confirmed by reviewers.	2020-10-30 12:14:57 -07:00
sfc-gh-tclinkenbeard	91a8367acb	Avoid slow task in ~DataDistributionTracker	2020-10-01 11:44:55 -07:00
Evan Tschannen	7adc916e18	Merge pull request #2806 from ajbeamon/improve-team-request-performance Improve performance of get team requests.	2020-03-16 11:56:45 -07:00
Evan Tschannen	12f2b32770	added additional logging in data distribution	2020-03-13 15:19:33 -07:00
A.J. Beamon	555db50cd1	Avoid calling into SABTF so frequently. Use a cheaper call that only checks that shards exist.	2020-03-12 11:22:03 -07:00
Evan Tschannen	fd5705a451	fixed capitalization	2020-01-15 09:35:57 -08:00
Evan Tschannen	c93ca04ea6	Do not merge more than 100 shards together to avoid creating untrackable shards	2020-01-15 09:33:27 -08:00
Evan Tschannen	b331c5dafe	wantsToMerge was created before the shardEvaluator has a chance to update it based on shardSize changes	2020-01-10 17:23:56 -08:00
Evan Tschannen	fde53cbeef	HasBeenTrueFor was ready immediately after a previous shard merge	2020-01-10 16:28:56 -08:00
Evan Tschannen	9b80498180	Added a trace event to warn if a shard is merged before enough time has elapses from becoming low bandwidth	2020-01-10 14:58:38 -08:00
Evan Tschannen	e4fa4ad0c9	Data distribution will not merge a shard unless it has been low bandwidth for 5 minutes	2020-01-09 17:02:49 -08:00
Evan Tschannen	8f0348d5e0	fix: merges which cross over systemKeys.begin did not properly decrement the systemSizeEstimate	2019-10-31 16:38:33 -07:00
Evan Tschannen	86bcb84b45	Raised the data distribution priority of splitting shards above restoring fault tolerance to avoid hot write shards	2019-10-11 17:50:43 -07:00
A.J. Beamon	909855bcec	Fix: the keys argument to changeSizes was passed as a reference, but when used after the first wait(), it may no longer be valid.	2019-10-09 14:07:48 -07:00
Evan Tschannen	045175bd0e	added tracking for the size of the system keyspace	2019-09-27 22:39:19 -07:00
Evan Tschannen	3bb62e008c	lowered the priority of some delays in data distribution so that the process will prefer other work	2019-09-27 18:33:13 -07:00
Meng Xu	b7478f5dd3	DD:Add comments to help understand code Add comments to explain the functionalities of some code.	2019-07-22 11:23:16 -07:00
Alex Miller	7a500cd37f	A giant translation of TaskFooPriority -> TaskPriority::Foo This is so that APIs that take priorities don't take ints, which are common and easy to accidentally pass the wrong thing.	2019-06-25 02:47:35 -07:00
A.J. Beamon	5f55f3f613	Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.	2019-05-10 14:01:52 -07:00
mpilman	d01cbf3455	Addressed code review comments	2019-04-05 13:12:20 -07:00
mpilman	1c16f87a4e	Remove trace-calls to printable (in non-workloads)	2019-04-05 13:12:19 -07:00
anoyes	981426bac9	More ide fixes	2019-03-05 18:03:57 -08:00
Jingyu Zhou	c38b2a8c38	Change masterId to distributorId in tracker. This reflects the change of moving data distribution out of master server.	2019-02-14 16:37:16 -08:00
Evan Tschannen	4e54690005	Merge branch 'release-6.0' # Conflicts: # fdbserver/DataDistribution.actor.cpp # fdbserver/MoveKeys.actor.cpp	2018-11-12 20:26:58 -08:00
Evan Tschannen	cd188a351e	fix: if a destination team became unhealthy and then healthy again, it would lower the priority of a move even though the source servers we are moving from are still unhealthy fix: badTeams were not accounted for when checking priorities	2018-11-11 12:33:31 -08:00
Evan Tschannen	4b5d0b4e2c	Merge branch 'release-6.0' # Conflicts: # documentation/sphinx/source/release-notes.rst # fdbclient/AsyncFileBlobStore.actor.cpp # fdbclient/AsyncFileBlobStore.actor.h # fdbclient/BlobStore.actor.cpp # fdbclient/BlobStore.h # fdbclient/HTTP.actor.cpp # fdbclient/ManagementAPI.actor.cpp # fdbclient/NativeAPI.actor.cpp # fdbrpc/LoadBalance.actor.h # fdbrpc/batcher.actor.h # fdbrpc/fdbrpc.vcxproj # fdbrpc/sim2.actor.cpp # fdbserver/DataDistribution.actor.cpp # fdbserver/DataDistributionTracker.actor.cpp # fdbserver/SimulatedCluster.actor.cpp # fdbserver/TLogServer.actor.cpp # fdbserver/masterserver.actor.cpp	2018-11-10 13:04:24 -08:00
Evan Tschannen	e68c07ae35	fix: trackShardBytes was called with the incorrect range, resulting in incorrect shard sizes reduced the size of shard tracker actors by removing unnecessary state variable. Because we have a large number of these actors these extra state variables add up to a lot of memory	2018-11-02 13:03:01 -07:00
Robert Escriva	268093a96d	Adjust all includes to be relative to the root. Remove the use of relative paths. A header at foo/bar.h could be included by files under foo/ with "bar.h", but would be included everywhere else as "foo/bar.h". Adjust so that every include references such a header with the latter form. Signed-off-by: Robert Escriva <rescriva@dropbox.com>	2018-10-19 17:35:33 +00:00
A.J. Beamon	2a97139d5d	This is the first step in eliminating the usage of database names in our code. The C API remains the same, but underneath that all usage of database names is eliminated.	2018-08-16 10:24:12 -07:00
Alex Miller	fb31a6999f	Rewrite all files to have #include actorcompiler.h as the last include.	2018-08-14 15:50:26 -07:00
Alex Miller	535b5701e5	Rewrite all `Void _ = wait(...)` -> `wait(...)`. This takes advantage of the new actorcompiler functionality to avoid having duplicate definitions of `Void _` when trying to feed the un-actorompiled source through clang.	2018-08-14 15:50:26 -07:00
Evan Tschannen	6f02ea843a	prevented a slow task when too many shards were sent to the data distribution queue after switching to a fearless deployment	2018-08-09 12:37:46 -07:00
Evan Tschannen	1c29275672	call all methods which could disable a trace event before it is initialized. In practice this means calling .error first, then .suppressFor, then all your details.	2018-08-01 14:30:57 -07:00
Evan Tschannen	392c73affb	fixed a few slow tasks	2018-07-12 14:06:59 -07:00
A.J. Beamon	e5488419cc	Attempt to normalize trace events: * Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check. * Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase. * Use seconds instead of milliseconds in details. Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed. This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.	2018-06-08 11:11:08 -07:00
Evan Tschannen	fa7eaea7cf	fix: shards affected by team failure did not properly handle separate teams for the remote and primary data centers	2018-03-08 10:50:05 -08:00
Evan Tschannen	37a6a81634	Merge commit '7f6fc3e039c911cd84b8540f7f799fc38a1c1822' into feature-remote-logs # Conflicts: # fdbserver/workloads/RestartRecovery.actor.cpp	2018-02-23 12:33:28 -08:00
Alec Grieser	0bae9880f1	remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py	2018-02-21 10:25:11 -08:00
Evan Tschannen	ebd94bb654	removed a separately configurable storage team size for the remote data center, because it did not make sense fix: the master did not monitor for the failure of remote logs stop merge attempts when a data center is failed fixed a variety of other problems with data distribution when a data center is failed	2018-02-02 11:46:04 -08:00
Evan Tschannen	c3918d892a	do not use bandwidth splitting on the keyServer shard, lots of sets and clears to this shard generally means you do not want to create additional data distribution work	2017-11-30 18:28:16 -08:00
Evan Tschannen	aa0c2ae317	only increase the max shard size if the shard begins in the keyServer keyspace, do not increase the minimum shard size	2017-10-27 14:22:26 -07:00
Evan Tschannen	3a4078bdda	the keyservers shards are always a fixed large size	2017-10-27 11:52:11 -07:00
Yichi Chiang	53e1ae9f60	shard system keyspace	2017-07-26 13:47:31 -07:00
FDB Dev Team	a674cb4ef4	Initial repository commit	2017-05-25 13:48:44 -07:00

49 Commits