Commit Graph

3612 Commits

Author SHA1 Message Date
Evan Tschannen a1f8b546e6 fix: ensure connections to blob store are evenly distributed across network addresses
added a per address limit to the number of open connections
lowered a variety of knobs to prevent us from using too much memory
2017-09-29 14:59:24 -07:00
Evan Tschannen bff8bcc8a9 added help messages for setting a memory limit 2017-09-29 12:38:42 -07:00
Evan Tschannen e2b65e86ed added configurable memory limits for backup and dr executables
added a default memory limit of 8GB for fdbcli
2017-09-29 10:35:40 -07:00
Bhaskar Muppana 91975244fe Fixing OSX build. 2017-09-28 19:35:44 -07:00
Bhaskar Muppana 942c04e992 Merge pull request #162 from bmuppana/master
Fixing TimeKeeperCorrectness to deal with network delays.
2017-09-28 17:04:39 -07:00
Bhaskar Muppana 3d2bafc3a6 Fixing TimeKeeperCorrectness to deal with network delays. 2017-09-28 16:52:28 -07:00
Evan Tschannen ef41b07bb3 renamed past_version to transaction_too_old
implemented read_lock_aware option
2017-09-28 16:35:08 -07:00
Yichi Chiang d4f75630de Support log group field in status json 2017-09-28 16:31:29 -07:00
Evan Tschannen 7b60e26660 Merge pull request #160 from cie/use-error-descriptions
Add the ability to access name and description in Error. Update error…
2017-09-28 16:00:39 -07:00
A.J. Beamon 4f97bd44a5 If we fail to get the interface name due to a platform error, don't kill the process. Instead, just leave the network counters alone. Change the GetInterfaceAddrs trace event to SevWarnAlways. 2017-09-28 13:32:39 -07:00
Evan Tschannen 5f4b997400 emergency teams are bad for performance, because we will route client read requests to servers that do not have the data, therefore getting many wrong shard server errors. emergency teams only protect us from data loss in very rare scenarios, we may want to add them in again in the future, but make sure load balance knows which storage servers used to be destinations so they can only route to them as a last resort. 2017-09-28 13:20:01 -07:00
Evan Tschannen 73fca75239 added the ability to disable timeKeeper; disabled timeKeeper before consistency check in simulation 2017-09-28 13:13:24 -07:00
A.J. Beamon 67d0eb5d66 Change a few more error descriptions; update sphinx error code documentation 2017-09-28 13:03:17 -07:00
A.J. Beamon d30c730f75 Add the ability to access name and description in Error. Update error descriptions. 2017-09-28 12:35:03 -07:00
Alec Grieser bd6dabacdb added versionstamp type to python tuple layer and updated bindingtester to test it 2017-09-28 12:03:40 -07:00
Bhaskar Muppana 0f8ff26029 Merge pull request #158 from bmuppana/master
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:42 -07:00
Bhaskar Muppana 6a0b1d6808 Fixing PR comments
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:01 -07:00
A.J. Beamon 91281ec754 Don't use SetByteArrayRegion for get range results. 2017-09-27 13:41:06 -07:00
Alec Grieser 80f559d148 changed name from thread_completion_hook to network_thread_completion_hook 2017-09-27 11:30:39 -07:00
Evan Tschannen 4b21da1cd6 fix: lastVersionWithData was not updated when fetchKeys injects mutations 2017-09-27 10:44:34 -07:00
Alec Grieser 18edc56559 removed unused local variable 2017-09-27 09:32:31 -07:00
Stephen Atherton c3e4364f08 Added checksums to pager. 2017-09-26 23:13:22 -07:00
Stephen Atherton cfb0cc4c3b Merge branch 'release-5.0' 2017-09-26 22:13:23 -07:00
Stephen Atherton 333fb65a91 FDBMonitor now supports “flag_X=<true/false>” in all sections, with the usual inheritance, in order to enable (true) or disable (false) the passing of a parameterless command line named X as “—X”. 2017-09-26 22:13:01 -07:00
Alec Grieser d7e1b267be changed name from shutdown hook to thread completion hook ; added hook parameter 2017-09-26 17:00:04 -07:00
Alec Grieser a5f1c3b15b Merge remote-tracking branch 'origin/master' into 33300740-with-shutdown-hooks 2017-09-26 11:28:40 -07:00
Alvin Moore 298b54104e Merge branch 'release-5.0' 2017-09-26 11:16:14 -07:00
Alvin Moore 02525d7b14 Added TESTs to ensure that all of the different kills are performed during simulation 2017-09-26 11:15:39 -07:00
Yichi Chiang 4ce60c4276 Merge pull request #159 from cie/add-locality-to-backup
Add locality to backup agent and DR agent
2017-09-26 10:20:32 -07:00
Yichi Chiang 5e9c6d6b64 Add locality to backup agent and DR agent 2017-09-26 10:19:26 -07:00
Evan Tschannen acb7e66d01 fix: failed logs do not count even if they have returned a result 2017-09-25 18:14:40 -07:00
Evan Tschannen 2bf042a559 fix: file_corrupt was not checking for fault injection
latency threshold was too long
2017-09-25 17:22:41 -07:00
A.J. Beamon e5e7f8a081 When using setKey() on Standalone<KeySelectorRef> in RYW, make sure that the key is part of the key selector's arena. 2017-09-25 15:52:45 -07:00
Bhaskar Muppana 0bf5bdb23a <rdar://problem/34557380> Need a way to map real time to version 2017-09-25 12:51:37 -07:00
Yichi Chiang 6758c649fc Catch and update processClass change from DBSource 2017-09-25 10:36:03 -07:00
Stephen Atherton ed226e2e1c VersionedBtree now allows writes while a commit is in progress. Also, lots of bug fixes, like the one where writes during commits weren’t prevented but also didn’t work. 2017-09-22 17:18:28 -07:00
Evan Tschannen cce4eeb52d fix: the master was sending the cluster controller uninitialized configurations 2017-09-22 16:59:24 -07:00
Evan Tschannen 180438d41e fix: use the number of present logServers rather than the total size of the vector 2017-09-22 16:19:16 -07:00
Evan Tschannen 7081136f74 added a fix 2017-09-22 15:08:14 -07:00
Evan Tschannen 738ae21c3a fix: an optimization in buggified locking can cause recovery to break because it would not restart if a locked process was killed when the remaining logs cannot obtain a quorum 2017-09-22 15:07:57 -07:00
Evan Tschannen fba78ce4ef refactored monitor leader again to be even safer.
fixed a problem where we would write the header to clusters files twice
added extra logging in monitor leader
2017-09-22 15:06:11 -07:00
Stephen Atherton 248dab79b6 Created “redwood” storage engine option and many changes to support that including IKeyValueStore::init() and custom DiskQueue file extensions. 2017-09-21 23:51:55 -07:00
Alex Miller 585c9bf68f Quick fix to reduce CPU usage of ensureEpochLive.
It is suspected that policy recomputations are driving proxy CPU usage up, and
thus latency and throughput down.  To quickly confirm this theory, we're
forcing ensureEpochLive to wait until it has RF responses, which means we'll
probably only validate the policy once per call.
2017-09-21 18:22:24 -07:00
Evan Tschannen 4809bd8f62 fix: We cannot inject faults after renaming the file, because we could end up with two asyncFileNonDurable open for the same file 2017-09-21 18:11:18 -07:00
A.J. Beamon 995587b12b Merge branch 'release-5.0' 2017-09-21 13:32:12 -07:00
Stephen Atherton d880569d52 Checkpointing progress on KeyValueStoreMVBTree. All methods are implemented to a usable point, and everything compiles, but Worker does not yet try to use it. 2017-09-21 04:43:49 -07:00
Stephen Atherton e16ce63db0 Bug fix, find was being done with a key ref that wasn’t living as long as the actor. 2017-09-21 00:58:56 -07:00
Stephen Atherton 1ca9814879 Bug (arguable, perhaps) fix in AsyncFileCached. Order was not being enforced between writes and truncates such that calling and waiting on a truncate to X and then writing to X + 1 could end up writing first and then truncating the written page off of the file. 2017-09-20 17:58:56 -07:00
Stephen Atherton bcc8a2cc82 Bug fix, page reference was not living as long as the file write call so the memory could be freed before the write happens if there is any delay. 2017-09-20 17:51:26 -07:00
Stephen Atherton eee8116fa4 Cleaned up how the root page is written and page write debug output. 2017-09-20 17:50:02 -07:00