Commit Graph

587 Commits

Author SHA1 Message Date
A.J. Beamon e5488419cc Attempt to normalize trace events:
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.

Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.

This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
A.J. Beamon f1d389448c
Merge pull request #453 from apple/release-5.2
Merge release-5.2 into master
2018-06-08 10:41:44 -07:00
A.J. Beamon 6461478695
Merge pull request #452 from apple/release-5.1
Merge release-5.1 into release-5.2
2018-06-08 10:41:13 -07:00
Evan Tschannen 953c27e570
Merge pull request #431 from ajbeamon/tlog-rename-variables
Rename several variables in TLogServer.actor.cpp to follow our normal camel case conventions.
2018-06-08 10:30:22 -07:00
A.J. Beamon c9543791fd Fix case of newSeverity detail in StderrSeverity trace event 2018-06-08 10:24:12 -07:00
Evan Tschannen ce6a2f0563
Merge pull request #425 from bnamasivayam/leader-election-optimize
Optimize client and server connection times to cluster controller, es…
2018-06-01 18:35:27 -07:00
Balachandar Namasivayam 59bfa74197 Address review comments. Refactor getLeader function to mask the first 7 bits of changeID and return the masked LeaderInfo. 2018-06-01 18:23:24 -07:00
Balachandar Namasivayam 529d0497f1 Proxy going OOM when applying high volumes of writes to a proxy, particular in a sudden fashion before ratekeeper can control the workload.
Address this issue by proactively monitoring the memory used by commit batches and dropping requests if a certain memory limit is exceeded.
2018-06-01 15:21:40 -07:00
A.J. Beamon 1f0b519a73 Rename several variables in TLogServer.actor.cpp to follow our normal camel case conventions. I didn't rename every variable here, because some appear to be data structures (like a map) following the pattern keydesc_valuedesc, and I wasn't sure that the straightforward keydescValuedesc rename made sense. I did rename a couple of instances of these where it seemed reasonable, though. 2018-06-01 10:18:07 -07:00
Balachandar Namasivayam 9f55ccd4a5 Remove extraneous comments. 2018-05-31 15:32:47 -07:00
Balachandar Namasivayam 070366ca70 Optimize client and server connection times to cluster controller, especially in multi DC configurations.
A majority(quorum) answer from co-ordinators was required to connect to cluster controller.
Now a cluster controller is optimistically selected to connect even if there is no quorum.
2018-05-30 16:48:04 -07:00
A.J. Beamon d9c702a9e3 Merge release-5.1 into release-5.2 2018-05-30 09:09:55 -07:00
A.J. Beamon 026458baf3 Merge release-5.2 into master 2018-05-23 15:32:56 -07:00
A.J. Beamon e538fb4065 Add error description to error output when networking could not be initialized. 2018-05-23 15:05:28 -07:00
Alec Grieser 40babc40e1
remove one unnecessary line ; fix else formatting 2018-05-15 17:20:44 -07:00
Alec Grieser 6d132717f2
add versionstamp compatibility test to VersionStampWorkload
surfaces error found in #387
2018-05-15 17:09:24 -07:00
Dennis Schafroth a9f54e1865 Compile on macOS 10.13.4: Use ASSERT_ABORT in destructors. Import fstream 2018-05-15 12:55:02 -07:00
Evan Tschannen 91338fc984 Merge branch 'master' into feature-remote-logs 2018-05-10 15:33:45 -07:00
Evan Tschannen 8f984cb2c9 Merge branch 'release-5.2'
# Conflicts:
#	fdbrpc/TLSConnection.h
2018-05-10 09:13:22 -07:00
Evan Tschannen d3450ce5b0
Merge pull request #343 from bnamasivayam/tls-plugin
Tls plugin
2018-05-09 16:35:53 -07:00
Evan Tschannen f6e55d0b74
Merge pull request #348 from etschannen/release-5.2
DR upgrade tests now test the durability of the data.
2018-05-09 15:40:03 -07:00
Evan Tschannen 8930c2e3db DR upgrade tests now test the durability of the data. 2018-05-09 15:11:05 -07:00
Balachandar Namasivayam 7591931a09 Revert "Make tls_verify_peers as a comma separated string of constraints."
This reverts commit 2033847e4b.
2018-05-09 14:40:36 -07:00
Balachandar Namasivayam 2033847e4b Make tls_verify_peers as a comma separated string of constraints. 2018-05-09 14:37:39 -07:00
Alec Grieser f3093642b3
Merge pull request #242 from alecgrieser/32437306-better-versionstamped-value
Unify SET_VERSIONSTAMPED_KEY and SET_VERSIONSTAMPED_VALUE API
2018-05-09 09:04:07 -07:00
Balachandar Namasivayam e8b7f4b190 Add password support for tls. 2018-05-08 20:46:31 -07:00
Balachandar Namasivayam 49af5d685b Restore previous behavior of not specifying peer_verify option means disable checking. 2018-05-08 18:54:44 -07:00
Balachandar Namasivayam d3b5cfb93c Support latest TLS plugin.
Add support for https in backup.
2018-05-08 16:28:13 -07:00
Evan Tschannen 9f0d244efe Merge branch 'master' into feature-remote-logs 2018-05-08 13:28:23 -07:00
Evan Tschannen 7acdc314e4 Merge branch 'release-5.2'
# Conflicts:
#	fdbrpc/TLSConnection.actor.cpp
2018-05-08 13:22:53 -07:00
Evan Tschannen 1f6c6a886b Merge branch 'release-5.1' into release-5.2 2018-05-08 13:08:11 -07:00
A.J. Beamon ca720e1540
Merge pull request #297 from apple/release-5.2
Merge 5.2 to Master
2018-05-08 12:04:20 -07:00
Alec Grieser 47c9e4f923
update bindings and bindingtester that uses versionstamps to use new protocol
issue #148
2018-05-08 08:57:09 -07:00
Alec Grieser 464e2cdbf0
change SetVersionstampedKey and SetVersionstampedValue behavior based on API version to make them consistent 2018-05-08 08:57:09 -07:00
Alec Grieser 14cca75429
server components of version of alternative versionstamp op that writes to an arbitrary place in the value 2018-05-08 08:57:08 -07:00
Evan Tschannen e8f6ad88f0 fix: tripled the smallStorageTarget to prevent simulations which do a lot of work from timing out 2018-05-07 17:26:44 -07:00
Alec Grieser 752deb07a1
fix fdbmonitor help message output ; fix spelling error Ratekeeper.actor.cpp 2018-05-07 16:19:50 -07:00
Evan Tschannen 4677789b38 fix: low latency tests need 4 machines per datacenter to support triple replication after 1 machine has failed 2018-05-07 11:28:25 -07:00
Evan Tschannen 529bd34cf9 fix: when a tlog is stopped by another recruitment it no longer has the opportunity for commtingQueue to be set 2018-05-06 20:37:44 -07:00
Evan Tschannen 81c7bddaf8 fix: must check for log router errors while waiting on satellite replies because the recruitmentID will not be updated if it threw an error 2018-05-06 18:15:12 -07:00
Evan Tschannen 8cb8198250 fix: the e-brake should be buggified with ratekeeper storage limits to prevent simulation from running full blast into the e-brake resulting in simulation taking forever to complete (joshua timeouts) 2018-05-06 12:33:25 -07:00
Evan Tschannen cc6511a39e fix: we do not know that the minimum popped version on the log router is a known committed version until it has advanced. 2018-05-06 09:32:41 -07:00
Evan Tschannen b1935f1738 fix: do not allow a storage server to be removed within 5 million versions of it being added, because if a storage server is added and removed within the known committed version and recovery version, they storage server will need see either the add or remove when it peeks 2018-05-05 18:16:28 -07:00
Evan Tschannen 8371afb565 fix: log routers need to know if the log system is stopped to determine how they should peek the last log generation 2018-05-05 17:56:00 -07:00
Evan Tschannen 7ed64c821e fix: recruiting a cluster controller takes longer after restarting tests because we wait until files have recovered from disk before starting 2018-05-05 17:20:48 -07:00
Evan Tschannen e8ea02e054 fix: storage servers need to fail if they can no longer peek data 2018-05-05 17:19:59 -07:00
A.J. Beamon 432a295bc2 Add read bytes and read keys info to status. Collect this information directly from StorageMetrics rather than through ratekeeper. 2018-05-04 12:01:40 -07:00
Evan Tschannen 440e2ae609 fix: data distribution logic was incorrect for finding a complete source team in a failed DC 2018-05-01 23:08:31 -07:00
Evan Tschannen 87ad03ce53 locality aware load balancing was disabled on the storage servers because emergency teams might cause a server to be assigned a shard when it does not actually have the data. This problem has been fixed, so we can re-enable locality aware load balancing. 2018-05-01 22:45:22 -07:00
Evan Tschannen b4bd03e67e fix: we cannot set queueCommitEnd until we have popped the log system to prevent the popped version from going backwards 2018-05-01 22:20:25 -07:00