A.J. Beamon
e5488419cc
Attempt to normalize trace events:
...
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.
Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.
This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
A.J. Beamon
f1d389448c
Merge pull request #453 from apple/release-5.2
...
Merge release-5.2 into master
2018-06-08 10:41:44 -07:00
A.J. Beamon
6461478695
Merge pull request #452 from apple/release-5.1
...
Merge release-5.1 into release-5.2
2018-06-08 10:41:13 -07:00
Evan Tschannen
953c27e570
Merge pull request #431 from ajbeamon/tlog-rename-variables
...
Rename several variables in TLogServer.actor.cpp to follow our normal camel case conventions.
2018-06-08 10:30:22 -07:00
A.J. Beamon
c9543791fd
Fix case of newSeverity detail in StderrSeverity trace event
2018-06-08 10:24:12 -07:00
Evan Tschannen
ce6a2f0563
Merge pull request #425 from bnamasivayam/leader-election-optimize
...
Optimize client and server connection times to cluster controller, es…
2018-06-01 18:35:27 -07:00
Balachandar Namasivayam
59bfa74197
Address review comments. Refactor getLeader function to mask the first 7 bits of changeID and return the masked LeaderInfo.
2018-06-01 18:23:24 -07:00
Balachandar Namasivayam
529d0497f1
Proxy going OOM when applying high volumes of writes to a proxy, particular in a sudden fashion before ratekeeper can control the workload.
...
Address this issue by proactively monitoring the memory used by commit batches and dropping requests if a certain memory limit is exceeded.
2018-06-01 15:21:40 -07:00
A.J. Beamon
1f0b519a73
Rename several variables in TLogServer.actor.cpp to follow our normal camel case conventions. I didn't rename every variable here, because some appear to be data structures (like a map) following the pattern keydesc_valuedesc, and I wasn't sure that the straightforward keydescValuedesc rename made sense. I did rename a couple of instances of these where it seemed reasonable, though.
2018-06-01 10:18:07 -07:00
Balachandar Namasivayam
9f55ccd4a5
Remove extraneous comments.
2018-05-31 15:32:47 -07:00
Balachandar Namasivayam
070366ca70
Optimize client and server connection times to cluster controller, especially in multi DC configurations.
...
A majority(quorum) answer from co-ordinators was required to connect to cluster controller.
Now a cluster controller is optimistically selected to connect even if there is no quorum.
2018-05-30 16:48:04 -07:00
A.J. Beamon
d9c702a9e3
Merge release-5.1 into release-5.2
2018-05-30 09:09:55 -07:00
A.J. Beamon
026458baf3
Merge release-5.2 into master
2018-05-23 15:32:56 -07:00
A.J. Beamon
e538fb4065
Add error description to error output when networking could not be initialized.
2018-05-23 15:05:28 -07:00
Alec Grieser
40babc40e1
remove one unnecessary line ; fix else formatting
2018-05-15 17:20:44 -07:00
Alec Grieser
6d132717f2
add versionstamp compatibility test to VersionStampWorkload
...
surfaces error found in #387
2018-05-15 17:09:24 -07:00
Dennis Schafroth
a9f54e1865
Compile on macOS 10.13.4: Use ASSERT_ABORT in destructors. Import fstream
2018-05-15 12:55:02 -07:00
Evan Tschannen
91338fc984
Merge branch 'master' into feature-remote-logs
2018-05-10 15:33:45 -07:00
Evan Tschannen
8f984cb2c9
Merge branch 'release-5.2'
...
# Conflicts:
# fdbrpc/TLSConnection.h
2018-05-10 09:13:22 -07:00
Evan Tschannen
d3450ce5b0
Merge pull request #343 from bnamasivayam/tls-plugin
...
Tls plugin
2018-05-09 16:35:53 -07:00
Evan Tschannen
f6e55d0b74
Merge pull request #348 from etschannen/release-5.2
...
DR upgrade tests now test the durability of the data.
2018-05-09 15:40:03 -07:00
Evan Tschannen
8930c2e3db
DR upgrade tests now test the durability of the data.
2018-05-09 15:11:05 -07:00
Balachandar Namasivayam
7591931a09
Revert "Make tls_verify_peers as a comma separated string of constraints."
...
This reverts commit 2033847e4b
.
2018-05-09 14:40:36 -07:00
Balachandar Namasivayam
2033847e4b
Make tls_verify_peers as a comma separated string of constraints.
2018-05-09 14:37:39 -07:00
Alec Grieser
f3093642b3
Merge pull request #242 from alecgrieser/32437306-better-versionstamped-value
...
Unify SET_VERSIONSTAMPED_KEY and SET_VERSIONSTAMPED_VALUE API
2018-05-09 09:04:07 -07:00
Balachandar Namasivayam
e8b7f4b190
Add password support for tls.
2018-05-08 20:46:31 -07:00
Balachandar Namasivayam
49af5d685b
Restore previous behavior of not specifying peer_verify option means disable checking.
2018-05-08 18:54:44 -07:00
Balachandar Namasivayam
d3b5cfb93c
Support latest TLS plugin.
...
Add support for https in backup.
2018-05-08 16:28:13 -07:00
Evan Tschannen
9f0d244efe
Merge branch 'master' into feature-remote-logs
2018-05-08 13:28:23 -07:00
Evan Tschannen
7acdc314e4
Merge branch 'release-5.2'
...
# Conflicts:
# fdbrpc/TLSConnection.actor.cpp
2018-05-08 13:22:53 -07:00
Evan Tschannen
1f6c6a886b
Merge branch 'release-5.1' into release-5.2
2018-05-08 13:08:11 -07:00
A.J. Beamon
ca720e1540
Merge pull request #297 from apple/release-5.2
...
Merge 5.2 to Master
2018-05-08 12:04:20 -07:00
Alec Grieser
47c9e4f923
update bindings and bindingtester that uses versionstamps to use new protocol
...
issue #148
2018-05-08 08:57:09 -07:00
Alec Grieser
464e2cdbf0
change SetVersionstampedKey and SetVersionstampedValue behavior based on API version to make them consistent
2018-05-08 08:57:09 -07:00
Alec Grieser
14cca75429
server components of version of alternative versionstamp op that writes to an arbitrary place in the value
2018-05-08 08:57:08 -07:00
Evan Tschannen
e8f6ad88f0
fix: tripled the smallStorageTarget to prevent simulations which do a lot of work from timing out
2018-05-07 17:26:44 -07:00
Alec Grieser
752deb07a1
fix fdbmonitor help message output ; fix spelling error Ratekeeper.actor.cpp
2018-05-07 16:19:50 -07:00
Evan Tschannen
4677789b38
fix: low latency tests need 4 machines per datacenter to support triple replication after 1 machine has failed
2018-05-07 11:28:25 -07:00
Evan Tschannen
529bd34cf9
fix: when a tlog is stopped by another recruitment it no longer has the opportunity for commtingQueue to be set
2018-05-06 20:37:44 -07:00
Evan Tschannen
81c7bddaf8
fix: must check for log router errors while waiting on satellite replies because the recruitmentID will not be updated if it threw an error
2018-05-06 18:15:12 -07:00
Evan Tschannen
8cb8198250
fix: the e-brake should be buggified with ratekeeper storage limits to prevent simulation from running full blast into the e-brake resulting in simulation taking forever to complete (joshua timeouts)
2018-05-06 12:33:25 -07:00
Evan Tschannen
cc6511a39e
fix: we do not know that the minimum popped version on the log router is a known committed version until it has advanced.
2018-05-06 09:32:41 -07:00
Evan Tschannen
b1935f1738
fix: do not allow a storage server to be removed within 5 million versions of it being added, because if a storage server is added and removed within the known committed version and recovery version, they storage server will need see either the add or remove when it peeks
2018-05-05 18:16:28 -07:00
Evan Tschannen
8371afb565
fix: log routers need to know if the log system is stopped to determine how they should peek the last log generation
2018-05-05 17:56:00 -07:00
Evan Tschannen
7ed64c821e
fix: recruiting a cluster controller takes longer after restarting tests because we wait until files have recovered from disk before starting
2018-05-05 17:20:48 -07:00
Evan Tschannen
e8ea02e054
fix: storage servers need to fail if they can no longer peek data
2018-05-05 17:19:59 -07:00
A.J. Beamon
432a295bc2
Add read bytes and read keys info to status. Collect this information directly from StorageMetrics rather than through ratekeeper.
2018-05-04 12:01:40 -07:00
Evan Tschannen
440e2ae609
fix: data distribution logic was incorrect for finding a complete source team in a failed DC
2018-05-01 23:08:31 -07:00
Evan Tschannen
87ad03ce53
locality aware load balancing was disabled on the storage servers because emergency teams might cause a server to be assigned a shard when it does not actually have the data. This problem has been fixed, so we can re-enable locality aware load balancing.
2018-05-01 22:45:22 -07:00
Evan Tschannen
b4bd03e67e
fix: we cannot set queueCommitEnd until we have popped the log system to prevent the popped version from going backwards
2018-05-01 22:20:25 -07:00