Alex Miller
2d26e98d07
Add a cross-platform getLastWrite() to get a file's mtime.
2018-07-20 19:00:32 -07:00
A.J. Beamon
a7a1124c11
Fix incompatible connection accounting that was incorrectly decrementing the incompatible count in some cases.
2018-07-17 11:36:05 -07:00
A.J. Beamon
8879954254
Merge pull request #609 from etschannen/release-6.0
...
Improved simulation strength by only remove datacenters that have been killed
2018-07-16 15:59:28 -07:00
Evan Tschannen
e0caa28758
code cleanup
2018-07-16 15:56:43 -07:00
AlvinMooreSr
aafb3c5c00
Merge pull request #593 from AlvinMooreSr/release-6.0-tls-funct
...
Replaced separate TLS Log function with FDB TraceEvent logger
2018-07-16 12:01:02 -07:00
Evan Tschannen
f72a9f60c0
only disable fearless if a datacenter has actually been killed
...
fix: we must prevent recovery into the dead datacenter while reducing usable_regions
2018-07-16 10:06:57 -07:00
Alvin Moore
a034acf3bd
Replaced separate TLS Log function with FDB TraceEvent logger
2018-07-11 18:41:46 -07:00
Alec Grieser
d5a23642a1
Merge pull request #587 from etschannen/feature-remote-logs
...
close unneeded connections
2018-07-10 13:27:15 -07:00
Evan Tschannen
a35d5e30d9
Added a SevError trace event in case peer references becomes negative
2018-07-10 13:26:28 -07:00
Evan Tschannen
c25be5699a
close unneeded connections
2018-07-10 13:10:29 -07:00
Alec Grieser
be9c34c6f8
Merge remote-tracking branch 'upstream/release-5.2' into merge-release-5.2
2018-07-10 10:04:48 -07:00
Alec Grieser
ad37b1693d
Merge pull request #585 from etschannen/feature-remote-logs
...
A variety of cleanup and test strengthening commits
2018-07-10 09:58:44 -07:00
AlvinMooreSr
b3916a9b71
Merge pull request #409 from joelarmstrong/tlsconnection-clang-ub-warning
...
Fix compilation with clang from Apple LLVM 9.1.0
2018-07-10 09:32:24 -07:00
Evan Tschannen
82cc30be62
added testing for two_satellite_fast and two_satellite_safe
2018-07-09 22:01:46 -07:00
Stephen Atherton
fddb3e87e2
Differentiate between a timeout in attempting to connect vs a timeout on an active connection by converting timeouts during connection attempts to connection_failed errors.
2018-07-09 19:40:01 -07:00
Stephen Atherton
3ce7c78d36
If an HTTP request fails due to a connection failure or a timeout, do not convert the error to the more generic http_request_failed.
2018-07-09 18:58:33 -07:00
Evan Tschannen
e503dc975c
fix: destroy peers that are inactive
...
do not open new connections to send replies
2018-07-09 13:37:06 -07:00
Evan Tschannen
5a2cb3037b
merge 5.2 into 6.0
2018-07-08 20:14:06 -07:00
Evan Tschannen
0e97ce79b4
fix: destroy peers that are inactive
...
do not open new connections to send replies
2018-07-08 10:26:41 -07:00
Stephen Atherton
a2f16e217e
Memory waste fix, when a Peer disconnects an extra packet buffer block is allocated to copy unsent reliable bytes to even if there aren't any.
2018-07-06 19:44:30 -07:00
Evan Tschannen
6d7172ef7e
fix: canKillProcesses did not take into account the remoteTLogPolicy when checking notEnoughLeft
2018-07-05 21:36:09 -07:00
Evan Tschannen
6f4ca2eba2
fix: get all processes did not include rebooting processes
2018-07-05 21:13:56 -07:00
Evan Tschannen
cd4fb9285a
waitForExlusion requires both regions to be healthy, which is only possible if we do not kill all logs in a region
2018-07-05 14:04:42 -07:00
Evan Tschannen
7315e5da55
fix: isExcluded and isCleared were exactly wrong
...
fix: isCleared should mean the process is dead
2018-07-05 02:22:22 -07:00
Evan Tschannen
e17dfea3b6
fix: desiredTLogCount was used instead of getDesiredLogs(), which caused problems with recruitment when desiredTLogCount was -1.
...
canKillProcess logic was wrong.
We still need to configure usable_regions because if datacenterVersionDifference is too large we cannot complete data movement.
2018-07-04 16:22:32 -04:00
Alvin Moore
c3f88dbfe1
Merge branch 'master' of github.com:apple/foundationdb into tls-static
2018-07-01 23:13:57 -07:00
Alvin Moore
132e2d9267
Defined TLS build flags for projects
...
Updated TLS documentation
2018-07-01 22:49:39 -07:00
Evan Tschannen
899f880ce0
fix: log router class did not have the proper fitness for becoming the cluster controller
2018-06-28 23:20:01 -07:00
Alvin Moore
45849d1f95
Added support for no-op legacy TLS options
2018-06-27 09:25:05 -07:00
Alvin Moore
65d8b38ae9
Changed generic plugin code to work as expected plugin code except for TLS use case
...
Defined TLS plugin name constant
Changed TLS plugin name to get_tls_plugin
Fixed link script
Removed compilation flags from info make target
2018-06-26 16:01:25 -07:00
Alvin Moore
ef8de426d3
Changed the TLS_DISABLED macro
...
Disable TLS within Windows until working
2018-06-26 12:08:32 -07:00
Evan Tschannen
0123627d67
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-06-22 10:43:07 -07:00
Evan Tschannen
5fc8199abc
Swapped OkayFit and UnsetFit, because generally if machine classes are set on one machine they are set everywhere and it helps with wait_for_good_recruitment logic
...
wait_for_good_recruitment now requires that you have the desired count of each roll
remote recruitment is given a much longer wait_for_good_recruitment time interval, which does not start until enough remote machines have registered
2018-06-22 10:15:24 -07:00
Evan Tschannen
1dce97f28c
Merge branch 'release-5.2'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbserver/SimulatedCluster.actor.cpp
# packaging/msi/FDBInstaller.wxs
# versions.target
2018-06-21 17:05:11 -07:00
Balachandar Namasivayam
d7dba11366
Throw tls_error instead of internal_error when not able to create a TLS connection.
2018-06-21 15:33:00 -07:00
Stephen Atherton
e9e1e194f0
Added operation-specific rate controls to blob store interface.
2018-06-20 20:34:34 -07:00
Richard Low
fff6a47c43
Validate certiicates by default
2018-06-20 14:04:03 -07:00
Alvin Moore
f8ce1de601
Added support for compiling TLS into binaries
2018-06-20 09:21:23 -07:00
Evan Tschannen
0913368651
added usable_regions to specify if we will replicate into a remote region
...
remote replication defaults to the primary replication
removed remote_logs, because they should be specified as an override in the regions object
2018-06-17 19:31:15 -07:00
Alex Miller
6c2cb25c53
Rename BestOtherFit -> OkayFit.
...
The previous order of fitness was
BestFit > GoodFit > BestOtherFit > ...
which is baffling. It's now:
BestFit > GoodFit > OkayFit > ...
which won't break anyone's expectations.
2018-06-12 16:50:25 -07:00
Evan Tschannen
372ed67497
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbserver/DataDistribution.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
2018-06-11 11:34:10 -07:00
Evan Tschannen
48fbc407fd
fix: we cannot kill all of the remote tlogs, because we still need their data to copy to the next generation in the same data center
2018-06-08 15:28:44 -07:00
A.J. Beamon
99c9958db7
Some more trace event normalization
2018-06-08 13:57:00 -07:00
A.J. Beamon
e5488419cc
Attempt to normalize trace events:
...
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.
Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.
This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
Balachandar Namasivayam
529d0497f1
Proxy going OOM when applying high volumes of writes to a proxy, particular in a sudden fashion before ratekeeper can control the workload.
...
Address this issue by proactively monitoring the memory used by commit batches and dropping requests if a certain memory limit is exceeded.
2018-06-01 15:21:40 -07:00
A.J. Beamon
d9c702a9e3
Merge release-5.1 into release-5.2
2018-05-30 09:09:55 -07:00
Joel Armstrong
7c35ea6ba1
Fix use of bool in va_start causing undefined behavior
...
The version of clang included in Apple LLVM 9.1.0 complains about
passing the bool parameter `is_error` to va_start, which causes make
to fail:
fdbrpc/TLSConnection.actor.cpp:370:16: error: passing an object that undergoes
default argument promotion to 'va_start' has undefined behavior
[-Werror,-Wvarargs]
va_start( ap, is_error );
^
This just switches is_error back to the type it gets promoted to (int).
2018-05-24 16:37:11 -07:00
A.J. Beamon
026458baf3
Merge release-5.2 into master
2018-05-23 15:32:56 -07:00
Richard Low
84ed35b01f
Only log TLS verify failures if all verification fails; log failures at SevInfo
2018-05-21 10:58:59 -07:00
Richard Low
086700aeb1
Plumb through TLS key password to CLI and from environment
2018-05-21 10:56:10 -07:00