Commit Graph

1494 Commits

Author SHA1 Message Date
A.J. Beamon 58fbd0e3a1
Merge pull request #2980 from alexmiller-apple/tls-background-eio-thread
Stop background eio threads on Net2::stop()
2020-04-22 08:17:59 -07:00
Alex Miller c6df20a179
Use nullptr instead of NULL 2020-04-21 20:39:45 -07:00
negoyal 2fa7d485f5 Merge branch 'master' into cache_storageq_results 2020-04-21 17:28:17 -07:00
tclinken 885a2e020e Generate rvalue reference overloads with actor compiler 2020-04-21 15:18:59 -07:00
Vishesh Yadav 92c86a7799 Merge remote-tracking branch 'apple/master' into task/issue-1017-slow-machine-poisoning 2020-04-21 03:36:19 -07:00
Alex Miller a51746b307 Match 6.2.15's behavior in how invalid/unreadable/non-existent certs are handled.
Which is to proceed past Net2 creation, and allow certificate refresh to
try and eventually load valid certs.  Additionally, fix certificate
refeshing dieing if the certificate is not readable when first called.

In testing, I also found and fixed an issue where if a cert went from
unreadable to readable, we wouldn't reload the TLS context, due to not
considering it as a file change.
2020-04-20 21:38:04 -07:00
Alex Miller 20fe068863 Merge branch 'tls-background-eio-thread' into tls-permission-errors 2020-04-20 20:51:05 -07:00
Xin Dong 49c6bb90ef
Merge pull request #2982 from alexmiller-apple/tls-log-settings
Log Net2TLSConfig with paths and settings when using TLS.
2020-04-20 15:46:26 -07:00
Alex Miller 75a4f3b7c9 Remove comment about ignoring runOnMainThread errors.
If we got an exception, it wouldn't be of type `Error` anyway, so
it seems like things would crash regardless.
2020-04-20 13:19:42 -07:00
Alex Miller da8e47ea25 Merge remote-tracking branch 'upstream/release-6.2' into tls-background-eio-thread 2020-04-20 13:15:05 -07:00
Alex Miller 5c399bf725 Move the callbacks into ::run() right before it exits.
stopped=true doesn't cause the run loop to immediately exit.
2020-04-20 13:14:19 -07:00
Markus Pilman 040e62341a Merge branch 'master' of github.com:apple/foundationdb into features/flatbuffers-debugtx 2020-04-20 12:03:24 -07:00
Jingyu Zhou 7507f2da81
Merge pull request #2984 from satherton/future-move-t-constructor
Added Future<T>(T &&value) constructor to avoid a copy...
2020-04-20 11:47:32 -07:00
Steve Atherton ba1b0a1d96 Use std::move() instead of forward. 2020-04-20 11:01:01 -07:00
A.J. Beamon c28a843251
Merge pull request #2977 from alexmiller-apple/tls-no-atexit
Fix clients crashing in TLS code on exit.
2020-04-20 08:40:16 -07:00
Steve Atherton 022b77e288 Actor compiler will std::move() return expressions that exactly match a state variable. 2020-04-20 04:19:33 -07:00
Alex Miller 2ce539ef6d Respect flow<->fdbrpc module boundaries.
Which fixes a compilation error due to a circular dependency between
flow.a and fdbrpc.a.  However, this is now done at the cost of newNet2
users have to remember to add Net2FileSystem::stop() as a callback.
2020-04-20 02:53:07 -07:00
Steve Atherton 7b23c6f640 Future constructor to avoid a copy when Future<T> is initialized from an rvalue reference to T. 2020-04-20 01:52:28 -07:00
Steve Atherton 21277fdb4f Merge branch 'master' of github.com:apple/foundationdb into feature-redwood 2020-04-20 01:47:21 -07:00
Alex Miller cbb6ffb431 Only log OpenSSL error strings for OpenSSL errors.
Normal "connection refused" messages would show up with a long verbose
string that doesn't really provide any useful information otherwise.
2020-04-18 20:39:02 -07:00
Alex Miller 11eebc4a48 Log Net2TLSConfig with paths and settings when using TLS.
There were similar TraceEvents in the FDBLibTLS/LibreSSL TLS
implementaiton that were accidentally dropped in the TLS rewrite.

This makes it so that one does not have to use magic to figure out if a
process was configued with TLS correctly when some of the settings come
from environment variables.
2020-04-18 20:21:10 -07:00
Alex Miller 1398e9a82e Stop background eio threads on Net2::stop().
This will stop eio threads for both the client (`fdb_stop_network()`)
and the server.  This change is being done more for the former, but I
don't see any harm in doing the latter as well.
2020-04-18 19:40:55 -07:00
Alex Miller 94b4f78ea9 Fix clients crashing in TLS code on exit.
If client code initiates an FDB operation to a TLS cluster, and then
immediately exits the main thread, then OpenSSL's atexit handler would
potentially run while the network thread is attempting to do TLS
operations, and thus crash.

This commit removes the OpenSSL atexit hander, and instead relies on a
client intentionally ending the network thread to do TLS cleanup.  If
the client code exits without stopping the network thread, then we'll
never free OpenSSL data structures, which is the safer thing to do.
2020-04-18 15:48:02 -07:00
Andrew Noyes cb6389d42d Prevent main thread from destroying flatbuffers globals
We recently witnessed (using tsan) the main thread exiting without first
joining the network thread, and this caused data races and
heap-use-after-free's

Now the lifetime of these globals will be tied to the network thread
itself (and I guess every thread, but the one that actually uses memory
will be owned by the network thread.)
2020-04-17 23:34:28 +00:00
Markus Pilman 72028de14d intermediate commit 2020-04-17 15:30:45 -07:00
Evan Tschannen ba3e2af473 Merge commit '5288033bcfe40c3ade97c8bf2d04cf31b3f16cb1' into feature-tree-broadcast 2020-04-17 15:17:37 -07:00
A.J. Beamon dfec896438 Enforce a throttle limit. Don't count transaction tags on RK if the proxy has updated us in a while. 2020-04-17 11:48:02 -07:00
A.J. Beamon 78d48a0dad Merge branch 'master' into transaction-tagging
# Conflicts:
#	fdbcli/fdbcli.actor.cpp
#	fdbclient/Knobs.h
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/fdbclient.vcxproj
#	fdbserver/MasterProxyServer.actor.cpp
2020-04-17 09:23:18 -07:00
Markus Pilman e22b88d857 Revert "Add FBTrace calls for most debugTransaction trace messages."
This reverts commit be68bd6ad2.
2020-04-16 13:14:27 -07:00
Markus Pilman 9f114d96c1 Merge branch 'master' of github.com:apple/foundationdb into features/flatbuffers-debugtx 2020-04-16 12:32:19 -07:00
Vishesh Yadav 8c8f23bff2 Merge remote-tracking branch 'apple/master' into task/issue-1017-slow-machine-poisoning 2020-04-16 00:45:35 -07:00
Xin Dong cb47def22a
Merge pull request #2946 from alexmiller-apple/isystem-flow-lpthread
Remove `-isystem flow/-lpthread` from INCLUDES/CXXFLAGS
2020-04-15 11:01:45 -07:00
A.J. Beamon d8690d31cd Merge branch 'master' into per-priority-busy-logging
# Conflicts:
#	flow/Net2.actor.cpp
2020-04-15 08:31:30 -07:00
Evan Tschannen 37e2b0d353
Update flow/network.cpp 2020-04-14 17:12:07 -07:00
negoyal b85dc16c6d Merge branch 'master' into fdb_cache_subfeature2 2020-04-14 17:07:41 -07:00
Evan Tschannen a3598a7616
Merge pull request #2738 from ajbeamon/fix-assertion-failure-on-io-error
Fix assertion failure in SQLite thread pools on io_error
2020-04-14 16:48:22 -07:00
A.J. Beamon b1172417f5 Merge branch 'master' into per-priority-busy-logging
# Conflicts:
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/Net2.actor.cpp
2020-04-14 14:22:12 -07:00
A.J. Beamon e104a2e3a6 Merge commit 'cf01233f28a2c42908656a39f458a4475c1d44a3' into run-loop-busy-profiler
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/NativeAPI.actor.h
#	fdbserver/fdbserver.actor.cpp
#	flow/Net2.actor.cpp
2020-04-14 14:02:24 -07:00
Evan Tschannen cf01233f28
Merge pull request #2549 from ajbeamon/slow-task-and-priority-tracking-improvements
Slow task and priority tracker improvements
2020-04-14 13:26:32 -07:00
Alex Miller 2e53c8c5e2
Merge pull request #2915 from tclinken/move-optimizations
Avoid unnecessary copies in PromiseStream
2020-04-13 21:15:16 -07:00
A.J. Beamon c851ee4031
Merge pull request #2897 from tclinken/fix-trace-batch-loggroup-and-role
Annotate trace batch events before dumping
2020-04-13 11:22:51 -07:00
Alex Miller 7dc348d077 Remove `-isystem flow/-lpthread` from INCLUDES/CXXFLAGS
This cmake line generated a bogus and nonsensical include path, so as
the entire line isn't necessary, just remove it.
2020-04-13 02:22:25 -07:00
Evan Tschannen 07cc0a8d74 code cleanup 2020-04-10 17:02:11 -07:00
tclinken 8ef5a04896 Guard all of annotateEvent with mutex 2020-04-10 13:03:15 -07:00
A.J. Beamon 29b2c2f3aa Add hash to StringRef. Use unordered maps for storing tags. Create some helpful typedefs. 2020-04-10 12:54:59 -07:00
A.J. Beamon ebeca10bce Change the serialization of tags sent in some messages. Add communication of the sampling rate from cluster to clients. 2020-04-09 16:55:56 -07:00
negoyal be68bd6ad2 Add FBTrace calls for most debugTransaction trace messages. 2020-04-09 16:45:24 -07:00
tclinken 01285f3374 Delay annotation of trace batch events created before trace file is opened 2020-04-09 14:09:00 -07:00
Steve Atherton 8de2bbb2dc Merge branch 'master' of github.com:apple/foundationdb into feature-redwood 2020-04-09 09:39:49 -07:00
Vishesh Yadav 13447f439f fdbrpc: Add a constant to onFailedFor()
Since, we mark an address as failed when connection is failed, this
patch adds a contant to compensate the time needed to reconnect and
make sure endpoint is actually down. This contant is equal to
FAILURE_MIN_DELAY which was used by centralized
FailureMonitoringClient earlier removed.
2020-04-08 19:34:40 -07:00
Vishesh Yadav 975e6b1d9a Merge remote-tracking branch 'apple/master' into task/issue-1017-slow-machine-poisoning
Removed merge conflict with old build system.
2020-04-08 19:25:13 -07:00
A.J. Beamon af4e0088ba
Merge pull request #2896 from tclinken/atomically-update-dependent-knobs
Atomically update dependent knobs
2020-04-08 15:00:49 -07:00
tclinken 3a01d24970 Pass const ref to a_callback_fire 2020-04-08 14:50:41 -07:00
tclinken 52860043c9 Merge remote-tracking branch 'origin' into atomically-update-dependent-knobs 2020-04-08 12:26:21 -07:00
Steve Atherton 872877b221 Added StringRef::copyTo(), a prettier way to memcpy a StringRef somewhere and with a more useful return value. 2020-04-08 03:23:46 -07:00
A.J. Beamon 36da61dd9c Merge branch 'master' into transaction-tagging
# Conflicts:
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/vexillographer/fdb.options
2020-04-07 21:12:14 -07:00
Markus Pilman d4542dbb5a Delete old build system 2020-04-07 11:03:45 -07:00
Steve Atherton a354d7a037 Merge branch 'master' of github.com:apple/foundationdb into feature-redwood 2020-04-06 20:29:12 -07:00
negoyal e24017fae3 Minor definition fixes. 2020-04-06 17:53:26 -07:00
tclinken e682f6741c Avoid unnecessary copies in PromiseStream 2020-04-06 17:15:03 -07:00
negoyal 0fe5e55f77 Merge branch 'features/flatbuffers-debugtx' of github.com:negoyal/foundationdb into features/flatbuffers-debugtx 2020-04-06 16:01:31 -07:00
negoyal 740c5490b3 Added flat buffer classes for debugTransaction trace message types. 2020-04-06 16:00:40 -07:00
Markus Pilman 8b5780c36c don't include source and binary dir
This forces users to use include paths from the sources root.

So `#include "Arena.h"` won't work anymore, only
`#include "flow/Arena.h"` will.
2020-04-06 10:13:49 -07:00
Markus Pilman 9beb191932 Merge branch 'master' of github.com:apple/foundationdb into features/flatbuffers-debugtx 2020-04-06 09:38:58 -07:00
Vishesh Yadav fdc1048f75 Add knob to turn off marking unstable connections 2020-04-03 15:53:00 -07:00
Vishesh Yadav 1d35f2ff5a Mark a connection as failed for X seconds if closes too often 2020-04-03 15:53:00 -07:00
A.J. Beamon 2336f073ad Checkpointing a bunch of work on throttles. Rudimentary implementation of auto-throttling. Support for manual throttling via fdbcli. Throttles are stored in the system keyspace. 2020-04-03 15:24:14 -07:00
Xin Dong 760dc68b7f
Merge pull request #2869 from dongxinEric/feature/1689/allow-custome-trace-log-file-identifier
Allow the user to provide a custome trace log file identifier that wi…
2020-04-03 14:08:17 -07:00
Alvin Moore 78f0cddb14
Merge pull request #2684 from mpilman/features/boost70
Upgrade to boost 1.72
2020-04-03 09:30:59 -07:00
Markus Pilman 6a899dd0d8 added knob to turn on/off fbtraces 2020-04-03 09:24:23 -07:00
tclinken 10fee8fafc Annotate trace batch events before dumping 2020-04-02 19:34:02 -07:00
negoyal a0c8946f31 Merge branch 'master' into fdb_cache_subfeature2 2020-04-02 12:27:04 -07:00
Markus Pilman bbd2fe62cc Merge branch 'master' of github.com:apple/foundationdb into features/boost70 2020-04-02 09:21:01 -07:00
Meng Xu 6bce67ca75 FastRestore:Apply clang-format 2020-04-01 21:27:54 -07:00
Meng Xu 1eea388278 Add unit test for splitMutation test file
Comment out debug trace as well.
2020-04-01 21:27:18 -07:00
tclinken 884e92bb49 Atomically update dependent knobs 2020-04-01 15:18:49 -07:00
Markus Pilman 3e3ac9f67d Added interface 2020-04-01 14:41:16 -07:00
A.J. Beamon 903128a36f Fix boost error trace event fields to match other trace events by the same name. 2020-04-01 08:18:50 -07:00
Steve Atherton c1b31b46ae Missing include, doesn't cause a compile error but IDE complains when viewing the file. 2020-04-01 02:41:16 -07:00
Meng Xu 7847d70e3a StringRef:toString:Handle empty StringRef 2020-03-31 22:14:05 -07:00
Xin Dong 6820167d77
Merge branch 'master' into feature/1689/allow-custome-trace-log-file-identifier 2020-03-31 16:50:46 -07:00
Xin Dong 2805111a32 When provided with a custome identifier, use that string instead of the port/PID as the last part of the baseName. 2020-03-31 11:02:02 -07:00
Meng Xu 60f6edc3b5
Merge pull request #2860 from zjuLcg/report-conflicting-key-roll-forward
Report conflicting key roll forward
2020-03-30 17:33:56 -07:00
Markus Pilman 28cd38913d Merge branch 'master' of github.com:apple/foundationdb into features/boost70 2020-03-27 13:44:10 -07:00
Alex Miller cd65f301dd Merge remote-tracking branch 'upstream/master' into mutation-debugging 2020-03-27 03:47:24 -07:00
Alex Miller 72e5891058 Clean up and rework the debugMutation API.
As a relatively unknown debugging tool for simulation tests, one could
have simulation print when a particular key is handled in various stages
of the commit process.  This functionality was enabled by changing a 0
to a 1 in an #if, and changing a constant to the key in question.

As a proxy and storage server handle mutations, they call debugMutation
or debugKeyRange, which then checks against the mutation against the key
in question, and logs if they match.  A mixture of printfs and
TraceEvents would then be emitted, and for this to actually be usable,
one also needs to comment out some particularly spammy debugKeyRange()
calls.

This PR reworks the API of debugMutation/debugKeyRange, pulls it out
into its own file, and trims what is logged by default into something
useful and understandable:
* debugMutation() now returns a TraceEvent, that one can add more details to before it is logged.
* Data distribution and storage server cleanup operations are no longer logged by default
2020-03-27 03:30:28 -07:00
negoyal acaf91ac47 Merge branch 'master' into fdb_cache_subfeature2 2020-03-26 13:33:08 -07:00
Xin Dong 03e2102a21 Fix macOS build failure. 2020-03-26 11:41:36 -07:00
Xin Dong a0177a9335 Allow the user to provide a custome trace log file identifier that will be used as the prefix of all trace log files created at the client side. 2020-03-26 11:25:05 -07:00
tclinken baf0fe956c Take trace mutex in setLogGroup 2020-03-26 09:55:03 -07:00
tclinken 7d5ed53215 Allow trace log group to be set after database is created 2020-03-25 13:40:43 -07:00
A.J. Beamon e0424a52f8 Merge branch 'master' into transaction-tagging 2020-03-25 08:23:11 -07:00
Andrew Noyes 9123cd35ed Version report_conflicting_keys field 2020-03-24 18:11:15 -07:00
Andrew Noyes c0bae64105 Use sigaction, _exit if anything fails 2020-03-20 12:50:31 -07:00
A.J. Beamon 26b7e02d4c Some initial work to support tagging transactions and passing them around. 2020-03-20 11:23:11 -07:00
Balachandar Namasivayam 58a9bfa78b
Merge pull request #2820 from dongxinEric/fix/1977/add-back-trace-event-flush-failure-report
Fix/1977/add back trace event flush failure report
2020-03-18 16:11:44 -07:00
Andrew Noyes bed5d4733a Fix syntax 2020-03-18 11:00:02 -07:00
Andrew Noyes 0d4f49f02f Run default signal handler after custom signal handler 2020-03-18 10:54:47 -07:00
Meng Xu d8a9034085
Merge pull request #2741 from ajbeamon/roughness-cleanup
Clean up and add comments to Roughness calculation
2020-03-17 15:45:35 -07:00
Evan Tschannen e08f0201f1 merge release 6.2 into master 2020-03-17 12:51:47 -07:00
Xin Dong 31a9f0a26c Fix the segfault 2020-03-17 11:03:46 -07:00
Evan Tschannen 787a5caaca
Merge pull request #2805 from ajbeamon/localized-allocation-trace-depth
Don't disallow allocation tracking when a trace event is open
2020-03-16 16:21:09 -07:00
Steve Atherton c7a9d184f0
Merge pull request #2790 from tclinken/ignore-create-directory-errors
Ignore createDirectory error if directory already exists
2020-03-16 16:13:47 -07:00
Evan Tschannen c197520fa7
Merge pull request #2810 from alexmiller-apple/fdbcli-tlsinfo
Add a `tlsinfo` command to fdbcli that prints the certificate chain.
2020-03-16 15:47:32 -07:00
A.J. Beamon f1523bd472 Setting the network thread more than once is a no-op 2020-03-16 15:37:06 -07:00
A.J. Beamon 96187618a0 Fix condition to check whether allocation tracing is enabled 2020-03-16 15:12:50 -07:00
Evan Tschannen ed4d02a3e4
Merge pull request #2812 from etschannen/feature-proxy-mem-limit
Limit the amount of requests the proxy can queue up in memory
2020-03-16 14:56:56 -07:00
A.J. Beamon 7769218303 Move an increment after an ASSERT. 2020-03-16 14:11:07 -07:00
A.J. Beamon d8cfabe73b Extend the allocation tracing disabling flag to cover more parts of trace logging as a precaution. Make it possible to disable via knob. 2020-03-16 13:59:31 -07:00
Xin Dong 89861c661e Fix the random crash. Use a thread safe 'ThreadReturnPromise' instead of the ThreadFuture. 2020-03-16 13:36:55 -07:00
A.J. Beamon ee3cde0b0d
Merge pull request #2815 from etschannen/feature-timeout-tlog-create
Treat a tlog which takes a long time to create its disk queue as failed
2020-03-16 12:49:33 -07:00
Alex Miller 72326fe8af Fix the build. 2020-03-16 12:46:13 -07:00
Alex Miller db5863145a Merge remote-tracking branch 'upstream/release-6.2' into fdbcli-tlsinfo 2020-03-16 12:33:50 -07:00
Evan Tschannen a068d4063f renamed ProxyGetConsistentReadVersion 2020-03-16 12:11:32 -07:00
Evan Tschannen 77dde00da7
Merge pull request #2818 from ajbeamon/increase-metrics-priority
Increase priority of the logging of various metrics trace events
2020-03-16 11:57:37 -07:00
Evan Tschannen ea98c7a40a added additional timeout on initPersistentState 2020-03-16 11:38:14 -07:00
A.J. Beamon 5f4373c200
Merge pull request #2811 from alexmiller-apple/tls-failures-status
Add TLS Policy Failure count to ProcessMetrics and status json
2020-03-16 11:11:30 -07:00
A.J. Beamon 031b579ede Increase priority of the logging of various metrics trace events. 2020-03-13 16:20:23 -07:00
Alex Miller a5568b2fc6 Rewrite tlsinfo into --debug-tls, and print out configuration. 2020-03-13 15:46:03 -07:00
Evan Tschannen 243c268d9d Limit the amount of requests the proxy can queue up in memory 2020-03-13 10:17:49 -07:00
Alex Miller 04498cbc0e Make policy failures be reported as per 1s and not over 5s. 2020-03-13 02:49:06 -07:00
Alex Miller 75e2fffe5a Add a ProcessMetrics.TLSPolicyFailures metric
This reports the number of policy failures over the past 5s interval.
It also is step 1 towards getting this information into status json.
2020-03-13 02:24:37 -07:00
Alex Miller 0c558efcfe Add a `tlsinfo` command to fdbcli that prints the certificate chain.
This requires the certificate chain to load successfully, otherwise
fdbcli will error out at an earlier point due to Net2 not being able to
configure TLS.
2020-03-13 00:11:53 -07:00
Xin Dong 5967ef5eab Added back the changes that report trace log flush failures and fix the random crash 2020-03-12 14:34:19 -07:00
A.J. Beamon 2466749648 Don't disallow allocation tracking when a trace event is open because we now have state trace events. Instead, only block allocation tracking while we are in the middle of allocation tracking already to prevent recursion. 2020-03-12 11:17:49 -07:00
A.J. Beamon 8cdf918316 Add logging when file identifiers don't match 2020-03-12 11:06:53 -07:00
Andrew Noyes 770ef6e726 Add test 2020-03-10 10:42:57 -07:00
Andrew Noyes 027029cc9b Remove offending overload? 2020-03-10 10:18:14 -07:00
Evan Tschannen 303df197cf Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	bindings/c/test/mako/mako.c
#	documentation/sphinx/source/release-notes.rst
#	fdbbackup/backup.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/NativeAPI.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/LogRouter.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/fdbserver.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/flow.vcxproj
#	flow/flow.vcxproj.filters
#	versions.target
2020-03-06 18:22:46 -08:00
tclinken 2017daf7d4 Ignore createDirectory error if directory already exists 2020-03-06 16:48:23 -08:00
Evan Tschannen dbfc0cbcc0
Merge pull request #2781 from alexmiller-apple/certificate-refresh
Refresh certificates used for handshaking when they change on disk
2020-03-06 11:12:04 -08:00
Alex Miller f9969a853c Merge remote-tracking branch 'origin/certificate-refresh' into certificate-refresh 2020-03-06 11:10:05 -08:00
Alex Miller 188d9b8239 Don't swallow actor cancellation in certificate refreshing. 2020-03-06 11:09:17 -08:00
Alex Miller 9b760fae2d Rewrite all Errors into tls_errors if they happen as part of initializing TLS. 2020-03-06 11:06:19 -08:00
Alex Miller 1f56bf8933
Fix the build with success()
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-03-06 10:15:04 -08:00
Alex Miller ac52b6b474 Rework a bit of error and exception handling.
I went back and dug through all of the "what functions can throw what
types", and made sane decisions about them.  boost errors are
aggressively translated into FDB ones, whcih might result in multiple
lines of logging about errors, but this is in infrequently run code, so
it should be fine.
2020-03-06 02:33:16 -08:00
Evan Tschannen 39050308ff lower accept batch size just to be conservative with the change 2020-03-05 18:17:49 -08:00
Evan Tschannen 1128666840 added additional logging on the log router 2020-03-05 18:17:06 -08:00
Alex Miller ccef3f7d05 Attempt to fix TLS_DISABLED compiles. 2020-03-05 17:32:10 -08:00
Alex Miller 2d95a1e64d Implement certificate refreshing 2020-03-05 17:25:33 -08:00
Alex Miller 595dd77ed1 Merge remote-tracking branch 'upstream/release-6.2' into certificate-refresh 2020-03-04 20:25:42 -08:00
Alex Miller 9b5ef3416e Refactor TLSParams into TLSConfig + LoadedTLSConfig
The idea being that we keep around a TLSConfig that the configuration
that the user has provided, and then when we want to intialize an SSL
context, we ask the TLSConfig to load all certificates and return us a
LoadedTLSConfig that is a concrete set of certificate bytes in memory.

initTLS now just takes the in-memory bytes and applies them to the ssl
context.

This is a large refactor to lead up into certificate refeshing, where we
will periodically check for changes to the certificates, and then
re-load them and apply them to a new SSL context.
2020-03-04 20:14:47 -08:00
Xin Dong 39610d15f8 Revert this change since it somehow introduced a random crash detected on circus 2020-03-04 16:14:38 -08:00
Evan Tschannen 2a877bce9a
Merge pull request #2777 from etschannen/feature-accept-batch
Accept connections in batches of 20 to improve performance
2020-03-04 16:14:24 -08:00
Evan Tschannen c73cae0feb
Merge pull request #2760 from ajbeamon/client-version-fixes
Improvements to client version reporting
2020-03-04 15:52:49 -08:00
A.J. Beamon b3c3f8aa5f
Update flow/genericactors.actor.h
Pass by reference
2020-03-04 15:35:51 -08:00
Evan Tschannen 7cbabca124 remove printing to stderr from initTLS because that could cause problems on clients 2020-03-04 15:06:22 -08:00
Evan Tschannen 35a1ac6482 prepare net2 for new versions of boost 2020-03-04 14:26:01 -08:00
Evan Tschannen da579faf62 add missing task priority 2020-03-04 14:25:30 -08:00
Evan Tschannen 820957025f accept connections in batches of 20 to improve performance 2020-03-04 14:24:57 -08:00
Andrew Noyes 24bbf5a8f0 Avoid invalid read on invalid Void msg 2020-03-02 12:11:43 -08:00
Andrew Noyes cdbe3117d7 Fix typo 2020-03-02 12:11:43 -08:00
Andrew Noyes 7119b46eb2 Add unit test 2020-03-02 12:11:43 -08:00
Evan Tschannen c11c24b79d removed the fdbrpc version of platform.h 2020-02-28 14:56:10 -08:00
Andrew Noyes e6d36a0aa5 Fix Makefile build 2020-02-28 13:16:58 -08:00
Andrew Noyes f29d6c3f67 Move implementation of ArenaBlock members to Arena.cpp 2020-02-28 12:33:57 -08:00
Evan Tschannen 6054c05963 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/fdbserver.actor.cpp
#	versions.target
2020-02-28 12:11:05 -08:00
A.J. Beamon d1e1fea42d Our binaries that act like clients (fdbcli, backup and DR binaries) were reporting an unknown client version. Clients did not react if the list of supported versions changed. 2020-02-28 09:35:21 -08:00
Xin Dong 13e72f7b3b
Merge pull request #2605 from dongxinEric/fix/1977/report-inability-to-flush-trace-log
Report inability to flush trace logs.
2020-02-27 12:36:55 -08:00
Xin Dong 16575ae94d Address review comments 2020-02-27 11:54:15 -08:00
Xin Dong 4ac7b36e44 Added back the mutex holder that was removed accidentally 2020-02-27 10:19:17 -08:00
Evan Tschannen 707fc1ddea only capture the policy to match prior code 2020-02-26 19:04:49 -08:00
Evan Tschannen c3299b8ebe if tls cannot be initialized, throw an error from createDatabase 2020-02-26 18:53:06 -08:00
Evan Tschannen bf5a95e6df Merge commit 'dc39bdfbbf94a7f470386f439df08c044d08d90c' into feature-tls-environment-vars
# Conflicts:
#	flow/Net2.actor.cpp
2020-02-26 18:02:56 -08:00
Evan Tschannen f035bed870 defer initializing TLS to avoid throwing errors from a constructor and so that errors can be logged to the trace file 2020-02-26 17:50:07 -08:00
A.J. Beamon 4bbac9d996 Change a special case return to -1. Update comments to clarify and correct some things. 2020-02-26 16:39:13 -08:00
Evan Tschannen f85af10a18 fixed a few problems with tls setup 2020-02-26 16:06:45 -08:00
Evan Tschannen d1598e7c99 set_verify_peers throws an error instead of returning a value 2020-02-26 16:06:16 -08:00
Evan Tschannen 2586bade68 re-added support for configuration TLS options with environment variables 2020-02-26 15:33:48 -08:00
A.J. Beamon 0f5c999d4b Better containment of boost errors related to TLS. 2020-02-26 12:26:43 -08:00
Steve Atherton 087c6fa33d
Merge branch 'master' into feature-redwood 2020-02-26 12:25:04 -08:00
negoyal cd949eca71 Merge branch 'master' into fdb_cache_subfeature2 2020-02-26 11:22:08 -08:00
Xin Dong 74c929d98d Fix windows build, again 2020-02-26 10:01:08 -08:00
Evan Tschannen 924d335aa7 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	flow/Knobs.cpp
#	flow/Knobs.h
2020-02-25 18:25:19 -08:00
Xin Dong 7b51ab6b63 Rebased with master 2020-02-25 15:43:33 -08:00
Xin Dong f20619c9fb Resolve review comments. Changed how issues got cleared 2020-02-25 15:39:51 -08:00
Xin Dong 3f24ae93f2 Remove the unused variable 2020-02-25 15:39:38 -08:00
Xin Dong 090c89e90a Addressed review comments. Fix the bug where issues on a worker may be wrongly cleared by subsequent GetDBinfo request. 2020-02-25 15:39:38 -08:00
Xin Dong aaa63331b6 Fix windows build 2020-02-25 15:39:09 -08:00
Xin Dong 288e95c7e1 Reallocate the issues set after each get. Changed an issues name to be accurate 2020-02-25 15:39:09 -08:00
Xin Dong 1c346fcfb0 Added the new issues into Status Schema. Remove the issue reporting in lastError since:
- If the issue string contains the error number, status schema needs to be super verbose to include all possible issue strings
- If the issue string does not contain the error number, the generic issue string can be pretty useless.

Thus now specific issues are being reported before calling lastError
2020-02-25 15:38:14 -08:00
Xin Dong 39c92c9cce Update flow/FileTraceLogWriter.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-02-25 15:38:14 -08:00
Xin Dong f4f860bfa8 Changed issue reporting to be thread safe. Also changed the liveness ping to be thread safe. 2020-02-25 15:38:14 -08:00
Xin Dong a6580dc15f Added the ability to ping a trace log writer thread and the monitoring in worker.actor.cpp. The current solution is simple a loose check. We can change this to be accurate check by using 'pthread_kill(writer_thread, 0)' 2020-02-25 15:37:53 -08:00
Xin Dong 0b0414fb94 Addressded review comments. Change the issue reporting from 'ITraceLogWriter' to be a more generic way. 2020-02-25 15:37:53 -08:00
Xin Dong 034dfe5e42 Now the inability to flush trace logs will be reported to both 'stderr' and also the status json object.
- Since the first flush failure, if the accumulated consecutive failure count exceeds the value defined in knobs, it will trigger the current worker process to report this issue via the 'GetServerDBInfo' interface of the cluster controler
    - A successful flush will reset the accumulated counter.
    Notice that the current solution does not take the time into consideration. The assumption is that flush failures tend to only happen in a clustered manner. The intermittent, but short, periods of flush failures are not considered as a problem since the memory pressure built by them should be negligible.
2020-02-25 15:37:32 -08:00
A.J. Beamon 0f7656e52e Document roughness. Remove an unexplained factor of 2 and handle window edges better. Subtract 1 from roughness to correspond better to variance. 2020-02-25 08:45:51 -08:00
A.J. Beamon 1c6aef76b5 When one of the sqlite reader or writer thread pools fail, fail the other with the same error. 2020-02-24 12:39:04 -08:00
Alvin Moore 9585cd10f1 Removed duplicate CMake link request 2020-02-24 00:19:43 -08:00
Alvin Moore 0f64505d0b Merge branch 'release-6.2' of github.com:apple/foundationdb
Needed to pull in changes to build docker
2020-02-23 23:27:53 -08:00
Steve Atherton 712aa27896 Merge branch 'release-6.2' of github.com:apple/foundationdb into feature-redwood 2020-02-23 00:30:27 -08:00
Evan Tschannen 65fbe0d0bc revert AcceptSocket priority change because of bad performance results 2020-02-21 19:22:14 -08:00
Evan Tschannen 96258b9809 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbcli/fdbcli.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/FlowTransport.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/KeyValueStoreMemory.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/QuietDatabase.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/StorageMetrics.actor.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KVStoreTest.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/genericactors.actor.cpp
#	flow/serialize.h
2020-02-21 19:09:16 -08:00
Steve Atherton f1ec780b31 Merge branch 'release-6.2' of github.com:apple/foundationdb into feature-redwood 2020-02-21 17:43:11 -08:00
A.J. Beamon 4c696d5bf2 Merge branch 'release-6.2' into dd-better-rebalance-logging
# Conflicts:
#	fdbserver/DataDistributionQueue.actor.cpp
2020-02-21 17:41:00 -08:00
A.J. Beamon dfa5f76c01 Remove unused parameter. Don't put check for g_network presence in ASSERT_WE_THINK. 2020-02-21 16:28:03 -08:00
A.J. Beamon 2431d4d788 Always compute the time for a trace event when it is being logged rather than when it is being created. Usually these are the same, but if they aren't, doing the opposite can lead to out of order trace events. 2020-02-21 13:57:04 -08:00
A.J. Beamon 6810a03283 Add more logging to valley filler and mountain chopper 2020-02-21 10:55:14 -08:00
Alvin Moore 90b4050eca Added required include for stringstream 2020-02-21 09:59:11 -08:00
Alvin Moore d02d84a577 Added required include for std:set which is for some reason only missing within Windows build 2020-02-21 09:36:24 -08:00
Alvin Moore 9042cab7bc Changed ordering of link libraries 2020-02-21 08:56:52 -08:00
Evan Tschannen dc3826e2fd fix: tls throttling would re-insert the failure into the map 2020-02-20 18:17:39 -08:00
Evan Tschannen f04e311a1e Merge commit 'b46d6e25e24993ab5a5f04091fd3235050b7cd09' into feature-boost-ssl
# Conflicts:
#	fdbserver/SimulatedCluster.actor.cpp
#	flow/Net2.actor.cpp
2020-02-20 17:36:38 -08:00
Alex Miller 927cff3317 Report errors on TLS misconfigurations ... or at least try to. 2020-02-20 16:57:29 -08:00
Evan Tschannen d7c841a28a
Merge pull request #2589 from etschannen/feature-proxy-delay
Improve version pipelining on the proxy
2020-02-20 15:23:30 -08:00
Evan Tschannen 8129f74a10
Merge pull request #2698 from etschannen/feature-recruit-delay
The CC waits until no new workers register before starting a bad recruitment
2020-02-20 14:42:37 -08:00
Evan Tschannen 7d54acf4ca removed an unnecessary yield 2020-02-20 14:41:49 -08:00
A.J. Beamon 5586e6f6d8
Merge pull request #2697 from etschannen/feature-correctness-fixes
A variety of correctness fixes
2020-02-20 13:32:18 -08:00
Evan Tschannen 08c318d28a re-added the connect lock in the fdbcli so that the timeout is not spent before a connection has been initiated (because of the handshake lock) 2020-02-20 10:43:34 -08:00
Evan Tschannen 69b5a1fbe3 more priority improvements 2020-02-20 10:11:43 -08:00
Evan Tschannen fd8a58b035 re-added support for the TLS_DISABLED flag 2020-02-19 18:37:47 -08:00
Evan Tschannen 761da5a059 code cleanup 2020-02-19 17:59:45 -08:00
Evan Tschannen fbd45963d8 The cluster controller waits until no new workers register for 1.0 before starting a bad recruitment 2020-02-19 16:48:30 -08:00
Evan Tschannen 9b3254d5f4 A corrupted processId file should be deleted in simulation, as that is the manual operation that would fix the problem in the real world 2020-02-19 15:21:42 -08:00
Alex Miller fe78524bbc
Merge pull request #2678 from sears/networktest_perf
Add some tuning knobs to networktestclient; also, measure latency directly
2020-02-19 14:38:09 -08:00
Russell Sears 956a3efa80 Pull request comments 2020-02-19 10:55:05 -08:00
Alex Miller 88d36af9c7 Fix --tls_password and add better error logging
This refactors all tls settings into a TLSParams object so that we can
set the password before loading any certificates.

It turns out that the FDBLibTLS code did really nice things with error
logging, but I just didn't understand openssl enough before to realize
what pieces I should be copying.
2020-02-19 00:57:05 -08:00
Meng Xu 31a6ec34b7 Merge branch 'master' into mengxu/fast-restore-agent-PR 2020-02-18 16:17:59 -08:00
Alex Miller 9d88356468
Merge pull request #2686 from mpilman/features/avoid-unnecessary-template-instanciations
Removed dead code
2020-02-17 14:46:39 -08:00
Alex Miller 9144c3e8ca
Merge pull request #2087 from atn34/issue-1226
Allow member actors access to private variables
2020-02-17 14:39:31 -08:00
mpilman aac94a766b Removed dead code 2020-02-15 21:56:48 -08:00
A.J. Beamon 649fc6ba94
Merge pull request #2329 from davisp/trace-clock-source-network-option
Add network option for the trace clock source
2020-02-15 10:43:00 -08:00
Paul J. Davis 32e285a761 Add network option for the trace clock source
This option allows clients to select the clock source for trace events
similar to the `--traceclock` command line parameter for `fdbserver`.
Using the `realtime` clock sources makes loading event data into
OpenTracing systems like Jaeger more useful.
2020-02-15 11:30:43 -06:00
Markus Pilman ccf590e193 Merge branch 'master' of github.com:apple/foundationdb into features/boost70 2020-02-14 22:05:51 -08:00
mpilman 579444419a remove call to `get_io_service` 2020-02-14 21:22:14 -08:00
mpilman 3a1e878a9b Upgrade to boost 1.72 2020-02-14 18:10:13 -08:00
Evan Tschannen 693e469003 Changed the handshake lock to a BoundedFlowLock, which will enforce that old handshakes complete before starting to initiate new handshakes 2020-02-14 16:49:52 -08:00
Evan Tschannen 321dded7dd rely on preverified to verify the certificate 2020-02-14 16:45:04 -08:00
Alex Miller 94e7f790d8
Merge pull request #2667 from atn34/atn34/remove-flatbuffers-knob
Remove USE_OBJECT_SERIALIZER knob
2020-02-14 15:44:38 -08:00
Alex Miller 723a70b357 Call X509_verify_cert once and implement time checking by hand 2020-02-13 21:31:36 -08:00
Alex Miller d716c50000 Find OpenSSL or LibreSSL in CMake 2020-02-13 21:31:36 -08:00
Alex Miller 8298fb3cb5 Remove spammy traceevent from testing 2020-02-13 21:31:36 -08:00
Russell Sears 7724c644e5 Add some tuning knobs to networktestclient; also, measure latency directly. 2020-02-13 13:11:54 -08:00
Steve Atherton 0c7c815396 Merge branch 'release-6.2' of github.com:apple/foundationdb into feature-redwood
# Conflicts:
#	tests/CMakeLists.txt
2020-02-12 16:12:57 -08:00
Andrew Noyes 1248d2b8b4 Remove USE_OBJECT_SERIALIZER knob 2020-02-12 10:41:52 -08:00
Steve Atherton 93e3e36d52 Changed RedwoodRecordRef::compare() to include value and updated VersionedBTree to adapt to this change. This fixes an (uncommitted) bug where DeltaTree inserts of a record matching a deleted record except for value would simply unhide the deleted record. For DeltaTree delete/insert sequences to work correctly compare() must only return 0 when the records are fully equivalent. 2020-02-12 01:18:35 -08:00
Meng Xu e76b6d824a FastRestore:Assign priority to actors to prioritize vb work
When we pipeline multiple version batches, we should prevent a later
version batch from blocking the earlier version batch by consuming
CPU resources.

To achive the above, we should assign higher priority to actors
in later phases in a version batch.

Because restore master will not invoke an actor at a later phase unless
the actors at the earlier phases have been finished. This priority assignment
will not cause dead lock.
2020-02-10 20:29:23 -08:00
Evan Tschannen dcbce3593e fixed TLS in simulation 2020-02-10 14:00:21 -08:00
mpilman 5a9d420cb7 Merge remote-tracking branch 'upstream/release-6.2' into release-merges/20200210 2020-02-10 10:02:05 -08:00
A.J. Beamon ff44bd2b33
Merge pull request #2639 from atn34/atn34/include-port-in-address-default
Enable include_port_in_address by default for api version 700
2020-02-10 09:50:59 -08:00
Markus Pilman e71fe44ee3
Merge branch 'master' into features/icc 2020-02-08 21:33:02 -08:00
A.J. Beamon abb75f7eb7 Add logging to indicate the time spent at each priority that exceeds some minimum busyness threshold 2020-02-07 14:34:24 -08:00
A.J. Beamon 6010d835fb Reorganize the interaction between slow task checking and check_yield 2020-02-07 10:35:09 -08:00
Alex Miller 2a2bf945ef Also remove FDBLibTLS from CMake 2020-02-06 21:55:13 -08:00
Alex Miller e390dbd36c Add a non-FDBLibTLS verify peers framework to new TLS impl 2020-02-06 21:06:52 -08:00
Evan Tschannen 38d8d0d675 fixed simulation 2020-02-06 19:29:31 -08:00
Evan Tschannen 69de430057 separate handshaking from connection to improve pipelining 2020-02-06 16:45:54 -08:00
A.J. Beamon df2b0452b4 Step 3 of fixing storage server range reads: change return type of readRange from VectorRef<KeyValueRef> to RangeResultRef. 2020-02-06 13:19:24 -08:00
negoyal 85cc35e81e Merge branch 'master' into HEAD 2020-02-05 14:59:55 -08:00
Evan Tschannen 53d0867a17 limit the number of connections a process can attempt to establish in parallel 2020-02-04 18:15:10 -08:00