Commit Graph

2226 Commits

Author SHA1 Message Date
Balachandar Namasivayam cfefab18fb Merge branch 'master' into add-new-atomic-ops 2017-10-25 18:03:34 -07:00
Balachandar Namasivayam 9dd588dcce Addressed review comments.
Changed naming for NewMin and NewAnd to MinV2 and AndV2
2017-10-25 14:48:05 -07:00
Evan Tschannen d852a53ae4 Merge pull request #181 from cie/throttle-spammy-logs
Throttle spammy logs
2017-10-25 13:45:55 -07:00
Stephen Atherton 3afc85881e Merge branch 'master' into backup-container-refactor
# Conflicts:
#	fdbrpc/BlobStore.actor.cpp
2017-10-20 21:38:28 -07:00
Stephen Atherton 42955012e9 Merge branch 'release-5.0'
# Conflicts:
#	fdbrpc/BlobStore.actor.cpp
#	flow/error_definitions.h
2017-10-20 21:16:55 -07:00
Stephen Atherton efe857fef6 Fixed inconsistent styles of recently changed error messages. 2017-10-20 12:56:00 -07:00
Evan Tschannen e2c1e87df6 made a large number of fixes to make fearless DR correctness clean. 2017-10-19 15:36:32 -07:00
Alec Grieser dd6d8f3b0e Merge branch 'master' into add-new-atomic-ops 2017-10-18 16:36:44 -07:00
Alex Miller d3df8469dd Upgrade protocol version as ClientWorkerInterface was changed. 2017-10-18 14:56:31 -07:00
A.J. Beamon a5c2373dbb Spaces->tabs 2017-10-18 09:04:35 -07:00
A.J. Beamon 050c1bcba6 Update transaction_too_old error text. 2017-10-18 08:49:17 -07:00
Stephen Atherton ebd0234514 Rewrote most error handling in BlobStoreEndpoint to fix several shortcomings in error handling and logging. The request loop now logs but rate limits all errors, and the exceptions thrown are more appropriate. HTTP 503 is now treated as retryable. Callers of BlobStoreEndpoint::doRequest() now specify which codes they consider to be successful so that more error handling can take place in the main request loop. 2017-10-18 02:52:09 -07:00
Alec Grieser fc2ff719d0 fixed some capitalization issues that slid in through the case-insenstive filesystems 2017-10-17 09:07:05 -07:00
Alex Miller 7b9bc1d715 Merge pull request #170 from cie/alexmiller/flowprofile
Add support for profiling a running fdb cluster to fdbcli, fix security issues, and add an improved backtrace.
2017-10-16 16:51:53 -07:00
Alex Miller cf646d4a99 Address review comments.
* Fixed fdbcli to be more idiomatic.
* Removed is_binary_serializable in favor of std::is_pod<>
* Removed custom enable_if<> in favor of std::enable_if<>
* Removed HEY REVIEWER comments
* Removed print from prof.py
* Added FLOW_PROFILER_ENABLED=yes to circus components that wished to enable the flow profiler.
2017-10-16 16:46:52 -07:00
Alex Miller 16e5b50685 Replace backtrace with absl::GetStackTrace on non-MacOS platforms.
backtrace() gives a list of return addresses, which means that addr2line will
print out the line after the caller. GetStackTrace returns the list of caller
addresses, so the addr2line results should be accurate.  The flow profiler was
also changed to use the new backtracing code, so flow profiles will now be
accurate as well.  Unfortunately, the abseil code doesn't work on MacOS, so we
still fall back to backtrace() in this case.

For the stack unwinder to work, we must disable -fomit-frame-pointer.  This can
result in a small performance penalty, as it effectively reduces the number of
general purpose registers available by one.  (I'm also curious if this has
anything to do with the overly frequent "<value optimized out>" messages from
gdb.)  If this shows up as a problem, we can make release builds still have
-fomit-frame-pointer, and fall back to backtrace when it's enabled then as
well.
2017-10-16 16:05:02 -07:00
Alex Miller 1a91aab1d7 Import //base/debugging:stacktrace from abseil.
This code is all Apache 2 licensed, and all headers were maintained when
concatinated, so we should be completely fine from a legal standpoint.

I've scriptified the steps that I took so that if we need to update this code
in the future, it hopefully shouldn't be too much of a hassle.
2017-10-16 16:05:02 -07:00
Alex Miller f997cb9038 Add a string knob to hold the Log directory, and write profiles to it.
This is the combination of two small changes.

1. Add support for a string knob type.
2. Change profiles to be written to the log directory instead of the working
   directory.

We have three options of where to write files: the working directory, the data
directory, and the log directory.

The working directory may be set to a non-writable location, and likely
contains the fdb binaries.  Allowing these files to be overwritten would likely
not be a wise idea.

The data directory hosts our sqlite b-trees.  It would also be very unfortunate
if these were ever overwritten by an unfortunate profile name.

The log directory contains logs.  Out of the three, these matter the least if
they disappear or become corrupted.

Thus, we write to the log directory.
2017-10-16 16:05:02 -07:00
Alex Miller 91a26a170c Add toggleable profiling support to fdbserver+fdbcli.
This adds the fdbcli commands:
* profile list -- Lists all workers in a way that doesn't fill `kill`'s list.
* profile flow run -- Allows starting flow profiling on a set of hosts for a specified interval.

And threads through all the support for enabling and disabling profiling as an RPC.
2017-10-16 16:05:02 -07:00
Stephen Atherton e934604f67 Added DNS resolution. Interface is INetworkConnections::resolveTCPEndpoint() to resolve, or for convenience INetworkConnections::connect(host, service) will resolve host and service (port number or service name like http) and connect to one of the addresses at random.
BlobStoreEndpoint now only accepts hostnames and an optional service, so this update is not compatible with the previous URL formats having many IP addresses.
2017-10-15 21:51:11 -07:00
Balachandar Namasivayam 3aaa11977e Addressed Review Comments 2017-10-12 14:56:00 -07:00
Stephen Atherton ad0ed79d36 Merge pull request #172 from bmuppana/backup-refactor
Backup refactoring
2017-10-12 11:38:49 -07:00
Stephen Atherton 11517f7bfc Merge branch 'master' into continuous-backup
# Conflicts:
#	fdbclient/FileBackupAgent.actor.cpp
2017-10-12 11:03:23 -07:00
A.J. Beamon b20ae356b1 Alloc instrumentation backtraces use format_backtrace; Magnesium detects backtraces from binaries besides fdbserver. 2017-10-12 08:39:13 -07:00
Alex Miller 9648f96200 Also fix unforwarded Metric in IndexedSet.
This is simply an exceedingly minor performance fix rather than a correctness issue.
2017-10-11 17:40:48 -07:00
Alex Miller c24b941485 Fix erroneous std::move in indexed set, and clean up addMetric users.
This is a follow-on to c4eb73d0.  Thanks to Bala for pointing out the unchanged
std::move usage, and there appeared to not be many existing users of addMetric
anyway.
2017-10-11 17:36:51 -07:00
Balachandar Namasivayam eeebf10030 Modified existing behavior of MIN and AND atomic ops. The new behavior results in a 'SET' if the atomic op is performed on a non -existing key.
Added new atomic ops ByteMin and ByteMax that does lexicographic comparison of byte strings.
2017-10-10 13:02:22 -07:00
Evan Tschannen 15962cf079 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbrpc/Locality.cpp
#	fdbrpc/Locality.h
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/ClusterRecruitmentInterface.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/TagPartitionedLogSystem.actor.cpp
#	fdbserver/WorkerInterface.h
#	fdbserver/fdbserver.vcxproj.filters
#	fdbserver/masterserver.actor.cpp
#	fdbserver/worker.actor.cpp
#	flow/error_definitions.h
2017-10-05 17:09:44 -07:00
Alex Miller 0ac868ad5d "Simplify" IndexedSet's insert and addMetric API.
The existing code tried to work around the complexities of optionally using
rvalue references' move capabilities if they exist.  As seen in the previous
MapPair, there's a combinatorial explosion of prototypes to declare as the
parameter length increases.  Because of this, addMetric ended up with a strange
API, and there was a wrapper to make a copy for insert.

Instead, we can apply the idiom of using universal/forwarding references and
std::forward to allow the compiler to instantiate the combinations that are
needed.  There's a TagData struct with no copy constructor that validates that
move constructors can be properly called still.

I measured a 12-byte difference between before and after this change, so no
template bloat was introduced.
2017-10-03 20:15:12 -07:00
Balachandar Namasivayam 0e153cdd35 Throttle Spammy logs. Three knobs are added.
Trace Events are sampled and cached with an expiration set. Every TraceEvent above SevDebug is checked against this cache to see if it exceeded a set threshold. If yes, then throttle the TraceEvent.
If a TraceEvent is throttled, a warning msg is logged.
2017-10-02 18:43:11 -07:00
Evan Tschannen 6ea9903c82 Merge branch 'release-5.0'
# Conflicts:
#	fdbbackup/backup.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	versions.target
2017-10-01 18:46:44 -07:00
Evan Tschannen e2b65e86ed added configurable memory limits for backup and dr executables
added a default memory limit of 8GB for fdbcli
2017-09-29 10:35:40 -07:00
Evan Tschannen ef41b07bb3 renamed past_version to transaction_too_old
implemented read_lock_aware option
2017-09-28 16:35:08 -07:00
Yichi Chiang d4f75630de Support log group field in status json 2017-09-28 16:31:29 -07:00
Evan Tschannen 7b60e26660 Merge pull request #160 from cie/use-error-descriptions
Add the ability to access name and description in Error. Update error…
2017-09-28 16:00:39 -07:00
A.J. Beamon 4f97bd44a5 If we fail to get the interface name due to a platform error, don't kill the process. Instead, just leave the network counters alone. Change the GetInterfaceAddrs trace event to SevWarnAlways. 2017-09-28 13:32:39 -07:00
A.J. Beamon 67d0eb5d66 Change a few more error descriptions; update sphinx error code documentation 2017-09-28 13:03:17 -07:00
A.J. Beamon d30c730f75 Add the ability to access name and description in Error. Update error descriptions. 2017-09-28 12:35:03 -07:00
Evan Tschannen 7081136f74 added a fix 2017-09-22 15:08:14 -07:00
Evan Tschannen 4809bd8f62 fix: We cannot inject faults after renaming the file, because we could end up with two asyncFileNonDurable open for the same file 2017-09-21 18:11:18 -07:00
Evan Tschannen 489332533c all timeouts longer than two minutes have been can be lowered to 60.0 with buggification
added a workload that tries for a 50 second maximum latency in the presence of one failure with both buggification and connection failures
2017-09-18 11:04:51 -07:00
Evan Tschannen 76e7988663 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/OldTLogServer.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/WorkerInterface.h
#	flow/Net2.actor.cpp
2017-09-11 15:15:56 -07:00
Bhaskar Muppana fe208d6adf Merge branch 'master' of github.com:apple/foundationdb into backup 2017-09-06 10:01:55 -07:00
Alec Grieser fe9abbfac9 revert 'Remove unused code' for function referenced in fdbrpc 2017-09-01 09:54:03 -07:00
Ben Collins 3aaa131b4f Merge branch 'master' of github.com:apple/foundationdb 2017-09-01 09:03:45 -07:00
Bhaskar Muppana 439193d17b Moving keyBackupContainer to BackupConfig.backupContainer() 2017-08-30 12:48:28 -07:00
Stephen Atherton 5d49f2c710 Merge branch 'master' into feature-redwood
# Conflicts:
#	fdbserver/fdbserver.vcxproj
2017-08-28 17:45:50 -07:00
A.J. Beamon 9a0a3b6329 Merge commit '66528becb82d826e81fa644bb378212584ab580e' 2017-08-28 16:47:59 -07:00
Stephen Atherton 86d025f943 Bug fix: Metric base enabled state was not being initialized. Metrics are configured to be disabled upon construction, however if during construction it appears that a metric was initially enabled then a crash would result if the MetricsCollection global was not created. 2017-08-27 22:22:32 -07:00
A.J. Beamon 9ce8d3ae4f Merge branch 'release-5.0' 2017-08-09 10:37:43 -07:00
A.J. Beamon 5a8cd34224 Disable profiling if we aren't using the cached dl_iterate_phdr values. 2017-08-08 09:03:04 -07:00
Evan Tschannen c22708b6d6 added tag localities
fix: remote logs need to stop the master when they are stopped
2017-08-03 16:16:36 -07:00
A.J. Beamon 03915ce4ea Merge branch 'release-5.0' 2017-08-03 15:49:54 -07:00
A.J. Beamon f66e22c89d fix: machine metrics could sometimes default to 0, which cause underflows when compared with prior results. 2017-08-03 15:49:30 -07:00
Evan Tschannen 9fb709cf73 Merge branch 'release-5.0'
# Conflicts:
#	versions.target
2017-07-27 16:51:57 -07:00
Evan Tschannen 19fa31ffff fix: uncancellable was not forwarding errors to the result promise 2017-07-27 13:29:08 -07:00
Ben Collins 6f0062330b Merge branch 'master' of github.com:apple/foundationdb 2017-07-26 11:02:13 -07:00
Evan Tschannen 64e9560599 Merge pull request #128 from cie/maintain-incompatible-connections
Maintain incompatible connections
2017-07-17 16:28:22 -07:00
A.J. Beamon 2113d47db6 Update protocol version for incompatible connection change 2017-07-17 16:16:05 -07:00
Alec Grieser 3700624fd7 Merge branch 'release-5.0' 2017-07-17 08:54:10 -07:00
Alec Grieser eee492a05b fix build issue from Notified.h not being shuffled in vcxproj files 2017-07-14 16:46:08 -07:00
Alec Grieser c860f09d8a Merge branch 'release-5.0' 2017-07-14 16:01:15 -07:00
Alec Grieser 660729839c moved Notified.h from flow -> fdbclient ; flow bindings package does better job when excluding testers 2017-07-14 15:49:30 -07:00
Alec Grieser b133862db6 added FLOW and FDB_FLOW targets to make packages of flow headers and libs 2017-07-13 10:21:36 -07:00
Ben Collins 72de765083 Remove unused code 2017-07-13 08:18:00 -07:00
Evan Tschannen 979ebcef6c changed to using a vector of logSets instead of a duplicate set of logs for remote servers
finished porting changes to the tlog
everything but peeking is finished in the TagPartitionedLogSystem
2017-07-09 14:46:16 -07:00
Evan Tschannen 0906250e78 merged everything from feature-remote-logs besides the tlog and tagpartitionedlogsystem
re-included tags in messages to the tlog
previously never committed the LogRouter
2017-06-29 15:50:19 -07:00
Alec Grieser 343d115b37 update Platform to handle newer version of clang that does have __rdtsc 2017-06-28 14:32:01 -07:00
Alvin Moore 8f7c76ddd3 Merge branch 'release-4.6' into release-5.0
# Conflicts:
#	fdbserver/Knobs.h
Updated Windows GUID
Updated and corrected format of Protocol Version
2017-06-26 15:54:57 -07:00
Stephen Atherton 430bb6224e Merge branch 'release-4.6' into release-5.0
# Conflicts:
#	fdbrpc/AsyncFileKAIO.actor.h
#	fdbrpc/Net2FileSystem.cpp
#	fdbrpc/sim2.actor.cpp
2017-06-16 02:14:19 -07:00
Evan Tschannen 4bdcd8fc12 Merge branch 'release-4.6' into release-5.0
# Conflicts:
#	bindings/bindingtester/run_binding_tester.sh
#	fdbrpc/AsyncFileKAIO.actor.h
2017-06-14 16:43:53 -07:00
Stephen Atherton b65ad3563c Merge branch 'master' into feature-redwood
# Conflicts:
#	fdbserver/fdbserver.vcxproj
#	fdbserver/fdbserver.vcxproj.filters
2017-06-09 14:56:41 -07:00
Stephen Atherton fa4fdb1f1d Merge branch 'fix-io-timeout-handling' into release-5.0
# Conflicts:
#	fdbserver/optimisttest.actor.cpp
2017-05-31 17:03:15 -07:00
Stephen Atherton 98604d33a0 Merge branch 'fix-io-timeout-handling'
# Conflicts:
#	fdbrpc/AsyncFileKAIO.actor.h
#	fdbrpc/sim2.actor.cpp
#	fdbserver/KeyValueStoreSQLite.actor.cpp
#	fdbserver/optimisttest.actor.cpp
#	fdbserver/worker.actor.cpp
#	fdbserver/workloads/MachineAttrition.actor.cpp
#	tests/fast/SidebandWithStatus.txt
#	tests/rare/LargeApiCorrectnessStatus.txt
#	tests/slow/DDBalanceAndRemoveStatus.txt
2017-05-26 18:43:08 -07:00
A.J. Beamon fc468f682b Merge branch 'release-5.0' into bindings-tuple-improvements
# Conflicts:
#	bindings/java/src-completable/main/com/apple/apple/foundationdbdb/tuple/Tuple.java
2017-05-26 12:33:33 -07:00
FDB Dev Team a674cb4ef4 Initial repository commit 2017-05-25 13:48:44 -07:00