foundationdb

Commit Graph

Author	SHA1	Message	Date
Stephen Atherton	3dfaf13b67	IBackupContainer has been rewritten to be a logical interface for storing, reading, deleting, expiring, and querying backup data. The details of how the data is organized or stored is now hidden from users of the interface. Both the local and blobstore containers have been rewritten, the key changes being a multi level directory structure and no more use of temporary files or pseudo-symlinks in the blob store implementation. This refactor has a large impact radius as the previous backup container was just a thin wrapper that presented a single level list of files and offered no methods for managing or interpreting the file structure so all of that logic was spread around other places in the code base. This made moving to the new blob store schema very messy, and without this refactor further changes in the future would only be worse. Several backup tasks have been cleaned up / simplified because they no longer need to manage the ‘raw’ structure of the backup. The addition of IBackupFile and its finish() method simplified the log and range writer tasks. Updated BlobStoreEndpoint to support now-required bucket creation and bucket listing prefix/delimiter options for finding common prefixes. Added KeyBackedSet<T> type. Moved JSONDoc to its own header. Added platform::findFilesRecursively(). Still to do: update command line tool to use new IBackupContainer interface, fix bugs in Restore startup.	2017-11-14 23:33:17 -08:00
A.J. Beamon	313e823629	Delete TDMetric data (tmpEventMetric) when a trace event is throttled.	2017-11-13 15:06:21 -08:00
A.J. Beamon	bf07fa3023	Untested changes to MemAvailable computation on kernels without MemAvailable	2017-11-06 09:35:05 -08:00
A.J. Beamon	7cf17df821	Merge branch 'master' into log-group-for-unsupported-clients # Conflicts: # flow/Net2.actor.cpp # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2017-11-01 11:31:02 -07:00
Evan Tschannen	b1e3864c0e	fix: stop after does not print errors if actor cancelled	2017-11-01 10:58:39 -07:00
Balachandar Namasivayam	e377d1985c	Merge pull request #195 from cie/throttle-trace-events Increase message limit for throttling to 100,000 for circus runs.	2017-10-31 11:10:11 -07:00
Balachandar Namasivayam	dd6c24ce09	Addressed Review Comments.	2017-10-31 11:08:54 -07:00
Balachandar Namasivayam	80e5fecfe2	Increase message limit for throttling to 100,000 for circus runs. Added an optimization to use a separate set for throttled events. Since this set is expected to be small, comparison of every event against this set is going to be cheaper.	2017-10-31 10:35:26 -07:00
Evan Tschannen	54d82c0d92	Merge pull request #194 from cie/alexmiller/valgrind Fix valgrind errors	2017-10-27 17:25:12 -07:00
Alex Miller	3b61b76876	Fix a massive amount of valgrind errors and make them easier to debug in the future. std::is_pod<> being less restrictive than is_binary_serializable<> meant that structs that both were POD and had a serialize method defined would be binary serialized instead of using the defined serialize(). This means that it would also serialize any padding that the struct contained, which would cause mass waves of valgrind failures from uninitialized memory. Included in this change is additional uses of valgrind client requests so that attempts to send uninitialized memory are reported at the sending site, versus as part of checksum calculation in sending the packet.	2017-10-27 16:54:44 -07:00
Balachandar Namasivayam	1d3c88c147	Increase the spammy trace event threshold to 20000 in 20 minutes.	2017-10-27 12:22:56 -07:00
Balachandar Namasivayam	cfefab18fb	Merge branch 'master' into add-new-atomic-ops	2017-10-25 18:03:34 -07:00
Balachandar Namasivayam	9dd588dcce	Addressed review comments. Changed naming for NewMin and NewAnd to MinV2 and AndV2	2017-10-25 14:48:05 -07:00
Evan Tschannen	d852a53ae4	Merge pull request #181 from cie/throttle-spammy-logs Throttle spammy logs	2017-10-25 13:45:55 -07:00
Stephen Atherton	3afc85881e	Merge branch 'master' into backup-container-refactor # Conflicts: # fdbrpc/BlobStore.actor.cpp	2017-10-20 21:38:28 -07:00
Stephen Atherton	42955012e9	Merge branch 'release-5.0' # Conflicts: # fdbrpc/BlobStore.actor.cpp # flow/error_definitions.h	2017-10-20 21:16:55 -07:00
Stephen Atherton	efe857fef6	Fixed inconsistent styles of recently changed error messages.	2017-10-20 12:56:00 -07:00
Evan Tschannen	e2c1e87df6	made a large number of fixes to make fearless DR correctness clean.	2017-10-19 15:36:32 -07:00
Alec Grieser	dd6d8f3b0e	Merge branch 'master' into add-new-atomic-ops	2017-10-18 16:36:44 -07:00
Alex Miller	d3df8469dd	Upgrade protocol version as ClientWorkerInterface was changed.	2017-10-18 14:56:31 -07:00
A.J. Beamon	a5c2373dbb	Spaces->tabs	2017-10-18 09:04:35 -07:00
A.J. Beamon	050c1bcba6	Update transaction_too_old error text.	2017-10-18 08:49:17 -07:00
Stephen Atherton	ebd0234514	Rewrote most error handling in BlobStoreEndpoint to fix several shortcomings in error handling and logging. The request loop now logs but rate limits all errors, and the exceptions thrown are more appropriate. HTTP 503 is now treated as retryable. Callers of BlobStoreEndpoint::doRequest() now specify which codes they consider to be successful so that more error handling can take place in the main request loop.	2017-10-18 02:52:09 -07:00
Alec Grieser	fc2ff719d0	fixed some capitalization issues that slid in through the case-insenstive filesystems	2017-10-17 09:07:05 -07:00
Alex Miller	7b9bc1d715	Merge pull request #170 from cie/alexmiller/flowprofile Add support for profiling a running fdb cluster to fdbcli, fix security issues, and add an improved backtrace.	2017-10-16 16:51:53 -07:00
Alex Miller	cf646d4a99	Address review comments. * Fixed fdbcli to be more idiomatic. * Removed is_binary_serializable in favor of std::is_pod<> * Removed custom enable_if<> in favor of std::enable_if<> * Removed HEY REVIEWER comments * Removed print from prof.py * Added FLOW_PROFILER_ENABLED=yes to circus components that wished to enable the flow profiler.	2017-10-16 16:46:52 -07:00
Alex Miller	16e5b50685	Replace backtrace with absl::GetStackTrace on non-MacOS platforms. backtrace() gives a list of return addresses, which means that addr2line will print out the line after the caller. GetStackTrace returns the list of caller addresses, so the addr2line results should be accurate. The flow profiler was also changed to use the new backtracing code, so flow profiles will now be accurate as well. Unfortunately, the abseil code doesn't work on MacOS, so we still fall back to backtrace() in this case. For the stack unwinder to work, we must disable -fomit-frame-pointer. This can result in a small performance penalty, as it effectively reduces the number of general purpose registers available by one. (I'm also curious if this has anything to do with the overly frequent "<value optimized out>" messages from gdb.) If this shows up as a problem, we can make release builds still have -fomit-frame-pointer, and fall back to backtrace when it's enabled then as well.	2017-10-16 16:05:02 -07:00
Alex Miller	1a91aab1d7	Import //base/debugging:stacktrace from abseil. This code is all Apache 2 licensed, and all headers were maintained when concatinated, so we should be completely fine from a legal standpoint. I've scriptified the steps that I took so that if we need to update this code in the future, it hopefully shouldn't be too much of a hassle.	2017-10-16 16:05:02 -07:00
Alex Miller	f997cb9038	Add a string knob to hold the Log directory, and write profiles to it. This is the combination of two small changes. 1. Add support for a string knob type. 2. Change profiles to be written to the log directory instead of the working directory. We have three options of where to write files: the working directory, the data directory, and the log directory. The working directory may be set to a non-writable location, and likely contains the fdb binaries. Allowing these files to be overwritten would likely not be a wise idea. The data directory hosts our sqlite b-trees. It would also be very unfortunate if these were ever overwritten by an unfortunate profile name. The log directory contains logs. Out of the three, these matter the least if they disappear or become corrupted. Thus, we write to the log directory.	2017-10-16 16:05:02 -07:00
Alex Miller	91a26a170c	Add toggleable profiling support to fdbserver+fdbcli. This adds the fdbcli commands: * profile list -- Lists all workers in a way that doesn't fill `kill`'s list. * profile flow run -- Allows starting flow profiling on a set of hosts for a specified interval. And threads through all the support for enabling and disabling profiling as an RPC.	2017-10-16 16:05:02 -07:00
Stephen Atherton	e934604f67	Added DNS resolution. Interface is INetworkConnections::resolveTCPEndpoint() to resolve, or for convenience INetworkConnections::connect(host, service) will resolve host and service (port number or service name like http) and connect to one of the addresses at random. BlobStoreEndpoint now only accepts hostnames and an optional service, so this update is not compatible with the previous URL formats having many IP addresses.	2017-10-15 21:51:11 -07:00
Balachandar Namasivayam	3aaa11977e	Addressed Review Comments	2017-10-12 14:56:00 -07:00
Stephen Atherton	ad0ed79d36	Merge pull request #172 from bmuppana/backup-refactor Backup refactoring	2017-10-12 11:38:49 -07:00
Stephen Atherton	11517f7bfc	Merge branch 'master' into continuous-backup # Conflicts: # fdbclient/FileBackupAgent.actor.cpp	2017-10-12 11:03:23 -07:00
A.J. Beamon	b20ae356b1	Alloc instrumentation backtraces use format_backtrace; Magnesium detects backtraces from binaries besides fdbserver.	2017-10-12 08:39:13 -07:00
Alex Miller	9648f96200	Also fix unforwarded Metric in IndexedSet. This is simply an exceedingly minor performance fix rather than a correctness issue.	2017-10-11 17:40:48 -07:00
Alex Miller	c24b941485	Fix erroneous std::move in indexed set, and clean up addMetric users. This is a follow-on to c4eb73d0. Thanks to Bala for pointing out the unchanged std::move usage, and there appeared to not be many existing users of addMetric anyway.	2017-10-11 17:36:51 -07:00
Balachandar Namasivayam	eeebf10030	Modified existing behavior of MIN and AND atomic ops. The new behavior results in a 'SET' if the atomic op is performed on a non -existing key. Added new atomic ops ByteMin and ByteMax that does lexicographic comparison of byte strings.	2017-10-10 13:02:22 -07:00
Evan Tschannen	15962cf079	Merge branch 'master' into feature-remote-logs # Conflicts: # fdbrpc/Locality.cpp # fdbrpc/Locality.h # fdbserver/ClusterController.actor.cpp # fdbserver/ClusterRecruitmentInterface.h # fdbserver/TLogServer.actor.cpp # fdbserver/TagPartitionedLogSystem.actor.cpp # fdbserver/WorkerInterface.h # fdbserver/fdbserver.vcxproj.filters # fdbserver/masterserver.actor.cpp # fdbserver/worker.actor.cpp # flow/error_definitions.h	2017-10-05 17:09:44 -07:00
Alex Miller	0ac868ad5d	"Simplify" IndexedSet's insert and addMetric API. The existing code tried to work around the complexities of optionally using rvalue references' move capabilities if they exist. As seen in the previous MapPair, there's a combinatorial explosion of prototypes to declare as the parameter length increases. Because of this, addMetric ended up with a strange API, and there was a wrapper to make a copy for insert. Instead, we can apply the idiom of using universal/forwarding references and std::forward to allow the compiler to instantiate the combinations that are needed. There's a TagData struct with no copy constructor that validates that move constructors can be properly called still. I measured a 12-byte difference between before and after this change, so no template bloat was introduced.	2017-10-03 20:15:12 -07:00
Balachandar Namasivayam	0e153cdd35	Throttle Spammy logs. Three knobs are added. Trace Events are sampled and cached with an expiration set. Every TraceEvent above SevDebug is checked against this cache to see if it exceeded a set threshold. If yes, then throttle the TraceEvent. If a TraceEvent is throttled, a warning msg is logged.	2017-10-02 18:43:11 -07:00
Evan Tschannen	6ea9903c82	Merge branch 'release-5.0' # Conflicts: # fdbbackup/backup.actor.cpp # fdbserver/ClusterController.actor.cpp # versions.target	2017-10-01 18:46:44 -07:00
Evan Tschannen	e2b65e86ed	added configurable memory limits for backup and dr executables added a default memory limit of 8GB for fdbcli	2017-09-29 10:35:40 -07:00
Evan Tschannen	ef41b07bb3	renamed past_version to transaction_too_old implemented read_lock_aware option	2017-09-28 16:35:08 -07:00
Yichi Chiang	d4f75630de	Support log group field in status json	2017-09-28 16:31:29 -07:00
Evan Tschannen	7b60e26660	Merge pull request #160 from cie/use-error-descriptions Add the ability to access name and description in Error. Update error…	2017-09-28 16:00:39 -07:00
A.J. Beamon	4f97bd44a5	If we fail to get the interface name due to a platform error, don't kill the process. Instead, just leave the network counters alone. Change the GetInterfaceAddrs trace event to SevWarnAlways.	2017-09-28 13:32:39 -07:00
A.J. Beamon	67d0eb5d66	Change a few more error descriptions; update sphinx error code documentation	2017-09-28 13:03:17 -07:00
A.J. Beamon	d30c730f75	Add the ability to access name and description in Error. Update error descriptions.	2017-09-28 12:35:03 -07:00
Evan Tschannen	7081136f74	added a fix	2017-09-22 15:08:14 -07:00
Evan Tschannen	4809bd8f62	fix: We cannot inject faults after renaming the file, because we could end up with two asyncFileNonDurable open for the same file	2017-09-21 18:11:18 -07:00
Evan Tschannen	489332533c	all timeouts longer than two minutes have been can be lowered to 60.0 with buggification added a workload that tries for a 50 second maximum latency in the presence of one failure with both buggification and connection failures	2017-09-18 11:04:51 -07:00
Evan Tschannen	76e7988663	Merge branch 'master' into feature-remote-logs # Conflicts: # fdbserver/ClusterController.actor.cpp # fdbserver/DataDistribution.actor.cpp # fdbserver/OldTLogServer.actor.cpp # fdbserver/TLogServer.actor.cpp # fdbserver/WorkerInterface.h # flow/Net2.actor.cpp	2017-09-11 15:15:56 -07:00
Bhaskar Muppana	fe208d6adf	Merge branch 'master' of github.com:apple/foundationdb into backup	2017-09-06 10:01:55 -07:00
Alec Grieser	fe9abbfac9	revert 'Remove unused code' for function referenced in fdbrpc	2017-09-01 09:54:03 -07:00
Ben Collins	3aaa131b4f	Merge branch 'master' of github.com:apple/foundationdb	2017-09-01 09:03:45 -07:00
Bhaskar Muppana	439193d17b	Moving keyBackupContainer to BackupConfig.backupContainer()	2017-08-30 12:48:28 -07:00
Stephen Atherton	5d49f2c710	Merge branch 'master' into feature-redwood # Conflicts: # fdbserver/fdbserver.vcxproj	2017-08-28 17:45:50 -07:00
A.J. Beamon	9a0a3b6329	Merge commit '66528becb82d826e81fa644bb378212584ab580e'	2017-08-28 16:47:59 -07:00
Stephen Atherton	86d025f943	Bug fix: Metric base enabled state was not being initialized. Metrics are configured to be disabled upon construction, however if during construction it appears that a metric was initially enabled then a crash would result if the MetricsCollection global was not created.	2017-08-27 22:22:32 -07:00
A.J. Beamon	9ce8d3ae4f	Merge branch 'release-5.0'	2017-08-09 10:37:43 -07:00
A.J. Beamon	5a8cd34224	Disable profiling if we aren't using the cached dl_iterate_phdr values.	2017-08-08 09:03:04 -07:00
Evan Tschannen	c22708b6d6	added tag localities fix: remote logs need to stop the master when they are stopped	2017-08-03 16:16:36 -07:00
A.J. Beamon	03915ce4ea	Merge branch 'release-5.0'	2017-08-03 15:49:54 -07:00
A.J. Beamon	f66e22c89d	fix: machine metrics could sometimes default to 0, which cause underflows when compared with prior results.	2017-08-03 15:49:30 -07:00
Evan Tschannen	9fb709cf73	Merge branch 'release-5.0' # Conflicts: # versions.target	2017-07-27 16:51:57 -07:00
Evan Tschannen	19fa31ffff	fix: uncancellable was not forwarding errors to the result promise	2017-07-27 13:29:08 -07:00
Ben Collins	6f0062330b	Merge branch 'master' of github.com:apple/foundationdb	2017-07-26 11:02:13 -07:00
Evan Tschannen	64e9560599	Merge pull request #128 from cie/maintain-incompatible-connections Maintain incompatible connections	2017-07-17 16:28:22 -07:00
A.J. Beamon	2113d47db6	Update protocol version for incompatible connection change	2017-07-17 16:16:05 -07:00
Alec Grieser	3700624fd7	Merge branch 'release-5.0'	2017-07-17 08:54:10 -07:00
Alec Grieser	eee492a05b	fix build issue from Notified.h not being shuffled in vcxproj files	2017-07-14 16:46:08 -07:00
Alec Grieser	c860f09d8a	Merge branch 'release-5.0'	2017-07-14 16:01:15 -07:00
Alec Grieser	660729839c	moved Notified.h from flow -> fdbclient ; flow bindings package does better job when excluding testers	2017-07-14 15:49:30 -07:00
Alec Grieser	b133862db6	added FLOW and FDB_FLOW targets to make packages of flow headers and libs	2017-07-13 10:21:36 -07:00
Ben Collins	72de765083	Remove unused code	2017-07-13 08:18:00 -07:00
Evan Tschannen	979ebcef6c	changed to using a vector of logSets instead of a duplicate set of logs for remote servers finished porting changes to the tlog everything but peeking is finished in the TagPartitionedLogSystem	2017-07-09 14:46:16 -07:00
Evan Tschannen	0906250e78	merged everything from feature-remote-logs besides the tlog and tagpartitionedlogsystem re-included tags in messages to the tlog previously never committed the LogRouter	2017-06-29 15:50:19 -07:00
Alec Grieser	343d115b37	update Platform to handle newer version of clang that does have __rdtsc	2017-06-28 14:32:01 -07:00
Alvin Moore	8f7c76ddd3	Merge branch 'release-4.6' into release-5.0 # Conflicts: # fdbserver/Knobs.h Updated Windows GUID Updated and corrected format of Protocol Version	2017-06-26 15:54:57 -07:00
Stephen Atherton	430bb6224e	Merge branch 'release-4.6' into release-5.0 # Conflicts: # fdbrpc/AsyncFileKAIO.actor.h # fdbrpc/Net2FileSystem.cpp # fdbrpc/sim2.actor.cpp	2017-06-16 02:14:19 -07:00
Evan Tschannen	4bdcd8fc12	Merge branch 'release-4.6' into release-5.0 # Conflicts: # bindings/bindingtester/run_binding_tester.sh # fdbrpc/AsyncFileKAIO.actor.h	2017-06-14 16:43:53 -07:00
Stephen Atherton	b65ad3563c	Merge branch 'master' into feature-redwood # Conflicts: # fdbserver/fdbserver.vcxproj # fdbserver/fdbserver.vcxproj.filters	2017-06-09 14:56:41 -07:00
Stephen Atherton	fa4fdb1f1d	Merge branch 'fix-io-timeout-handling' into release-5.0 # Conflicts: # fdbserver/optimisttest.actor.cpp	2017-05-31 17:03:15 -07:00
Stephen Atherton	98604d33a0	Merge branch 'fix-io-timeout-handling' # Conflicts: # fdbrpc/AsyncFileKAIO.actor.h # fdbrpc/sim2.actor.cpp # fdbserver/KeyValueStoreSQLite.actor.cpp # fdbserver/optimisttest.actor.cpp # fdbserver/worker.actor.cpp # fdbserver/workloads/MachineAttrition.actor.cpp # tests/fast/SidebandWithStatus.txt # tests/rare/LargeApiCorrectnessStatus.txt # tests/slow/DDBalanceAndRemoveStatus.txt	2017-05-26 18:43:08 -07:00
A.J. Beamon	fc468f682b	Merge branch 'release-5.0' into bindings-tuple-improvements # Conflicts: # bindings/java/src-completable/main/com/apple/apple/foundationdbdb/tuple/Tuple.java	2017-05-26 12:33:33 -07:00
FDB Dev Team	a674cb4ef4	Initial repository commit	2017-05-25 13:48:44 -07:00

... 10 11 12 13 14

687 Commits