Commit Graph

1018 Commits

Author SHA1 Message Date
Stephen Atherton e28641886d TraceEvent improvements. Minor bug fix, restore log writing tasks didn't have the log file endVersion but it's only for logging purposes. 2017-12-19 15:27:04 -08:00
Stephen Atherton a276985baf Bug fix, if there are range files in a restore which begin at exactly the restore version they will be repeatedly dispatched forever. 2017-12-18 17:48:18 -08:00
Stephen Atherton 005a4a0706 Restore status bug fix, during restore the apply lag would appear as a large negative number until the first restore batch is completed. Test improvement, snapshot dispatch now chooses a random number of tasks to dispatch per commit. 2017-12-18 15:56:57 -08:00
Stephen Atherton 937fa75bec Bug fix, if target snapshot end version is at or before the begin version then no progress would be made. 2017-12-18 00:13:25 -08:00
Stephen Atherton d32a770648 Bug fix, backup never went to differential mode once it was restorable which caused waitBackup to only return once the backup was discontinued. 2017-12-17 23:22:18 -08:00
Stephen Atherton 2b92815e8c Bug fix. The snapshot dispatch add task retry loop was incorrectly deciding that the second and further transaction of an execution was already committed and therefore skipping it, resulting in missing ranges in the snapshot. 2017-12-17 21:01:31 -08:00
Stephen Atherton afd2603576 Refactored backup task flow and config to support ongoing snapshots and allow stopping the backup cleanly between snapshots. The previously separate tasks for initial and differential mode log dispatching have been merged into BackupLogsDispatchTask. 2017-12-17 14:29:57 -08:00
Evan Tschannen 1dc9eceb6d optimize GetKeyLocationRequests on the proxy so they only require a single map lookup, instead of doing 3 + (3* [number of ranges]) lookups 2017-12-15 20:13:44 -08:00
Stephen Atherton 18305ab326 Bug fixes. Added snapshotBatchSize to backupConfig to enable detecting if a transaction for adding a group of tasks to a batch had already completed. Changed KeyRangeMap usage so that each range value to be dispatched has a unique integer value, enabling more efficient range coalescing and avoiding some iterator invalidation bugs. 2017-12-15 01:39:50 -08:00
Yichi Chiang 50c154fed4 Add fdbbackup interface 2017-12-14 13:54:01 -08:00
Stephen Atherton 33f9f1a95c Added SnapshotDispatch task for writing snapshots in random order over a specified period of time and adapting speed to a growing or shrinking database. TaskBucket now supports scheduling tasks. TaskFuture now correctly recognizes multiple tasks in its callback space. TaskBucket extendTimeout() now supports specifying the new timeout version. Submitting a backup now requires a snapshot duration. 2017-12-14 01:44:38 -08:00
Stephen Atherton 47a9a7ab0e Finished backup container discovery / listing via base URL. 2017-12-12 17:44:03 -08:00
Evan Tschannen 73a0a07eac clients ask for key location information directly from the proxy, instead of reading it from the database 2017-12-09 16:10:22 -08:00
Stephen Atherton 872edd7540 Merge branch 'release-5.0'
# Conflicts:
#	fdbclient/FileBackupAgent.actor.cpp
2017-12-06 16:27:04 -08:00
Stephen Atherton 532de63a05 Changed log and range backup task error events to SevWarn from SevError. 2017-12-06 16:21:15 -08:00
Stephen Atherton d3b4a81ed0 Blobstore connection details in unit tests now come from environment variables. 2017-12-06 14:38:45 -08:00
Stephen Atherton 4068ed3554 Merge branch 'backup-container-refactor' of github.com:apple/foundationdb into backup-container-refactor 2017-12-06 14:12:26 -08:00
Stephen Atherton ce6c49e173 Corrected a bunch of retry loops to not reset the backoff timer. 2017-12-06 14:11:40 -08:00
Balachandar Namasivayam 1f949240f5 Make fdbbackup s3 compatible.
s3 sends response in XML.  FDB backup expects json response. Added a new libraray xml2json to convert xml to json.
2017-12-05 17:13:15 -08:00
Evan Tschannen 44f0f943e8 fix: an abort is not successful until a second dummy transaction in committed to ensure that apply mutations has stopped 2017-12-04 17:21:43 -08:00
Stephen Atherton 86ae6c09c7 Bug fixes, take(1) is incorrect usage of FlowLock. 2017-12-04 10:20:50 -08:00
Stephen Atherton 42c6f7db34 Taskbucket but fix, caused by accidental removal of task function lookup. Added extendMutex to Task for use around transaction loops that call extendTimeout() to reduce conflicts. 2017-12-03 20:52:09 -08:00
Stephen Atherton 3a6708707f Removed unnecessary duplicate variable. 2017-12-02 07:03:34 -08:00
Stephen Atherton 20a8aae241 Old bug fix, transaction reset() not being called in a retry loop. 2017-12-02 07:02:26 -08:00
Stephen Atherton eadf93826d Bug fixes with transaction options and exception handling that were causing internal errors. 2017-12-01 15:16:44 -08:00
Evan Tschannen 482ac38ca6 added knobs so that the client failure monitoring update rate and the server failure monitoring update rate are separate knobs 2017-12-01 13:04:32 -08:00
Evan Tschannen 0c986f25ed Merge pull request #215 from cie/alexmiller/drtimefix
Fix a race between dumpData and version upgrades.
2017-11-30 18:17:19 -08:00
Alex Miller e583beb8f6 Fix a race between dumpData and version upgrades.
This fixes the occasional VersionStampBackupToDB failures, that were caused by
the version upgrade comarision happening before dumpData invocations were
stopped.  Committing the first transaction stops dumpData, and thus we can then
do the primary vs secondary version check correctly.
2017-11-30 17:37:00 -08:00
Stephen Atherton aeebe711ce TaskBucket’s saveAndExtend() is now accomplished through extendTimeout() with an option to save parameters. SaveAndExtendIncrementally() has been removed as it is no longer needed because TaskBucket’s normal execution loop calls extendTimeout() periodically as long as the TaskFunc’s execute() actor has not finished or thrown. If a TaskFunc wants to save changes to task parameters to checkpoint progress for task restarts to benefit from it can call extendTimeout() explicitly with the updateParams flag set to true. 2017-11-30 17:18:57 -08:00
Stephen Atherton 1e643239f9 Improvement in blob connnection reuse, oldest connnections in pool are now used first. 2017-11-30 12:57:29 -08:00
Stephen Atherton 39edda1804 Bug fix, and some code cleanup along the way. If a range backup task dies in finish() the re-run of the task will start at begin == end, which wasn’t being handled correctly. 2017-11-27 15:57:19 -08:00
Evan Tschannen 062d7ad400 fix: client might not notice a cluster controller which has changed ids because of process class or exclusion changes 2017-11-27 15:08:03 -08:00
Stephen Atherton d9c2f6d705 Bug fix. The terminator argument of readCommitted() previously did nothing, and end_of_stream() was always sent to the output stream. The parameter was fixed to enable changing this behavior but original the behavior was not being correctly preserved in at least one case. 2017-11-26 22:52:47 -08:00
Stephen Atherton 9ce9fd8692 Added comments to describe IBackupFile contract. 2017-11-26 22:02:14 -08:00
Stephen Atherton 1d3af8f4f0 Bug fix. 2017-11-25 21:13:56 -08:00
Stephen Atherton 1b1c8e985a Merge branch 'master' into backup-container-refactor
# Conflicts:
#	fdbclient/FileBackupAgent.actor.cpp
2017-11-25 19:54:51 -08:00
Stephen Atherton 6695c9e6a2 Bug fixes and improvements to error handling and trace events. The most serious bug was that restore would start at the wrong version, possibly skipping early log and range files. 2017-11-25 00:46:16 -08:00
Stephen Atherton 3449bc4cdc Bug fix, range end was wrong for final range file of backup range task. 2017-11-19 04:44:33 -08:00
Stephen Atherton a31216f3f7 Added toString() to Backup/Restore TaskFunc interface so tasks can provide a method to describe important task parameters for the default handleError() methods to use. 2017-11-19 04:39:18 -08:00
Stephen Atherton 32903ffa77 Trace event improvements and severity changes. 2017-11-19 04:34:28 -08:00
Stephen Atherton 9354a8cbb4 Added new backup container method to list everything in a backup. 2017-11-19 04:28:22 -08:00
Evan Tschannen f9efdf1fc1 fix: typeString was not static, so it added a lot of memory to MutationRef 2017-11-17 23:36:09 -08:00
Bhaskar Muppana 1bf84cd51a Merge pull request #210 from bmuppana/backup-logs
Adding TraceEvents for BackupRangeTask.
2017-11-16 19:12:04 -08:00
Bhaskar Muppana 5e596ea670 Adding TraceEvents for BackupRangeTask. 2017-11-16 19:11:31 -08:00
Stephen Atherton 07c19098fe Improved backup container unit test, added file reading / verification, more data, and a series of expirations and validating the expected result. Then fixed the bugs that this new testing discovered. 2017-11-16 16:19:56 -08:00
Stephen Atherton f105204aca Shifted version distribution over folders. 2017-11-15 23:13:04 -08:00
Stephen Atherton cc47d0e161 Bug fix in restore dispatch, begin file was not being incremented. Removed try/catch because the inherited handleError() is better. 2017-11-15 22:38:31 -08:00
Alex Miller e900333dbf Fix a subtle valgrind error.
If there was an error in waiting for the read version, we would attempt to
serialize and eventually commit a CommitTransactionRef that had an
uninitialized read_snapshot.
2017-11-15 19:21:20 -08:00
Evan Tschannen ad456a939a Merge pull request #206 from cie/change-excluded-cluster-controller
Change excluded cluster controller
2017-11-15 17:28:33 -08:00
Stephen Atherton ab0017f023 TaskBucket’s TaskFunc interface now has an optional handleError() which is called on any task that throws an error from execute() or finish(). Restore and Backup tasks use this to ensure that any errors that occur are placed in the backup or restore config’s lastError property. Bug fixes in log and range file encodings. 2017-11-15 13:33:09 -08:00
Stephen Atherton a77162b53d Merge branch 'master' into backup-container-refactor
# Conflicts:
#	fdbclient/BackupAgent.h
#	fdbclient/FileBackupAgent.actor.cpp
#	fdbclient/KeyBackedTypes.h
2017-11-15 08:14:47 -08:00
Stephen Atherton 3dfaf13b67 IBackupContainer has been rewritten to be a logical interface for storing, reading, deleting, expiring, and querying backup data. The details of how the data is organized or stored is now hidden from users of the interface. Both the local and blobstore containers have been rewritten, the key changes being a multi level directory structure and no more use of temporary files or pseudo-symlinks in the blob store implementation. This refactor has a large impact radius as the previous backup container was just a thin wrapper that presented a single level list of files and offered no methods for managing or interpreting the file structure so all of that logic was spread around other places in the code base. This made moving to the new blob store schema very messy, and without this refactor further changes in the future would only be worse.
Several backup tasks have been cleaned up / simplified because they no longer need to manage the ‘raw’ structure of the backup.  The addition of IBackupFile and its finish() method simplified the log and range writer tasks.  Updated BlobStoreEndpoint to support now-required bucket creation and bucket listing prefix/delimiter options for finding common prefixes.  Added KeyBackedSet<T> type.  Moved JSONDoc to its own header.  Added platform::findFilesRecursively().

Still to do:  update command line tool to use new IBackupContainer interface, fix bugs in Restore startup.
2017-11-14 23:33:17 -08:00
Yichi Chiang df922bc973 Change excluded cluster controller 2017-11-14 13:57:37 -08:00
Balachandar Namasivayam 986b73f458 Fixed an issue where an ACTOR outlives an object passed to it and then crashes while accessing it. 2017-11-14 13:51:23 -08:00
A.J. Beamon cd085764f1 Do not automatically change a cluster file that does not match what you expect. 2017-11-10 14:12:45 -08:00
A.J. Beamon d174e05bac Merge pull request #180 from cie/bindings-versionstamps-in-tuples
<rdar://problem/25560444> [Feature] Versionstamped keys and tuple/directory incompatibility
2017-11-06 16:39:17 -08:00
A.J. Beamon b1fe3d1b55 Delete spurious spaces 2017-11-06 12:59:00 -08:00
Evan Tschannen 33eef4c76b Merge branch 'master' into getrange-perf-improvements 2017-11-04 14:25:45 -07:00
Balachandar Namasivayam 33cc644ba8 Copy CommitTransactionRequest during tryCommit function call as it is anyway copied inside the function. 2017-11-02 17:00:44 -07:00
A.J. Beamon 3cad2676cc Collapse some of the getRange actors into a single actor. Avoid unnecessary comparisons. 2017-11-02 13:39:06 -07:00
A.J. Beamon 2d5a3a07e4 Avoid copies and comparisons in RYW get range 2017-11-02 10:51:30 -07:00
John Brownlee d46e240de2 Merge branch 'release-5.0'
# Conflicts:
#	fdbclient/FileBackupAgent.actor.cpp
#	versions.target
2017-11-02 10:42:30 -07:00
A.J. Beamon 07c80484eb Merge branch 'master' into getrange-perf-improvements 2017-11-02 08:42:26 -07:00
Balachandar Namasivayam 3efaaec479 onMasterProxiesChanged was being triggered when any member of ClientDBInfo changed. Change the behavior to be triggered only when proxies field in ClientDBInfo is changed. 2017-11-01 18:29:56 -07:00
Yichi Chiang eeaea60f94 Merge branch 'master' of github.com:apple/foundationdb 2017-11-01 16:20:38 -07:00
Yichi Chiang bb32add5f0 Fix reusing watch transaction in watchDisabled() 2017-11-01 16:19:18 -07:00
Balachandar Namasivayam 1d95e1bfd6 Log actor cancellation error to Client Transaction sampling records. 2017-11-01 11:51:31 -07:00
A.J. Beamon 7cf17df821 Merge branch 'master' into log-group-for-unsupported-clients
# Conflicts:
#	flow/Net2.actor.cpp
#	tests/fast/SidebandWithStatus.txt
#	tests/rare/LargeApiCorrectnessStatus.txt
#	tests/slow/DDBalanceAndRemoveStatus.txt
2017-11-01 11:31:02 -07:00
Stephen Atherton c2fd27f294 Previous commit was accidentally early and didn’t compile. 2017-11-01 01:22:08 -07:00
Stephen Atherton acf747c41f Bug fix in backup file termination, errors during sync() were being logged but then not thrown. This can cause a backup to appear to have a log or keyrange file in it but the temp file pointed to by the log/keyrange symbolic link was not successfully uploaded. 2017-11-01 01:16:16 -07:00
Bhaskar Muppana 92a6e97604 Fixing object copy issue with lambdas in KeyBackedTypes.h 2017-10-31 20:17:06 -07:00
Evan Tschannen 196aac4cec fix: tag needs to be assigned directly before using it 2017-10-30 16:23:26 -07:00
Evan Tschannen 98b4270703 fix: disableKey was read before options were set 2017-10-30 13:11:54 -07:00
Evan Tschannen fb89ae9f85 added the ability to enable and disable all backup and DR agents from fdbbackup and fdbdr. 2017-10-30 12:35:00 -07:00
Evan Tschannen 54d82c0d92 Merge pull request #194 from cie/alexmiller/valgrind
Fix valgrind errors
2017-10-27 17:25:12 -07:00
Alex Miller 3b61b76876 Fix a massive amount of valgrind errors and make them easier to debug in the future.
std::is_pod<> being less restrictive than is_binary_serializable<> meant that
structs that both were POD and had a serialize method defined would be binary
serialized instead of using the defined serialize().  This means that it would
also serialize any padding that the struct contained, which would cause mass
waves of valgrind failures from uninitialized memory.

Included in this change is additional uses of valgrind client requests so that
attempts to send uninitialized memory are reported at the sending site, versus
as part of checksum calculation in sending the packet.
2017-10-27 16:54:44 -07:00
Evan Tschannen 7c2185d5f7 fix: last restorable subspace was not initialized 2017-10-27 14:06:15 -07:00
A.J. Beamon e52f46064a Don't use the results arena for conflict ranges. Rearrange the code to avoid the copy unless necessary. 2017-10-26 09:35:34 -07:00
A.J. Beamon 0d68db0dac Merge branch 'master' into getrange-perf-improvements 2017-10-26 09:25:04 -07:00
Alec Grieser 5cc4328602 Merge remote-tracking branch 'origin/master' into bindings-versionstamps-in-tuples 2017-10-26 08:58:09 -07:00
Balachandar Namasivayam cfefab18fb Merge branch 'master' into add-new-atomic-ops 2017-10-25 18:03:34 -07:00
Balachandar Namasivayam 9dd588dcce Addressed review comments.
Changed naming for NewMin and NewAnd to MinV2 and AndV2
2017-10-25 14:48:05 -07:00
Balachandar Namasivayam 2f6d55a52f Add correctness tests for all atomic ops 2017-10-25 13:36:49 -07:00
Alec Grieser 1855f876db Merge remote-tracking branch 'origin/master' into bindings-versionstamps-in-tuples 2017-10-24 18:08:47 -07:00
Alec Grieser 50e41c3968 Merge branch 'master' into bindings-versionstamps-in-tuples 2017-10-24 18:05:10 -07:00
Bhaskar Muppana 3848f7808c Fixing Backup wrong key range issue. 2017-10-24 15:26:55 -07:00
Evan Tschannen df74e2a373 re-added support for non-copying tlog recovery 2017-10-24 15:09:31 -07:00
Bhaskar Muppana 7799aef549 Fixing memory corruption in KeyBackedTypes.h 2017-10-23 18:03:53 -07:00
Balachandar Namasivayam 8c3bdc5b3b Make atomic ops differentiate between unset and empty values. 2017-10-23 16:48:13 -07:00
Stephen Atherton 3afc85881e Merge branch 'master' into backup-container-refactor
# Conflicts:
#	fdbrpc/BlobStore.actor.cpp
2017-10-20 21:38:28 -07:00
Stephen Atherton 42955012e9 Merge branch 'release-5.0'
# Conflicts:
#	fdbrpc/BlobStore.actor.cpp
#	flow/error_definitions.h
2017-10-20 21:16:55 -07:00
A.J. Beamon 0167c1e7e8 Remove unintentionally committed line 2017-10-20 09:47:10 -07:00
A.J. Beamon 39a43aeb95 Eliminate another copy 2017-10-20 09:46:35 -07:00
A.J. Beamon a2e059be11 Eliminate some more copies 2017-10-20 09:27:17 -07:00
A.J. Beamon 5fcd58b637 Whitespace fix 2017-10-20 09:24:49 -07:00
A.J. Beamon 55bea14b9e Convert extraConflictRanges from KeyRange to std::pair<Key, Key> to avoid copies. 2017-10-20 09:17:47 -07:00
A.J. Beamon 7bab9a0276 Eliminate some copies 2017-10-20 09:11:26 -07:00
A.J. Beamon 986a99a39a Change the reply priority for NativeAPI requests to the cluster to TaskDefaultPromiseEndpoint. 2017-10-20 09:03:48 -07:00
Evan Tschannen e2c1e87df6 made a large number of fixes to make fearless DR correctness clean. 2017-10-19 15:36:32 -07:00
Stephen Atherton 5718d32553 Bug fix, file backup on Windows has been broken for a long time because it was constructing a path using / and then checking that it matches a normalized absolute path which would have the / converted to \. 2017-10-19 13:33:12 -07:00
Bhaskar Muppana f6822a4f6b Merge pull request #186 from bmuppana/backup-joshua-fix
Backup joshua fixes
2017-10-19 08:17:38 -07:00
Bhaskar Muppana 360b777b78 Fail with correct error code in case of abort or discontinue of
non-existing backups.
2017-10-18 23:17:48 -07:00
Stephen Atherton 09e97e1e7e TaskBucket now logs a trace event for any task execution failures. Previously only external timeouts were logged, but now timeouts or any other error from inside the task is logged as well. 2017-10-18 17:26:18 -07:00
Alec Grieser dd6d8f3b0e Merge branch 'master' into add-new-atomic-ops 2017-10-18 16:36:44 -07:00
Bhaskar Muppana 314511f4d7 Fixing spaces in BackupCorrectness TraceEvents. 2017-10-18 14:27:52 -07:00
Alec Grieser c12c928141 Merge branch 'master' into bindings-versionstamps-in-tuples 2017-10-18 14:13:01 -07:00
A.J. Beamon 795f178b11 Merge pull request #178 from cie/bindings-java-repackage
<rdar://problem/33271641> "cie" should be removed from Java binding package path
2017-10-18 13:34:07 -07:00
Stephen Atherton ef84e52127 Improved error handling and memory usage in AsyncFileBlobStoreWrite. Writes will now fail if any upload has already failed, rather than buffering unboundedly until sync() is called to complete the file. There is also a configurable limit on how many uploads can be pending before writes will stall waiting for one to finish. 2017-10-18 05:51:30 -07:00
Alec Grieser 479c9fab1b added experimental warning to external callbacks docs 2017-10-17 13:18:58 -07:00
Alec Grieser 191aca0a3c removed reference to cie in vexillographer.csproj 2017-10-17 09:30:53 -07:00
Alex Miller 7b9bc1d715 Merge pull request #170 from cie/alexmiller/flowprofile
Add support for profiling a running fdb cluster to fdbcli, fix security issues, and add an improved backtrace.
2017-10-16 16:51:53 -07:00
Alex Miller cf646d4a99 Address review comments.
* Fixed fdbcli to be more idiomatic.
* Removed is_binary_serializable in favor of std::is_pod<>
* Removed custom enable_if<> in favor of std::enable_if<>
* Removed HEY REVIEWER comments
* Removed print from prof.py
* Added FLOW_PROFILER_ENABLED=yes to circus components that wished to enable the flow profiler.
2017-10-16 16:46:52 -07:00
Alex Miller 91a26a170c Add toggleable profiling support to fdbserver+fdbcli.
This adds the fdbcli commands:
* profile list -- Lists all workers in a way that doesn't fill `kill`'s list.
* profile flow run -- Allows starting flow profiling on a set of hosts for a specified interval.

And threads through all the support for enabling and disabling profiling as an RPC.
2017-10-16 16:05:02 -07:00
Balachandar Namasivayam 312f614133 Add the new ops and AND to NON_ASSOCIATIVE_MASK.
In the storage server, read the entire value if the op is ByteMin or ByteMax.
2017-10-16 11:06:31 -07:00
Alec Grieser d40eb1ef9a changed java package from com.apple.cie.foundationdb to com.apple.foundationdb 2017-10-16 08:31:44 -07:00
Stephen Atherton e934604f67 Added DNS resolution. Interface is INetworkConnections::resolveTCPEndpoint() to resolve, or for convenience INetworkConnections::connect(host, service) will resolve host and service (port number or service name like http) and connect to one of the addresses at random.
BlobStoreEndpoint now only accepts hostnames and an optional service, so this update is not compatible with the previous URL formats having many IP addresses.
2017-10-15 21:51:11 -07:00
Stephen Atherton 68eccb681e Merge pull request #173 from bmuppana/master
Backup log messages.
2017-10-13 18:31:53 -07:00
Evan Tschannen 215bcb8d3e Merge pull request #157 from cie/choose-leader-on-stateless-processes
Catch and update processClass change from DBSource
2017-10-13 14:03:29 -07:00
Yichi Chiang 12edd27281 Introduce prevChangeID to CandidacyRequest and LeaderHeartbeatRequest 2017-10-12 17:11:58 -07:00
Bhaskar Muppana d1e9d28239 Backup log messages. 2017-10-12 16:12:42 -07:00
Stephen Atherton 659e39103e Missed file from merge of master into backup-refactor 2017-10-12 11:25:29 -07:00
Stephen Atherton e71a9c5cb9 Missed file from merge of master into continuous-backup. 2017-10-12 11:04:11 -07:00
Stephen Atherton 11517f7bfc Merge branch 'master' into continuous-backup
# Conflicts:
#	fdbclient/FileBackupAgent.actor.cpp
2017-10-12 11:03:23 -07:00
Balachandar Namasivayam fd4e62d4c9 Add documentation for the new atomic ops byte_min an byte_max as well as changing description for min and max atomic op. 2017-10-11 18:43:19 -07:00
Alec Grieser f95553aca2 updated javadocs 2017-10-10 16:56:32 -07:00
Balachandar Namasivayam eeebf10030 Modified existing behavior of MIN and AND atomic ops. The new behavior results in a 'SET' if the atomic op is performed on a non -existing key.
Added new atomic ops ByteMin and ByteMax that does lexicographic comparison of byte strings.
2017-10-10 13:02:22 -07:00
Evan Tschannen 15962cf079 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbrpc/Locality.cpp
#	fdbrpc/Locality.h
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/ClusterRecruitmentInterface.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/TagPartitionedLogSystem.actor.cpp
#	fdbserver/WorkerInterface.h
#	fdbserver/fdbserver.vcxproj.filters
#	fdbserver/masterserver.actor.cpp
#	fdbserver/worker.actor.cpp
#	flow/error_definitions.h
2017-10-05 17:09:44 -07:00
Alex Miller a21c8a820b Move cpuProfilerRequest from WorkerInterface to ClientWorkerInterface.
A way to access this stream is required if we wish to be able to toggle
profiling from fdbcli.  There's two ways to do this:

1. Use `monitorLeader()` to get a `ClusterControllerFullInterface`, and use
`getWorkers` from there to get a list of `WorkerInterface`s, from which we can
access cpuProfilerRequest.
2. Move cpuProfilerRequest to ClientWorkerInterface and use the existing code
in the client that can fetch a list of all `ClientWorkerInterface`s.

The split between WorkerInterface and ClientWorkerInterface appears to be
what a client might have a need to call versus what is fdbserver-internal (and
thus no client should even want to call). Thus, it seems to make more sense to
acknowledge that profiling is useful to be able to toggle from a client, and go
with option (2).
2017-10-05 14:08:28 -07:00
Alex Miller e55cc447d2 Address code review comments.
* Fixed memory corruption with SystemData key constants
* Removed duplication in ClusterController
* Reworked fdbcli actions to better represent explicit vs default assignments
2017-10-04 13:36:18 -07:00
Alex Miller 80fa597422 Allow client profiling to be configured from fdbcli.
This adds the following commands:
* profile client status
* profile client on 0.001 100MB
* profile client off
2017-10-04 13:36:18 -07:00
A.J. Beamon d886b95628 Merge pull request #131 from cie/33300740-with-shutdown-hooks
<rdar://problem/33300740> Java: support callbacks from external multi-version client threads
2017-10-04 09:17:25 -07:00
Evan Tschannen 7818a7972b fix: read_lock_aware had the same code as used_during_commit_protection_disable 2017-10-03 09:37:45 -07:00
Evan Tschannen 6ea9903c82 Merge branch 'release-5.0'
# Conflicts:
#	fdbbackup/backup.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	versions.target
2017-10-01 18:46:44 -07:00
Stephen Atherton ad9de674ac Knob change, blob requests should be allowed more time. 2017-10-01 16:45:47 -07:00
Stephen Atherton 13a79482d8 Added comments for clarity. 2017-10-01 16:03:12 -07:00
Stephen Atherton a95107417f Improved behavior of slow writes during backup. KeyRange and Log backup tasks now use TaskBucket::saveAndExtend() to keep the task alive until flushing the file finishes or fails with an error (blob uploads fail after a limited number of retries). This prevents blob uploads from being retried too often if the destination is slow since a task abort and retry would start the backoff counters back at zero. Also removed a debugging behavior that was accidentally checked in. 2017-10-01 16:01:24 -07:00
Alex Miller 11668bb359 Fixing code review comments. 2017-09-29 15:58:36 -07:00
Alex Miller f9b7ce9a2f Add write conflict ranges to metadata modifications on backup data dumps. 2017-09-29 15:58:36 -07:00
Alex Miller 87a1581871 Ensure VersionStamps are strictly increasing with DR ACI switchovers.
This should be the final change in making sure that versionstamps are never
higher than the read version of a database that they're read from.
2017-09-29 15:58:36 -07:00
Alex Miller 8f4c45418b Make atomicSwitchover preserve an ever-increasing commit version. 2017-09-29 15:58:36 -07:00
Evan Tschannen a1f8b546e6 fix: ensure connections to blob store are evenly distributed across network addresses
added a per address limit to the number of open connections
lowered a variety of knobs to prevent us from using too much memory
2017-09-29 14:59:24 -07:00
Evan Tschannen ef41b07bb3 renamed past_version to transaction_too_old
implemented read_lock_aware option
2017-09-28 16:35:08 -07:00
Yichi Chiang d4f75630de Support log group field in status json 2017-09-28 16:31:29 -07:00
Evan Tschannen 7b60e26660 Merge pull request #160 from cie/use-error-descriptions
Add the ability to access name and description in Error. Update error…
2017-09-28 16:00:39 -07:00
Evan Tschannen 73fca75239 added the ability to disable timeKeeper; disabled timeKeeper before consistency check in simulation 2017-09-28 13:13:24 -07:00
A.J. Beamon d30c730f75 Add the ability to access name and description in Error. Update error descriptions. 2017-09-28 12:35:03 -07:00
Bhaskar Muppana 0f8ff26029 Merge pull request #158 from bmuppana/master
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:42 -07:00
Bhaskar Muppana 6a0b1d6808 Fixing PR comments
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:01 -07:00
Alec Grieser 80f559d148 changed name from thread_completion_hook to network_thread_completion_hook 2017-09-27 11:30:39 -07:00
Alec Grieser d7e1b267be changed name from shutdown hook to thread completion hook ; added hook parameter 2017-09-26 17:00:04 -07:00
Alec Grieser a5f1c3b15b Merge remote-tracking branch 'origin/master' into 33300740-with-shutdown-hooks 2017-09-26 11:28:40 -07:00
A.J. Beamon e5e7f8a081 When using setKey() on Standalone<KeySelectorRef> in RYW, make sure that the key is part of the key selector's arena. 2017-09-25 15:52:45 -07:00
Bhaskar Muppana 0bf5bdb23a <rdar://problem/34557380> Need a way to map real time to version 2017-09-25 12:51:37 -07:00
Evan Tschannen fba78ce4ef refactored monitor leader again to be even safer.
fixed a problem where we would write the header to clusters files twice
added extra logging in monitor leader
2017-09-22 15:06:11 -07:00
Stephen Atherton 248dab79b6 Created “redwood” storage engine option and many changes to support that including IKeyValueStore::init() and custom DiskQueue file extensions. 2017-09-21 23:51:55 -07:00
Stephen Atherton d880569d52 Checkpointing progress on KeyValueStoreMVBTree. All methods are implemented to a usable point, and everything compiles, but Worker does not yet try to use it. 2017-09-21 04:43:49 -07:00
Evan Tschannen a9e3ae40d6 refactored monitorLeader to avoid the risk of one generation or coordinators interfering with the next 2017-09-20 17:42:12 -07:00
Evan Tschannen 53a4a3280a fix: we cannot add to the trLog when cancelled 2017-09-20 14:47:57 -07:00
Balachandar Namasivayam 24aa616a7a Merge pull request #154 from cie/additional-client-profiling
Additional client profiling
2017-09-19 18:15:02 -07:00
Evan Tschannen d67e017bcc reduced reply_byte_limit to 80k 2017-09-15 11:01:56 -07:00
Evan Tschannen 76e7988663 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/OldTLogServer.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/WorkerInterface.h
#	flow/Net2.actor.cpp
2017-09-11 15:15:56 -07:00
Bhaskar Muppana 10402e0c32 Removing add_task() code duplication in FileBackupAgent.actor.cpp 2017-09-11 11:14:30 -07:00
Bhaskar Muppana c36a30837d Moving keyErrors in BackupConfig. 2017-09-08 16:09:18 -07:00
Evan Tschannen ea26bc1c43 passed first tests which kill entire datacenters
added configuration options for the remote data center and satellite data centers
updated cluster controller recruitment logic
refactors how master writes core state
updated log recovery, and log system peeking
2017-09-07 15:32:08 -07:00
Bhaskar Muppana 02cc8b43c4 More backup cleanup. 2017-09-07 09:04:10 -07:00
Bhaskar Muppana c7df951f7c Using BackupConfig from backup.actor.cpp to reduce intermediate
functions.
2017-09-07 08:36:36 -07:00
Bhaskar Muppana fe208d6adf Merge branch 'master' of github.com:apple/foundationdb into backup 2017-09-06 10:01:55 -07:00
Bhaskar Muppana 9f8056754a Moving KeyBackedTag and KeyBackedConfig into BackupAgent.h to use them from backup.actor.cpp 2017-09-06 09:46:27 -07:00
Stephen Atherton b5f79d27f7 Large values are now split and stored, but not yet read correctly. 2017-09-05 16:59:31 -07:00
Bhaskar Muppana d917f9449f Fixing Steve's review comments. 2017-09-05 14:06:55 -07:00
Bhaskar Muppana 92d05f6fc3 backupContainer is a std::string not a Key. 2017-09-05 13:22:44 -07:00
Bhaskar Muppana 83810edabc Backup/Restore tag can be std::string instad of Key. 2017-09-05 11:38:40 -07:00
Bhaskar Muppana 456ced2c65 Minor backup code cleanup 2017-09-05 09:42:14 -07:00
Evan Tschannen 6e26ae2bb3 added a new multi_dc configuration 2017-09-01 15:45:27 -07:00
Bhaskar Muppana e1a7e11347 Minor backup code cleanup 2017-09-01 14:39:38 -07:00
Bhaskar Muppana d834ab9d4d Moving from task->params to Params 2017-09-01 13:50:38 -07:00
Evan Tschannen dc1f7ca6b7 testers now use client locality load balancing 2017-09-01 12:53:01 -07:00
Bhaskar Muppana c564aaae68 Moving keyConfigBackupRanges into BackupConfig::backupRanges(). 2017-09-01 11:52:08 -07:00
Bhaskar Muppana 871bac0f96 Cleanup submitCleanup() 2017-08-30 18:05:50 -07:00
Bhaskar Muppana b38f131a46 Move keyStateStop to BackupConfig::stopVersion() 2017-08-30 16:22:28 -07:00
Bhaskar Muppana e73b72cdb9 Moving keyConfigStopWhenDoneKey to BackupConfig::stopWhenDone() 2017-08-30 15:31:55 -07:00
Bhaskar Muppana 1655547048 Removing keyConfigLogUid in preference to KeyBackedConfig::getUidAsKey(). 2017-08-30 15:07:36 -07:00
Bhaskar Muppana c1b6f3fdf2 Moving keyBackupTag to BackupConfig.tag() 2017-08-30 14:34:44 -07:00
Bhaskar Muppana 439193d17b Moving keyBackupContainer to BackupConfig.backupContainer() 2017-08-30 12:48:28 -07:00
Bhaskar Muppana c766bcb797 Moving keyStateStatus to BackupConfig::stateEnum. 2017-08-30 10:38:06 -07:00
Bhaskar Muppana 819566c166 keyFolderId is not used in File Backup anymore. We are instead using tag->uid based task validation. 2017-08-29 09:26:32 -07:00
Bhaskar Muppana df15dce000 Make BackupConfig subclass of KeyBackedConfig and remoe old way of Task
key validation.
2017-08-28 18:20:55 -07:00
Bhaskar Muppana 2ece658e60 Don't reuse backup logUid. 2017-08-28 16:50:39 -07:00
Bhaskar Muppana 32a690bce8 Generalize RestoreConfig class. 2017-08-28 16:48:26 -07:00
Bhaskar Muppana 8ac750672b Make RestoreTag and RestoreTags classes generic to be used with Backup. 2017-08-28 11:28:19 -07:00
A.J. Beamon f8be643662 Merge branch 'release-5.0' 2017-08-09 15:30:43 -07:00
A.J. Beamon 29a38d1a51 Add warning that the slow task profiling network option is not recommended for use in production. 2017-08-09 14:39:05 -07:00
Yichi Chiang aac82074af Avoid calling setCachedLocation twice 2017-08-08 10:03:04 -07:00
Balachandar Namasivayam e767860010 Addressed review comments.
Changed current protocol version to match master
Added operation details for operations that failed.
2017-08-07 18:45:42 -07:00
Evan Tschannen c22708b6d6 added tag localities
fix: remote logs need to stop the master when they are stopped
2017-08-03 16:16:36 -07:00
Balachandar Namasivayam 3e90fdfae7 Added extra client transactional profiling info
1) Key has been added to GET
2) KeyRange has been added to GETRANGE
3) ReadConflict, WriteConflict, Mutation info has been added to COMMIT
2017-08-01 18:33:39 -07:00
Yichi Chiang 6a8a5c41b0 Add a switch to turn off data distribution in CLI 2017-07-28 18:14:55 -07:00
Alec Grieser 59aae5e994 added catch all for client shutdown hooks 2017-07-26 15:39:03 -07:00
Yichi Chiang 53e1ae9f60 shard system keyspace 2017-07-26 13:47:31 -07:00
Alec Grieser 83bf2ee312 added add_shutdown_hook to fdb_c api and used it to detach java threads where appropriate 2017-07-25 15:57:26 -07:00
Stephen Atherton 4aaee86c2a Moved MetricLogger actor to fdbclient so applications other than fdbserver can use it. 2017-07-24 13:13:06 -07:00
A.J. Beamon b19611010a Add ability to disable options in specific bindings, use it to disable callbacks on external threads in java 2017-07-19 12:58:21 -07:00
Alec Grieser 3700624fd7 Merge branch 'release-5.0' 2017-07-17 08:54:10 -07:00
Alec Grieser eee492a05b fix build issue from Notified.h not being shuffled in vcxproj files 2017-07-14 16:46:08 -07:00
A.J. Beamon f73b0b6961 fix: Move failureMonitorClient state to a reference counted object. This avoids a race condition in the fdbcli as its shutting down that can cause it to crash. 2017-07-14 16:28:04 -07:00
Alec Grieser c860f09d8a Merge branch 'release-5.0' 2017-07-14 16:01:15 -07:00
Alec Grieser 660729839c moved Notified.h from flow -> fdbclient ; flow bindings package does better job when excluding testers 2017-07-14 15:49:30 -07:00
Evan Tschannen 57ba9d36af fixed a large number of bugs 2017-07-13 12:29:21 -07:00
Alvin Moore 31d562ff7b Merge branch 'release-5.0'
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbserver/DatabaseConfiguration.cpp
#	versions.target
2017-06-27 11:16:08 -07:00
Evan Tschannen 9fd5955e92 Merge branch 'master' into removing-old-dc-code 2017-06-26 16:27:10 -07:00
Evan Tschannen 035efd79cf fix: if a database gets locked after an unknown result, the dummyTransaction will be stuck until the database is unlocked 2017-06-26 12:12:47 -07:00
Evan Tschannen 15cb498aa7 removed fast_recovery_double and fast_recovery_triple from the fdbcli 2017-06-23 16:18:23 -07:00
Stephen Atherton 03d2b1787a Merge branch 'release-4.6' into release-5.0
# Conflicts:
#	fdbrpc/sim2.actor.cpp
2017-06-22 16:56:25 -07:00
Alvin Moore 9553458b78 Updated simulation to support managing exclusion and inclusion address
Added method for identifying acceptable availability process classes
Extended cluster availability function to ensure coordinators can be auto configured
Fixed availability function to allow protected processes to be considered as dead if not available
Added debug trace events for providing machine state when considering availability
Added trace event for protected coordinators
2017-06-19 16:48:15 -07:00
Evan Tschannen 766dc23e26 fix: do not use TLS in protectedAddresses 2017-06-02 13:52:21 -07:00
Evan Tschannen 1626e16377 Merge branch 'release-4.6' into release-5.0 2017-05-31 16:23:37 -07:00
A.J. Beamon 93509133ad Attempt to eliminate a race in DLSingleAssignmentVar between cancelling a future and checking its ref count. Reduce the amount of iterations in the test because it’s timing out on the pie workers. 2017-05-26 10:47:37 -07:00
FDB Dev Team a674cb4ef4 Initial repository commit 2017-05-25 13:48:44 -07:00