Commit Graph

1018 Commits

Author SHA1 Message Date
Bhaskar Muppana f6822a4f6b Merge pull request #186 from bmuppana/backup-joshua-fix
Backup joshua fixes
2017-10-19 08:17:38 -07:00
Bhaskar Muppana 360b777b78 Fail with correct error code in case of abort or discontinue of
non-existing backups.
2017-10-18 23:17:48 -07:00
Stephen Atherton 09e97e1e7e TaskBucket now logs a trace event for any task execution failures. Previously only external timeouts were logged, but now timeouts or any other error from inside the task is logged as well. 2017-10-18 17:26:18 -07:00
Alec Grieser dd6d8f3b0e Merge branch 'master' into add-new-atomic-ops 2017-10-18 16:36:44 -07:00
Bhaskar Muppana 314511f4d7 Fixing spaces in BackupCorrectness TraceEvents. 2017-10-18 14:27:52 -07:00
Alec Grieser c12c928141 Merge branch 'master' into bindings-versionstamps-in-tuples 2017-10-18 14:13:01 -07:00
A.J. Beamon 795f178b11 Merge pull request #178 from cie/bindings-java-repackage
<rdar://problem/33271641> "cie" should be removed from Java binding package path
2017-10-18 13:34:07 -07:00
Stephen Atherton ef84e52127 Improved error handling and memory usage in AsyncFileBlobStoreWrite. Writes will now fail if any upload has already failed, rather than buffering unboundedly until sync() is called to complete the file. There is also a configurable limit on how many uploads can be pending before writes will stall waiting for one to finish. 2017-10-18 05:51:30 -07:00
Alec Grieser 479c9fab1b added experimental warning to external callbacks docs 2017-10-17 13:18:58 -07:00
Alec Grieser 191aca0a3c removed reference to cie in vexillographer.csproj 2017-10-17 09:30:53 -07:00
Alex Miller 7b9bc1d715 Merge pull request #170 from cie/alexmiller/flowprofile
Add support for profiling a running fdb cluster to fdbcli, fix security issues, and add an improved backtrace.
2017-10-16 16:51:53 -07:00
Alex Miller cf646d4a99 Address review comments.
* Fixed fdbcli to be more idiomatic.
* Removed is_binary_serializable in favor of std::is_pod<>
* Removed custom enable_if<> in favor of std::enable_if<>
* Removed HEY REVIEWER comments
* Removed print from prof.py
* Added FLOW_PROFILER_ENABLED=yes to circus components that wished to enable the flow profiler.
2017-10-16 16:46:52 -07:00
Alex Miller 91a26a170c Add toggleable profiling support to fdbserver+fdbcli.
This adds the fdbcli commands:
* profile list -- Lists all workers in a way that doesn't fill `kill`'s list.
* profile flow run -- Allows starting flow profiling on a set of hosts for a specified interval.

And threads through all the support for enabling and disabling profiling as an RPC.
2017-10-16 16:05:02 -07:00
Balachandar Namasivayam 312f614133 Add the new ops and AND to NON_ASSOCIATIVE_MASK.
In the storage server, read the entire value if the op is ByteMin or ByteMax.
2017-10-16 11:06:31 -07:00
Alec Grieser d40eb1ef9a changed java package from com.apple.cie.foundationdb to com.apple.foundationdb 2017-10-16 08:31:44 -07:00
Stephen Atherton e934604f67 Added DNS resolution. Interface is INetworkConnections::resolveTCPEndpoint() to resolve, or for convenience INetworkConnections::connect(host, service) will resolve host and service (port number or service name like http) and connect to one of the addresses at random.
BlobStoreEndpoint now only accepts hostnames and an optional service, so this update is not compatible with the previous URL formats having many IP addresses.
2017-10-15 21:51:11 -07:00
Stephen Atherton 68eccb681e Merge pull request #173 from bmuppana/master
Backup log messages.
2017-10-13 18:31:53 -07:00
Evan Tschannen 215bcb8d3e Merge pull request #157 from cie/choose-leader-on-stateless-processes
Catch and update processClass change from DBSource
2017-10-13 14:03:29 -07:00
Yichi Chiang 12edd27281 Introduce prevChangeID to CandidacyRequest and LeaderHeartbeatRequest 2017-10-12 17:11:58 -07:00
Bhaskar Muppana d1e9d28239 Backup log messages. 2017-10-12 16:12:42 -07:00
Stephen Atherton 659e39103e Missed file from merge of master into backup-refactor 2017-10-12 11:25:29 -07:00
Stephen Atherton e71a9c5cb9 Missed file from merge of master into continuous-backup. 2017-10-12 11:04:11 -07:00
Stephen Atherton 11517f7bfc Merge branch 'master' into continuous-backup
# Conflicts:
#	fdbclient/FileBackupAgent.actor.cpp
2017-10-12 11:03:23 -07:00
Balachandar Namasivayam fd4e62d4c9 Add documentation for the new atomic ops byte_min an byte_max as well as changing description for min and max atomic op. 2017-10-11 18:43:19 -07:00
Alec Grieser f95553aca2 updated javadocs 2017-10-10 16:56:32 -07:00
Balachandar Namasivayam eeebf10030 Modified existing behavior of MIN and AND atomic ops. The new behavior results in a 'SET' if the atomic op is performed on a non -existing key.
Added new atomic ops ByteMin and ByteMax that does lexicographic comparison of byte strings.
2017-10-10 13:02:22 -07:00
Evan Tschannen 15962cf079 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbrpc/Locality.cpp
#	fdbrpc/Locality.h
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/ClusterRecruitmentInterface.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/TagPartitionedLogSystem.actor.cpp
#	fdbserver/WorkerInterface.h
#	fdbserver/fdbserver.vcxproj.filters
#	fdbserver/masterserver.actor.cpp
#	fdbserver/worker.actor.cpp
#	flow/error_definitions.h
2017-10-05 17:09:44 -07:00
Alex Miller a21c8a820b Move cpuProfilerRequest from WorkerInterface to ClientWorkerInterface.
A way to access this stream is required if we wish to be able to toggle
profiling from fdbcli.  There's two ways to do this:

1. Use `monitorLeader()` to get a `ClusterControllerFullInterface`, and use
`getWorkers` from there to get a list of `WorkerInterface`s, from which we can
access cpuProfilerRequest.
2. Move cpuProfilerRequest to ClientWorkerInterface and use the existing code
in the client that can fetch a list of all `ClientWorkerInterface`s.

The split between WorkerInterface and ClientWorkerInterface appears to be
what a client might have a need to call versus what is fdbserver-internal (and
thus no client should even want to call). Thus, it seems to make more sense to
acknowledge that profiling is useful to be able to toggle from a client, and go
with option (2).
2017-10-05 14:08:28 -07:00
Alex Miller e55cc447d2 Address code review comments.
* Fixed memory corruption with SystemData key constants
* Removed duplication in ClusterController
* Reworked fdbcli actions to better represent explicit vs default assignments
2017-10-04 13:36:18 -07:00
Alex Miller 80fa597422 Allow client profiling to be configured from fdbcli.
This adds the following commands:
* profile client status
* profile client on 0.001 100MB
* profile client off
2017-10-04 13:36:18 -07:00
A.J. Beamon d886b95628 Merge pull request #131 from cie/33300740-with-shutdown-hooks
<rdar://problem/33300740> Java: support callbacks from external multi-version client threads
2017-10-04 09:17:25 -07:00
Evan Tschannen 7818a7972b fix: read_lock_aware had the same code as used_during_commit_protection_disable 2017-10-03 09:37:45 -07:00
Evan Tschannen 6ea9903c82 Merge branch 'release-5.0'
# Conflicts:
#	fdbbackup/backup.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	versions.target
2017-10-01 18:46:44 -07:00
Stephen Atherton ad9de674ac Knob change, blob requests should be allowed more time. 2017-10-01 16:45:47 -07:00
Stephen Atherton 13a79482d8 Added comments for clarity. 2017-10-01 16:03:12 -07:00
Stephen Atherton a95107417f Improved behavior of slow writes during backup. KeyRange and Log backup tasks now use TaskBucket::saveAndExtend() to keep the task alive until flushing the file finishes or fails with an error (blob uploads fail after a limited number of retries). This prevents blob uploads from being retried too often if the destination is slow since a task abort and retry would start the backoff counters back at zero. Also removed a debugging behavior that was accidentally checked in. 2017-10-01 16:01:24 -07:00
Alex Miller 11668bb359 Fixing code review comments. 2017-09-29 15:58:36 -07:00
Alex Miller f9b7ce9a2f Add write conflict ranges to metadata modifications on backup data dumps. 2017-09-29 15:58:36 -07:00
Alex Miller 87a1581871 Ensure VersionStamps are strictly increasing with DR ACI switchovers.
This should be the final change in making sure that versionstamps are never
higher than the read version of a database that they're read from.
2017-09-29 15:58:36 -07:00
Alex Miller 8f4c45418b Make atomicSwitchover preserve an ever-increasing commit version. 2017-09-29 15:58:36 -07:00
Evan Tschannen a1f8b546e6 fix: ensure connections to blob store are evenly distributed across network addresses
added a per address limit to the number of open connections
lowered a variety of knobs to prevent us from using too much memory
2017-09-29 14:59:24 -07:00
Evan Tschannen ef41b07bb3 renamed past_version to transaction_too_old
implemented read_lock_aware option
2017-09-28 16:35:08 -07:00
Yichi Chiang d4f75630de Support log group field in status json 2017-09-28 16:31:29 -07:00
Evan Tschannen 7b60e26660 Merge pull request #160 from cie/use-error-descriptions
Add the ability to access name and description in Error. Update error…
2017-09-28 16:00:39 -07:00
Evan Tschannen 73fca75239 added the ability to disable timeKeeper; disabled timeKeeper before consistency check in simulation 2017-09-28 13:13:24 -07:00
A.J. Beamon d30c730f75 Add the ability to access name and description in Error. Update error descriptions. 2017-09-28 12:35:03 -07:00
Bhaskar Muppana 0f8ff26029 Merge pull request #158 from bmuppana/master
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:42 -07:00
Bhaskar Muppana 6a0b1d6808 Fixing PR comments
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:01 -07:00
Alec Grieser 80f559d148 changed name from thread_completion_hook to network_thread_completion_hook 2017-09-27 11:30:39 -07:00
Alec Grieser d7e1b267be changed name from shutdown hook to thread completion hook ; added hook parameter 2017-09-26 17:00:04 -07:00
Alec Grieser a5f1c3b15b Merge remote-tracking branch 'origin/master' into 33300740-with-shutdown-hooks 2017-09-26 11:28:40 -07:00
A.J. Beamon e5e7f8a081 When using setKey() on Standalone<KeySelectorRef> in RYW, make sure that the key is part of the key selector's arena. 2017-09-25 15:52:45 -07:00
Bhaskar Muppana 0bf5bdb23a <rdar://problem/34557380> Need a way to map real time to version 2017-09-25 12:51:37 -07:00
Evan Tschannen fba78ce4ef refactored monitor leader again to be even safer.
fixed a problem where we would write the header to clusters files twice
added extra logging in monitor leader
2017-09-22 15:06:11 -07:00
Stephen Atherton 248dab79b6 Created “redwood” storage engine option and many changes to support that including IKeyValueStore::init() and custom DiskQueue file extensions. 2017-09-21 23:51:55 -07:00
Stephen Atherton d880569d52 Checkpointing progress on KeyValueStoreMVBTree. All methods are implemented to a usable point, and everything compiles, but Worker does not yet try to use it. 2017-09-21 04:43:49 -07:00
Evan Tschannen a9e3ae40d6 refactored monitorLeader to avoid the risk of one generation or coordinators interfering with the next 2017-09-20 17:42:12 -07:00
Evan Tschannen 53a4a3280a fix: we cannot add to the trLog when cancelled 2017-09-20 14:47:57 -07:00
Balachandar Namasivayam 24aa616a7a Merge pull request #154 from cie/additional-client-profiling
Additional client profiling
2017-09-19 18:15:02 -07:00
Evan Tschannen d67e017bcc reduced reply_byte_limit to 80k 2017-09-15 11:01:56 -07:00
Evan Tschannen 76e7988663 Merge branch 'master' into feature-remote-logs
# Conflicts:
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/OldTLogServer.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/WorkerInterface.h
#	flow/Net2.actor.cpp
2017-09-11 15:15:56 -07:00
Bhaskar Muppana 10402e0c32 Removing add_task() code duplication in FileBackupAgent.actor.cpp 2017-09-11 11:14:30 -07:00
Bhaskar Muppana c36a30837d Moving keyErrors in BackupConfig. 2017-09-08 16:09:18 -07:00
Evan Tschannen ea26bc1c43 passed first tests which kill entire datacenters
added configuration options for the remote data center and satellite data centers
updated cluster controller recruitment logic
refactors how master writes core state
updated log recovery, and log system peeking
2017-09-07 15:32:08 -07:00
Bhaskar Muppana 02cc8b43c4 More backup cleanup. 2017-09-07 09:04:10 -07:00
Bhaskar Muppana c7df951f7c Using BackupConfig from backup.actor.cpp to reduce intermediate
functions.
2017-09-07 08:36:36 -07:00
Bhaskar Muppana fe208d6adf Merge branch 'master' of github.com:apple/foundationdb into backup 2017-09-06 10:01:55 -07:00
Bhaskar Muppana 9f8056754a Moving KeyBackedTag and KeyBackedConfig into BackupAgent.h to use them from backup.actor.cpp 2017-09-06 09:46:27 -07:00
Stephen Atherton b5f79d27f7 Large values are now split and stored, but not yet read correctly. 2017-09-05 16:59:31 -07:00
Bhaskar Muppana d917f9449f Fixing Steve's review comments. 2017-09-05 14:06:55 -07:00
Bhaskar Muppana 92d05f6fc3 backupContainer is a std::string not a Key. 2017-09-05 13:22:44 -07:00
Bhaskar Muppana 83810edabc Backup/Restore tag can be std::string instad of Key. 2017-09-05 11:38:40 -07:00
Bhaskar Muppana 456ced2c65 Minor backup code cleanup 2017-09-05 09:42:14 -07:00
Evan Tschannen 6e26ae2bb3 added a new multi_dc configuration 2017-09-01 15:45:27 -07:00
Bhaskar Muppana e1a7e11347 Minor backup code cleanup 2017-09-01 14:39:38 -07:00
Bhaskar Muppana d834ab9d4d Moving from task->params to Params 2017-09-01 13:50:38 -07:00
Evan Tschannen dc1f7ca6b7 testers now use client locality load balancing 2017-09-01 12:53:01 -07:00
Bhaskar Muppana c564aaae68 Moving keyConfigBackupRanges into BackupConfig::backupRanges(). 2017-09-01 11:52:08 -07:00
Bhaskar Muppana 871bac0f96 Cleanup submitCleanup() 2017-08-30 18:05:50 -07:00
Bhaskar Muppana b38f131a46 Move keyStateStop to BackupConfig::stopVersion() 2017-08-30 16:22:28 -07:00
Bhaskar Muppana e73b72cdb9 Moving keyConfigStopWhenDoneKey to BackupConfig::stopWhenDone() 2017-08-30 15:31:55 -07:00
Bhaskar Muppana 1655547048 Removing keyConfigLogUid in preference to KeyBackedConfig::getUidAsKey(). 2017-08-30 15:07:36 -07:00
Bhaskar Muppana c1b6f3fdf2 Moving keyBackupTag to BackupConfig.tag() 2017-08-30 14:34:44 -07:00
Bhaskar Muppana 439193d17b Moving keyBackupContainer to BackupConfig.backupContainer() 2017-08-30 12:48:28 -07:00
Bhaskar Muppana c766bcb797 Moving keyStateStatus to BackupConfig::stateEnum. 2017-08-30 10:38:06 -07:00
Bhaskar Muppana 819566c166 keyFolderId is not used in File Backup anymore. We are instead using tag->uid based task validation. 2017-08-29 09:26:32 -07:00
Bhaskar Muppana df15dce000 Make BackupConfig subclass of KeyBackedConfig and remoe old way of Task
key validation.
2017-08-28 18:20:55 -07:00
Bhaskar Muppana 2ece658e60 Don't reuse backup logUid. 2017-08-28 16:50:39 -07:00
Bhaskar Muppana 32a690bce8 Generalize RestoreConfig class. 2017-08-28 16:48:26 -07:00
Bhaskar Muppana 8ac750672b Make RestoreTag and RestoreTags classes generic to be used with Backup. 2017-08-28 11:28:19 -07:00
A.J. Beamon f8be643662 Merge branch 'release-5.0' 2017-08-09 15:30:43 -07:00
A.J. Beamon 29a38d1a51 Add warning that the slow task profiling network option is not recommended for use in production. 2017-08-09 14:39:05 -07:00
Yichi Chiang aac82074af Avoid calling setCachedLocation twice 2017-08-08 10:03:04 -07:00
Balachandar Namasivayam e767860010 Addressed review comments.
Changed current protocol version to match master
Added operation details for operations that failed.
2017-08-07 18:45:42 -07:00
Evan Tschannen c22708b6d6 added tag localities
fix: remote logs need to stop the master when they are stopped
2017-08-03 16:16:36 -07:00
Balachandar Namasivayam 3e90fdfae7 Added extra client transactional profiling info
1) Key has been added to GET
2) KeyRange has been added to GETRANGE
3) ReadConflict, WriteConflict, Mutation info has been added to COMMIT
2017-08-01 18:33:39 -07:00
Yichi Chiang 6a8a5c41b0 Add a switch to turn off data distribution in CLI 2017-07-28 18:14:55 -07:00
Alec Grieser 59aae5e994 added catch all for client shutdown hooks 2017-07-26 15:39:03 -07:00
Yichi Chiang 53e1ae9f60 shard system keyspace 2017-07-26 13:47:31 -07:00
Alec Grieser 83bf2ee312 added add_shutdown_hook to fdb_c api and used it to detach java threads where appropriate 2017-07-25 15:57:26 -07:00
Stephen Atherton 4aaee86c2a Moved MetricLogger actor to fdbclient so applications other than fdbserver can use it. 2017-07-24 13:13:06 -07:00
A.J. Beamon b19611010a Add ability to disable options in specific bindings, use it to disable callbacks on external threads in java 2017-07-19 12:58:21 -07:00
Alec Grieser 3700624fd7 Merge branch 'release-5.0' 2017-07-17 08:54:10 -07:00
Alec Grieser eee492a05b fix build issue from Notified.h not being shuffled in vcxproj files 2017-07-14 16:46:08 -07:00
A.J. Beamon f73b0b6961 fix: Move failureMonitorClient state to a reference counted object. This avoids a race condition in the fdbcli as its shutting down that can cause it to crash. 2017-07-14 16:28:04 -07:00
Alec Grieser c860f09d8a Merge branch 'release-5.0' 2017-07-14 16:01:15 -07:00
Alec Grieser 660729839c moved Notified.h from flow -> fdbclient ; flow bindings package does better job when excluding testers 2017-07-14 15:49:30 -07:00
Evan Tschannen 57ba9d36af fixed a large number of bugs 2017-07-13 12:29:21 -07:00
Alvin Moore 31d562ff7b Merge branch 'release-5.0'
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbserver/DatabaseConfiguration.cpp
#	versions.target
2017-06-27 11:16:08 -07:00
Evan Tschannen 9fd5955e92 Merge branch 'master' into removing-old-dc-code 2017-06-26 16:27:10 -07:00
Evan Tschannen 035efd79cf fix: if a database gets locked after an unknown result, the dummyTransaction will be stuck until the database is unlocked 2017-06-26 12:12:47 -07:00
Evan Tschannen 15cb498aa7 removed fast_recovery_double and fast_recovery_triple from the fdbcli 2017-06-23 16:18:23 -07:00
Stephen Atherton 03d2b1787a Merge branch 'release-4.6' into release-5.0
# Conflicts:
#	fdbrpc/sim2.actor.cpp
2017-06-22 16:56:25 -07:00
Alvin Moore 9553458b78 Updated simulation to support managing exclusion and inclusion address
Added method for identifying acceptable availability process classes
Extended cluster availability function to ensure coordinators can be auto configured
Fixed availability function to allow protected processes to be considered as dead if not available
Added debug trace events for providing machine state when considering availability
Added trace event for protected coordinators
2017-06-19 16:48:15 -07:00
Evan Tschannen 766dc23e26 fix: do not use TLS in protectedAddresses 2017-06-02 13:52:21 -07:00
Evan Tschannen 1626e16377 Merge branch 'release-4.6' into release-5.0 2017-05-31 16:23:37 -07:00
A.J. Beamon 93509133ad Attempt to eliminate a race in DLSingleAssignmentVar between cancelling a future and checking its ref count. Reduce the amount of iterations in the test because it’s timing out on the pie workers. 2017-05-26 10:47:37 -07:00
FDB Dev Team a674cb4ef4 Initial repository commit 2017-05-25 13:48:44 -07:00