mpilman
3cb2391b58
use proper fwd declarations in ManagementAPI
...
Also ManagementAPI.h -> ManagementAPI.actor.h
2019-02-19 15:16:59 -08:00
Evan Tschannen
ed9e20ce17
forgot to fix merge conflicts
2019-02-18 17:09:55 -08:00
Evan Tschannen
065a45e05f
Merge branch 'master' into feature-fix-force-recovery
...
# Conflicts:
# fdbclient/ManagementAPI.actor.cpp
# fdbserver/ClusterController.actor.cpp
# fdbserver/workloads/KillRegion.actor.cpp
2019-02-18 17:09:06 -08:00
Evan Tschannen
8f2af8bed1
fix: forced recoveries now require a target dcid which will become the new primary location. During the forced recovery, the configuration will be changed to make that location primary, and usable_regions will be set to 1. If the target dcid is already the primary location, the forced recovery will do nothing. This makes forced recoveries idempotent, so it is safe to the client to re-send forced recovery commands to the cluster controller.
...
fix: the cluster controller attempts to do a commit to determine if the cluster is alive, since its own internal recoveryState might not be up-to-date.
fix: forceMasterFailure on the cluster controller did not always cause the current master to be re-recruited
2019-02-18 14:54:28 -08:00
Evan Tschannen
83060c6e56
Merge pull request #1062 from jzhou77/PR
...
Add a new DataDistributor role.
2019-02-15 13:51:27 -08:00
Jingyu Zhou
6a655143e8
A follow-on fix for config key usage
...
And some trace event cleanups.
2019-02-14 16:37:16 -08:00
Jingyu Zhou
be5c962bb7
Add a new configuration version key \xff/conf/version
...
This fixed a bug found by upgrade test, where the configuration monitor of the
data distributor was monitoring excludedServersVersionKey, which doesn't
change in ChangeConfig workload. As a result, data distributor was not aware of
configuration changes.
Adding this new key and make sure this key is updated in configuration changes
so that the monitor can detect configuration changes.
2019-02-14 16:37:16 -08:00
Alex Miller
21ce26a681
Merge release-6.0 into master
2019-02-14 14:24:35 -08:00
A.J. Beamon
b435d51061
Merge branch 'master' into track-server-request-latencies
2019-02-14 08:07:32 -08:00
Andrew Noyes
067a445e06
Replace unused _ variables with wait(success(...))
2019-02-12 17:30:30 -08:00
Markus Pilman
c4fa69c998
Remove duplicate comment
2019-02-06 09:38:08 -08:00
mpilman
b11d2ce1ec
Fix bug that in `include` command
...
This bug results in `include` to include all
machines that have a prefix of a given machine.
This is a weird bug that has to do with ASCII
ordering. Instead this command executes two
clears: one for the IP address and one for
all IP:PORT pairs
2019-02-06 09:38:08 -08:00
Meng Xu
550f2e2682
Merge with master to use the latest backup container
...
In fdb 6.0.15, backup container is changed on how to organize the backup data.
The backup made by fdb >6.0.15 has to be restored with fdb > 6.0.15.
Merge with master so that the fast restore uses fdb > 6.0.15
2019-01-30 12:05:15 -08:00
Evan Tschannen
6d7d40bcb3
Update fdbclient/ManagementAPI.actor.cpp
2019-01-29 11:37:08 -08:00
A.J. Beamon
a49aea29d1
Update fdbclient/ManagementAPI.actor.cpp
...
Co-Authored-By: etschannen <36455792+etschannen@users.noreply.github.com>
2019-01-29 11:34:50 -08:00
Trevor Clinkenbeard
da1a8a9919
Prioritize processes with class coordinator when "coordinators auto" is run
2019-01-25 11:33:10 -08:00
A.J. Beamon
2198d24ce1
Merge commit '3b2700d25334c53d13496ca16682642aac951beb' into track-server-request-latencies
...
# Conflicts:
# fdbclient/MasterProxyInterface.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/ServerDBInfo.h
# fdbserver/Status.actor.cpp
# fdbserver/fdbserver.vcxproj
# fdbserver/storageserver.actor.cpp
2019-01-24 11:43:26 -08:00
A.J. Beamon
8e05e95045
Added the ability to configure the latency band settings by setting a special key in \xff keyspace.
2019-01-18 16:18:34 -08:00
Evan Tschannen
699f8dd617
fix: coordinators auto could put two coordinators in the same zone
...
simulation now tests two machines in the same zone
2019-01-18 15:42:48 -08:00
Meng Xu
80b2f75187
debug why restore did not restore the complete data
2018-12-03 19:29:17 -08:00
Evan Tschannen
d2d68aa171
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/ManagementAPI.actor.cpp
# versions.target
2018-12-03 18:26:52 -08:00
Meng Xu
1b085a9817
sequantial restore: pass 1 test case
...
-r simulation --logsize 1024MiB -f foundationdb/tests/fast/ParallelRestoreCorrectness.txt -b off -s 95208406
2018-12-03 10:57:30 -08:00
Evan Tschannen
6e94d7d71a
changing the database configuration now checks that enough processes exist to support the new configuration
2018-11-30 18:52:24 -08:00
Evan Tschannen
4e54690005
Merge branch 'release-6.0'
...
# Conflicts:
# fdbserver/DataDistribution.actor.cpp
# fdbserver/MoveKeys.actor.cpp
2018-11-12 20:26:58 -08:00
Evan Tschannen
ccdc83036d
simplified region configuration protections
2018-11-12 17:45:20 -08:00
Evan Tschannen
6353a6724b
strengthened the protections related to changing regions
2018-11-12 17:40:40 -08:00
Evan Tschannen
4b5d0b4e2c
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/AsyncFileBlobStore.actor.cpp
# fdbclient/AsyncFileBlobStore.actor.h
# fdbclient/BlobStore.actor.cpp
# fdbclient/BlobStore.h
# fdbclient/HTTP.actor.cpp
# fdbclient/ManagementAPI.actor.cpp
# fdbclient/NativeAPI.actor.cpp
# fdbrpc/LoadBalance.actor.h
# fdbrpc/batcher.actor.h
# fdbrpc/fdbrpc.vcxproj
# fdbrpc/sim2.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/DataDistributionTracker.actor.cpp
# fdbserver/SimulatedCluster.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/masterserver.actor.cpp
2018-11-10 13:04:24 -08:00
Evan Tschannen
a654183f63
Merge pull request #791 from ajbeamon/remove-cluster-from-iclientapi
...
Remove cluster from IClientApi (phase 2 of removing DB names)
2018-11-10 10:16:18 -08:00
Evan Tschannen
c02690471d
added protection against configuration changes which cannot be immediately reverted
...
the configure database workload tests region configurations
2018-11-04 19:53:55 -08:00
Robert Escriva
268093a96d
Adjust all includes to be relative to the root.
...
Remove the use of relative paths. A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h". Adjust so that every include references such a header with the
latter form.
Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Stephen Atherton
22f8a4efa9
Normalized all unit test names to begin with "/" if they should be included in random unit testing.
2018-10-05 22:09:58 -07:00
Stephen Atherton
3ea9193fa7
Renamed redwood to redwood-experimental. UnitTest names can now be hidden using # as the first character so that random correctness tests will not run them. Excluded redwood tests from correctness testing. Reverted default storage engine to ssd.
2018-10-05 14:43:54 -07:00
Stephen Atherton
7c1dc305cb
Merge commit 'a72c8f5cb2e79a673abc0ed3d27ef1c51028fb13' into feature-redwood
2018-10-05 10:15:10 -07:00
Evan Tschannen
3922e477a5
Merge branch 'release-6.0'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbclient/ManagementAPI.actor.cpp
# fdbserver/ClusterController.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/LogSystemDiskQueueAdapter.actor.cpp
# fdbserver/SimulatedCluster.actor.cpp
# fdbserver/TLogServer.actor.cpp
2018-10-03 16:57:18 -07:00
Evan Tschannen
dd9c6856e1
fix: forced recovery is not safe to send multiple times
2018-10-02 17:29:39 -07:00
A.J. Beamon
c831051474
This removes the idea of clusters from IClientApi.
2018-09-21 15:58:14 -07:00
Stephen Atherton
2fc86c5ff3
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# fdbrpc/AsyncFileCached.actor.h
# fdbserver/IKeyValueStore.h
# fdbserver/KeyValueStoreMemory.actor.cpp
# fdbserver/workloads/StatusWorkload.actor.cpp
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-09-20 03:39:55 -07:00
Evan Tschannen
200e65fe61
added a workload which tests killing an entire region, and recovering from the failure with data loss.
...
fix: we cannot pop the txs tag from remote logs until they have a full copy of the txnStateStore
fix: we have to modify all of history, we cannot stop after finding a local remote
2018-09-17 18:32:39 -07:00
Alec Grieser
10a8e67266
Merge remote-tracking branch 'upstream/release-6.0' into merge-release-6.0
2018-09-11 21:49:59 -07:00
Evan Tschannen
d3c8d7ab4e
fix: status would generate invalid json
2018-09-07 18:26:05 -07:00
Evan Tschannen
90301f497f
Merge branch 'release-6.0'
...
# Conflicts:
# fdbclient/ManagementAPI.actor.cpp
# fdbrpc/FlowTransport.actor.cpp
# fdbrpc/TLSConnection.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/Status.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/StatusWorkload.actor.cpp
# versions.target
2018-09-05 16:06:33 -07:00
Evan Tschannen
40f5dbe423
fixed issues from review, added a safeguard to prevent configuring a cluster to an invalid configuration
2018-09-04 22:16:35 -07:00
Evan Tschannen
d8ea3dbf9a
Added the ability to configure a cluster from a JSON file
2018-08-16 17:34:59 -07:00
Alex Miller
fb31a6999f
Rewrite all files to have #include actorcompiler.h as the last include.
2018-08-14 15:50:26 -07:00
Alex Miller
535b5701e5
Rewrite all `Void _ = wait(...)` -> `wait(...)`.
...
This takes advantage of the new actorcompiler functionality to avoid
having duplicate definitions of `Void _` when trying to feed the
un-actorompiled source through clang.
2018-08-14 15:50:26 -07:00
Stephen Atherton
9d85a05372
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-07-05 12:52:06 -07:00
Stephen Atherton
2925b9b984
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
2018-07-03 23:03:56 -07:00
Evan Tschannen
866ccfe344
added the ability to allow the master to finish recovery before all storage servers in both regions have their mutations. This allows you to recover from scenarios where you lose all your tlogs in one dc.
2018-07-04 01:59:04 -04:00
Evan Tschannen
7a12d3e130
added the (untested) ability to force a recovery to the remote datacenter, even if that results in data loss. If the DR lag is more than 1 week there could be potential data corruption if any primary storage servers are still alive.
2018-07-01 09:39:04 -04:00
Stephen Atherton
b95a2bd6c1
Merge commit 'b17c8359ec22892ed4daeaa569f2f5e105477251' into feature-redwood
...
# Conflicts:
# flow/Trace.cpp
2018-06-30 23:18:29 -07:00
Evan Tschannen
a66eda8baa
added the three_datacenter_fallback redundancy mode, which allows you to drop a down datacenter when configured in three_datacenter mode
2018-06-27 23:24:33 -07:00
Evan Tschannen
8a8914f046
re-added the ability to configure the number of log routers. Many log routers are needed to get a sufficient number of sockets involved in copying data across the WAN
2018-06-22 00:04:00 -07:00
Stephen Atherton
e5c48d453a
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
2018-06-18 22:45:27 -07:00
Evan Tschannen
1ccfb3a0f4
fix: log_anti_quorum was always 0 in simulation
...
removed durableStorageQuorum, because it is no longer a useful configuration parameter
2018-06-18 10:24:57 -07:00
Evan Tschannen
e8c462882b
re-added remote_logs as a parameter, because it could be useful to have a different number of logs between when recruited as primary and remote
2018-06-18 10:22:34 -07:00
Evan Tschannen
0913368651
added usable_regions to specify if we will replicate into a remote region
...
remote replication defaults to the primary replication
removed remote_logs, because they should be specified as an override in the regions object
2018-06-17 19:31:15 -07:00
Stephen Atherton
2878f30f29
Merge branch 'master' of github.com:apple/foundationdb into feature-redwood
...
# Conflicts:
# fdbserver/IKeyValueStore.h
# fdbserver/KeyValueStoreMemory.actor.cpp
# fdbserver/storageserver.actor.cpp
2018-06-13 15:56:06 -07:00
Evan Tschannen
372ed67497
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbserver/DataDistribution.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
2018-06-11 11:34:10 -07:00
Evan Tschannen
6e48d93d39
backed out the healthy team check because it was unnecessary
2018-06-10 12:43:32 -07:00
A.J. Beamon
e5488419cc
Attempt to normalize trace events:
...
* Detail names now all start with an uppercase character and contain no underscores. Ideally these should be head-first camel case, though that was harder to check.
* Type names have the same rules, except they allow one underscore (to support a usage pattern Context_Type). The first character after the underscore is also uppercase.
* Use seconds instead of milliseconds in details.
Added a check when events are logged in simulation that logs a message to stderr if the first two rules above aren't followed.
This probably doesn't address every instance of the above problems, but all of the events I was able to hit in simulation pass the check.
2018-06-08 11:11:08 -07:00
Evan Tschannen
be06938d9d
fix: dropping the remote replication will cause all remote storage servers to die. Make sure we are not restoring redundancy before doing this to prevent data loss in simulation.
2018-06-04 18:46:09 -07:00
Evan Tschannen
91338fc984
Merge branch 'master' into feature-remote-logs
2018-05-10 15:33:45 -07:00
Evan Tschannen
8f984cb2c9
Merge branch 'release-5.2'
...
# Conflicts:
# fdbrpc/TLSConnection.h
2018-05-10 09:13:22 -07:00
Alec Grieser
464e2cdbf0
change SetVersionstampedKey and SetVersionstampedValue behavior based on API version to make them consistent
2018-05-08 08:57:09 -07:00
Evan Tschannen
10d25927cd
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbserver/DataDistribution.actor.cpp
2018-04-30 22:15:39 -07:00
Stephen Atherton
af61d3596d
Merge branch 'public-master' into feature-redwood
...
# Conflicts:
# fdbserver/DatabaseConfiguration.cpp
# fdbserver/OldTLogServer.actor.cpp
# fdbserver/fdbserver.vcxproj
# fdbserver/fdbserver.vcxproj.filters
2018-04-24 17:22:21 -07:00
Evan Tschannen
19762b847d
Merge branch 'release-5.2'
...
# Conflicts:
# fdbserver/DatabaseConfiguration.cpp
# fdbserver/SimulatedCluster.actor.cpp
2018-04-10 17:02:43 -07:00
Evan Tschannen
528caeb196
continue supporting multi_dc as a hidden option
2018-04-10 15:59:15 -07:00
Evan Tschannen
7af892f50b
first working version of non-copying recovery working with fearless configurations
2018-04-08 21:24:05 -07:00
Stephen Atherton
2752a28611
Merge branch 'release-5.2' of github.com:apple/foundationdb into feature-redwood
2018-04-06 16:29:37 -07:00
Evan Tschannen
b36e08f08f
first version of non-copying recovery. Upgrades are broken, and it has not been tested using fearless configurations yet
2018-03-29 15:12:38 -07:00
Evan Tschannen
82ed956c65
renamed the multi_dc configuration to three_datacenter. The old three_datacenter configuration was not a useful configuration.
2018-03-26 18:31:26 -07:00
A.J. Beamon
f2c804e14f
Reverting changes from merge of master into release-5.2 ( b25810711c
). Note that we never intend to release master into release-5.2, but if we did we would need to revert this commit.
2018-03-06 10:15:04 -08:00
Evan Tschannen
1194e3a361
added region-based configuration to support a large variety of fearless setups. Currently only 1 primary 1 remote setups are allowed.
2018-03-05 19:27:46 -08:00
Evan Tschannen
470f5c01f3
changed remoteDcId to a vector of ids, to support future configurations where there are multiple remote databases
2018-02-26 17:09:09 -08:00
Evan Tschannen
37a6a81634
Merge commit '7f6fc3e039c911cd84b8540f7f799fc38a1c1822' into feature-remote-logs
...
# Conflicts:
# fdbserver/workloads/RestartRecovery.actor.cpp
2018-02-23 12:33:28 -08:00
Alec Grieser
0bae9880f1
remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py
2018-02-21 10:25:11 -08:00
Evan Tschannen
31b89a638f
added satellite_none and remote_none options to unconfigure from a fearless setup
...
fix: log_router configuration was broken
2018-02-17 13:51:17 -08:00
Evan Tschannen
cb25564d38
simulated cluster supports fearless configurations
...
removed unused simulation variables
run the simulation with only 1 coordinator most of the time, since we protect the coordinator from being killed, and protecting too many things is bad for simulation
2018-02-15 18:32:39 -08:00
Stephen Atherton
0a35f167e4
Merge branch 'master' into feature-redwood
...
# Conflicts:
# fdbserver/DiskQueue.actor.cpp
# fdbserver/IDiskQueue.h
# fdbserver/Knobs.cpp
# fdbserver/Knobs.h
# fdbserver/fdbserver.vcxproj
# fdbserver/fdbserver.vcxproj.filters
# fdbserver/worker.actor.cpp
2018-02-12 01:30:02 -08:00
Evan Tschannen
ebd94bb654
removed a separately configurable storage team size for the remote data center, because it did not make sense
...
fix: the master did not monitor for the failure of remote logs
stop merge attempts when a data center is failed
fixed a variety of other problems with data distribution when a data center is failed
2018-02-02 11:46:04 -08:00
Evan Tschannen
15962cf079
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbrpc/Locality.cpp
# fdbrpc/Locality.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/ClusterRecruitmentInterface.h
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WorkerInterface.h
# fdbserver/fdbserver.vcxproj.filters
# fdbserver/masterserver.actor.cpp
# fdbserver/worker.actor.cpp
# flow/error_definitions.h
2017-10-05 17:09:44 -07:00
Evan Tschannen
ef41b07bb3
renamed past_version to transaction_too_old
...
implemented read_lock_aware option
2017-09-28 16:35:08 -07:00
Evan Tschannen
73fca75239
added the ability to disable timeKeeper; disabled timeKeeper before consistency check in simulation
2017-09-28 13:13:24 -07:00
Stephen Atherton
248dab79b6
Created “redwood” storage engine option and many changes to support that including IKeyValueStore::init() and custom DiskQueue file extensions.
2017-09-21 23:51:55 -07:00
Stephen Atherton
d880569d52
Checkpointing progress on KeyValueStoreMVBTree. All methods are implemented to a usable point, and everything compiles, but Worker does not yet try to use it.
2017-09-21 04:43:49 -07:00
Evan Tschannen
76e7988663
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# fdbserver/ClusterController.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/OldTLogServer.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/WorkerInterface.h
# flow/Net2.actor.cpp
2017-09-11 15:15:56 -07:00
Evan Tschannen
ea26bc1c43
passed first tests which kill entire datacenters
...
added configuration options for the remote data center and satellite data centers
updated cluster controller recruitment logic
refactors how master writes core state
updated log recovery, and log system peeking
2017-09-07 15:32:08 -07:00
Evan Tschannen
6e26ae2bb3
added a new multi_dc configuration
2017-09-01 15:45:27 -07:00
Yichi Chiang
6a8a5c41b0
Add a switch to turn off data distribution in CLI
2017-07-28 18:14:55 -07:00
Evan Tschannen
57ba9d36af
fixed a large number of bugs
2017-07-13 12:29:21 -07:00
Alvin Moore
31d562ff7b
Merge branch 'release-5.0'
...
# Conflicts:
# fdbclient/ManagementAPI.actor.cpp
# fdbserver/DatabaseConfiguration.cpp
# versions.target
2017-06-27 11:16:08 -07:00
Evan Tschannen
9fd5955e92
Merge branch 'master' into removing-old-dc-code
2017-06-26 16:27:10 -07:00
Evan Tschannen
15cb498aa7
removed fast_recovery_double and fast_recovery_triple from the fdbcli
2017-06-23 16:18:23 -07:00
Alvin Moore
9553458b78
Updated simulation to support managing exclusion and inclusion address
...
Added method for identifying acceptable availability process classes
Extended cluster availability function to ensure coordinators can be auto configured
Fixed availability function to allow protected processes to be considered as dead if not available
Added debug trace events for providing machine state when considering availability
Added trace event for protected coordinators
2017-06-19 16:48:15 -07:00
Evan Tschannen
766dc23e26
fix: do not use TLS in protectedAddresses
2017-06-02 13:52:21 -07:00
FDB Dev Team
a674cb4ef4
Initial repository commit
2017-05-25 13:48:44 -07:00