Commit Graph

197 Commits

Author SHA1 Message Date
Meng Xu 7f559bc712 Cleanup code and apply clang-format
Self code review
2020-03-16 15:08:32 -07:00
Meng Xu 1513df22f3 AutoQuorumChange:Exclude unreliable node from coordinator in simulation 2020-03-16 14:39:25 -07:00
Meng Xu 15c48b9e19 Add event for getDesired coordinators 2020-03-16 09:40:35 -07:00
Meng Xu bd345f85db ConsistencyCheck:Fix failue due to address inconsistency between process and worker
With TLS, a worker (or process) can have a TLS address and non-TLS address.
When a process is created in simulation, the primary address is TLS by default.
The non-TLS one is the TLS address port plus one.

In a connection between two workers, if their primary addresses do not enable
or disable TLS together, one worker will swap its primary address and secondary address
so that the TLS config of the two endpoints can match.

The swap can make the primary address no longer the TLS one that was created
when the process is created. And the swap only happens for worker instead of
process struct in simulation.

This swap can cause worker->address != process->address.
In checkForExtraDataStores actor, we use worker->address to check if a process
is killable and use the process->address to kill the process. The inconsistency
can cause simulation to kill a protected process that is not killable and leads
to simulation failure.
2020-03-10 21:07:16 -07:00
Evan Tschannen 96258b9809 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbcli/fdbcli.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/FlowTransport.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/KeyValueStoreMemory.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/QuietDatabase.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/StorageMetrics.actor.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KVStoreTest.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/genericactors.actor.cpp
#	flow/serialize.h
2020-02-21 19:09:16 -08:00
Evan Tschannen f04e311a1e Merge commit 'b46d6e25e24993ab5a5f04091fd3235050b7cd09' into feature-boost-ssl
# Conflicts:
#	fdbserver/SimulatedCluster.actor.cpp
#	flow/Net2.actor.cpp
2020-02-20 17:36:38 -08:00
Evan Tschannen e06c3e2eb7 fix: checkForExcludedServer needs to check both the tls and non-tls address 2020-02-19 15:10:54 -08:00
Evan Tschannen 663d176fdb fix: coordinators auto could added 0.0.0.0:0 as a coordinator 2020-02-14 16:50:55 -08:00
Evan Tschannen 96eec756b3 more simulation fixes 2020-02-12 15:12:43 -08:00
Evan Tschannen 38a5511b96 additional simulation fixes 2020-02-11 15:52:06 -08:00
Andrew Noyes 7b5de42d43 Address review comments 2020-02-11 10:40:09 -08:00
Andrew Noyes 90c1b2df88 Don't include header 2020-02-05 09:57:18 -08:00
mpilman 100402aadf Don't call operator explicitely 2020-02-04 11:03:43 -08:00
mpilman 52ca752dd3 Merge remote-tracking branch 'origin/features/icc' into features/icc 2020-02-04 10:29:49 -08:00
mpilman d09e07f1f5 Merge remote-tracking branch 'upstream/master' into features/icc 2020-02-04 10:26:18 -08:00
Andrew Noyes a043646d1d Don't read in init database transaction 2020-02-03 15:32:01 -08:00
Andrew Noyes 61a727d701 Allow new databases to be configured as locked 2020-02-03 15:32:01 -08:00
mengranwo 227edd4248 change memory storage engine name from memory-radixtree to memory-radixtree-beta 2020-01-15 13:49:45 -08:00
mengranwo f597aa7e18 WIP : deployable/stable version since Nov 3. Start rebase to master branch 2020-01-15 13:49:45 -08:00
Jon Fu 9db95bd976 initial commit to allow re-inclusion of servers marked as failed 2019-10-23 11:05:48 -07:00
Jon Fu d2b6626d5c Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-21 13:47:06 -07:00
Evan Tschannen 688940b685 merge 6.2 into master 2019-10-21 11:43:46 -07:00
Evan Tschannen 35e816e9ad added the ability to configure satellite_logs by satellite location, this will overwrite the region configure if both are present 2019-10-14 18:30:15 -07:00
Jon Fu d96a7b2c69 Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-03 09:47:45 -07:00
Meng Xu d0147e5e5d Merge branch 'release-6.2' into mengxu/merge-release620-to-master-v3
Resolved Conflicts:
	documentation/sphinx/source/release-notes.rst
	fdbserver/DataDistribution.actor.cpp
	versions.target
2019-10-02 13:22:56 -07:00
A.J. Beamon 1d63ba6980 Use immediate priority for coordinator changes 2019-10-01 08:36:37 -07:00
Evan Tschannen 324d0bd3b0 Merge branch 'release-6.2' of github.com:apple/foundationdb into feature-cleanup-mutations 2019-09-27 19:15:14 -07:00
Evan Tschannen 3cc5d484a5 the include and exclude commands do not need to set the moveKeysLockOwnerKey, which will kill the data distribution algorithm 2019-09-27 18:33:56 -07:00
Jon Fu 450a09e117 Code Review Changes 2019-09-24 15:48:50 -07:00
Jon Fu 471e283128 Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-09-18 11:49:07 -07:00
sramamoorthy a4d38f1158 Fix #2057 snapshot cli to print UID in failure too 2019-09-17 05:18:28 -07:00
Jingyu Zhou d9fb199486 Remove debug messages from checkDatabaseLock() 2019-09-12 22:11:25 -07:00
Evan Tschannen 8fbd90e2f6
Merge pull request #1985 from xumengpanda/mengxu/storage-engine-switch-PR-v2
Graceful storage engine migration
2019-09-09 13:51:53 -07:00
Meng Xu c2355f721e Merge branch 'master' into mengxu/performant-restore-PR 2019-09-04 17:11:42 -07:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00
Meng Xu bd80a67d46 Merge branch 'master' into mengxu/storage-engine-switch-PR-v2 2019-09-03 14:11:33 -07:00
Evan Tschannen dc1d055b27
Merge pull request #2042 from senthil-ram/snap_cli_fix
fix fdbcli --exec 'snapshot create.sh' failure
2019-08-30 13:40:38 -07:00
sramamoorthy b3277f2982 Fix #2009 posix compliant args for snapshot binary 2019-08-30 12:54:09 -07:00
A.J. Beamon 3f9e392668
Merge pull request #2014 from etschannen/feature-fdbcli-sleep
Added a sleep command to fdbcli
2019-08-30 11:22:13 -07:00
Evan Tschannen f3bc7e0abd do not duplicate data distribution disabled fields in status
fixed a few bugs related to the existing data distribution disabled fields in status
2019-08-29 18:41:34 -07:00
Jon Fu 807b02551e updated help message and changed existing workload to use mark as failed feature 2019-08-27 14:39:43 -07:00
Jon Fu c908c6c1db added command to fdbcli and changes to SystemData and ManagementAPI 2019-08-27 14:39:43 -07:00
Evan Tschannen 0b0c9fe0ff data distribution status was combined into regular status 2019-08-21 14:44:15 -07:00
A.J. Beamon 7953545331 Fix an unknown_error when the file passed to fileconfigure doesn't contain a valid object (e.g. if you omit the enclosing {} of your object).
Fix an internal error when configuring regions with some storage servers that don't have a datacenter set.
2019-08-19 11:28:15 -07:00
Evan Tschannen 2a436d5f6f fix: do not block fdbcli from starting if DataDistributionStatus is not available 2019-08-16 18:15:02 -07:00
Meng Xu e6284684f0 StorageEngineSwitch:Always remove wrong storeType SS
In the old logic of switching storage engines, it marks a storage server
with wrong store type as undesired even though this can lead to no healthy team.

In the first version of the new storage engine switch, we mimic the same logic
of the old version.
2019-08-13 14:59:46 -07:00
Meng Xu a588710376 StorageEngineSwitch:Graceful switch
When fdbcli change storeType for storage engines,
we switch the store type of storage servers one by one gracefully.
This avoids recruiting multiple storage servers on the same process,
which can cause OOM error.
2019-08-12 17:37:52 -07:00
Meng Xu 3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu 7ccaeddf05 Merge branch 'master' into mengxu/performant-restore-PR 2019-08-01 13:23:17 -07:00
Xin Dong 5d20364423 Address review comments 2019-07-30 22:24:30 -07:00