Commit Graph

176 Commits

Author SHA1 Message Date
Jon Fu 9db95bd976 initial commit to allow re-inclusion of servers marked as failed 2019-10-23 11:05:48 -07:00
Jon Fu d2b6626d5c Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-21 13:47:06 -07:00
Evan Tschannen 688940b685 merge 6.2 into master 2019-10-21 11:43:46 -07:00
Evan Tschannen 35e816e9ad added the ability to configure satellite_logs by satellite location, this will overwrite the region configure if both are present 2019-10-14 18:30:15 -07:00
Jon Fu d96a7b2c69 Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-03 09:47:45 -07:00
Meng Xu d0147e5e5d Merge branch 'release-6.2' into mengxu/merge-release620-to-master-v3
Resolved Conflicts:
	documentation/sphinx/source/release-notes.rst
	fdbserver/DataDistribution.actor.cpp
	versions.target
2019-10-02 13:22:56 -07:00
A.J. Beamon 1d63ba6980 Use immediate priority for coordinator changes 2019-10-01 08:36:37 -07:00
Evan Tschannen 324d0bd3b0 Merge branch 'release-6.2' of github.com:apple/foundationdb into feature-cleanup-mutations 2019-09-27 19:15:14 -07:00
Evan Tschannen 3cc5d484a5 the include and exclude commands do not need to set the moveKeysLockOwnerKey, which will kill the data distribution algorithm 2019-09-27 18:33:56 -07:00
Jon Fu 450a09e117 Code Review Changes 2019-09-24 15:48:50 -07:00
Jon Fu 471e283128 Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-09-18 11:49:07 -07:00
sramamoorthy a4d38f1158 Fix #2057 snapshot cli to print UID in failure too 2019-09-17 05:18:28 -07:00
Jingyu Zhou d9fb199486 Remove debug messages from checkDatabaseLock() 2019-09-12 22:11:25 -07:00
Evan Tschannen 8fbd90e2f6
Merge pull request #1985 from xumengpanda/mengxu/storage-engine-switch-PR-v2
Graceful storage engine migration
2019-09-09 13:51:53 -07:00
Meng Xu c2355f721e Merge branch 'master' into mengxu/performant-restore-PR 2019-09-04 17:11:42 -07:00
Meng Xu d160810662 FastRestore:Resolve review comments 2019-09-04 16:48:43 -07:00
Meng Xu bd80a67d46 Merge branch 'master' into mengxu/storage-engine-switch-PR-v2 2019-09-03 14:11:33 -07:00
Evan Tschannen dc1d055b27
Merge pull request #2042 from senthil-ram/snap_cli_fix
fix fdbcli --exec 'snapshot create.sh' failure
2019-08-30 13:40:38 -07:00
sramamoorthy b3277f2982 Fix #2009 posix compliant args for snapshot binary 2019-08-30 12:54:09 -07:00
A.J. Beamon 3f9e392668
Merge pull request #2014 from etschannen/feature-fdbcli-sleep
Added a sleep command to fdbcli
2019-08-30 11:22:13 -07:00
Evan Tschannen f3bc7e0abd do not duplicate data distribution disabled fields in status
fixed a few bugs related to the existing data distribution disabled fields in status
2019-08-29 18:41:34 -07:00
Jon Fu 807b02551e updated help message and changed existing workload to use mark as failed feature 2019-08-27 14:39:43 -07:00
Jon Fu c908c6c1db added command to fdbcli and changes to SystemData and ManagementAPI 2019-08-27 14:39:43 -07:00
Evan Tschannen 0b0c9fe0ff data distribution status was combined into regular status 2019-08-21 14:44:15 -07:00
A.J. Beamon 7953545331 Fix an unknown_error when the file passed to fileconfigure doesn't contain a valid object (e.g. if you omit the enclosing {} of your object).
Fix an internal error when configuring regions with some storage servers that don't have a datacenter set.
2019-08-19 11:28:15 -07:00
Evan Tschannen 2a436d5f6f fix: do not block fdbcli from starting if DataDistributionStatus is not available 2019-08-16 18:15:02 -07:00
Meng Xu e6284684f0 StorageEngineSwitch:Always remove wrong storeType SS
In the old logic of switching storage engines, it marks a storage server
with wrong store type as undesired even though this can lead to no healthy team.

In the first version of the new storage engine switch, we mimic the same logic
of the old version.
2019-08-13 14:59:46 -07:00
Meng Xu a588710376 StorageEngineSwitch:Graceful switch
When fdbcli change storeType for storage engines,
we switch the store type of storage servers one by one gracefully.
This avoids recruiting multiple storage servers on the same process,
which can cause OOM error.
2019-08-12 17:37:52 -07:00
Meng Xu 3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu 7ccaeddf05 Merge branch 'master' into mengxu/performant-restore-PR 2019-08-01 13:23:17 -07:00
Xin Dong 5d20364423 Address review comments 2019-07-30 22:24:30 -07:00
Xin Dong 1922c39377 Resolve review comments. 100K run shows one suspecious ASSERT_WE_THINK failure which I think could be a race. 2019-07-30 22:24:30 -07:00
Xin Dong ae11efcb0a Made following changes:
- Make sure the disabled data distribution won't be accidentally enabled by the 'maintenance' command
- Make sure the status json reflects the status of DD accordingly
- Make sure the CLI can play with the new DD states correctly, i.e. print out warns when necessary
2019-07-30 22:20:45 -07:00
sramamoorthy 49eaa31984 Add a trace event for snap create failure 2019-07-29 20:28:22 -07:00
sramamoorthy 5a56f6b456 minor snap create client improvement and bug fixes 2019-07-29 20:28:22 -07:00
Meng Xu 1706aaf199 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflict in TlogServer.actor.cpp by accepting master changes
2019-07-26 11:46:27 -07:00
sramamoorthy 9afd162e2f remove snap v1 related code 2019-07-25 17:29:31 -07:00
Meng Xu 45083edf74 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
sramamoorthy a2f2ad96ff code review comments and merge to master changes 2019-07-24 15:36:28 -07:00
sramamoorthy 209448807d snap v2: fdbclient related changes 2019-07-24 15:36:28 -07:00
Evan Tschannen 83f4b8ebb1
Merge pull request #1866 from senthil-ram/setDDBugFix
setDDMode should set moveKeysLockWriteKey
2019-07-24 15:35:18 -07:00
sramamoorthy 0962641540 setDDMode should set moveKeysLockWriteKey
After takeMoveKeysLock notes down the owner and the
moveKeysLockWriteKey value, it monitors the above two in
pollMoveKeysLock and checks if anything is changed, but
setDDMode was not setting the moveKeysLockWriteKey and
so a sequence like disable, enable and disable would not
really disable DD.
2019-07-19 11:13:29 -07:00
Vishesh Yadav d9a8657096 fdbcli: Add `no_wait` option in `exclude` command to avoid blocking
RESOLVES #1840
2019-07-18 13:07:31 -07:00
Alex Miller 7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Meng Xu 477fd152c0 FastRestore:Refactor code
1) Use the runRYWTransaction for simple DB access
2) Replace some printf with TraceEvent
3) Remove printf not used in debugging
4) Avoid wait inside the condition in loop-choose-when for
   the core routine of restore worker, loader and applier.
5) Rename Restore.actor.cpp to RestoreWorker.actor.cpp since
   the file only has functionalities related to restore worker.

Passed correctness test
2019-06-04 11:22:47 -07:00
sramamoorthy 4bcb590f12 g_random -> deterministicRandom() 2019-05-28 22:07:46 -07:00
sramamoorthy ec7834e2f7 code re-orgnaization and address comments 2019-05-28 22:07:46 -07:00
sramamoorthy 61e93a9304 Address review comments and minor fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00
A.J. Beamon 603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00