Commit Graph

666 Commits

Author SHA1 Message Date
Meng Xu 39a4f2372f Change FASTRESTORE_SAMPLING_PERCENT to 0 to 100 2019-12-04 21:26:27 -08:00
Meng Xu c6b36dbffb FastRestore:Sampling:Resolve review comments 2019-12-04 17:35:11 -08:00
Meng Xu a15320cca7 Merge branch 'master' into mengxu/fast-restore-sampling-PR 2019-12-03 21:42:01 -08:00
Meng Xu 3310f67e9e Merge branch 'mengxu/fast-restore-fix-valgrind-PR' into mengxu/fast-restore-sampling-PR 2019-12-03 16:24:40 -08:00
A.J. Beamon 4d28793c76
Merge pull request #2365 from xumengpanda/mengxu/no-wait-actor-fix-PR
Fix compilation warning that actor that does not contain wait statement
2019-12-03 08:30:04 -08:00
Evan Tschannen 07331ab5fd
Merge pull request #2362 from etschannen/master
Merge 6.2 into master
2019-12-02 15:04:27 -08:00
Meng Xu f153cadab9 ComplilationWarning:Fix actor that does not contain wait statement 2019-12-02 11:38:29 -08:00
Meng Xu 594bd9544e Merge branch 'master' into mengxu/fast-restore-fix-valgrind-PR 2019-12-02 11:27:38 -08:00
Meng Xu 530b689299 Move state variable to the start of function 2019-11-26 11:17:59 -08:00
Jon Fu da5f48f344 Merge branch 'master' of https://github.com/apple/foundationdb into merge-attrition-workloads 2019-11-25 12:55:55 -08:00
Jon Fu ff19e11b40 added more parameters 2019-11-20 15:11:18 -08:00
Jon Fu c8a4ad0412 added live duration before kill and changed naming of variables from machine to worker 2019-11-20 10:46:00 -08:00
Meng Xu 3f5491318d FastRestore:Fix bug that cause nondeterminism
1) Use map iterator instead of pointer to maintain stability when map is inserted or deleted
2) dummySampleWorkload: clear rangeToApplier data in each sampling phase. otherwise, we can
have an increasing number of keys assigned to the applier.
2019-11-15 11:30:09 -08:00
Evan Tschannen 8d3ef89540 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/MutationList.h
#	fdbserver/MasterProxyServer.actor.cpp
#	versions.target
2019-11-14 15:49:56 -08:00
Evan Tschannen b1b5f88cb1
Merge pull request #2344 from bnamasivayam/release-6.2
Fix bug where DD or RK could be halted and re-recruited in a loop for…
2019-11-12 21:47:28 -08:00
Balachandar Namasivayam c26bb52979 Enable Consistency Checks for DD and RK. 2019-11-12 20:11:08 -08:00
Balachandar Namasivayam f5282f2c7e Fix bug where DD or RK could be halted and re-recruited in a loop for certain valid process class configurations. Specifically, recruitment of DD or RK takes into account that master process is preferred over proxy, resolver or cc.
But check for better DD only looks for better machine class ignoring that the new recruit could share a proxy or resolver or CC. Also try to balance the distribution of the DD and RK role if there are enough processes to do so.
2019-11-12 14:22:36 -08:00
A.J. Beamon ef801a6432 Rename LargePacket warnings to distinguish between sent and received packets. Also remove Net2_ prefix from packet size trace events. 2019-11-12 09:23:46 -08:00
A.J. Beamon cf2ec3418c
Merge pull request #2317 from xumengpanda/mengxu/fastrestore-extend-atomicOpTest-PR
AtomicOps Test: Add more detailed debug information when test fails with opType = AddValue
2019-11-11 15:03:10 -08:00
Jon Fu bdbd887fa5 revise some trace lines 2019-11-08 15:09:09 -08:00
Jon Fu 2147401a21 added function lambda and ability to specify zone kill 2019-11-08 15:05:18 -08:00
Meng Xu 04e66fa0ec AtomicOp:Trace when txn reads exceeds limit and add upper bound sum 2019-11-08 14:35:37 -08:00
Jon Fu 489a98c62b use vector of targets and removed randomization from specified kill types (dc, datahall, etc.) 2019-11-08 13:56:39 -08:00
Jon Fu 3de7ae5b0c Added size assertion in test workload 2019-11-08 09:39:25 -08:00
Meng Xu 5fbe399baf AtomicOp: Resolve review comments; no functional change.
1) Trace Txn commit_unknown_results in workload;
2) Add SevError trace events when txn reads hit limits since we
do not handle this situation in dumping the debug info.
2019-11-06 12:13:27 -08:00
Meng Xu 0ccded1929 AtomicOps:Resolve review comments 2019-11-05 19:27:49 -08:00
Jon Fu da1a70e19a fix check for killable processes 2019-11-05 13:57:32 -08:00
Jon Fu 72ad8c5f99 only randomize killdc option under simulation 2019-11-05 13:31:47 -08:00
Jon Fu f7b3686fc7 fixed bug in maintaining kill set size 2019-11-05 11:27:10 -08:00
Jon Fu ba2c5dd2a6 first draft of adding option to kill processes 2019-11-04 15:46:45 -08:00
Meng Xu 96989e0fb6 AtomicOps test:Add sanity check for log and ops keys
Provide more information about which opsKey is missing when
log and ops results are inconsistent for Add operation.
2019-11-04 14:28:40 -08:00
Jon Fu d9d1cdc470 removed delay before kill 2019-11-04 11:19:39 -08:00
Andrew Noyes de8921b660 Move RestoreWorkerInterface to fdbclient 2019-10-25 10:42:22 -07:00
Andrew Noyes d4de608bb6 Fix OPEN_FOR_IDE build 2019-10-25 10:42:22 -07:00
Jingyu Zhou a30e6ec147
Merge pull request #2277 from xumengpanda/mengxu/fastrestore-atomicOpTest-increaseLoadAndBugFix-PR
Performant restore [7/XX]: Add tests for transactionBatchSizeThreshold when apply mutations
2019-10-24 21:21:14 -07:00
Jon Fu 5d7c84b803 moved shuffle outside of the conditional blocks 2019-10-24 09:45:04 -07:00
Meng Xu b1881a7c1c FastRestore:Apply clang-format 2019-10-23 20:49:14 -07:00
Meng Xu 1ae02dd1df FastRestore:AtomicOp test:Add sanity check for setup step 2019-10-23 17:28:21 -07:00
Jon Fu ab262e5e4d use StringRef over std::string for workload params 2019-10-23 14:55:28 -07:00
Jon Fu 103cc37a35 added datahall kill and option to target a specific datahall/dc/machine id 2019-10-23 14:19:17 -07:00
Meng Xu ba7e499efe FastRestore:AtomicOpTest:Limit 1 actor per client 2019-10-23 14:04:14 -07:00
Jon Fu d97ff75638 added mode to specifically kill all workers with same machineId 2019-10-23 11:30:16 -07:00
Jon Fu 47dc0ee25c removed coordinator check and added pre-processing of workers rather than checking each cycle 2019-10-23 11:19:27 -07:00
Jon Fu 6583c499f8 Merge branch 'master' of https://github.com/apple/foundationdb into modify-attrition 2019-10-23 09:42:14 -07:00
Meng Xu e676348710
Merge pull request #1955 from fzhjon/mark-ss-failed
Add fdbcli and API command to mark storage servers as permanently failed
2019-10-22 23:36:30 -07:00
Jon Fu e39d0dde9b Merge branch 'master' of https://github.com/apple/foundationdb into modify-attrition 2019-10-22 11:51:08 -07:00
Meng Xu 4af69fd94f Merge branch 'master' into mengxu/fastrestore-multifiles-has-sameversion-mutations-PR-testPR 2019-10-21 14:35:04 -07:00
Jon Fu d2b6626d5c Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed 2019-10-21 13:47:06 -07:00
Evan Tschannen 688940b685 merge 6.2 into master 2019-10-21 11:43:46 -07:00
Meng Xu 6d0c9e9198 FastRestore:AtomicOpTestCase:Add the test case
Also add trace events for AtomicOps.actor.cpp
2019-10-18 16:58:45 -07:00