Commit Graph

163 Commits

Author SHA1 Message Date
Jon Fu 5a877d6b14 added safety check on client to prevent removing all servers from a team 2019-08-27 14:39:43 -07:00
Evan Tschannen ba54508c47 code cleanup 2019-08-06 16:30:30 -07:00
Evan Tschannen 4c9a392f05 the master checks the popped version of the txsTag before recovering the txnStateStore, to avoid restoring data that is later found to be popped 2019-08-05 17:01:48 -07:00
Andrew Noyes 1bad0fd44e Make requestTime private 2019-07-31 17:59:35 -07:00
Evan Tschannen 6dbaddd0a7 Added a knob to always use CAUSAL_READ_RISKY for GRV 2019-07-30 18:21:46 -07:00
sramamoorthy 63941e0d96 disable DD with a in-memory flag and use in snapv2 2019-07-30 17:04:51 -07:00
Evan Tschannen 5c98dcce6d revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators 2019-07-27 16:46:22 -07:00
Evan Tschannen b509a441e7 Merge branch 'master' into feature-skip-confirm
# Conflicts:
#	bindings/flow/tester/Tester.actor.cpp
#	bindings/go/src/_stacktester/stacktester.go
#	bindings/java/src/test/com/apple/foundationdb/test/AsyncStackTester.java
#	bindings/java/src/test/com/apple/foundationdb/test/StackTester.java
#	bindings/python/tests/tester.py
#	bindings/ruby/tests/tester.rb
#	documentation/sphinx/source/api-c.rst
#	documentation/sphinx/source/api-python.rst
#	documentation/sphinx/source/api-ruby.rst
#	documentation/sphinx/source/data-modeling.rst
#	documentation/sphinx/source/developer-guide.rst
#	fdbclient/vexillographer/fdb.options
#	fdbserver/MasterProxyServer.actor.cpp
2019-07-27 15:08:13 -07:00
sramamoorthy 9afd162e2f remove snap v1 related code 2019-07-25 17:29:31 -07:00
sramamoorthy a65c9f92ed get rid of all timeouts and other changes 2019-07-24 15:36:28 -07:00
sramamoorthy a2f2ad96ff code review comments and merge to master changes 2019-07-24 15:36:28 -07:00
sramamoorthy 869f77aef1 Few cosmetic edits and fixes 2019-07-24 15:36:28 -07:00
sramamoorthy 62c14dae72 disable dd during snap and enable in restore 2019-07-24 15:36:28 -07:00
sramamoorthy 95d6807740 tryGetReply instead of getReply for ddSnapReq 2019-07-24 15:36:28 -07:00
sramamoorthy d0793f5ca2 snap v2: master proxy related changes 2019-07-24 15:36:28 -07:00
Jingyu Zhou 63e37aebaf Reorder include files. 2019-07-19 11:24:26 -07:00
Jingyu Zhou d8fb1ea2d3 Log large transactions at proxy
This can help debugging where large transactions are coming from.
2019-07-19 11:10:48 -07:00
Evan Tschannen d4c332e2cf update lastCommitTime when doing GRV’s without causal read risky 2019-07-14 14:46:00 -07:00
Evan Tschannen 02de53160d only skip confirm epoch live if CAUSAL_READ_RISKY is enabled
time checked on the proxy should be less than the time waited by the master to account for clock speed differences
setting REQUIRED_MIN_RECOVERY_DURATION and ENFORCED_MIN_RECOVERY_DURATION to 0 will go back to the old behavior
2019-07-12 17:58:16 -07:00
Evan Tschannen a63969afb3 enforce a minimum recovery duration, which allows proxies to avoid checking if the epoch is alive as long as its last commit has been less than MINIMUM_RECOVERY_DURATION ago 2019-07-12 13:10:21 -07:00
Evan Tschannen d8948c8be1 Merge branch 'master' into feature-fast-txs-recovery
# Conflicts:
#	fdbserver/TagPartitionedLogSystem.actor.cpp
2019-07-10 13:59:52 -07:00
Evan Tschannen 001abec29d fixed a compiler error, buggified a new knob 2019-07-09 16:50:59 -07:00
Evan Tschannen 64aee73c4f we only need to hold the ReplyPromise for messages that we are going to forward to new proxies 2019-07-09 16:47:56 -07:00
Evan Tschannen c348b3da51 After a proxy dies, it will remain alive for an additional 10 seconds to forward clients to the new proxies 2019-07-08 12:53:40 -07:00
Jingyu Zhou 50e7593c5b
Merge pull request #1796 from ajbeamon/remove-trace-event-underscores
Remove trace event underscores
2019-07-05 21:45:55 -07:00
Evan Tschannen 15e894c724 Merge in master 2019-07-05 15:49:24 -07:00
A.J. Beamon 9f4b6fd770 Remove additional underscores 2019-07-05 08:12:25 -07:00
Alex Miller 7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Evan Tschannen e0be631414 shard the txs tag so that more transaction logs are involved in its recovery 2019-06-19 18:15:09 -07:00
sramamoorthy 2a68b28590 rebase related changes 2019-05-28 22:07:46 -07:00
sramamoorthy b17ad85497 exec op not supported when log_anti_quorum > 0 2019-05-28 22:07:46 -07:00
sramamoorthy d3a179b6f9 Multiple bug fixes
- wait for snapTLogFailKeys in a loop, otherwise in some race
  condition it can cause a false assert
- in single region, there does not seem to be a guarantee of
  tagLocalityListKey for a given DC ID, avoiding that assert for now
- to find the workers that are coordinators, looking up by primary
  address is not sufficient in some cases, hence looking by both
  primary and secondary address
- test make files to reflect the location of the new test cases
2019-05-28 22:07:46 -07:00
sramamoorthy bb474dc323 if recovery < fully_recovered then fail the exec
Will do more cleanup, pushing it for a test run in CI
2019-05-28 22:07:46 -07:00
sramamoorthy 925499954b New status cluster_not_fully_recovered 2019-05-28 22:07:46 -07:00
sramamoorthy 6f42337c09 TransactionNotPermitted instead of conflict error
When the cluster has not recovered completely, return op not
permitted instead of conflict error
2019-05-28 22:07:46 -07:00
sramamoorthy 4083af0b01 Avoid using trackLatest for TLog pop test cases 2019-05-28 22:07:46 -07:00
sramamoorthy 936ffc2dde rebase related changes 2019-05-28 22:07:46 -07:00
sramamoorthy ec7834e2f7 code re-orgnaization and address comments 2019-05-28 22:07:46 -07:00
sramamoorthy c76cc84ded execute coordinators code reorganized 2019-05-28 22:07:46 -07:00
sramamoorthy 61e93a9304 Address review comments and minor fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 00ccee8a6c workaround for log giving remote log and others
logSystemConfig.allLocalLogs() sometimes returns remote TLog interface
and a workaround is implemented here. Other minor cleanup.
2019-05-28 22:07:46 -07:00
sramamoorthy 89b7a052f5 Bug fixes for snapping coordinators 2019-05-28 22:07:46 -07:00
sramamoorthy 17ecba8313 trace cleanup and other indentation changes 2019-05-28 22:07:46 -07:00
sramamoorthy 898bed66c1 Allow only whitelisted binary path for exec op 2019-05-28 22:07:46 -07:00
sramamoorthy d282016f93 Exec op to tag only local storage nodes 2019-05-28 22:07:46 -07:00
sramamoorthy 6431513ad0 Fail exec req until the cluster is fully_recovered 2019-05-28 22:07:46 -07:00
sramamoorthy 4016f16c76 Fix few compilation and bugs in rebase 2019-05-28 22:07:46 -07:00
sramamoorthy 4bc4c615da exec op to all tlog, restore change in test &other
- exec operation to go to all the TLogs
- minor bug fix in tlog
- restore implementation for the simulator
- restore snap UID to be stored in restartInfo.ini
- test cases added
- indentation and trace file fixes
2019-05-28 22:07:46 -07:00
sramamoorthy 72dd067173 Trace message changes and fix few FIXMEs 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00