Evan Tschannen
3325980c03
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# fdbserver/DataDistribution.actor.cpp
# fdbserver/OldTLogServer_6_0.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/WorkerInterface.actor.h
# fdbserver/worker.actor.cpp
# versions.target
2019-10-24 17:38:15 -07:00
A.J. Beamon
a1bed51d34
Ignore batch priority GRVs for latency band tracking
2019-10-23 10:29:58 -07:00
Evan Tschannen
12c517ab16
limit the number of committed version updates in progress simultaneously to prevent running out of memory
2019-10-21 16:01:45 -07:00
Jon Fu
d2b6626d5c
Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed
2019-10-21 13:47:06 -07:00
Evan Tschannen
688940b685
merge 6.2 into master
2019-10-21 11:43:46 -07:00
Evan Tschannen
ac28e96bbf
added a yield on the proxy to remove a slow task when processing large transactions
2019-10-16 14:31:59 -07:00
Jon Fu
450a09e117
Code Review Changes
2019-09-24 15:48:50 -07:00
Jon Fu
40ad6f0931
created new reply types to work with flatbuffer serialization requirements
2019-09-18 13:40:18 -07:00
Jon Fu
471e283128
Merge branch 'master' of https://github.com/apple/foundationdb into mark-ss-failed
2019-09-18 11:49:07 -07:00
Jingyu Zhou
02230e4ef6
Refactor: moving addition of backup mutations out of commitBatch
...
commitBatch() is too complicated and should be simplified.
2019-09-12 18:37:08 -07:00
Evan Tschannen
dcbb19816b
Merge pull request #1999 from ajbeamon/fix-proxy-grv-budgeting
...
The master proxy was too slow to erase a GRV budget deficit if no GRV requests were coming in.
2019-08-30 13:32:54 -07:00
sramamoorthy
5d87443323
improved error msgs for snapshot cmd
2019-08-27 16:43:52 -07:00
Jon Fu
3666c0c776
added more trace lines and added timeout to safety check in test workload
2019-08-27 14:39:44 -07:00
Jon Fu
04d514c483
added a wait to check for master proxies changed and put in a few more trace events
2019-08-27 14:39:44 -07:00
Jon Fu
e515691d7d
do not continuously loop if maybe_request_delivered
2019-08-27 14:39:44 -07:00
Jon Fu
00c2025d4b
fixed removeKeys impl, adjusted test workload, and introduced extra safety checks to NativeAPI and proxy
2019-08-27 14:39:44 -07:00
Jon Fu
5a877d6b14
added safety check on client to prevent removing all servers from a team
2019-08-27 14:39:43 -07:00
A.J. Beamon
0b1fc91a9c
Revert "Don't grow the budget deficit once it's exceeded some number of seconds of transactions. Decay the deficit if the rate changes and it exceeds the new limit."
...
This reverts commit 90cb73d472
.
2019-08-22 10:05:29 -07:00
A.J. Beamon
90cb73d472
Don't grow the budget deficit once it's exceeded some number of seconds of transactions. Decay the deficit if the rate changes and it exceeds the new limit.
2019-08-19 14:56:59 -07:00
A.J. Beamon
b2af17fb08
Simplify logic by removing an unneeded condition.
2019-08-15 08:23:13 -07:00
A.J. Beamon
717ede25b3
Fix: the master proxy was too slow to erase a GRV budget deficit if no GRV requests were coming in.
2019-08-14 15:01:09 -07:00
Evan Tschannen
ba54508c47
code cleanup
2019-08-06 16:30:30 -07:00
Evan Tschannen
4c9a392f05
the master checks the popped version of the txsTag before recovering the txnStateStore, to avoid restoring data that is later found to be popped
2019-08-05 17:01:48 -07:00
Andrew Noyes
1bad0fd44e
Make requestTime private
2019-07-31 17:59:35 -07:00
Evan Tschannen
6dbaddd0a7
Added a knob to always use CAUSAL_READ_RISKY for GRV
2019-07-30 18:21:46 -07:00
sramamoorthy
63941e0d96
disable DD with a in-memory flag and use in snapv2
2019-07-30 17:04:51 -07:00
Evan Tschannen
5c98dcce6d
revert the proxy forwarding path, because it is no longer necessary as clients keep a persistent connection open with coordinators
2019-07-27 16:46:22 -07:00
Evan Tschannen
b509a441e7
Merge branch 'master' into feature-skip-confirm
...
# Conflicts:
# bindings/flow/tester/Tester.actor.cpp
# bindings/go/src/_stacktester/stacktester.go
# bindings/java/src/test/com/apple/foundationdb/test/AsyncStackTester.java
# bindings/java/src/test/com/apple/foundationdb/test/StackTester.java
# bindings/python/tests/tester.py
# bindings/ruby/tests/tester.rb
# documentation/sphinx/source/api-c.rst
# documentation/sphinx/source/api-python.rst
# documentation/sphinx/source/api-ruby.rst
# documentation/sphinx/source/data-modeling.rst
# documentation/sphinx/source/developer-guide.rst
# fdbclient/vexillographer/fdb.options
# fdbserver/MasterProxyServer.actor.cpp
2019-07-27 15:08:13 -07:00
sramamoorthy
9afd162e2f
remove snap v1 related code
2019-07-25 17:29:31 -07:00
sramamoorthy
a65c9f92ed
get rid of all timeouts and other changes
2019-07-24 15:36:28 -07:00
sramamoorthy
a2f2ad96ff
code review comments and merge to master changes
2019-07-24 15:36:28 -07:00
sramamoorthy
869f77aef1
Few cosmetic edits and fixes
2019-07-24 15:36:28 -07:00
sramamoorthy
62c14dae72
disable dd during snap and enable in restore
2019-07-24 15:36:28 -07:00
sramamoorthy
95d6807740
tryGetReply instead of getReply for ddSnapReq
2019-07-24 15:36:28 -07:00
sramamoorthy
d0793f5ca2
snap v2: master proxy related changes
2019-07-24 15:36:28 -07:00
Jingyu Zhou
63e37aebaf
Reorder include files.
2019-07-19 11:24:26 -07:00
Jingyu Zhou
d8fb1ea2d3
Log large transactions at proxy
...
This can help debugging where large transactions are coming from.
2019-07-19 11:10:48 -07:00
Evan Tschannen
d4c332e2cf
update lastCommitTime when doing GRV’s without causal read risky
2019-07-14 14:46:00 -07:00
Evan Tschannen
02de53160d
only skip confirm epoch live if CAUSAL_READ_RISKY is enabled
...
time checked on the proxy should be less than the time waited by the master to account for clock speed differences
setting REQUIRED_MIN_RECOVERY_DURATION and ENFORCED_MIN_RECOVERY_DURATION to 0 will go back to the old behavior
2019-07-12 17:58:16 -07:00
Evan Tschannen
a63969afb3
enforce a minimum recovery duration, which allows proxies to avoid checking if the epoch is alive as long as its last commit has been less than MINIMUM_RECOVERY_DURATION ago
2019-07-12 13:10:21 -07:00
Evan Tschannen
d8948c8be1
Merge branch 'master' into feature-fast-txs-recovery
...
# Conflicts:
# fdbserver/TagPartitionedLogSystem.actor.cpp
2019-07-10 13:59:52 -07:00
Evan Tschannen
001abec29d
fixed a compiler error, buggified a new knob
2019-07-09 16:50:59 -07:00
Evan Tschannen
64aee73c4f
we only need to hold the ReplyPromise for messages that we are going to forward to new proxies
2019-07-09 16:47:56 -07:00
Evan Tschannen
c348b3da51
After a proxy dies, it will remain alive for an additional 10 seconds to forward clients to the new proxies
2019-07-08 12:53:40 -07:00
Jingyu Zhou
50e7593c5b
Merge pull request #1796 from ajbeamon/remove-trace-event-underscores
...
Remove trace event underscores
2019-07-05 21:45:55 -07:00
Evan Tschannen
15e894c724
Merge in master
2019-07-05 15:49:24 -07:00
A.J. Beamon
9f4b6fd770
Remove additional underscores
2019-07-05 08:12:25 -07:00
Alex Miller
7a500cd37f
A giant translation of TaskFooPriority -> TaskPriority::Foo
...
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Evan Tschannen
e0be631414
shard the txs tag so that more transaction logs are involved in its recovery
2019-06-19 18:15:09 -07:00
sramamoorthy
2a68b28590
rebase related changes
2019-05-28 22:07:46 -07:00
sramamoorthy
b17ad85497
exec op not supported when log_anti_quorum > 0
2019-05-28 22:07:46 -07:00
sramamoorthy
d3a179b6f9
Multiple bug fixes
...
- wait for snapTLogFailKeys in a loop, otherwise in some race
condition it can cause a false assert
- in single region, there does not seem to be a guarantee of
tagLocalityListKey for a given DC ID, avoiding that assert for now
- to find the workers that are coordinators, looking up by primary
address is not sufficient in some cases, hence looking by both
primary and secondary address
- test make files to reflect the location of the new test cases
2019-05-28 22:07:46 -07:00
sramamoorthy
bb474dc323
if recovery < fully_recovered then fail the exec
...
Will do more cleanup, pushing it for a test run in CI
2019-05-28 22:07:46 -07:00
sramamoorthy
925499954b
New status cluster_not_fully_recovered
2019-05-28 22:07:46 -07:00
sramamoorthy
6f42337c09
TransactionNotPermitted instead of conflict error
...
When the cluster has not recovered completely, return op not
permitted instead of conflict error
2019-05-28 22:07:46 -07:00
sramamoorthy
4083af0b01
Avoid using trackLatest for TLog pop test cases
2019-05-28 22:07:46 -07:00
sramamoorthy
936ffc2dde
rebase related changes
2019-05-28 22:07:46 -07:00
sramamoorthy
ec7834e2f7
code re-orgnaization and address comments
2019-05-28 22:07:46 -07:00
sramamoorthy
c76cc84ded
execute coordinators code reorganized
2019-05-28 22:07:46 -07:00
sramamoorthy
61e93a9304
Address review comments and minor fixes
2019-05-28 22:07:46 -07:00
sramamoorthy
00ccee8a6c
workaround for log giving remote log and others
...
logSystemConfig.allLocalLogs() sometimes returns remote TLog interface
and a workaround is implemented here. Other minor cleanup.
2019-05-28 22:07:46 -07:00
sramamoorthy
89b7a052f5
Bug fixes for snapping coordinators
2019-05-28 22:07:46 -07:00
sramamoorthy
17ecba8313
trace cleanup and other indentation changes
2019-05-28 22:07:46 -07:00
sramamoorthy
898bed66c1
Allow only whitelisted binary path for exec op
2019-05-28 22:07:46 -07:00
sramamoorthy
d282016f93
Exec op to tag only local storage nodes
2019-05-28 22:07:46 -07:00
sramamoorthy
6431513ad0
Fail exec req until the cluster is fully_recovered
2019-05-28 22:07:46 -07:00
sramamoorthy
4016f16c76
Fix few compilation and bugs in rebase
2019-05-28 22:07:46 -07:00
sramamoorthy
4bc4c615da
exec op to all tlog, restore change in test &other
...
- exec operation to go to all the TLogs
- minor bug fix in tlog
- restore implementation for the simulator
- restore snap UID to be stored in restartInfo.ini
- test cases added
- indentation and trace file fixes
2019-05-28 22:07:46 -07:00
sramamoorthy
72dd067173
Trace message changes and fix few FIXMEs
2019-05-28 22:07:46 -07:00
sramamoorthy
69edefe68b
Snapshot based backup and resotre implementation
2019-05-28 22:07:46 -07:00
A.J. Beamon
5f55f3f613
Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used.
2019-05-10 14:01:52 -07:00
Jingyu Zhou
e193cac5ef
Merge remote-tracking branch 'apple/master' into tlog
...
Resolve Conflicts: fdbserver/MasterProxyServer.actor.cpp
2019-05-01 17:18:00 -07:00
Evan Tschannen
2d5043c665
Merge branch 'release-6.1'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# versions.target
2019-04-30 18:27:04 -07:00
Evan Tschannen
6c77864731
separate GetStorageServerRejoinInfoRequest from GetKeyServerLocationsRequest, to avoid yielding for the rejoin requests
2019-04-25 17:07:35 -07:00
Jingyu Zhou
439d5a3843
Use emplace_back instead of push_back in Proxy
2019-04-22 14:03:48 -07:00
Jingyu Zhou
d2b215b926
Refactor tag population of ServerCacheInfo
2019-04-22 11:55:04 -07:00
Jingyu Zhou
966ec30fcc
Add pseudoLocalities for special tag consumers
2019-04-21 10:41:07 -07:00
Jingyu Zhou
b4e7e7a85b
Refactor StorageCache updates
2019-04-21 10:41:07 -07:00
Evan Tschannen
6220a5ce0f
Merge pull request #1370 from jzhou77/fix-unreferenced
...
Remove unused functions
2019-04-09 11:49:45 -07:00
mpilman
1c16f87a4e
Remove trace-calls to printable (in non-workloads)
2019-04-05 13:12:19 -07:00
Jingyu Zhou
47b4b82628
Merge branch 'master' into fix-unreferenced
2019-04-01 14:07:19 -07:00
Jingyu Zhou
f7f8ddd894
Fix warnings on unused variables
...
Found by -Wunused-variable flag.
2019-04-01 14:00:20 -07:00
Evan Tschannen
b6008558d3
renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
...
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Evan Tschannen
87e2a1a029
The proxy budget is implemented to let one request over its limit through, and then pay back what was over the limit in the next update
2019-03-18 16:09:57 -07:00
Evan Tschannen
ec6c843124
increased the GRV client batch size, similarly increased the proxy limits related to the number of transactions started in a batch
2019-03-16 16:18:58 -07:00
Evan Tschannen
a7e45cff91
Merge pull request #1176 from jzhou77/ratekeeper
...
Make Ratekeeper a separate role
2019-03-12 15:58:59 -07:00
Jingyu Zhou
dc129207a9
Minor fix after rebase.
2019-03-07 13:16:20 -08:00
Jingyu Zhou
3c86643822
Separate Ratekeeper from data distribution.
...
Add a new role for ratekeeper.
Remove StorageServerChanges from data distribution.
Ratekeeper monitors storage servers, which borrows the idea from
DataDistribution.
2019-03-07 13:16:20 -08:00
Evan Tschannen
988add9fb5
cache the metadataVersion for commits, so that doing setVersion with the commit version of a different transaction will
2019-03-04 16:48:34 -08:00
Evan Tschannen
075fdef31a
Merge branch 'master' into feature-metadata-version
...
# Conflicts:
# fdbclient/DatabaseContext.h
2019-03-03 22:58:45 -08:00
Trevor Clinkenbeard
39f612d132
Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics
2019-03-02 17:07:00 -08:00
Evan Tschannen
841eb10af0
Replace g_network->now() with now()
...
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:15:56 -08:00
Evan Tschannen
a37a65a7ff
Replace g_network->now() with now()
...
Co-Authored-By: tclinken <trevorclinkenbeard@gmail.com>
2019-03-02 08:15:05 -08:00
Evan Tschannen
3da85f3acd
implemented the \xff/metadataVersion key, which can be used by layers to help them cheaply cache metadata and know when their cache is invalid
2019-02-28 17:45:00 -08:00
A.J. Beamon
3e6a6a6569
Update status schema for correctness. Send the count of batch transactions started back to ratekeeper so that it can be logged with other ratekeeper metrics.
2019-02-28 12:00:58 -08:00
A.J. Beamon
d48a2d7b54
Fix problem caused by merge
2019-02-27 12:21:07 -08:00
A.J. Beamon
a051055caf
Initial implementation of adding separate limits for batch priority in ratekeeper
2019-02-27 10:31:56 -08:00
Trevor Clinkenbeard
abfe057805
Merge branch 'master' of https://github.com/apple/foundationdb into add-health-metrics
2019-02-25 13:47:16 -08:00
Trevor Clinkenbeard
07f800eeee
Got rid of detailed field in GetRateInfoReply message
2019-02-23 17:52:11 -08:00
Trevor Clinkenbeard
f3a73963b4
Got rid of detailedLeaseDuration in GetRateInfoReply message
2019-02-23 16:42:11 -08:00