Commit Graph

488 Commits

Author SHA1 Message Date
Alex Miller c8e94e601a
Merge pull request #1729 from etschannen/feature-fast-txs-recovery
Improve the recovery speed of the txnStateStore
2019-07-15 13:27:41 -07:00
Evan Tschannen bbef631872 fix: do not access optionInfo unless the option already exists in the map 2019-07-10 18:48:54 -07:00
Evan Tschannen d8948c8be1 Merge branch 'master' into feature-fast-txs-recovery
# Conflicts:
#	fdbserver/TagPartitionedLogSystem.actor.cpp
2019-07-10 13:59:52 -07:00
Evan Tschannen 7ad0d1a12b Merge branch 'master' into feature-proxy-forward
# Conflicts:
#	fdbclient/NativeAPI.actor.cpp
2019-07-09 17:26:15 -07:00
Vishesh Yadav eabc610daa
Merge pull request #1813 from alexmiller-apple/log-version-4
Add a TLogVersion::V4
2019-07-09 08:42:20 -07:00
Alex Miller d2ef84a8f9 Add a TLogVersion::V4
And refactor some code to make adding more TLogVersions easier.
2019-07-08 22:22:45 -07:00
Evan Tschannen c348b3da51 After a proxy dies, it will remain alive for an additional 10 seconds to forward clients to the new proxies 2019-07-08 12:53:40 -07:00
Alex Miller 14e5dd74fe Add a checkOnly parameter to Cycle workload.
So that it can be used in the real world for consistency checking of
backup and DR.
2019-07-05 19:09:09 -07:00
Evan Tschannen 15e894c724 Merge in master 2019-07-05 15:49:24 -07:00
dyoungworth 817fce080b Fix minor bug in External Workload 2019-07-02 15:57:26 -07:00
Evan Tschannen 841e61ac25 fixed a broken promise in localRatekeeper 2019-07-01 16:56:35 -07:00
Evan Tschannen db413c37f7 restored the STORAGE_DURABILITY_LAG_SOFT_MAX knob and made the rk target slightly smaller than the soft limit, to avoid inaccuracies in ratekeeper control causing behavior changes on the storage servers 2019-06-28 16:54:22 -07:00
Evan Tschannen ec16688db1 fixed the local ratekeeper workload to match the logic on the storage server 2019-06-28 16:54:22 -07:00
Evan Tschannen 92b32855ca ratekeeper’s control algorithm would oscillate when limited by local ratekeeper 2019-06-28 16:54:22 -07:00
Evan Tschannen 235697f688 fix: txsTags are not popped at the recovery version 2019-06-27 23:18:26 -07:00
mpilman 7bfda1faaa Fixed three more Windows issues
This is now compiling on my Windows machine
2019-06-27 11:39:36 -07:00
Alex Miller 83fae6cc15 Fix ExternalWorkload not being a part of the old build/test system. 2019-06-25 21:42:35 -07:00
Evan Tschannen 0fe6edc254
Merge pull request #1678 from mpilman/features/external-workload
Features/external workload
2019-06-25 13:53:19 -07:00
sramamoorthy 212136d024 SnapTest to handle retries for exec txns 2019-06-24 10:22:42 -07:00
Evan Tschannen 37c1df2491
Merge pull request #1705 from bnamasivayam/suspend-process
Extend RebootRequest API to include time to suspend the process befor…
2019-06-20 17:36:25 -07:00
mpilman ab7562160c Made JavaWorkload an external workload 2019-06-19 13:03:41 -07:00
mpilman 2eff2b7e21 First simple test is working (but very buggy) 2019-06-19 13:03:41 -07:00
mpilman c8957d93f8 Implementation code complete 2019-06-19 13:03:41 -07:00
mpilman 8576665a90 Revert "Revert "Make protocol version a type""
This reverts commit 455bf3b3ec.
2019-06-18 14:49:04 -07:00
Alex Miller 455bf3b3ec Revert "Make protocol version a type" 2019-06-18 10:59:17 -07:00
mpilman da53a92bec Make protocol version a type
This fixes #1214

The basic idea is that ProtocolVersion is now its own type. This
alone is an improvement as it makes many things more typesafe. For
each version, we can now add breaking features (for example Fearless).
After that, there's no need to test against actual (confusing) version
numbers. Instead a developer can simply test
`protocolVersion->hasFearless()` and this will return true iff the
protocolVersion is newer than the newest version that didn't support
fearless.
2019-06-16 09:59:15 -07:00
Balachandar Namasivayam 5eb833759e Extend RebootRequest API to include time to suspend the process before reboot. This is intended to be used for testing purposes to simulate failures. 2019-06-14 11:35:38 -07:00
Evan Tschannen 55f7e7d372 fix: The delay inside the disabledMap was causing the storage server updateStorage actor to run on the client process 2019-06-13 14:28:30 -07:00
Evan Tschannen dccb9bc26d fixed a number of correctness problems 2019-06-12 19:40:50 -07:00
Trevor Clinkenbeard 8144882d7b Merge branch 'apple-master' into features/local-rk 2019-06-10 19:40:25 -07:00
Trevor Clinkenbeard 46b77819aa Fixed LocalRatekeeper test 2019-06-10 18:25:58 -07:00
chaoguang 66811b7bd2 update to latest version 2019-06-03 16:49:19 -07:00
chaoguang 3055376b45 remove static keyword to make variables not in binary 2019-06-03 16:40:34 -07:00
Meng Xu dc59f63d0e TraceEvent:First letter must be capitalized 2019-06-03 13:27:18 -07:00
chaoguang ac2c0f38b7 remove inheritance from KVWorkload 2019-06-02 23:16:39 -07:00
chaoguang d07c46e3f3 fix issues by comments 2019-05-31 00:44:07 -07:00
chaoguang 66d25cef21 fix issues by comments 2019-05-31 00:27:30 -07:00
sramamoorthy 4bcb590f12 g_random -> deterministicRandom() 2019-05-28 22:07:46 -07:00
sramamoorthy b17ad85497 exec op not supported when log_anti_quorum > 0 2019-05-28 22:07:46 -07:00
sramamoorthy 3aa848b8af minor bug in whitelist binary path testing 2019-05-28 22:07:46 -07:00
sramamoorthy 40358e1dd6 limit of getRange in snapTest reduced
With CLIENT_KNOBS->TOO_MANY in snapTest, by the time getRange
gathers all the results, the storage server's oldest version has
gone past the req->version and hence the transaction fails with
transaction_too_old
2019-05-28 22:07:46 -07:00
sramamoorthy ceac68c990 restore - remove emtpy snapdir,snap loop retry fix
- remove partially snapped directories to avoid no cluster file assert
- snap create to retry max 3 times for not_fully_recovered and keep
  retrying for the other failures
2019-05-28 22:07:46 -07:00
sramamoorthy d3a179b6f9 Multiple bug fixes
- wait for snapTLogFailKeys in a loop, otherwise in some race
  condition it can cause a false assert
- in single region, there does not seem to be a guarantee of
  tagLocalityListKey for a given DC ID, avoiding that assert for now
- to find the workers that are coordinators, looking up by primary
  address is not sufficient in some cases, hence looking by both
  primary and secondary address
- test make files to reflect the location of the new test cases
2019-05-28 22:07:46 -07:00
sramamoorthy bb474dc323 if recovery < fully_recovered then fail the exec
Will do more cleanup, pushing it for a test run in CI
2019-05-28 22:07:46 -07:00
sramamoorthy 925499954b New status cluster_not_fully_recovered 2019-05-28 22:07:46 -07:00
sramamoorthy 591ff96b93 increase retry and use eat instead of parsing 2019-05-28 22:07:46 -07:00
sramamoorthy 4083af0b01 Avoid using trackLatest for TLog pop test cases 2019-05-28 22:07:46 -07:00
sramamoorthy ec7834e2f7 code re-orgnaization and address comments 2019-05-28 22:07:46 -07:00
sramamoorthy 858604b51d minor cleanups to SnapTest 2019-05-28 22:07:46 -07:00
sramamoorthy 00ccee8a6c workaround for log giving remote log and others
logSystemConfig.allLocalLogs() sometimes returns remote TLog interface
and a workaround is implemented here. Other minor cleanup.
2019-05-28 22:07:46 -07:00