Alex Miller
c40c1bb5fe
Add a new workload: BackupToDBAbort, which does an ACI switchover.
...
This is to allower easier testing of non-durable switchovers without having to
wiggle into BackupToDBCorrectness's view of the world.
2017-09-29 15:58:36 -07:00
Alex Miller
9e9a96ae76
Make VersionStamp workload able to run with DR-style workloads.
...
* It is now tolerant of locked database errors, and handles them correctly.
* There is an option to specify which database to verify against.
2017-09-29 15:58:36 -07:00
Alex Miller
34630b6130
Make VersionStamp workload can handle commit_unknown_result.
...
Previously, if a transaction failed with commit_unknown_result, and was
actually committed, it would look like data that magically appeared in the
database and verification would fail.
Now, we explicitly re-read and check to see if the commit happened, so that we
may maintain an accurate understanding of what the database state should be.
2017-09-29 15:58:36 -07:00
Alex Miller
23945b9fea
VersionStamp can co-exist with other workloads that write data to the database.
...
VersionStamp previously would range-read the entire database during validation.
This has the unfortunate effect of making it fail during validation if run with
any other workload that writes keys to the database.
Now, all keys written and read are done with a configurable prefix, so that it
may co-exist with a variety of other workloads.
2017-09-29 15:58:36 -07:00
Alex Miller
370a6afb80
Make VersionStamp have an option to be tolerant of data being lost.
2017-09-29 15:58:36 -07:00
Alex Miller
8f4c45418b
Make atomicSwitchover preserve an ever-increasing commit version.
2017-09-29 15:58:36 -07:00
Alex Miller
69523ce151
Hackish version of a test, but it does fail.
2017-09-29 15:58:36 -07:00
Alex Miller
65713b226f
Fix whitespace and line endings.
2017-09-29 15:58:36 -07:00
Bhaskar Muppana
91975244fe
Fixing OSX build.
2017-09-28 19:35:44 -07:00
Bhaskar Muppana
942c04e992
Merge pull request #162 from bmuppana/master
...
Fixing TimeKeeperCorrectness to deal with network delays.
2017-09-28 17:04:39 -07:00
Bhaskar Muppana
3d2bafc3a6
Fixing TimeKeeperCorrectness to deal with network delays.
2017-09-28 16:52:28 -07:00
Evan Tschannen
ef41b07bb3
renamed past_version to transaction_too_old
...
implemented read_lock_aware option
2017-09-28 16:35:08 -07:00
Evan Tschannen
7b60e26660
Merge pull request #160 from cie/use-error-descriptions
...
Add the ability to access name and description in Error. Update error…
2017-09-28 16:00:39 -07:00
A.J. Beamon
4f97bd44a5
If we fail to get the interface name due to a platform error, don't kill the process. Instead, just leave the network counters alone. Change the GetInterfaceAddrs trace event to SevWarnAlways.
2017-09-28 13:32:39 -07:00
Evan Tschannen
5f4b997400
emergency teams are bad for performance, because we will route client read requests to servers that do not have the data, therefore getting many wrong shard server errors. emergency teams only protect us from data loss in very rare scenarios, we may want to add them in again in the future, but make sure load balance knows which storage servers used to be destinations so they can only route to them as a last resort.
2017-09-28 13:20:01 -07:00
Evan Tschannen
73fca75239
added the ability to disable timeKeeper; disabled timeKeeper before consistency check in simulation
2017-09-28 13:13:24 -07:00
A.J. Beamon
67d0eb5d66
Change a few more error descriptions; update sphinx error code documentation
2017-09-28 13:03:17 -07:00
A.J. Beamon
d30c730f75
Add the ability to access name and description in Error. Update error descriptions.
2017-09-28 12:35:03 -07:00
Bhaskar Muppana
0f8ff26029
Merge pull request #158 from bmuppana/master
...
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:42 -07:00
Bhaskar Muppana
6a0b1d6808
Fixing PR comments
...
<rdar://problem/34557380> Need a way to map real time to version
2017-09-27 17:56:01 -07:00
A.J. Beamon
91281ec754
Don't use SetByteArrayRegion for get range results.
2017-09-27 13:41:06 -07:00
Evan Tschannen
4b21da1cd6
fix: lastVersionWithData was not updated when fetchKeys injects mutations
2017-09-27 10:44:34 -07:00
Stephen Atherton
cfb0cc4c3b
Merge branch 'release-5.0'
2017-09-26 22:13:23 -07:00
Stephen Atherton
333fb65a91
FDBMonitor now supports “flag_X=<true/false>” in all sections, with the usual inheritance, in order to enable (true) or disable (false) the passing of a parameterless command line named X as “—X”.
2017-09-26 22:13:01 -07:00
Alvin Moore
298b54104e
Merge branch 'release-5.0'
2017-09-26 11:16:14 -07:00
Alvin Moore
02525d7b14
Added TESTs to ensure that all of the different kills are performed during simulation
2017-09-26 11:15:39 -07:00
Yichi Chiang
4ce60c4276
Merge pull request #159 from cie/add-locality-to-backup
...
Add locality to backup agent and DR agent
2017-09-26 10:20:32 -07:00
Yichi Chiang
5e9c6d6b64
Add locality to backup agent and DR agent
2017-09-26 10:19:26 -07:00
Evan Tschannen
acb7e66d01
fix: failed logs do not count even if they have returned a result
2017-09-25 18:14:40 -07:00
Evan Tschannen
2bf042a559
fix: file_corrupt was not checking for fault injection
...
latency threshold was too long
2017-09-25 17:22:41 -07:00
A.J. Beamon
e5e7f8a081
When using setKey() on Standalone<KeySelectorRef> in RYW, make sure that the key is part of the key selector's arena.
2017-09-25 15:52:45 -07:00
Bhaskar Muppana
0bf5bdb23a
<rdar://problem/34557380> Need a way to map real time to version
2017-09-25 12:51:37 -07:00
Evan Tschannen
cce4eeb52d
fix: the master was sending the cluster controller uninitialized configurations
2017-09-22 16:59:24 -07:00
Evan Tschannen
180438d41e
fix: use the number of present logServers rather than the total size of the vector
2017-09-22 16:19:16 -07:00
Evan Tschannen
7081136f74
added a fix
2017-09-22 15:08:14 -07:00
Evan Tschannen
738ae21c3a
fix: an optimization in buggified locking can cause recovery to break because it would not restart if a locked process was killed when the remaining logs cannot obtain a quorum
2017-09-22 15:07:57 -07:00
Evan Tschannen
fba78ce4ef
refactored monitor leader again to be even safer.
...
fixed a problem where we would write the header to clusters files twice
added extra logging in monitor leader
2017-09-22 15:06:11 -07:00
Alex Miller
585c9bf68f
Quick fix to reduce CPU usage of ensureEpochLive.
...
It is suspected that policy recomputations are driving proxy CPU usage up, and
thus latency and throughput down. To quickly confirm this theory, we're
forcing ensureEpochLive to wait until it has RF responses, which means we'll
probably only validate the policy once per call.
2017-09-21 18:22:24 -07:00
Evan Tschannen
4809bd8f62
fix: We cannot inject faults after renaming the file, because we could end up with two asyncFileNonDurable open for the same file
2017-09-21 18:11:18 -07:00
A.J. Beamon
995587b12b
Merge branch 'release-5.0'
2017-09-21 13:32:12 -07:00
Evan Tschannen
a9e3ae40d6
refactored monitorLeader to avoid the risk of one generation or coordinators interfering with the next
2017-09-20 17:42:12 -07:00
Evan Tschannen
53a4a3280a
fix: we cannot add to the trLog when cancelled
2017-09-20 14:47:57 -07:00
Evan Tschannen
c3f77ebbd2
Merge branch 'master' of github.com:apple/foundationdb
2017-09-20 11:48:35 -07:00
Evan Tschannen
fbd67ea547
fix: excluded servers are worst fit for master rather than never assign (so that we can recover if every process has been excluded)
...
fix: better master exists did not use exclusions because the configuration was reset
2017-09-20 11:48:26 -07:00
Ben Collins
21688afeb3
Merge pull request #155 from cie/feature-jni-no-memcpy
...
Fix possible leaks, move to SetByteArrayRegion()
2017-09-20 11:01:29 -07:00
A.J. Beamon
da9b56e1ef
More use of SetByteArrayRegion and various memory management fixes.
2017-09-20 10:31:25 -07:00
Balachandar Namasivayam
24aa616a7a
Merge pull request #154 from cie/additional-client-profiling
...
Additional client profiling
2017-09-19 18:15:02 -07:00
Evan Tschannen
cb43563b2d
fix: toMap properly lists the redundancy mode of the cluster
2017-09-19 16:35:42 -07:00
Ben Collins
8c13f60625
Update tuple.md
2017-09-19 22:41:55 +00:00
Evan Tschannen
f75dfc3153
do not register with the master until recovery of the queue is complete, to avoid having the master wait a long time for a peek response
2017-09-18 17:39:12 -07:00