Evan Tschannen
30710f7493
syncLogId was not necessary
2018-01-06 14:52:39 -08:00
Evan Tschannen
3ec45d38a0
Merge branch 'master' into feature-remote-logs
...
# Conflicts:
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-01-06 13:54:45 -08:00
Evan Tschannen
11ade6b6c5
Merge commit '6a756891a4cd9fd1980cecd0890b4a4f17840c5f'
2018-01-06 13:51:06 -08:00
Evan Tschannen
14e956c4fc
fix: getRange did not check for endpoint failures
...
refactored checking endpoint failures
2018-01-06 13:49:47 -08:00
Evan Tschannen
10c3fc165e
fix: after recovering from disk, only allow peeking data the was fully recovered
2018-01-06 13:49:13 -08:00
Stephen Atherton
96cb06cbc7
Bug fixes. Fdbbackup delete was broken. Blobstore backup container deletion would do too much listing before deletions began due to list operations queueing up ahead of and starving the delete operations. Created new knob and blob endpoint limit for concurrent list operations to fix this. Increased blob request timeout default because some requests were taking longer. Crash fixes in blobstore doRequest() which wasn't checking that response object is valid before using it in error conditions. Filesystem-like backup container class (covering blobstore and local dirs) now ignores unrecognized filenames for describe() and expire() operations.
2018-01-05 23:06:39 -08:00
Stephen Atherton
b86f68ceb8
Added new test that combines atomic backup/restore. Added randomization to delays in AtomicRestore workload.
2018-01-05 14:43:21 -08:00
Stephen Atherton
236799c77f
Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1
2018-01-05 14:38:06 -08:00
A.J. Beamon
1b8cae5f8e
Merge branch 'release-5.1'
2018-01-05 14:24:15 -08:00
Stephen Atherton
cbecc44ee2
Merge pull request #231 from cie/fdbmonitor-deconfigure-on-delete
...
Fdbmonitor deconfigure on delete
2018-01-05 14:23:34 -08:00
Alex Miller
f021934792
Fix yet another VersionStamp DR bug.
...
In this episode, we discover that having a transaction retry loop in which the
transaction conditionally has write conflict ranges is potentially troublesome.
To simplify the problem, if we have two concurrent transaction loops:
retry {
if (rand() > .5) tr->set('x', rand());
if (rand() > .5) tr->set('y', rand());
}
and
retry {
x = tr->get('x')
y = tr->get('y')
if (x > y) {
tr->set('y', x)
}
tr->commit();
}
Is not guaranteed that x > y in the database after the second transaction
commits. This is because it could read an older snapshot of x and y, in which
x was greater than y, and thus not invoke set. This means that `tr` is now a
read-only transaction, which no-ops out of committing as an "optimization". If
we add any write conflict range to `tr`, it then will conflict checked and
committed, which would guarantee that x>y when it commits.
Replace the first transaction with dumpData, and the second with version
upgrade transaction, and you have the bug that we're fixing, why, and how.
2018-01-05 14:23:11 -08:00
A.J. Beamon
27e0cdc0f4
Merge branch 'release-5.1'
2018-01-05 14:18:36 -08:00
Evan Tschannen
63751fb0e2
fix: remote logs are not in the log system until the recovery is complete so they cannot be used to determine if this is the correct log system to recover from
2018-01-05 14:15:25 -08:00
Stephen Atherton
cbeff0f789
Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1
2018-01-05 14:13:27 -08:00
Stephen Atherton
2763713cbc
Bug fix, backup snapshot dispatch was calculating that all shards must be done immediately.
2018-01-05 14:12:00 -08:00
Alec Grieser
70cc02d572
Merge pull request #230 from cie/vexillographer-binding-specific-disables
...
Vexillographer binding specific disables
2018-01-05 13:26:12 -08:00
A.J. Beamon
30067d2f53
Whitespace fixes and removal of change to java's AbstractTester
2018-01-05 13:21:54 -08:00
A.J. Beamon
9f2e6bfbd1
Merge branch 'release-5.1' into vexillographer-binding-specific-disables
...
# Conflicts:
# fdbclient/vexillographer/fdb.options
2018-01-05 13:16:41 -08:00
Stephen Atherton
097cd01968
Merge pull request #213 from cie/fdbmonitor-deconfigure-on-delete
...
fdbmonitor now deconfigures processes when the config file is deleted.
2018-01-05 13:10:19 -08:00
Evan Tschannen
d10d42a015
Merge pull request #229 from cie/status-generalize-incorrect-cluster-file
...
Status generalize incorrect cluster file
2018-01-05 12:42:50 -08:00
Evan Tschannen
2a869a4178
fixed merge errors
2018-01-05 12:17:17 -08:00
Evan Tschannen
5ac4f73978
Merge branch 'release-5.1' into feature-remote-logs
...
# Conflicts:
# fdbclient/NativeAPI.actor.cpp
# fdbrpc/Locality.h
# fdbrpc/simulator.h
# fdbserver/ApplyMetadataMutation.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/LogSystemPeekCursor.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/SimulatedCluster.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WorkerInterface.h
# fdbserver/masterserver.actor.cpp
# flow/Net2.actor.cpp
# tests/fast/SidebandWithStatus.txt
# tests/rare/LargeApiCorrectnessStatus.txt
# tests/slow/DDBalanceAndRemoveStatus.txt
2018-01-05 11:33:42 -08:00
A.J. Beamon
5015119115
Generalize the message that gets displayed in status if a cluster file's contents are incorrect.
2018-01-05 10:29:47 -08:00
Stephen Atherton
0f20068e82
Renamed all TaskBucket backup tasks to more appropriate names. Created the ability to make task aliases and used this to direct old task names to a task definition which will abort backups created before version 5.1.
2018-01-04 22:53:31 -08:00
Alex Miller
b264a98aea
Fix yet another VersionStamp DR bug.
...
In this episode, we discover that having a transaction retry loop in which the
transaction conditionally has write conflict ranges is potentially troublesome.
To simplify the problem, if we have two concurrent transaction loops:
retry {
if (rand() > .5) tr->set('x', rand());
if (rand() > .5) tr->set('y', rand());
}
and
retry {
x = tr->get('x')
y = tr->get('y')
if (x > y) {
tr->set('y', x)
}
tr->commit();
}
Is not guaranteed that x > y in the database after the second transaction
commits. This is because it could read an older snapshot of x and y, in which
x was greater than y, and thus not invoke set. This means that `tr` is now a
read-only transaction, which no-ops out of committing as an "optimization". If
we add any write conflict range to `tr`, it then will conflict checked and
committed, which would guarantee that x>y when it commits.
Replace the first transaction with dumpData, and the second with version
upgrade transaction, and you have the bug that we're fixing, why, and how.
2018-01-04 17:29:43 -08:00
Evan Tschannen
e11f461cbd
fix: better master exists needs to check master fitness before tlogs or proxies because that is the order of recruitment
2018-01-04 15:19:46 -08:00
A.J. Beamon
653a46f12f
Update error string fro cluster_version_changed error
2018-01-04 15:06:09 -08:00
A.J. Beamon
7d6c93122f
Merge branch 'release-5.1'
2018-01-04 11:41:29 -08:00
Evan Tschannen
f8f1c48d83
sometimes test pausing backups
2018-01-04 11:40:08 -08:00
Stephen Atherton
529716972d
Merge branch 'release-5.1' of github.com:apple/foundationdb into release-5.1
2018-01-04 11:38:41 -08:00
Evan Tschannen
f2c4beed9f
fix: tlogFitness did not consider it better to have one tlog of a better fitness
...
fix: checkStable was not used in all places in better master exists
fix: we need to call checkOutstanding on worker registration in all cases
fix: in case persistentData is keyValueStoreMemory, we need to make sure it is fully recovered before writing to it
2018-01-04 11:33:02 -08:00
A.J. Beamon
8891804b1b
Add some comments to watch setting in fdbmonitor.
2018-01-04 11:31:01 -08:00
A.J. Beamon
c0d37864cf
Remove the watch on parent of missing directory if we detect that the directory has just shown up.
2018-01-04 11:30:41 -08:00
A.J. Beamon
d52280b628
Merge branch 'release-5.1' into fdbmonitor-deconfigure-on-delete
...
# Conflicts:
# fdbmonitor/fdbmonitor.cpp
2018-01-04 11:23:35 -08:00
A.J. Beamon
6f452ae9fa
Remove unused variable
2018-01-04 10:30:02 -08:00
Stephen Atherton
d43e80cf48
Bug fix, atomicRestore would fail to get a commit version after a commit_unknown_result where the transaction actually was committed. This would cause the restore to target version -1 so it would use one of the first available restorable versions in the backup instead of the version at which the database was locked.
2018-01-03 16:24:02 -08:00
Stephen Atherton
cec9f4d7a4
Bug fix in DNS resolution. When the result is an error the result promise was being set twice.
2018-01-03 13:05:38 -08:00
Stephen Atherton
fd3f3aa647
Increased system key size limit to fix some rare backup use cases.
2018-01-03 12:05:12 -08:00
Stephen Atherton
96c479dc71
Rare bug fix. It turns out that backup log files must be written with unique names, otherwise a re-written >1 block log file overwritten after a restore has begun could read some blocks from before the rewrite and some blocks after, but due to random content ordering this would be incorrect and produce a corrupt restore. This bug is very rare because restore would detect an error unless the rewritten log file has exactly same size as the original file, but this is unlikely because the random content order affects block padding and therefore usable content bytes per block.
2018-01-02 23:38:01 -08:00
Stephen Atherton
371dee70e6
Improved backup folder structure to be shallower but spread files more uniformly and make each folder's entries lexically sort into version order regardless of numeric length. Improved backup container test to use a random version multiplier on the file versions created in order to test a wide range of versioned folder paths.
2018-01-02 23:22:35 -08:00
Stephen Atherton
78430425e8
Blob bucket listings will now use parallel recursive requests on CommonPrefixes, up to a max depth, if a delimiter is provided.
2018-01-02 23:17:52 -08:00
Stephen Atherton
07fde9dfb4
Bug fix, error code 429 was not being treated as retryable in the recent refactor.
2018-01-02 23:15:25 -08:00
Evan Tschannen
6d5dd9bd27
fix: we cannot pipeline disk queue commits until after the first commit is successful
2018-01-02 13:30:27 -08:00
Evan Tschannen
e648ddc3fe
Merge branch 'release-5.1'
2018-01-02 11:30:08 -08:00
Evan Tschannen
b5ed01c053
updated version to 5.2
2018-01-02 11:28:30 -08:00
Evan Tschannen
3e4d968308
Merge commit 'f3a3799b1d7af148949cb50c4fc349494976a6f8' into release-5.1
2018-01-02 11:25:46 -08:00
Stephen Atherton
f324afc13f
Bug fix in blob store listing when it requires multiple serial requests Added more trace events to FileBackup and BlobStoreEndpoint with suppression and added suppression to existing trace events.
2017-12-22 17:08:25 -08:00
Stephen Atherton
f2524ffd33
AsyncFileBlobStoreWrite was prohibiting the writing of 0-byte files. Improved HTTP verbose logging to stdout. Added writing a 0-byte file to BackupContainer unit test. Added backup log and snapshot sizes to backup description.
2017-12-21 21:15:26 -08:00
Stephen Atherton
aa8b4c52d5
Removed backup URL from trace events.
2017-12-21 18:22:14 -08:00
Stephen Atherton
93e426ccd2
Merge branch 'master' of github.com:apple/foundationdb
2017-12-21 17:21:20 -08:00