Commit Graph

576 Commits

Author SHA1 Message Date
A.J. Beamon 04d1217941 Track statistics about server-side request latency on each process, to include min, max, mean, and various percentiles. 2020-07-09 16:39:15 -07:00
sfc-gh-tclinkenbeard 99bf993815 Replace BOOST_NOEXCEPT with noexcept 2020-06-09 22:39:19 -07:00
sfc-gh-ngoyal 693d9e8b89
Merge branch 'master' into fdb_cache_wo_allocator 2020-06-09 15:09:58 -07:00
negoyal cf13e00a8f Merge remote-tracking branch 'origin/release-6.3' into fdb_cache_wo_allocator 2020-06-01 17:38:31 -07:00
Meng Xu 1c35ad884f Merge branch 'master' into mengxu/release-6.3-conflict-PR
Has conflict with master;
Next commit will fix the conflicts.
2020-05-25 12:01:49 -07:00
Evan Tschannen ee6ff80064 another compile fix 2020-05-22 17:26:22 -07:00
Evan Tschannen ced65cd30b finished explicitly versioning everything stored in the database 2020-05-22 17:14:21 -07:00
A.J. Beamon 7a09d016a6 Merge branch 'release-6.3' into merge-release6.3-into-master 2020-05-19 12:52:44 -07:00
Markus Pilman c2bc75516f Merge branch 'release-6.3' of github.com:apple/foundationdb into features/trace-roles 2020-05-14 10:34:53 -07:00
Alvin Moore a160f9199f
Merge pull request #3171 from apple/release-6.3
Merge Release 6.3 into Master
2020-05-14 10:00:47 -07:00
Alex Miller bf6d056095 Changing the last suggestions from review. 2020-05-13 18:48:43 -07:00
Alex Miller ccaac162e2 Resolve performance concerns of nearly-no-op debugMutation being frequently called
This introduces unhygenic macro variants that inline a `ENABLED &&`
before the TraceEvent.  This way, they get entirely compiled out unless
enabled.

Then rewrite all debugMutation uses via sed.
2020-05-13 18:44:15 -07:00
Alex Miller 27da91ab9e Merge remote-tracking branch 'upstream/master' into mutation-debugging 2020-05-13 12:51:44 -07:00
Alex Miller f148412a32 Make UPDATE_STORAGE_BYTE_LIMIT the reference spill variety.
Which is unrelated, but a change I was supposed to do a while ago and
forgot.
2020-05-12 16:59:20 -07:00
Markus Pilman 5f9b127e56 Emit traces regularly about role assignment
We are currently emitting Role transition traces when a role starts and
when it ends. While this is useful for debugging, it doesn't work well
with tools that inject data and might potentially miss some trace lines.

We do decorate each trace lines with the roles assigned to that
particular process, however, this is not sufficient for tools that can
make use of the UID -> Role mapping
2020-05-08 16:27:57 -07:00
negoyal dd033736ed Merge branch 'master' into fdb_cache_subfeature2 2020-05-04 17:29:43 -07:00
Evan Tschannen 7cebe743f9 A number of bug fixes of rare correctness errors 2020-04-29 13:50:13 -07:00
Evan Tschannen c87aa33941 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	bindings/go/src/fdb/generated.go
#	documentation/sphinx/source/api-common.rst.inc
#	documentation/sphinx/source/api-ruby.rst
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/FailureMonitorClient.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/vexillographer/fdb.options
#	fdbrpc/FlowTransport.actor.cpp
#	fdbserver/OldTLogServer_6_0.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	versions.target
2020-04-23 13:47:53 -07:00
Evan Tschannen 0a1b2a572f more compile fixes 2020-04-22 14:41:17 -07:00
Evan Tschannen 68906bf3c3 fix compile errors 2020-04-22 14:36:41 -07:00
Evan Tschannen d0cc2a1ee4 added logging for parallel peeks on TLogs 2020-04-22 14:24:45 -07:00
Alex Miller 122762cce1 Add debugMessagesAndTags, and track mutations in more places.
Like:
* Leaving the proxy
* Entering the TLog
* Leaving the TLog
* Being read on a cursor

All of this brought to you by TagsAndMessage!

This also slides in a minor optimization as to how mutations are serialized per target log.
2020-03-27 03:31:04 -07:00
negoyal acaf91ac47 Merge branch 'master' into fdb_cache_subfeature2 2020-03-26 13:33:08 -07:00
negoyal 8abac91033 Fixed a bug in cache server while peeking at a version lower than popped version and added some logging. 2020-03-26 12:39:07 -07:00
Evan Tschannen e08f0201f1 merge release 6.2 into master 2020-03-17 12:51:47 -07:00
Evan Tschannen ea98c7a40a added additional timeout on initPersistentState 2020-03-16 11:38:14 -07:00
Evan Tschannen d6d347f665 treat a tlog which takes a long time to create its disk queue as failed 2020-03-13 10:31:59 -07:00
Evan Tschannen 96258b9809 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbcli/fdbcli.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/FlowTransport.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/KeyValueStoreMemory.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/QuietDatabase.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/StorageMetrics.actor.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KVStoreTest.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/genericactors.actor.cpp
#	flow/serialize.h
2020-02-21 19:09:16 -08:00
Evan Tschannen 8129f74a10
Merge pull request #2698 from etschannen/feature-recruit-delay
The CC waits until no new workers register before starting a bad recruitment
2020-02-20 14:42:37 -08:00
A.J. Beamon fcbdcda490
Merge pull request #2650 from ajbeamon/fix-reverse-range-read-byte-limit-bug
Fix reverse range read performance bug
2020-02-20 12:47:17 -08:00
Evan Tschannen fbd45963d8 The cluster controller waits until no new workers register for 1.0 before starting a bad recruitment 2020-02-19 16:48:30 -08:00
A.J. Beamon 1d9140d874 Removed TLogVersion logging.
Added logging of SharedTLog ID for each TLog.
Switched ID logged for TLogRejoining event to the TLog instead of the SharedTLog.
Made some parameters to startRole passed by reference.
2020-02-14 12:33:43 -08:00
A.J. Beamon 56053c565b Improve TLog "Role" event by adding the worker ID, the TLog version, and under what circumstances the TLog is being started (Restored, Recruited, or Recovered).
The SharedTLog role was being started and stopped twice, so remove one instance of it.
2020-02-12 15:11:38 -08:00
Markus Pilman e71fe44ee3
Merge branch 'master' into features/icc 2020-02-08 21:33:02 -08:00
A.J. Beamon df2b0452b4 Step 3 of fixing storage server range reads: change return type of readRange from VectorRef<KeyValueRef> to RangeResultRef. 2020-02-06 13:19:24 -08:00
mpilman d09e07f1f5 Merge remote-tracking branch 'upstream/master' into features/icc 2020-02-04 10:26:18 -08:00
Jingyu Zhou 7544ff88d9 Comment out frequent TLogPop trace event 2020-01-31 19:29:09 -08:00
Evan Tschannen 6c0b934dda
Merge pull request #2242 from alexmiller-apple/fix-10min-stall-again
Fix the 10min multi-region recovery stall again
2020-01-23 17:53:02 -08:00
Jingyu Zhou 8b67a89eed More review comments fixed. 2020-01-22 19:42:13 -08:00
Jingyu Zhou 9d7a1a77d0 Small fixes. 2020-01-22 19:38:45 -08:00
Jingyu Zhou 85c4a4e422 Address review comments for PR #1625 2020-01-22 19:38:45 -08:00
Jingyu Zhou 73824faf65 Track pseudo tags popping for individual IDs
For each log router ID, we track the popped version of each pseudo tag so that
the popping only applied to the minimum of these versions.

Also add more tracing for popping and epochs.
2020-01-22 19:38:45 -08:00
Jingyu Zhou 11964733b7 WIP: should be divided into smaller commits. 2020-01-22 19:38:45 -08:00
Jingyu Zhou 03a17a30ef Refactor: check displacement in LogSystemConfig 2020-01-22 19:38:45 -08:00
Jingyu Zhou 442738b6db Small code refactoring 2020-01-22 19:35:30 -08:00
Jingyu Zhou 8221d33eb1 Use emplace_back instead of push_back for TLogServer 2020-01-22 19:35:30 -08:00
Alex Miller f0fe62a298 TLogs should not respond with data earlier than the begin version
Parallel peek more code would prefer the begin version it was sent by
the previous parallel peek over the request's begin version.  This means
that a merge cursor trying to advance past message versions would still
get old data that it would have to filter out.

A simple application of std::max fixes this.
2020-01-21 19:09:07 -08:00
Alex Miller 7798456201 Make TLogs have consistent parallel peek behavior.
TLogServer and LogRouter had some leftover code from me trying to be
more "correct" about parallel peek semantics, but those changes weren't
reflected in the OldTLog* files.  I've reverted the changes, as
realistically, they are more likely to waste CPU than improve TLog behavior.
2020-01-21 18:23:16 -08:00
Alex Miller ffc3506fff Continuing a parallel peek after a timeout would hang. 2020-01-21 17:12:18 -08:00
Alex Miller 9c47bbe460 Remove trackerData time bump
As we're in an error handling case, so this shouldn't be considered
making forward progress.
2020-01-21 17:08:42 -08:00
Alex Miller 1cb311fcb8 Add an ASSERT_WE_THINK that peek cursors don't get timed_out()
This should prevent us from regressing and having multi-region
recoveries hang for 10min again.
2020-01-21 17:07:37 -08:00
Evan Tschannen 3f9d9d8b84 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	cmake/FlowCommands.cmake
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/StorageServerInterface.h
#	fdbserver/DataDistributionTracker.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	flow/Knobs.h
#	flow/Platform.cpp
#	versions.target
2020-01-16 18:37:47 -08:00
Evan Tschannen 827cea74b5 fix: tlogs must send a recruitment reply even when actor cancelled or the recruitment endpoint will be marked as permanently failed 2020-01-16 17:37:17 -08:00
Alex Miller f58507c830 Rename poppedLocationForVersion -> versionForPoppedLocation 2019-12-19 10:24:31 -08:00
Alex Miller b5d82a74c3
Update fdbserver/TLogServer.actor.cpp
Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-12-19 10:20:52 -08:00
Alex Miller d8cbd495af Fix another pop + spill/dq-pop interleaving issue
This fixes an issue introduced in the previous patch, where pop would
immediately set `poppedLocationNeedsUpdate`, but setting the popped
version was now delayed.  This means that we could:

1. Run the spill loop and persist all popped versions
2. Receive a pop, and set the poppedLocationNeedsUpdate flag
3. Run the dq-pop loop, and clear the poppedLocationNeedsUpdate flag

and now when we update the persistentPopped version again, we won't have
the flag set for dq-pop to know that it needs to scan the spilled data
again for the minLocation.

We could more carefully update the flag, but instead, I've just
converted it into a version that's kept in sync purely in the dq-pop
loop, to remove shared state between pop and the dq-pop loop.
2019-12-17 23:15:48 -08:00
Alex Miller b36062a509 DiskQueue should only pop based off of persisted popped tag versions
This commit is to fix a bug where popping a tag between
updatePersistentData and popDiskQueue can cause the TLog to recover to
an incorrect understanding of what data it has available.

The following series of events need to happen to trigger this bug:

    Tag 1:1 is popped to version 10
    updatePersistentData is run...
      updatePersistentPopped runs and we persistentData stores 1:1 as popped to 10
      A mutation is spilled for 1:1 at version 11 at location 1000
      A mutation is spilled for 1:1 at version 21 at location 5000
    updatePersistentData finishes and commits the btree changes
    Tag 1:1 is popped to version 20
    popDiskQueue runs
      The btree is read for spilled mutations with version >=20
      The minimum location required for the disk queue is found to be location 5000
      The disk queue is popped to location 5000

    The TLog crashes

    The worker restarts, and reloads the TLog files from disk
    restorePersistentPopped restores tag 1:1 as having been popped to version 10
    Parallel peeks are received for tag 1:1 starting at version 0
      The first peek is less than the popped version, so we respond with no data, and an end version of 10
      The second peek starts at version 10, which is greater than the popped version
      The btree is read for spilled mutations, and we find that there is a mutation at version 11 at location 1000
      Location 1000 is read in the DiskQueue

The resulting page read at Location 1000 was popped pre-crash, and thus
might either (a) be corrupt or (b) have an incorrect sequence number.

The fix to this is to force popDiskQueue/updatePoppedLocation to use the
popped version that was persisted to disk, and not the most recently
popped version for the given tag.

This bug doesn't manifest in simulation, because we don't have any code
that peeks at a lower version than what has been popped.
2019-12-17 23:02:37 -08:00
Evan Tschannen ebcb2f79ed Merge branch 'master' of github.com:apple/foundationdb 2019-11-22 15:34:49 -08:00
Evan Tschannen 8d3ef89540 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/MutationList.h
#	fdbserver/MasterProxyServer.actor.cpp
#	versions.target
2019-11-14 15:49:56 -08:00
negoyal a4a0bf18f9 Merging with Master. 2019-11-12 13:01:29 -08:00
Evan Tschannen 396dccbc98 when peeking from satellites we do not need to limit the amount of peeking on log router tags, because that is the only thing that can be peeked from a satellite log 2019-11-08 18:34:05 -08:00
Evan Tschannen afc9713005 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/FDBTypes.h
#	fdbserver/LogSystem.h
#	fdbserver/LogSystemPeekCursor.actor.cpp
#	fdbserver/OldTLogServer_6_0.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	versions.target
2019-11-06 13:45:37 -08:00
Evan Tschannen a8ca47beff optimized memory allocations by using VectorRef<Tag> instead of std::vector<Tag> 2019-11-05 18:07:30 -08:00
Evan Tschannen 4de60fc437 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/TLogServer.actor.cpp
2019-11-01 15:48:04 -07:00
Evan Tschannen 85c315f684 Fix: parallelPeekMore was not enabled when peeking from log routers 2019-11-01 14:02:44 -07:00
Evan Tschannen 3325980c03 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/OldTLogServer_6_0.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/WorkerInterface.actor.h
#	fdbserver/worker.actor.cpp
#	versions.target
2019-10-24 17:38:15 -07:00
Evan Tschannen 2722c8b188 avoid starting a new startSpillingActor with every TLog recruitment 2019-10-23 11:15:54 -07:00
Evan Tschannen e01e8371a6
Merge pull request #2256 from alexmiller-apple/spill-log-on-switch-6.2
Spill SharedTLog when there's more than one
2019-10-23 10:51:28 -07:00
Alex Miller 0c325c5351 Always check which SharedTLog is active
In case it is set before we get to the onChange()
2019-10-23 01:59:36 -07:00
Alex Miller 1e5b8c74e3 Continuing a parallel peek after a timeout would hang.
This is to guard against the case where

1. Peeks with sequence numbers 0-39 are submitted
2. A 15min pause happens, in which timeout removes the peek tracker data
3. Peeks with sequence numbers 40-59 are submitted, with the same peekId

The second round of peeks wouldn't have the data left that it's allowed
to start running peek 40 immediately, and thus would hang for 10min
until it gets cleaned up.

Also, guard against overflowing the sequence number.
2019-10-22 19:24:05 -07:00
Alex Miller 1eb3a70b96 Spill SharedTLog when there's more than one.
When switching between spill_type or log_version, a new instance of a
SharedTLog is created in the transaction log processes.  If this is done
in a saturated database, then doubling the amount of memory to hold
mutations in memory can cause TLogs to be uncomfortably close to the 8GB
OOM limit.

Instead, we now thread which UID of a SharedTLog is active, and the
other TLog spill out the majority of their mutations.

This is a backport of #2213 (fef89aa1) to release-6.2
2019-10-17 01:24:50 -07:00
sramamoorthy c9097cca18 deprecate isTLogInSameNode used by snapshot V1 2019-10-09 15:33:11 -07:00
Alex Miller 77c72de176 Comment variable and code style fix
Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-10-07 18:08:27 -07:00
Alex Miller 71af24dff3 Fix a bug that would cause active logs to spill aggressively
And add some useful logging about when things do or do not spill.
2019-10-07 18:08:27 -07:00
Alex Miller 1d8a7e5af7 Spill SharedTLog when there's more than one.
When switching between spill_type or log_version, a new instance of a
SharedTLog is created in the transaction log processes.  If this is done
in a saturated database, then doubling the amount of memory to hold
mutations in memory can cause TLogs to be uncomfortably close to the 8GB
OOM limit.

Instead, we now thread which UID of a SharedTLog is active, and the
other TLog spill out the majority of their mutations.
2019-10-07 18:08:27 -07:00
Alex Miller 5016f3fedd
Whitespace fixes
no idea what happened here

Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-10-04 13:37:59 -07:00
Alex Miller 6bcb72fa74
Fix stray Unversioned()
I forgot there were two
2019-10-03 19:45:13 -07:00
Alex Miller 28f6275f94 Use AssumeVersion instead of Unversioned
Which lets us revert the unversioned serilaization of TLogSpillType
2019-10-03 15:59:09 -07:00
Alex Miller 9401a6941a
Code review nits
const correctness and file renaming in comment.

Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-10-03 15:53:39 -07:00
Alex Miller 6742222084 Make TLogServer able to spill by value and by reference
...and test it in simulation, but not combined yet.

It turns out that because of txsTag, we basically had to support
spill-by-value anyway.  Thus, if we treat all tags like txsTag when
spilling and peeking, then we have an easy way to bring the two spilling
types back into one implementation.
2019-10-03 01:45:10 -07:00
Alex Miller d38a96ab73 Make LogData aware of the spill type it was created to perform.
The spilling type is now pulled out of the request, and then stored on
LogData for later access, and persisted in the tlog metadata per tlog
generation.

It turns out that serializing types as Unversioned is a bit wonky.
2019-10-03 01:45:10 -07:00
Evan Tschannen b495cc697b Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	versions.target
2019-09-13 09:25:08 -07:00
Alex Miller 53bcf41805 Fix the build. 2019-09-12 18:46:30 -07:00
Alex Miller befa0646b3 Merge remote-tracking branch 'upstream/release-6.2' into faster-remote-dc 2019-09-12 18:46:03 -07:00
Evan Tschannen 6a7f109788 added logging on the TLog for the tag with smallest popped version 2019-09-12 16:22:01 -07:00
Alex Miller 99843bd4ba Add parallel peek support to log routers 2019-09-12 14:26:37 -07:00
Evan Tschannen 94668c6f1f
Merge pull request #2063 from jzhou77/clang
Refactor deserialization of on-wire buffer with TagsAndMessage
2019-09-09 16:34:56 -07:00
Jingyu Zhou 2d5ebebb7b Use TagsAndMessage for deserialization in TLogServer 2019-09-05 16:53:10 -07:00
Jingyu Zhou 2723922f5f Replace -1 as VERSION_HEADER constant for serialization 2019-09-05 12:45:39 -07:00
Meng Xu c2355f721e Merge branch 'master' into mengxu/performant-restore-PR 2019-09-04 17:11:42 -07:00
Jingyu Zhou cd3f1e33d4 Refactor deserialization of TagsAndMessages
Consolidate deserialization of TagsAndMessages in the structure itself and
change both TLog and ServerPeekCursor to use it.
2019-09-04 14:55:05 -07:00
Evan Tschannen 24aad14f06 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	versions.target
2019-08-30 17:23:58 -07:00
Evan Tschannen dc1d055b27
Merge pull request #2042 from senthil-ram/snap_cli_fix
fix fdbcli --exec 'snapshot create.sh' failure
2019-08-30 13:40:38 -07:00
sramamoorthy b3277f2982 Fix #2009 posix compliant args for snapshot binary 2019-08-30 12:54:09 -07:00
Andrew Noyes 6aa0ada7b1 Replace scalar root types with proper messages 2019-08-28 14:40:50 -07:00
Jingyu Zhou 4a63de16e9
Merge pull request #1945 from xumengpanda/mengxu/tLog-code-read-v2
Add comments to DiskQueue and tLog
2019-08-08 13:24:32 -07:00
Meng Xu c9c50ceff8 Comments:Add comments to DiskQueue
No functional change.
2019-08-01 15:20:01 -07:00
Meng Xu 7ccaeddf05 Merge branch 'master' into mengxu/performant-restore-PR 2019-08-01 13:23:17 -07:00
Evan Tschannen 3774ff55b0 There were still use cases where this checks are necessary 2019-07-31 17:45:21 -07:00
Evan Tschannen 854ee75664 we no longer need to special case for txs tag, because it will be initialized by createTagData 2019-07-31 17:13:15 -07:00
Evan Tschannen ff171e293e fix: always make sure to add txsTags to localTags for remote logs 2019-07-31 16:04:35 -07:00
Evan Tschannen 9f11f2ec53 Merge branch 'master' of github.com:apple/foundationdb 2019-07-30 16:55:56 -07:00
Evan Tschannen aaeeb605b2 Changes to degraded can cause master recoveries, which are not supposed to happen when speedUpSimulation is true 2019-07-30 16:33:40 -07:00
Evan Tschannen 6977e7d2e8 do not return recovered version as popped for txsTags because it could cause recovery to start over
optimized how buffered peek cursor discards popped data
2019-07-30 12:21:48 -07:00
Evan Tschannen 13203da199 fix: do not set the popped version of txsTag because it could be copied over at the recoveredAt version 2019-07-27 22:36:06 -07:00
Evan Tschannen 28df2c35bb
Merge pull request #1855 from alexmiller-apple/sharded-txs-safe-upgrade
Make sharded txsTag upgradeable and downgradeable
2019-07-26 13:29:39 -07:00
Meng Xu 1706aaf199 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflict in TlogServer.actor.cpp by accepting master changes
2019-07-26 11:46:27 -07:00
sramamoorthy 9afd162e2f remove snap v1 related code 2019-07-25 17:29:31 -07:00
Meng Xu 45083edf74 Merge branch 'master' into mengxu/performant-restore-PR
Fix conflicts as well.
2019-07-25 10:46:11 -07:00
sramamoorthy a65c9f92ed get rid of all timeouts and other changes 2019-07-24 15:36:28 -07:00
sramamoorthy a2f2ad96ff code review comments and merge to master changes 2019-07-24 15:36:28 -07:00
sramamoorthy 31c010b393 few minor fixes 2019-07-24 15:36:28 -07:00
sramamoorthy c73bdfad9f do not pop txsTag 2019-07-24 15:36:28 -07:00
sramamoorthy a335ed2011 includeCancelled for tLogSnapCreate 2019-07-24 15:36:28 -07:00
sramamoorthy 61cd690add enable/disable pop req with UID mis-match to fail 2019-07-24 15:36:28 -07:00
sramamoorthy f4e257e464 snap v2: TLog related changes 2019-07-24 15:36:28 -07:00
Evan Tschannen 6d694cc2ce
Merge pull request #1818 from alexmiller-apple/peek-cursor-timeout-bug
Fix parallel peek stalling for 10min when a TLog generation is destroyed
2019-07-19 16:39:31 -07:00
Alex Miller 9863ace96c Replace usages with intialization lists.
But C++ needs a bit of help to inference though the templates.
2019-07-18 22:27:36 -07:00
Alex Miller 55258709a0 Remove an ASSERT from testing and now inaccurate comment. 2019-07-17 01:30:01 -07:00
Alex Miller e9684a1f63 Fix issues configuring from sharded txs tag to not
Which is an intermingling of what should be two commits:

1. Rely on TLogVersion instead of txsTags==0

2. Copy and index sharded txsTags between KCV and RV as txsTag when
configuring log_version 4->3.
2019-07-17 01:25:09 -07:00
Alex Miller 812ce37bcd Remove buggify and unneeded safeguards.
The buggify was actually incorrect and broke an invariant, which I then
fixed on the other side, but this work was actually unneeded in total.

The real issue being fixed was returnIfBlock not sending an error, as
well as the other error cases.
2019-07-16 15:58:02 -07:00
Alex Miller 4cc60dc9b8 Merge remote-tracking branch 'upstream/master' into peek-cursor-timeout-bug 2019-07-15 17:05:39 -07:00
Alex Miller 2cbc05fc72 Address more issues that cause peek cursors to time out.
There were error cases that would cause a peek to terminate early or be
cancelled without sending anything to the next peek in line.  We would
thus end up with the first peek in a sequence waiting on its future, and
nothing that exists that would send to that future.
2019-07-15 16:03:37 -07:00
Alex Miller c8e94e601a
Merge pull request #1729 from etschannen/feature-fast-txs-recovery
Improve the recovery speed of the txnStateStore
2019-07-15 13:27:41 -07:00
Vishesh Yadav 2606794df6
Merge pull request #1812 from alexmiller-apple/improve-only-spilled
Improve the behavior of parallelPeekMore+onlySpilled.
2019-07-10 17:15:19 -07:00
Evan Tschannen d8948c8be1 Merge branch 'master' into feature-fast-txs-recovery
# Conflicts:
#	fdbserver/TagPartitionedLogSystem.actor.cpp
2019-07-10 13:59:52 -07:00
Evan Tschannen 49121172ea
Merge pull request #1795 from alexmiller-apple/peek-from-satellites
Log Routers will prefer to peek from satellite logs.
2019-07-09 17:38:57 -07:00
Alex Miller fd769ad878 Fix parallel peek stalling for 10min when a TLog generation is destroyed.
`peekTracker` was held on the Shared TLog (TLogData), whereas peeks are
received and replied to as part of a TLog instance (LogData).  When a
peek was received on a TLog, it was registered into peekTracker along
with the ReplyPromise.  If the TLog was then removed as part of a
no-longer-needed generation of TLogs, there is nothing left to reply to
the request, but by holding onto the ReplyPromise in peekTracker, we
leave the remote end with an expectation that we will reply.  Then,
10min later, peekTrackerCleanup runs and finally times out the peek
cursor, thus preventing FDB from being completely stuck.

Now, each TLog generation has its own `peekTracker`, and when a TLog is
destroyed, it times out all of the pending peek curors that are still
expecting a response.  This will then trigger the client to re-issue
them to the next generation of TLogs, thus removing the 10min gap to do
so.
2019-07-09 17:27:36 -07:00
Alex Miller 44f11702a8 Log Routers will prefer to peek from satellite logs.
Formerly, they would prefer to peek from the primary's logs.  Testing of
a failed region rejoining the cluster revealed that this becomes quite a
strain on the primary logs when extremely large volumes of peek requests
are coming from the Log Routers.  It happens that we have satellites
that contain the same mutations with Log Router tags, that have no other
peeking load, so we can prefer to use the satellite to peek rather than
the primary to distribute load across TLogs better.

Unfortunately, this revealed a latent bug in how tagged mutations in the
KnownCommittedVersion->RecoveryVersion gap were copied across
generations when the number of log router tags were decreased.
Satellite TLogs would be assigned log router tags using the
team-building based logic in getPushLocations(), whereas TLogs would
internally re-index tags according to tag.id%logRouterTags.  This
mismatch would mean that we could have:

    Log0 -2:0 ----- -2:0  Log 0

    Log1 -2:1 \
               >--- -2:1,-2:0 (-2:2 mod 2 becomes -2:0)  Log 1
    Log2 -2:2 /

And now we have data that's tagged as -2:0 on a TLog that's not the
preferred location for -2:0, and therefore a BestLocationOnly cursor
would miss the mutations.

This was never noticed before, as we never
used a satellite as a preferred location to peek from.  Merge cursors
always peek from all locations, and thus a peek for -2:0 that needed
data from the satellites would have gone to both TLogs and merged the
results.

We now take this mod-based re-indexing into account when assigning which
TLogs need to recover which tags from the previous generation, to make
sure that tag.id%logRouterTags always results in the assigned TLog being
the preferred location.

Unfortunately, previously existing will potentially have existing
satellites with log router tags indexed incorrectly, so this transition
needs to be gated on a `log_version` transition.  Old LogSets will have
an old LogVersion, and we won't prefer the sattelite for peeking.  Log
Sets post-6.2 (opt-in) or post-6.3 (default) will be indexed correctly,
and therefore we can safely offload peeking onto the satellites.
2019-07-08 22:25:01 -07:00
Alex Miller 6c8f50ca66 Improve the behavior of parallelPeekMore+onlySpilled.
When onlySpilled transitions from true (don't peek memory) to false (do
peek memory) as part of a parallel peek, we'll end up wasting the rest
of the replies because we'll honor their onlySpilled=true setting and
thus not have any additional data to return.

Instead, we thread the onlySpilled back through in the same way that the
ending version of the last peek is used overrides the requested starting
version of the next peek.  This simulated the same behavior that the
client has, where the value of onlySpilled that we reply with comes back
in the next request.

I haven't actually seen it be a problem, but this should help make sure
the onlySpilled transition when catching up doesn't ever cause any ill
effects if a process starts riding the line between onlySpilled settings.
2019-07-08 22:13:09 -07:00
Evan Tschannen 15e894c724 Merge in master 2019-07-05 15:49:24 -07:00
Evan Tschannen 235697f688 fix: txsTags are not popped at the recovery version 2019-06-27 23:18:26 -07:00
Alex Miller bf883d7055 Merge remote-tracking branch 'upstream/master' into flowlock-api 2019-06-25 14:26:50 -07:00
Alex Miller 7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
Evan Tschannen 1c005d5878
Merge pull request #1584 from alexmiller-apple/spilled-only-peek
Save TLog resources by letting peek request only spilled data.
2019-06-20 18:22:31 -07:00
mpilman 844dd60202 FDB compiling with intel compiler 2019-06-20 09:29:01 -07:00
Evan Tschannen e0be631414 shard the txs tag so that more transaction logs are involved in its recovery 2019-06-19 18:15:09 -07:00
mpilman 68ce9a5e75 ProtocolVersion type - second try 2019-06-18 17:55:27 -07:00
Alex Miller 51fd42a4d2 Merge remote-tracking branch 'upstream/master' into spilled-only-peek 2019-06-18 17:33:52 -07:00
mpilman 8576665a90 Revert "Revert "Make protocol version a type""
This reverts commit 455bf3b3ec.
2019-06-18 14:49:04 -07:00
Alex Miller 455bf3b3ec Revert "Make protocol version a type" 2019-06-18 10:59:17 -07:00
mpilman da53a92bec Make protocol version a type
This fixes #1214

The basic idea is that ProtocolVersion is now its own type. This
alone is an improvement as it makes many things more typesafe. For
each version, we can now add breaking features (for example Fearless).
After that, there's no need to test against actual (confusing) version
numbers. Instead a developer can simply test
`protocolVersion->hasFearless()` and this will return true iff the
protocolVersion is newer than the newest version that didn't support
fearless.
2019-06-16 09:59:15 -07:00
sramamoorthy 1190f2f33d rebased related changes 2019-05-28 22:07:46 -07:00
sramamoorthy b43c100e57 TLog bug fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 3877f87481 comment change in tLogCommit 2019-05-28 22:07:46 -07:00
sramamoorthy 31b6c86650 ignorePopDeadline to have high limit in simulator
- ignorePopDeadline to have highier limit in simulator
to accommdate for the buggify delays and make snapshot succeed.

- introduce a new knob for auto resetting the disabling of tlog pop
2019-05-28 22:07:46 -07:00
sramamoorthy b1b96946af logData->stop check right after execOpHold wait 2019-05-28 22:07:46 -07:00
sramamoorthy 5749e220bd use FlowLock for implementing critical section
Instead of using Promises and future to implement
critcal section use FlowLock
2019-05-28 22:07:46 -07:00
sramamoorthy e6c0b87a4d remove unused variable 2019-05-28 22:07:46 -07:00
sramamoorthy f27a40f118 execProcessingHelper made synchronous
tLogCommit exects no blocking between duplicate check and
setting of the new version, that constraint was broken
when synchronous execProcessingHelper was introduced.
As a fix, execProcessingHelper was made asynchronous.
2019-05-28 22:07:46 -07:00
sramamoorthy d3a179b6f9 Multiple bug fixes
- wait for snapTLogFailKeys in a loop, otherwise in some race
  condition it can cause a false assert
- in single region, there does not seem to be a guarantee of
  tagLocalityListKey for a given DC ID, avoiding that assert for now
- to find the workers that are coordinators, looking up by primary
  address is not sufficient in some cases, hence looking by both
  primary and secondary address
- test make files to reflect the location of the new test cases
2019-05-28 22:07:46 -07:00
sramamoorthy dcd2d96751 make spawnProcess predictable in the simulator 2019-05-28 22:07:46 -07:00
sramamoorthy 4083af0b01 Avoid using trackLatest for TLog pop test cases 2019-05-28 22:07:46 -07:00
sramamoorthy ec7834e2f7 code re-orgnaization and address comments 2019-05-28 22:07:46 -07:00
sramamoorthy b6e037ffbc Replace fork with boost::process::child 2019-05-28 22:07:46 -07:00
sramamoorthy e91c76834e tlog: move snap create part to indepdendent funcs 2019-05-28 22:07:46 -07:00
sramamoorthy 61e93a9304 Address review comments and minor fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 9e3104c2d4 Fix: races in async exec leading to bad backup 2019-05-28 22:07:46 -07:00
sramamoorthy cfdad0c5e6 tlog to snapshot exactly at exec version 2019-05-28 22:07:46 -07:00
sramamoorthy 539e65efad Skip parsing mutations if it is tagged for TxsTag
In Tlog, if a mutation is targetted for TxsTag then skip from
parsing them.
2019-05-28 22:07:46 -07:00
sramamoorthy 17ecba8313 trace cleanup and other indentation changes 2019-05-28 22:07:46 -07:00
sramamoorthy aa79480d69 changes to make fdbfork asynchronous 2019-05-28 22:07:46 -07:00
sramamoorthy 4016f16c76 Fix few compilation and bugs in rebase 2019-05-28 22:07:46 -07:00
sramamoorthy 3d5998e9dd tlog: when pops are disabled, store them & replay
In Tlogs, disable pop is done whlie taking snapshots. Earlier, tlogs
were ignoring the pops if it got pop requests when pops were
disabled. In this change, instead of ignoring the pop - it remembers
the list of pops in-memory and plays them once the popping is
enabled.
2019-05-28 22:07:46 -07:00
sramamoorthy 4bc4c615da exec op to all tlog, restore change in test &other
- exec operation to go to all the TLogs
- minor bug fix in tlog
- restore implementation for the simulator
- restore snap UID to be stored in restartInfo.ini
- test cases added
- indentation and trace file fixes
2019-05-28 22:07:46 -07:00
sramamoorthy 72dd067173 Trace message changes and fix few FIXMEs 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00
A.J. Beamon f417e60264 Merge branch 'merge-release-6.1-into-master' into thread-safe-random-number-generation
# Conflicts:
#	fdbserver/QuietDatabase.actor.cpp
2019-05-23 09:52:00 -07:00
A.J. Beamon d29c7e4c9b Merge branch 'release-6.1' into merge-release-6.1-into-master
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/QuietDatabase.actor.cpp
#	versions.target
2019-05-23 09:28:45 -07:00
Evan Tschannen 003cc6be18 fix: nothingPersistent could be incorrect when popped is equal to persistentDataVersion 2019-05-22 20:23:35 -10:00
Evan Tschannen ee04c583fa fix: do not pop the disk queue past the persistentDataVersion 2019-05-21 10:40:30 -07:00
Evan Tschannen 4059d68348 fix: the tlog would not pop data from the disk queue after a storage server was removed, because the tag still exists in memory on the logs
fix: we could incorrectly make data durable if eraseMessagesFromMemory was in progress while running updatePersistentData
the quiet database check now ensure that tlogs have no more than 30 seconds of versions unpopped from the disk queue
2019-05-20 23:58:45 -07:00
Meng Xu 9ea83e0f3c FastRestore:Remove dbprintf 2019-05-17 17:34:42 -07:00
Alex Miller 4eb4c03ce5 Save TLog resources by letting peek request only spilled data.
If a peek is entirely fulfilled from spilled data, then it's likely that
the next peek will be also.  It is thus wasteful for each of these peeks
to call peekMessagesFromMemory, which memcpy's excessively, and then
throw all that data away without using it.

Now, TLogs will give a hint back to peek cursors about if the provided
reply was served entirely from the spilled data, which peek curors then
feed back as the hint into their next request.

At some point, a cursor will send a request for only spilled data, get
an incomplete response, and then be told to send its next request as one
that peeks from memory as well, and then it will fully catch up.
2019-05-14 15:38:48 -10:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Evan Tschannen 22499666d0 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/LogRouter.actor.cpp
#	flow/Trace.cpp
#	versions.target
2019-05-08 18:19:35 -07:00
Evan Tschannen 93eb2a9395
Merge pull request #1527 from alexmiller-apple/tstlog-6.1
Spill-by-reference knob + TLog6.0 Spilled Peek deprioritization
2019-05-03 17:19:45 -07:00
Alex Miller c918b21137 Deprioritize spilled peeks in spill-by-value, and improve its logic.
This deprioritizes before calling peekMessagesFromMemory, which should
improve the memory usage of the TLog, and makes sure to keep txsTag
peeks at a high priority to help recoveries stay fast.
2019-05-03 15:27:11 -07:00
Alex Miller 4052f3826a Add a knob to limit the number of commits indexed per key.
Theoretically, we could spill 20MB of 22B mutations for one key, which
would generate a very long value being stored in SQLite, and very
inefficiently read back.  This stops that from being a problem, at the
cost of some extra write calls.
2019-05-03 15:27:10 -07:00
Evan Tschannen 12088119d2
Merge pull request #1517 from alexmiller-apple/tstlog-6.1
Add a knob to limit amount of data read from sqlite for one PeekRequest.
2019-05-03 11:01:11 -07:00
Alex Miller f4e48c3851 Add a knob to limit amount of data read from sqlite for one PeekRequest.
This prevents peeking from degrading over time if there are a very large
number of SpilledData entries for one particular tag.
2019-05-02 17:26:45 -07:00
Evan Tschannen 8590b710bf added additional logging on the logs and log routers 2019-05-02 17:24:39 -07:00
Jingyu Zhou 8b5449e608 Fix review comments for PR #1473 2019-04-29 16:45:42 -07:00
Jingyu Zhou 5462f560e7 Add pseudo locality for log routers and tlogs
This changes the logic of pop operations from log routers (LG):
- LG pops tagLocalityLogRouterMapped from TLogs;
- TLog converts tagLocalityLogRouterMapped back to tagLocalityLogRouter before
  popping.

Later when we add more psuedo localities, the same pattern can be used.
2019-04-23 21:35:56 -07:00
Jingyu Zhou 0b1984978a Small code refactoring. 2019-04-21 10:41:07 -07:00
Jingyu Zhou ec1bc5cfca Add LogSystemType enum 2019-04-21 10:41:07 -07:00
Meng Xu 529ce66b6c Merge branch 'apple/master' into mengxu/performant-restore-PR 2019-04-18 18:02:45 -07:00
Meng Xu 4c3ccebe8a FastRestore: Cleanup code
Remove unused code and comments.
2019-04-12 13:49:55 -07:00
Evan Tschannen 6220a5ce0f
Merge pull request #1370 from jzhou77/fix-unreferenced
Remove unused functions
2019-04-09 11:49:45 -07:00
mpilman 1c16f87a4e Remove trace-calls to printable (in non-workloads) 2019-04-05 13:12:19 -07:00
Meng Xu c4a8a80d6f Merge branch 'apple/master' into mengxu/performant-restore-PR 2019-04-04 22:51:00 -07:00
Jingyu Zhou 47b4b82628
Merge branch 'master' into fix-unreferenced 2019-04-01 14:07:19 -07:00
Meng Xu 70d7c289f4 Merge branch 'master' into mengxu/restore/parallel-v7 2019-03-30 22:13:10 -07:00
Alex Miller e7ad39246c
Fix typo 2019-03-29 20:16:26 -07:00
Evan Tschannen a44ffd851e fix: the shared tlog could fail to update a stopped tlog’s queueCommitVersion to version if a second tlog registered before it could issue the first commit for the tlog 2019-03-29 20:11:30 -07:00
Evan Tschannen b6008558d3 renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Jingyu Zhou a55f06e082 Remove unused functions
Found with -Wunused-function flag.
2019-03-27 15:45:28 -07:00
Evan Tschannen c705a1af74 fix: make sure recoveryLocation is always a valid page 2019-03-20 19:33:09 -07:00
Evan Tschannen 1c6ad6d307 fix: change the location where stopped is checked, because a yield could cause cause stopped to be set after the existing check 2019-03-20 19:33:09 -07:00
Alex Miller b11ecb3210 Remove random bits of code that were either unneeded or leftover from debugging. 2019-03-18 15:47:20 -07:00