Commit Graph

283 Commits

Author SHA1 Message Date
Evan Tschannen 1c005d5878
Merge pull request #1584 from alexmiller-apple/spilled-only-peek
Save TLog resources by letting peek request only spilled data.
2019-06-20 18:22:31 -07:00
mpilman 68ce9a5e75 ProtocolVersion type - second try 2019-06-18 17:55:27 -07:00
Alex Miller 51fd42a4d2 Merge remote-tracking branch 'upstream/master' into spilled-only-peek 2019-06-18 17:33:52 -07:00
mpilman 8576665a90 Revert "Revert "Make protocol version a type""
This reverts commit 455bf3b3ec.
2019-06-18 14:49:04 -07:00
Alex Miller 455bf3b3ec Revert "Make protocol version a type" 2019-06-18 10:59:17 -07:00
mpilman da53a92bec Make protocol version a type
This fixes #1214

The basic idea is that ProtocolVersion is now its own type. This
alone is an improvement as it makes many things more typesafe. For
each version, we can now add breaking features (for example Fearless).
After that, there's no need to test against actual (confusing) version
numbers. Instead a developer can simply test
`protocolVersion->hasFearless()` and this will return true iff the
protocolVersion is newer than the newest version that didn't support
fearless.
2019-06-16 09:59:15 -07:00
sramamoorthy 1190f2f33d rebased related changes 2019-05-28 22:07:46 -07:00
sramamoorthy b43c100e57 TLog bug fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 3877f87481 comment change in tLogCommit 2019-05-28 22:07:46 -07:00
sramamoorthy 31b6c86650 ignorePopDeadline to have high limit in simulator
- ignorePopDeadline to have highier limit in simulator
to accommdate for the buggify delays and make snapshot succeed.

- introduce a new knob for auto resetting the disabling of tlog pop
2019-05-28 22:07:46 -07:00
sramamoorthy b1b96946af logData->stop check right after execOpHold wait 2019-05-28 22:07:46 -07:00
sramamoorthy 5749e220bd use FlowLock for implementing critical section
Instead of using Promises and future to implement
critcal section use FlowLock
2019-05-28 22:07:46 -07:00
sramamoorthy e6c0b87a4d remove unused variable 2019-05-28 22:07:46 -07:00
sramamoorthy f27a40f118 execProcessingHelper made synchronous
tLogCommit exects no blocking between duplicate check and
setting of the new version, that constraint was broken
when synchronous execProcessingHelper was introduced.
As a fix, execProcessingHelper was made asynchronous.
2019-05-28 22:07:46 -07:00
sramamoorthy d3a179b6f9 Multiple bug fixes
- wait for snapTLogFailKeys in a loop, otherwise in some race
  condition it can cause a false assert
- in single region, there does not seem to be a guarantee of
  tagLocalityListKey for a given DC ID, avoiding that assert for now
- to find the workers that are coordinators, looking up by primary
  address is not sufficient in some cases, hence looking by both
  primary and secondary address
- test make files to reflect the location of the new test cases
2019-05-28 22:07:46 -07:00
sramamoorthy dcd2d96751 make spawnProcess predictable in the simulator 2019-05-28 22:07:46 -07:00
sramamoorthy 4083af0b01 Avoid using trackLatest for TLog pop test cases 2019-05-28 22:07:46 -07:00
sramamoorthy ec7834e2f7 code re-orgnaization and address comments 2019-05-28 22:07:46 -07:00
sramamoorthy b6e037ffbc Replace fork with boost::process::child 2019-05-28 22:07:46 -07:00
sramamoorthy e91c76834e tlog: move snap create part to indepdendent funcs 2019-05-28 22:07:46 -07:00
sramamoorthy 61e93a9304 Address review comments and minor fixes 2019-05-28 22:07:46 -07:00
sramamoorthy 9e3104c2d4 Fix: races in async exec leading to bad backup 2019-05-28 22:07:46 -07:00
sramamoorthy cfdad0c5e6 tlog to snapshot exactly at exec version 2019-05-28 22:07:46 -07:00
sramamoorthy 539e65efad Skip parsing mutations if it is tagged for TxsTag
In Tlog, if a mutation is targetted for TxsTag then skip from
parsing them.
2019-05-28 22:07:46 -07:00
sramamoorthy 17ecba8313 trace cleanup and other indentation changes 2019-05-28 22:07:46 -07:00
sramamoorthy aa79480d69 changes to make fdbfork asynchronous 2019-05-28 22:07:46 -07:00
sramamoorthy 4016f16c76 Fix few compilation and bugs in rebase 2019-05-28 22:07:46 -07:00
sramamoorthy 3d5998e9dd tlog: when pops are disabled, store them & replay
In Tlogs, disable pop is done whlie taking snapshots. Earlier, tlogs
were ignoring the pops if it got pop requests when pops were
disabled. In this change, instead of ignoring the pop - it remembers
the list of pops in-memory and plays them once the popping is
enabled.
2019-05-28 22:07:46 -07:00
sramamoorthy 4bc4c615da exec op to all tlog, restore change in test &other
- exec operation to go to all the TLogs
- minor bug fix in tlog
- restore implementation for the simulator
- restore snap UID to be stored in restartInfo.ini
- test cases added
- indentation and trace file fixes
2019-05-28 22:07:46 -07:00
sramamoorthy 72dd067173 Trace message changes and fix few FIXMEs 2019-05-28 22:07:46 -07:00
sramamoorthy 69edefe68b Snapshot based backup and resotre implementation 2019-05-28 22:07:46 -07:00
A.J. Beamon f417e60264 Merge branch 'merge-release-6.1-into-master' into thread-safe-random-number-generation
# Conflicts:
#	fdbserver/QuietDatabase.actor.cpp
2019-05-23 09:52:00 -07:00
A.J. Beamon d29c7e4c9b Merge branch 'release-6.1' into merge-release-6.1-into-master
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/QuietDatabase.actor.cpp
#	versions.target
2019-05-23 09:28:45 -07:00
Evan Tschannen 003cc6be18 fix: nothingPersistent could be incorrect when popped is equal to persistentDataVersion 2019-05-22 20:23:35 -10:00
Evan Tschannen ee04c583fa fix: do not pop the disk queue past the persistentDataVersion 2019-05-21 10:40:30 -07:00
Evan Tschannen 4059d68348 fix: the tlog would not pop data from the disk queue after a storage server was removed, because the tag still exists in memory on the logs
fix: we could incorrectly make data durable if eraseMessagesFromMemory was in progress while running updatePersistentData
the quiet database check now ensure that tlogs have no more than 30 seconds of versions unpopped from the disk queue
2019-05-20 23:58:45 -07:00
Alex Miller 4eb4c03ce5 Save TLog resources by letting peek request only spilled data.
If a peek is entirely fulfilled from spilled data, then it's likely that
the next peek will be also.  It is thus wasteful for each of these peeks
to call peekMessagesFromMemory, which memcpy's excessively, and then
throw all that data away without using it.

Now, TLogs will give a hint back to peek cursors about if the provided
reply was served entirely from the spilled data, which peek curors then
feed back as the hint into their next request.

At some point, a cursor will send a request for only spilled data, get
an incomplete response, and then be told to send its next request as one
that peeks from memory as well, and then it will fully catch up.
2019-05-14 15:38:48 -10:00
A.J. Beamon 5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Evan Tschannen 22499666d0 Merge branch 'release-6.1'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/LogRouter.actor.cpp
#	flow/Trace.cpp
#	versions.target
2019-05-08 18:19:35 -07:00
Evan Tschannen 93eb2a9395
Merge pull request #1527 from alexmiller-apple/tstlog-6.1
Spill-by-reference knob + TLog6.0 Spilled Peek deprioritization
2019-05-03 17:19:45 -07:00
Alex Miller c918b21137 Deprioritize spilled peeks in spill-by-value, and improve its logic.
This deprioritizes before calling peekMessagesFromMemory, which should
improve the memory usage of the TLog, and makes sure to keep txsTag
peeks at a high priority to help recoveries stay fast.
2019-05-03 15:27:11 -07:00
Alex Miller 4052f3826a Add a knob to limit the number of commits indexed per key.
Theoretically, we could spill 20MB of 22B mutations for one key, which
would generate a very long value being stored in SQLite, and very
inefficiently read back.  This stops that from being a problem, at the
cost of some extra write calls.
2019-05-03 15:27:10 -07:00
Evan Tschannen 12088119d2
Merge pull request #1517 from alexmiller-apple/tstlog-6.1
Add a knob to limit amount of data read from sqlite for one PeekRequest.
2019-05-03 11:01:11 -07:00
Alex Miller f4e48c3851 Add a knob to limit amount of data read from sqlite for one PeekRequest.
This prevents peeking from degrading over time if there are a very large
number of SpilledData entries for one particular tag.
2019-05-02 17:26:45 -07:00
Evan Tschannen 8590b710bf added additional logging on the logs and log routers 2019-05-02 17:24:39 -07:00
Jingyu Zhou 8b5449e608 Fix review comments for PR #1473 2019-04-29 16:45:42 -07:00
Jingyu Zhou 5462f560e7 Add pseudo locality for log routers and tlogs
This changes the logic of pop operations from log routers (LG):
- LG pops tagLocalityLogRouterMapped from TLogs;
- TLog converts tagLocalityLogRouterMapped back to tagLocalityLogRouter before
  popping.

Later when we add more psuedo localities, the same pattern can be used.
2019-04-23 21:35:56 -07:00
Jingyu Zhou 0b1984978a Small code refactoring. 2019-04-21 10:41:07 -07:00
Jingyu Zhou ec1bc5cfca Add LogSystemType enum 2019-04-21 10:41:07 -07:00
Evan Tschannen 6220a5ce0f
Merge pull request #1370 from jzhou77/fix-unreferenced
Remove unused functions
2019-04-09 11:49:45 -07:00