Commit Graph

7157 Commits

Author SHA1 Message Date
Meng Xu 0c8de91932 FastRestore:applyToDB:Add functions to DBApplyProgress for encapsulation 2019-10-14 16:24:36 -07:00
Meng Xu f89b5586df FastRestore:applyToDB:Record applyToDB progress in DBApplyProgress struct
This avoids repetitive code
2019-10-14 14:57:17 -07:00
Meng Xu 7b36fee38f FastRestore:applyToDB:Cosmic change for review comments
No functional change.
2019-10-14 12:52:32 -07:00
Meng Xu 71509a5157 FastRestore:Applier:applyToDB:Clang format 2019-10-10 17:36:38 -07:00
Meng Xu 48e0620e5f FastRestore:Applier:applyToDB:Handle txn with errors 2019-10-10 17:24:07 -07:00
Meng Xu 84b5a5525f FastRestore:Add restoreApplierKeys 2019-10-10 17:18:34 -07:00
Jingyu Zhou 3bb0c12ce9
Merge pull request #2228 from xumengpanda/mengxu/oom-fix
StorageServerTracker:Fix OOM bug caused by server healthiness toggles infinitely
2019-10-10 08:58:00 -07:00
Meng Xu 1bd6151f54
Update fdbserver/DataDistribution.actor.cpp
Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-10-09 21:17:03 -07:00
Meng Xu 26e1d565f6 StorageServerTracker:Fix OOM bug caused by server healthyness toggles infinitely
When there is only one healthy team, the bug will set a server's status as unhealthy;
which causes the healthyTeam to 0, triggering StorageServerTracker to loop back;
which resets the server's status to healthy, and thus the healthyTeam to non-zero.

This pattern will cause infinite loop.

Infinite loop will prevent TraceEvent from flushing, which causes
TraceEvent to use most of memory and out-of-memory.

Kudos to JingYu Zhou (jingyu_zhou@apple.com) who is the main contributor who found the bug!
2019-10-09 17:45:09 -07:00
sramamoorthy c9097cca18 deprecate isTLogInSameNode used by snapshot V1 2019-10-09 15:33:11 -07:00
Bhaskar Muppana 36b1247054
Merge pull request #2136 from ajbeamon/dont-require-future-get-database
Don't require fdb_future_get_database from new client binaries.
2019-10-09 09:58:56 -07:00
Jingyu Zhou 0136bf0c62
Merge pull request #2222 from ajbeamon/add-alloc-instrumentation-tool
Add script to parse output from enabling ALLOC_INSTRUMENTATION_STDOUT
2019-10-08 19:09:28 -07:00
Vishesh Yadav b9a8e60318
Merge pull request #2091 from ajbeamon/binding-tester-python3
Update binding tester to use Python 3
2019-10-08 15:54:07 -07:00
A.J. Beamon 3ba8fd95b5 Add script to parse output from enabling ALLOC_INSTRUMENTATION_STDOUT 2019-10-08 15:50:47 -07:00
Vishesh Yadav 62972b523c
Merge pull request #2133 from kaomakino/kaomakino/mako
Mako benchmark:  Add variable throttling
2019-10-08 15:26:52 -07:00
Jingyu Zhou fef89aa17f
Merge pull request #2213 from alexmiller-apple/new-log-spill-default
Stop and spill TLogs when a new TLog is recruited in a different SharedTLog
2019-10-08 12:57:28 -07:00
Meng Xu 6ff368a52b
Merge pull request #2217 from jzhou77/pprof
Add memory profiling for FastAlloc when gperftool is used
2019-10-07 22:03:27 -07:00
Jingyu Zhou 396b10caca Add memory profiling for FastAlloc when gperftool is used
FastAlloc is the major memory use case in FDB, yet we can't profiling its usage.
This commit replaces FastAlloc memory allocation with malloc so that we may
track its memory usage when gperftool is used.
2019-10-07 19:27:06 -07:00
Alex Miller 77c72de176 Comment variable and code style fix
Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-10-07 18:08:27 -07:00
Alex Miller a34a009bf0 Shuffle member initialization in constructor. 2019-10-07 18:08:27 -07:00
Alex Miller b3fd4f62a7 Fix whitespace. 2019-10-07 18:08:27 -07:00
Alex Miller 71af24dff3 Fix a bug that would cause active logs to spill aggressively
And add some useful logging about when things do or do not spill.
2019-10-07 18:08:27 -07:00
Alex Miller 1d8a7e5af7 Spill SharedTLog when there's more than one.
When switching between spill_type or log_version, a new instance of a
SharedTLog is created in the transaction log processes.  If this is done
in a saturated database, then doubling the amount of memory to hold
mutations in memory can cause TLogs to be uncomfortably close to the 8GB
OOM limit.

Instead, we now thread which UID of a SharedTLog is active, and the
other TLog spill out the majority of their mutations.
2019-10-07 18:08:27 -07:00
Evan Tschannen 1b946d588f
Merge pull request #2208 from alexmiller-apple/faster-txstag-recovery
Recover Txs Faster [0/?]: Combine spill-by-value and spill-by-reference into one file/SharedTLog
2019-10-07 11:15:56 -07:00
A.J. Beamon 22c3fa867c
Merge pull request #2074 from xumengpanda/mengxu/fix-correctness-bug
DD:fix:getTeam may fail to get a team when it can
2019-10-07 09:33:57 -07:00
Alex Miller 5016f3fedd
Whitespace fixes
no idea what happened here

Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-10-04 13:37:59 -07:00
Meng Xu 6dae95a4ca
Update fdbserver/DataDistribution.actor.cpp
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2019-10-04 10:02:11 -07:00
Alex Miller 6bcb72fa74
Fix stray Unversioned()
I forgot there were two
2019-10-03 19:45:13 -07:00
Balachandar Namasivayam 493d39be7a Revert "Removed unnecessary and ununed libraries from compilation command. Inclusion of this will produce an error with certain compilers such as Clang"
This reverts commit b10f3ad7a1.
2019-10-03 18:24:43 -07:00
Alex Miller 28f6275f94 Use AssumeVersion instead of Unversioned
Which lets us revert the unversioned serilaization of TLogSpillType
2019-10-03 15:59:09 -07:00
Alex Miller 9401a6941a
Code review nits
const correctness and file renaming in comment.

Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
2019-10-03 15:53:39 -07:00
Vishesh Yadav 162b4efaea
Merge pull request #2180 from alexmiller-apple/cmake-staticify-libraries
Make FDBLibTLS and thirdparty static libraries.
2019-10-03 14:00:07 -07:00
Meng Xu 7cbe9f6ea8
Merge pull request #2151 from atn34/delay-order-test
Add test for delay ordering
2019-10-03 10:32:20 -07:00
A.J. Beamon ecad374fdb
Merge pull request #2095 from atn34/fix-actorcompiler-dep
Depend on actorcompiler.exe, rather than target
2019-10-03 09:56:49 -07:00
Alex Miller 9f9d2dff42 Point both spill types at the same SharedTLog for >=V5
This actually results in TLog generations of different spill types in
the same disk queue.  We also drop the LS_ parameter from the filename
to signify that there is no log spill configuration per file anymore.
2019-10-03 01:45:50 -07:00
Alex Miller 35a0fc948d Make DiskQueue V1 not ignore min recovery location.
I can't figure out why I made this branch on version, and it's breaking
having value and reference tlogs in the same SharedTLog
2019-10-03 01:45:10 -07:00
Alex Miller 6742222084 Make TLogServer able to spill by value and by reference
...and test it in simulation, but not combined yet.

It turns out that because of txsTag, we basically had to support
spill-by-value anyway.  Thus, if we treat all tags like txsTag when
spilling and peeking, then we have an easy way to bring the two spilling
types back into one implementation.
2019-10-03 01:45:10 -07:00
Alex Miller d38a96ab73 Make LogData aware of the spill type it was created to perform.
The spilling type is now pulled out of the request, and then stored on
LogData for later access, and persisted in the tlog metadata per tlog
generation.

It turns out that serializing types as Unversioned is a bit wonky.
2019-10-03 01:45:10 -07:00
Alex Miller 24c46337e1 Advance TLogVersions for 7.0 while adding a V5 that is TLogServer
Advancing the MIN_RECRUITABLE and DEFAULT is just following the standard
progression for 7.0.  It was convenient to do while adding the V5 so
that we can hook TLogServer back into being used.
2019-10-03 01:43:01 -07:00
Alex Miller 60fb04ca68 Fork TLogServer into TLogServer_6_2
This prepares us for incoming modifications to the TLog that can't
easily coexist with our current on-disk state.
2019-10-03 01:41:25 -07:00
Meng Xu 7599dd0093
Merge pull request #2205 from jzhou77/backup-fix
Fix format string with more portable code
2019-10-02 20:03:35 -07:00
A.J. Beamon cf654e2d8c
Merge pull request #2204 from xumengpanda/mengxu/merge-release620-to-master-v3
Merge branch release-6.2 into master
2019-10-02 19:19:26 -07:00
Jingyu Zhou ceb39c0279 Fix format string with more portable code 2019-10-02 15:25:14 -07:00
Meng Xu 5286523a43 StorageServerTracker:Distinguish wrongDC from invalidLocality 2019-10-02 14:50:53 -07:00
Meng Xu 9e7dfa358c Replace isCorrectDC with isCorrectLocality
This incorporates the change for defending DD from misconfigured
locality entries.

The check for misconfigured locality was in keyValueStoreTypeTracker,
but the storage engine switch PR moves the isCorrectDC checking out of
the tracker and move it into storageServerTracker
2019-10-02 14:05:43 -07:00
Meng Xu e26270b17f Fix compilation errors after merge 6.2 2019-10-02 13:28:24 -07:00
Meng Xu d0147e5e5d Merge branch 'release-6.2' into mengxu/merge-release620-to-master-v3
Resolved Conflicts:
	documentation/sphinx/source/release-notes.rst
	fdbserver/DataDistribution.actor.cpp
	versions.target
2019-10-02 13:22:56 -07:00
A.J. Beamon e387824ef4
Merge pull request #2203 from ajbeamon/post-release-cleanup-6.2.5
Post release cleanup 6.2.5
2019-10-02 11:18:34 -07:00
A.J. Beamon 4081d2a0a0 update installer WIX GUID following release 2019-10-02 11:17:35 -07:00
A.J. Beamon 9c46aa3f08 update versions target to 6.2.6 2019-10-02 11:17:35 -07:00