FDB Formatster
df90cc89de
apply clang-format to *.c, *.cpp, *.h, *.hpp files
2021-03-10 10:18:07 -08:00
sfc-gh-tclinkenbeard
5020e3faa1
Make ILogSystem::IPeekCursor const-correct
2020-12-08 09:09:31 -08:00
David Youngworth
d64cf8b9e3
Merge branch 6.3 into master
2020-11-17 11:22:45 -08:00
David Youngworth
d0391db862
Merge branch 'release-6.2' into release-6.3
2020-11-16 10:15:23 -08:00
Markus Pilman
bdd3dbfa7d
remove duplicates
2020-11-10 14:01:07 -07:00
sfc-gh-tclinkenbeard
4669f837fa
Add uses of makeReference
2020-11-07 22:10:18 -08:00
Vishesh Yadav
7b28de8a41
Add IDs to ConnectionReset TraceEvents
2020-11-04 14:06:49 -08:00
Vishesh Yadav
22b16302c3
Make ConnectionReset logs easier to query #3977
...
All TraceLogs that are related to ConnectionReset should be prefixed with
ConnectionReset. This should make it easy to query and aggregate by address and
role.
2020-11-02 15:10:51 -08:00
Evan Tschannen
12edadd059
Merge branch 'release-6.3'
...
# Conflicts:
# CMakeLists.txt
# fdbclient/Knobs.cpp
# fdbclient/MasterProxyInterface.h
# fdbrpc/simulator.h
# fdbserver/MasterProxyServer.actor.cpp
# tests/fast/CycleAndLock.txt
# tests/fast/TxnStateStoreCycleTest.txt
# tests/fast/VersionStamp.txt
# tests/slow/ParallelRestoreOldBackupApiCorrectnessAtomicRestore.txt
# tests/slow/ParallelRestoreOldBackupCorrectnessCycle.txt
# versions.target
2020-08-31 19:33:34 -07:00
Evan Tschannen
29eec30183
Merge branch 'release-6.2' into release-6.3
...
# Conflicts:
# CMakeLists.txt
# build/Dockerfile
# build/Dockerfile.devel
# documentation/sphinx/source/downloads.rst
# fdbserver/Knobs.cpp
# fdbserver/LogSystem.h
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/TagPartitionedLogSystem.actor.cpp
# fdbserver/WaitFailure.actor.cpp
# fdbserver/fdbserver.vcxproj
# fdbserver/fdbserver.vcxproj.filters
# packaging/msi/FDBInstaller.wxs
2020-08-31 01:10:29 -07:00
Evan Tschannen
507c67c930
Added additional information to trace events
2020-08-26 11:42:23 -07:00
Meng Xu
ef8c1060a2
Merge branch 'master' into mengxu/tmp-merge-6.3
2020-07-13 10:15:56 -07:00
A.J. Beamon
b09dddc07e
Merge branch 'release-6.2' into merge-release-6.2-into-release-6.3
...
# Conflicts:
# cmake/ConfigureCompiler.cmake
# documentation/sphinx/source/downloads.rst
# fdbrpc/FlowTransport.actor.cpp
# fdbrpc/fdbrpc.vcxproj
# fdbserver/DataDistributionQueue.actor.cpp
# fdbserver/Knobs.cpp
# fdbserver/Knobs.h
# fdbserver/LogSystemPeekCursor.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/Status.actor.cpp
# fdbserver/storageserver.actor.cpp
# flow/flow.vcxproj
2020-07-10 15:06:34 -07:00
Evan Tschannen
33c9b1374a
more compile fixes
2020-07-09 22:57:43 -07:00
Evan Tschannen
f6163d0a79
fix compile errors
2020-07-09 22:53:02 -07:00
Evan Tschannen
717242a0ee
reset WAN network connections every 5 minutes is responses take more than 500ms
2020-07-09 22:50:47 -07:00
sfc-gh-ngoyal
693d9e8b89
Merge branch 'master' into fdb_cache_wo_allocator
2020-06-09 15:09:58 -07:00
Alex Miller
ccaac162e2
Resolve performance concerns of nearly-no-op debugMutation being frequently called
...
This introduces unhygenic macro variants that inline a `ENABLED &&`
before the TraceEvent. This way, they get entirely compiled out unless
enabled.
Then rewrite all debugMutation uses via sed.
2020-05-13 18:44:15 -07:00
Alex Miller
122762cce1
Add debugMessagesAndTags, and track mutations in more places.
...
Like:
* Leaving the proxy
* Entering the TLog
* Leaving the TLog
* Being read on a cursor
All of this brought to you by TagsAndMessage!
This also slides in a minor optimization as to how mutations are serialized per target log.
2020-03-27 03:31:04 -07:00
negoyal
acaf91ac47
Merge branch 'master' into fdb_cache_subfeature2
2020-03-26 13:33:08 -07:00
negoyal
8abac91033
Fixed a bug in cache server while peeking at a version lower than popped version and added some logging.
2020-03-26 12:39:07 -07:00
Meng Xu
bd345f85db
ConsistencyCheck:Fix failue due to address inconsistency between process and worker
...
With TLS, a worker (or process) can have a TLS address and non-TLS address.
When a process is created in simulation, the primary address is TLS by default.
The non-TLS one is the TLS address port plus one.
In a connection between two workers, if their primary addresses do not enable
or disable TLS together, one worker will swap its primary address and secondary address
so that the TLS config of the two endpoints can match.
The swap can make the primary address no longer the TLS one that was created
when the process is created. And the swap only happens for worker instead of
process struct in simulation.
This swap can cause worker->address != process->address.
In checkForExtraDataStores actor, we use worker->address to check if a process
is killable and use the process->address to kill the process. The inconsistency
can cause simulation to kill a protected process that is not killable and leads
to simulation failure.
2020-03-10 21:07:16 -07:00
Evan Tschannen
303df197cf
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# bindings/c/test/mako/mako.c
# documentation/sphinx/source/release-notes.rst
# fdbbackup/backup.actor.cpp
# fdbclient/NativeAPI.actor.cpp
# fdbclient/NativeAPI.actor.h
# fdbserver/DataDistributionQueue.actor.cpp
# fdbserver/Knobs.cpp
# fdbserver/Knobs.h
# fdbserver/LogRouter.actor.cpp
# fdbserver/SkipList.cpp
# fdbserver/fdbserver.actor.cpp
# flow/CMakeLists.txt
# flow/Knobs.cpp
# flow/Knobs.h
# flow/flow.vcxproj
# flow/flow.vcxproj.filters
# versions.target
2020-03-06 18:22:46 -08:00
Evan Tschannen
1076abdee5
fixed crash when interf was not created
2020-03-05 19:09:08 -08:00
Evan Tschannen
1128666840
added additional logging on the log router
2020-03-05 18:17:06 -08:00
Evan Tschannen
96258b9809
Merge branch 'release-6.2'
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
# fdbcli/fdbcli.actor.cpp
# fdbclient/ManagementAPI.actor.cpp
# fdbrpc/FlowTransport.actor.cpp
# fdbserver/ClusterController.actor.cpp
# fdbserver/DataDistribution.actor.cpp
# fdbserver/DataDistribution.actor.h
# fdbserver/DataDistributionQueue.actor.cpp
# fdbserver/KeyValueStoreMemory.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/QuietDatabase.actor.cpp
# fdbserver/SkipList.cpp
# fdbserver/StorageMetrics.actor.h
# fdbserver/TLogServer.actor.cpp
# fdbserver/fdbserver.actor.cpp
# fdbserver/storageserver.actor.cpp
# fdbserver/workloads/KVStoreTest.actor.cpp
# flow/CMakeLists.txt
# flow/Knobs.cpp
# flow/Knobs.h
# flow/genericactors.actor.cpp
# flow/serialize.h
2020-02-21 19:09:16 -08:00
Evan Tschannen
cf4efca852
fix: buffered cursor should always make sure all of the sub-cursors are completely exhausted before calculating minVersion. It is not legal to advance a cursor version past an epochEnd (+100 million versions) without also returning the epochEnd mutation, or the storage servers might not be able to rollback far enough because the end of the previous epoch will be made durable
2020-02-19 15:24:32 -08:00
Alex Miller
7798456201
Make TLogs have consistent parallel peek behavior.
...
TLogServer and LogRouter had some leftover code from me trying to be
more "correct" about parallel peek semantics, but those changes weren't
reflected in the OldTLog* files. I've reverted the changes, as
realistically, they are more likely to waste CPU than improve TLog behavior.
2020-01-21 18:23:16 -08:00
Alex Miller
858e4e5900
Move the check to a better location.
...
This way, we avoid some ID randomness, and also avoid the potential for
resetting the randomID and sequence without clearing out the future
vector.
2020-01-21 17:08:42 -08:00
Alex Miller
1cb311fcb8
Add an ASSERT_WE_THINK that peek cursors don't get timed_out()
...
This should prevent us from regressing and having multi-region
recoveries hang for 10min again.
2020-01-21 17:07:37 -08:00
Alex Miller
0662f8dba0
When switching parallel->single->parallel, reset sequence and peekId
...
This fixes an issue where one could hang for 10min for the second
parallel peek to time out, if one happened to catch the edge of a
onlySpilled transition wrong.
2020-01-21 17:07:37 -08:00
Evan Tschannen
afc9713005
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# fdbclient/FDBTypes.h
# fdbserver/LogSystem.h
# fdbserver/LogSystemPeekCursor.actor.cpp
# fdbserver/OldTLogServer_6_0.actor.cpp
# fdbserver/TLogServer.actor.cpp
# versions.target
2019-11-06 13:45:37 -08:00
Evan Tschannen
dbc5a2393c
combineMessages still did not serialize tags correctly
2019-11-05 18:44:30 -08:00
Evan Tschannen
1c873591be
fixed a compiler error
2019-11-05 18:32:15 -08:00
Evan Tschannen
86560fe727
fix: tempTags was not used correctly
2019-11-05 18:22:25 -08:00
Evan Tschannen
a8ca47beff
optimized memory allocations by using VectorRef<Tag> instead of std::vector<Tag>
2019-11-05 18:07:30 -08:00
Evan Tschannen
daac8a2c22
Knobified a few variables
2019-11-04 20:21:38 -08:00
Evan Tschannen
457896b80d
remote logs use bufferedCursor when peeking from log routers to improve performance
...
bufferedCursor performance has been improved
2019-11-04 19:47:45 -08:00
Evan Tschannen
3325980c03
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# fdbserver/DataDistribution.actor.cpp
# fdbserver/OldTLogServer_6_0.actor.cpp
# fdbserver/TLogServer.actor.cpp
# fdbserver/WorkerInterface.actor.h
# fdbserver/worker.actor.cpp
# versions.target
2019-10-24 17:38:15 -07:00
Evan Tschannen
a7492aab0a
fix: poppedVersion can update during a yield, so all work must be done immediately after getMore returns
2019-10-23 23:06:02 -07:00
Alex Miller
1e5b8c74e3
Continuing a parallel peek after a timeout would hang.
...
This is to guard against the case where
1. Peeks with sequence numbers 0-39 are submitted
2. A 15min pause happens, in which timeout removes the peek tracker data
3. Peeks with sequence numbers 40-59 are submitted, with the same peekId
The second round of peeks wouldn't have the data left that it's allowed
to start running peek 40 immediately, and thus would hang for 10min
until it gets cleaned up.
Also, guard against overflowing the sequence number.
2019-10-22 19:24:05 -07:00
Alex Miller
c008e7f8b3
When switching parallel->single->parallel, reset sequence and peekId
...
This fixes an issue where one could hang for 10min for the second
parallel peek to time out, if one happened to catch the edge of a
onlySpilled transition wrong.
2019-10-22 19:10:58 -07:00
Evan Tschannen
b495cc697b
Merge branch 'release-6.2'
...
# Conflicts:
# CMakeLists.txt
# documentation/sphinx/source/release-notes.rst
# versions.target
2019-09-13 09:25:08 -07:00
Alex Miller
324289039a
When reloading one cursor in a merge cursor, top off the other cursors as well.
2019-09-12 16:22:28 -07:00
Jingyu Zhou
2723922f5f
Replace -1 as VERSION_HEADER constant for serialization
2019-09-05 12:45:39 -07:00
Jingyu Zhou
f9357c5ad8
Fix side effect of ArenaReader
...
ServerPeekCursor::nextMessage() should only consume the message header, because
the reader() directly inherits the current position. The previous commit
changes the positon to the begining of the next message, which breaks storage
server code.
2019-09-05 11:07:07 -07:00
Jingyu Zhou
cd3f1e33d4
Refactor deserialization of TagsAndMessages
...
Consolidate deserialization of TagsAndMessages in the structure itself and
change both TLog and ServerPeekCursor to use it.
2019-09-04 14:55:05 -07:00
Evan Tschannen
b0480edd15
fix: messageVersion could be larger than poppedVersion, and we will discard messages that are needed
2019-08-06 16:31:05 -07:00
Evan Tschannen
7ac7eb82f2
fix: buffered cursor would start multiple bufferedGetMore actors
...
advance all of the cursors to the poppedVersion
2019-07-30 14:42:05 -07:00
Evan Tschannen
b5cb7919b6
fix: canDiscardPopped was not reset when necessary in all cases
2019-07-30 13:44:44 -07:00