Add tracing documentation

This commit is contained in:
Lukas Joswiak 2021-06-28 16:19:55 -07:00
parent a1d37ceddb
commit 9a87a65f47
4 changed files with 98 additions and 6 deletions

View File

@ -1,7 +1,97 @@
.. _request-tracing:
#########################
Request Tracing Framework
#########################
###############
Request Tracing
###############
.. include:: guide-common.rst.inc
The request tracing framework adds the ability to monitor transactions as they
move through FoundationDB. Tracing provides a detailed view into where
transactions spend time with data exported in near real-time, enabling fast
performance debugging. The FoundationDB tracing framework is based off the
`OpenTracing <https://opentracing.io/>`_ specification.
*Disambiguation:* :ref:`Trace files <administration-managing-trace-files>` are
local log files containing debug and error output from a local ``fdbserver``
binary. Request tracing produces similarly named *traces* which record the
amount of time a transaction spent in a part of the system. This document uses
the term tracing (or trace) to refer to these request traces, not local debug
information, unless otherwise specified.
*Note*: Full request tracing capability requires at least ``TLogVersion::V6``.
==============
Recording data
==============
The request tracing framework produces no data by default. To enable collection
of traces, specify the collection type using the ``--tracer`` command line
option for ``fdbserver`` and the ``DISTRIBUTED_CLIENT_TRACER`` :ref:`network
option <network-options-using-environment-variables>` for clients. Both client
and server must have the same trace value set to perform correctly.
========================= ===============
**Option** **Description**
------------------------- ---------------
none No tracing data is collected.
file, logfile, log_file Write tracing data to FDB trace files, specified with ``--logdir``.
network_lossy Send tracing data as UDP packets. Data is sent to ``localhost:8889``, but the default port can be changed by setting the ``TRACING_UDP_LISTENER_PORT`` knob. This option is useful if you have a log aggregation program to collect trace data.
========================= ===============
-----------
Data format
-----------
Spans are the building blocks of traces. A span represents an operation in the
life of a transaction, including the start and end timestamp and an operation.
A collection of spans make up a trace, representing a single transaction. The
tracing framework outputs individual spans, which can be reconstructed into
traces through their parent relationships.
Trace data sent as UDP packets when using the ``network_lossy`` option is
serialized using `MessagePack <https://msgpack.org>`_. To save on the amount of
data sent, spans are serialized as an array of length 8 (if the span has one or
more parents), or length 7 (if the span has no parents).
The fields of a span are specified below. The index at which the field appears
in the serialized msgpack array is also specified, for those using the UDP
collection format.
================== ========= ======== ===============
**Field** **Index** **Type** **Description**
------------------ --------- -------- ---------------
Source IP:port 0 string The IP and port of the machine where the span originated.
Trace ID 1 uint64 The 64-bit identifier of the trace. All spans in a trace share the same trace ID.
Span ID 2 uint64 The 64-bit identifier of the span. All spans have a unique identifier.
Start timestamp 3 double The timestamp when the operation represented by the span began.
End timestamp 4 double The timestamp when the operation represented by the span ended.
Operation name 5 string The name of the operation the span represents.
Tags 6 map User defined tags, added manually to specify additional information.
Parent span IDs 7 vector (Optional) A list of span IDs representing parents of this span.
================== ========= ======== ===============
^^^^^^^^^^^^^^^^^^^^^
Multiple parent spans
^^^^^^^^^^^^^^^^^^^^^
Unlike traditional distributed tracing frameworks, FoundationDB spans can have
multiple parents. Because many FDB transactions are batched into a single
transaction, to continue tracing the request, the batched transaction must
treat all its component transactions as parents.
---------------
Control options
---------------
In addition to the command line parameter described above, tracing can be set
at a database and transaction level.
Tracing can be globally disabled by setting the
``distributed_transaction_trace_disable`` database option. It can be enabled by
setting the ``distributed_transaction_trace_enable`` database option. If
neither option is specified but a tracer option is set as described above,
tracing will be enabled.
Tracing can be enabled or disabled for individual transactions. The special key
space exposes an API to set a custom trace ID for a transaction, or to disable
tracing for the transaction. See the special key space :ref:`tracing module
documentation <special-key-space-tracing-module>` to learn more.

View File

@ -244,6 +244,8 @@ use the global configuration functions.
#. ``\xff\xff/global_config/<key> := <value>`` Read/write. Reading keys in the range will return a tuple decoded string representation of the value for the given key. Writing a value will update all processes in the cluster with the new key-value pair. Values must be written using the :ref:`api-python-tuple-layer`.
.. _special-key-space-tracing-module:
Tracing module
--------------

View File

@ -6,7 +6,7 @@ Visibility Documents
Curation of documents related to Visibility into FDB.
* :doc:`request-tracing` walks you through request-tracing framework.
* :doc:`request-tracing` provides fine-grained visibility into the flow of transactions through the system.
.. toctree::
:maxdepth: 2

View File

@ -129,7 +129,7 @@ description is not currently required but encouraged.
paramType="Int" paramDescription="probability expressed as a percentage between 0 and 100"
description="Set the probability of an active CLIENT_BUGGIFY section being fired. A section will only fire if it was activated" />
<Option name="distributed_client_tracer" code="90"
paramType="String" paramDescription="Distributed tracer type. Choose from none, log_file, network_async, or network_lossy"
paramType="String" paramDescription="Distributed tracer type. Choose from none, log_file, or network_lossy"
description="Set a tracer to run on the client. Should be set to the same value as the tracer set on the server." />
<Option name="supported_client_versions" code="1000"
paramType="String" paramDescription="[release version],[source version],[protocol version];..."