From 9a87a65f47330af72d6009afa9e83fbfc456b0df Mon Sep 17 00:00:00 2001 From: Lukas Joswiak Date: Mon, 28 Jun 2021 16:19:55 -0700 Subject: [PATCH] Add tracing documentation --- .../sphinx/source/request-tracing.rst | 98 ++++++++++++++++++- documentation/sphinx/source/special-keys.rst | 2 + documentation/sphinx/source/visibility.rst | 2 +- fdbclient/vexillographer/fdb.options | 2 +- 4 files changed, 98 insertions(+), 6 deletions(-) diff --git a/documentation/sphinx/source/request-tracing.rst b/documentation/sphinx/source/request-tracing.rst index cca170ad24..03d2d35c50 100644 --- a/documentation/sphinx/source/request-tracing.rst +++ b/documentation/sphinx/source/request-tracing.rst @@ -1,7 +1,97 @@ .. _request-tracing: -######################### -Request Tracing Framework -######################### +############### +Request Tracing +############### -.. include:: guide-common.rst.inc +The request tracing framework adds the ability to monitor transactions as they +move through FoundationDB. Tracing provides a detailed view into where +transactions spend time with data exported in near real-time, enabling fast +performance debugging. The FoundationDB tracing framework is based off the +`OpenTracing `_ specification. + +*Disambiguation:* :ref:`Trace files ` are +local log files containing debug and error output from a local ``fdbserver`` +binary. Request tracing produces similarly named *traces* which record the +amount of time a transaction spent in a part of the system. This document uses +the term tracing (or trace) to refer to these request traces, not local debug +information, unless otherwise specified. + +*Note*: Full request tracing capability requires at least ``TLogVersion::V6``. + +============== +Recording data +============== + +The request tracing framework produces no data by default. To enable collection +of traces, specify the collection type using the ``--tracer`` command line +option for ``fdbserver`` and the ``DISTRIBUTED_CLIENT_TRACER`` :ref:`network +option ` for clients. Both client +and server must have the same trace value set to perform correctly. + +========================= =============== +**Option** **Description** +------------------------- --------------- +none No tracing data is collected. +file, logfile, log_file Write tracing data to FDB trace files, specified with ``--logdir``. +network_lossy Send tracing data as UDP packets. Data is sent to ``localhost:8889``, but the default port can be changed by setting the ``TRACING_UDP_LISTENER_PORT`` knob. This option is useful if you have a log aggregation program to collect trace data. +========================= =============== + +----------- +Data format +----------- + +Spans are the building blocks of traces. A span represents an operation in the +life of a transaction, including the start and end timestamp and an operation. +A collection of spans make up a trace, representing a single transaction. The +tracing framework outputs individual spans, which can be reconstructed into +traces through their parent relationships. + +Trace data sent as UDP packets when using the ``network_lossy`` option is +serialized using `MessagePack `_. To save on the amount of +data sent, spans are serialized as an array of length 8 (if the span has one or +more parents), or length 7 (if the span has no parents). + +The fields of a span are specified below. The index at which the field appears +in the serialized msgpack array is also specified, for those using the UDP +collection format. + +================== ========= ======== =============== +**Field** **Index** **Type** **Description** +------------------ --------- -------- --------------- +Source IP:port 0 string The IP and port of the machine where the span originated. +Trace ID 1 uint64 The 64-bit identifier of the trace. All spans in a trace share the same trace ID. +Span ID 2 uint64 The 64-bit identifier of the span. All spans have a unique identifier. +Start timestamp 3 double The timestamp when the operation represented by the span began. +End timestamp 4 double The timestamp when the operation represented by the span ended. +Operation name 5 string The name of the operation the span represents. +Tags 6 map User defined tags, added manually to specify additional information. +Parent span IDs 7 vector (Optional) A list of span IDs representing parents of this span. +================== ========= ======== =============== + +^^^^^^^^^^^^^^^^^^^^^ +Multiple parent spans +^^^^^^^^^^^^^^^^^^^^^ + +Unlike traditional distributed tracing frameworks, FoundationDB spans can have +multiple parents. Because many FDB transactions are batched into a single +transaction, to continue tracing the request, the batched transaction must +treat all its component transactions as parents. + +--------------- +Control options +--------------- + +In addition to the command line parameter described above, tracing can be set +at a database and transaction level. + +Tracing can be globally disabled by setting the +``distributed_transaction_trace_disable`` database option. It can be enabled by +setting the ``distributed_transaction_trace_enable`` database option. If +neither option is specified but a tracer option is set as described above, +tracing will be enabled. + +Tracing can be enabled or disabled for individual transactions. The special key +space exposes an API to set a custom trace ID for a transaction, or to disable +tracing for the transaction. See the special key space :ref:`tracing module +documentation ` to learn more. diff --git a/documentation/sphinx/source/special-keys.rst b/documentation/sphinx/source/special-keys.rst index 4d8abdf177..2a6b0018b1 100644 --- a/documentation/sphinx/source/special-keys.rst +++ b/documentation/sphinx/source/special-keys.rst @@ -244,6 +244,8 @@ use the global configuration functions. #. ``\xff\xff/global_config/ := `` Read/write. Reading keys in the range will return a tuple decoded string representation of the value for the given key. Writing a value will update all processes in the cluster with the new key-value pair. Values must be written using the :ref:`api-python-tuple-layer`. +.. _special-key-space-tracing-module: + Tracing module -------------- diff --git a/documentation/sphinx/source/visibility.rst b/documentation/sphinx/source/visibility.rst index de16800ce0..200dea0447 100644 --- a/documentation/sphinx/source/visibility.rst +++ b/documentation/sphinx/source/visibility.rst @@ -6,7 +6,7 @@ Visibility Documents Curation of documents related to Visibility into FDB. -* :doc:`request-tracing` walks you through request-tracing framework. +* :doc:`request-tracing` provides fine-grained visibility into the flow of transactions through the system. .. toctree:: :maxdepth: 2 diff --git a/fdbclient/vexillographer/fdb.options b/fdbclient/vexillographer/fdb.options index 15ba1250ca..e0f908a69c 100644 --- a/fdbclient/vexillographer/fdb.options +++ b/fdbclient/vexillographer/fdb.options @@ -129,7 +129,7 @@ description is not currently required but encouraged. paramType="Int" paramDescription="probability expressed as a percentage between 0 and 100" description="Set the probability of an active CLIENT_BUGGIFY section being fired. A section will only fire if it was activated" />