Commit Graph

9 Commits

Author SHA1 Message Date
Dean Michael Berris da375a67f8 [XRay] Improve FDR trace handling and error messaging
Summary:
This change covers a number of things spanning LLVM and compiler-rt,
which are related in a non-trivial way.

In LLVM, we have a library that handles the FDR mode even log loading,
which uses C++'s runtime polymorphism feature to better faithfully
represent the events that are written down by the FDR mode runtime. We
do this by interpreting a trace that's serliased in a common format
agreed upon by both the trace loading library and the FDR mode runtime.
This library is under active development, which consists of features
allowing us to reconstitute a higher-level event log.

This event log is used by the conversion and visualisation tools we have
for interpreting XRay traces.

One of the tools we have is a diagnostic tool in llvm-xray called
`fdr-dump` which we've been using to debug our expectations of what the
FDR runtime should be writing and what the logical FDR event log
structures are. We use this fairly extensively to reason about why some
non-trivial traces we're generating with FDR mode runtimes fail to
convert or fail to parse correctly.

One of these failures we've found in manual debugging of some of the
traces we've seen involve an inconsistency between the buffer extents (a
record indicating how many bytes to follow are part of a logical
thread's event log) and the record of the bytes written into the log --
sometimes it turns out the data could be garbage, due to buffers being
recycled, but sometimes we're seeing the buffer extent indicating a log
is "shorter" than the actual records associated with the buffer. This
case happens particularly with function entry records with a call
argument.

This change for now updates the FDR mode runtime to write the bytes for
the function call and arg record before updating the buffer extents
atomically, allowing multiple threads to see a consistent view of the
data in the buffer using the atomic counter associated with a buffer.
What we're trying to prevent here is partial updates where we see the
intermediary updates to the buffer extents (function record size then
call argument record size) becoming observable from another thread, for
instance, one doing the serialization/flushing.

To do both diagnose this issue properly, we need to be able to honour
the extents being set in the `BufferExtents` records marking the
beginning of the logical buffers when reading an FDR trace. Since LLVM
doesn't use C++'s RTTI mechanism, we instead follow the advice in the
documentation for LLVM Style RTTI
(https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html). We then rely on
this RTTI feature to ensure that our file-based record producer (our
streaming "deserializer") can honour the extents of individual buffers
as we interpret traces.

This also sets us up to be able to eventually do smart
skipping/continuation of FDR logs, seeking instead to find BufferExtents
records in cases where we find potentially recoverable errors. In the
meantime, we make this change to operate in a strict mode when reading
logical buffers with extent records.

Reviewers: mboerger

Subscribers: hiraditya, llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D54201

llvm-svn: 346473
2018-11-09 06:26:48 +00:00
Dean Michael Berris 59439dd069 [XRay] Use TSC delta encoding for custom/typed events
Summary:
This change updates the version number for FDR logs to 5, and update the
trace processing to support changes in the custom event records.

In the runtime, since we're already writing down the record preamble to
handle CPU migrations and TSC wraparound, we can use the same TSC delta
encoding in the custom event and typed event records that we use in
function event records. We do the same change to typed events (which
were unsupported before this change in the trace processing) which now
show up in the trace.

Future changes should increase our testing coverage to make custom and
typed events as first class entities in the FDR mode log processing
tools.

This change is also a good example of how we end up supporting new
record types in the FDR mode implementation. This shows the places where
new record types are added and supported.

Depends on D54139.

Reviewers: mboerger

Subscribers: hiraditya, arphaman, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D54140

llvm-svn: 346293
2018-11-07 04:37:42 +00:00
Dean Michael Berris e8c650ab12 [XRay] Fix TSC and atomic custom/typed event accounting
Summary:
This is a follow-on change to D53858 which turns out to have had a TSC
accounting bug when writing out function exit records in FDR mode.

This change adds a number of tests to ensure that:

- We are handling the delta between the exit TSC and the last TSC we've
  seen.

- We are writing the custom event and typed event records as a single
  update to the buffer extents.

- We are able to catch boundary conditions when loading FDR logs.

We introduce a TSC matcher to the test helpers, which we use in the
testing/verification of the TSC accounting change.

Reviewers: mboerger

Subscribers: mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D53967

llvm-svn: 345905
2018-11-01 22:57:50 +00:00
Dean Michael Berris 6b67ff0300 [XRay] Add CPU ID in Custom Event FDR Records
Summary:
This change cuts across compiler-rt and llvm, to increment the FDR log
version number to 4, and include the CPU ID in the custom event records.

This is a step towards allowing us to change the `llvm::xray::Trace`
object to start representing both custom and typed events in the stream
of records. Follow-on changes will allow us to change the kinds of
records we're presenting in the stream of traces, to incorporate the
data in custom/typed events.

A follow-on change will handle the typed event case, where it may not
fit within the 15-byte buffer for metadata records.

This work is part of the larger effort to enable writing analysis and
processing tools using a common in-memory representation of the events
found in traces. The work will focus on porting existing tools in LLVM
to use the common representation and informing the design of a
library/framework for expressing trace event analysis as C++ programs.

Reviewers: mboerger, eizan

Subscribers: hiraditya, mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D53920

llvm-svn: 345798
2018-11-01 00:18:52 +00:00
Dean Michael Berris 01aeb3221d [XRay] Migrate FDR runtime to use refactored controller
Summary:
This change completes the refactoring of the FDR runtime to support the
following:

- Generational buffer management.

- Centralised and well-tested controller implementation.

In this change we've had to:

- Greatly simplify the code in xray_fdr_logging.cc to only implement the
  glue code for calling into the controller.

- Implement the custom and typed event logging functions in the
  FDRLogWriter.

- Imbue the `XRAY_NEVER_INSTRUMENT` attribute onto all functions in the
  controller implementation.

Reviewers: mboerger, eizan, jfb

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D53858

llvm-svn: 345568
2018-10-30 04:35:48 +00:00
Dean Michael Berris 3c01508409 [XRay][compiler-rt] FDR Mode Controller
Summary:
This change implements a controller for abstracting away the details of
what happens when tracing with FDR mode. This controller type allows us
to test in isolation the various cases where we're encountering function
entry, exit, and other kinds of events we are handling when FDR mode is
enabled.

This change introduces a number of testing facilities we've needed to
better support expressing the conditions we need for the unit tests. We
leave some TODOs for moving those utilities into the LLVM project,
sitting in the `Testing` library, to make matching conditions on XRay
`Trace` instances through googlemock more manageable and declarative.

We don't wire in the controller right away, to allow us to incrementally
update the implementation(s) as we increase testing coverage of the
controller type. There's a need to re-think the way we're managing
buffers in a multi-threaded environment, which is more invasive than
this implementation.

This step in the process allows us to encode our assumptions in the
implementation of the controller, and then evolve the buffer queue
implementation to support generational buffer management to ensure we
can continue to support the cases we're already supporting with the
controller.

Reviewers: mboerger, eizan

Subscribers: mgorny, llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D52588

llvm-svn: 344488
2018-10-15 02:57:06 +00:00
Dean Michael Berris 1f60207984 [XRay][compiler-rt] FDRLogWriter Abstraction
Summary:
This change introduces an `FDRLogWriter` type which is responsible for
serialising metadata and function records to character buffers. This is
the first step in a refactoring of the implementation of the FDR runtime
to allow for more granular testing of the individual components of the
implementation.

The main contribution of this change is a means of hiding the details of
how specific records are written to a buffer, and for managing the
extents of these buffers. We make use of C++ features (templates and
some metaprogramming) to reduce repetition in the act of writing out
specific kinds of records to the buffer.

In this process, we make a number of changes across both LLVM and
compiler-rt to allow us to use the `Trace` abstraction defined in the
LLVM project in the testing of the runtime implementation. This gives us
a closer end-to-end test which version-locks the runtime implementation
with the loading implementation in LLVM.

We also allow using gmock in compiler-rt unit tests, by adding the
requisite definitions in the `AddCompilerRT.cmake` module. We also add
the terminfo library detection along with inclusion of the appropriate
compiler flags for header include lookup.

Finally, we've gone ahead and updated the FDR logging implementation to
use the FDRLogWriter for the lowest-level record-writing details.

Following patches will isolate the state machine transitions which
manage the set-up and tear-down of the buffers we're using in multiple
threads.

Reviewers: mboerger, eizan

Subscribers: mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52220

llvm-svn: 342617
2018-09-20 05:22:37 +00:00
Evgeniy Stepanov 09e7f243f1 Revert "[XRay][compiler-rt] FDRLogWriter Abstraction" and 1 more.
Revert the following 2 commits to fix standalone compiler-rt build:
* r342523 [XRay] Detect terminfo library
* r342518 [XRay][compiler-rt] FDRLogWriter Abstraction

llvm-svn: 342596
2018-09-19 22:29:56 +00:00
Dean Michael Berris b64f71b029 [XRay][compiler-rt] FDRLogWriter Abstraction
Summary:
This change introduces an `FDRLogWriter` type which is responsible for
serialising metadata and function records to character buffers. This is
the first step in a refactoring of the implementation of the FDR runtime
to allow for more granular testing of the individual components of the
implementation.

The main contribution of this change is a means of hiding the details of
how specific records are written to a buffer, and for managing the
extents of these buffers. We make use of C++ features (templates and
some metaprogramming) to reduce repetition in the act of writing out
specific kinds of records to the buffer.

In this process, we make a number of changes across both LLVM and
compiler-rt to allow us to use the `Trace` abstraction defined in the
LLVM project in the testing of the runtime implementation. This gives us
a closer end-to-end test which version-locks the runtime implementation
with the loading implementation in LLVM.

We also allow using gmock in compiler-rt unit tests, by adding the
requisite definitions in the `AddCompilerRT.cmake` module.

Finally, we've gone ahead and updated the FDR logging implementation to
use the FDRLogWriter for the lowest-level record-writing details.

Following patches will isolate the state machine transitions which
manage the set-up and tear-down of the buffers we're using in multiple
threads.

Reviewers: mboerger, eizan

Subscribers: mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52220

llvm-svn: 342518
2018-09-18 23:59:32 +00:00