Commit Graph

7 Commits

Author SHA1 Message Date
Dean Michael Berris 9d6b7a5f2b [XRay][compiler-rt] Simplify Allocator Implementation
Summary:
This change simplifies the XRay Allocator implementation to self-manage
an mmap'ed memory segment instead of using the internal allocator
implementation in sanitizer_common.

We've found through benchmarks and profiling these benchmarks in D48879
that using the internal allocator in sanitizer_common introduces a
bottleneck on allocating memory through a central spinlock. This change
allows thread-local allocators to eliminate contention on the
centralized allocator.

To get the most benefit from this approach, we also use a managed
allocator for the chunk elements used by the segmented array
implementation. This gives us the chance to amortize the cost of
allocating memory when creating these internal segmented array data
structures.

We also took the opportunity to remove the preallocation argument from
the allocator API, simplifying the usage of the allocator throughout the
profiling implementation.

In this change we also tweak some of the flag values to reduce the
amount of maximum memory we use/need for each thread, when requesting
memory through mmap.

Depends on D48956.

Reviewers: kpw, eizan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49217

llvm-svn: 337342
2018-07-18 01:53:39 +00:00
Dean Michael Berris b251f62428 [XRay][compiler-rt] Fixup build breakage
Changes:

- Remove static assertion on size of a structure, fails on systems where
  pointers aren't 8 bytes.

- Use size_t instead of deducing type of arguments to
  `nearest_boundary`.

Follow-up to D48653.

llvm-svn: 336648
2018-07-10 08:58:12 +00:00
Dean Michael Berris 0dd4f9f22f [XRay][compiler-rt] xray::Array Freelist and Iterator Updates
Summary:
We found a bug while working on a benchmark for the profiling mode which
manifests as a segmentation fault in the profiling handler's
implementation. This change adds unit tests which replicate the
issues in isolation.

We've tracked this down as a bug in the implementation of the Freelist
in the `xray::Array` type. This happens when we trim the array by a
number of elements, where we've been incorrectly assigning pointers for
the links in the freelist of chunk nodes. We've taken the chance to add
more debug-only assertions to the code path and allow us to verify these
assumptions in debug builds.

In the process, we also took the opportunity to use iterators to
implement both `front()` and `back()` which exposes a bug in the
iterator decrement operation.  In particular, when we decrement past a
chunk size boundary, we end up moving too far back and reaching the
`SentinelChunk` prematurely.

This change unblocks us to allow for contributing the non-crashing
version of the benchmarks in the test-suite as well.

Reviewers: kpw

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D48653

llvm-svn: 336644
2018-07-10 08:25:44 +00:00
Dean Michael Berris cfd7eec3d8 [XRay][profiler] Part 4: Profiler Mode Wiring
Summary:
This is part of the larger XRay Profiling Mode effort.

This patch implements the wiring required to enable us to actually
select the `xray-profiling` mode, and install the handlers to start
measuring the time and frequency of the function calls in call stacks.
The current way to get the profile information is by working with the
XRay API to `__xray_process_buffers(...)`.

In subsequent changes we'll implement profile saving to files, similar
to how the FDR and basic modes operate, as well as means for converting
this format into those that can be loaded/visualised as flame graphs. We
will also be extending the accounting tool in LLVM to support
stack-based function call accounting.

We also continue with the implementation to support building small
histograms of latencies for the `FunctionCallTrie::Node` type, to allow
us to actually approximate the distribution of latencies per function.

Depends on D45758 and D46998.

Reviewers: eizan, kpw, pelikan

Reviewed By: kpw

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D44620

llvm-svn: 334469
2018-06-12 03:29:39 +00:00
Dean Michael Berris d1fe506694 [XRay] Fixup: Remove unnecessary type alias
Follow-up to D45758.

llvm-svn: 333628
2018-05-31 05:25:47 +00:00
Dean Michael Berris 1eb8c206cd [XRay][profiler] Part 3: Profile Collector Service
Summary:
This is part of the larger XRay Profiling Mode effort.

This patch implements a centralised collector for `FunctionCallTrie`
instances, associated per thread. It maintains a global set of trie
instances which can be retrieved through the XRay API for processing
in-memory buffers (when registered). Future changes will include the
wiring to implement the actual profiling mode implementation.

This central service provides the following functionality:

*  Posting a `FunctionCallTrie` associated with a thread, to the central
   list of tries.

*  Serializing all the posted `FunctionCallTrie` instances into
   in-memory buffers.

*  Resetting the global state of the serialized buffers and tries.

Depends on D45757.

Reviewers: echristo, pelikan, kpw

Reviewed By: kpw

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D45758

llvm-svn: 333624
2018-05-31 04:33:52 +00:00
Dean Michael Berris 980d93d0e0 [XRay][profiler] Part 2: XRay Function Call Trie
Summary:
This is part of the larger XRay Profiling Mode effort.

This patch implements a central data structure for capturing statistics
about XRay instrumented function call stacks. The `FunctionCallTrie`
type does the following things:

*  It keeps track of a shadow function call stack of XRay instrumented
   functions as they are entered (function enter event) and as they are
   exited (function exit event).

*  When a function is entered, the shadow stack contains information
   about the entry TSC, and updates the trie (or prefix tree)
   representing the current function call stack. If we haven't
   encountered this function call before, this creates a unique node for
   the function in this position on the stack. We update the list of
   callees of the parent function as well to reflect this newly found
   path.

*  When a function is exited, we compute statistics (TSC deltas,
   function call count frequency) for the associated function(s) up the
   stack as we unwind to find the matching entry event.

This builds upon the XRay `Allocator` and `Array` types in Part 1 of
this series of patches.

Depends on D45756.

Reviewers: echristo, pelikan, kpw

Reviewed By: kpw

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D45757

llvm-svn: 332313
2018-05-15 00:42:36 +00:00