Commit Graph

19 Commits

Author SHA1 Message Date
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Dean Michael Berris 25d505953a [XRay] Use preallocated memory for XRay profiling
Summary:
This change builds upon D54989, which removes memory allocation from the
critical path of the profiling implementation. This also changes the API
for the profile collection service, to take ownership of the memory and
associated data structures per-thread.

The consolidation of the memory allocation allows us to do two things:

- Limits the amount of memory used by the profiling implementation,
  associating preallocated buffers instead of allocating memory
  on-demand.

- Consolidate the memory initialisation and cleanup by relying on the
  buffer queue's reference counting implementation.

We find a number of places which also display some problematic
behaviour, including:

- Off-by-factor bug in the allocator implementation.

- Unrolling semantics in cases of "memory exhausted" situations, when
  managing the state of the function call trie.

We also add a few test cases which verify our understanding of the
behaviour of the system, with important edge-cases (especially for
memory-exhausted cases) in the segmented array and profile collector
unit tests.

Depends on D54989.

Reviewers: mboerger

Subscribers: dschuff, mgorny, dmgreen, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D55249

llvm-svn: 348568
2018-12-07 06:23:06 +00:00
Dean Michael Berris 190c49bc8f Re-land "[XRay] Move-only Allocator, FunctionCallTrie, and Array"
This reverts commit r348455, with some additional changes:

- Work-around deficiency of gcc-4.8 by duplicating the implementation of
  `AppendEmplace` in `Append`, but instead of using brace-init for the
  copy construction, use a placement new explicitly calling the copy
  constructor.

llvm-svn: 348563
2018-12-07 03:19:13 +00:00
Dean Michael Berris 82f7b21f17 Revert "[XRay] Move-only Allocator, FunctionCallTrie, and Array"
This reverts commits r348438, r348445, and r348449 due to breakages with
gcc-4.8 builds.

llvm-svn: 348455
2018-12-06 03:28:57 +00:00
Dean Michael Berris cb447a2604 Re-land r348335 "[XRay] Move-only Allocator, FunctionCallTrie, and Array"
Continuation of D54989.

Additional changes:

  - Use `.AppendEmplace(...)` instead of `.Append(Type{...})` to appease
    GCC 4.8 with confusion on when an initializer_list is used as
    opposed to a temporary aggregate initialized object.

llvm-svn: 348438
2018-12-06 00:25:56 +00:00
Hans Wennborg 83ff22c297 Revert r348335 "[XRay] Move-only Allocator, FunctionCallTrie, and Array"
.. and also the follow-ups r348336 r348338.

It broke stand-alone compiler-rt builds with GCC 4.8:

In file included from /work/llvm/projects/compiler-rt/lib/xray/xray_function_call_trie.h:20:0,
                 from /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.h:21,
                 from /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:15:
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget&}; T = __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget]’:
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71:   required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget]’
/work/llvm/projects/compiler-rt/lib/xray/xray_function_call_trie.h:517:54:   required from here
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget’
     new (AlignedOffset) T{std::forward<Args>(args)...};
     ^
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::profileCollectorService::{anonymous}::ThreadTrie&}; T = __xray::profileCollectorService::{anonymous}::ThreadTrie]’:
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71:   required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::profileCollectorService::{anonymous}::ThreadTrie]’
/work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:98:34:   required from here
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::profileCollectorService::{anonymous}::ThreadTrie&>((* & args#0))}’ from
‘<brace-enclosed initializer list>’ to ‘__xray::profileCollectorService::{anonymous}::ThreadTrie’
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::profileCollectorService::{anonymous}::ProfileBuffer&}; T = __xray::profileCollectorService::{anonymous}::ProfileBuffer]’:
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71:   required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::profileCollectorService::{anonymous}::ProfileBuffer]
’
/work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:244:44:   required from here
/work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::profileCollectorService::{anonymous}::ProfileBuffer&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::profileCollectorService::{anonymous}::ProfileBuffer’

> Summary:
> This change makes the allocator and function call trie implementations
> move-aware and remove the FunctionCallTrie's reliance on a
> heap-allocated set of allocators.
>
> The change makes it possible to always have storage associated with
> Allocator instances, not necessarily having heap-allocated memory
> obtainable from these allocator instances. We also use thread-local
> uninitialised storage.
>
> We've also re-worked the segmented array implementation to have more
> precondition and post-condition checks when built in debug mode. This
> enables us to better implement some of the operations with surrounding
> documentation as well. The `trim` algorithm now has more documentation
> on the implementation, reducing the requirement to handle special
> conditions, and being more rigorous on the computations involved.
>
> In this change we also introduce an initialisation guard, through which
> we prevent an initialisation operation from racing with a cleanup
> operation.
>
> We also ensure that the ThreadTries array is not destroyed while copies
> into the elements are still being performed by other threads submitting
> profiles.
>
> Note that this change still has an issue with accessing thread-local
> storage from signal handlers that are instrumented with XRay. We also
> learn that with the testing of this patch, that there will be cases
> where calls to mmap(...) (through internal_mmap(...)) might be called in
> signal handlers, but are not async-signal-safe. Subsequent patches will
> address this, by re-using the `BufferQueue` type used in the FDR mode
> implementation for pre-allocated memory segments per active, tracing
> thread.
>
> We still want to land this change despite the known issues, with fixes
> forthcoming.
>
> Reviewers: mboerger, jfb
>
> Subscribers: jfb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D54989

llvm-svn: 348346
2018-12-05 10:19:55 +00:00
Dean Michael Berris d49fc9c6fa [XRay] Use deallocateBuffer instead of deallocate
Follow-up to D54989.

llvm-svn: 348336
2018-12-05 07:05:44 +00:00
Dean Michael Berris adc880467d [XRay] Move-only Allocator, FunctionCallTrie, and Array
Summary:
This change makes the allocator and function call trie implementations
move-aware and remove the FunctionCallTrie's reliance on a
heap-allocated set of allocators.

The change makes it possible to always have storage associated with
Allocator instances, not necessarily having heap-allocated memory
obtainable from these allocator instances. We also use thread-local
uninitialised storage.

We've also re-worked the segmented array implementation to have more
precondition and post-condition checks when built in debug mode. This
enables us to better implement some of the operations with surrounding
documentation as well. The `trim` algorithm now has more documentation
on the implementation, reducing the requirement to handle special
conditions, and being more rigorous on the computations involved.

In this change we also introduce an initialisation guard, through which
we prevent an initialisation operation from racing with a cleanup
operation.

We also ensure that the ThreadTries array is not destroyed while copies
into the elements are still being performed by other threads submitting
profiles.

Note that this change still has an issue with accessing thread-local
storage from signal handlers that are instrumented with XRay. We also
learn that with the testing of this patch, that there will be cases
where calls to mmap(...) (through internal_mmap(...)) might be called in
signal handlers, but are not async-signal-safe. Subsequent patches will
address this, by re-using the `BufferQueue` type used in the FDR mode
implementation for pre-allocated memory segments per active, tracing
thread.

We still want to land this change despite the known issues, with fixes
forthcoming.

Reviewers: mboerger, jfb

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D54989

llvm-svn: 348335
2018-12-05 06:44:34 +00:00
Petr Hosek e7dec7848b [XRay] Support for Fuchsia
This extends XRay to support Fuchsia.

Differential Revision: https://reviews.llvm.org/D52162

llvm-svn: 347443
2018-11-22 02:00:44 +00:00
Dean Michael Berris 388af45f18 [XRay] Add a test for allocator exhaustion
Use a more representative test of allocating small chunks for
oddly-sized (small) objects from an allocator that has a page's worth of
memory.

llvm-svn: 347286
2018-11-20 03:56:04 +00:00
Dean Michael Berris 0cb22386e0 [XRay][compiler-rt] Update use of internal_mmap
Summary:
The implementation of `internal_mmap(...)` deviates from the contract of
`mmap(...)` -- i.e. error returns are actually the equivalent of `errno`
results. We update how XRay uses `internal_mmap(...)` to better handle
these error conditions.

In the process, we change the default pointers we're using from `char*`
to `uint8_t*` to prevent potential usage of the pointers in the string
library functions that expect to operate on `char*`.

We also take the chance to "promote" sizes of individual `internal_mmap`
requests to at least page size bytes, consistent with the expectations
of calls to `mmap`.

Reviewers: cryptoad, mboerger

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52361

llvm-svn: 342745
2018-09-21 16:34:42 +00:00
Dean Michael Berris 1a23d3bbce [XRay] Simplify FDR buffer management
Summary:
This change makes XRay FDR mode use a single backing store for the
buffer queue, and have indexes into that backing store instead. We also
remove the reliance on the internal allocator implementation in the FDR
mode logging implementation.

In the process of making this change we found an inconsistency with the
way we're returning buffers to the queue, and how we're setting the
extents. We take the chance to simplify the way we're managing the
extents of each buffer. It turns out we do not need the indirection for
the extents, so we co-host the atomic 64-bit int with the buffer object.
It also seems that we've not been returning the buffers for the thread
running the flush functionality when writing out the files, so we can
run into a situation where we could be missing data.

We consolidate all the allocation routines now into xray_allocator.h,
where we used to have routines defined in xray_buffer_queue.cc.

Reviewers: mboerger, eizan

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52077

llvm-svn: 342356
2018-09-17 03:09:01 +00:00
Dean Michael Berris edf0f6a79b [XRay] XRAY_NEVER_INSTRUMENT more functions, consolidate allocators
Summary:
In this change we apply `XRAY_NEVER_INSTRUMENT` to more functions in the
profiling implementation to ensure that these never get instrumented if
the compiler used to build the library is capable of doing XRay
instrumentation.

We also consolidate all the allocators into a single header
(xray_allocator.h) which sidestep the use of the internal allocator
implementation in sanitizer_common.

This addresses more cases mentioned in llvm.org/PR38577.

Reviewers: mboerger, eizan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51776

llvm-svn: 341647
2018-09-07 10:16:14 +00:00
Dean Michael Berris 560c733815 [XRay][compiler-rt] Remove MAP_NORESERVE from XRay allocations
Summary:
This reverses an earlier decision to allow seg-faulting from the
XRay-allocated memory if it turns out that the system cannot provide
physical memory backing that cannot be swapped in/out on Linux.

This addresses http://llvm.org/PR38588.

Reviewers: eizan

Reviewed By: eizan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50831

llvm-svn: 339869
2018-08-16 12:19:03 +00:00
David Carlier cfc1d1d46e [Xray] Fix allocator build, MAP_NORESERVE flag is not always supported
MAP_NORESERVE is not supported or a no-op on BSD.

Reviewers: dberris

Reviewed By: dberris

Differential Revision: https://reviews.llvm.org/D49494

llvm-svn: 337440
2018-07-19 05:08:59 +00:00
Dean Michael Berris 9d6b7a5f2b [XRay][compiler-rt] Simplify Allocator Implementation
Summary:
This change simplifies the XRay Allocator implementation to self-manage
an mmap'ed memory segment instead of using the internal allocator
implementation in sanitizer_common.

We've found through benchmarks and profiling these benchmarks in D48879
that using the internal allocator in sanitizer_common introduces a
bottleneck on allocating memory through a central spinlock. This change
allows thread-local allocators to eliminate contention on the
centralized allocator.

To get the most benefit from this approach, we also use a managed
allocator for the chunk elements used by the segmented array
implementation. This gives us the chance to amortize the cost of
allocating memory when creating these internal segmented array data
structures.

We also took the opportunity to remove the preallocation argument from
the allocator API, simplifying the usage of the allocator throughout the
profiling implementation.

In this change we also tweak some of the flag values to reduce the
amount of maximum memory we use/need for each thread, when requesting
memory through mmap.

Depends on D48956.

Reviewers: kpw, eizan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49217

llvm-svn: 337342
2018-07-18 01:53:39 +00:00
Dean Michael Berris 0dd4f9f22f [XRay][compiler-rt] xray::Array Freelist and Iterator Updates
Summary:
We found a bug while working on a benchmark for the profiling mode which
manifests as a segmentation fault in the profiling handler's
implementation. This change adds unit tests which replicate the
issues in isolation.

We've tracked this down as a bug in the implementation of the Freelist
in the `xray::Array` type. This happens when we trim the array by a
number of elements, where we've been incorrectly assigning pointers for
the links in the freelist of chunk nodes. We've taken the chance to add
more debug-only assertions to the code path and allow us to verify these
assumptions in debug builds.

In the process, we also took the opportunity to use iterators to
implement both `front()` and `back()` which exposes a bug in the
iterator decrement operation.  In particular, when we decrement past a
chunk size boundary, we end up moving too far back and reaching the
`SentinelChunk` prematurely.

This change unblocks us to allow for contributing the non-crashing
version of the benchmarks in the test-suite as well.

Reviewers: kpw

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D48653

llvm-svn: 336644
2018-07-10 08:25:44 +00:00
Reid Kleckner d44a6eab40 Fix GCC 4.8.5 build by avoiding braced default member initializer
Use internal_memset instead.  Should revive all Linux Chromium ToT bots.

llvm-svn: 333678
2018-05-31 18:26:36 +00:00
Dean Michael Berris 26e81209ef [XRay][profiler] Part 1: XRay Allocator and Array Implementations
Summary:
This change is part of the larger XRay Profiling Mode effort.

Here we implement an arena allocator, for fixed sized buffers used in a
segmented array implementation. This change adds the segmented array
data structure, which relies on the allocator to provide and maintain
the storage for the segmented array.

Key features of the `Allocator` type:

*  It uses cache-aligned blocks, intended to host the actual data. These
   blocks are cache-line-size multiples of contiguous bytes.

*  The `Allocator` has a maximum memory budget, set at construction
   time. This allows us to cap the amount of data each specific
   `Allocator` instance is responsible for.

*  Upon destruction, the `Allocator` will clean up the storage it's
   used, handing it back to the internal allocator used in
   sanitizer_common.

Key features of the `Array` type:

*  Each segmented array is always backed by an `Allocator`, which is
   either user-provided or uses a global allocator.

*  When an `Array` grows, it grows by appending a segment that's
   fixed-sized. The size of each segment is computed by the number of
   elements of type `T` that can fit into cache line multiples.

*  An `Array` does not return memory to the `Allocator`, but it can keep
   track of the current number of "live" objects it stores.

*  When an `Array` is destroyed, it will not return memory to the
   `Allocator`. Users should clean up the `Allocator` independently of
   the `Array`.

*  The `Array` type keeps a freelist of the chunks it's used before, so
   that trimming and growing will re-use previously allocated chunks.

These basic data structures are used by the XRay Profiling Mode
implementation to implement efficient and cache-aware storage for data
that's typically read-and-write heavy for tracking latency information.
We're relying on the cache line characteristics of the architecture to
provide us good data isolation and cache friendliness, when we're
performing operations like searching for elements and/or updating data
hosted in these cache lines.

Reviewers: echristo, pelikan, kpw

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D45756

llvm-svn: 331141
2018-04-29 13:46:30 +00:00