llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Dean Michael Berris	25d505953a	[XRay] Use preallocated memory for XRay profiling Summary: This change builds upon D54989, which removes memory allocation from the critical path of the profiling implementation. This also changes the API for the profile collection service, to take ownership of the memory and associated data structures per-thread. The consolidation of the memory allocation allows us to do two things: - Limits the amount of memory used by the profiling implementation, associating preallocated buffers instead of allocating memory on-demand. - Consolidate the memory initialisation and cleanup by relying on the buffer queue's reference counting implementation. We find a number of places which also display some problematic behaviour, including: - Off-by-factor bug in the allocator implementation. - Unrolling semantics in cases of "memory exhausted" situations, when managing the state of the function call trie. We also add a few test cases which verify our understanding of the behaviour of the system, with important edge-cases (especially for memory-exhausted cases) in the segmented array and profile collector unit tests. Depends on D54989. Reviewers: mboerger Subscribers: dschuff, mgorny, dmgreen, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D55249 llvm-svn: 348568	2018-12-07 06:23:06 +00:00
Dean Michael Berris	190c49bc8f	Re-land "[XRay] Move-only Allocator, FunctionCallTrie, and Array" This reverts commit r348455, with some additional changes: - Work-around deficiency of gcc-4.8 by duplicating the implementation of `AppendEmplace` in `Append`, but instead of using brace-init for the copy construction, use a placement new explicitly calling the copy constructor. llvm-svn: 348563	2018-12-07 03:19:13 +00:00
Dean Michael Berris	82f7b21f17	Revert "[XRay] Move-only Allocator, FunctionCallTrie, and Array" This reverts commits r348438, r348445, and r348449 due to breakages with gcc-4.8 builds. llvm-svn: 348455	2018-12-06 03:28:57 +00:00
Dean Michael Berris	cb447a2604	Re-land r348335 "[XRay] Move-only Allocator, FunctionCallTrie, and Array" Continuation of D54989. Additional changes: - Use `.AppendEmplace(...)` instead of `.Append(Type{...})` to appease GCC 4.8 with confusion on when an initializer_list is used as opposed to a temporary aggregate initialized object. llvm-svn: 348438	2018-12-06 00:25:56 +00:00
Hans Wennborg	83ff22c297	Revert r348335 "[XRay] Move-only Allocator, FunctionCallTrie, and Array" .. and also the follow-ups r348336 r348338. It broke stand-alone compiler-rt builds with GCC 4.8: In file included from /work/llvm/projects/compiler-rt/lib/xray/xray_function_call_trie.h:20:0, from /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.h:21, from /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:15: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget&}; T = __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget]’: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71: required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget]’ /work/llvm/projects/compiler-rt/lib/xray/xray_function_call_trie.h:517:54: required from here /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget’ new (AlignedOffset) T{std::forward<Args>(args)...}; ^ /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::profileCollectorService::{anonymous}::ThreadTrie&}; T = __xray::profileCollectorService::{anonymous}::ThreadTrie]’: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71: required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::profileCollectorService::{anonymous}::ThreadTrie]’ /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:98:34: required from here /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::profileCollectorService::{anonymous}::ThreadTrie&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::profileCollectorService::{anonymous}::ThreadTrie’ /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::profileCollectorService::{anonymous}::ProfileBuffer&}; T = __xray::profileCollectorService::{anonymous}::ProfileBuffer]’: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71: required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::profileCollectorService::{anonymous}::ProfileBuffer] ’ /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:244:44: required from here /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::profileCollectorService::{anonymous}::ProfileBuffer&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::profileCollectorService::{anonymous}::ProfileBuffer’ > Summary: > This change makes the allocator and function call trie implementations > move-aware and remove the FunctionCallTrie's reliance on a > heap-allocated set of allocators. > > The change makes it possible to always have storage associated with > Allocator instances, not necessarily having heap-allocated memory > obtainable from these allocator instances. We also use thread-local > uninitialised storage. > > We've also re-worked the segmented array implementation to have more > precondition and post-condition checks when built in debug mode. This > enables us to better implement some of the operations with surrounding > documentation as well. The `trim` algorithm now has more documentation > on the implementation, reducing the requirement to handle special > conditions, and being more rigorous on the computations involved. > > In this change we also introduce an initialisation guard, through which > we prevent an initialisation operation from racing with a cleanup > operation. > > We also ensure that the ThreadTries array is not destroyed while copies > into the elements are still being performed by other threads submitting > profiles. > > Note that this change still has an issue with accessing thread-local > storage from signal handlers that are instrumented with XRay. We also > learn that with the testing of this patch, that there will be cases > where calls to mmap(...) (through internal_mmap(...)) might be called in > signal handlers, but are not async-signal-safe. Subsequent patches will > address this, by re-using the `BufferQueue` type used in the FDR mode > implementation for pre-allocated memory segments per active, tracing > thread. > > We still want to land this change despite the known issues, with fixes > forthcoming. > > Reviewers: mboerger, jfb > > Subscribers: jfb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D54989 llvm-svn: 348346	2018-12-05 10:19:55 +00:00
Dean Michael Berris	adc880467d	[XRay] Move-only Allocator, FunctionCallTrie, and Array Summary: This change makes the allocator and function call trie implementations move-aware and remove the FunctionCallTrie's reliance on a heap-allocated set of allocators. The change makes it possible to always have storage associated with Allocator instances, not necessarily having heap-allocated memory obtainable from these allocator instances. We also use thread-local uninitialised storage. We've also re-worked the segmented array implementation to have more precondition and post-condition checks when built in debug mode. This enables us to better implement some of the operations with surrounding documentation as well. The `trim` algorithm now has more documentation on the implementation, reducing the requirement to handle special conditions, and being more rigorous on the computations involved. In this change we also introduce an initialisation guard, through which we prevent an initialisation operation from racing with a cleanup operation. We also ensure that the ThreadTries array is not destroyed while copies into the elements are still being performed by other threads submitting profiles. Note that this change still has an issue with accessing thread-local storage from signal handlers that are instrumented with XRay. We also learn that with the testing of this patch, that there will be cases where calls to mmap(...) (through internal_mmap(...)) might be called in signal handlers, but are not async-signal-safe. Subsequent patches will address this, by re-using the `BufferQueue` type used in the FDR mode implementation for pre-allocated memory segments per active, tracing thread. We still want to land this change despite the known issues, with fixes forthcoming. Reviewers: mboerger, jfb Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D54989 llvm-svn: 348335	2018-12-05 06:44:34 +00:00
Dean Michael Berris	ebfbf89000	[XRay] Handle allocator exhaustion in segmented array Summary: This change allows us to handle allocator exhaustion properly in the segmented array implementation. Before this change, we relied on the caller of the `trim` function to provide a valid number of elements to trim. This change allows us to do the right thing in case the elements to trim is greater than the size of the container. Reviewers: mboerger, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53484 llvm-svn: 344880	2018-10-22 02:11:27 +00:00
Dean Michael Berris	edf0f6a79b	[XRay] XRAY_NEVER_INSTRUMENT more functions, consolidate allocators Summary: In this change we apply `XRAY_NEVER_INSTRUMENT` to more functions in the profiling implementation to ensure that these never get instrumented if the compiler used to build the library is capable of doing XRay instrumentation. We also consolidate all the allocators into a single header (xray_allocator.h) which sidestep the use of the internal allocator implementation in sanitizer_common. This addresses more cases mentioned in llvm.org/PR38577. Reviewers: mboerger, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51776 llvm-svn: 341647	2018-09-07 10:16:14 +00:00
Dean Michael Berris	21d4a1eec7	[XRay][compiler-rt] Avoid InternalAlloc(...) in Profiling Mode Summary: We avoid using dynamic memory allocated with the internal allocator in the profile collection service used by profiling mode. We use aligned storage for globals and in-struct storage of objects we dynamically initialize. We also remove the dependency on `Vector<...>` which also internally uses the dynamic allocator in sanitizer_common (InternalAlloc) in favour of the XRay allocator and segmented array implementation. This change addresses llvm.org/PR38577. Reviewers: eizan Reviewed By: eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50782 llvm-svn: 339978	2018-08-17 01:57:42 +00:00
David Carlier	12be7b7bf7	[Xray] fix c99 warning build about flexible array semantics Reviewers: dberris Reviewed By: dberris Differential Revision: https://reviews.llvm.org/D49590 llvm-svn: 337536	2018-07-20 09:22:22 +00:00
Dean Michael Berris	4719c52455	[XRay][compiler-rt] Segmented Array: Simplify and Optimise Summary: This is a follow-on to D49217 which simplifies and optimises the implementation of the segmented array. In this patch we co-locate the book-keeping for segments in the `__xray::Array<T>` with the data it's managing. We take the chance in this patch to actually rename `Chunk` to `Segment` to better align with the high-level description of the segmented array. With measurements using benchmarks landed in D48879, we've identified that calls to `pthread_getspecific` started dominating the cycles, which led us to revert the change made in D49217 to use C++ thread_local initialisation instead (it reduces the cost by a huge margin, since we save one PLT-based call to pthread functions in the hot path). In particular, this is in `__xray::getThreadLocalData()`. We also took the opportunity to remove the least-common-multiple based calculation and instead pack as much data into segments of the array. This greatly simplifies the API of the container which hides as much of the implementation details as possible. For instance, we calculate the number of elements we need for the each segment internally in the Array instead of making it part of the type. With the changes here, we're able to get a measurable improvement on the performance of profiling mode on top of what D48879 already provides. Depends on D48879. Reviewers: kpw, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49363 llvm-svn: 337343	2018-07-18 02:08:39 +00:00
Dean Michael Berris	9d6b7a5f2b	[XRay][compiler-rt] Simplify Allocator Implementation Summary: This change simplifies the XRay Allocator implementation to self-manage an mmap'ed memory segment instead of using the internal allocator implementation in sanitizer_common. We've found through benchmarks and profiling these benchmarks in D48879 that using the internal allocator in sanitizer_common introduces a bottleneck on allocating memory through a central spinlock. This change allows thread-local allocators to eliminate contention on the centralized allocator. To get the most benefit from this approach, we also use a managed allocator for the chunk elements used by the segmented array implementation. This gives us the chance to amortize the cost of allocating memory when creating these internal segmented array data structures. We also took the opportunity to remove the preallocation argument from the allocator API, simplifying the usage of the allocator throughout the profiling implementation. In this change we also tweak some of the flag values to reduce the amount of maximum memory we use/need for each thread, when requesting memory through mmap. Depends on D48956. Reviewers: kpw, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49217 llvm-svn: 337342	2018-07-18 01:53:39 +00:00
Dean Michael Berris	0dd4f9f22f	[XRay][compiler-rt] xray::Array Freelist and Iterator Updates Summary: We found a bug while working on a benchmark for the profiling mode which manifests as a segmentation fault in the profiling handler's implementation. This change adds unit tests which replicate the issues in isolation. We've tracked this down as a bug in the implementation of the Freelist in the `xray::Array` type. This happens when we trim the array by a number of elements, where we've been incorrectly assigning pointers for the links in the freelist of chunk nodes. We've taken the chance to add more debug-only assertions to the code path and allow us to verify these assumptions in debug builds. In the process, we also took the opportunity to use iterators to implement both `front()` and `back()` which exposes a bug in the iterator decrement operation. In particular, when we decrement past a chunk size boundary, we end up moving too far back and reaching the `SentinelChunk` prematurely. This change unblocks us to allow for contributing the non-crashing version of the benchmarks in the test-suite as well. Reviewers: kpw Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D48653 llvm-svn: 336644	2018-07-10 08:25:44 +00:00
Dean Michael Berris	ca856e07de	[XRay] Fixup: Address some warnings breaking build Follow-up to D45758. llvm-svn: 333625	2018-05-31 04:55:11 +00:00
Dean Michael Berris	238aa1366e	[XRay][compiler-rt] Relocate a DCHECK to the correct location. Fixes a bad DCHECK where the condition being checked is still valid (for iterators pointing to sentinels). Follow-up to D45756. llvm-svn: 332212	2018-05-14 04:21:12 +00:00
Dean Michael Berris	26e81209ef	[XRay][profiler] Part 1: XRay Allocator and Array Implementations Summary: This change is part of the larger XRay Profiling Mode effort. Here we implement an arena allocator, for fixed sized buffers used in a segmented array implementation. This change adds the segmented array data structure, which relies on the allocator to provide and maintain the storage for the segmented array. Key features of the `Allocator` type: * It uses cache-aligned blocks, intended to host the actual data. These blocks are cache-line-size multiples of contiguous bytes. * The `Allocator` has a maximum memory budget, set at construction time. This allows us to cap the amount of data each specific `Allocator` instance is responsible for. * Upon destruction, the `Allocator` will clean up the storage it's used, handing it back to the internal allocator used in sanitizer_common. Key features of the `Array` type: * Each segmented array is always backed by an `Allocator`, which is either user-provided or uses a global allocator. * When an `Array` grows, it grows by appending a segment that's fixed-sized. The size of each segment is computed by the number of elements of type `T` that can fit into cache line multiples. * An `Array` does not return memory to the `Allocator`, but it can keep track of the current number of "live" objects it stores. * When an `Array` is destroyed, it will not return memory to the `Allocator`. Users should clean up the `Allocator` independently of the `Array`. * The `Array` type keeps a freelist of the chunks it's used before, so that trimming and growing will re-use previously allocated chunks. These basic data structures are used by the XRay Profiling Mode implementation to implement efficient and cache-aware storage for data that's typically read-and-write heavy for tracking latency information. We're relying on the cache line characteristics of the architecture to provide us good data isolation and cache friendliness, when we're performing operations like searching for elements and/or updating data hosted in these cache lines. Reviewers: echristo, pelikan, kpw Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D45756 llvm-svn: 331141	2018-04-29 13:46:30 +00:00

17 Commits