2016-12-06 14:24:08 +08:00
|
|
|
//===-- xray_buffer_queue.h ------------------------------------*- C++ -*-===//
|
|
|
|
//
|
|
|
|
// The LLVM Compiler Infrastructure
|
|
|
|
//
|
|
|
|
// This file is distributed under the University of Illinois Open Source
|
|
|
|
// License. See LICENSE.TXT for details.
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
//
|
|
|
|
// This file is a part of XRay, a dynamic runtime instrumentation system.
|
|
|
|
//
|
|
|
|
// Defines the interface for a buffer queue implementation.
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
#ifndef XRAY_BUFFER_QUEUE_H
|
|
|
|
#define XRAY_BUFFER_QUEUE_H
|
|
|
|
|
2017-10-24 09:39:59 +08:00
|
|
|
#include <cstddef>
|
2017-03-27 15:13:35 +08:00
|
|
|
#include "sanitizer_common/sanitizer_atomic.h"
|
|
|
|
#include "sanitizer_common/sanitizer_mutex.h"
|
2016-12-06 14:24:08 +08:00
|
|
|
|
|
|
|
namespace __xray {
|
|
|
|
|
|
|
|
/// BufferQueue implements a circular queue of fixed sized buffers (much like a
|
|
|
|
/// freelist) but is concerned mostly with making it really quick to initialise,
|
|
|
|
/// finalise, and get/return buffers to the queue. This is one key component of
|
|
|
|
/// the "flight data recorder" (FDR) mode to support ongoing XRay function call
|
|
|
|
/// trace collection.
|
|
|
|
class BufferQueue {
|
2017-10-24 09:39:59 +08:00
|
|
|
public:
|
[XRay] Use optimistic logging model for FDR mode
Summary:
Before this change, the FDR mode implementation relied on at thread-exit
handling to return buffers back to the (global) buffer queue. This
introduces issues with the initialisation of the thread_local objects
which, even through the use of pthread_setspecific(...) may eventually
call into an allocation function. Similar to previous changes in this
line, we're finding that there is a huge potential for deadlocks when
initialising these thread-locals when the memory allocation
implementation is also xray-instrumented.
In this change, we limit the call to pthread_setspecific(...) to provide
a non-null value to associate to the key created with
pthread_key_create(...). While this doesn't completely eliminate the
potential for the deadlock(s), it does allow us to still clean up at
thread exit when we need to. The change is that we don't need to do more
work when starting and ending a thread's lifetime. We also have a test
to make sure that we actually can safely recycle the buffers in case we
end up re-using the buffer(s) available from the queue on multiple
thread entry/exits.
This change cuts across both LLVM and compiler-rt to allow us to update
both the XRay runtime implementation as well as the library support for
loading these new versions of the FDR mode logging. Version 2 of the FDR
logging implementation makes the following changes:
* Introduction of a new 'BufferExtents' metadata record that's outside
of the buffer's contents but are written before the actual buffer.
This data is associated to the Buffer handed out by the BufferQueue
rather than a record that occupies bytes in the actual buffer.
* Removal of the "end of buffer" records. This is in-line with the
changes we described above, to allow for optimistic logging without
explicit record writing at thread exit.
The optimistic logging model operates under the following assumptions:
* Threads writing to the buffers will potentially race with the thread
attempting to flush the log. To avoid this situation from occuring,
we make sure that when we've finalized the logging implementation,
that threads will see this finalization state on the next write, and
either choose to not write records the thread would have written or
write the record(s) in two phases -- first write the record(s), then
update the extents metadata.
* We change the buffer queue implementation so that once it's handed
out a buffer to a thread, that we assume that buffer is marked
"used" to be able to capture partial writes. None of this will be
safe to handle if threads are racing to write the extents records
and the reader thread is attempting to flush the log. The optimism
comes from the finalization routine being required to complete
before we attempt to flush the log.
This is a fairly significant semantics change for the FDR
implementation. This is why we've decided to update the version number
for FDR mode logs. The tools, however, still need to be able to support
older versions of the log until we finally deprecate those earlier
versions.
Reviewers: dblaikie, pelikan, kpw
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D39526
llvm-svn: 318733
2017-11-21 15:16:57 +08:00
|
|
|
struct alignas(64) BufferExtents {
|
|
|
|
__sanitizer::atomic_uint64_t Size;
|
|
|
|
};
|
|
|
|
|
2016-12-06 14:24:08 +08:00
|
|
|
struct Buffer {
|
|
|
|
void *Buffer = nullptr;
|
2017-03-22 12:40:32 +08:00
|
|
|
size_t Size = 0;
|
[XRay] Use optimistic logging model for FDR mode
Summary:
Before this change, the FDR mode implementation relied on at thread-exit
handling to return buffers back to the (global) buffer queue. This
introduces issues with the initialisation of the thread_local objects
which, even through the use of pthread_setspecific(...) may eventually
call into an allocation function. Similar to previous changes in this
line, we're finding that there is a huge potential for deadlocks when
initialising these thread-locals when the memory allocation
implementation is also xray-instrumented.
In this change, we limit the call to pthread_setspecific(...) to provide
a non-null value to associate to the key created with
pthread_key_create(...). While this doesn't completely eliminate the
potential for the deadlock(s), it does allow us to still clean up at
thread exit when we need to. The change is that we don't need to do more
work when starting and ending a thread's lifetime. We also have a test
to make sure that we actually can safely recycle the buffers in case we
end up re-using the buffer(s) available from the queue on multiple
thread entry/exits.
This change cuts across both LLVM and compiler-rt to allow us to update
both the XRay runtime implementation as well as the library support for
loading these new versions of the FDR mode logging. Version 2 of the FDR
logging implementation makes the following changes:
* Introduction of a new 'BufferExtents' metadata record that's outside
of the buffer's contents but are written before the actual buffer.
This data is associated to the Buffer handed out by the BufferQueue
rather than a record that occupies bytes in the actual buffer.
* Removal of the "end of buffer" records. This is in-line with the
changes we described above, to allow for optimistic logging without
explicit record writing at thread exit.
The optimistic logging model operates under the following assumptions:
* Threads writing to the buffers will potentially race with the thread
attempting to flush the log. To avoid this situation from occuring,
we make sure that when we've finalized the logging implementation,
that threads will see this finalization state on the next write, and
either choose to not write records the thread would have written or
write the record(s) in two phases -- first write the record(s), then
update the extents metadata.
* We change the buffer queue implementation so that once it's handed
out a buffer to a thread, that we assume that buffer is marked
"used" to be able to capture partial writes. None of this will be
safe to handle if threads are racing to write the extents records
and the reader thread is attempting to flush the log. The optimism
comes from the finalization routine being required to complete
before we attempt to flush the log.
This is a fairly significant semantics change for the FDR
implementation. This is why we've decided to update the version number
for FDR mode logs. The tools, however, still need to be able to support
older versions of the log until we finally deprecate those earlier
versions.
Reviewers: dblaikie, pelikan, kpw
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D39526
llvm-svn: 318733
2017-11-21 15:16:57 +08:00
|
|
|
BufferExtents* Extents;
|
2016-12-06 14:24:08 +08:00
|
|
|
};
|
|
|
|
|
2017-10-24 09:39:59 +08:00
|
|
|
private:
|
|
|
|
struct BufferRep {
|
|
|
|
// The managed buffer.
|
2017-10-24 10:36:32 +08:00
|
|
|
Buffer Buff;
|
2017-10-24 09:39:59 +08:00
|
|
|
|
|
|
|
// This is true if the buffer has been returned to the available queue, and
|
|
|
|
// is considered "used" by another thread.
|
|
|
|
bool Used = false;
|
|
|
|
};
|
|
|
|
|
2017-10-04 13:20:13 +08:00
|
|
|
// Size of each individual Buffer.
|
2017-03-22 12:40:32 +08:00
|
|
|
size_t BufferSize;
|
[XRay][compiler-rt] XRay Flight Data Recorder Mode
Summary:
In this change we introduce the notion of a "flight data recorder" mode
for XRay logging, where XRay logs in-memory first, and write out data
on-demand as required (as opposed to the naive implementation that keeps
logging while tracing is "on"). This depends on D26232 where we
implement the core data structure for holding the buffers that threads
will be using to write out records of operation.
This implementation only currently works on x86_64 and depends heavily
on the TSC math to write out smaller records to the inmemory buffers.
Also, this implementation defines two different kinds of records with
different sizes (compared to the current naive implementation): a
MetadataRecord (16 bytes) and a FunctionRecord (8 bytes). MetadataRecord
entries are meant to write out information like the thread ID for which
the metadata record is defined for, whether the execution of a thread
moved to a different CPU, etc. while a FunctionRecord represents the
different kinds of function call entry/exit records we might encounter
in the course of a thread's execution along with a delta from the last
time the logging handler was called.
While this implementation is not exactly what is described in the
original XRay whitepaper, this one gives us an initial implementation
that we can iterate and build upon.
Reviewers: echristo, rSerge, majnemer
Subscribers: mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D27038
llvm-svn: 293015
2017-01-25 11:50:46 +08:00
|
|
|
|
2017-10-24 09:39:59 +08:00
|
|
|
BufferRep *Buffers;
|
2017-10-04 13:20:13 +08:00
|
|
|
size_t BufferCount;
|
|
|
|
|
|
|
|
__sanitizer::SpinMutex Mutex;
|
2017-03-27 15:13:35 +08:00
|
|
|
__sanitizer::atomic_uint8_t Finalizing;
|
2016-12-06 14:24:08 +08:00
|
|
|
|
2017-10-04 13:20:13 +08:00
|
|
|
// Pointers to buffers managed/owned by the BufferQueue.
|
2017-10-24 09:39:59 +08:00
|
|
|
void **OwnedBuffers;
|
2017-10-04 13:20:13 +08:00
|
|
|
|
|
|
|
// Pointer to the next buffer to be handed out.
|
2017-10-24 09:39:59 +08:00
|
|
|
BufferRep *Next;
|
2017-10-04 13:20:13 +08:00
|
|
|
|
|
|
|
// Pointer to the entry in the array where the next released buffer will be
|
|
|
|
// placed.
|
2017-10-24 09:39:59 +08:00
|
|
|
BufferRep *First;
|
2017-10-04 13:20:13 +08:00
|
|
|
|
|
|
|
// Count of buffers that have been handed out through 'getBuffer'.
|
|
|
|
size_t LiveBuffers;
|
|
|
|
|
2017-10-24 09:39:59 +08:00
|
|
|
public:
|
2017-03-22 12:40:32 +08:00
|
|
|
enum class ErrorCode : unsigned {
|
|
|
|
Ok,
|
|
|
|
NotEnoughMemory,
|
|
|
|
QueueFinalizing,
|
|
|
|
UnrecognizedBuffer,
|
|
|
|
AlreadyFinalized,
|
|
|
|
};
|
|
|
|
|
|
|
|
static const char *getErrorString(ErrorCode E) {
|
|
|
|
switch (E) {
|
2017-10-24 09:39:59 +08:00
|
|
|
case ErrorCode::Ok:
|
|
|
|
return "(none)";
|
|
|
|
case ErrorCode::NotEnoughMemory:
|
|
|
|
return "no available buffers in the queue";
|
|
|
|
case ErrorCode::QueueFinalizing:
|
|
|
|
return "queue already finalizing";
|
|
|
|
case ErrorCode::UnrecognizedBuffer:
|
|
|
|
return "buffer being returned not owned by buffer queue";
|
|
|
|
case ErrorCode::AlreadyFinalized:
|
|
|
|
return "queue already finalized";
|
2017-03-22 12:40:32 +08:00
|
|
|
}
|
|
|
|
return "unknown error";
|
|
|
|
}
|
|
|
|
|
[XRay][compiler-rt] XRay Flight Data Recorder Mode
Summary:
In this change we introduce the notion of a "flight data recorder" mode
for XRay logging, where XRay logs in-memory first, and write out data
on-demand as required (as opposed to the naive implementation that keeps
logging while tracing is "on"). This depends on D26232 where we
implement the core data structure for holding the buffers that threads
will be using to write out records of operation.
This implementation only currently works on x86_64 and depends heavily
on the TSC math to write out smaller records to the inmemory buffers.
Also, this implementation defines two different kinds of records with
different sizes (compared to the current naive implementation): a
MetadataRecord (16 bytes) and a FunctionRecord (8 bytes). MetadataRecord
entries are meant to write out information like the thread ID for which
the metadata record is defined for, whether the execution of a thread
moved to a different CPU, etc. while a FunctionRecord represents the
different kinds of function call entry/exit records we might encounter
in the course of a thread's execution along with a delta from the last
time the logging handler was called.
While this implementation is not exactly what is described in the
original XRay whitepaper, this one gives us an initial implementation
that we can iterate and build upon.
Reviewers: echristo, rSerge, majnemer
Subscribers: mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D27038
llvm-svn: 293015
2017-01-25 11:50:46 +08:00
|
|
|
/// Initialise a queue of size |N| with buffers of size |B|. We report success
|
|
|
|
/// through |Success|.
|
2017-03-22 12:40:32 +08:00
|
|
|
BufferQueue(size_t B, size_t N, bool &Success);
|
2016-12-06 14:24:08 +08:00
|
|
|
|
|
|
|
/// Updates |Buf| to contain the pointer to an appropriate buffer. Returns an
|
|
|
|
/// error in case there are no available buffers to return when we will run
|
|
|
|
/// over the upper bound for the total buffers.
|
|
|
|
///
|
|
|
|
/// Requirements:
|
|
|
|
/// - BufferQueue is not finalising.
|
|
|
|
///
|
|
|
|
/// Returns:
|
2017-08-31 08:50:12 +08:00
|
|
|
/// - ErrorCode::NotEnoughMemory on exceeding MaxSize.
|
|
|
|
/// - ErrorCode::Ok when we find a Buffer.
|
|
|
|
/// - ErrorCode::QueueFinalizing or ErrorCode::AlreadyFinalized on
|
|
|
|
/// a finalizing/finalized BufferQueue.
|
2017-03-22 12:40:32 +08:00
|
|
|
ErrorCode getBuffer(Buffer &Buf);
|
2016-12-06 14:24:08 +08:00
|
|
|
|
|
|
|
/// Updates |Buf| to point to nullptr, with size 0.
|
|
|
|
///
|
|
|
|
/// Returns:
|
2017-08-31 08:50:12 +08:00
|
|
|
/// - ErrorCode::Ok when we successfully release the buffer.
|
|
|
|
/// - ErrorCode::UnrecognizedBuffer for when this BufferQueue does not own
|
|
|
|
/// the buffer being released.
|
2017-03-22 12:40:32 +08:00
|
|
|
ErrorCode releaseBuffer(Buffer &Buf);
|
2016-12-06 14:24:08 +08:00
|
|
|
|
2017-03-27 15:13:35 +08:00
|
|
|
bool finalizing() const {
|
|
|
|
return __sanitizer::atomic_load(&Finalizing,
|
|
|
|
__sanitizer::memory_order_acquire);
|
|
|
|
}
|
2016-12-06 14:24:08 +08:00
|
|
|
|
2017-03-29 13:56:37 +08:00
|
|
|
/// Returns the configured size of the buffers in the buffer queue.
|
|
|
|
size_t ConfiguredBufferSize() const { return BufferSize; }
|
|
|
|
|
[XRay][compiler-rt] XRay Flight Data Recorder Mode
Summary:
In this change we introduce the notion of a "flight data recorder" mode
for XRay logging, where XRay logs in-memory first, and write out data
on-demand as required (as opposed to the naive implementation that keeps
logging while tracing is "on"). This depends on D26232 where we
implement the core data structure for holding the buffers that threads
will be using to write out records of operation.
This implementation only currently works on x86_64 and depends heavily
on the TSC math to write out smaller records to the inmemory buffers.
Also, this implementation defines two different kinds of records with
different sizes (compared to the current naive implementation): a
MetadataRecord (16 bytes) and a FunctionRecord (8 bytes). MetadataRecord
entries are meant to write out information like the thread ID for which
the metadata record is defined for, whether the execution of a thread
moved to a different CPU, etc. while a FunctionRecord represents the
different kinds of function call entry/exit records we might encounter
in the course of a thread's execution along with a delta from the last
time the logging handler was called.
While this implementation is not exactly what is described in the
original XRay whitepaper, this one gives us an initial implementation
that we can iterate and build upon.
Reviewers: echristo, rSerge, majnemer
Subscribers: mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D27038
llvm-svn: 293015
2017-01-25 11:50:46 +08:00
|
|
|
/// Sets the state of the BufferQueue to finalizing, which ensures that:
|
|
|
|
///
|
|
|
|
/// - All subsequent attempts to retrieve a Buffer will fail.
|
|
|
|
/// - All releaseBuffer operations will not fail.
|
|
|
|
///
|
|
|
|
/// After a call to finalize succeeds, all subsequent calls to finalize will
|
2017-08-31 08:50:12 +08:00
|
|
|
/// fail with ErrorCode::QueueFinalizing.
|
2017-03-22 12:40:32 +08:00
|
|
|
ErrorCode finalize();
|
2016-12-06 14:24:08 +08:00
|
|
|
|
[XRay][compiler-rt] XRay Flight Data Recorder Mode
Summary:
In this change we introduce the notion of a "flight data recorder" mode
for XRay logging, where XRay logs in-memory first, and write out data
on-demand as required (as opposed to the naive implementation that keeps
logging while tracing is "on"). This depends on D26232 where we
implement the core data structure for holding the buffers that threads
will be using to write out records of operation.
This implementation only currently works on x86_64 and depends heavily
on the TSC math to write out smaller records to the inmemory buffers.
Also, this implementation defines two different kinds of records with
different sizes (compared to the current naive implementation): a
MetadataRecord (16 bytes) and a FunctionRecord (8 bytes). MetadataRecord
entries are meant to write out information like the thread ID for which
the metadata record is defined for, whether the execution of a thread
moved to a different CPU, etc. while a FunctionRecord represents the
different kinds of function call entry/exit records we might encounter
in the course of a thread's execution along with a delta from the last
time the logging handler was called.
While this implementation is not exactly what is described in the
original XRay whitepaper, this one gives us an initial implementation
that we can iterate and build upon.
Reviewers: echristo, rSerge, majnemer
Subscribers: mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D27038
llvm-svn: 293015
2017-01-25 11:50:46 +08:00
|
|
|
/// Applies the provided function F to each Buffer in the queue, only if the
|
|
|
|
/// Buffer is marked 'used' (i.e. has been the result of getBuffer(...) and a
|
2017-08-31 08:50:12 +08:00
|
|
|
/// releaseBuffer(...) operation).
|
2017-10-24 09:39:59 +08:00
|
|
|
template <class F>
|
|
|
|
void apply(F Fn) {
|
2017-10-04 13:20:13 +08:00
|
|
|
__sanitizer::SpinMutexLock G(&Mutex);
|
2017-10-24 09:39:59 +08:00
|
|
|
for (auto I = Buffers, E = Buffers + BufferCount; I != E; ++I) {
|
2017-10-04 13:20:13 +08:00
|
|
|
const auto &T = *I;
|
2017-10-24 10:43:49 +08:00
|
|
|
if (T.Used) Fn(T.Buff);
|
[XRay][compiler-rt] XRay Flight Data Recorder Mode
Summary:
In this change we introduce the notion of a "flight data recorder" mode
for XRay logging, where XRay logs in-memory first, and write out data
on-demand as required (as opposed to the naive implementation that keeps
logging while tracing is "on"). This depends on D26232 where we
implement the core data structure for holding the buffers that threads
will be using to write out records of operation.
This implementation only currently works on x86_64 and depends heavily
on the TSC math to write out smaller records to the inmemory buffers.
Also, this implementation defines two different kinds of records with
different sizes (compared to the current naive implementation): a
MetadataRecord (16 bytes) and a FunctionRecord (8 bytes). MetadataRecord
entries are meant to write out information like the thread ID for which
the metadata record is defined for, whether the execution of a thread
moved to a different CPU, etc. while a FunctionRecord represents the
different kinds of function call entry/exit records we might encounter
in the course of a thread's execution along with a delta from the last
time the logging handler was called.
While this implementation is not exactly what is described in the
original XRay whitepaper, this one gives us an initial implementation
that we can iterate and build upon.
Reviewers: echristo, rSerge, majnemer
Subscribers: mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D27038
llvm-svn: 293015
2017-01-25 11:50:46 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-12-06 14:24:08 +08:00
|
|
|
// Cleans up allocated buffers.
|
|
|
|
~BufferQueue();
|
|
|
|
};
|
|
|
|
|
2017-10-24 09:39:59 +08:00
|
|
|
} // namespace __xray
|
2016-12-06 14:24:08 +08:00
|
|
|
|
2017-10-24 09:39:59 +08:00
|
|
|
#endif // XRAY_BUFFER_QUEUE_H
|