Summary:
This is initial implementation of assembly profiler which only scans prologue/epilogue assembly instructions to create CFI instructions.
Reviewers: clayborg, jasonmolenda
Differential Revision: http://reviews.llvm.org/D7696
llvm-svn: 232619
Summary:
This change refactors UnwindPlan::Row to be able to store the fact that the CFA is value is set
by evaluating a dwarf expression (DW_CFA_def_cfa_expression). This is achieved by creating a new
class CFAValue and moving all CFA setting/getting code there. Note that code using the new
CFAValue::isDWARFExpression is not yet present and will be added in a follow-up patch. Therefore,
this patch should not change the functionality in any way.
Test Plan: Ran tests on Mac and Linux. No regressions detected.
Reviewers: jasonmolenda, clayborg
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D7755
llvm-svn: 230210
changing it was in r219544 - after living on that for a few
months, I wanted to take another crack at this.
The disassembly-format setting still exists and the old format
can be user specified with a setting like
${current-pc-arrow}${addr-file-or-load}{ <${function.name-without-args}${function.concrete-only-addr-offset-no-padding}>}:
This patch was discussed in http://reviews.llvm.org/D7578
<rdar://problem/19726421>
llvm-svn: 229186
Why? Debugger::FormatPrompt() would run through the format prompt every time and parse it and emit it piece by piece. It also did formatting differently depending on which key/value pair it was parsing.
The new code improves on this with the following features:
1 - Allow format strings to be parsed into a FormatEntity::Entry which can contain multiple child FormatEntity::Entry objects. This FormatEntity::Entry is a parsed version of what was previously always done in Debugger::FormatPrompt() so it is more efficient to emit formatted strings using the new parsed FormatEntity::Entry.
2 - Allows errors in format strings to be shown immediately when setting the settings (frame-format, thread-format, disassembly-format
3 - Allows auto completion by implementing a new OptionValueFormatEntity and switching frame-format, thread-format, and disassembly-format settings over to using it.
4 - The FormatEntity::Entry for each of the frame-format, thread-format, disassembly-format settings only replaces the old one if the format parses correctly
5 - Combines all consecutive string values together for efficient output. This means all "${ansi.*}" keys and all desensitized characters like "\n" "\t" "\0721" "\x23" will get combined with their previous strings
6 - ${*.script:} (like "${var.script:mymodule.my_var_function}") have all been switched over to use ${script.*:} "${script.var:mymodule.my_var_function}") to make the format easier to parse as I don't believe anyone was using these format string power user features.
7 - All key values pairs are defined in simple C arrays of entries so it is much easier to add new entries.
These changes pave the way for subsequent modifications where we can modify formats to do more (like control the width of value strings can do more and add more functionality more easily like string formatting to control the width, printf formats and more).
llvm-svn: 228207
output style can be customized. Change the built-in default to be
more similar to gdb's disassembly formatting.
The disassembly-format for a gdb-like output is
${addr-file-or-load} <${function.name-without-args}${function.concrete-only-addr-offset-no-padding}>:
The disassembly-format for the lldb style output is
{${function.initial-function}{${module.file.basename}`}{${function.name-without-args}}:\n}{${function.changed}\n{${module.file.basename}`}{${function.name-without-args}}:\n}{${current-pc-arrow} }{${addr-file-or-load}}:
The two backticks in the lldb style formatter triggers the sub-expression evaluation in
CommandInterpreter::PreprocessCommand() so you can't use that one as-is ... changing to
use ' characters instead of ` would work around that.
<rdar://problem/9885398>
llvm-svn: 219544
We decided to use assmbly profiler instead of eh_frame for frame 0 because for compiler generated code, eh_frame is usually synchronous(a.k.a. only valid at call site); and we have no way to tell if it's asynchronous or not.
But for x86 & x86_64 compiler generated code:
1. clang & GCC describes all prologue instructions in eh_frame;
2. mid-function stack pointer altering instructions can be easily detected.
So we can grab eh_frame, and use assembly profiler to augment it into asynchronous unwind table.
This change also benefits hand-written assembly; eh_frame for hand-written assembly is often asynchronous,so we have a much better chance to successfully unwind through them.
Change by Tong Shen.
llvm-svn: 216406
with prefer_file_cache == false. This is what we want to do when
the user is doing a disassemble command -- show the actual memory
contents in case the memory has been corrupted or something -- but
when we're profiling functions for stepping or unwinding
(ThreadPlanStepRange::GetInstructionsForAddress,
UnwindAssemblyInstEmulation::GetNonCallSiteUnwindP) we can read
__TEXT instructions directly out of the file, if it exists.
<rdar://problem/14397491>
llvm-svn: 190638
list have a shared pointer back to their DisassemblerLLVMC. This checkin force clears the InstructionList
in all the places we use the DisassemblerSP to stop the leaking for now. I'll go back and fix this
for real when I have time to do so.
<rdar://problem/14581918>
llvm-svn: 187473
<rdar://problem/13594769>
Main changes in this patch include:
- cleanup plug-in interface and use ConstStrings for plug-in names
- Modfiied the BSD Archive plug-in to be able to pick out the correct .o file when .a files contain multiple .o files with the same name by using the timestamp
- Modified SymbolFileDWARFDebugMap to properly verify the timestamp on .o files it loads to ensure we don't load updated .o files and cause problems when debugging
The plug-in interface changes:
Modified the lldb_private::PluginInterface class that all plug-ins inherit from:
Changed:
virtual const char * GetPluginName() = 0;
To:
virtual ConstString GetPluginName() = 0;
Removed:
virtual const char * GetShortPluginName() = 0;
- Fixed up all plug-in to adhere to the new interface and to return lldb_private::ConstString values for the plug-in names.
- Fixed all plug-ins to return simple names with no prefixes. Some plug-ins had prefixes and most ones didn't, so now they all don't have prefixed names, just simple names like "linux", "gdb-remote", etc.
llvm-svn: 181631
LLDB is crashing when logging is enabled from lldb-perf-clang. This has to do with the global destructor chain as the process and its threads are being torn down.
All logging channels now make one and only one instance that is kept in a global pointer which is never freed. This guarantees that logging can correctly continue as the process tears itself down.
llvm-svn: 178191
Calculate "can branch" using the MC API's rather than our hand-rolled regex'es.
As extra credit, allow setting the disassembly flavor for x86 based architectures to intel or att.
<rdar://problem/11319574>
<rdar://problem/9329275>
llvm-svn: 176392
- generate-vers.pl has to be called by cmake to generate the version number
- parallel builds not yet supported; dependency on clang must be explicitly specified
Tested on Linux.
- Building on Mac will require code-signing logic to be implemented.
- Building on Windows will require OS-detection logic and some selective directory inclusion
Thanks to Carlo Kok (who originally prepared these CMakefiles for Windows) and Ben Langmuir
who ported them to Linux!
llvm-svn: 175795
The results from Clang name lookups changed to
be ArrayRefs, so I had to change the way we
check for the presence of a result and the way
we iterate across results.
llvm-svn: 170927
the function's prologue instructions so we can re-instate that prologue
if we hit an early return mid-function. Add some additional heuristics
to differentiate between prologue and epilogue instruction sequences.
This fixes the specific problem of correctly unwinding through a function
which has an epilogue one instruction after the last prologue setup
instruction has completed.
<rdar://problem/12091139>
llvm-svn: 166465
to handle an addition class of early-return instructions we find in arm code:
tail-call optimziation returns where we restore the register state from the
function entry and jump directly (not branch & link) to another function --
when that other function returns, it will return to our caller.
Previously this mid-function epilogue sequence was not being correctly detected.
We would not re-instate the prologue setup instructions for the rest of the function
so unwinds would break from that point until the end of the function.
<rdar://problem/12502597>
llvm-svn: 166081
Make breakpoint setting by file and line much more efficient by only looking for inlined breakpoint locations if we are setting a breakpoint in anything but a source implementation file. Implementing this complex for a many reasons. Turns out that parsing compile units lazily had some issues with respect to how we need to do things with DWARF in .o files. So the fixes in the checkin for this makes these changes:
- Add a new setting called "target.inline-breakpoint-strategy" which can be set to "never", "always", or "headers". "never" will never try and set any inlined breakpoints (fastest). "always" always looks for inlined breakpoint locations (slowest, but most accurate). "headers", which is the default setting, will only look for inlined breakpoint locations if the breakpoint is set in what are consudered to be header files, which is realy defined as "not in an implementation source file".
- modify the breakpoint setting by file and line to check the current "target.inline-breakpoint-strategy" setting and act accordingly
- Modify compile units to be able to get their language and other info lazily. This allows us to create compile units from the debug map and not have to fill all of the details in, and then lazily discover this information as we go on debuggging. This is needed to avoid parsing all .o files when setting breakpoints in implementation only files (no inlines). Otherwise we would need to parse the .o file, the object file (mach-o in our case) and the symbol file (DWARF in the object file) just to see what the compile unit was.
- modify the "SymbolFileDWARFDebugMap" to subclass lldb_private::Module so that the virtual "GetObjectFile()" and "GetSymbolVendor()" functions can be intercepted when the .o file contenst are later lazilly needed. Prior to this fix, when we first instantiated the "SymbolFileDWARFDebugMap" class, we would also make modules, object files and symbol files for every .o file in the debug map because we needed to fix up the sections in the .o files with information that is in the executable debug map. Now we lazily do this in the DebugMapModule::GetObjectFile()
Cleaned up header includes a bit as well.
llvm-svn: 162860
return 0x0 as the read value instead of uninitialized
stack data so we get consistent behavior from the
emulator.
<rdar://problem/12058770>
llvm-svn: 161795
the state of the unwind instructions once the prologue has finished. If it hits an
early return epilogue in the middle of the function, re-instate the prologue after that
epilogue has completed so that we can still unwind for cases where the flow of control
goes past that early-return. <rdar://problem/11775059>
Move the UnwindPlan operator== definition into the .cpp file, expand the definition a bit.
Add some casts to a SBCommandInterpreter::HandleCompletion() log statement so it builds without
warning on 64- and 32-bit systems.
llvm-svn: 160337
a shared pointer to ease some memory management issues with a patch
I'm working on.
The main complication with using SPs for these objects is that most
methods that build up an UnwindPlan will construct a Row to a given
instruction point in a function, then add additional regsaves in
the next instruction point to that row and push it again. A little
care is needed to not mutate the previous instruction point's Row
once these are switched to being held behing shared pointers.
llvm-svn: 160214
frame pointer overwritten with the caller's fp value, return to
expressing the CFA in terms of the stack pointer.
<rdar://problem/11855862>
llvm-svn: 160150
a bit -- we're creating the UnwindPlan here, we can set the register set to
whatever is convenient for us, no need to handle different register sets.
A handful of small comment fixes I noticed while reading through the code.
llvm-svn: 159924
Fixed the DisassemblerLLVMC disassembler to parse more efficiently instead of parsing opcodes over and over. The InstructionLLVMC class now only reads the opcode in the InstructionLLVMC::Decode function. This can be done very efficiently for ARM and architectures that have fixed opcode sizes. For x64 it still calls the disassembler to get the byte size.
Moved the lldb_private::Instruction::Dump(...) function up into the lldb_private::Instruction class and it now uses the function that gets the mnemonic, operandes and comments so that all disassembly is using the same code.
Added StreamString::FillLastLineToColumn() to allow filling a line up to a column with a character (which is used by the lldb_private::Instruction::Dump(...) function).
Modified the Opcode::GetData() fucntion to "do the right thing" for thumb instructions.
llvm-svn: 156532
objects for the backlink to the lldb_private::Process. The issues we were
running into before was someone was holding onto a shared pointer to a
lldb_private::Thread for too long, and the lldb_private::Process parent object
would get destroyed and the lldb_private::Thread had a "Process &m_process"
member which would just treat whatever memory that used to be a Process as a
valid Process. This was mostly happening for lldb_private::StackFrame objects
that had a member like "Thread &m_thread". So this completes the internal
strong/weak changes.
Documented the ExecutionContext and ExecutionContextRef classes so that our
LLDB developers can understand when and where to use ExecutionContext and
ExecutionContextRef objects.
llvm-svn: 151009
shared pointers.
Changed the ExecutionContext over to use shared pointers for
the target, process, thread and frame since these objects can
easily go away at any time and any object that was holding onto
an ExecutionContext was running the risk of using a bad object.
Now that the shared pointers for target, process, thread and
frame are just a single pointer (they all use the instrusive
shared pointers) the execution context is much safer and still
the same size.
Made the shared pointers in the the ExecutionContext class protected
and made accessors for all of the various ways to get at the pointers,
references, and shared pointers.
llvm-svn: 140298
the arm emulate instruction unwinder so you can leave it
on by default and not be overwhelmed. Set verbose mode to
get the full story on how the unwindplans were created.
llvm-svn: 139897
the appropriate registers for arm and x86_64. The register names for the
arguments that are the size of a pointer or less are all named "arg1", "arg2",
etc. This allows you to read these registers by name:
(lldb) register read arg1 arg2 arg3
...
You can also now specify you want to see alternate register names when executing
the read register command:
(lldb) register read --alternate
(lldb) register read -A
llvm-svn: 131376
thread plan. In order to get the return value, you can call:
void
ThreadPlanCallFunction::RequestReturnValue (lldb::ValueSP &return_value_sp);
This registers a shared pointer to a return value that will get filled in if
everything goes well. After the thread plan is run the return value will be
extracted for you.
Added an ifdef to be able to switch between the LLVM MCJIT and the standand JIT.
We currently have the standard JIT selected because we have some work to do to
get the MCJIT fuctioning properly.
Added the ability to call functions with 6 argument in the x86_64 ABI.
Added the ability for GDBRemoteCommunicationClient to detect if the allocate
and deallocate memory packets are supported and to not call allocate memory
("_M") or deallocate ("_m") if we find they aren't supported.
Modified the ProcessGDBRemote::DoAllocateMemory(...) and ProcessGDBRemote::DoDeallocateMemory(...)
to be able to deal with the allocate and deallocate memory packets not being
supported. If they are not supported, ProcessGDBRemote will switch to calling
"mmap" and "munmap" to allocate and deallocate memory instead using our
trivial function call support.
Modified the "void ProcessGDBRemote::DidLaunchOrAttach()" to correctly ignore
the qHostInfo triple information if any was specified in the target. Currently
if the target only specifies an architecture when creating the target:
(lldb) target create --arch i386 a.out
Then the vendor, os and environemnt will be adopted by the target.
If the target was created with any triple that specifies more than the arch:
(lldb) target create --arch i386-unknown-unknown a.out
Then the target will maintain its triple and not adopt any new values. This
can be used to help force bare board debugging where the dynamic loader for
static files will get used and users can then use "target modules load ..."
to set addressses for any files that are desired.
Added back some convenience functions to the lldb_private::RegisterContext class
for writing registers with unsigned values. Also made all RegisterContext
constructors explicit to make sure we know when an integer is being converted
to a RegisterValue.
llvm-svn: 131370
respective ABI plugins as they were plug-ins that supplied ABI specfic info.
Also hookep up the UnwindAssemblyInstEmulation so that it can generate the
unwind plans for ARM.
Changed the way ABI plug-ins are handed out when you get an instance from
the plug-in manager. They used to return pointers that would be mananged
individually by each client that requested them, but now they are handed out
as shared pointers since there is no state in the ABI objects, they can be
shared.
llvm-svn: 131193
into some cleanup I have been wanting to do when reading/writing registers.
Previously all RegisterContext subclasses would need to implement:
virtual bool
ReadRegisterBytes (uint32_t reg, DataExtractor &data);
virtual bool
WriteRegisterBytes (uint32_t reg, DataExtractor &data, uint32_t data_offset = 0);
There is now a new class specifically designed to hold register values:
lldb_private::RegisterValue
The new register context calls that subclasses must implement are:
virtual bool
ReadRegister (const RegisterInfo *reg_info, RegisterValue ®_value) = 0;
virtual bool
WriteRegister (const RegisterInfo *reg_info, const RegisterValue ®_value) = 0;
The RegisterValue class must be big enough to handle any register value. The
class contains an enumeration for the value type, and then a union for the
data value. Any integer/float values are stored directly in an appropriate
host integer/float. Anything bigger is stored in a byte buffer that has a length
and byte order. The RegisterValue class also knows how to copy register value
bytes into in a buffer with a specified byte order which can be used to write
the register value down into memory, and this does the right thing when not
all bytes from the register values are needed (getting a uint8 from a uint32
register value..).
All RegiterContext and other sources have been switched over to using the new
regiter value class.
llvm-svn: 131096
Switch the EmulateInstruction to use the standard RegisterInfo structure
that is defined in the lldb private types intead of passing the reg kind and
reg num everywhere. EmulateInstruction subclasses also need to provide
RegisterInfo structs given a reg kind and reg num. This eliminates the need
for the GetRegisterName() virtual function and allows more complete information
to be passed around in the read/write register callbacks. Subclasses should
always provide RegiterInfo structs with the generic register info filled in as
well as at least one kind of register number in the RegisterInfo.kinds[] array.
llvm-svn: 130256
are defined as enumerations. Current bits include:
eEmulateInstructionOptionAutoAdvancePC
eEmulateInstructionOptionIgnoreConditions
Modified the EmulateInstruction class to have a few more pure virtuals that
can help clients understand how many instructions the emulator can handle:
virtual bool
SupportsEmulatingIntructionsOfType (InstructionType inst_type) = 0;
Where instruction types are defined as:
//------------------------------------------------------------------
/// Instruction types
//------------------------------------------------------------------
typedef enum InstructionType
{
eInstructionTypeAny, // Support for any instructions at all (at least one)
eInstructionTypePrologueEpilogue, // All prologue and epilogue instructons that push and pop register values and modify sp/fp
eInstructionTypePCModifying, // Any instruction that modifies the program counter/instruction pointer
eInstructionTypeAll // All instructions of any kind
} InstructionType;
This allows use to tell what an emulator can do and also allows us to request
these abilities when we are finding the plug-in interface.
Added the ability for an EmulateInstruction class to get the register names
for any registers that are part of the emulation. This helps with being able
to dump and log effectively.
The UnwindAssembly class now stores the architecture it was created with in
case it is needed later in the unwinding process.
Added a function that can tell us DWARF register names for ARM that goes
along with the source/Utility/ARM_DWARF_Registers.h file:
source/Utility/ARM_DWARF_Registers.c
Took some of plug-ins out of the lldb_private namespace.
llvm-svn: 130189