Commit Graph

270 Commits

Author SHA1 Message Date
Greg Clayton 05d2b7f741 Added some functions to our API related to classifying symbols as code, data,
const data, etc, and also for SBAddress objects to classify their type of
section they are in and also getting the module for a section offset address.

    lldb::SymbolType SBSymbol::GetType();
    
    lldb::SectionType SBAddress::GetSectionType ();
    lldb::SBModule SBAddress::GetModule ();

llvm-svn: 128602
2011-03-31 01:08:07 +00:00
Greg Clayton 32e0a7509c Many improvements to the Platform base class and subclasses. The base Platform
class now implements the Host functionality for a lot of things that make 
sense by default so that subclasses can check:

int
PlatformSubclass::Foo ()
{
    if (IsHost())
        return Platform::Foo (); // Let the platform base class do the host specific stuff
    
    // Platform subclass specific code...
    int result = ...
    return result;
}

Added new functions to the platform:

    virtual const char *Platform::GetUserName (uint32_t uid);
    virtual const char *Platform::GetGroupName (uint32_t gid);

The user and group names are cached locally so that remote platforms can avoid
sending packets multiple times to resolve this information.

Added the parent process ID to the ProcessInfo class. 

Added a new ProcessInfoMatch class which helps us to match processes up
and changed the Host layer over to using this new class. The new class allows
us to search for processs:
1 - by name (equal to, starts with, ends with, contains, and regex)
2 - by pid
3 - And further check for parent pid == value, uid == value, gid == value, 
    euid == value, egid == value, arch == value, parent == value.
    
This is all hookup up to the "platform process list" command which required
adding dumping routines to dump process information. If the Host class 
implements the process lookup routines, you can now lists processes on 
your local machine:

machine1.foo.com % lldb
(lldb) platform process list 
PID    PARENT USER       GROUP      EFF USER   EFF GROUP  TRIPLE                   NAME
====== ====== ========== ========== ========== ========== ======================== ============================
99538  1      username   usergroup  username   usergroup  x86_64-apple-darwin      FileMerge
94943  1      username   usergroup  username   usergroup  x86_64-apple-darwin      mdworker
94852  244    username   usergroup  username   usergroup  x86_64-apple-darwin      Safari
94727  244    username   usergroup  username   usergroup  x86_64-apple-darwin      Xcode
92742  92710  username   usergroup  username   usergroup  i386-apple-darwin        debugserver


This of course also works remotely with the lldb-platform:

machine1.foo.com % lldb-platform --listen 1234

machine2.foo.com % lldb
(lldb) platform create remote-macosx
  Platform: remote-macosx
 Connected: no
(lldb) platform connect connect://localhost:1444
  Platform: remote-macosx
    Triple: x86_64-apple-darwin
OS Version: 10.6.7 (10J869)
    Kernel: Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386
  Hostname: machine1.foo.com
 Connected: yes
(lldb) platform process list 
PID    PARENT USER       GROUP      EFF USER   EFF GROUP  TRIPLE                   NAME
====== ====== ========== ========== ========== ========== ======================== ============================
99556  244    username   usergroup  username   usergroup  x86_64-apple-darwin      trustevaluation
99548  65539  username   usergroup  username   usergroup  x86_64-apple-darwin      lldb
99538  1      username   usergroup  username   usergroup  x86_64-apple-darwin      FileMerge
94943  1      username   usergroup  username   usergroup  x86_64-apple-darwin      mdworker
94852  244    username   usergroup  username   usergroup  x86_64-apple-darwin      Safari

The lldb-platform implements everything with the Host:: layer, so this should
"just work" for linux. I will probably be adding more stuff to the Host layer
for launching processes and attaching to processes so that this support should
eventually just work as well.

Modified the target to be able to be created with an architecture that differs
from the main executable. This is needed for iOS debugging since we can have
an "armv6" binary which can run on an "armv7" machine, so we want to be able
to do:

% lldb
(lldb) platform create remote-ios
(lldb) file --arch armv7 a.out

Where "a.out" is an armv6 executable. The platform then can correctly decide
to open all "armv7" images for all dependent shared libraries.

Modified the disassembly to show the current PC value. Example output:

(lldb) disassemble --frame
a.out`main:
   0x1eb7:  pushl  %ebp
   0x1eb8:  movl   %esp, %ebp
   0x1eba:  pushl  %ebx
   0x1ebb:  subl   $20, %esp
   0x1ebe:  calll  0x1ec3                   ; main + 12 at test.c:18
   0x1ec3:  popl   %ebx
-> 0x1ec4:  calll  0x1f12                   ; getpid
   0x1ec9:  movl   %eax, 4(%esp)
   0x1ecd:  leal   199(%ebx), %eax
   0x1ed3:  movl   %eax, (%esp)
   0x1ed6:  calll  0x1f18                   ; printf
   0x1edb:  leal   213(%ebx), %eax
   0x1ee1:  movl   %eax, (%esp)
   0x1ee4:  calll  0x1f1e                   ; puts
   0x1ee9:  calll  0x1f0c                   ; getchar
   0x1eee:  movl   $20, (%esp)
   0x1ef5:  calll  0x1e6a                   ; sleep_loop at test.c:6
   0x1efa:  movl   $12, %eax
   0x1eff:  addl   $20, %esp
   0x1f02:  popl   %ebx
   0x1f03:  leave
   0x1f04:  ret
   
This can be handy when dealing with the new --line options that was recently
added:

(lldb) disassemble --line
a.out`main + 13 at test.c:19
   18  	{
-> 19  		printf("Process: %i\n\n", getpid());
   20  	    puts("Press any key to continue..."); getchar();
-> 0x1ec4:  calll  0x1f12                   ; getpid
   0x1ec9:  movl   %eax, 4(%esp)
   0x1ecd:  leal   199(%ebx), %eax
   0x1ed3:  movl   %eax, (%esp)
   0x1ed6:  calll  0x1f18                   ; printf

Modified the ModuleList to have a lookup based solely on a UUID. Since the
UUID is typically the MD5 checksum of a binary image, there is no need
to give the path and architecture when searching for a pre-existing
image in an image list.

Now that we support remote debugging a bit better, our lldb_private::Module
needs to be able to track what the original path for file was as the platform
knows it, as well as where the file is locally. The module has the two 
following functions to retrieve both paths:

const FileSpec &Module::GetFileSpec () const;
const FileSpec &Module::GetPlatformFileSpec () const;

llvm-svn: 128563
2011-03-30 18:16:51 +00:00
Greg Clayton 357132eb9a Added the ability to get the min and max instruction byte size for
an architecture into ArchSpec:

uint32_t
ArchSpec::GetMinimumOpcodeByteSize() const;

uint32_t
ArchSpec::GetMaximumOpcodeByteSize() const;

Added an AddressClass to the Instruction class in Disassembler.h.
This allows decoded instructions to know know if they are code,
code with alternate ISA (thumb), or even data which can be mixed
into code. The instruction does have an address, but it is a good
idea to cache this value so we don't have to look it up more than 
once.

Fixed an issue in Opcode::SetOpcodeBytes() where the length wasn't
getting set.

Changed:

	bool
	SymbolContextList::AppendIfUnique (const SymbolContext& sc);

To:
	bool
	SymbolContextList::AppendIfUnique (const SymbolContext& sc, 
									   bool merge_symbol_into_function);

This function was typically being used when looking up functions
and symbols. Now if you lookup a function, then find the symbol,
they can be merged into the same symbol context and not cause
multiple symbol contexts to appear in a symbol context list that
describes the same function.

Fixed the SymbolContext not equal operator which was causing mixed
mode disassembly to not work ("disassembler --mixed --name main").

Modified the disassembler classes to know about the fact we know,
for a given architecture, what the min and max opcode byte sizes
are. The InstructionList class was modified to return the max
opcode byte size for all of the instructions in its list.
These two fixes means when disassemble a list of instructions and dump 
them and show the opcode bytes, we can format the output more 
intelligently when showing opcode bytes. This affects any architectures
that have varying opcode byte sizes (x86_64 and i386). Knowing the max
opcode byte size also helps us to be able to disassemble N instructions
without having to re-read data if we didn't read enough bytes.

Added the ability to set the architecture for the disassemble command.
This means you can easily cross disassemble data for any supported 
architecture. I also added the ability to specify "thumb" as an 
architecture so that we can force disassembly into thumb mode when
needed. In GDB this was done using a hack of specifying an odd
address when disassembling. I don't want to repeat this hack in LLDB,
so the auto detection between ARM and thumb is failing, just specify
thumb when disassembling:

(lldb) disassemble --arch thumb --name main

You can also have data in say an x86_64 file executable and disassemble
data as any other supported architecture:
% lldb a.out
Current executable set to 'a.out' (x86_64).
(lldb) b main
(lldb) run
(lldb) disassemble --arch thumb --count 2 --start-address 0x0000000100001080 --bytes
0x100001080:  0xb580 push   {r7, lr}
0x100001082:  0xaf00 add    r7, sp, #0

Fixed Target::ReadMemory(...) to be able to deal with Address argument object
that isn't section offset. When an address object was supplied that was
out on the heap or stack, target read memory would fail. Disassembly uses
Target::ReadMemory(...), and the example above where we disassembler thumb
opcodes in an x86 binary was failing do to this bug.

llvm-svn: 128347
2011-03-26 19:14:58 +00:00
Greg Clayton 64195a2c8b Abtracted all mach-o and ELF out of ArchSpec. This patch is a modified form
of Stephen Wilson's idea (thanks for the input Stephen!). What I ended up
doing was:
- Got rid of ArchSpec::CPU (which was a generic CPU enumeration that mimics
  the contents of llvm::Triple::ArchType). We now rely upon the llvm::Triple 
  to give us the machine type from llvm::Triple::ArchType.
- There is a new ArchSpec::Core definition which further qualifies the CPU
  core we are dealing with into a single enumeration. If you need support for
  a new Core and want to debug it in LLDB, it must be added to this list. In
  the future we can allow for dynamic core registration, but for now it is
  hard coded.
- The ArchSpec can now be initialized with a llvm::Triple or with a C string
  that represents the triple (it can just be an arch still like "i386").
- The ArchSpec can still initialize itself with a architecture type -- mach-o
  with cpu type and subtype, or ELF with e_machine + e_flags -- and this will
  then get translated into the internal llvm::Triple::ArchSpec + ArchSpec::Core.
  The mach-o cpu type and subtype can be accessed using the getter functions:
  
  uint32_t
  ArchSpec::GetMachOCPUType () const;

  uint32_t
  ArchSpec::GetMachOCPUSubType () const;
  
  But these functions are just converting out internal llvm::Triple::ArchSpec 
  + ArchSpec::Core back into mach-o. Same goes for ELF.

All code has been updated to deal with the changes.

This should abstract us until later when the llvm::TargetSpec stuff gets
finalized and we can then adopt it.

llvm-svn: 126278
2011-02-23 00:35:02 +00:00
Greg Clayton 514487e806 Made lldb_private::ArchSpec contain much more than just an architecture. It
now, in addition to cpu type/subtype and architecture flavor, contains:
- byte order (big endian, little endian)
- address size in bytes
- llvm::Triple for true target triple support and for more powerful plug-in
  selection.

llvm-svn: 125602
2011-02-15 21:59:32 +00:00
Greg Clayton 6083026822 Applied a fix to qualify "UUID" with the lldb_private namespace to fix
build issues on MinGW.

llvm-svn: 124888
2011-02-04 18:53:10 +00:00
Greg Clayton 931180e644 Changed the SymbolFile::FindFunction() function calls to only return
lldb_private::Function objects. Previously the SymbolFileSymtab subclass
would return lldb_private::Symbol objects when it was asked to find functions.

The Module::FindFunctions (...) now take a boolean "bool include_symbols" so
that the module can track down functions and symbols, yet functions are found
by the SymbolFile plug-ins (through the SymbolVendor class), and symbols are
gotten through the ObjectFile plug-ins.

Fixed and issue where the DWARF parser might run into incomplete class member
function defintions which would make clang mad when we tried to make certain
member functions with invalid number of parameters (such as an operator=
operator that had no parameters). Now we just avoid and don't complete these
incomplete functions.

llvm-svn: 124359
2011-01-27 06:44:37 +00:00
Greg Clayton 6beaaa680a A few of the issue I have been trying to track down and fix have been due to
the way LLDB lazily gets complete definitions for types within the debug info.
When we run across a class/struct/union definition in the DWARF, we will only
parse the full definition if we need to. This works fine for top level types
that are assigned directly to variables and arguments, but when we have a 
variable with a class, lets say "A" for this example, that has a member:
"B *m_b". Initially we don't need to hunt down a definition for this class
unless we are ever asked to do something with it ("expr m_b->getDecl()" for
example). With my previous approach to lazy type completion, we would be able
to take a "A *a" and get a complete type for it, but we wouldn't be able to
then do an "a->m_b->getDecl()" unless we always expanded all types within a
class prior to handing out the type. Expanding everything is very costly and
it would be great if there were a better way.

A few months ago I worked with the llvm/clang folks to have the 
ExternalASTSource class be able to complete classes if there weren't completed
yet:

class ExternalASTSource {
....

    virtual void
    CompleteType (clang::TagDecl *Tag);
    
    virtual void 
    CompleteType (clang::ObjCInterfaceDecl *Class);
};

This was great, because we can now have the class that is producing the AST
(SymbolFileDWARF and SymbolFileDWARFDebugMap) sign up as external AST sources
and the object that creates the forward declaration types can now also
complete them anywhere within the clang type system.

This patch makes a few major changes:
- lldb_private::Module classes now own the AST context. Previously the TypeList
  objects did.
- The DWARF parsers now sign up as an external AST sources so they can complete
  types.
- All of the pure clang type system wrapper code we have in LLDB (ClangASTContext,
  ClangASTType, and more) can now be iterating through children of any type,
  and if a class/union/struct type (clang::RecordType or ObjC interface) 
  is found that is incomplete, we can ask the AST to get the definition. 
- The SymbolFileDWARFDebugMap class now will create and use a single AST that
  all child SymbolFileDWARF classes will share (much like what happens when
  we have a complete linked DWARF for an executable).
  
We will need to modify some of the ClangUserExpression code to take more 
advantage of this completion ability in the near future. Meanwhile we should
be better off now that we can be accessing any children of variables through
pointers and always be able to resolve the clang type if needed.

llvm-svn: 123613
2011-01-17 03:46:26 +00:00
Greg Clayton 2d4edfbc6a Modified all logging calls to hand out shared pointers to make sure we
don't crash if we disable logging when some code already has a copy of the
logger. Prior to this fix, logs were handed out as pointers and if they were
held onto while a log got disabled, then it could cause a crash. Now all logs
are handed out as shared pointers so this problem shouldn't happen anymore.
We are also using our new shared pointers that put the shared pointer count
and the object into the same allocation for a tad better performance.

llvm-svn: 118319
2010-11-06 01:53:30 +00:00
Greg Clayton cfd1aced7e Cleaned up the API logging a lot more to reduce redundant information and
keep the file size a bit smaller.

Exposed SBValue::GetExpressionPath() so SBValue users can get an expression
path for their values.

llvm-svn: 117851
2010-10-31 03:01:06 +00:00
Caroline Tice ceb6b1393d First pass at adding logging capabilities for the API functions. At the moment
it logs the function calls, their arguments and the return values.  This is not
complete or polished, but I am committing it now, at the request of someone who
really wants to use it, even though it's not really done.  It currently does not
attempt to log all the functions, just the most important ones.  I will be 
making further adjustments to the API logging code over the next few days/weeks.
(Suggestions for improvements are welcome).


Update the Python build scripts to re-build the swig C++ file whenever 
the python-extensions.swig file is modified.

Correct the help for 'log enable' command (give it the correct number & type of
arguments).

llvm-svn: 117349
2010-10-26 03:11:13 +00:00
Greg Clayton 274060b6f1 Fixed an issue where we were resolving paths when we should have been.
So the issue here was that we have lldb_private::FileSpec that by default was 
always resolving a path when using the:

FileSpec::FileSpec (const char *path);

and in the:

void FileSpec::SetFile(const char *pathname, bool resolve = true);

This isn't what we want in many many cases. One example is you have "/tmp" on
your file system which is really "/private/tmp". You compile code in that
directory and end up with debug info that mentions "/tmp/file.c". Then you 
type:

(lldb) breakpoint set --file file.c --line 5

If your current working directory is "/tmp", then "file.c" would be turned 
into "/private/tmp/file.c" which won't match anything in the debug info.
Also, it should have been just a FileSpec with no directory and a filename
of "file.c" which could (and should) potentially match any instances of "file.c"
in the debug info.

So I removed the constructor that just takes a path:

FileSpec::FileSpec (const char *path); // REMOVED

You must now use the other constructor that has a "bool resolve" parameter that you must always supply:

FileSpec::FileSpec (const char *path, bool resolve);

I also removed the default parameter to SetFile():

void FileSpec::SetFile(const char *pathname, bool resolve);

And fixed all of the code to use the right settings.

llvm-svn: 116944
2010-10-20 20:54:39 +00:00
Greg Clayton 8941142af8 Hooked up ability to look up data symbols so they show up in disassembly
if the address comes from a data section. 

Fixed an issue that could occur when looking up a symbol that has a zero
byte size where no match would be returned even if there was an exact symbol
match.

Cleaned up the section dump output and added the section type into the output.

llvm-svn: 116017
2010-10-08 00:21:05 +00:00
Greg Clayton bcf2cfbdc5 Remove the eSymbolTypeFunction, eSymbolTypeGlobal, and eSymbolTypeStatic.
They will now be represented as:
eSymbolTypeFunction: eSymbolTypeCode with IsDebug() == true
  eSymbolTypeGlobal: eSymbolTypeData with IsDebug() == true and IsExternal() == true
  eSymbolTypeStatic: eSymbolTypeData with IsDebug() == true and IsExternal() == false

This simplifies the logic when dealing with symbols and allows for symbols
to be coalesced into a single symbol most of the time.

Enabled the minimal symbol table for mach-o again after working out all the
kinks. We now get nice concise symbol tables and debugging with DWARF in the
.o files with a debug map in the binary works well again. There were issues
where the SymbolFileDWARFDebugMap symbol file parser was using symbol IDs and
symbol indexes interchangeably. Now that all those issues are resolved 
debugging is working nicely.

llvm-svn: 113678
2010-09-11 03:13:28 +00:00
Greg Clayton e83e731ec1 Remove the Flags member in lldb_private::Module in favor of bitfield boolean
member variables.

Modified lldb_private::Module to have an accessor that can be used to tell if
a module is a dynamic link editor (dyld) as there are functions in dyld on
darwin that mirror functions in libc (malloc, free, etc) that should not
be used when doing function lookups by name in expressions if there are more
than one match when looking up functions by name.

llvm-svn: 113313
2010-09-07 23:40:05 +00:00
Jim Ingham 680e177833 Don't re-look up the symbol in ResolveSymbolContextForAddress.
llvm-svn: 112679
2010-08-31 23:51:36 +00:00
Jim Ingham 5aee162f97 Change Target & Process so they can really be initialized with an invalid architecture.
Arrange that this then gets properly set on attach, or when a "file" is set.
Add a completer for "process attach -n".

Caveats: there isn't currently a way to handle multiple processes with the same name.  That
will have to wait on a way to pass annotations along with the completion strings.

llvm-svn: 110624
2010-08-09 23:31:02 +00:00
Greg Clayton 3504eee8a8 Added FindTypes to Module and ModuleList.
llvm-svn: 110093
2010-08-03 01:26:16 +00:00
Greg Clayton 0c5cd90d63 Added function name types to allow us to set breakpoints by name more
intelligently. The four name types we currently have are:

eFunctionNameTypeFull       = (1 << 1), // The function name.
                                        // For C this is the same as just the name of the function
                                        // For C++ this is the demangled version of the mangled name.
                                        // For ObjC this is the full function signature with the + or
                                        // - and the square brackets and the class and selector
eFunctionNameTypeBase       = (1 << 2), // The function name only, no namespaces or arguments and no class 
                                        // methods or selectors will be searched.
eFunctionNameTypeMethod     = (1 << 3), // Find function by method name (C++) with no namespace or arguments
eFunctionNameTypeSelector   = (1 << 4)  // Find function by selector name (ObjC) names


this allows much more flexibility when setting breakoints:

(lldb) breakpoint set --name main --basename
(lldb) breakpoint set --name main --fullname
(lldb) breakpoint set --name main --method
(lldb) breakpoint set --name main --selector

The default:

(lldb) breakpoint set --name main

will inspect the name "main" and look for any parens, or if the name starts
with "-[" or "+[" and if any are found then a full name search will happen.
Else a basename search will be the default.

Fixed some command option structures so not all options are required when they
shouldn't be.

Cleaned up the breakpoint output summary.

Made the "image lookup --address <addr>" output much more verbose so it shows
all the important symbol context results. Added a GetDescription method to 
many of the SymbolContext objects for the more verbose output.

llvm-svn: 107075
2010-06-28 21:30:43 +00:00
Chris Lattner 30fdc8d841 Initial checkin of lldb code from internal Apple repo.
llvm-svn: 105619
2010-06-08 16:52:24 +00:00