My apologies for the large patch. With the exception of ConstString.h
itself it was entirely produced by sed.
ConstString has exactly one const char * data member, so passing a
ConstString by reference is not any more efficient than copying it by
value. In both cases a single pointer is passed. But passing it by
value makes it harder to accidentally return the address of a local
object.
(This fixes rdar://problem/48640859 for the Apple folks)
Differential Revision: https://reviews.llvm.org/D59030
llvm-svn: 355553
Summary:
This file implements some general purpose data structures, and so it
belongs to the Utility module.
Reviewers: zturner, jingham, JDevlieghere, clayborg, espindola
Subscribers: emaste, mgorny, javed.absar, arichardson, MaskRay, lldb-commits
Differential Revision: https://reviews.llvm.org/D58970
llvm-svn: 355509
Summary:
These functions should always return the opposite of the
`Triple{Environment,OS,Vendor}WasSpecified` functions. Unspecified unknown is
the same as unspecified, which is why one set of functions should give us what
we want. It's possible to have specified unknown, which is why we can't just
rely on checking the enum values of vendor/os/environment. We must also ensure
that the names of these are empty and not "unknown".
Differential Revision: https://reviews.llvm.org/D58653
llvm-svn: 354933
Split the recognition into NetBSD executables & shared libraries
and core(5) files.
Introduce new owner type: "NetBSD-CORE", as core(5) files are not tagged
in the same way as regular NetBSD executables.
Stop using incorrectly ABI_TAG and ABI_SIZE. Introduce IDENT_TAG,
IDENT_DECSZ, IDENT_NAMESZ and PROCINFO.
The new values detect correctly the NetBSD images.
The patch has been originally written by Kamil Rytarowski. I've added
tests and applied minor code changes per review. The work has been
sponsored by the NetBSD Foundation.
Differential Revision: https://reviews.llvm.org/D42870
llvm-svn: 354466
COFF files are modelled in lldb as having one big container section
spanning the entire module image, with the actual sections being
subsections of that. In this model, the base address is simply the
address of the first byte of that section.
This also removes the hack where ObjectFilePECOFF was using the
m_file_offset field to communicate this information. Using file offset
for this purpose is completely wrong, as that is supposed to indicate
where is this ObjectFile located in the file on disk. This field is only
meaningful for fat binaries, and should normally be 0.
Both PDB plugins have been updated to use GetBaseAddress instead of
GetFileOffset.
llvm-svn: 354258
Summary:
This is a preparatory step to enable adding extra unwind strategies by
symbol file plugins. This has been discussed on the lldb-dev mailing
list: <http://lists.llvm.org/pipermail/lldb-dev/2019-February/014703.html>.
Reviewers: jasonmolenda, clayborg, espindola
Subscribers: lemo, emaste, lldb-commits, arichardson
Differential Revision: https://reviews.llvm.org/D58129
llvm-svn: 354033
Summary:
This is coming from the discussion in D55356 (the most interesting part
happened on the mailing list, so it isn't reflected on the review page).
In short the issue is that lldb assumes that all bytes of a module image
in memory will be backed by a "section". This isn't the case for PECOFF
files because the initial bytes of the module image will contain the
file header, which does not correspond to any normal section in the
file. In particular, this means it is not possible to implement
GetBaseAddress function for PECOFF files, because that's supposed point
to the first byte of that header.
If my (limited) understanding of how PECOFF files work is correct, then
the OS is expecded to load the entire module into one continuous chunk
of memory. The address of that chunk (+/- ASLR) is given by the "image
base" field in the COFF header, and it's size by "image size". All of
the COFF sections are then loaded into this range.
If that's true, then we can model this behavior in lldb by creating a
"container" section to represent the entire module image, and then place
other sections inside that. This would make be consistent with how MachO
and ELF files are modelled (except that those can have multiple
top-level containers as they can be loaded into multiple discontinuous
chunks of memory).
This change required a small number of fixups in the PDB plugins, which
assumed a certain order of sections within the object file (which
obivously changes now). I fix this by changing the lookup code to use
section IDs (which are unchanged) instead of indexes. This has the nice
benefit of removing spurious -1s in the plugins as the section IDs in
the pdbs match the 1-based section IDs in the COFF plugin.
Besides making the implementation of GetBaseAddress possible, this also
improves the lookup of addresses in the gaps between the object file
sections, which will now be correctly resolved as belonging to the
object file.
Reviewers: zturner, amccarth, stella.stamenova, clayborg, lemo
Reviewed By: clayborg, lemo
Subscribers: JDevlieghere, abidh, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D56537
llvm-svn: 353916
The `ap` suffix is a remnant of lldb's former use of auto pointers,
before they got deprecated. Although all their uses were replaced by
unique pointers, some variables still carried the suffix.
In r353795 I removed another auto_ptr remnant, namely redundant calls to
::get for unique_pointers. Jim justly noted that this is a good
opportunity to clean up the variable names as well.
I went over all the changes to ensure my find-and-replace didn't have
any undesired side-effects. I hope I didn't miss any, but if you end up
at this commit doing a git blame on a weirdly named variable, please
know that the change was unintentional.
llvm-svn: 353912
Apparently there are multiple places where MSVC complains about
instantiations with extended aligment. I think it's better to define
`_ENABLE_EXTENDED_ALIGNED_STORAGE` as suggested by the error message.
I don't have access to a Windows machine so this is all speculative.
llvm-svn: 353778
Unlike std::make_unique, which is only available since C++14,
std::make_shared is available since C++11. Not only is std::make_shared
a lot more readable compared to ::reset(new), it also performs a single
heap allocation for the object and control block.
Differential revision: https://reviews.llvm.org/D57990
llvm-svn: 353764
instead of returning the UUID through by-ref argument and a boolean
value indicating success, we can just return it directly. Since the UUID
class already has an invalid state, it can be used to denote the failure
without the additional bool.
llvm-svn: 353714
LLDB testsuite fails when built by GCC8 on:
LLDB :: SymbolFile/DWARF/find-basic-namespace.cpp
This is because this code in LLDB codebase has undefined behavior:
#include <algorithm>
#include <string.h>
// lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1731
static struct section_64 {
char sectname[16];
char segname[16];
} sect64 = { {'_','_','a','p','p','l','e','_','n','a','m','e','s','p','a','c'}, "__DWARF" };
int main() {
return std::min<size_t>(strlen(sect64.sectname), sizeof(sect64.sectname));
}
It has been discussed as a (false) bugreport to GCC:
wrong-code: LLDB testcase fails: SymbolFile/DWARF/find-basic-namespace.cpp
https://bugzilla.redhat.com/show_bug.cgi?id=1672436
Differential Revision: https://reviews.llvm.org/D57781
llvm-svn: 353280
The two records aren't used by anything yet, but this part can be
separated out easily, so I am comitting it separately to simplify
reviews of the followup patch.
llvm-svn: 352507
Summary:
This addresses the issues raised in D56844. It removes the accessors from the
breakpad record structures by making the fields public. Also, I refactor the
UUID parsing code to remove hard-coded constants.
Reviewers: lemo
Subscribers: clayborg, lldb-commits
Differential Revision: https://reviews.llvm.org/D57037
llvm-svn: 352021
This patch extends SymbolFileBreakpad::AddSymbols to include the symbols
from the FUNC records too. These symbols come from the debug info and
have a size associated with them, so they are given preference in case
there is a PUBLIC record for the same address.
To achieve this, I first pre-process the symbols into a temporary
DenseMap, and then insert the uniqued symbols into the module's symtab.
Reviewers: clayborg, lemo, zturner
Reviewed By: clayborg
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D56590
llvm-svn: 351781
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
This centralizes parsing of breakpad records, which was previously
spread out over ObjectFileBreakpad and SymbolFileBreakpad.
For each record type X there is a separate breakpad::XRecord class, and
an associated parse function. The classes just store the information in
the breakpad records in a more accessible form. It is up to the users to
determine what to do with that data.
This separation also made it possible to write some targeted tests for
the parsing code, which was previously unaccessible, so I write a couple
of those too.
Reviewers: clayborg, lemo, zturner
Reviewed By: clayborg
Subscribers: mgorny, fedor.sergeev, lldb-commits
Differential Revision: https://reviews.llvm.org/D56844
llvm-svn: 351541
The code was assuming that the elf file will have a PT_LOAD segment
starting from the first byte of the file. While this is true for files
generated by most linkers (it's a way of saving space), it is not a
requirement. And files not satisfying this constraint can still be
perfectly executable. yaml2obj is one of the tools which produces files
like this.
This patch relaxes the check in ObjectFileELF to take the address of the
first PT_LOAD segment as the base address of the object (instead of the
one with the offset 0). Since the PT_LOAD segments are supposed to be
sorted according to the VM address, this entry will also be the one with
the lowest VM address.
If we ever run into files which don't have the PT_LOAD segments sorted,
we can easily change this code to return the lowest VM address as the
base address (if that is the correct thing to do for these files).
llvm-svn: 350923
If a section name is exactly 8 bytes long (or has been truncated to 8
bytes), it will not contain the terminating nul character. This means
reading the name as a c string will pick up random data following the
name field (which happens to be the section vm size).
This fixes the name computation to avoid out-of-bounds access and adds a
test.
Reviewers: zturner, stella.stamenova
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D56124
llvm-svn: 350809
Summary:
The concept of a base address was already present in the implementation
(it's needed for computing section load addresses properly), but it was
never exposed through this function. This fixes that.
llvm-svn: 350804
Summary:
This is the result of the discussion in D55356, where it was suggested
as a solution to representing the addresses that logically belong to a
module in memory, but are not a part of any of its sections.
The ELF PT_LOAD segments are similar to the MachO "load commands",
except that the relationship between them and the object file sections
is a bit weaker. While in the MachO case, the sections belonging to a
specific segment are placed directly inside it in the object file
logical structur, in the ELF case, the sections and segments form two
separate hierarchies. This means that it is in theory possible to create
an elf file where only a part of a section would belong to some segment
(and another part to a different one). However, I am not aware of any
tool which would produce such a file (and most tools will have problems
ingesting them), so this means it is still possible to follow the MachO
model and make sections children of the PT_LOAD segments.
In case we run into (corrupt?) files with overlapping sections, I have
added code (and tests) which adjusts the sizes and/or drops the offending
sections in order to present a reasonable image to the upper layers of
LLDB. This is mostly done for completeness, as I don't anticipate
running into this situation in the real world. However, if we do run
into it, and the current behavior is not suitable for some reason, we
can implement this logic differently.
Reviewers: clayborg, jankratochvil, krytarowski, joerg, espindola
Subscribers: emaste, arichardson, lldb-commits
Differential Revision: https://reviews.llvm.org/D55998
llvm-svn: 350742
Summary:
This patch allows ObjectFileBreakpad to parse the contents of Breakpad
files into sections. This sounds slightly odd at first, but in essence
its not too different from how other object files handle things. For
example in elf files, the symtab section consists of a number of
"records", where each record represents a single symbol. The same is
true for breakpad's PUBLIC section, except in this case, the records will be
textual instead of binary.
To keep sections contiguous, I create a new section every time record
type changes. Normally, the breakpad processor will group all records of
the same type in one block, but the format allows them to be intermixed,
so in general, the "object file" may contain multiple sections with the
same record type.
Reviewers: clayborg, zturner, lemo, markmentovai, amccarth
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D55434
llvm-svn: 350511
Summary:
instead of returning the architecture through by-ref argument and a
boolean value indicating success, we can just return the ArchSpec
directly. Since the ArchSpec already has an invalid state, it can be
used to denote the failure without the additional bool.
Reviewers: clayborg, zturner, espindola
Subscribers: emaste, arichardson, JDevlieghere, lldb-commits
Differential Revision: https://reviews.llvm.org/D56129
llvm-svn: 350291
Prior to this there were 3 places that were duplicating the logic to detect if a section can/should be loaded and some were doing things a bit differently. Now it is all centralized in one place and it is done correctly.
llvm-svn: 349926
Summary:
The first section header does not define a real section. Instead it is
used for various elf extensions. This patch skips creation of a section
for index 0.
This has one furtunate side-effect, in that it allows us to use the section
header index as the Section ID (where 0 is also invalid). This way, we
can get rid of a lot of spurious +1s in the ObjectFileELF code.
Reviewers: clayborg, krytarowski, joerg, espindola
Subscribers: emaste, lldb-commits, arichardson
Differential Revision: https://reviews.llvm.org/D55757
llvm-svn: 349498
Summary:
This patch attempts to move as much code as possible out of the
CreateSections function to make room for future improvements there. Some
of this may be slightly over-engineered (VMAddressProvider), but I
wanted to keep the logic of this function very simple, because once I
start taking segment headers into acount (as discussed in D55356), the
function is going to grow significantly.
While in there, I also added tests for various bits of functionality.
This should be NFC, except that I changed the order of hac^H^Heuristicks
for determining section type slightly. Previously, name-based deduction
(.symtab -> symtab) would take precedence over type-based (SHT_SYMTAB ->
symtab) one. In fact we would assert if we ran into a .text section with
type SHT_SYMTAB. Though unlikely to matter in practice, this order
seemed wrong to me, so I have inverted it.
Reviewers: clayborg, krytarowski, espindola
Subscribers: emaste, arichardson, lldb-commits
Differential Revision: https://reviews.llvm.org/D55706
llvm-svn: 349268
This patch simplifies boolean expressions acorss LLDB. It was generated
using clang-tidy with the following command:
run-clang-tidy.py -checks='-*,readability-simplify-boolean-expr' -format -fix $PWD
Differential revision: https://reviews.llvm.org/D55584
llvm-svn: 349215
Move code into a separate function, and replace the if-else chain with
llvm::StringSwitch.
A slight behavioral change is that now I use the section flags
(SHF_TLS) instead of the section name to set the thread-specific
property. There is no explanation in the original commit introducing
this (r153537) as to why that was done this way, but the new behavior
should be more correct.
llvm-svn: 348936
Instead of GetProgramHeaderCount+GetProgramHeaderByIndex, expose an
ArrayRef of all program headers, to enable range-based iteration.
Instead of GetSegmentDataByIndex, expose GetSegmentData, taking a
program header (reference).
This makes the code simpler by enabling range-based loops and also
allowed to remove some null checks, as it became locally obvious that
some pointers can never be null.
llvm-svn: 348928
Summary:
This function was named such because in the case of MachO files, the
mach header is located at this address. However all (most?) usages of
this function were not interested in that fact, but the fact that this
address is used as the base address for expressing various relative
addresses in the object file.
For other object file formats, this name is not appropriate (and it's
probably the reason why this function was not implemented in these
classes). In the ELF case the ELF header will usually end up at this
address, but this is a result of the linker optimizing the file layout
and not a requirement of the spec. For COFF files, I believe the is no
header located at this address either.
Reviewers: clayborg, jasonmolenda, amccarth, lemo, stella.stamenova
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D55422
llvm-svn: 348849
Summary: Instead use a more reasonable value to start and rely on the fact that SmallString will resize if necessary.
Reviewers: labath, asmith
Reviewed By: labath
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D55457
llvm-svn: 348775
This re-commits r348592, which was reverted due to a failing test on
macos.
The issue was that I was passing a null pointer for the
"CreateMemoryInstance" callback when registering ObjectFileBreakpad,
which caused crashes when attemping to load modules from memory. The
correct thing to do is to pass a callback which always returns a null
pointer (as breakpad files are never loaded in inferior memory).
It turns out that there is only one test which exercises this code path,
and it's mac-only, so I've create a new test which should run everywhere
(except windows, as one cannot delete an executable which is being run).
Unfortunately, this test still fails on linux for other reasons, but at
least it gives us something to aim for.
The original commit message was:
This patch adds the scaffolding necessary for lldb to recognise symbol
files generated by breakpad. These (textual) files contain just enough
information to be able to produce a backtrace from a crash
dump. This information includes:
- UUID, architecture and name of the module
- line tables
- list of symbols
- unwind information
A minimal breakpad file could look like this:
MODULE Linux x86_64 0000000024B5D199F0F766FFFFFF5DC30 a.out
INFO CODE_ID 00000000B52499D1F0F766FFFFFF5DC3
FILE 0 /tmp/a.c
FUNC 1010 10 0 _start
1010 4 4 0
1014 5 5 0
1019 5 6 0
101e 2 7 0
PUBLIC 1010 0 _start
STACK CFI INIT 1010 10 .cfa: $rsp 8 + .ra: .cfa -8 + ^
STACK CFI 1011 $rbp: .cfa -16 + ^ .cfa: $rsp 16 +
STACK CFI 1014 .cfa: $rbp 16 +
Even though this data would normally be considered "symbol" information,
in the current lldb infrastructure it is assumed every SymbolFile object
is backed by an ObjectFile instance. So, in order to better interoperate
with the rest of the code (particularly symbol vendors).
In this patch I just parse the breakpad header, which is enough to
populate the UUID and architecture fields of the ObjectFile interface.
The rough plan for followup patches is to expose the individual parts of
the breakpad file as ObjectFile "sections", which can then be used by
other parts of the codebase (SymbolFileBreakpad ?) to vend the necessary
information.
Reviewers: clayborg, zturner, lemo, amccarth
Subscribers: mgorny, fedor.sergeev, markmentovai, lldb-commits
Differential Revision: https://reviews.llvm.org/D55214
llvm-svn: 348773
Summary:
This patch adds the scaffolding necessary for lldb to recognise symbol
files generated by breakpad. These (textual) files contain just enough
information to be able to produce a backtrace from a crash
dump. This information includes:
- UUID, architecture and name of the module
- line tables
- list of symbols
- unwind information
A minimal breakpad file could look like this:
MODULE Linux x86_64 0000000024B5D199F0F766FFFFFF5DC30 a.out
INFO CODE_ID 00000000B52499D1F0F766FFFFFF5DC3
FILE 0 /tmp/a.c
FUNC 1010 10 0 _start
1010 4 4 0
1014 5 5 0
1019 5 6 0
101e 2 7 0
PUBLIC 1010 0 _start
STACK CFI INIT 1010 10 .cfa: $rsp 8 + .ra: .cfa -8 + ^
STACK CFI 1011 $rbp: .cfa -16 + ^ .cfa: $rsp 16 +
STACK CFI 1014 .cfa: $rbp 16 +
Even though this data would normally be considered "symbol" information,
in the current lldb infrastructure it is assumed every SymbolFile object
is backed by an ObjectFile instance. So, in order to better interoperate
with the rest of the code (particularly symbol vendors).
In this patch I just parse the breakpad header, which is enough to
populate the UUID and architecture fields of the ObjectFile interface.
The rough plan for followup patches is to expose the individual parts of
the breakpad file as ObjectFile "sections", which can then be used by
other parts of the codebase (SymbolFileBreakpad ?) to vend the necessary
information.
Reviewers: clayborg, zturner, lemo, amccarth
Subscribers: mgorny, fedor.sergeev, markmentovai, lldb-commits
Differential Revision: https://reviews.llvm.org/D55214
llvm-svn: 348592
Summary:
This parses entries in pecoff import tables for imported DLLs and
is intended as the first step to allow LLDB to load a PE's shared
modules when creating a target on the LLDB console.
Reviewers: rnk, zturner, aleksandr.urakov, lldb-commits, labath, asmith
Reviewed By: labath, asmith
Subscribers: labath, lemo, clayborg, Hui, mgorny, mgrang, teemperor
Differential Revision: https://reviews.llvm.org/D53094
llvm-svn: 348527
Summary:
This patch contains several small fixes, which makes it possible to evaluate
expressions on Windows using information from PDB. The changes are:
- several sanitize checks;
- make IRExecutionUnit::MemoryManager::getSymbolAddress to not return a magic
value on a failure, because callers wait 0 in this case;
- entry point required to be a file address, not RVA, in the ObjectFilePECOFF;
- do not crash on a debuggee second chance exception - it may be an expression
evaluation crash. Also fix detection of "crushed" threads in tests;
- create parameter declarations for functions in AST to make it possible to call
debugee functions from expressions;
- relax name searching rules for variables, functions, namespaces and types. Now
it works just like in the DWARF plugin;
- fix endless recursion in SymbolFilePDB::ParseCompileUnitFunctionForPDBFunc.
Reviewers: zturner, asmith, stella.stamenova
Reviewed By: stella.stamenova, asmith
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D53759
llvm-svn: 348136
This reverts commit dec87759523b2f22fcff3325bc2cd543e4cda0e7.
This commit caused the tests on Windows to run forever rather than complete.
Reverting until the commit can be fixed to not stall.
llvm-svn: 348009
Summary:
This patch contains several small fixes, which makes it possible to evaluate
expressions on Windows using information from PDB. The changes are:
- several sanitize checks;
- make IRExecutionUnit::MemoryManager::getSymbolAddress to not return a magic
value on a failure, because callers wait 0 in this case;
- entry point required to be a file address, not RVA, in the ObjectFilePECOFF;
- do not crash on a debuggee second chance exception - it may be an expression
evaluation crash;
- create parameter declarations for functions in AST to make it possible to call
debugee functions from expressions;
- relax name searching rules for variables, functions, namespaces and types. Now
it works just like in the DWARF plugin;
- fix endless recursion in SymbolFilePDB::ParseCompileUnitFunctionForPDBFunc.
Reviewers: zturner, asmith, stella.stamenova
Reviewed By: stella.stamenova, asmith
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D53759
llvm-svn: 347962
Test cases were updated to not use the local compilation dir which
is different between development pc and build bots.
Original commit message:
[LLDB] - Support the single file split DWARF.
DWARF5 spec describes a single file split dwarf case
(when .dwo sections are in the .o files).
Problem is that LLDB does not work correctly in that case.
The issue is that, for example, both .debug_info and .debug_info.dwo
has the same type: eSectionTypeDWARFDebugInfo. And when code searches
section by type it might find the regular debug section
and not the .dwo one.
The patch fixes that. With it, LLDB is able to work with
output compiled with -gsplit-dwarf=single flag correctly.
Differential revision: https://reviews.llvm.org/D52403
llvm-svn: 346855