Currently on Windows (_MSC_VER) LLVMSymbolizer supports only Microsoft mangling.
This fix just explicitly uses itaniumDemangle when mangled name starts with _Z.
Differential Revision: https://reviews.llvm.org/D44192
llvm-svn: 326959
This patch enhances DWARFDebugFrame with the capability of parsing and
printing DWARF expressions in CFI instructions. It also makes FDEs and
CIEs accessible to lib users, so they can process them in client tools
that rely on LLVM. To make it self-contained with a test case, it
teaches llvm-readobj to be able to dump EH frames and checks they are
correct in a unit test. The llvm-readobj code is Maksim Panchenko's work
(maksfb).
Reviewers: JDevlieghere, espindola
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D43313
llvm-svn: 326932
Instead of only printing the CU-relative offset in non-verbose mode, it
makes more sense to only printed the resolved address. In verbose mode
we still print both.
Differential revision: https://reviews.llvm.org/D44148
rdar://33525475
llvm-svn: 326903
Summary: This helps to determine the line number for a PDB type with definition
Reviewers: zturner, llvm-commits, rnk
Reviewed By: zturner
Subscribers: rengolin, JDevlieghere
Differential Revision: https://reviews.llvm.org/D44119
llvm-svn: 326857
Summary:
The symbolizer was checking for .debug as a subdirectory of the
binary file itself, not of the directory containing the binary. This led to
a failure to find split debug info when it was contained in a .debug directory.
Reviewers: rnk, glider, zturner
Subscribers: llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D44025
llvm-svn: 326630
For now this is NFC, but this small refactor opens the door to
letting us embed a hash of the PDB in the build id field of the
PDB.
Differential Revision: https://reviews.llvm.org/D43913
llvm-svn: 326453
Qualifiers on a pointer or reference type may apply to either the
pointee or the pointer itself. Consider 'const char *' and 'char *
const'. In the first example, the pointee data may not be modified
without casts, and in the second example, the pointer may not be updated
to point to new data.
In the general case, qualifiers are applied to types with LF_MODIFIER
records, which support the usual const and volatile qualifiers as well
as the __unaligned extension qualifier.
However, LF_POINTER records, which are used for pointers, references,
and member pointers, have flags for qualifiers applying to the
*pointer*. In fact, this is the only way to represent the restrict
qualifier, which can only apply to pointers, and cannot qualify regular
data types.
This patch causes LLVM to correctly fold 'const' and 'volatile' pointer
qualifiers into the pointer record, as well as adding support for
'__restrict' qualifiers in the same place.
Based on a patch from Aaron Smith
Differential Revision: https://reviews.llvm.org/D43060
llvm-svn: 326260
Summary:
This patch implements the name lookup functionality of the .debug_names
accelerator table and hooks it up to "llvm-dwarfdump -find". To make the
interface of the two kinds of accelerator tables more consistent, I've
created an abstract "DWARFAcceleratorTable::Entry" class, which provides
a consistent interface to access the common functionality of the table
entries (such as getting the die offset, die tag, etc.). I've also
modified the apple table to vend entries conforming to this interface.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: vleschuk, clayborg, echristo, llvm-commits
Differential Revision: https://reviews.llvm.org/D43067
llvm-svn: 326003
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.
Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.
Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.
Differential Revision: https://reviews.llvm.org/D42765
llvm-svn: 325970
Verifying any DWARF file that is optimized and contains at least one tag
with a DW_AT_location with a location list offset as a
DW_AT_form_dataXXX results in dwarfdump spuriously claiming that the
location list is invalid.
Differential revision: https://reviews.llvm.org/D40199
llvm-svn: 325430
This was originally reported as a bug with the symptom being "cvdump
crashes when printing an LLD-linked PDB that has an S_FILESTATIC record
in it". After some additional investigation, I determined that this was
a symptom of a larger problem, and in fact the real problem was in the
way we emitted the global PDB string table. As evidence of this, you can
take any lld-generated PDB, run cvdump -stringtable on it, and it would
return no results.
My hypothesis was that cvdump could not *find* the string table to begin
with. Normally it would do this by looking in the "named stream map",
finding the string /names, and using its value as the stream index. If
this lookup fails, then cvdump would fail to load the string table.
To test this hypothesis, I looked at the name stream map generated by a
link.exe PDB, and I emitted exactly those bytes into an LLD-generated
PDB. Suddenly, cvdump could read our string table!
This code has always been hacky and we knew there was something we
didn't understand. After all, there were some comments to the effect of
"we have to emit strings in a specific order, otherwise things don't
work". The key to fixing this was finally understanding this.
The way it works is that it makes use of a generic serializable hash map
that maps integers to other integers. In this case, the "key" is the
offset into a buffer, and the value is the stream number. If you index
into the buffer at the offset specified by a given key, you find the
name. The underlying cause of all these problems is that we were using
the identity function for the hash. i.e. if a string's offset in the
buffer was 12, the hash value was 12. Instead, we need to hash the
string *at that offset*. There is an additional catch, in that we have
to compute the hash as a uint32 and then truncate it to uint16.
Making this work is a little bit annoying, because we use the same hash
table in other places as well, and normally just using the identity
function for the hash function is actually what's desired. I'm not
totally happy with the template goo I came up with, but it works in any
case.
The reason we never found this bug through our own testing is because we
were building a /parallel/ hash table (in the form of an
llvm::StringMap<>) and doing all of our lookups and "real" hash table
work against that. I deleted all of that code and now everything goes
through the real hash table. Then, to test it, I added a unit test which
adds 7 strings and queries the associated values. I test every possible
insertion order permutation of these 7 strings, to verify that it really
does work as expected.
Differential Revision: https://reviews.llvm.org/D43326
llvm-svn: 325386
Seeing some inlining missing in internal uses of symbolizer. I'll work
on a reproduction, tests, improvements & recommit as soon as possible.
(Chandler would like it to be known that this improvement did make
check-llvm 4x faster... - so there's certainly some fairly good
motivation to push on fixing/figuring this out & getting it back in)
This reverts commit r321345.
llvm-svn: 324981
Emitting the correct (root of compilation) file at index 0 will be
posted for review later; I wanted to get this minor change out of the
way first.
llvm-svn: 324669
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.
The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself. This
matches the format used for dumping strings in the .debug_info
section.
Differential Revision: https://reviews.llvm.org/D42802
llvm-svn: 324270
This change adds support to llvm-dwarfdump for dumping DWARF5
.debug_rnglists sections in regular ELF files.
It is not complete, in that several DW_RLE_* encodings are currently
not supported, but does dump the headert and the basic ranges for
DW_RLE_start_length and DW_RLE_start_end encodings.
Obvious next steps are to add verbose dumping that dumps the raw
encodings, rather than the interpreted contents, to add -verify support
of the section (e.g. to show that the correct number of offsets are
specified), add dumping of .debug_rnglists.dwo, and to add support for
other encodings.
Reviewed by: dblaikie, JDevlieghere
Differential Revision: https://reviews.llvm.org/D42481
llvm-svn: 324077
Based on a profile, a couple of hot spots were identified in the
main type merging loop. The code was simplified, a few loops
were re-arranged, and some outlined functions were inlined. This
speeds up type merging by a decent amount, shaving around 3-4 seconds
off of a 40 second link in my test case.
Differential Revision: https://reviews.llvm.org/D42559
llvm-svn: 323790
r323476 added support for DW_FORM_line_strp, and incorrectly made that
depend on having a DWARFUnit available. We shouldn't be tracking
.debug_line_str in DWARFUnit after all. After this patch, I can do an
NFC follow up and undo a bunch of the "plumbing" part of r323476.
Differential Revision: https://reviews.llvm.org/D42609
llvm-svn: 323691
The test was failing because of an incorrect sizeof check in the name
index parsing code. This code was meant to check that we have enough
input to parse the fixed-size part of the dwarf header, which it did by
comparing the input to sizeof(Header). Originally struct Header only
contained the fixed-size part, but during review, we've moved additional
members into it, which rendered the sizeof check invalid.
I resolve this by moving the fixed-size part to a separate struct and
updating the sizeof-expression to use that.
llvm-svn: 323648
The call to ScopedPrinter::printNumber with size_t argument was
ambiguous (I think) on 32-bit builds. Explicitly cast to a 64-bit int to
avoid this.
llvm-svn: 323642
Summary:
This modifies the dwarfdump output to align it with the new .debug_names
dump. It also renames two header fields to match similar fields in the
dwarf5 header.
A couple of tests needed to be updated to match new output. The changes
were fairly straight-forward, although not really automatable.
Reviewers: JDevlieghere, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42415
llvm-svn: 323641
Summary:
This commit renames DWARFAcceleratorTable to AppleAcceleratorTable to free up
the first name as an interface for the different accelerator tables.
Then I add a DWARFDebugNames class for the dwarf5 table.
Presently, the only common functionality of the two classes is the dump()
method, because this is the only method that was necessary to implement
dwarfdump -debug-names; and because the rest of the
AppleAcceleratorTable interface does not directly transfer to the dwarf5
tables (the main reason for that is that the present interface assumes
the tables are homogeneous, but the dwarf5 tables can have different
keys associated with each entry).
I expect to make the common interface richer as I add more functionality
to the new class (and invent a way to represent it in generic way).
In terms of sharing the implementation, I found the format of the two
tables sufficiently different to frustrate any attempts to have common
parsing or dumping code, so presently the implementations share just low
level code for formatting dwarf constants.
Reviewers: vleschuk, JDevlieghere, clayborg, aprantl, probinson, echristo, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42297
llvm-svn: 323638
This patch moves the DJB hash to support. This is consistent with other
hashing algorithms living there. The hash is used by the DWARF
accelerator tables. We're doing this now because the hashing function is
needed by dsymutil and we don't want to link against libBinaryFormat.
Differential revision: https://reviews.llvm.org/D42594
llvm-svn: 323616
Move standard forms from a switch statement to the table of forms;
fill in all the missing ones defined in DWARF v5. I'm guessing at
classifications in a couple of cases where v5 forms aren't actually
supported yet, but whoever adds support for the forms can fix the
classifications as needed.
llvm-svn: 323481
This form is like DW_FORM_strp, but points to .debug_line_str instead
of .debug_str as the string section. It's intended to be used from
the line-table header, and allows string-pooling of directory and
filenames across compilation units.
Differential Revision: https://reviews.llvm.org/D42553
llvm-svn: 323476
This frees up the first name to be used as an base class for the
apple table and the dwarf5 .debug_names accel table. The rename was
split off from D42297 (adding of debug_names support), which is still
under review.
llvm-svn: 323113
The compilation directory has always been #0, but as of DWARF v5 it is
explicitly listed in the line-table section instead of implicitly
being a reference to the compile_unit DIE's DW_AT_comp_dir attribute.
This means the dumper should number the dumped array starting with 0
or 1 depending on the DWARF version of the line table.
References in the generated DWARF are correct, it's just the dumper
that was wrong. Also some assembler-coded tests were similarly
confused about directory numbers.
llvm-svn: 322884
There's some abstraction overhead in the underlying
mechanisms that were being used, and it was leading to an
abundance of small but not-free copies being made. This
showed up on a profile. Eliminating this and going back to
a low-level byte-based implementation speeds up lld with
/DEBUG between 10 and 15%.
Differential Revision: https://reviews.llvm.org/D42148
llvm-svn: 322871
Summary:
- Fix a bug in PrettyBuiltinDumper that returns "void" as the name for
an unspecified builtin type. Since the unspecified param of a variadic
function is considered a builtin of unspecified type in PDBs, we set
"..." for its name.
- Provide a method to determine if a PDBSymbolFunc is variadic in
PrettyFunctionDumper since PDBSymbolFunc::getArgument() doesn't return the
last unspecified-type param.
- Add a pretty-func-dumper.test to test pretty dumping of variadic
functions.
Reviewers: zturner, llvm-commits
Reviewed By: zturner
Differential Revision: https://reviews.llvm.org/D41801
llvm-svn: 322608
We have some code to try to determine how many pieces an MSF
Free Page Map is split into, and this code had an off by one
error which would cause the calculation to be incorrect when
there were exactly 4096*k + 1 blocks in an MSF file.
Original investigation and patch outline by Colden Cullen.
Differential Revision: https://reviews.llvm.org/D41742
llvm-svn: 321880
inlined subroutines for a given address.
This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.
This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.
It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.
Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]
All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.
Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.
Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.
Differential Revision: https://reviews.llvm.org/D40987
llvm-svn: 321345
Reorganizes the DWARF consumer to derive the string offsets table
contribution's format from the contribution header instead of
(incorrectly) from the unit's format.
Reviewers: JDevliegehere, aprantl
Differential Revision: https://reviews.llvm.org/D41146
llvm-svn: 321295
Adds missing support for DW_FORM_data16.
Update of r320852/r320886, fixing the unittest again, this time use a
raw char string for the test data.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 321011
Adds missing support for DW_FORM_data16.
Update of r320852, fixing the unittest to use a hand-coded struct
instead of std::array to guarantee data layout.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 320886
This adds the /DEBUG:GHASH option to LLD which will look for
the existence of .debug$H sections in linker inputs and use them
to accelerate type merging. The clang-cl side has already been
added, so this completes the work necessary to begin experimenting
with this feature.
Differential Revision: https://reviews.llvm.org/D40980
llvm-svn: 320719
Currently this is an LLVM extension to the COFF spec which is
experimental and intended to speed up linking. For now it is
behind a hidden cl::opt flag, but in the future we can move it
to a "real" cc1 flag and have the driver pass it through whenever
it is appropriate.
The patch to actually make use of this section in lld will come
in a followup.
Differential Revision: https://reviews.llvm.org/D40917
llvm-svn: 320649
This fixes a bug where the verifier was complaining about empty
accelerator tables. When the table is empty, its size is not a valid
offset as it points after the end of the section.
This patch also makes the extractor return llvm:Error instead of bool
for better error reporting in the verifier.
Differential revision: https://reviews.llvm.org/D41063
rdar://35932007
llvm-svn: 320399
Previously, when linking against libcmt from the MSVC runtime,
lld-link /verbose would show "Ignoring unknown symbol record
with kind 0x1006". It turns out this was because
TypeIndexDiscovery did not handle S_REGISTER records, so these
records were not getting properly remapped.
Patch by: Alexnadre Ganea
Differential Revision: https://reviews.llvm.org/D40919
llvm-svn: 320108
Currently nothing uses this, but this at least gets the core
algorithm in, and adds some test to demonstrate correctness.
Differential Revision: https://reviews.llvm.org/D40736
llvm-svn: 319854
This was storing the hash alongside the key so that the hash
doesn't need to be re-computed every time, but in doing so it
was allocating a structure to keep the key size small in the
DenseMap. This is a noble goal, but it also leads to a pointer
indirection on every probe, and this cost of this pointer
indirection ends up being higher than the cost of having a
slightly larger entry in the hash table. Removing this not only
simplifies the code, but yields a small but noticeable
performance improvement in the type merging algorithm.
llvm-svn: 319493
This class had some code that would automatically remap type
indices before hashing and serializing. The only caller of
this method was the TypeStreamMerger anyway, and the method
doesn't make general sense, and prevents making certain future
improvements to the class. So, factoring this up one level
into the TypeStreamMerger where it belongs.
llvm-svn: 319377
A couple of places in LLD were passing references to
TypeTableCollections around, which makes it hard to change the
implementation at runtime. However, these cases only needed to
iterate over the types in the collection, and TypeCollection
already provides a handy abstract interface for this purpose.
By implementing this interface, we can get rid of the need to
pass TypeTableBuilder references around, which should allow us
to swap the implementation at runtime in subsequent patches.
llvm-svn: 319345
The motivation behind this patch is that future directions require us to
be able to compute the hash value of records independently of actually
using them for de-duplication.
The current structure of TypeSerializer / TypeTableBuilder being a
single entry point that takes an unserialized type record, and then
hashes and de-duplicates it is not flexible enough to allow this.
At the same time, the existing TypeSerializer is already extremely
complex for this very reason -- it tries to be too many things. In
addition to serializing, hashing, and de-duplicating, ti also supports
splitting up field list records and adding continuations. All of this
functionality crammed into this one class makes it very complicated to
work with and hard to maintain.
To solve all of these problems, I've re-written everything from scratch
and split the functionality into separate pieces that can easily be
reused. The end result is that one class TypeSerializer is turned into 3
new classes SimpleTypeSerializer, ContinuationRecordBuilder, and
TypeTableBuilder, each of which in isolation is simple and
straightforward.
A quick summary of these new classes and their responsibilities are:
- SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of
bytes. Does not do any hashing. Every time you call it, it will
re-serialize and return bytes again. The same instance can be re-used
over and over to avoid re-allocations, and in exchange for this
optimization the bytes returned by the serializer only live until the
caller attempts to serialize a new record.
- ContinuationRecordBuilder : Turns a FieldList-like record into a series
of fragments. Does not do any hashing. Like SimpleTypeSerializer,
returns references to privately owned bytes, so the storage is
invalidated as soon as the caller tries to re-use the instance. Works
equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a
long-standing theoretical limitation of the previous implementation.
- TypeTableBuilder : Accepts sequences of bytes that the user has already
serialized, and inserts them by de-duplicating with a hash table. For
the sake of convenience and efficiency, this class internally stores a
SimpleTypeSerializer so that it can accept unserialized records. The
same is not true of ContinuationRecordBuilder. The user is required to
create their own instance of ContinuationRecordBuilder.
Differential Revision: https://reviews.llvm.org/D40518
llvm-svn: 319198
The previous implementation would only look 1 DW_AT_specification or DW_AT_abstract_origin deep. This means DWARFDie::getName() would fail in certain cases. I ran into such a case while creating a tool that used the LLVM DWARF parser to generate a symbolication format so I have seen this in the wild.
Differential Revision: https://reviews.llvm.org/D40156
llvm-svn: 319104
The existing library assumed that a stream's length would never
change. This makes some things simpler, but it's not flexible
enough for what we need, especially for writable streams where
what you really want is for each call to write to actually append.
llvm-svn: 319070
DWARF4 relative DW_AT_high_pc values are now displayed as absolute
addresses. The relative value is only shown when explicitly dumping the
forms, i.e. in show-form or verbose mode.
```
DW_AT_low_pc (0x0000000000000049)
DW_AT_high_pc (0x00000019)
```
becomes
```
DW_AT_low_pc (0x0000000000000049)
DW_AT_high_pc (0x0000000000000062)
```
Differential revision: https://reviews.llvm.org/D40317
rdar://35416943
llvm-svn: 319044
As a side effect, the .debug_line section will be dumped in physical
order, rather than in the order that compile units refer to their
associated portions of the .debug_line section. These are probably
always the same order anyway, and no tests noticed the difference.
Differential Revision: https://reviews.llvm.org/D39854
llvm-svn: 318839
It turns out this #include isn't used from Host.h anyway,
but by having it it causes circular include dependencies.
This issues only surfaced while I was working on a separate
patch, so I'm submitting this first so that it's independent
of the other, unrelated patch.
llvm-svn: 318489
Initial changes to support debugging PE/COFF files with LLDB on Windows through DIA SDK.
There is another set of changes required on the LLDB side before this does anything.
Differential Revision: https://reviews.llvm.org/D39517
llvm-svn: 318403
Patch improves next things:
* Fixes assert/crash in getOpDesc when giving it a invalid expression op code.
* DWARFExpression::print() called DWARFExpression::Operation::getEndOffset() which
returned and used uninitialized field EndOffset. Patch fixes that.
* Teaches verifier to verify DW_AT_location and error out on broken expressions.
Differential revision: https://reviews.llvm.org/D39294
llvm-svn: 316756
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.
Differential revision: https://reviews.llvm.org/D38409
llvm-svn: 316619
This fixes possible out of bound access in
DWARFDie::getFirstChild()
which might happen when .debug_info section is corrupted,
like shown in testcase.
Differential revision: https://reviews.llvm.org/D39185
llvm-svn: 316566
The type index is from the TPI stream, not the IPI stream. Fix the
dumper, fix type index discovery, and add a test in LLD.
Also improve the log message we emit when we fail to rewrite type
indices in LLD. That's how I found this bug.
llvm-svn: 316461
This adds type index discovery and dumper support for symbol record kind
0x1168, which is a list of inlined function ids. This symbol kind is
undocumented, but S_INLINEES is consistent with the existing
nomenclature.
Fixes PR34222
llvm-svn: 316398
Currently llvm-dwarfdump runs into llvm_unreachable when
faces DW_CFA_GNU_args_size. Patch implements the support.
Differential revision: https://reviews.llvm.org/D38879
llvm-svn: 315897
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.
Recommit after being reverted in r315299.
Differential revision: https://reviews.llvm.org/D36993
llvm-svn: 315316
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.
Differential revision: https://reviews.llvm.org/D36993
llvm-svn: 315297
This patch adds two new verifiers:
- It checks that the root DIE of a CU is actually a valid unit DIE.
(based on its tag)
- For DWARF5 which contains a unit type int he CU header, it checks that
this matches the type of the unit DIE.
Differential revision: https://reviews.llvm.org/D38453
llvm-svn: 315121
The test fails on Linux; see follow-up email on the llvm-commits list.
> Add the option to lookup an address in the debug information and print
> out the file, function, block and line table details.
>
> Differential revision: https://reviews.llvm.org/D38409
This also reverts the follow-up r314818:
> [test] Fix llvm-dwarfdump/cmdline.test
>
> Fixes test/tools/llvm-dwarfdump/cmdline.test
llvm-svn: 314825
The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.
X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.
Differential revision: https://reviews.llvm.org/D38480
llvm-svn: 314821
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.
Differential revision: https://reviews.llvm.org/D38409
llvm-svn: 314817
This implement the insertion operator for DWARF address ranges so they
are consistently printed as [LowPC, HighPC).
While a dump method might have felt more consistent, it is used
exclusively for printing error messages in the verifier and never used
for actual dumping. Hence this approach is more intuitive and creates
less clutter at the call sites.
Differential revision: https://reviews.llvm.org/D38395
llvm-svn: 314523
This patch introduces 3 helper functions: error(), warn() and note() to
make printing during verification more consistent. When supported, the
respective prefixes are printed in color using the same color scheme as
clang.
Differential revision: https://reviews.llvm.org/D38368
llvm-svn: 314498
This patch implements the dwarfdump option --find=<name>. This option
looks for a DIE in the accelerator tables and dumps it if found. This
initial patch only adds support for .apple_names to keep the review
small, adding the other sections and pubnames support should be
trivial though.
Differential Revision: https://reviews.llvm.org/D38282
llvm-svn: 314439
This patch adds a check to the DWARF verifier to detect CUs without a
unit DIE.
Differential revision: https://reviews.llvm.org/D38363
llvm-svn: 314426
When dsymutil generates the companion file, its strips all unnecessary
sections by omitting their body and setting the offset in their
corresponding load command to zero.
One such section is the .eh_frame section, as it contains runtime
information rather than debug information and is part of the __TEXT
segment. When reading this section, we would just read the number of
bytes specified in the load command, starting from offset 0 (i.e. the
beginning of the file).
Rather than trying to parse this obviously invalid section, dwarfdump
now skips this.
Differential revision: https://reviews.llvm.org/D38135
llvm-svn: 314208
This patch adds dumping of line table instructions as well as the final
state at each specified pc value in verbose mode. This is essentially
the same as the default in Darwin's dwarfdump. Dumping the actual line
table opcodes can be particularly useful for something like debugging a
bad `.debug_line` section.
Differential revision: https://reviews.llvm.org/D37971
llvm-svn: 313910
This is a bit ugly because we can't put Optional into a union. Hide all
of that behind a set of accessors and make accesses safer using asserts.
llvm-svn: 313884
This patch implements the Darwin dwarfdump option --recurse-depth=<N>,
which limits the recursion depth when selectively printing DIEs at an
offset.
Differential Revision: https://reviews.llvm.org/D38064
llvm-svn: 313778
When symbolizing large binaries, parsing every CU in a DWP file is a
significant performance penalty. Instead, use the index to only load the
CUs that are needed.
llvm-svn: 313659
This speeds up dumping specific DIEs by not parsing abbreviations for
units that are not used.
(this is also handy to have in eventually to speed up llvm-symbolizer
for .dwp files, where parsing most of the DWP file can be avoided by
using the index)
llvm-svn: 313635
This patch makes the `.eh_frame` extension an alias for `.debug_frame`.
Up till now it was only possible to dump the section using objdump, but
not with dwarfdump. Since the two are essentially interchangeable, we
dump whichever of the two is present.
As a workaround, this patch also adds parsing for 3 currently
unimplemented CFA instructions: `DW_CFA_def_cfa_expression`,
`DW_CFA_expression`, and `DW_CFA_val_expression`. Because I lack the
required knowledge, I just parse the fields without actually creating
the instructions.
Finally, this also fixes the typo in the `.debug_frame` section name
which incorrectly contained a trailing `s`.
Differential revision: https://reviews.llvm.org/D37852
llvm-svn: 313530
This is the first of many commits that enable selectively dumping just
one record from the debug info.
This reapplies r313412 with some extra qualification to appease GCC and MSVC.
llvm-svn: 313419
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:
- Verify that all address ranges in a DIE are valid.
- Verify that no ranges within the DIE overlap.
- Verify that no ranges overlap with the ranges of a sibling.
- Verify that children are completely contained in its (direct)
parent's address range. (unless both are subprograms)
Differential revision: https://reviews.llvm.org/D37696
llvm-svn: 313255
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:
- Verify that all address ranges in a DIE are valid.
- Verify that no ranges within the DIE overlap.
- Verify that no ranges overlap with the ranges of a sibling.
- Verify that children are completely contained in its (direct)
parent's address range. (unless both are subprograms)
Differential revision: https://reviews.llvm.org/D37696
llvm-svn: 313250
Since users typically don't really care about the .dwo / non.dwo
distinction, this patch makes it so dwarfdump --debug-<info,...> dumps
.debug_info and (if available) also .debug_info.dwo. This simplifies
the command line interface (I've removed all dwo-specific dump
options) and makes the tool friendlier to use.
Differential Revision: https://reviews.llvm.org/D37771
llvm-svn: 313207
This patches renames "brief" to "verbose" in de DIDumpOptions and
inverts the logic to match the new behavior where brief is the default.
Changing the default value uncovered some bugs related to the
DIDumpOptions not being propagated and have been fixed as well.
Differential revision: https://reviews.llvm.org/D37745
llvm-svn: 313139
As discussed on llvm-dev in
http://lists.llvm.org/pipermail/llvm-dev/2017-September/117301.html
this changes the command line interface of llvm-dwarfdump to match the
one used by the dwarfdump utility shipping on macOS. In addition to
being shorter to type this format also has the advantage of allowing
more than one section to be specified at the same time.
In a nutshell, with this change
$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc
becomes
$ dwarfdump --debug-info --apple-objc
Differential Revision: https://reviews.llvm.org/D37714
llvm-svn: 312970
This patch adds prologue verification, which is already present in
Apple's dwarfdump. It checks for invalid directory indices and warns
about duplicate file paths.
Differential revision: https://reviews.llvm.org/D37511
llvm-svn: 312782
It is possible for two modules to have the same name if they are
archive members with the same name, or if we are doing LTO (in which
case all modules will have the name "lto.tmp").
Differential Revision: https://reviews.llvm.org/D37589
llvm-svn: 312744
It solves issue of wrong section index evaluating for ranges when
base address is used.
Based on David Blaikie's patch D36097.
Differential revision: https://reviews.llvm.org/D37214
llvm-svn: 312477
We have llvm-readobj for dumping CodeView from object files, and
llvm-pdbutil has always been more focused on PDB. However,
llvm-pdbutil has a lot of useful options for summarizing debug
information in aggregate and presenting high level statistical
views. Furthermore, it's arguably better as a testing tool since
we don't have to write tests to conform to a state-machine like
structure where you match multiple lines in succession, each
depending on a previous match. llvm-pdbutil dumps much more
concisely, so it's possible to use single-line matches in many
cases where as with readobj tests you have to use multi-line
matches with an implicit state machine.
Because of this, I'm adding object file support to llvm-pdbutil.
In fact, this mirrors the cvdump tool from Microsoft, which also
supports both object files and pdb files. In the future we could
perhaps rename this tool llvm-cvutil.
In the meantime, this allows us to deep dive into object files
the same way we already can with PDB files.
llvm-svn: 312358
This adds a new command line option, -udt-stats, which breaks
down the stats of S_UDT records. These are one of the biggest
contributors to the size of /DEBUG:FASTLINK PDBs, so they need
some additional tools to be able to analyze their usage. This
option will dig into each S_UDT record and determine what kind
of record it points to, and then break down the statistics by
the target type. The goal here is to identify how our object
files differ from MSVC object files in S_UDT records, so that
we can output fewer of them and reach size parity.
llvm-svn: 312276
Summary:
Based on Fred's patch here: https://reviews.llvm.org/D6771
I can't seem to commandeer the old review, so I'm creating a new one.
With that change the locations exrpessions are pretty printed inline in the
DIE tree. The output looks like this for debug_loc entries:
DW_AT_location [DW_FORM_data4] (0x00000000
0x0000000000000001 - 0x000000000000000b: DW_OP_consts +3
0x000000000000000b - 0x0000000000000012: DW_OP_consts +7
0x0000000000000012 - 0x000000000000001b: DW_OP_reg0 RAX, DW_OP_piece 0x4
0x000000000000001b - 0x0000000000000024: DW_OP_breg5 RDI+0)
And like this for debug_loc.dwo entries:
DW_AT_location [DW_FORM_sec_offset] (0x00000000
Addr idx 2 (w/ length 190): DW_OP_consts +0, DW_OP_stack_value
Addr idx 3 (w/ length 23): DW_OP_reg0 RAX, DW_OP_piece 0x4)
Simple locations without ranges are printed inline:
DW_AT_location [DW_FORM_block1] (DW_OP_reg4 RSI, DW_OP_piece 0x4, DW_OP_bit_piece 0x20 0x0)
The debug_loc(.dwo) dumping in changed accordingly to factor the code.
Reviewers: dblaikie, aprantl, friss
Subscribers: mgorny, javed.absar, hiraditya, llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D37123
llvm-svn: 312042
We were using a std::vector<> and resizing to MaxRecordLength,
which is ~64KB. We would then do this repeatedly often many
times in a tight loop, which was causing measurable performance
impact when linking PDBs.
Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36940
llvm-svn: 311375
computeAddrMap function calls std::stable_sort with a comparison
function that computes deserialized symbols every time its called.
In the result deserializeAs<PublicSym32> is called 20-30 times per
symbol. It's much faster to calculate it beforehand and pass a
pointer to it to the comparison function.
Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36941
llvm-svn: 311373
This adds support for dumping a summary of module symbols
and CodeView debug chunks. This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size. Then
at the end it prints the totals for the entire file.
Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.
llvm-svn: 311338
This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.
Differential Revision: https://reviews.llvm.org/D36835
llvm-svn: 311201
When dumping, we were treating the S_INLINESITESYM as referring
to a type record, when it actually refers to an id record. We
had this correct in TypeIndexDiscovery, so our merging algorithm
should be fine, but we had it wrong in the dumper, which means it
would appear to work most of the time, unless the index was out
of bounds in the type stream, when it would fail. Fixed this, and
audited a few other cases to make them match the behavior in
TypeIndexDiscovery.
Also, I've now observed a new symbol record with kind 0x1168 which
I have no clue what it is, so to avoid crashing we have to just
print "Unknown Symbol Kind".
llvm-svn: 311117
As was requested in D36313 thread,
with this patch section names and uniqueness calculated once,
and not every time when a range is dumped.
Differential revision: https://reviews.llvm.org/D36740
llvm-svn: 310923
Teaches llvm-dwarfdump to print section index and name of range
when it dumps .debug_info.
Differential revision: https://reviews.llvm.org/D36313
llvm-svn: 310915
Previously we were writing an empty globals stream. Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this. Regardless,
without it we don't have information about global variables, so
we need to fix it anyway. This patch does that.
With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.
Differential Revision: https://reviews.llvm.org/D36535
llvm-svn: 310743
In the refactor to merge the publics and globals stream, a bug
was introduced that wrote the wrong value for one of the fields
of the PublicsStreamHeader. This caused debugging in WinDbg
to break.
We had no way of dumping any of these fields, so in addition to
fixing the bug I've added dumping support for them along with a
test that verifies the correct value is written.
llvm-svn: 310439
The publics stream and globals stream are very similar. They both
contain a list of hash buckets that refer into a single shared stream,
the symbol record stream. Because of the need for each builder to manage
both an independent hash stream as well as a single shared record
stream, making the two builders be independent entities is not the right
design. This patch merges them into a single class, of which only a
single instance is needed to create all 3 streams. PublicsStreamBuilder
and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder
class, which writes all 3 streams at once.
Note that this patch does not contain any functionality change. So we're
still not yet writing any records to the globals stream. All we're doing
is making it so that when we do start writing records to the globals,
this refactor won't have to be part of that patch.
Differential Revision: https://reviews.llvm.org/D36489
llvm-svn: 310438
The compiler outputs PROC32_ID symbols into the object files
for functions, and these symbols have an embedded type index
which, when copied to the PDB, refer to the IPI stream. However,
the symbols themselves are also converted into regular symbols
(e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular
symbol records refer to the TPI stream. So this patch applies
two fixes to function records.
1. It converts ID symbols to the proper non-ID record type.
2. After remapping the type index from the object file's index
space to the PDB file/IPI stream's index space, it then
remaps that index to the TPI stream's index space by.
Besides functions, during the remapping process we were also
discarding symbol record types which we did not recognize.
In particular, we were discarding S_BPREL32 records, which is
what MSVC uses to describe local variables on the stack. So
this patch fixes that as well by copying them to the PDB.
Differential Revision: https://reviews.llvm.org/D36426
llvm-svn: 310394
These lead to tests failing spuriously as the values after being rendered to a
string were incorrect.
Reviewers: clayborg
Differential Revision: https://reviews.llvm.org/D36319
llvm-svn: 310262
This extends the native reader to enable llvm-pdbutil to list the enums in a
PDB and it includes a simple test. It does not yet list the values in the
enumerations, which requires an actual implementation of
NativeEnumSymbol::FindChildren.
To exercise this code, use a command like:
llvm-pdbutil pretty -native -enums foo.pdb
Differential Revision: https://reviews.llvm.org/D35738
llvm-svn: 310144
Summary:
PDB section contributions are supposed to use output section indices and
offsets, not input section indices and offsets.
This allows the debugger to look up the index of the module that it
should look up in the modules stream for symbol information. With this
change, windbg can now find line tables, but it still cannot print local
variables.
Fixes PR34048
Reviewers: zturner
Subscribers: hiraditya, ruiu, llvm-commits
Differential Revision: https://reviews.llvm.org/D36285
llvm-svn: 309987
The PDB reserves certain blocks for the FPM that describe which
blocks in the file are allocated and which are free. We weren't
filling that out at all, and in some cases we were even stomping
it with incorrect data. This patch writes a correct FPM.
Differential Revision: https://reviews.llvm.org/D36235
llvm-svn: 309896
Recently problems have been discovered in the way we write the FPM
(free page map). In order to fix this, we first need to establish
a baseline about what a correct FPM looks like using an MSVC
generated PDB, so that we can then make our own generated PDBs
match. And in order to do this, the dumper needs a mode where it
can dump an FPM so that we can write tests for it.
This patch adds a command to dump the FPM, as well as a test against
a known-good PDB.
llvm-svn: 309894
Followup to r309570, fixing it slightly differently (ranges_base and
addr_base should never be read from a DWO file - so there shouldn't be
any issue with 'overriding' the values - conditionalize the code and
assert that the values aren't being overriden).
llvm-svn: 309879
We don't write any actual symbols to this stream yet, but for
now we just create the stream and hook it up to the appropriate
places and give it a valid header.
Differential Revision: https://reviews.llvm.org/D35290
llvm-svn: 309608
DIEs are lazily deserialized so it's possible that the DWO CU is created
before the DIE is parsed. DWO shares .debug_addr and .debug_ranges with the
object file so overwriting the offset with 0 will make the CU unusable.
No test case because I couldn't get clang to emit a non-zero range base.
llvm-svn: 309570
I was a bit lazy when I first implemented this & skipped the index
lookup - obviously for large files this becomes pretty crucial, so here
we go, do the index lookup. Speeds up large DWP symbolizing by... lots.
(20m -> 20s, actually, maybe more in a release build (that was a release
build without index lookup, compared to a debug/non-release build with
the index usage))
llvm-svn: 309507
If you've archived the DWP file somewhere it's probably useful to be
able to just tell llvm-symbolizer where it is when you're symbolizing
stack traces from the binary.
This only provides a mechanism for specifying a single DWP file, good if
you're symbolizing a program with a single DWP file, but it's likely if
the program is dynamically linked that you might have a DWP for each
dynamic library - in which case this feature won't help (at least as
it's surfaced in llvm-symbolizer for now) - in theory it could be
extended to specify a collection of DWP files that could all be
consulted for split CU hash resolution.
llvm-svn: 309498
With ASan, we would write about 512 bytes of malloc fill value to the
PDB, with some random bits ORed in here and there. Dumping the PDB would
always fail reliably.
llvm-svn: 309331
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.
Reviewers: zturner, ruiu
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D35871
llvm-svn: 309303
The PDB "symbol stream" actually contains symbol records for the publics
and the globals stream. The globals and publics streams are essentially
hash tables that point into a single stream of records. In order to
match cvdump's behavior, we need to only dump symbol records referenced
from the hash table. This patch implements that, and then implements
global stream dumping, since it's just a subset of public stream
dumping.
Now we shouldn't see S_PROCREF or S_GDATA32 records when dumping
publics, and instead we should see those record in the globals stream.
llvm-svn: 309066
Mostly just to silence a warning about an unhandled case. There don't seem to
be any tests for this operator (at least that I could find).
llvm-svn: 308901