Commit Graph

248 Commits

Author SHA1 Message Date
Zachary Turner ee8010abe3 Move some code from PDBFileBuilder to MSFBuilder.
The code to emit the pieces of the MSF file were actually in
PDBFileBuilder.  Move this to MSFBuilder so that we can
theoretically emit an MSF without having a PDB file.

llvm-svn: 335789
2018-06-27 21:18:15 +00:00
Jonas Devlieghere 43dce3edbe [CodeView] Add prefix to CodeView registers.
Adds CVReg to CodeView register names to prevent a duplicate symbol with
CR3 defined in termios.h, as suggested by Zachary on the mailing list.

http://lists.llvm.org/pipermail/llvm-dev/2018-May/123372.html

Differential revision: https://reviews.llvm.org/D47478

rdar://39863705

llvm-svn: 333421
2018-05-29 14:35:34 +00:00
James Henderson 004b729ed1 [DWARF] Refactor callback usage for .debug_line error handling
Change the "recoverable" error callback to take an Error instaed of a
string.

Reviewed by: JDevlieghere

Differential Revision: https://reviews.llvm.org/D46831

llvm-svn: 332845
2018-05-21 15:30:54 +00:00
Peter Collingbourne f7b81db715 MC: Change the streamer ctors to take an object writer instead of a stream. NFCI.
The idea is that a client that wants split dwarf would create a
specific kind of object writer that creates two files, and use it to
create the streamer.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47050

llvm-svn: 332749
2018-05-18 18:26:45 +00:00
Nico Weber 9668e45b61 Inline a few CMake variables into their only uses.
No behavior change. Makes unittests CMakeLists.txt files more self-consistent.

llvm-svn: 332280
2018-05-14 19:23:31 +00:00
Simon Pilgrim affbc99bea Fix Wdocumentation warnings. NFCI.
llvm-svn: 332239
2018-05-14 12:22:30 +00:00
James Henderson 198a87c8f3 [DWARF] Remove unused member and fix(?) the unit-tests on big endian hosts
I can't verified the fix on a big endian host, so I'm not 100% certain it
will work.

llvm-svn: 331986
2018-05-10 14:36:24 +00:00
Roman Lebedev 85343925d7 [DWARF] DwarfGenerator.h LineTable: explicitly mark DG as unused
Just want to unbreak the build.

llvm-svn: 331984
2018-05-10 14:16:45 +00:00
Roman Lebedev e1b0e29e1e [DWARF] dwarfgen::LineTable::writeData(): pacify -Wcovered-switch-default
llvm-svn: 331983
2018-05-10 14:16:41 +00:00
Roman Lebedev a68aee07d5 [DWARF] DWARFDebugLineTest: fix a few more signed/unsigned mismatch warnings
llvm-svn: 331982
2018-05-10 14:16:37 +00:00
James Henderson 11a9de74c9 Fix signed/unsigned comparison warning and print format
The print format was causing at least 2 unit-test failures from r331971.

The signed/unsigned comparison warnings only appeared to affect two lines but
it was unclear whether it might just pop up on other lines, so I have been
explicit in all the literals in the tests.

There were other bot unit-test failures that I am still investigating.

llvm-svn: 331978
2018-05-10 12:15:43 +00:00
James Henderson a3acf99e59 [DWARF] Rework debug line parsing to use llvm::Error and callbacks
Reviewed by: dblaikie, JDevlieghere, espindola

Differential Revision: https://reviews.llvm.org/D44560

Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.

There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).

I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.

Known behaviour changes:
  - The dump function in DWARFContext now does not attempt to read subsequent
  tables when searching for a specific offset, if the unit length field of a
  table before the specified offset is a reserved value.
  - getOrParseLineTable now returns a useful Error if an invalid offset is
  encountered, rather than simply a nullptr.
  - The parse functions no longer use `WithColor::warning` directly to report
  errors, allowing LLD to call its own warning function.
  - The existing parse error messages have been updated to not specifically
  include "warning" in their message, allowing consumers to determine what
  severity the problem is.
  - If the line table version field appears to have a value less than 2, an
  informative error is returned, instead of just false.
  - If the line table unit length field uses a reserved value, an informative
  error is returned, instead of just false.
  - Dumping of .debug_line.dwo sections is now implemented the same as regular
  .debug_line sections.
  - Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
  there is a prologue error, just like non-verbose dumping.

As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.

This change also requires a change to LLD, which will be committed separately.

llvm-svn: 331971
2018-05-10 10:51:33 +00:00
Nico Weber da750079fa IWYU for llvm-config.h, removals. Also see r331184.
llvm-svn: 331190
2018-04-30 15:26:01 +00:00
Nico Weber 432a38838d IWYU for llvm-config.h in llvm, additions.
See r331124 for how I made a list of files missing the include.
I then ran this Python script:

    for f in open('filelist.txt'):
        f = f.strip()
        fl = open(f).readlines()

        found = False
        for i in xrange(len(fl)):
            p = '#include "llvm/'
            if not fl[i].startswith(p):
                continue
            if fl[i][len(p):] > 'Config':
                fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
                found = True
                break
        if not found:
            print 'not found', f
        else:
            open(f, 'w').write(''.join(fl))

and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.

No intended behavior change.

llvm-svn: 331184
2018-04-30 14:59:11 +00:00
Jonas Devlieghere 5c709eda07 [ObjectYAML] Add ability for DWARFYAML to calculate DIE lengths
This patch adds the ability for the ObjectYAML DWARFEmitter to calculate
the lengths of DIEs. This is accomplished by creating a DIEFixupVisitor
class which traverses the DWARF DIEs to calculate and fix up the lengths
in the Compile Unit header.

The DIEFixupVisitor can be extended in the future to enable more complex
fix ups which will enable simplified YAML string representations.

This is also very useful when using the YAML format in unit tests
because you no longer need to know the length of the compile unit when
writing the YAML string.

Differential commandeered from Chris Bieneman (beanz)

Differential revision: https://reviews.llvm.org/D30666

llvm-svn: 330421
2018-04-20 12:33:49 +00:00
David Blaikie 4333f9700d Rename *CommandFlags.def to *CommandFlags.inc
These aren't the .def style files used in LLVM that require a macro
defined before their inclusion - they're just basic non-modular includes
to stamp out command line flag variables.

llvm-svn: 329840
2018-04-11 18:49:37 +00:00
Aaron Smith 860f0a5dd8 [DebugInfoPDB] Add missing test for findSymbolByRVA and findSymbolByAddr
llvm-svn: 329733
2018-04-10 18:12:49 +00:00
Alexandre Ganea 08df84e4f0 [DebugInfo][COFF] Fix reading variable-length encoded records
While reading Codeview records which contain variable-length encoded integers,
such as LF_BCLASS, LF_ENUMERATE, LF_MEMBER, LF_VBCLASS or LF_IVBCLASS,
the record's size would be improperly calculated in cases where the value was
indeed of a variable length (>= LF_NUMERIC). This caused a bad alignement on
the next record, which would/might crash later on.

Differential Revision: https://reviews.llvm.org/D45104

llvm-svn: 329659
2018-04-10 01:58:45 +00:00
Alexandre Ganea 3241cec577 Fix line endings (CR/LF -> LF) introduced by rL329613
reviewer: zturner
llvm-svn: 329646
2018-04-10 00:09:15 +00:00
Alexandre Ganea d9e96741c4 [Debuginfo][COFF] Minimal serialization support for precompiled types records
This change adds support for the LF_PRECOMP and LF_ENDPRECOMP records required
to read/write Microsoft precompiled types .objs.
See https://en.wikipedia.org/wiki/Precompiled_header#Microsoft_Visual_C_and_C++

This also adds handling for the .debug$P section, which is actually a .debug$T
section in disguise, found only in precompiled .objs.

Differential Revision: https://reviews.llvm.org/D45283

llvm-svn: 329613
2018-04-09 20:17:56 +00:00
Rafael Espindola 4b4d85fd4d Style update. NFC.
Rename 3 functions to start with lowercase letters. Don't repeat the
name in the comments.

llvm-svn: 328848
2018-03-29 23:32:54 +00:00
Zachary Turner 3203e27473 [MSF] Default to FPM2, and always mark FPM pages allocated.
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit.  LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit.  To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.

Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated.  In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.

In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.

This has the side benefit of simplifying our code.

llvm-svn: 328812
2018-03-29 18:34:15 +00:00
Aaron Smith ed81a9db29 [DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA
This method is used to find line numbers for PDBSymbolData
that have an invalid virtual address.

llvm-svn: 328586
2018-03-26 22:13:22 +00:00
Aaron Smith 53708a5e9e [DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA
These are used in finding line numbers for PDBSymbolData

llvm-svn: 328585
2018-03-26 22:10:02 +00:00
Aaron Smith 523de05a1f [DIA] Add IPDBSectionContrib interfaces and DIA implementation
To resolve symbol context at a particular address, we need to
determine the compiland for the address. We are able to determine
the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT,
PDBSymbolTypeEnum symbols indirectly through line information. 
However no such information is availabile for PDBSymbolData, 
i.e. variables.

The Section Contribution table from PDBs has information about
each compiland's contribution to sections by address. For example,
a piece of a contribution looks like,

  VA         RelativeVA  Sect No.  Offset    Length    Compiland
  14000087B0 000087B0    0001      000077B0  000000BB  exe_main.obj

So given an address, it's possible to determine its compiland with
this information.

llvm-svn: 328178
2018-03-22 04:08:15 +00:00
Zachary Turner eb62999455 [PDB] Don't ignore bucket 0 when writing the PDB string table.
The hash table is a list of buckets, and the *value* stored in
the bucket cannot be 0 since that is reserved.  However, the code
here was incorrectly skipping over the 0'th bucket entirely.
The 0'th bucket is perfectly fine, just none of these buckets
can contain the value 0.

As a result, whenever there was a string where hash(S) % Size
was equal to 0, we would write the value in the next bucket
instead.  We never caught this in our tests due to *another*
bug, which is that we would iterate the entire list of buckets
looking for the value, only using the hash value as a starting
point.  However, the real algorithm stops when it finds 0 in
a bucket since it takes that to mean "the item is not in the
hash table".

The unit test is updated to carefully construct a set of hash
values that will cause one item to hash to 0 mod bucket count,
and the reader is also updated to return an error indicating that
the item is not found when it encounters a 0 bucket.

llvm-svn: 328162
2018-03-21 22:23:59 +00:00
Pavel Labath 3461b1e097 HashTableTest: squelch some "comparison of integers of different signs" warnings
llvm-svn: 327701
2018-03-16 10:30:26 +00:00
Zachary Turner 03028f327b Fix structure alignment issue.
llvm-svn: 327666
2018-03-15 21:12:51 +00:00
Zachary Turner ebf03f6c46 Refactor the PDB HashTable class.
It previously only worked when the key and value types were
both 4 byte integers.  We now have a use case for a non trivial
value type, so we need to extend it to support arbitrary value
types, which means templatizing it.

llvm-svn: 327647
2018-03-15 17:38:26 +00:00
Aaron Smith 40198f5905 [DebugInfo] Add a new method IPDBSession::findLineNumbersBySectOffset
Summary:
Some PDB symbols do not have a valid VA or RVA but have Addr by Section and Offset. For example, a variable in thread-local storage has the following properties:

     get_addressOffset: 0
     get_addressSection: 5
     get_lexicalParentId: 2
     get_name: g_tls
     get_symIndexId: 12
     get_typeId: 4
     get_dataKind: 6
     get_symTag: 7
     get_locationType: 2

This change provides a new method to locate line numbers by Section and Offset from those symbols.

Reviewers: zturner, rnk, llvm-commits

Subscribers: asmith, JDevlieghere

Differential Revision: https://reviews.llvm.org/D44407

llvm-svn: 327601
2018-03-15 06:04:51 +00:00
Pavel Labath 322711f529 DWARF: Unify form size handling code
Summary:
This patch replaces the two switches which are deducing the size of
various forms with a single implementation. I have put the new
implementation into BinaryFormat, to avoid introducing dependencies
between the two independent libraries (DebugInfo and CodeGen) that need
this functionality.

Reviewers: aprantl, JDevlieghere, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44418

llvm-svn: 327486
2018-03-14 09:39:54 +00:00
Zachary Turner a389c84df5 Implement pure virtual method to fix build.
llvm-svn: 327431
2018-03-13 17:58:28 +00:00
James Henderson 667026297d [DWARF] Don't attempt to parse line tables at invalid offsets
Whilst working on improvements to the error handling of the debug line
parsing code, I noticed that if an invalid offset were to be specified
in a call to getOrParseLineTable(), an entry in the LineTableMap would
still be created, even if the offset was not within the section range.
The immediate parsing attempt afterwards would fail (it would end up
getting a version of 0), and thereafter, any subsequent calls to
getOrParseLineTable or getLineTable would return the default-
constructed, invalid line table. In reality, we shouldn't even attempt
to parse this table, and we should always return a nullptr from these
two functions for this situation.

I have tested this via a unit test, which required some new framework
for unit testing debug line. My plan is to add quite a few more unit
tests for the new error reporting mechanism that will follow shortly,
hence the reason why the supporting code for the tests are written the
way they are - I intend to extend the DwarfGenerator class to support
generating debug line. At that point, I'll make sure that there are a
few positive test cases for this and the parsing code too.

Differential Revision: https://reviews.llvm.org/D44200

Reviewers: JDevlieghere, aprantl
llvm-svn: 326995
2018-03-08 10:53:34 +00:00
Aaron Smith 25409ddf2a [DebugInfoPDB] Add DIA implementation for getSrcLineOnTypeDefn
Summary: This helps to determine the line number for a PDB type with definition

Reviewers: zturner, llvm-commits, rnk

Reviewed By: zturner

Subscribers: rengolin, JDevlieghere

Differential Revision: https://reviews.llvm.org/D44119

llvm-svn: 326857
2018-03-07 00:33:09 +00:00
Aaron Smith 89a19ac38d [PDB] Check the result of setLoadAddress()
Summary: Change setLoadAddress() to return true or false on failure.

Reviewers: zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D43638

llvm-svn: 325843
2018-02-23 00:02:27 +00:00
Aaron Smith 9930e900e9 [PDB] Add missing override to silence buildbots
llvm-svn: 325828
2018-02-22 20:28:40 +00:00
Aaron Smith fbe65404fd [PDB] Implement more find methods for PDB symbols
Summary:
Add additional find methods on PDB raw symbols.

findChildrenByAddr()
findChildrenByVA()
findInlineFramesByAddr()
findInlineFramesByVA()
findInlineLines()
findInlineLinesByAddr()
findInlineLinesByRVA()
findInlineLinesByVA()




Reviewers: zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D43637

llvm-svn: 325824
2018-02-22 19:47:43 +00:00
Eric Christopher e752e68575 Silence an unsigned vs signed compare warning.
llvm-svn: 325402
2018-02-16 22:46:45 +00:00
Zachary Turner cafd476836 Fix emission of PDB string table.
This was originally reported as a bug with the symptom being "cvdump
crashes when printing an LLD-linked PDB that has an S_FILESTATIC record
in it". After some additional investigation, I determined that this was
a symptom of a larger problem, and in fact the real problem was in the
way we emitted the global PDB string table. As evidence of this, you can
take any lld-generated PDB, run cvdump -stringtable on it, and it would
return no results.

My hypothesis was that cvdump could not *find* the string table to begin
with. Normally it would do this by looking in the "named stream map",
finding the string /names, and using its value as the stream index. If
this lookup fails, then cvdump would fail to load the string table.

To test this hypothesis, I looked at the name stream map generated by a
link.exe PDB, and I emitted exactly those bytes into an LLD-generated
PDB. Suddenly, cvdump could read our string table!

This code has always been hacky and we knew there was something we
didn't understand. After all, there were some comments to the effect of
"we have to emit strings in a specific order, otherwise things don't
work". The key to fixing this was finally understanding this.

The way it works is that it makes use of a generic serializable hash map
that maps integers to other integers. In this case, the "key" is the
offset into a buffer, and the value is the stream number. If you index
into the buffer at the offset specified by a given key, you find the
name. The underlying cause of all these problems is that we were using
the identity function for the hash. i.e. if a string's offset in the
buffer was 12, the hash value was 12. Instead, we need to hash the
string *at that offset*. There is an additional catch, in that we have
to compute the hash as a uint32 and then truncate it to uint16.

Making this work is a little bit annoying, because we use the same hash
table in other places as well, and normally just using the identity
function for the hash function is actually what's desired. I'm not
totally happy with the template goo I came up with, but it works in any
case.

The reason we never found this bug through our own testing is because we
were building a /parallel/ hash table (in the form of an
llvm::StringMap<>) and doing all of our lookups and "real" hash table
work against that. I deleted all of that code and now everything goes
through the real hash table. Then, to test it, I added a unit test which
adds 7 strings and queries the associated values. I test every possible
insertion order permutation of these 7 strings, to verify that it really
does work as expected.

Differential Revision: https://reviews.llvm.org/D43326

llvm-svn: 325386
2018-02-16 20:46:04 +00:00
Zachary Turner de6a487d70 [MSF] Fix FPM interval calcluation
We have some code to try to determine how many pieces an MSF
Free Page Map is split into, and this code had an off by one
error which would cause the calculation to be incorrect when
there were exactly 4096*k + 1 blocks in an MSF file.

Original investigation and patch outline by Colden Cullen.

Differential Revision: https://reviews.llvm.org/D41742

llvm-svn: 321880
2018-01-05 18:12:14 +00:00
Alex Bradbury b22f751fa7 Thread MCSubtargetInfo through Target::createMCAsmBackend
Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. 
D20830 threaded an MCSubtargetInfo reference through 
MCAsmBackend::relaxInstruction, but this isn't the only function that would 
benefit from access. This patch removes the Triple and CPUString arguments 
from createMCAsmBackend and replaces them with MCSubtargetInfo.

This patch just changes the interface without making any intentional 
functional changes. Once in, several cleanups are possible:
* Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend
* Support 16-bit instructions when valid in MipsAsmBackend::writeNopData
* Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl
* Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221)

This change initially exposed PR35686, which has since been resolved in r321026.

Differential Revision: https://reviews.llvm.org/D41349

llvm-svn: 321692
2018-01-03 08:53:05 +00:00
Paul Robinson a06f8dcca6 Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header."
Adds missing support for DW_FORM_data16.

Update of r320852/r320886, fixing the unittest again, this time use a
raw char string for the test data.

Differential Revision: https://reviews.llvm.org/D41090

llvm-svn: 321011
2017-12-18 19:08:35 +00:00
Paul Robinson 6d0484f2b6 Revert "Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header.""
This reverts commit 0afef672f63f0e4e91938656bc73424a8c058bfc.
Still failing at runtime on bots.

llvm-svn: 320888
2017-12-15 23:21:52 +00:00
Paul Robinson 5c8f7d7de4 Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header."
Adds missing support for DW_FORM_data16.

Update of r320852, fixing the unittest to use a hand-coded struct
instead of std::array to guarantee data layout.

Differential Revision: https://reviews.llvm.org/D41090

llvm-svn: 320886
2017-12-15 22:57:17 +00:00
Paul Robinson 67ca67d1b2 Revert "[DWARFv5] Dump an MD5 checksum in the line-table header."
Unit test fails on some bots.

llvm-svn: 320857
2017-12-15 20:29:25 +00:00
Paul Robinson 72546fe87b [DWARFv5] Dump an MD5 checksum in the line-table header.
Adds missing support for DW_FORM_data16.

Differential Revision: https://reviews.llvm.org/D41090

llvm-svn: 320852
2017-12-15 19:52:34 +00:00
Michael Zolotukhin 5c0ab473f2 Remove redundant includes from unittests.
llvm-svn: 320630
2017-12-13 21:31:05 +00:00
Alexandre Ganea 757026dbe6 Test commit.
llvm-svn: 320504
2017-12-12 18:00:43 +00:00
Zachary Turner ecd2684ed7 [DebugInfo] Fix register variables not showing up in pdb.
Previously, when linking against libcmt from the MSVC runtime,
lld-link /verbose would show "Ignoring unknown symbol record
with kind 0x1006".  It turns out this was because
TypeIndexDiscovery did not handle S_REGISTER records, so these
records were not getting properly remapped.

Patch by: Alexnadre Ganea
Differential Revision: https://reviews.llvm.org/D40919

llvm-svn: 320108
2017-12-07 22:51:16 +00:00
Zachary Turner 87b78e9d1e [CodeView] Add support for content hashing CodeView type records.
Currently nothing uses this, but this at least gets the core
algorithm in, and adds some test to demonstrate correctness.

Differential Revision: https://reviews.llvm.org/D40736

llvm-svn: 319854
2017-12-05 23:08:58 +00:00