Commit Graph

14 Commits

Author SHA1 Message Date
Zachary Turner 59468f5a1e Fix uninitialized read error reported by MSAN.
The problem was that our Obj -> Yaml dumper had not been taught
to handle certain types of records.  This meant that when I
generated the test input files, the records were still there but
none of its fields were filled out.  So when it did the
Yaml -> Obj conversion as part of the test, it generated records
with garbage in them.

The patch here fixes the Obj <-> Yaml converter, and additionally
updates the test file with fresh Yaml generated by the fixed
converter.

llvm-svn: 322029
2018-01-08 21:38:50 +00:00
Francis Visoiu Mistrih b213b27ee3 [YAML] Add support for non-printable characters
LLVM IR function names which disable mangling start with '\01'
(https://www.llvm.org/docs/LangRef.html#identifiers).

When an identifier like "\01@abc@" gets dumped to MIR, it is quoted, but
only with single quotes.

http://www.yaml.org/spec/1.2/spec.html#id2770814:

"The allowed character range explicitly excludes the C0 control block
allowed), the surrogate block #xD800-#xDFFF, #xFFFE, and #xFFFF."

http://www.yaml.org/spec/1.2/spec.html#id2776092:

"All non-printable characters must be escaped.
[...]
Note that escape sequences are only interpreted in double-quoted scalars."

This patch adds support for printing escaped non-printable characters
between double quotes if needed.

Should also fix PR31743.

Differential Revision: https://reviews.llvm.org/D41290

llvm-svn: 320996
2017-12-18 17:38:03 +00:00
Eugene Zelenko 28082ab0e5 [ObjectYAML] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306925
2017-07-01 01:35:55 +00:00
Richard Smith d0c0c13447 Fix ODR violations due to abuse of LLVM_YAML_IS_(FLOW_)?SEQUENCE_VECTOR
This is a short-term fix for PR33650 aimed to get the modules build bots green again.

Remove all the places where we use the LLVM_YAML_IS_(FLOW_)?SEQUENCE_VECTOR
macros to try to locally specialize a global template for a global type. That's
not how C++ works.

Instead, we now centrally define how to format vectors of fundamental types and
of string (std::string and StringRef). We use flow formatting for the former
cases, since that's the obvious right thing to do; in the latter case, it's
less clear what the right choice is, but flow formatting is really bad for some
cases (due to very long strings), so we pick block formatting. (Many of the
cases that were using flow formatting for strings are improved by this change.)

Other than the flow -> block formatting change for some vectors of strings,
this should result in no functionality change.

Differential Revision: https://reviews.llvm.org/D34907

Corresponding updates to clang, clang-tools-extra, and lld to follow.

llvm-svn: 306878
2017-06-30 20:56:57 +00:00
Reid Kleckner 91ef9de643 [codeview] YAMLize all section offsets and indices in symbol records
We forgot to serialize these because llvm-readobj didn't dump them. They
are typically all zeros in an object file. The linker fills them in with
relocations before adding them to the PDB. Now we can properly round
trip these symbols through pdb2yaml -> yaml2pdb.

I made these fields optional with a zero default so that we can elide
them from our test cases.

llvm-svn: 305857
2017-06-20 21:19:22 +00:00
Reid Kleckner 665e1c9240 [codeview] Fully initialize DataSym when mapping from YAML
In the object file, the section index and relative offset are typically
zero, so make these YAML fields optional with a default.

It looks like there may be more partially initialized symbol records,
but this should fix the msan bot.

llvm-svn: 305842
2017-06-20 20:34:37 +00:00
Reid Kleckner 18d90e17ad [CodeView] Fix dumping of public symbol record flags
I noticed nonsensical type information while dumping PDBs produced by
MSVC.

llvm-svn: 305708
2017-06-19 16:54:51 +00:00
Zachary Turner 6305545527 Resubmit "[llvm-pdbutil] rewrite the "raw" output style."
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.

It was broken due to some weird template issues, which have
since been fixed.

llvm-svn: 305517
2017-06-15 22:24:24 +00:00
Zachary Turner da504b794c Revert "[llvm-pdbutil] rewrite the "raw" output style."
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.

This is failing due to some strange template problems, so reverting
until it can be straightened out.

llvm-svn: 305505
2017-06-15 20:55:51 +00:00
Zachary Turner b560fdf3b8 [llvm-pdbutil] rewrite the "raw" output style.
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.

One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.

This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.

See the tests for some examples of what the new output looks like.

Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:

The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.

Differential Revision: https://reviews.llvm.org/D34191

llvm-svn: 305495
2017-06-15 19:34:41 +00:00
Zachary Turner 606d766538 [pdb] Don't choke on unknown symbol types.
When we get an unknown symbol type, we might as well at least
dump it.  Same goes for round-tripping through YAML, we can
dump the record contents as raw bytes even if we don't know
how to interpret it semantically.

llvm-svn: 305248
2017-06-12 23:10:31 +00:00
Zachary Turner 349c18f837 [CodeView] Handle Cross Module Imports and Exports.
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.

This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.

llvm-svn: 304738
2017-06-05 21:40:33 +00:00
Zachary Turner ebd3ae8371 [CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries.  Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.

Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.

Differential Revision: https://reviews.llvm.org/D33785

llvm-svn: 304484
2017-06-01 21:52:41 +00:00
Zachary Turner 1b88f4f33a [ObjectYAML] Split CodeViewYAML into 3 pieces.
The code was a mess and disorganized due to the sheer amount
of it being in one file.  So I'm splitting this into three files.
One for CodeView types, one for CodeView symbols, and one for
CodeView debug subsections.  NFC.

llvm-svn: 304278
2017-05-31 04:17:13 +00:00