Commit Graph

867 Commits

Author SHA1 Message Date
Filipe Cabecinhas fcd044b692 Check bit widths before trying to get a type.
Added a test case for it.
Also added run lines for the test case in r227566.

Bugs found with afl-fuzz

llvm-svn: 227589
2015-01-30 18:13:50 +00:00
Filipe Cabecinhas d0858e1037 [bitcode reader] Fix an assert on invalid type tables
Bug found with afl-fuzz

llvm-svn: 227566
2015-01-30 10:57:58 +00:00
Filipe Cabecinhas de968ecb05 [Bitcode] Diagnose errors instead of asserting from bad input
Eventually we can make some of these pass the error along to the caller.

Reports a fatal error if:
We find an invalid abbrev record
We try to get an invalid abbrev number
We can't fill the current word due to an EOF

Fixed an invalid bitcode test to check for output with FileCheck

Bugs found with afl-fuzz

llvm-svn: 226986
2015-01-24 04:15:05 +00:00
Duncan P. N. Exon Smith e8b5e49ffd IR: DwarfNode => DebugNode, NFC
These things are potentially used for non-DWARF data (see the discussion
in PR22235), so take the `Dwarf` out of the name.  Since the new name
gives fewer clues, update the doxygen to properly describe what they
are.

llvm-svn: 226874
2015-01-22 22:47:44 +00:00
David Majnemer 3087b22e1a Bitcode: Don't create comdats when autoupgrading macho bitcode
Don't infer COMDAT groups from older bitcode if the target is macho,
it doesn't have COMDATs.

llvm-svn: 226546
2015-01-20 05:58:07 +00:00
Duncan P. N. Exon Smith db6bc8bfdf Bitcode: Simplify MDNode subclass dispatch, NFC
llvm-svn: 226535
2015-01-20 01:03:09 +00:00
Duncan P. N. Exon Smith 6592deeab2 Bitcode: WriteMDNode() => WriteMDTuple(), NFC
llvm-svn: 226534
2015-01-20 01:01:53 +00:00
Duncan P. N. Exon Smith 9a6f64e7b8 Bitcode: Add ValueEnumerator::getMetadataOrNullID(), NFC
llvm-svn: 226533
2015-01-20 01:00:23 +00:00
Duncan P. N. Exon Smith 2bc00f4a38 IR: Merge UniquableMDNode back into MDNode, NFC
As pointed out in r226501, the distinction between `MDNode` and
`UniquableMDNode` is confusing.  When we need subclasses of `MDNode`
that don't use all its functionality it might make sense to break it
apart again, but until then this makes the code clearer.

llvm-svn: 226520
2015-01-19 23:13:14 +00:00
Duncan P. N. Exon Smith 7d82313bcd IR: Return unique_ptr from MDNode::getTemporary()
Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to
return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and
clean up call sites.  (For now, `DIBuilder` call sites just call
`release()` immediately.)

There's an accompanying change in each of clang and polly to use the new
API.

llvm-svn: 226504
2015-01-19 21:30:18 +00:00
Duncan P. N. Exon Smith 946fdcc50c IR: Remove MDNodeFwdDecl
Remove `MDNodeFwdDecl` (as promised in r226481).  Aside from API
changes, there's no real functionality change here.
`MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`,
which returns a tuple with `isTemporary()` equal to true.

The main point is that we can now add temporaries of other `MDNode`
subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the
first place because I didn't recognize this need, and thought they were
only needed to handle forward references).

A few things left out of (or highlighted by) this commit:

  - I've had to remove the (few) uses of `std::unique_ptr<>` to deal
    with temporaries, since the destructor is no longer public.
    `getTemporary()` should probably return the equivalent of
    `std::unique_ptr<T, MDNode::deleteTemporary>`.
  - `MDLocation::getTemporary()` doesn't exist yet (worse, it actually
    does exist, but does the wrong thing: `MDNode::getTemporary()` is
    inherited and returns an `MDTuple`).
  - `MDNode` now only has one subclass, `UniquableMDNode`, and the
    distinction between them is actually somewhat confusing.

I'll fix those up next.

llvm-svn: 226501
2015-01-19 20:36:39 +00:00
Rafael Espindola 12ca34f53f Bring r226038 back.
No change in this commit, but clang was changed to also produce trivial comdats when
needed.

Original message:

Don't create new comdats in CodeGen.

This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

llvm-svn: 226467
2015-01-19 15:16:06 +00:00
Timur Iskhodzhanov 60b721363c Revert r226242 - Revert Revert Don't create new comdats in CodeGen
This breaks AddressSanitizer (ninja check-asan) on Windows

llvm-svn: 226251
2015-01-16 08:38:45 +00:00
Rafael Espindola 67a79e72f5 Revert "Revert Don't create new comdats in CodeGen"
This reverts commit r226173, adding r226038 back.

No change in this commit, but clang was changed to also produce trivial comdats for
costructors, destructors and vtables when needed.

Original message:

Don't create new comdats in CodeGen.

This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

llvm-svn: 226242
2015-01-16 02:22:55 +00:00
Timur Iskhodzhanov f5adf13fac Revert Don't create new comdats in CodeGen
It breaks AddressSanitizer on Windows.

llvm-svn: 226173
2015-01-15 16:14:34 +00:00
Rafael Espindola fad1639a12 Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

llvm-svn: 226038
2015-01-14 20:55:48 +00:00
Rafael Espindola 4e74d3be35 Add support for comdats with names larger than 256 characters.
llvm-svn: 226012
2015-01-14 18:25:45 +00:00
Chandler Carruth d9903888d9 [cleanup] Re-sort all the #include lines in LLVM using
utils/sort_includes.py.

I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.

llvm-svn: 225974
2015-01-14 11:23:27 +00:00
Duncan P. N. Exon Smith 6a4848324b AsmParser/Bitcode: Add support for MDLocation
This adds assembly and bitcode support for `MDLocation`.  The assembly
side is rather big, since this is the first `MDNode` subclass (that
isn't `MDTuple`).  Part of PR21433.

(If you're wondering where the mountains of testcase updates are, we
don't need them until I update `DILocation` and `DebugLoc` to actually
use this class.)

llvm-svn: 225830
2015-01-13 21:10:44 +00:00
Duncan P. N. Exon Smith 49503f827d Bitcode: Range-based for, NFC
llvm-svn: 225716
2015-01-12 22:35:34 +00:00
Duncan P. N. Exon Smith b1ad5d39a9 Bitcode: Add abbreviation for METADATA_NAME
llvm-svn: 225715
2015-01-12 22:34:10 +00:00
Duncan P. N. Exon Smith f8dd6ad6de Bitcode: Range-based for, NFC
llvm-svn: 225714
2015-01-12 22:33:00 +00:00
Duncan P. N. Exon Smith 73d5aae74c Bitcode: Range-based for, NFC
llvm-svn: 225713
2015-01-12 22:31:35 +00:00
Duncan P. N. Exon Smith 2fcf60e78e Bitcode: Simplify emission of METADATA_BLOCK
Refactor logic so that we know up-front whether to open a block and
whether we need an MDString abbreviation.

This is almost NFC, but will start emitting `MDString` abbreviations
when the first record is not an `MDString`.

llvm-svn: 225712
2015-01-12 22:30:34 +00:00
Duncan P. N. Exon Smith 118632dbf6 IR: Split GenericMDNode into MDTuple and UniquableMDNode
Split `GenericMDNode` into two classes (with more descriptive names).

  - `UniquableMDNode` will be a common subclass for `MDNode`s that are
    sometimes uniqued like constants, and sometimes 'distinct'.

    This class gets the (short-lived) RAUW support and related API.

  - `MDTuple` is the basic tuple that has always been returned by
    `MDNode::get()`.  This is as opposed to more specific nodes to be
    added soon, which have additional fields, custom assembly syntax,
    and extra semantics.

    This class gets the hash-related logic, since other sublcasses of
    `UniquableMDNode` may need to hash based on other fields.

To keep this diff from getting too big, I've added casts to `MDTuple`
that won't really scale as new subclasses of `UniquableMDNode` are
added, but I'll clean those up incrementally.

(No functionality change intended.)

llvm-svn: 225682
2015-01-12 20:09:34 +00:00
Rafael Espindola d0b23bef6f Use the DiagnosticHandler to print diagnostics when reading bitcode.
The bitcode reading interface used std::error_code to report an error to the
callers and it is the callers job to print diagnostics.

This is not ideal for error handling or diagnostic reporting:

* For error handling, all that the callers care about is 3 possibilities:
  * It worked
  * The bitcode file is corrupted/invalid.
  * The file is not bitcode at all.

* For diagnostic, it is user friendly to include far more information
  about the invalid case so the user can find out what is wrong with the
  bitcode file. This comes up, for example, when a developer introduces a
  bug while extending the format.

The compromise we had was to have a lot of error codes.

With this patch we use the DiagnosticHandler to communicate with the
human and std::error_code to communicate with the caller.

This allows us to have far fewer error codes and adds the infrastructure to
print better diagnostics. This is so because the diagnostics are printed when
he issue is found. The code that detected the problem in alive in the stack and
can pass down as much context as needed. As an example the patch updates
test/Bitcode/invalid.ll.

Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the
caller. A simple one like llvm-dis can just use fatal errors. The gold plugin
needs a bit more complex treatment because of being passed non-bitcode files. An
hypothetical interactive tool would make all bitcode errors non-fatal.

llvm-svn: 225562
2015-01-10 00:07:30 +00:00
Duncan P. N. Exon Smith 9ed19665bb Revert "Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD"
This reverts commit r225498 (but leaves r225499, which was a worthy
cleanup).

My plan was to change `DEBUG_LOC` to store the `MDNode` directly rather
than its operands (patch was to go out this morning), but on reflection
it's not clear that it's strictly better.  (I had missed that the
current code is unlikely to emit the `MDNode` at all.)

Conflicts:
	lib/Bitcode/Reader/BitcodeReader.cpp (due to r225499)

llvm-svn: 225531
2015-01-09 17:53:27 +00:00
Duncan P. N. Exon Smith 52d0f16e1b Bitcode: Share logic for last instruction, NFC
Share logic for getting the last instruction emitted.

llvm-svn: 225499
2015-01-09 02:51:45 +00:00
Duncan P. N. Exon Smith 11fae74ae5 Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD
Prepare to simplify the `DebugLoc` record.

llvm-svn: 225498
2015-01-09 02:48:48 +00:00
Duncan P. N. Exon Smith 090a19bd3c IR: Add 'distinct' MDNodes to bitcode and assembly
Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode).  This adds the `distinct` keyword to assembly.

Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:

  - self-references, which are never uniqued, and
  - nodes whose operands are replaced that hit a uniquing collision.

The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.

Part of PR22111.

llvm-svn: 225474
2015-01-08 22:38:29 +00:00
Rafael Espindola 261d25b940 clang-format. NFC.
llvm-svn: 225454
2015-01-08 16:25:01 +00:00
Rafael Espindola bec6af62b8 Explicitly handle LinkOnceODRAutoHideLinkage. NFC. We already have a test.
llvm-svn: 225449
2015-01-08 15:39:50 +00:00
Rafael Espindola 7b4b2dcd0a Update naming style and clang-format. NFC.
llvm-svn: 225448
2015-01-08 15:36:32 +00:00
Chandler Carruth d174ce4ad1 [PM] Switch the new pass manager to use a reference-based API for IR
units.

This was debated back and forth a bunch, but using references is now
clearly cleaner. Of all the code written using pointers thus far, in
only one place did it really make more sense to have a pointer. In most
cases, this just removes immediate dereferencing from the code. I think
it is much better to get errors on null IR units earlier, potentially
at compile time, than to delay it.

Most notably, the legacy pass manager uses references for its routines
and so as more and more code works with both, the use of pointers was
likely to become really annoying. I noticed this when I ported the
domtree analysis over and wrote the entire thing with references only to
have it fail to compile. =/ It seemed better to switch now than to
delay. We can, of course, revisit this is we learn that references are
really problematic in the API.

llvm-svn: 225145
2015-01-05 02:47:05 +00:00
Yaron Keren 06d6930b68 Fix Visual C++ error "'llvm::make_unique' : ambiguous call to overloaded function".
llvm-svn: 224506
2014-12-18 10:03:35 +00:00
Rafael Espindola 7d727b5f11 Modernize the getStreamedBitcodeModule interface a bit. NFC.
llvm-svn: 224499
2014-12-18 05:08:43 +00:00
Nick Lewycky ee0a3a7a2f Make ValueEnumerator::print use OS for metadata too. Noticed by inspection.
llvm-svn: 224404
2014-12-17 01:52:08 +00:00
Duncan P. N. Exon Smith 5bd34e56ce Bitcode: Add missing "Remove in 4.0" comments
llvm-svn: 224090
2014-12-12 02:11:31 +00:00
Duncan P. N. Exon Smith eca1e031d1 Bitcode: Use unsigned char to record MDStrings
`MDString`s can have arbitrary characters in them.  Prevent an assertion
that fired in `BitcodeWriter` because of sign extension by copying the
characters into the record as `unsigned char`s.

Based on a patch by Keno Fischer; fixes PR21882.

llvm-svn: 224077
2014-12-11 23:34:30 +00:00
Duncan P. N. Exon Smith 5c7006e062 Bitcode: Add METADATA_NODE and METADATA_VALUE
This reflects the typelessness of `Metadata` in the bitcode format,
removing types from all metadata operands.

`METADATA_VALUE` represents a `ValueAsMetadata`, and always has two
fields: the type and the value.

`METADATA_NODE` represents an `MDNode`, and unlike `METADATA_OLD_NODE`,
doesn't store types.  It stores operands at their ID+1 so that `0` can
reference `nullptr` operands.

Part of PR21532.

llvm-svn: 224073
2014-12-11 23:02:24 +00:00
Duncan P. N. Exon Smith 005f9f433c Bitcode: Add `OLD_` prefix to metadata node records
I'm about to change these, so move the old ones out of the way.

Part of PR21532.

llvm-svn: 224070
2014-12-11 22:30:48 +00:00
Duncan P. N. Exon Smith 5bf8fef580 IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532.  Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.

I have a follow-up patch prepared for `clang`.  If this breaks other
sub-projects, I apologize in advance :(.  Help me compile it on Darwin
I'll try to fix it.  FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.

This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.

Here's a quick guide for updating your code:

  - `Metadata` is the root of a class hierarchy with three main classes:
    `MDNode`, `MDString`, and `ValueAsMetadata`.  It is distinct from
    the `Value` class hierarchy.  It is typeless -- i.e., instances do
    *not* have a `Type`.

  - `MDNode`'s operands are all `Metadata *` (instead of `Value *`).

  - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
    replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.

    If you're referring solely to resolved `MDNode`s -- post graph
    construction -- just use `MDNode*`.

  - `MDNode` (and the rest of `Metadata`) have only limited support for
    `replaceAllUsesWith()`.

    As long as an `MDNode` is pointing at a forward declaration -- the
    result of `MDNode::getTemporary()` -- it maintains a side map of its
    uses and can RAUW itself.  Once the forward declarations are fully
    resolved RAUW support is dropped on the ground.  This means that
    uniquing collisions on changing operands cause nodes to become
    "distinct".  (This already happened fairly commonly, whenever an
    operand went to null.)

    If you're constructing complex (non self-reference) `MDNode` cycles,
    you need to call `MDNode::resolveCycles()` on each node (or on a
    top-level node that somehow references all of the nodes).  Also,
    don't do that.  Metadata cycles (and the RAUW machinery needed to
    construct them) are expensive.

  - An `MDNode` can only refer to a `Constant` through a bridge called
    `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).

    As a side effect, accessing an operand of an `MDNode` that is known
    to be, e.g., `ConstantInt`, takes three steps: first, cast from
    `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
    third, cast down to `ConstantInt`.

    The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
    metadata schema owners transition away from using `Constant`s when
    the type isn't important (and they don't care about referring to
    `GlobalValue`s).

    In the meantime, I've added transitional API to the `mdconst`
    namespace that matches semantics with the old code, in order to
    avoid adding the error-prone three-step equivalent to every call
    site.  If your old code was:

        MDNode *N = foo();
        bar(isa             <ConstantInt>(N->getOperand(0)));
        baz(cast            <ConstantInt>(N->getOperand(1)));
        bak(cast_or_null    <ConstantInt>(N->getOperand(2)));
        bat(dyn_cast        <ConstantInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));

    you can trivially match its semantics with:

        MDNode *N = foo();
        bar(mdconst::hasa               <ConstantInt>(N->getOperand(0)));
        baz(mdconst::extract            <ConstantInt>(N->getOperand(1)));
        bak(mdconst::extract_or_null    <ConstantInt>(N->getOperand(2)));
        bat(mdconst::dyn_extract        <ConstantInt>(N->getOperand(3)));
        bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));

    and when you transition your metadata schema to `MDInt`:

        MDNode *N = foo();
        bar(isa             <MDInt>(N->getOperand(0)));
        baz(cast            <MDInt>(N->getOperand(1)));
        bak(cast_or_null    <MDInt>(N->getOperand(2)));
        bat(dyn_cast        <MDInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));

  - A `CallInst` -- specifically, intrinsic instructions -- can refer to
    metadata through a bridge called `MetadataAsValue`.  This is a
    subclass of `Value` where `getType()->isMetadataTy()`.

    `MetadataAsValue` is the *only* class that can legally refer to a
    `LocalAsMetadata`, which is a bridged form of non-`Constant` values
    like `Argument` and `Instruction`.  It can also refer to any other
    `Metadata` subclass.

(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)

llvm-svn: 223802
2014-12-09 18:38:53 +00:00
Duncan P. N. Exon Smith 35303fd739 IR: Disallow function-local metadata attachments
Metadata attachments to instructions cannot be function-local.

This is part of PR21532.

llvm-svn: 223574
2014-12-06 02:29:44 +00:00
Duncan P. N. Exon Smith da41af9e94 IR: Disallow complicated function-local metadata
Disallow complex types of function-local metadata.  The only valid
function-local metadata is an `MDNode` whose sole argument is a
non-metadata function-local value.

Part of PR21532.

llvm-svn: 223564
2014-12-06 01:26:49 +00:00
Rafael Espindola 2fa1e43a22 Ask the module for its the identified types.
When lazy reading a module, the types used in a function will not be visible to
a TypeFinder until the body is read.

This patch fixes that by asking the module for its identified struct types.
If a materializer is present, the module asks it. If not, it uses a TypeFinder.

This fixes pr21374.

I will be the first to say that this is ugly, but it was the best I could find.

Some of the options I looked at:

* Asking the LLVMContext. This could be made to work for gold, but not currently
  for ld64. ld64 will load multiple modules into a single context before merging
  them. This causes us to see types from future merges. Unfortunately,
  MappedTypes is not just a cache when it comes to opaque types. Once the
  mapping has been made, we have to remember it for as long as the key may
  be used. This would mean moving MappedTypes to the Linker class and having
  to drop the Linker::LinkModules static methods, which are visible from C.

* Adding an option to ignore function bodies in the TypeFinder. This would
  fix the PR by picking the worst result. It would work, but unfortunately
  we are currently quite dependent on the upfront type merging. I will
  try to reduce our dependency, but it is not clear that we will be able
  to get rid of it for now.

The only clean solution I could think of is making the Module own the types.
This would have other advantages, but it is a much bigger change. I will
propose it, but it is nice to have this fixed while that is discussed.

With the gold plugin, this patch takes the number of types in the LTO clang
binary from 52817 to 49669.

llvm-svn: 223215
2014-12-03 07:18:23 +00:00
Peter Collingbourne 51d2de7b9e Prologue support
Patch by Ben Gamari!

This redefines the `prefix` attribute introduced previously and
introduces a `prologue` attribute.  There are a two primary usecases
that these attributes aim to serve,

  1. Function prologue sigils

  2. Function hot-patching: Enable the user to insert `nop` operations
     at the beginning of the function which can later be safely replaced
     with a call to some instrumentation facility

  3. Runtime metadata: Allow a compiler to insert data for use by the
     runtime during execution. GHC is one example of a compiler that
     needs this functionality for its tables-next-to-code functionality.

Previously `prefix` served cases (1) and (2) quite well by allowing the user
to introduce arbitrary data at the entrypoint but before the function
body. Case (3), however, was poorly handled by this approach as it
required that prefix data was valid executable code.

Here we redefine the notion of prefix data to instead be data which
occurs immediately before the function entrypoint (i.e. the symbol
address). Since prefix data now occurs before the function entrypoint,
there is no need for the data to be valid code.

The previous notion of prefix data now goes under the name "prologue
data" to emphasize its duality with the function epilogue.

The intention here is to handle cases (1) and (2) with prologue data and
case (3) with prefix data.

References
----------

This idea arose out of discussions[1] with Reid Kleckner in response to a
proposal to introduce the notion of symbol offsets to enable handling of
case (3).

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html

Test Plan: testsuite

Differential Revision: http://reviews.llvm.org/D6454

llvm-svn: 223189
2014-12-03 02:08:38 +00:00
Rafael Espindola ffbfcf29f2 Add and use Type::subtypes. NFC.
llvm-svn: 222682
2014-11-24 20:44:36 +00:00
Richard Trieu e3d126cbbb Add accessor marcos to ConstantPlaceHolder, similar to those in the base class.
llvm-svn: 222502
2014-11-21 02:42:08 +00:00
Rafael Espindola 49e9bf8c74 Pass a reference to ValueEnumerator.
NFC. This will just make it easier to use std::unique_ptr in a caller.

llvm-svn: 222170
2014-11-17 20:06:27 +00:00
Reid Kleckner 9651f01a88 Silence MSVC warning on missing return after fully covered switch
llvm-svn: 221943
2014-11-13 23:07:22 +00:00