Gets a bit tricky in the ValueMapper, of course - not sure if we should
just expose a list of explicit types for each Value so that the
ValueMapper can be neutral to these special cases (it's OK for things
like load, where the explicit type is the result type - but when that's
not the case, it means plumbing through another "special" type... )
llvm-svn: 245728
As a follow-up to r244181, resolve uniquing cycles underneath distinct
nodes on the fly. This prevents uniquing cycles in early operands from
affecting later operands. It also removes an iteration through distinct
nodes' operands.
No real functional change here, just more prompt resolution of temporary
nodes.
llvm-svn: 244302
Rotate the algorithm for remapping distinct nodes in order to simplify
how uniquing cycles get resolved. This removes some of the recursion,
and, most importantly, exposes all uniquing cycles at the top-level.
Besides being a little more efficient -- temporary MDNodes won't live as
long -- the clearer logic should help protect against bugs like those
fixed in r243961 and r243976.
What are uniquing cycles? Why do they present challenges when remapping
metadata?
!0 = !{!1}
!1 = !{!0}
!0 and !1 form a simple uniquing cycle. When remapping from one
metadata graph to another, every uniquing cycle gets "duplicated"
through a dance:
!0-temp = !{!1?} ; map(!0): clone !0, VM[!0] = !0-temp
!1-temp = !{!0?} ; ..map(!1): clone !1, VM[!1] = !1-temp
!1-temp = !{!0-temp} ; ..map(!1): remap !1's operands
!2 = !{!0-temp} ; ..map(!1): uniquify: !1-temp => !2
!0-temp = !{!2} ; map(!0): remap !0's operands
!3 = !{!2} ; map(!0): uniquify: !0-temp => !3
; Result
!2 = !{!3}
!3 = !{!2}
(In the two "uniquify" steps above, the operands of !X-temp are compared
to the operands of !X. If they're the same, then !X-temp gets RAUW'ed
to !X; if they're different, then !X-temp is promoted to a new unique
node. The latter case always hits in for uniquing cycles, so we
duplicate all the nodes involved.)
Why is this a problem? Uniquable Metadata nodes that have temporary
node as transitive operands keep RAUW support until the temporary nodes
get finalized. With non-cycles, this happens automatically: when a
uniquable node's count of unresolved operands drops to zero, it
immediately sheds its own RAUW support (possibly triggering the same in
any node that references it). However, uniquing cycles create a
reference cycle, and uniqued nodes that transitively reference a
uniquing cycle are "stuck" in an unresolved state until someone calls
`MDNode::resolveCycles()` on a node in the unresolved subgraph.
Distinct nodes should help here (and mostly do): since they aren't
uniqued anywhere, they are guaranteed not to be RAUW'ed. They
effectively form a barrier between uniqued nodes, breaking some uniquing
cycles, and shielding uniqued nodes from uniquing cycles.
Unfortunately, with this barrier in place, the unresolved subgraph(s)
can be disjoint from the top-level node. The mapping algorithm needs to
find at least one representative from each disjoint subgraph. But which
nodes are *stuck*, and which will get resolved automatically? And which
nodes are in the unresolved subgraph? The old logic was conservative.
This commit rotates the logic for distinct nodes, so that we have access
to unresolved nodes at the top-level call to `llvm::MapMetadata()`.
Each time we return to the top-level, we know that all temporaries have
been RAUW'ed away. Here, it's safe (and necessary) to call
`resolveCycles()` immediately on unresolved operands.
This should also perform better than the old algorithm. The recursion
stack is shorter, temporary nodes don't live as long, and there are
fewer tracking references to unresolved nodes. As the debug info graph
introduces more 'distinct' nodes, remapping should incrementally get
cheaper and cheaper.
Aside from possible performance improvements (and reduced cruft in the
`LLVMContext`), there should be no functionality change here.
llvm-svn: 244181
Rename `remap()` to `remapOperands()`, and restrict its contract to
remapping operands. Previously, it also called `mapToMetadata()`, but
this logic is hard to reason about externally. In particular, this
refactors `mapUniquedNode()` to avoid redundant mapping calls, taking
advantage of the RAUWs that are already in place.
llvm-svn: 244168
r243883 and r243961 made a use-after-free far more likely:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/6041/steps/check-llvm%20asan/logs/stdio
Unresolved nodes get inserted into the `Cycles` array. If they later
get resolved through RAUW, we need to update the reference. It's
interesting that this never hit before (maybe an asan-ified clang
bootstrap with `-flto -g` would have hit it, but I admit I haven't tried
anything quite that crazy).
llvm-svn: 243976
r243883 started moving 'distinct' nodes instead of duplicated them in
lib/Linker. This had the side-effect of sometimes not cloning uniqued
nodes that reference them. I missed a corner case:
!named = !{!0}
!0 = !{!1}
!1 = distinct !{!0}
!0 is the entry point for "remapping", and a temporary clone (say,
!0-temp) is created and mapped in case we need to model a uniquing
cycle.
Recursive descent into !1. !1 is distinct, so we leave it alone,
but update its operand to !0-temp.
Pop back out to !0. Its only operand, !1, hasn't changed, so we don't
need to use !0-temp. !0-temp goes out of scope, and we're finished
remapping, but we're left with:
!named = !{!0}
!0 = !{!1}
!1 = distinct !{null} ; uh oh...
Previously, if !0 and !0-temp ended up with identical operands, then
!0-temp couldn't have been referenced at all. Now that distinct nodes
don't get duplicated, that assumption is invalid. We need to
!0-temp->replaceAllUsesWith(!0) before freeing !0-temp.
I found this while running an internal `-flto -g` bootstrap. Strangely,
there was no case of this in the open source bootstrap I'd done before
commit...
llvm-svn: 243961
Instead of cloning distinct `MDNode`s when linking in a module, just
move them over. The module linker destroys the source module, so the
old node would otherwise just be leaked on the context. Create the new
node in place. This also reduces the number of cloned uniqued nodes
(since it's less likely their operands have changed).
This mapping strategy is only correct when we're discarding the source,
so the linker turns it on via a ValueMapper flag, `RF_MoveDistinctMDs`.
There's nothing observable in terms of `llvm-link` output here: the
linked module should be semantically identical.
I'll be adding more 'distinct' nodes to the debug info metadata graph in
order to break uniquing cycles, so the benefits of this will partly come
in future commits. However, we should get some gains immediately, since
we have a fair number of 'distinct' `DILocation`s being linked in.
llvm-svn: 243883
This is a minor optimization to only check for unresolved operands
inside `mapDistinctNode()` if the operands have actually changed. This
shouldn't really cause any change in behaviour. I didn't actually see a
slowdown in a profile, I was just poking around nearby and saw the
opportunity.
llvm-svn: 243866
Alternatively, this type could be derived on-demand whenever
getResultElementType is called - if someone thinks that's the better
choice (simple time/space tradeoff), I'm happy to give it a go.
llvm-svn: 238716
(reverted in r235533)
Original commit message:
"Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)"
The remapping done in ValueMapper for LTO was insufficient as the types
weren't correctly mapped (though I was using the post-mapped operands,
some of those operands might not have been mapped yet so the type
wouldn't be post-mapped yet). Instead use the pre-mapped type and
explicitly map all the types.
llvm-svn: 235651
This reverts commit r235458.
It looks like this might be breaking something LTO-ish. Looking into it
& will recommit with a fix/test case/etc once I've got more to go on.
llvm-svn: 235533
Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)
llvm-svn: 235458
Allow unresolved nodes through the `MapMetadata()` if
`RF_NoModuleLevelChanges`, since there's no remapping to do anyway.
This fixes PR22929. I'll add a clang test as a follow-up.
llvm-svn: 232449
Track unresolved nodes under distinct `MDNode`s during `MapMetadata()`,
and resolve them at the end. Previously, these cycles wouldn't get
resolved.
llvm-svn: 228180
Now that the clone methods used by `MapMetadata()` don't do any
remapping (and return a temporary), they make more sense as member
functions on `MDNode` (and subclasses).
llvm-svn: 226541
As part of PR22235, introduce `DwarfNode` and `GenericDwarfNode`. The
former is a metadata node with a DWARF tag. The latter matches our
current (generic) schema of a header with string (and stringified
integer) data and an arbitrary number of operands.
This doesn't move it into place yet; that change will require a large
number of testcase updates.
llvm-svn: 226529
As pointed out in r226501, the distinction between `MDNode` and
`UniquableMDNode` is confusing. When we need subclasses of `MDNode`
that don't use all its functionality it might make sense to break it
apart again, but until then this makes the code clearer.
llvm-svn: 226520
Take advantage of the new ability of temporary nodes to mutate to
distinct and uniqued nodes to greatly simplify the `MapMetadata()`
helper functions.
llvm-svn: 226511
Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to
return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and
clean up call sites. (For now, `DIBuilder` call sites just call
`release()` immediately.)
There's an accompanying change in each of clang and polly to use the new
API.
llvm-svn: 226504
Remove `MDNodeFwdDecl` (as promised in r226481). Aside from API
changes, there's no real functionality change here.
`MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`,
which returns a tuple with `isTemporary()` equal to true.
The main point is that we can now add temporaries of other `MDNode`
subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the
first place because I didn't recognize this need, and thought they were
only needed to handle forward references).
A few things left out of (or highlighted by) this commit:
- I've had to remove the (few) uses of `std::unique_ptr<>` to deal
with temporaries, since the destructor is no longer public.
`getTemporary()` should probably return the equivalent of
`std::unique_ptr<T, MDNode::deleteTemporary>`.
- `MDLocation::getTemporary()` doesn't exist yet (worse, it actually
does exist, but does the wrong thing: `MDNode::getTemporary()` is
inherited and returns an `MDTuple`).
- `MDNode` now only has one subclass, `UniquableMDNode`, and the
distinction between them is actually somewhat confusing.
I'll fix those up next.
llvm-svn: 226501
Change `MDNode::isDistinct()` to only apply to 'distinct' nodes (not
temporaries), and introduce `MDNode::isUniqued()` and
`MDNode::isTemporary()` for the other two possibilities.
llvm-svn: 226482
Although this makes the `cast<>` assert more often, the
`assert(Node->isResolved())` on the following line would assert in all
those cases. So, no functionality change here.
llvm-svn: 225903
Split `GenericMDNode` into two classes (with more descriptive names).
- `UniquableMDNode` will be a common subclass for `MDNode`s that are
sometimes uniqued like constants, and sometimes 'distinct'.
This class gets the (short-lived) RAUW support and related API.
- `MDTuple` is the basic tuple that has always been returned by
`MDNode::get()`. This is as opposed to more specific nodes to be
added soon, which have additional fields, custom assembly syntax,
and extra semantics.
This class gets the hash-related logic, since other sublcasses of
`UniquableMDNode` may need to hash based on other fields.
To keep this diff from getting too big, I've added casts to `MDTuple`
that won't really scale as new subclasses of `UniquableMDNode` are
added, but I'll clean those up incrementally.
(No functionality change intended.)
llvm-svn: 225682
Create new copies of distinct `MDNode`s instead of following the
uniquing `MDNode` logic.
Just like self-references (or other cycles), `MapMetadata()` creates a
new node. In practice most calls use `RF_NoModuleLevelChanges`, in
which case nothing is duplicated anyway.
Part of PR22111.
llvm-svn: 225476
Instead of reusing the name `MapValue()` when mapping `Metadata`, use
`MapMetadata()`. The old name doesn't make much sense after the
`Metadata`/`Value` split.
llvm-svn: 224566