SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.
While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.
There are much better tools than llvm.trap: UBSan and ASan.
N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...
llvm-svn: 273778
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
This new verifier rule lets us unambigously pick a calling convention
when creating a new declaration for
`@llvm.experimental.deoptimize.<ty>`. It is also congruent with our
lowering strategy -- since all calls to `@llvm.experimental.deoptimize`
are lowered to calls to `__llvm_deoptimize`, it is reasonable to enforce
a unique calling convention.
Some of the tests that were breaking this verifier rule have had to be
split up into different .ll files.
The inliner was violating this rule as well, and has been fixed to avoid
producing invalid IR.
llvm-svn: 269261
When inlining a call site with llvm.mem.parallel_loop_access metadata, this
metadata needs to be propagated to all cloned memory-accessing instructions.
Otherwise, inlining parts of the loop body will invalidate the annotation.
With this functionality, we now vectorize the following as expected:
void Body(int *res, int *c, int *d, int *p, int i) {
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
void Test(int *res, int *c, int *d, int *p, int n) {
int i;
#pragma clang loop vectorize(assume_safety)
for (i = 0; i < 1600; i++) {
Body(res, c, d, p, i);
}
}
llvm-svn: 267949
They're not necessary (since the stack pointer is trivially restored on
return), and the way LLVM inserts the stackrestore calls breaks the
IR (we get a stackrestore between the deoptimize call and the return).
llvm-svn: 265101
They're not necessary (since the lifetime of the alloca is trivially
over due to the return), and the way LLVM inserts the lifetime.end
markers breaks the IR (we get a lifetime end marker between the
deoptimize call and the return).
llvm-svn: 265100
Summary:
As discussed on llvm-dev[1].
This change adds the basic boilerplate code around having this intrinsic
in LLVM:
- Changes in Intrinsics.td, and the IR Verifier
- A lowering pass to lower @llvm.experimental.guard to normal
control flow
- Inliner support
[1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html
Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18527
llvm-svn: 264976
Summary:
This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.
In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality. In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.
Note: this change does not address how `@llvm.experimental_deoptimize`
is lowered. That will be done in a later change.
Reviewers: chandlerc, rnk, atrick, reames
Subscribers: llvm-commits, kmod, mjacob, maksfb, mcrosier, JosephTremoulet
Differential Revision: http://reviews.llvm.org/D17732
llvm-svn: 263281
This patch provides the following infrastructure for PGO enhancements in inliner:
Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.
Differential Revision: http://reviews.llvm.org/D16381
llvm-svn: 262636
It is problematic if the inlinee has a cleanupret which unwinds to
caller and we inline it into a call site which doesn't unwind.
If the funclet unwinds anywhere other than to the caller,
then we will give the funclet two unwind destinations.
This will result in a verifier failure.
Seeing as how the caller wasn't an invoke (which would locally unwind)
and that the funclet cannot unwind to caller, we must conclude that an
'unwind to caller' cleanupret is dynamically unreachable.
This fixes PR26698.
Differential Revision: http://reviews.llvm.org/D17536
llvm-svn: 261656
Summary:
Funclet EH tables require that a given funclet have only one unwind
destination for exceptional exits. The verifier will therefore reject
e.g. two cleanuprets with different unwind dests for the same cleanup, or
two invokes exiting the same funclet but to different unwind dests.
Because catchswitch has no 'nounwind' variant, and because IR producers
are not *required* to annotate calls which will not unwind as 'nounwind',
it is legal to nest a call or an "unwind to caller" catchswitch within a
funclet pad that has an unwind destination other than caller; it is
undefined behavior for such a call or catchswitch to unwind.
Normally when inlining an invoke, calls in the inlined sequence are
rewritten to invokes that unwind to the callsite invoke's unwind
destination, and "unwind to caller" catchswitches in the inlined sequence
are rewritten to unwind to the callsite invoke's unwind destination.
However, if such a call or "unwind to caller" catchswitch is located in a
callee funclet that has another exceptional exit with an unwind
destination within the callee, applying the normal transformation would
give that callee funclet multiple unwind destinations for its exceptional
exits. There would be no way for EH table generation to determine which
is the "true" exit, and the verifier would reject the function
accordingly.
Add logic to the inliner to detect these cases and leave such calls and
"unwind to caller" catchswitches as calls and "unwind to caller"
catchswitches in the inlined sequence.
This fixes PR26147.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: alexcrichton, llvm-commits
Differential Revision: http://reviews.llvm.org/D16319
llvm-svn: 258273
Summary:
The overloads of CallInst::Create and InvokeInst::Create that are used to
adjust operand bundles purport to create a new instruction "identical in
every way except [for] the operand bundles", so copy the DebugLoc along
with everything else.
Reviewers: sanjoy, majnemer
Subscribers: majnemer, dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D16157
llvm-svn: 257745
`CloneAndPruneIntoFromInst` sometimes RAUW's dead instructions with
`undef` before erasing them (to avoid deleting instructions that still
have uses). This changes the `WeakVH` in `OperandBundleCallSites` to
hold an `undef`, and we need to guard for this situation in eventuality
in `llvm::InlineFunction`.
llvm-svn: 256110
SimplifyCFG allows tail merging with code which terminates in
unreachable which, in turn, makes it possible for an invoke to end up in
a funclet which it was not originally part of.
Using operand bundles on invokes allows us to determine whether or not
an invoke was part of a funclet in the source program.
Furthermore, it allows us to unambiguously answer questions about the
legality of inlining into call sites which the personality may have
trouble with.
Differential Revision: http://reviews.llvm.org/D15517
llvm-svn: 255674
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function. This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.
Depends on D15478.
Differential Revision: http://reviews.llvm.org/D15479
llvm-svn: 255522
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
but they are difficult to explain to others, even to seasoned LLVM
experts.
- catchendpad and cleanupendpad are optimization barriers. They cannot
be split and force all potentially throwing call-sites to be invokes.
This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
It is unsplittable, starts a funclet, and has control flow to other
funclets.
- The nesting relationship between funclets is currently a property of
control flow edges. Because of this, we are forced to carefully
analyze the flow graph to see if there might potentially exist illegal
nesting among funclets. While we have logic to clone funclets when
they are illegally nested, it would be nicer if we had a
representation which forbade them upfront.
Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
flow, just a bunch of simple operands; catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad. Their presence can be inferred
implicitly using coloring information.
N.B. The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for. An expert should take a
look to make sure the results are reasonable.
Reviewers: rnk, JosephTremoulet, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D15139
llvm-svn: 255422
- This simplifies the CallSite class, arg_begin / arg_end are now
simple wrapper getters.
- In several places, we were creating CallSite instances solely to call
arg_begin and arg_end. With this change, that's no longer required.
llvm-svn: 255226
`CloneAndPruneIntoFromInst` can DCE instructions after cloning them into
the new function, and so an AssertingVH is too strong. This change
switches CloneCodeInfo to use a std::vector<WeakVH>.
llvm-svn: 255148
The StringRef constructor is unnecessary (since we're converting to
std::string anyway), and having it requires an explicit call to
StringRef's or std::string's constructor.
llvm-svn: 255000
Summary:
Followed the guidelines in:
http://llvm.org/docs/CodingStandards.html#include-style
However, I noticed that uppercase named headers come before lowercase ones
throughout the codebase. So kept them as is.
Patch by Mandeep Singh Grang <mgrang@codeaurora.org>
Reviewers: majnemer, davide, jmolloy, atrick
Subscribers: sanjoy
Differential Revision: http://reviews.llvm.org/D14939
llvm-svn: 254005
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe. For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)
For out of tree owners, I was able to strip alignment from calls using sed by replacing:
(call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
$1i1 false)
and similarly for memmove and memcpy.
I then added back in alignment to test cases which needed it.
A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.
In IRBuilder itself, a new argument was added. Instead of calling:
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)
There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool. This is to prevent isVolatile here from passing its default
parameter to the source alignment.
Note, changes in future can now be made to codegen. I didn't change anything here, but this
change should enable better memcpy code sequences.
Reviewed by Hal Finkel.
llvm-svn: 253511
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites. The operation is described in more detail
in the LangRef changes.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14552
llvm-svn: 253438
Summary:
This change teaches the LLVM inliner to not inline through callsites
with unknown operand bundles. Currently all operand bundles are
"unknown" operand bundles but in the near future we will add support for
inlining through some select kinds of operand bundles.
Reviewers: reames, chandlerc, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14001
llvm-svn: 251141
We forgot to append the terminatepad's arguments which resulted in us
treating the old terminatepad as an argument to the new terminatepad
causing us to crash immediately. Instead, add the old terminatepad's
arguments to the new terminatepad.
This fixes PR25155.
llvm-svn: 250234
Continuing the work from last week to remove implicit ilist iterator
conversions. First related commit was probably r249767, with some more
motivation in r249925. This edition gets LLVMTransformUtils compiling
without the implicit conversions.
No functional change intended.
llvm-svn: 250142
This changes the behavior of AddAligntmentAssumptions to match its
comment. I.e, prove the asserted alignment in the context of the caller,
not the callee.
Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko
who also saw a fix for the issue.
rdar://22521387
Differential Revision: http://reviews.llvm.org/D12997
llvm-svn: 248390
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
Summary:
Add a `cleanupendpad` instruction, used to mark exceptional exits out of
cleanups (for languages/targets that can abort a cleanup with another
exception). The `cleanupendpad` instruction is similar to the `catchendpad`
instruction in that it is an EH pad which is the target of unwind edges in
the handler and which itself has an unwind edge to the next EH action.
The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad`
argument indicating which cleanup it exits. The unwind successors of a
`cleanuppad`'s `cleanupendpad`s must agree with each other and with its
`cleanupret`s.
Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12433
llvm-svn: 246751
Summary:
WinEHPrepare is going to require that cleanuppad and catchpad produce values
of token type which are consumed by any cleanupret or catchret exiting the
pad. This change updates the signatures of those operators to require/enforce
that the type produced by the pads is token type and that the rets have an
appropriate argument.
The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and
similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that
restriction, this change adds a notion of an operator constraint to both
LLParser and BitcodeReader, allowing appropriate sentinels to be constructed
for forward references and appropriate error messages to be emitted for
illegal inputs.
Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad
predecessor must have no other predecessors; this ensures that WinEHPrepare
will see the expected linear relationship between sibling catches on the
same try.
Lastly, remove some superfluous/vestigial casts from instruction operand
setters operating on BasicBlocks.
Reviewers: rnk, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12108
llvm-svn: 245797
This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.
There are several applications for such a type but my immediate
motivation stems from WinEH. Our personality routine enforces a
single-entry - single-exit regime for cleanups. After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together. We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.
Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup. This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.
What is the burden to the optimizer? Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway. There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.
Differential Revision: http://reviews.llvm.org/D11861
llvm-svn: 245029
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Differential Revision: http://reviews.llvm.org/D11097
llvm-svn: 243766
Instead of the pattern
for (auto I = x.rbegin(), E = x.end(); I != E; ++I)
we can use make_range to construct the reverse range and iterate using
that instead.
llvm-svn: 243163
preparation for de-coupling the AA implementations.
In order to do this, they had to become fake-scoped using the
traditional LLVM pattern of a leading initialism. These can't be actual
scoped enumerations because they're bitfields and thus inherently we use
them as integers.
I've also renamed the behavior enums that are specific to reasoning
about the mod/ref behavior of functions when called. This makes it more
clear that they have a very narrow domain of applicability.
I think there is a significantly cleaner API for all of this, but
I don't want to try to do really substantive changes for now, I just
want to refactor the things away from analysis groups so I'm preserving
the exact original design and just cleaning up the names, style, and
lifting out of the class.
Differential Revision: http://reviews.llvm.org/D10564
llvm-svn: 242963
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
llvm-svn: 239940
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`. The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.
Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one. It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs. YMMV of
course.
Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py. I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three. It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).
Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.
llvm-svn: 236120
This commit fixes the code which adds lifetime markers in InlineFunction to skip
zero-sized allocas instead of asserting on them.
rdar://problem/20531155
llvm-svn: 235312
Remove 'inlinedAt:' from MDLocalVariable. Besides saving some memory
(variables with it seem to be single largest `Metadata` contributer to
memory usage right now in -g -flto builds), this stops optimization and
backend passes from having to change local variables.
The 'inlinedAt:' field was used by the backend in two ways:
1. To tell the backend whether and into what a variable was inlined.
2. To create a unique id for each inlined variable.
Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg`
attachment, and change the DWARF backend to use a typedef called
`InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`.
This `DebugLoc` is already passed reliably through the backend (as
verified by r234021).
This commit removes the check from r234021, but I added a new check
(that will survive) in r235048, and changed the `DIBuilder` API in
r235041 to require a `!dbg` attachment whose 'scope:` is in the same
`MDSubprogram` as the variable's.
If this breaks your out-of-tree testcases, perhaps the script I used
(mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778
in a moment.
llvm-svn: 235050
The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics.
This patch makes sure that the Inliner does not add such an edge to the callgraph.
Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857
Differential Revision: http://reviews.llvm.org/D8231
llvm-svn: 231927
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.
This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.
I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.
I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.
Test Plan:
Reviewers: echristo
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
Summary:
DataLayout keeps the string used for its creation.
As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().
Get rid of DataLayoutPass: the DataLayout is in the Module
The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.
Make DataLayout Non-Optional in the Module
Module->getDataLayout() will never returns nullptr anymore.
Reviewers: echristo
Subscribers: resistor, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D7992
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
When two calls from the same MDLocation are inlined they currently get
treated as one inlined function call (creating difficulty debugging,
duplicate variables, etc).
Clang worked around this by including column information on inline calls
which doesn't address LTO inlining or calls to the same function from
the same line and column (such as through a macro). It also didn't
address ctor and member function calls.
By making the inlinedAt locations distinct, every call site has an
explicitly distinct location that cannot be coalesced with any other
call.
This can produce linearly (2x in the worst case where every call is
inlined and the call instruction has a non-call instruction at the same
location) more debug locations. Any increase beyond that are in cases
where the Clang workaround was insufficient and the new scheme is
creating necessary distinct nodes that were being erroneously coalesced
previously.
After this change to LLVM the incomplete workarounds in Clang. That
should reduce the number of debug locations (in a build without column
info, the default on Darwin, not the default on Linux) by not creating
pseudo-distinct locations for every call to an inline function.
(oh, and I made the inlined-at chain rebuilding iterative instead of
recursive because I was having trouble wrapping my head around it the
way it was - open to discussion on the right design for that function
(including going back to a recursive solution))
llvm-svn: 226736
Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to
return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and
clean up call sites. (For now, `DIBuilder` call sites just call
`release()` immediately.)
There's an accompanying change in each of clang and polly to use the new
API.
llvm-svn: 226504
Remove `MDNodeFwdDecl` (as promised in r226481). Aside from API
changes, there's no real functionality change here.
`MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`,
which returns a tuple with `isTemporary()` equal to true.
The main point is that we can now add temporaries of other `MDNode`
subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the
first place because I didn't recognize this need, and thought they were
only needed to handle forward references).
A few things left out of (or highlighted by) this commit:
- I've had to remove the (few) uses of `std::unique_ptr<>` to deal
with temporaries, since the destructor is no longer public.
`getTemporary()` should probably return the equivalent of
`std::unique_ptr<T, MDNode::deleteTemporary>`.
- `MDLocation::getTemporary()` doesn't exist yet (worse, it actually
does exist, but does the wrong thing: `MDNode::getTemporary()` is
inherited and returns an `MDTuple`).
- `MDNode` now only has one subclass, `UniquableMDNode`, and the
distinction between them is actually somewhat confusing.
I'll fix those up next.
llvm-svn: 226501
a cache of assumptions for a single function, and an immutable pass that
manages those caches.
The motivation for this change is two fold. Immutable analyses are
really hacks around the current pass manager design and don't exist in
the new design. This is usually OK, but it requires that the core logic
of an immutable pass be reasonably partitioned off from the pass logic.
This change does precisely that. As a consequence it also paves the way
for the *many* utility functions that deal in the assumptions to live in
both pass manager worlds by creating an separate non-pass object with
its own independent API that they all rely on. Now, the only bits of the
system that deal with the actual pass mechanics are those that actually
need to deal with the pass mechanics.
Once this separation is made, several simplifications become pretty
obvious in the assumption cache itself. Rather than using a set and
callback value handles, it can just be a vector of weak value handles.
The callers can easily skip the handles that are null, and eventually we
can wrap all of this up behind a filter iterator.
For now, this adds boiler plate to the various passes, but this kind of
boiler plate will end up making it possible to port these passes to the
new pass manager, and so it will end up factored away pretty reasonably.
llvm-svn: 225131
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532. Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.
I have a follow-up patch prepared for `clang`. If this breaks other
sub-projects, I apologize in advance :(. Help me compile it on Darwin
I'll try to fix it. FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.
This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.
Here's a quick guide for updating your code:
- `Metadata` is the root of a class hierarchy with three main classes:
`MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from
the `Value` class hierarchy. It is typeless -- i.e., instances do
*not* have a `Type`.
- `MDNode`'s operands are all `Metadata *` (instead of `Value *`).
- `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.
If you're referring solely to resolved `MDNode`s -- post graph
construction -- just use `MDNode*`.
- `MDNode` (and the rest of `Metadata`) have only limited support for
`replaceAllUsesWith()`.
As long as an `MDNode` is pointing at a forward declaration -- the
result of `MDNode::getTemporary()` -- it maintains a side map of its
uses and can RAUW itself. Once the forward declarations are fully
resolved RAUW support is dropped on the ground. This means that
uniquing collisions on changing operands cause nodes to become
"distinct". (This already happened fairly commonly, whenever an
operand went to null.)
If you're constructing complex (non self-reference) `MDNode` cycles,
you need to call `MDNode::resolveCycles()` on each node (or on a
top-level node that somehow references all of the nodes). Also,
don't do that. Metadata cycles (and the RAUW machinery needed to
construct them) are expensive.
- An `MDNode` can only refer to a `Constant` through a bridge called
`ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).
As a side effect, accessing an operand of an `MDNode` that is known
to be, e.g., `ConstantInt`, takes three steps: first, cast from
`Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
third, cast down to `ConstantInt`.
The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
metadata schema owners transition away from using `Constant`s when
the type isn't important (and they don't care about referring to
`GlobalValue`s).
In the meantime, I've added transitional API to the `mdconst`
namespace that matches semantics with the old code, in order to
avoid adding the error-prone three-step equivalent to every call
site. If your old code was:
MDNode *N = foo();
bar(isa <ConstantInt>(N->getOperand(0)));
baz(cast <ConstantInt>(N->getOperand(1)));
bak(cast_or_null <ConstantInt>(N->getOperand(2)));
bat(dyn_cast <ConstantInt>(N->getOperand(3)));
bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));
you can trivially match its semantics with:
MDNode *N = foo();
bar(mdconst::hasa <ConstantInt>(N->getOperand(0)));
baz(mdconst::extract <ConstantInt>(N->getOperand(1)));
bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2)));
bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3)));
bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));
and when you transition your metadata schema to `MDInt`:
MDNode *N = foo();
bar(isa <MDInt>(N->getOperand(0)));
baz(cast <MDInt>(N->getOperand(1)));
bak(cast_or_null <MDInt>(N->getOperand(2)));
bat(dyn_cast <MDInt>(N->getOperand(3)));
bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));
- A `CallInst` -- specifically, intrinsic instructions -- can refer to
metadata through a bridge called `MetadataAsValue`. This is a
subclass of `Value` where `getType()->isMetadataTy()`.
`MetadataAsValue` is the *only* class that can legally refer to a
`LocalAsMetadata`, which is a bridged form of non-`Constant` values
like `Argument` and `Instruction`. It can also refer to any other
`Metadata` subclass.
(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)
llvm-svn: 223802
Instead, we're going to separate metadata from the Value hierarchy. See
PR21532.
This reverts commit r221375.
This reverts commit r221373.
This reverts commit r221359.
This reverts commit r221167.
This reverts commit r221027.
This reverts commit r221024.
This reverts commit r221023.
This reverts commit r220995.
This reverts commit r220994.
llvm-svn: 221711
Change `Instruction::getMetadata()` to return `Value` as part of
PR21433.
Update most callers to use `Instruction::getMDNode()`, which wraps the
result in a `cast_or_null<MDNode>`.
llvm-svn: 221024
This restores the commit from SVN r219899 with an additional change to ensure
that the CodeGen is correct for the case that was identified as being incorrect
(originally PR7272).
In the case that during inlining we need to synthesize a value on the stack
(i.e. for passing a value byval), then any function involving that alloca must
be stripped of its tailness as the restriction that it does not access the
parent's stack no longer holds. Unfortunately, a single alloca can cause a
rippling effect through out the inlining as the value may be aliased or may be
mutated through an escaped external call. As such, we simply track if an alloca
has been introduced in the frame during inlining, and strip any tail calls.
llvm-svn: 220811
When functions are inlined, instructions without debug information are
attributed to the call site's DebugLoc. After inlining, inlined static
allocas are moved to the caller's entry block, adjacent to the caller's
original static alloca instructions. By retaining the call site's
DebugLoc, these instructions could cause instructions that were
subsequently inserted at the entry block to pick up the same DebugLoc.
Patch by Wolfgang Pieb!
llvm-svn: 220255
For pointer-typed function arguments, enhanced alignment can be asserted using
the 'align' attribute. When inlining, if this enhanced alignment information is
not otherwise available, preserve it using @llvm.assume-based alignment
assumptions.
llvm-svn: 219876
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.
As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.
The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.
Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.
This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).
llvm-svn: 217342
This adds an immutable pass, AssumptionTracker, which keeps a cache of
@llvm.assume call instructions within a module. It uses callback value handles
to keep stale functions and intrinsics out of the map, and it relies on any
code that creates new @llvm.assume calls to notify it of the new instructions.
The benefit is that code needing to find @llvm.assume intrinsics can do so
directly, without scanning the function, thus allowing the cost of @llvm.assume
handling to be negligible when none are present.
The current design is intended to be lightweight. We don't keep track of
anything until we need a list of assumptions in some function. The first time
this happens, we scan the function. After that, we add/remove @llvm.assume
calls from the cache in response to registration calls and ValueHandle
callbacks.
There are no new direct test cases for this pass, but because it calls it
validation function upon module finalization, we'll pick up detectable
inconsistencies from the other tests that touch @llvm.assume calls.
This pass will be used by follow-up commits that make use of @llvm.assume.
llvm-svn: 217334
This feeds AA through the IFI structure into the inliner so that
AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which
functions only access their arguments (instead of just hard-coding some
knowledge of memory intrinsics). Most of the information is only available from
BasicAA; this is important for preserving alias scoping information for
target-specific intrinsics when doing the noalias parameter attribute to
metadata conversion.
llvm-svn: 216866
I thought that I had fixed this problem in r216818, but I did not do a very
good job. The underlying issue is that when we add alias.scope metadata we are
asserting that this metadata completely describes the aliasing relationships
within the current aliasing scope domain, and so in the context of translating
noalias argument attributes, the pointers must all be based on noalias
arguments (as underlying objects) and have no other kind of underlying object.
In r216818 excluding appropriate accesses from getting alias.scope metadata is
done by looking for underlying objects that are not identified function-local
objects -- but that's wrong because allocas, etc. are also function-local
objects and we need to explicitly check that all underlying objects are the
noalias arguments for which we're adding metadata aliasing scopes.
This fixes the underlying-object check for adding alias.scope metadata, and
does some refactoring of the related capture-checking eligibility logic (and
adds more comments; hopefully making everything a bit clearer).
Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the
feature is still disabled by default).
llvm-svn: 216863
The previous implementation of AddAliasScopeMetadata, which adds noalias
metadata to preserve noalias parameter attribute information when inlining had
a flaw: it would add alias.scope metadata to accesses which might have been
derived from pointers other than noalias function parameters. This was
incorrect because even some access known not to alias with all noalias function
parameters could easily alias with an access derived from some other pointer.
Instead, when deriving from some unknown pointer, we cannot add alias.scope
metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4.
Furthermore, we cannot add alias.scope to functions unless we know they
access only argument-derived pointers (currently, we know this only for
memory intrinsics).
Also, we fix a theoretical problem with using the NoCapture attribute to skip
the capture check. This is incorrect (as explained in the comment added), but
would not matter in any code generated by Clang because we get only inferred
nocapture attributes in Clang-generated IR.
This functionality is not yet enabled by default.
llvm-svn: 216818
When a call site with noalias metadata is inlined, that metadata can be
propagated directly to the inlined instructions (only those that might access
memory because it is not useful on the others). Prior to inlining, the noalias
metadata could express that a call would not alias with some other memory
access, which implies that no instruction within that called function would
alias. By propagating the metadata to the inlined instructions, we preserve
that knowledge.
This should complete the enhancements requested in PR20500.
llvm-svn: 215676
When preserving noalias function parameter attributes by adding noalias
metadata in the inliner, we should do this for general function calls (not just
memory intrinsics). The logic is very similar to what already existed (except
that we want to add this metadata even for functions taking no relevant
parameters). This metadata can be used by ModRef queries in the caller after
inlining.
This addresses the first part of PR20500. Adding noalias metadata during
inlining is still turned off by default.
llvm-svn: 215657
No functional change. To be used in future commits that need to look
for such instructions.
Reviewed By: rafael
Differential Revision: http://reviews.llvm.org/D4504
llvm-svn: 215413
This functionality is currently turned off by default.
Part of the motivation for introducing scoped-noalias metadata is to enable the
preservation of noalias parameter attribute information after inlining.
Sometimes this can be inferred from the code in the caller after inlining, but
often we simply lose valuable information.
The overall process if fairly simple:
1. Create a new unqiue scope domain.
2. For each (used) noalias parameter, create a new alias scope.
3. For each pointer, collect the underlying objects. Add a noalias scope for
each noalias parameter from which we're not derived (and has not been
captured prior to that point).
4. Add an alias.scope for each noalias parameter from which we might be
derived (or has been captured before that point).
Note that the capture checks apply only if one of the underlying objects is not
an identified function-local object.
llvm-svn: 213949
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
1. To preserve noalias function attribute information when inlining
2. To provide the ability to model block-scope C99 restrict pointers
Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.
What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:
!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }
Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:
... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }
When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.
Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.
[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]
Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.
llvm-svn: 213864
This both improves basic debug info quality, but also fixes a larger
hole whenever we inline a call/invoke without a location (debug info for
the entire inlining is lost and other badness that the debug info
emission code is currently working around but shouldn't have to).
llvm-svn: 212065
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.
With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.
Fixes PR19001.
llvm-svn: 210459
The allocas going out of scope are immediately killed by the return
instruction.
This is a resend of r208912, which was committed accidentally.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3792
llvm-svn: 208920
We have to iterate over all the calls that were inlined to find out if
any were musttail.
Sink another variable down to where its used.
llvm-svn: 208913
The allocas going out of scope are immediately killed by the return
instruction.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3630
llvm-svn: 208912
The interesting case is what happens when you inline a musttail call
through a musttail call site. In this case, we can't break perfect
forwarding or allow any stack growth.
Instead of merging control flow from the inlined return instruction
after a musttail call into the body of the caller, leave the inlined
return instruction in the caller so that the musttail call stays in the
tail position.
More work is required in http://reviews.llvm.org/D3630 to handle the
case where the inlined function has dynamic allocas or byval arguments.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3491
llvm-svn: 208910
The -tailcallelim pass should be checking if byval or inalloca args can
be captured before marking calls as tail calls. This was the real root
cause of PR7272.
With a better fix in place, revert the inliner change from r105255. The
test case it introduced still passes and has been moved to
test/Transforms/Inline/byval-tail-call.ll.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3403
llvm-svn: 206789