Code region is the only part of this class that is IR-specific. Code
region is moved down in the inheritance tree to a new derived class,
called DiagnosticInfoIROptimization.
All the existing remarks are derived from this new class now.
This allows the new MIR pass-remark classes to be derived from
DiagnosticInfoOptimizationBase.
Also because we keep the name DiagnosticInfoOptimizationBase, the clang
parts don't need any adjustment.
Differential Revision: https://reviews.llvm.org/D29003
llvm-svn: 293109
Floating point intrinsics in LLVM are generally not speculatively
executed, since most of them are defined to behave the same as libm
functions, which set errno.
However, the @llvm.powi.* intrinsics do not correspond to any libm
function, and lacks any defined error handling semantics in LangRef.
It most certainly does not alter errno.
llvm-svn: 293041
I found root class should be instantiated for variadic tempate to instantiate static member explicitly.
This will fix failures in mingw DLL build.
llvm-svn: 293017
a lazy-asserting PoisoningVH.
AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.
This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.
The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.
The rest is straight cleanup.
I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.
Differential Revision: https://reviews.llvm.org/D29006
llvm-svn: 292928
Verifications of dominator tree and loop info are expensive operations
so they are disabled by default. They can be enabled by command line
options -verify-dom-info and -verify-loop-info. These options however
enable checks only in files Dominators.cpp and LoopInfo.cpp. If some
transformation changes dominaror tree and/or loop info, it would be
convenient to place similar checks to the files implementing the
transformation.
This change makes corresponding flags global, so they can be used in
any file to optionally turn verification on.
llvm-svn: 292889
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).
Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.
The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)
There are additional changes required in clang.
Reviewers: rsmith
Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28476
llvm-svn: 292848
become unavailable.
The AssumptionCache is now immutable but it still needs to respond to
DomTree invalidation if it ended up caching one.
This lets us remove one of the explicit invalidates of LVI but the
other one continues to avoid hitting a latent bug.
llvm-svn: 292769
This is similar to what the caller (matchSelectPattern()) does. In all
cases where we succeed in matching a min/max pattern, the values in
that pattern will be the values of the 'select', so hoist that and
remove a bunch of duplicated code.
llvm-svn: 292725
Summary:
Currently we return undef, but we're in the process of changing the
LangRef so that llvm.sqrt behaves like the other math intrinsics,
matching the return value of the standard libcall but not setting errno.
This change is legal even without the LangRef change because currently
calling llvm.sqrt(x) where x is negative is spec'ed to be UB. But in
practice it's also safe because we're simply constant-folding fewer
inputs: Inputs >= -0 get constant-folded as before, but inputs < -0 now
aren't constant-folded, because ConstantFoldFP aborts if the host math
function raises an fp exception.
Reviewers: hfinkel, efriedma, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28929
llvm-svn: 292692
This adds the following to the new PM based inliner in PGO mode:
* Use block frequency analysis to derive callsite's profile count and use
that to adjust thresholds of hot and cold callsites.
* Incrementally update the BFI of the caller after a callee gets inlined
into it. This incremental update is only within an invocation of the run
method - BFI is not preserved across calls to run.
Update the function entry count of the callee after inlining it into a
caller.
* I've tuned the thresholds for the hot and cold callsites using a hacked
up version of the old inliner that explicitly computes BFI on a set of
internal benchmarks and spec. Once the new PM based pipeline stabilizes
(IIRC Chandler mentioned there are known issues) I'll benchmark this
again and adjust the thresholds if required.
Inliner PGO support.
Differential revision: https://reviews.llvm.org/D28331
llvm-svn: 292666
This is the third attemp to recommit r292526.
The original summary:
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
llvm-svn: 292633
This is the second attemp to recommit r292526.
The original summary:
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
llvm-svn: 292616
This recommits r292526 which is reverted in r292529 after fixing the test case.
The original summary:
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
llvm-svn: 292570
loops in a function.
These are relatively confusing to talk about and compute correctly so it
seems really good to write down their implementation in one place. I've
replaced one place we needed this in the loop PM infrastructure and
I have another place in a pending patch that wants it.
We can't quite use this for the core loop PM walk because there we're
sometimes working on a sub-forest.
I'll add the expected unittests before committing this but wanted to
make sure folks were happy with these names / comments.
Credit goes to Richard Smith for the idea for naming the order where siblings
are in reverse program order but the tree traversal remains preorder.
Differential Revision: https://reviews.llvm.org/D28932
llvm-svn: 292569
Summary:
Fence instructions are currently marked as `ModRef` for all memory locations.
We can improve this for constant memory locations (such as constant globals),
since fence instructions cannot modify these locations.
This helps us to forward constant loads across fences (added test case in GVN).
There were no changes in behaviour for similar test cases in early-cse and licm.
Reviewers: dberlin, sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28914
llvm-svn: 292546
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
Differential Revision: https://reviews.llvm.org/D28693
llvm-svn: 292526
The scaling is done with reference to the the new frequency of a reference block.
Differential Revision: https://reviews.llvm.org/D28535
llvm-svn: 292507
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.
Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.
No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.
Differential Revision: https://reviews.llvm.org/D28587
llvm-svn: 292449
Before, it would print a sequence of:
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
...
for every single function in the module.
llvm-svn: 292442
r292188 confused MSVC because of the combined lack of a default
case and return statement.
Move the unreachable outside of the NumLibFuncs case, to make it
obvious that all cases should be handled.
llvm_unreachable is __declspec(noreturn), so I'm assuming this
does appease MSVC.
llvm-svn: 292246
Also, add the corresponding match to the AssumptionCache's 'Affected Values' list.
Differential Revision: https://reviews.llvm.org/D28485
llvm-svn: 292239
This is another step towards unifying all LibFunc prototype checks.
This work started in r267758 (D19469); add the remaining checks.
Also add a unittest that checks each libfunc declared with a known-valid
and known-invalid prototype. New libfuncs added in the future are
required to have prototype checking in place; the known-valid test will
fail otherwise.
Differential Revision: https://reviews.llvm.org/D28030
llvm-svn: 292188
When transferring affected values in the cache from an old value, identified by
the value of the current callback, to the specified new value we might need to
insert a new entry into the DenseMap which constitutes the cache. Doing so
might delete the current callback object. Move the copying logic into a new
function, a member of the assumption cache itself, so that we don't run into UB
should the callback handle itself be removed mid-copy.
Differential Revision: https://reviews.llvm.org/D28749
llvm-svn: 292133
Summary:
Use getLoopLatch in place of isLoopSimplifyForm. we do not need
to know whether the loop has a preheader nor dedicated exits.
Reviewers: hfinkel, sanjoy, atrick, mkuper
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28724
llvm-svn: 292078
events.
This pass sometimes has a pointer to BlockFrequencyInfo so it needs
custom invalidation logic. It is also otherwise immutable so we can
reduce the number of invalidations that happen substantially.
llvm-svn: 292058
a function's CFG when that CFG is unchanged.
This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.
I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.
Differential Revision: https://reviews.llvm.org/D28627
llvm-svn: 292054
mark it as never invalidated in the new PM.
The old PM already required this to work, and after a discussion with
Hal this seems to really be the only sensible answer. The cache
gracefully degrades as the IR is mutated, and most things which do this
should already be incrementally updating the cache.
This gets rid of a bunch of logic preserving and testing the
invalidation of this analysis.
llvm-svn: 292039
This fallthrough if other cases are added between fabs and default
could cause fabs to fall to the next case resulting in a bug.
Better getting rid of it immediately just to be sure.
llvm-svn: 292003
extractProfTotalWeight checks if the profile type is sample profile, but
before that we have to ensure that summary is available. Also expanded
the unittest to test the case where there is no summar
Differential Revision: https://reviews.llvm.org/D28708
llvm-svn: 291982
- For a loop body with VERY complicated exit condition evaluation, constant
evolving may run out of stack on platforms such as Windows. Need to limit the
recursion depth.
Differential Revision: https://reviews.llvm.org/D28629
llvm-svn: 291927
Running tests with expensive checks enabled exhibits some problems with
verification of pass results.
First, the pass verification may require results of analysis that are not
available. For instance, verification of loop info requires results of dominator
tree analysis. A pass may be marked as conserving loop info but does not need to
be dependent on DominatorTreePass. When a pass manager tries to verify that loop
info is valid, it needs dominator tree, but corresponding analysis may be
already destroyed as no user of it remained.
Another case is a pass that is skipped. For instance, entities with linkage
available_externally do not need code generation and such passes are skipped for
them. In this case result verification must also be skipped.
To solve these problems this change introduces a special flag to the Pass
structure to mark passes that have valid results. If this flag is reset,
verifications dependent on the pass result are skipped.
Differential Revision: https://reviews.llvm.org/D27190
llvm-svn: 291882
* Add is{Hot|Cold}CallSite methods
* Fix a bug in isHotBB where it was looking for MD_prof on a return instruction
* Use MD_prof data only if sample profiling was used to collect profiles.
* Add an unit test to ProfileSummaryInfo
Differential Revision: https://reviews.llvm.org/D28584
llvm-svn: 291878
Summary:
Memory Dependence Analysis was limited to return only local dependencies
for invariant.group handling. Now it returns NonLocal when it finds it
and then by asking getNonLocalPointerDependency we get found dep.
Thanks to this we are able to devirtualize loops!
void indirect(A &a, int n) {
for (int i = 0 ; i < n; i++)
a.foo();
}
void test(int n) {
A a;
indirect(a);
}
After inlining a.foo() will be changed to direct call, even if foo and A::A()
is external (but only if vtable definition is be available).
Reviewers: nlewycky, dberlin, chandlerc, rsmith
Subscribers: mehdi_amini, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D28137
llvm-svn: 291762
Refines max backedge-taken count if a loop like
"for (int i = 0; i != n; ++i) { /* body */ }" is rotated.
Differential Revision: https://reviews.llvm.org/D28536
llvm-svn: 291704
This is both easier to understand, and produces a tighter bound in certain
cases.
Differential Revision: https://reviews.llvm.org/D28393
llvm-svn: 291701
Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.
As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).
Differential Revision: https://reviews.llvm.org/D28459
llvm-svn: 291671
the latter to the Transforms library.
While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.
Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.
We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.
This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.
I haven't split the unittest though because testing one component
without the other seems nearly intractable.
Differential Revision: https://reviews.llvm.org/D28452
llvm-svn: 291662
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.
special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq.
In case if the real operands bitwidth <= 16.
Differential Revision: https://reviews.llvm.org/D28104
llvm-svn: 291657
arguments much like the CGSCC pass manager.
This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.
An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.
This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.
While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.
I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.
Differential Revision: https://reviews.llvm.org/D28292
llvm-svn: 291651
Functional change: Previously, if a callee is cold, we used ColdThreshold if it minimizes the existing threshold. This was irrespective of whether we were optimizing for minsize (-Oz) or not. But -Oz uses very low threshold to begin with and the inlining with -Oz is expected to be tuned for lowering code size, so there is no good reason to set an even lower threshold for cold callees. We now lower the threshold for cold callees only when -Oz is not used. For default values of -inlinethreshold and -inlinecold-threshold, this change has no effect and this simplifies the code.
NFC changes: Group all threshold updates that are guarded by !Caller->optForMinSize() and within that group threshold updates that require profile summary info.
Differential revision: https://reviews.llvm.org/D28369
llvm-svn: 291487
invalid.
This fixes use-after-free bugs that will arise with any interesting use
of SCEV.
I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.
llvm-svn: 291426
Summary:
By using stripPointerCasts we can get to the root
value and then walk down the bitcast graph
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28181
llvm-svn: 291405
Summary:
Using the linker-supplied list of "preserved" symbols, we can compute
the list of "dead" symbols, i.e. the one that are not reachable from
a "preserved" symbol transitively on the reference graph.
Right now we are using this information to mark these functions as
non-eligible for import.
The impact is two folds:
- Reduction of compile time: we don't import these functions anywhere
or import the function these symbols are calling.
- The limited number of import/export leads to better internalization.
Patch originally by Mehdi Amini.
Reviewers: mehdi_amini, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23488
llvm-svn: 291177
Should fix some more bot failures from r291108.
This should have been a DenseSet, since GUID is not a pointer type.
It caused some bots to fail, but for some reason I wasnt't getting a
build failure.
llvm-svn: 291115
Summary:
This adds a new summary flag NotEligibleToImport that subsumes
several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal
and IsNotViableToInline). It also subsumes the checking of references
on the summary that was being done during the thin link by
eligibleForImport() for each candidate. It is much more efficient to
do that checking once during the per-module summary build and record
it in the summary.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28169
llvm-svn: 291108
This code seems to be target dependent which may not be the same for all targets.
Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'.
Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general.
Differential Revision: https://reviews.llvm.org/D27518
llvm-svn: 291106
X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost.
In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426).
* Shiffle-broadcast cost will be changed in Simon's upcoming patch.
Differential Revision: https://reviews.llvm.org/D28118
llvm-svn: 290810
I'm not sure if this was intentional, but today
isGuaranteedToTransferExecutionToSuccessor returns true for readonly and
argmemonly calls that may throw. This commit changes the function to
not implicitly infer nounwind this way.
Even if we eventually specify readonly calls as not throwing,
isGuaranteedToTransferExecutionToSuccessor is not the best place to
infer that. We should instead teach FunctionAttrs or some other such
pass to tag readonly functions / calls as nounwind instead.
llvm-svn: 290794
I don't think this hole is currently exposed, but I crashed regression tests for
jump-threading and loop-vectorize after I added calls to isKnownNonNullAt() in
InstSimplify as part of trying to solve PR28430:
https://llvm.org/bugs/show_bug.cgi?id=28430
That's because they call into value tracking with a context instruction, but no
other parts of the query structure filled in.
For more background, see the discussion in:
https://reviews.llvm.org/D27855
llvm-svn: 290786
Summary:
gep 0, 0 is equivalent to bitcast. LLVM canonicalizes it
to getelementptr because it make SROA can then handle it.
Simple case like
void g(A &a) {
z(a);
if (glob)
a.foo();
}
void testG() {
A a;
g(a);
}
was not devirtualized with -fstrict-vtable-pointers because luck of
handling for gep 0 in Memory Dependence Analysis
Reviewers: dberlin, nlewycky, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28126
llvm-svn: 290763
analyses when we're about to break apart an SCC.
We can't wait until after breaking apart the SCC to invalidate things:
1) Which SCC do we then invalidate? All of them?
2) Even if we invalidate all of them, a newly created SCC may not have
a proxy that will convey the invalidation to functions!
Previously we only invalidated one of the SCCs and too late. This led to
stale analyses remaining in the cache. And because the caching strategy
actually works, they would get used and chaos would ensue.
Doing invalidation early is somewhat pessimizing though if we *know*
that the SCC structure won't change. So it turns out that the design to
make the mutation API force the caller to know the *kind* of mutation in
advance was indeed 100% correct and we didn't do enough of it. So this
change also splits two cases of switching a call edge to a ref edge into
two separate APIs so that callers can clearly test for this and take the
easy path without invalidating when appropriate. This is particularly
important in this case as we expect most inlines to be between functions
in separate SCCs and so the common case is that we don't have to so
aggressively invalidate analyses.
The LCG API change in turn needed some basic cleanups and better testing
in its unittest. No interesting functionality changed there other than
more coverage of the returned sequence of SCCs.
While this seems like an obvious improvement over the current state, I'd
like to revisit the core concept of invalidating within the CG-update
layer at all. I'm wondering if we would be better served forcing the
callers to handle the invalidation beforehand in the cases that they
can handle it. An interesting example is when we want to teach the
inliner to *update and preserve* analyses. But we can cross that bridge
when we get there.
With this patch, the new pass manager an build all of the LLVM test
suite at -O3 and everything passes. =D I haven't bootstrapped yet and
I'm sure there are still plenty of bugs, but this gives a nice baseline
so I'm going to increasingly focus on fleshing out the missing
functionality, especially the bits that are just turned off right now in
order to let us establish this baseline.
llvm-svn: 290664
due to a call cycle.
This actually crashed the ref removal before.
I've added a unittest that covers this kind of interesting graph
structure and mutation.
llvm-svn: 290645
There is no need to do this within an analysis. That method shouldn't
even be reached if this predicate holds as the actual useful
optimization is in the analysis manager itself.
llvm-svn: 290614
The effect of the bug was that we would incorrectly create summaries
for global and weak values defined in module asm (since we were
essentially testing for bit 1 which is SF_Undefined, and the
RecordStreamer ignores local undefined references). This would have
resulted in conservatively disabling importing of anything referencing
globals and weaks defined in module asm. Added these cases to the test
which now fails without this bug fix.
Fixes PR31459.
llvm-svn: 290610
Because operand was not marked as seen it was visited twice.
It doesn't change behavior of optimization, it just saves redudant
visit, so no test changes.
llvm-svn: 290607
This requires custom handling because BasicAA caches handles to other
analyses and so it needs to trigger indirect invalidation.
This fixes one of the common crashes when using the new PM in real
pipelines. I've also tweaked a regression test to check that we are at
least handling the most immediate case.
I'm going to work at re-structuring this test some to both scale better
(rather than all being in one file) and check more invalidation paths in
a follow-up commit, but I wanted to get the basic bug fix in place.
llvm-svn: 290603
inter-analysis dependencies) to use the new invalidation infrastructure.
This teaches it to invalidate itself when any of the peer function
AA results that it uses become invalid. We do this by just tracking the
originating IDs. I've kept it in a somewhat clunky API since some users
of AAResults are outside the new PM right now. We can clean this API up
if/when those users go away.
Secondly, it uses the registration on the outer analysis manager proxy
to trigger deferred invalidation when a module analysis result becomes
invalid.
I've included test cases that specifically try to trigger use-after-free
in both of these cases and they would crash or hang pretty horribly for
me even without ASan. Now they work nicely.
The `InvalidateAnalysis` utility pass required some tweaking to be
useful in this context and it still is pretty garbage. I'd like to
switch it back to the previous implementation and teach the explicit
invalidate method on the AnalysisManager to take care of correctly
triggering indirect invalidation, but I wanted to go ahead and send this
out so folks could see how all of this stuff works together in practice.
And, you know, that it does actually work. =]
Differential Revision: https://reviews.llvm.org/D27205
llvm-svn: 290595
that require deferred invalidation.
This handles the other real-world invalidation scenario that we have
cases of: a function analysis which caches references to a module
analysis. We currently do this in the AA aggregation layer and might
well do this in other places as well.
Since this is relative rare, the technique is somewhat more cumbersome.
Analyses need to register themselves when accessing the outer analysis
manager's proxy. This proxy is already necessarily present to allow
access to the outer IR unit's analyses. By registering here we can track
and trigger invalidation when that outer analysis goes away.
To make this work we need to enhance the PreservedAnalyses
infrastructure to support a (slightly) more explicit model for "sets" of
analyses, and allow abandoning a single specific analyses even when
a set covering that analysis is preserved. That allows us to describe
the scenario of preserving all Function analyses *except* for the one
where deferred invalidation has triggered.
We also need to teach the invalidator API to support direct ID calls
instead of always going through a template to dispatch so that we can
just record the ID mapping.
I've introduced testing of all of this both for simple module<->function
cases as well as for more complex cases involving a CGSCC layer.
Much like the previous patch I've not tried to fully update the loop
pass management layer because that layer is due to be heavily reworked
to use similar techniques to the CGSCC to handle updates. As that
happens, we'll have a better testing basis for adding support like this.
Many thanks to both Justin and Sean for the extensive reviews on this to
help bring the API design and documentation into a better state.
Differential Revision: https://reviews.llvm.org/D27198
llvm-svn: 290594
We currently ignore the `allocsize` attribute on functions calls with
the `nobuiltin` attribute when trying to lower `@llvm.objectsize`. We
shouldn't care about `nobuiltin` here: `allocsize` is explicitly added
by the user, not inferred based on a function's symbol.
llvm-svn: 290588
This also makes us no longer check for `allocsize` on intrinsic calls.
This shouldn't matter, since intrinsics should provide the information
we get from `allocsize` on their own.
llvm-svn: 290585
This patch fixes some ASAN unittest failures on FreeBSD. See the
cfe-commits email thread for r290169 for more on those.
According to the LangRef, the allocsize attribute only tells us about
the number of bytes that exist at the memory location pointed to by the
return value of a function. It does not necessarily mean that the
function will only ever allocate. So, we need to be very careful about
treating functions with allocsize as general allocation functions. This
patch makes us fully conservative in this regard, though I suspect that
we have room to be a bit more aggressive if we want.
This has a FIXME that can be fixed by a relatively straightforward
refactor; I just wanted to keep this patch minimal. If this sticks, I'll
come back and fix it in a few days.
llvm-svn: 290397
declarations.
We're using a custom class here instead of the helper template, these
bits just didn't get deleted when the other bits did get deleted. This
was found by a really nice MSVC warning about explicitly instantiating
a template where some member functions aren't defined and thus can't be
instantiatied.
llvm-svn: 290327
from the old pass manager in the new one.
I'm not trying to support (initially) the numerous options that are
currently available to customize the pass pipeline. If we end up really
wanting them, we can add them later, but I suspect many are no longer
interesting. The simplicity of omitting them will help a lot as we sort
out what the pipeline should look like in the new PM.
I've also documented to the best of my ability *why* each pass or group
of passes is used so that reading the pipeline is more helpful. In many
cases I think we have some questionable choices of ordering and I've
left FIXME comments in place so we know what to come back and revisit
going forward. But for now, I've left it as similar to the current
pipeline as I could.
Lastly, I've had to comment out several places where passes are not
ported to the new pass manager or where the loop pass infrastructure is
not yet ready. I did at least fix a few bugs in the loop pass
infrastructure uncovered by running the full pipeline, but I didn't want
to go too far in this patch -- I'll come back and re-enable these as the
infrastructure comes online. But I'd like to keep the comments in place
because I don't want to lose track of which passes need to be enabled
and where they go.
One thing that seemed like a significant API improvement was to require
that we don't build pipelines for O0. It seems to have no real benefit.
I've also switched back to returning pass managers by value as at this
API layer it feels much more natural to me for composition. But if
others disagree, I'm happy to go back to an output parameter.
I'm not 100% happy with the testing strategy currently, but it seems at
least OK. I may come back and try to refactor or otherwise improve this
in subsequent patches but I wanted to at least get a good starting point
in place.
Differential Revision: https://reviews.llvm.org/D28042
llvm-svn: 290325
Each function summary has an attached list of type identifier GUIDs. The
idea is that during the regular LTO phase we would match these GUIDs to type
identifiers defined by the regular LTO module and store the resolutions in
a top-level "type identifier summary" (which will be implemented separately).
Differential Revision: https://reviews.llvm.org/D27967
llvm-svn: 290280
For vector GEPs, CastGEPIndices can end up in an infinite recursion, because
we compare the vector type to the scalar pointer type, find them different,
and then try to cast a type to itself.
Differential Revision: https://reviews.llvm.org/D28009
llvm-svn: 290260
We're currently doing nearly the same thing for @llvm.objectsize in
three different places: two of them are missing checks for overflow,
and one of them could subtly break if InstCombine gets much smarter
about removing alloc sites. Seems like a good idea to not do that.
llvm-svn: 290214
Summary:
In getRangeForAffineAR we compute ranges for affine exprs E = A + B*C,
where ranges for A, B, and C are known. To avoid overflow, we need to
operate on a bigger bitwidth, and originally we chose 2*x+1 for this
(x being the original bitwidth). However, it is safe to use just 2*x:
A+B*C <= (2^x - 1) + (2^x - 1)*(2^x - 1) =
= 2^x - 1 + 2^2x - 2^x - 2^x + 1 =
= 2^2x - 2^x <= 2^2x - 1
Unnecessary extending of bitwidths results in noticeable slowdowns: ranges
perform arithmetic operations using APInt, which are much slower when bitwidths
are bigger than 64.
Reviewers: sanjoy, majnemer, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27795
llvm-svn: 290211
Also make the summary ref and call graph vectors immutable. This means
a smaller API surface and fewer places to audit for non-determinism.
Differential Revision: https://reviews.llvm.org/D27875
llvm-svn: 290200