Commit Graph

10661 Commits

Author SHA1 Message Date
Kazu Hirata b023cdeacc [llvm] Use llvm::all_of (NFC) 2021-01-19 20:19:17 -08:00
Kazu Hirata 8857202489 [llvm] Use llvm::find (NFC) 2021-01-19 20:19:14 -08:00
Mariya Podchishchaeva 7113de301a [ScalarizeMaskedMemIntrin] Add missing dependency
The pass has dependency on 'TargetTransformInfoWrapperPass', but the
corresponding call to INITIALIZE_PASS_DEPENDENCY was missing.

Differential Revision: https://reviews.llvm.org/D94916
2021-01-19 22:33:47 +03:00
Jeroen Dobbelaere 121cac01e8 [noalias.decl] Look through llvm.experimental.noalias.scope.decl
Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93042
2021-01-19 20:09:42 +01:00
Florian Hahn 83daa49758
[LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands.
D84108 exposed a bad interaction between inlining and loop-rotation
during regular LTO, which is causing notable regressions in at least
CINT2006/473.astar.

The problem boils down to: we now rotate a loop just before the vectorizer
which requires duplicating a function call in the preheader when compiling
the individual files ('prepare for LTO'). But this then prevents further
inlining of the function during LTO.

This patch tries to resolve this issue by making LoopRotate more
conservative with respect to rotating loops that have inline-able calls
during the 'prepare for LTO' stage.

I think this change intuitively improves the current situation in
general. Loop-rotate tries hard to avoid creating headers that are 'too
big'. At the moment, it assumes all inlining already happened and the
cost of duplicating a call is equal to just doing the call. But with LTO,
inlining also happens during full LTO and it is possible that a previously
duplicated call is actually a huge function which gets inlined
during LTO.

From the perspective of LV, not much should change overall. Most loops
calling user-provided functions won't get vectorized to start with
(unless we can infer that the function does not touch memory, has no
other side effects). If we do not inline the 'inline-able' call during
the LTO stage, we merely delayed loop-rotation & vectorization. If we
inline during LTO, chances should be very high that the inlined code is
itself vectorizable or the user call was not vectorizable to start with.

There could of course be scenarios where we inline a sufficiently large
function with code not profitable to vectorize, which would have be
vectorized earlier (by scalarzing the call). But even in that case,
there probably is no big performance impact, because it should be mostly
down to the cost-model to reject vectorization in that case. And then
the version with scalarized calls should also not be beneficial. In a way,
LV should have strictly more information after inlining and make more
accurate decisions (barring cost-model issues).

There is of course plenty of room for things to go wrong unexpectedly,
so we need to keep a close look at actual performance and address any
follow-up issues.

I took a look at the impact on statistics for
MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer
loops rotated, but no change to the number of loops vectorized.

Reviewed By: sanwou01

Differential Revision: https://reviews.llvm.org/D94232
2021-01-19 10:15:29 +00:00
Kazu Hirata 23b0ab2acb [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00
Kazu Hirata 2082b10d10 [llvm] Use *::empty (NFC) 2021-01-16 09:40:55 -08:00
Kazu Hirata 19aacdb715 [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-16 09:40:53 -08:00
Roman Lebedev c845c724c2
[Utils][SimplifyCFG] Port SplitBlock() to DomTreeUpdater
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.

The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.

This function is one of last three that not operate on DomTreeUpdater.
2021-01-15 23:35:56 +03:00
Jamie Schmeiser 17d0fb7f57 Set option default for enabling memory ssa for new pass manager loop sink pass to true.
Summary:
Set the default for the option enabling memory ssa use in the loop sink
pass to true for the new pass manager.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D92486
2021-01-15 09:56:44 -05:00
Kazu Hirata 7dc3575ef2 [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00
Kazu Hirata 9bcc0d1040 [CodeGen, Transforms] Use llvm::sort (NFC) 2021-01-14 20:30:31 -08:00
Kazu Hirata 125ea20d55 [llvm] Use llvm::stable_sort (NFC) 2021-01-13 19:14:43 -08:00
Arthur Eubanks 39e6d24237 [NewPM] Only non-trivially loop unswitch at -O3 and for non-optsize functions
This matches the legacy pipeline/pass.

Reviewed By: asbirlea, SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D94559
2021-01-13 14:54:49 -08:00
David Sherwood 4cd48535ec [NFC][InstructionCost] Use InstructionCost in Transforms/Scalar/RewriteStatepointsForGC.cpp
In places where we calculate costs using TTI.getXXXCost() interfaces
I have changed the code to use InstructionCost instead of unsigned.
The change is non functional since InstructionCost behaves in the
same way as an integer for valid costs. Currently the getXXXCost()
functions used in this file do not return invalid costs.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential revision: https://reviews.llvm.org/D94484
2021-01-13 09:42:58 +00:00
Kazu Hirata 12fc9ca3a4 [llvm] Remove redundant string initialization (NFC)
Identified with readability-redundant-string-init.
2021-01-12 21:43:46 -08:00
Quentin Colombet 905623b64d [NFC][LICM] Minor improvements to debug output
Added a utility function in Value class to print block name and use
block labels for unnamed blocks.
Changed LICM to call this function in its debug output.

Patch by Xiaoqing Wu <xiaoqing_wu@apple.com>

Differential Revision: https://reviews.llvm.org/D93577
2021-01-11 18:02:49 -08:00
Roman Lebedev ec8a6c11db
[SimplifyCFGPass] iterativelySimplifyCFG(): support lazy DomTreeUpdater
This boils down to how we deal with early-increment iterator
over function's basic blocks: not only we need to early-increment,
after that we also need to skip all the blocks
that are scheduled for removal, as per DomTreeUpdater.
2021-01-12 02:09:47 +03:00
Roman Lebedev 81afeacd37
[SimplifyCFGPass] mergeEmptyReturnBlocks(): skip blocks scheduled for removal as per DomTreeUpdater
Thus supporting lazy DomTreeUpdater mode,
where the domtree updates (and thus block removals)
aren't applied immediately, but are delayed
until last possible moment.
2021-01-12 02:09:47 +03:00
Bjorn Pettersson 675be65106 Require chained analyses in BasicAA and AAResults to be transitive
This patch fixes a bug that could result in miscompiles (at least
in an OOT target). The problem could be seen by adding checks that
the DominatorTree used in BasicAliasAnalysis and ValueTracking was
valid (e.g. by adding DT->verify() call before every DT dereference
and then running all tests in test/CodeGen).

Problem was that the LegacyPassManager calculated "last user"
incorrectly for passes such as the DominatorTree when not telling
the pass manager that there was a transitive dependency between
the different analyses. And then it could happen that an incorrect
dominator tree was used when doing alias analysis (which was a pretty
serious bug as the alias analysis result could be invalid).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=48709

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94138
2021-01-11 11:50:07 +01:00
Philip Reames 4739dd67e7 [LoopDeletion] Break backedge of outermost loops when known not taken
This is a resubmit of dd6bb367 (which was reverted due to stage2 build failures in 7c63aac), with the additional restriction added to the transform to only consider outer most loops.

As shown in the added test case, ensuring LCSSA is up to date when deleting an inner loop is tricky as we may actually need to remove blocks from any outer loops, thus changing the exit block set.   For the moment, just avoid transforming this case.  I plan to return to this case in a follow up patch and see if we can do better.

Original commit message follows...

The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.

Differential Revision: https://reviews.llvm.org/D93906
2021-01-10 16:02:33 -08:00
Roman Lebedev 8e8d214c4a
[NFCI][SimplifyCFG] Prefer to add Insert edges before Delete edges into DomTreeUpdater, if reasonable
This has a measurable impact on the number of DomTree recalculations.
While this doesn't handle all the cases,
it deals with the most obvious ones.
2021-01-11 00:30:44 +03:00
Kazu Hirata 6a6e382161 [llvm] Drop unnecessary make_range (NFC) 2021-01-09 09:25:00 -08:00
Kazu Hirata 4d92ab1669 [Transforms] Use llvm::find_if (NFC) 2021-01-09 09:24:58 -08:00
Kazu Hirata b7c5e0b02c [Target, Transforms] Use *Set::contains (NFC) 2021-01-08 18:39:54 -08:00
Kazu Hirata 33bf1cad75 [llvm] Use *Set::contains (NFC) 2021-01-07 20:29:34 -08:00
Hiroshi Yamauchi cf5415c727 [PGO][PGSO] Let unroll hints take precedence over PGSO.
Differential Revision: https://reviews.llvm.org/D94199
2021-01-07 10:10:31 -08:00
Oliver Stannard 76f6b125ce Revert "[llvm] Use BasicBlock::phis() (NFC)"
Reverting because this causes crashes on the 2-stage buildbots, for
example http://lab.llvm.org:8011/#/builders/7/builds/1140.

This reverts commit 9b228f107d.
2021-01-07 09:43:33 +00:00
Kazu Hirata 9b228f107d [llvm] Use BasicBlock::phis() (NFC) 2021-01-06 18:27:35 -08:00
Alina Sbirlea 63aeaf754a [DominatorTree] Add support for mixed pre/post CFG views.
Add support for mixed pre/post CFG views.

Update usages of the MemorySSAUpdater to use the new DT API by
requesting the DT updates to be done by the MSSAUpdater.

Differential Revision: https://reviews.llvm.org/D93371
2021-01-06 14:53:09 -08:00
Florian Hahn 494db3816b
[LoopDeletion] Also consider loops with subloops for deletion.
Currently, LoopDeletion does skip loops that have sub-loops, but this
means we currently fail to remove some no-op loops.

One example are inner loops with live-out values. Those cannot be
removed by itself. But the containing loop may itself be a no-op and the
whole loop-nest can be deleted.

The legality checks do not seem to rely on analyzing inner-loops only
for correctness.

With LoopDeletion being a LoopPass, the change means that we now
unfortunately need to do some extra work in parent loops, by checking
some conditions we already checked. But there appears to be no
noticeable compile time impact:
http://llvm-compile-time-tracker.com/compare.php?from=02d11f3cda2ab5b8bf4fc02639fd1f4b8c45963e&to=843201e9cf3b6871e18c52aede5897a22994c36c&stat=instructions

This changes patch leads to ~10 more loops being deleted on
MultiSource, SPEC2000, SPEC2006 with -O3 & LTO

This patch is also required (together with a few others) to eliminate a
no-op loop in omnetpp as discussed on llvm-dev 'LoopDeletion / removal of
empty loops.' (http://lists.llvm.org/pipermail/llvm-dev/2020-December/147462.html)

This change becomes relevant after removing potentially infinite loops
is made possible in 'must-progress' loops (D86844).

Note that I added a function call with side-effects to an outer loop in
`llvm/test/Transforms/LoopDeletion/update-scev.ll` to preserve the
original spirit of the test.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D93716
2021-01-06 14:49:00 +00:00
Atmn Patel f88a797521 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2021-01-05 09:56:16 -05:00
Jeremy Morse 914066fe38 [DebugInfo] Avoid LSR crash on large integer inputs
Loop strength reduction tries to recover debug variable values by looking
for simple offsets from PHI values. In really extreme conditions there may
be an offset used that won't fit in an int64_t, hitting an APInt assertion.

This patch adds a regression test and adjusts the equivalent value
collecting code to filter out any values where the offset can't be
represented by an int64_t. This means that for very large integers with
very large offsets, the variable location will become undef, which is the
same behaviour as before 2a6782bb9f / D87494.

Differential Revision: https://reviews.llvm.org/D94016
2021-01-05 10:25:37 +00:00
Arthur Eubanks e30fbbe9a5 [JumpThreading][NewPM] Skip when target has divergent CF
Matches the legacy pass.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D94028
2021-01-04 16:08:08 -08:00
Roman Lebedev ed9de61cc3
[SimplifyCFGPass] mergeEmptyReturnBlocks(): switch to non-permissive DomTree updates
... which requires not inserting an edge that already exists.
2021-01-05 01:26:36 +03:00
Roman Lebedev 3fb57222c4
[NFCI] SimplifyCFG: switch to non-permissive DomTree updates, where possible
Notably, this doesn't switch *every* case, remaining cases
don't actually pass sanity checks in non-permissve mode,
and therefore require further analysis.

Note that SimplifyCFG still defaults to not preserving DomTree by default,
so this is effectively a NFC change.
2021-01-05 01:26:36 +03:00
Philip Reames 7c63aac7bd Revert "[LoopDeletion] Break backedge of loops when known not taken"
This reverts commit dd6bb367d1.

Multi-stage builders are showing an assertion failure w/LCSSA not being preserved on entry to IndVars.  Reason isn't clear, reverting while investigating.
2021-01-04 09:50:47 -08:00
Philip Reames dd6bb367d1 [LoopDeletion] Break backedge of loops when known not taken
The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.

Differential Revision: https://reviews.llvm.org/D93906
2021-01-04 09:19:29 -08:00
Kazu Hirata ba82c0b315 [llvm] Call *(Set|Map)::erase directly (NFC)
We can erase an item in a set or map without checking its membership
first.
2021-01-03 09:57:47 -08:00
Juneyoung Lee 1fc992bd86 [Scalarizer] Use poison as insertelement's placeholder
This patch makes Scalarizer to use poison as insertelement's placeholder.

It contains two changes in Scalarizer.cpp, and the both changes does not change the semantics of the optimized program.
It is because the placeholder value (poison) is already completely hidden by following insertelement instructions.

The first change at visitBitCastInst() creates poison vector of MidTy and consecutively inserts FanIn times,
which is # of elems of MidTy.
The second change at ScalarizerVisitor::finish() creates poison with Op->getType(), and it is filled with
Count insertelements.

The test diffs show that the poison value is never exposed after insertelements.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93989
2021-01-04 00:35:28 +09:00
Kazu Hirata 530c5af6a4 [Transforms] Construct SmallVector with iterator ranges (NFC) 2021-01-02 09:24:17 -08:00
Roman Lebedev e08fea3b24
[SimplifyCFGPass] Ensure that DominatorTreeWrapperPass is init'd before SimplifyCFG
It's probably better than hoping that it will happen to be
already initialized.
2021-01-02 01:01:17 +03:00
Bogdan Graur 8bee4d4e8f Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops"
Test clang/test/Misc/loop-opt-setup.c fails when executed in Release.

This reverts commit 6f1503d598.

Reviewed By: SureYeaah

Differential Revision: https://reviews.llvm.org/D93956
2020-12-31 11:47:49 +00:00
Atmn Patel 6f1503d598 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2020-12-30 21:43:01 -05:00
Roman Lebedev 51879a5256
[LoopIdiom] 'left-shift until bittest': don't forget to check that PHI node is in loop header
Fixes an issue reported by Peter Collingbourne in
https://reviews.llvm.org/D91726#2475301
2020-12-30 23:58:41 +03:00
Juneyoung Lee 420d046d6b clang-format, address warnings 2020-12-30 23:05:07 +09:00
Juneyoung Lee 9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Juneyoung Lee bfedd5d2b6 [ConstraintElimination] Add support for select form of and/or
This patch adds support for select form of and/or.
Currently there is an ongoing effort for moving towards using `select a, b, false` instead of `and i1 a, b` and
`select a, true, b` instead of `or i1 a, b` as well.
D93065 has links to relevant changes.

Alive2 proof: (undef input was disabled due to timeout :( )
- and: https://alive2.llvm.org/ce/z/AgvFbQ
- or: https://alive2.llvm.org/ce/z/KjLJyb

Differential Revision: https://reviews.llvm.org/D93935
2020-12-30 21:27:36 +09:00
Arthur Eubanks c2ef06d3dd [NewPM] Port infer-address-spaces
And add it to the AMDGPU opt pipeline.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D93880
2020-12-28 19:58:12 -08:00
Kazu Hirata 5d2529f28f [Scalar] Construct SmallVector with iterator ranges (NFC) 2020-12-28 19:55:18 -08:00