Commit Graph

9793 Commits

Author SHA1 Message Date
Florian Hahn b5e52bfd83 [GVN] Do PHI translations across all edges between the load and the unavailable pred.
Currently we do not properly translate addresses with PHIs if LoadBB !=
LI->getParent(), because PHITranslateAddr expects a direct predecessor as argument,
because it considers all instructions outside of the current block to
not requiring translation.

The amount of cases that trigger this should be very low, as most single
predecessor blocks should be folded into their predecessor by GVN before
we actually start with value numbering. It is still not guaranteed to
happen, so we should do PHI translation along all edges between the
loads' block and the predecessor where we have to place a load.

There are a few test cases showing current limits of the PHI translation, which
could be improved later.

Reviewers: spatel, reames, efriedma, john.brawn

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65020

llvm-svn: 369570
2019-08-21 20:06:50 +00:00
Evgeniy Stepanov 55ccd16354 Refactor isPointerOffset (NFC).
Summary:
Simplify the API using Optional<> and address comments in
         https://reviews.llvm.org/D66165

Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits, ostannard, pcc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66317

llvm-svn: 369300
2019-08-19 21:08:04 +00:00
Alina Sbirlea 1a3fdaf6a6 [MemorySSA] Rename uses when inserting memory uses.
Summary:
When inserting uses from outside the MemorySSA creation, we don't
normally need to rename uses, based on the assumption that there will be
no inserted Phis (if  Def existed that required a Phi, that Phi already
exists). However, when dealing with unreachable blocks, MemorySSA will
optimize away Phis whose incoming blocks are unreachable, and these Phis end
up being re-added when inserting a Use.
There are two potential solutions here:
1. Analyze the inserted Phis and clean them up if they are unneeded
(current method for cleaning up trivial phis does not cover this)
2. Leave the Phi in place and rename uses, the same way as whe inserting
defs.
This patch use approach 2.

Resolves first test in PR42940.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66033

llvm-svn: 369291
2019-08-19 18:57:40 +00:00
Alina Sbirlea f92109dc01 [MemorySSA] Loop passes should mark MSSA preserved when available.
This patch applies only to the new pass manager.
Currently, when MSSA Analysis is available, and pass to each loop pass, it will be preserved by that loop pass.
Hence, mark the analysis preserved based on that condition, vs the current `EnableMSSALoopDependency`. This leaves the global flag to affect only the entry point in the loop pass manager (in FunctionToLoopPassAdaptor).

llvm-svn: 369181
2019-08-17 01:02:12 +00:00
Evgeniy Stepanov 75344955fc Move isPointerOffset function to ValueTracking (NFC).
Summary: To be reused in MemTag sanitizer.

Reviewers: pcc, vitalybuka, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66165

llvm-svn: 369062
2019-08-15 22:58:28 +00:00
Jonas Devlieghere 0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Philip Reames 7b0515176b [SCEV] Rename getMaxBackedgeTakenCount to getConstantMaxBackedgeTakenCount [NFC]
llvm-svn: 368930
2019-08-14 21:58:13 +00:00
Philip Reames 6cca3ad43e [RLEV] Rewrite loop exit values for multiple exit loops w/o overall loop exit count
We already supported rewriting loop exit values for multiple exit loops, but if any of the loop exits were not computable, we gave up on all loop exit values. This patch generalizes the existing code to handle individual computable loop exits where possible.

As discussed in the review, this is a starting point for figuring out a better API.  The code is a bit ugly, but getting it in lets us test as we go.  

Differential Revision: https://reviews.llvm.org/D65544

llvm-svn: 368898
2019-08-14 18:27:57 +00:00
Matt Arsenault dbc1f207fa InferAddressSpaces: Move target intrinsic handling to TTI
I'm planning on handling intrinsics that will benefit from checking
the address space enums. Don't bother moving the address collection
for now, since those won't need th enums.

llvm-svn: 368895
2019-08-14 18:13:00 +00:00
Matt Arsenault 0eac2a2963 InferAddressSpaces: Remove unnecessary check for ConstantInt
The IR is invalid if this isn't a constant since immarg was added.

llvm-svn: 368893
2019-08-14 18:01:42 +00:00
Bill Wendling cc2bebe039 Ignore indirect branches from callbr.
Summary:
We can't speculate around indirect branches: indirectbr and invoke. The
callbr instruction needs to be included here.

Reviewers: nickdesaulniers, manojgupta, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66200

llvm-svn: 368873
2019-08-14 16:44:07 +00:00
David L. Jones d4edd9d97e Revert '[LICM] Make Loop ICM profile aware' and 'Fix pass dependency for LICM'
This reverts r368526 (git commit 7e71aa24bc)
This reverts r368542 (git commit cb5a90fd31)

llvm-svn: 368800
2019-08-14 04:50:33 +00:00
Wenlei He cb5a90fd31 Fix pass dependency for LICM
Expected to address buildbot failure http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16285 caused by D65060.

llvm-svn: 368542
2019-08-11 22:54:05 +00:00
Wenlei He 7e71aa24bc [LICM] Make Loop ICM profile aware
Summary:
Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block.

Test Plan:

ninja check

Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk

Reviewed By: asbirlea

Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65060

llvm-svn: 368526
2019-08-11 06:05:35 +00:00
Sanjay Patel 21c15ef384 [Reassociate] try harder to convert negative FP constants to positive
This is an extension of a transform that tries to produce positive floating-point
constants to improve canonicalization (and hopefully lead to more reassociation
and CSE).

The original patches were:
D4904
D5363 (rL221721)

But as the test diffs show, these were limited to basic patterns by walking from
an instruction to its single user rather than recursively moving up the def-use
sequence. No fast-math is required here because we're only rearranging implicit
FP negations in intermediate ops.

A motivating bug is:
https://bugs.llvm.org/show_bug.cgi?id=32939

Differential Revision: https://reviews.llvm.org/D65954

llvm-svn: 368512
2019-08-10 13:17:54 +00:00
Bjorn Pettersson d218a3326e [InstSimplify] Report "Changed" also when only deleting dead instructions
Summary:
Make sure that we report that changes has been made
by InstSimplify also in situations when only trivially
dead instructions has been removed. If for example a call
is removed the call graph must be updated.

Bug seem to have been introduced by llvm-svn r367173
(commit 02b9e45a7e), since the code in question
was rewritten in that commit.

Reviewers: spatel, chandlerc, foad

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65973

llvm-svn: 368401
2019-08-09 07:08:25 +00:00
Cameron McInally 8416f20f2f [LICM] Support unary FNeg in LICM
Differential Revision: https://reviews.llvm.org/D65908

llvm-svn: 368350
2019-08-08 21:38:31 +00:00
Tim Corringham 4f64f1ba3c Add llvm.licm.disable metadata
For some targets the LICM pass can result in sub-optimal code in some
cases where it would be better not to run the pass, but it isn't
always possible to suppress the transformations heuristically.

Where the front-end has insight into such cases it is beneficial
to attach loop metadata to disable the pass - this change adds the
llvm.licm.disable metadata to enable that.

Differential Revision: https://reviews.llvm.org/D64557

llvm-svn: 368296
2019-08-08 13:46:17 +00:00
Cameron McInally 303b6dbfb4 [EarlyCSE] Add support for unary FNeg to EarlyCSE
Differential Revision: https://reviews.llvm.org/D65815

llvm-svn: 368171
2019-08-07 14:34:41 +00:00
Tim Renouf 5a0794327a [StructurizeCFG] Enable -structurizecfg-relaxed-uniform-regions by default
D62198 introduced an option to relax the checks for
hasOnlyUniformBranches. This commit turns the option on by default, for
better code generation in some cases in AMDGPU.

Differential Revision: https://reviews.llvm.org/D63198

Change-Id: I9cbff002a1e74d3b7eb96b4192dc8129936d537d
llvm-svn: 368042
2019-08-06 14:30:19 +00:00
Serguei Katkov de67affd00 [Loop Peeling] Introduce an option for profile based peeling disabling.
This patch adds an ability to disable profile based peeling 
causing the peeling of all iterations and as a result prohibits
further unroll/peeling attempts on that loop.

The motivation to get an ability to separate peeling usage in
pipeline where in the first part we peel only separate iterations if needed
and later in pipeline we apply the full peeling which will prohibit further peeling.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64983

llvm-svn: 367668
2019-08-02 09:32:52 +00:00
Serguei Katkov bbdcc82111 [Loop Peeling] Do not close further unroll/peel if profile based peeling was not used.
Current peeling cost model can decide to peel off not all iterations
but only some of them to eliminate conditions on phi. At the same time 
if any peeling happens the door for further unroll/peel optimizations on that
loop closes because the part of the code thinks that if peeling happened
it is profile based peeling and all iterations are peeled off.

To resolve this inconsistency the patch provides the flag which states whether
the full peeling basing on profile is enabled or not and peeling cost model
is able to modify this field like it does not PeelCount.

In a separate patch I will introduce an option to allow/disallow peeling basing
on profile.

To avoid infinite loop peeling the patch tracks the total number of peeled iteration
through llvm.loop.peeled.count loop metadata.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64972

llvm-svn: 367647
2019-08-02 04:29:23 +00:00
Roman Lebedev 081e990d08 [IR] Value: add replaceUsesWithIf() utility
Summary:
While there is always a `Value::replaceAllUsesWith()`,
sometimes the replacement needs to be conditional.

I have only cleaned a few cases where `replaceUsesWithIf()`
could be used, to both add test coverage,
and show that it is actually useful.

Reviewers: jdoerfert, spatel, RKSimon, craig.topper

Reviewed By: jdoerfert

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65528

llvm-svn: 367548
2019-08-01 12:32:08 +00:00
Philip Reames 79c27c9464 Fix a release-only build warning triggered by rL367485
llvm-svn: 367499
2019-08-01 01:16:08 +00:00
Philip Reames f8e7b53657 [IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work)
This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts.  The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes.

The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop.

llvm-svn: 367485
2019-07-31 21:15:21 +00:00
Florian Hahn fa42f42858 [IPSCCP] Move callsite check to the beginning of the loop.
We have some code marks instructions with struct operands as overdefined,
but if the instruction is a call to a function with tracked arguments,
this breaks the assumption that the lattice values of all call sites
are not overdefined and will be replaced by a constant.

This also re-adds the assertion from D65222, with additionally skipping
non-callsite uses. This patch should address the cases reported in which
the assertion fired.

Fixes PR42738.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D65439

llvm-svn: 367430
2019-07-31 12:57:04 +00:00
Roman Lebedev 5e4e6b1fb1 [DivRemPairs] Fixup DNDEBUG build - variable is only used in assertion
llvm-svn: 367423
2019-07-31 12:26:37 +00:00
Roman Lebedev a686c60c45 [DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673)
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.

If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.

The original patch was committed in rL367288 but was reverted in rL367289
because it exposed pre-existing RAUW issues in internal data structures
of the pass; those now have been addressed in a previous patch.

https://bugs.llvm.org/show_bug.cgi?id=42673

Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner

Reviewed By: bogner

Subscribers: bogner, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65298

llvm-svn: 367419
2019-07-31 12:06:51 +00:00
Roman Lebedev 5f616901f5 [DivRemPairs] Avoid RAUW pitfalls (PR42823)
Summary:
`DivRemPairs` internally creates two maps:
* {sign, divident, divisor} -> div instruction
* {sign, divident, divisor} -> rem instruction
Then it iterates over rem map, and looks if there is an entry
in div map with the same key. Then depending on some internal logic
it may RAUW rem instruction with something else.

But if that rem instruction is an input to other div/rem,
then it was used as a key in these maps, so the old value (used in key)
is now dandling, because RAUW didn't update those maps.
And we can't even RAUW map keys in general, there's `ValueMap`,
but we don't have a single `Value` as key...

The bug was discovered via D65298, and the test there exists.
Now, i'm not sure how to expose this issue in trunk.
The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`,
but i guess this didn't miscompiled anything thus far?
I really don't think this is benin without that patch.

The fix is actually rather straight-forward - instead of trying to somehow
shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new
`ValueMap` with key being a struct of `Value`s, we can just have an intermediate
data structure - a vector, each entry containing matching `Div, Rem` pair,
and pre-filling it before doing any modifications.
This way we won't need to query map after doing RAUW, so no bug is possible.

Reviewers: spatel, bogner, RKSimon, craig.topper

Reviewed By: spatel

Subscribers: hiraditya, hans, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65451

llvm-svn: 367417
2019-07-31 12:06:38 +00:00
Florian Hahn 189efe295b Recommit "[GVN] Preserve loop related analysis/canonical forms."
This fixes some pipeline tests.
This reverts commit d0b6f42936.

llvm-svn: 367401
2019-07-31 09:27:54 +00:00
Florian Hahn d0b6f42936 Revert [GVN] Preserve loop related analysis/canonical forms.
This reverts r367332 (git commit 2d7227ec3a)

llvm-svn: 367335
2019-07-30 17:04:58 +00:00
Florian Hahn 2d7227ec3a [GVN] Preserve loop related analysis/canonical forms.
LoopInfo can be easily preserved by passing it to the functions that
modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor.
SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in
LoopInfo. The test case shows that we preserve LoopSimplify and
LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some
reason.

Also I am not sure if it is possible to preserve those in the new pass
manager, as they aren't analysis passes.

Reviewers: reames, hfinkel, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D65137

llvm-svn: 367332
2019-07-30 16:43:39 +00:00
Kit Barton de0b633999 [LoopFusion] Extend use of OptimizationRemarkEmitter
Summary:
This patch extends the use of the OptimizationRemarkEmitter to provide
information about loops that are not fused, and loops that are not eligible for
fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops
that are not eligible for fusion and the OptimizationRemarkMissed to identify
loops that cannot be fused.

It also reuses the statistics to provide the messages used in the
OptimizationRemarks. This provides common message strings between the
optimization remarks and the statistics.

I would like feedback on this approach, in general. If people are OK with this,
I will flesh out additional remarks in subsequent commits.

Subscribers: hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63844

llvm-svn: 367327
2019-07-30 15:58:43 +00:00
Roman Lebedev 8e0cf076ac Revert "[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)"
test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke:

Only PHI nodes may reference their own value!
  %sub33 = srem i32 %sub33, %ranks_in_i

This reverts commit r367288.

llvm-svn: 367289
2019-07-30 07:44:58 +00:00
Roman Lebedev c75cdd056f [DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.

If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.

https://bugs.llvm.org/show_bug.cgi?id=42673

Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner

Reviewed By: bogner

Subscribers: bogner, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65298

llvm-svn: 367288
2019-07-30 07:10:00 +00:00
Sanjay Patel 02b9e45a7e [InstSimplify] remove quadratic time looping (PR42771)
The test case from:
https://bugs.llvm.org/show_bug.cgi?id=42771
...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is
seemingly done just to avoid invalidating the instruction iterator. We can instead
delay instruction deletion until we reach the end of the block (or we could delay
until we reach the end of all blocks).

There's a test diff here for a degenerate case with llvm.assume that is not
meaningful in itself, but serves to verify this change in logic.

This change probably doesn't result in much overall compile-time improvement
because we call '-instsimplify' as a standalone pass only once in the standard
-O2 opt pipeline currently.

Differential Revision: https://reviews.llvm.org/D65336

llvm-svn: 367173
2019-07-27 14:05:51 +00:00
Florian Hahn d89f6cb299 Revert [IPSCCP] Add assertion to surface cases where we zap returns with overdefined users.
This reverts r366998 (git commit 5354c83ece)

This breaks a linux kernel build and we have reproducer to investigate.

llvm-svn: 367160
2019-07-26 22:14:08 +00:00
Wei Mi 55a68a2400 [JumpThreading] Stop searching predecessor when the current bb is in a
unreachable loop.

updatePredecessorProfileMetadata in jumpthreading tries to find the
first dominating predecessor block for a PHI value by searching upwards
the predecessor block chain.

But jumpthreading may see some temporary IR state which contains
unreachable bb not being cleaned up. If an unreachable loop happens to
be on the predecessor block chain, keeping chasing the predecessor
block will run into an infinite loop.

The patch fixes it.

Differential Revision: https://reviews.llvm.org/D65310

llvm-svn: 367154
2019-07-26 20:59:22 +00:00
Serguei Katkov 3c3a76527e [Loop Utils] Move utilty addStringMetadataToLoop to LoopUtils.cpp. NFC.
Just move the utility function to LoopUtils.cpp to re-use it in loop peeling.

Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, asbirlea, llvm-commits
Differential Revision: https://reviews.llvm.org/D65264

llvm-svn: 367085
2019-07-26 06:10:08 +00:00
JF Bastien dbc0a5df8d Allow prefetching from non-zero address spaces
Summary:
This is useful for targets which have prefetch instructions for non-default address spaces.

<rdar://problem/42662136>

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65254

llvm-svn: 367032
2019-07-25 16:11:57 +00:00
Florian Hahn 5354c83ece [IPSCCP] Add assertion to surface cases where we zap returns with overdefined users.
We should only zap returns in functions, where all live users have a
replace-able value (are not overdefined). Unused return values should be
undefined.

This should make it easier to detect bugs like in PR42738.

Alternatively we could bail out of zapping the function returns, but I
think it would be better to address those divergences between function
and call-site values where they are actually caused.

Reviewers: davide, efriedma

Reviewed By: davide, efriedma

Differential Revision: https://reviews.llvm.org/D65222

llvm-svn: 366998
2019-07-25 09:37:09 +00:00
Chen Zheng a2d74d3d90 [PowerPC] exclude more icmps in LSR which is converted in later hardware loop pass
Differential Revision: https://reviews.llvm.org/D64795

llvm-svn: 366976
2019-07-25 01:22:08 +00:00
David Bolvansky d2904ccf88 Let CorrelatedValuePropagation preserve LazyValueInfo
Summary:
This patch makes CorrelatedValuePropagation preserve LazyValueInfo by adding LazyValueInfo::eraseValue & calling it whenever an instruction is erased.

Passes `make check` , test-suite, and SPECrate 2017.

Patch by aqjune (Juneyoung Lee)

Reviewers: reames, mzolotukhin

Reviewed By: reames

Subscribers: xbolva00, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59349

llvm-svn: 366942
2019-07-24 20:27:32 +00:00
Philip Reames ea5c94b497 [IndVars] Fix a subtle bug in optimizeLoopExits
The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts.  This could cause two separate bugs:
1) We might exit the loop early, and leave optimizations undone.  This is what triggered the assertion failure in the reported test case.
2) We might optimize one exit, then exit without indicating a change.  This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars.

Note that the pointer exit counts are a really fragile concept.  They show up only when we have a pointer IV w/o a datalayout to provide their size.  It's really questionable to me whether the complexity implied is worth it.

llvm-svn: 366829
2019-07-23 17:45:11 +00:00
Philip Reames 6e1c3bb181 [IndVars] Speculative fix for an assertion failure seen in bots
I don't have an IR sample which is actually failing, but the issue described in the comment is theoretically possible, and should be guarded against even if there's a different root cause for the bot failures.

llvm-svn: 366241
2019-07-16 18:23:49 +00:00
Amara Emerson 228a7b4f2a [ADCE] Fix non-deterministic behaviour due to iterating over a pointer set.
Original patch by Yann Laigle-Chapuy

Differential Revision: https://reviews.llvm.org/D64785

llvm-svn: 366215
2019-07-16 15:23:10 +00:00
Rui Ueyama 49a3ad21d6 Fix parameter name comments using clang-tidy. NFC.
This patch applies clang-tidy's bugprone-argument-comment tool
to LLVM, clang and lld source trees. Here is how I created this
patch:

$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \
    -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm
$ ninja
$ parallel clang-tidy -checks='-*,bugprone-argument-comment' \
    -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \
    ::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h}

llvm-svn: 366177
2019-07-16 04:46:31 +00:00
Alina Sbirlea db101864bd [MemorySSA] Use SetVector to avoid nondeterminism.
Summary:
Use a SetVector for DeadBlockSet.
Resolves PR42574.

Reviewers: george.burgess.iv, uabelho, dblaikie

Subscribers: jlebar, Prazek, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64601

llvm-svn: 365970
2019-07-12 22:30:30 +00:00
Sterling Augustine 6d75a9e873 The variable "Latch" is only used in an assert, which makes builds that use "-DNDEBUG" fail with unused variable messages.
Summary: Move the logic into the assert itself.

Subscribers: hiraditya, sanjoy, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64654

llvm-svn: 365943
2019-07-12 18:51:08 +00:00
Philip Reames 34495b5533 [IndVars] Use exit count reasoning to discharge obviously untaken exits
Continue in the spirit of D63618, and use exit count reasoning to prove away loop exits which can not be taken since the backedge taken count of the loop as a whole is provably less than the minimal BE count required to take this particular loop exit.

As demonstrated in the newly added tests, this triggers in a number of cases where IndVars was previously unable to discharge obviously redundant exit tests. And some not so obvious ones.

Differential Revision: https://reviews.llvm.org/D63733

llvm-svn: 365920
2019-07-12 17:05:35 +00:00
Fangrui Song b251cc0d91 Delete dead stores
llvm-svn: 365903
2019-07-12 14:58:15 +00:00
Tim Northover 27658ed512 OpaquePtr: use load instruction directly for type. NFC.
llvm-svn: 365768
2019-07-11 13:12:08 +00:00
Vitaly Buka d03bd1db59 NFC: Pass DataLayout into isBytewiseValue
Summary:
We will need to handle IntToPtr which I will submit in a separate patch as it's
not going to be NFC.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63940

llvm-svn: 365709
2019-07-10 22:53:52 +00:00
Serguei Katkov d000f8b69f [SimpleLoopUnswitch] Don't consider unswitching `switch` insructions with one unique successor
Only instructions with two or more unique successors should be considered for unswitching.

Patch Author: Daniil Suchkov.

Reviewers: reames, asbirlea, skatkov
Reviewed By: skatkov
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64404

llvm-svn: 365611
2019-07-10 10:25:22 +00:00
Tim Northover 60afa49abe OpaquePtr: add Type parameter to Loads analysis API.
This makes the functions in Loads.h require a type to be specified
independently of the pointer Value so that when pointers have no structure
other than address-space, it can still do its job.

Most callers had an obvious memory operation handy to provide this type, but a
SROA and ArgumentPromotion were doing more complicated analysis. They get
updated to merge the properties of the various instructions they were
considering.

llvm-svn: 365468
2019-07-09 11:35:35 +00:00
Serguei Katkov c6caddb73d [LoopInfo] Update getExitEdges to accept vector of pairs for non const BasicBlock
D63921 requires getExitEdges fills a vector of Edge pairs where
BasicBlocks are not constant.

The rest Loop API mostly returns non-const BasicBlocks, so to be more consistent with
other Loop API getExitEdges is modified to return non-const BasicBlocks as well.

This is an alternative solution to D64060. 

Reviewers: reames, fhahn
Reviewed By: reames, fhahn
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64309

llvm-svn: 365437
2019-07-09 04:20:43 +00:00
Philip Reames 0e344e9dc5 [LoopPred] Stylistic improvement to recently added NE/EQ normalization [NFC]
llvm-svn: 365425
2019-07-09 02:03:31 +00:00
Philip Reames 5a637cbdc7 [LoopPred] Extend LFTR normalization to the inverse EQ case
A while back, I added support for NE latches formed by LFTR.  I didn't think that quite through, as LFTR will also produce the inverse EQ form for some loops and I hadn't handled that.  This change just adds handling for that case as well.

llvm-svn: 365419
2019-07-09 01:27:45 +00:00
Cameron McInally 771769be90 [Float2Int] Add support for unary FNeg to Float2Int
Differential Revision: https://reviews.llvm.org/D63941

llvm-svn: 365324
2019-07-08 14:46:07 +00:00
Philip Reames 9e62c86408 [IRBuilder] Introduce helpers for and/or of multiple values at once
We had versions of this code scattered around, so consolidate into one location.

Not strictly NFC since the order of intermediate results may change in some places, but since these operations are associatives, should not change results.

llvm-svn: 365259
2019-07-06 03:46:18 +00:00
Chen Zheng 469f30abab [PowerPC] Hardware Loop branch instruction's condition may not be icmp.
This fixes pr42492.
Differential Revision: https://reviews.llvm.org/D64124

llvm-svn: 365104
2019-07-04 01:51:47 +00:00
Eli Friedman 41ee3977c4 [JumpThreading] Fix threading with unusual PHI nodes.
If the block being cloned contains a PHI node, in general, we need to
clone that PHI node, even though it's trivial. If the operand of the PHI
is an instruction in the block being cloned, the correct value for the
operand doesn't exist until SSAUpdater constructs it.

We usually don't hit this issue because we try to avoid threading across
loop headers, but it's possible to hit this in some cases involving
irreducible CFGs.  I added a flag to allow threading across loop headers
to make the testcase easier to understand.

Thanks to Brian Rzycki for reducing the testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42085.

Differential Revision: https://reviews.llvm.org/D63913

llvm-svn: 365094
2019-07-03 23:12:39 +00:00
Philip Reames ea06d63c35 [LFTR] Use SCEVExpander for the pointer limit case instead of manual IR gen
As noted in the test change, this is not trivially NFC, but all of the changes in output are cases where the SCEVExpander form is more canonical/optimal than the hand generation.  

llvm-svn: 365075
2019-07-03 20:03:46 +00:00
Philip Reames 14f1543425 [LFTR] Remove a stray variable shadow *of the same value* [NFC]
llvm-svn: 365072
2019-07-03 19:08:43 +00:00
Philip Reames e7a258c6d9 [LFTR] Style and comment changes to clarify the narrow vs wide bitwidth evaluation behavior [NFC]
llvm-svn: 365071
2019-07-03 19:03:37 +00:00
Philip Reames abc8f344d6 [LFTR] Sink the decision not use truncate scheme for constants into genLoopLimit [NFC]
We might as well just evaluate the constants using SCEV, and having the cases grouped makes the logic slightly easier to read anyway.

llvm-svn: 365070
2019-07-03 18:41:03 +00:00
Philip Reames 4c80281c96 [LFTR] Remove falsely generalized (dead) code [NFC]
llvm-svn: 365067
2019-07-03 18:24:06 +00:00
Philip Reames 83cca94194 [LFTR] Hoist extend expressions outside of loops w/o waiting for LICM
The motivation for this is two fold:
1) Make the output (and thus tests)  a bit more readable to a human trying to understand the result of the transform
2) Reduce spurious diffs in a potential future change to restructure all of this logic to use SCEVExpander (which hoists by default)

llvm-svn: 365066
2019-07-03 18:18:36 +00:00
Chen Zheng dfdccbb26b [PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop.
Differential Revision: https://reviews.llvm.org/D63477

llvm-svn: 364993
2019-07-03 01:49:03 +00:00
Yevgeny Rouban d4097b4a93 [SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst
Differential Revision: https://reviews.llvm.org/D60606

llvm-svn: 364734
2019-07-01 08:43:53 +00:00
Fangrui Song 78ee2fbf98 Cleanup: llvm::bsearch -> llvm::partition_point after r364719
llvm-svn: 364720
2019-06-30 11:19:56 +00:00
Nikita Popov 8023c84433 [LFTR] Rephrase getLoopTest into "based-on" check; NFCI
What we want to know here is whether we're already using this value
for the loop condition, so make the query about that. We can extend
this to a more general "based-on" relationship, rather than a direct
icmp use later.

llvm-svn: 364715
2019-06-29 15:12:59 +00:00
Nikita Popov 61a8b62b4c [LFTR] Remove unnecessary latch check; NFCI
The whole indvars pass works on loops in simplified form, so there
is always a unique latch. Convert the condition into an assertion
in needsLFTR (though we also assert this in later LFTR functions).

Additionally update the comment on getLoopTest() now that we are
dealing with multiple exits.

llvm-svn: 364713
2019-06-29 12:41:02 +00:00
Nikita Popov 2d756c4feb [LFTR] Fix post-inc pointer IV with truncated exit count (PR41998)
Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we
have a truncated exit count we'll truncate the IV when comparing
against the limit, in which case exit count overflow in post-inc
form doesn't matter. However, for pointer IVs we don't do that, so
we have to be careful about incrementing the IV in the wide type.

I'm fixing this by removing the IVCount variable (which was
ExitCount or ExitCount+1) and replacing it with a UsePostInc flag,
and then moving the actual limit adjustment to the individual cases
(which are: pointer IV where we add to the wide type, integer IV
where we add to the narrow type, and constant integer IV where we
add to the wide type).

Differential Revision: https://reviews.llvm.org/D63686

llvm-svn: 364709
2019-06-29 09:24:12 +00:00
Philip Reames 1504b6ee7e [IndVars] Remove a bit of manual constant folding [NFC]
SCEV is more than capable of folding (add x, trunc(0)) to x.  

llvm-svn: 364693
2019-06-29 00:19:31 +00:00
Cameron McInally 30e5cf1d8f [NewGVN] Add unary FNeg support to NewGVN pass
Differential Revision: https://reviews.llvm.org/D63933

llvm-svn: 364680
2019-06-28 20:09:32 +00:00
Cameron McInally ab4b2364e5 [GVNSink] Add unary FNeg support to GVNSink pass
Differential Revision: https://reviews.llvm.org/D63900

llvm-svn: 364678
2019-06-28 19:57:31 +00:00
Cameron McInally 6e62a796d5 [GVN] Add support for unary FNeg to GVN pass
Differential Revision: https://reviews.llvm.org/D63896

llvm-svn: 364592
2019-06-27 21:05:02 +00:00
Gerolf Hoflehner e311a4d5c4 [SCCP] Fix non-deterministic uselists of return values (DenseMap -> MapVector)
llvm-svn: 364482
2019-06-26 21:44:37 +00:00
Philip Reames 03b2e2d986 [IndVars] Kill a redundant bit of debug output
llvm-svn: 364449
2019-06-26 17:19:09 +00:00
Clement Courbet 2851248fa1 Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
Breaks sanitizers:
    libFuzzer :: cxxstring.test
    libFuzzer :: memcmp.test
    libFuzzer :: recommended-dictionary.test
    libFuzzer :: strcmp.test
    libFuzzer :: value-profile-mem.test
    libFuzzer :: value-profile-strcmp.test

llvm-svn: 364416
2019-06-26 12:13:13 +00:00
Clement Courbet 7b3a5f0e6d [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
This allows later passes (in particular InstCombine) to optimize more
cases.

One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0.

llvm-svn: 364412
2019-06-26 11:50:18 +00:00
Philip Reames c42a357178 [LFTR] Adjust debug output to include extensions (if any)
llvm-svn: 364346
2019-06-25 20:14:08 +00:00
Clement Courbet 3bc5ad551a [ExpandMemCmp] Move all options to TargetTransformInfo.
Split off from D60318.

llvm-svn: 364281
2019-06-25 08:04:13 +00:00
Nikita Popov f1ffc4305d [CVP] Reenable nowrap flag inference
Inference of nowrap flags in CVP has been disabled, because it
triggered a bug in LFTR (https://bugs.llvm.org/show_bug.cgi?id=31181).
This issue has been fixed in D60935, so we should be able to reenable
nowrap flag inference now.

Differential Revision: https://reviews.llvm.org/D62776

llvm-svn: 364228
2019-06-24 20:13:13 +00:00
Sanjoy Das e2291f5af9 Fix typo in comment; NFC
llvm-svn: 364159
2019-06-23 19:22:13 +00:00
Philip Reames d22a2a9a72 [IndVars] Remove dead instructions after folding trivial loop exit
In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration).  This extends that to eliminate the dead comparison left around.  

llvm-svn: 364155
2019-06-23 17:06:57 +00:00
Philip Reames 8deb84c8ef Exploit a zero LoopExit count to eliminate loop exits
This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning.

Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here.

Differential Revision: https://reviews.llvm.org/D63618

llvm-svn: 364135
2019-06-22 17:54:25 +00:00
Nikita Popov 8c8e40f763 [NewGVN] Fix copy/paste mistake in cast
llvm-svn: 364130
2019-06-22 10:20:13 +00:00
Nikita Popov e96fda726e [NewGVN] Remove dead SwitchEdges variable; NFC
llvm-svn: 364129
2019-06-22 10:20:07 +00:00
Sanjay Patel ddb9093684 [GVNSink] prevent crashing on mismatched instructions (PR42346)
Patch based on suggestion by James Molloy (@jmolloy) in:
https://bugs.llvm.org/show_bug.cgi?id=42346

llvm-svn: 364062
2019-06-21 15:17:24 +00:00
Jay Foad d9d3c91b48 [Scalarizer] Propagate IR flags
Summary:
The motivation for this was to propagate fast-math flags like nnan and
ninf on vector floating point operations to the corresponding scalar
operations to take advantage of follow-on optimizations. But I think
the same argument applies to all of our IR flags: if they apply to the
vector operation then they also apply to all the individual scalar
operations, and they might enable follow-on optimizations.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63593

llvm-svn: 364051
2019-06-21 14:10:18 +00:00
Fangrui Song dc8de6037c Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC
llvm-svn: 364006
2019-06-21 05:40:31 +00:00
Cameron McInally 1c0bd6dd2c [Reassociate] Remove bogus assert reported in PR42349.
Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete.

The bogus assert with introduced in D63445.

llvm-svn: 363998
2019-06-20 23:03:55 +00:00
Alina Sbirlea d0b11698cd [LICM & MSSA] Limit unsafe sinking and hoisting.
Summary:
The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being
clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch).
If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink.
Given:
```
for i
  load a[i]
  store a[i]
```
The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it.
In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either
Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs.

Caught by PR42294.

This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict
hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63582

llvm-svn: 363982
2019-06-20 21:09:09 +00:00
Philip Reames a7fd8a806f [LFTR] Fix a (latent?) bug related to nested loops
I can't actually come up with a test case this triggers on without an out of tree change, but in theory, it's a bug in the recently added multiple exit LFTR support.  The root issue is that an exiting block common to two loops can (in theory) have computable exit counts for both loops.  Rewriting the exit of an inner loop in terms of the outer loops IV would cause the inner loop to either a) run forever, or b) terminate on the first iteration.

In practice, we appear to get lucky and not have the exit count computable for the outer loop, except when it's trivially zero.  Given we bail on zero exit counts, we don't appear to ever trigger this.  But I can't come up with a reason we *can't* compute an exit count for the outer loop on the common exiting block, so this may very well be triggering in some cases.

llvm-svn: 363964
2019-06-20 18:45:06 +00:00
Philip Reames eda1ba65ca LFTR for multiple exit loops
Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop.

This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.)

I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred.

I do want to note that I added one bit after the review.  When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one.  This was theoretically possible with a single exit, but the zero case covered it in practice.  With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached.  Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done.  The solution seemed obvious, so I didn't bother with another round of review.

Differential Revision: https://reviews.llvm.org/D62625

llvm-svn: 363883
2019-06-19 21:58:25 +00:00
Philip Reames ce53e2226c [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]
(Resumbit of r363292 which was reverted along w/an earlier patch)

llvm-svn: 363877
2019-06-19 20:45:57 +00:00
Philip Reames f8104f01e6 [LFTR] Rename variable to minimize confusion [NFC]
(Recommit of r363293 which was reverted when a dependent patch was.)

As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here.

llvm-svn: 363875
2019-06-19 20:41:28 +00:00
Cameron McInally a027cf4764 [Reassociate] Handle unary FNeg in the Reassociate pass
Differential Revision: https://reviews.llvm.org/D63445

llvm-svn: 363813
2019-06-19 14:59:14 +00:00
Michael Liao 4f7f70e262 Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas
[SROA] Enhance SROA to handle `addrspacecast`ed allocas

- Fix typo in original change
- Add additional handling to ensure all return pointers are properly
  casted.

Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363743
2019-06-18 21:41:13 +00:00
Jordan Rupprecht 33e85ad956 Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas
This reverts r363711 (git commit 76a149ef81)

This causes stage2 build failures, e.g.:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio
http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio

llvm-svn: 363718
2019-06-18 18:40:04 +00:00
Michael Liao 76a149ef81 [SROA] Enhance SROA to handle `addrspacecast`ed allocas
Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363711
2019-06-18 17:58:49 +00:00
Philip Reames 44475363e8 Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges
This patch really contains two pieces:
    Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
    Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

Differential Revision: https://reviews.llvm.org/D63224

llvm-svn: 363619
2019-06-17 21:06:17 +00:00
Philip Reames fe8bd96ebd Fix a bug w/inbounds invalidation in LFTR (recommit)
Recommit r363289 with a bug fix for crash identified in pr42279.  Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case.  Test case previously committed in r363601.

Original commit comment follows:

This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363613
2019-06-17 20:32:22 +00:00
Joseph Tremoulet daa1ae6142 [EarlyCSE] Fix hashing of self-compares
Summary:
Update compare normalization in SimpleValue hashing to break ties (when
the same value is being compared to itself) by switching to the swapped
predicate if it has a lower numerical value.  This brings the hashing in
line with isEqual, which already recognizes the self-compares with
swapped predicates as equal.

Fixes PR 42280.

Reviewers: spatel, efriedma, nikic, fhahn, uabelho

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63349

llvm-svn: 363598
2019-06-17 19:11:28 +00:00
Whitney Tsang 15b7f5b72d PHINode: introduce setIncomingValueForBlock() function, and use it.
Summary:
There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue()
but no function to replace incoming value for a specified BasicBlock*
predecessor.
Clearly, there are a lot of places that could use that functionality.

Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn
Reviewed By: Meinersbur, fhahn
Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63338

llvm-svn: 363566
2019-06-17 14:38:56 +00:00
Matt Arsenault 1df203d78e InferAddressSpaces: Fix cloning original addrspacecast
If an addrspacecast needed to be inserted again, this was creating a
clone of the original cast for each user. Just use the original, which
also saves losing the value name.

llvm-svn: 363562
2019-06-17 14:13:29 +00:00
Matt Arsenault 282dac717e SROA: Allow eliminating addrspacecasted allocas
There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.

This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.

For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.

llvm-svn: 363462
2019-06-14 21:38:31 +00:00
Florian Hahn dcdd12b68c Revert Fix a bug w/inbounds invalidation in LFTR
Reverting because it breaks a green dragon build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363289 (git commit eb88badff9)

llvm-svn: 363427
2019-06-14 17:23:09 +00:00
Florian Hahn a19809045c Revert [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]
Reverting because it depends on r363289, which breaks a green dragon build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363292 (git commit 42a3fc133d)

llvm-svn: 363426
2019-06-14 17:22:56 +00:00
Florian Hahn e1b4b1b46e Revert [LFTR] Rename variable to minimize confusion [NFC]
Reverting because it depends on r363289, which breaks a green dragon
build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363293 (git commit c37be29634)

llvm-svn: 363425
2019-06-14 17:22:49 +00:00
Philip Reames c37be29634 [LFTR] Rename variable to minimize confusion [NFC]
As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop.  A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here.

llvm-svn: 363293
2019-06-13 18:40:15 +00:00
Philip Reames 42a3fc133d [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]
llvm-svn: 363292
2019-06-13 18:32:55 +00:00
Philip Reames eb88badff9 Fix a bug w/inbounds invalidation in LFTR
This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying.  As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363289
2019-06-13 18:23:13 +00:00
Joseph Tremoulet 3bc6e2a7aa [EarlyCSE] Ensure equal keys have the same hash value
Summary:
The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is
equivalent to `select (cmp sgt X, Y), Y, X`.  Without this change,
however, only the latter is recognized as a form of `smin X, Y`, so the
two expressions receive different hash codes.  This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.

This change inverts the order of some of the pattern matching, checking
first for the optional `not` and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of `smin X, Y`.

It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again.  A new flag --earlycse-debug-hash is added which can
be used when changing the hash function; it forces hash collisions so
that any pair of values inserted which compare as equal but hash
differently will be caught by the isEqual assertion.

Reviewers: spatel, nikic

Reviewed By: spatel, nikic

Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62644

llvm-svn: 363274
2019-06-13 15:24:11 +00:00
Philip Reames ae2581cef3 [IndVars] Extend diagnostic -replexitval flag w/ability to bypass hard use hueristic
Note: This does mean that "always" is now more powerful than it was. 
llvm-svn: 363196
2019-06-12 19:52:05 +00:00
Matt Arsenault 86325be3d7 LoopLoadElim: Respect convergent
llvm-svn: 363162
2019-06-12 13:50:47 +00:00
Matt Arsenault 2466ba97bc LoopDistribute/LAA: Respect convergent
This case is slightly tricky, because loop distribution should be
allowed in some cases, and not others. As long as runtime dependency
checks don't need to be introduced, this should be OK. This is further
complicated by the fact that LoopDistribute partially ignores if LAA
says that vectorization is safe, and then does its own runtime pointer
legality checks.

Note this pass still does not handle noduplicate correctly, as this
should always be forbidden with it. I'm not going to bother trying to
fix it, as it would require more effort and I think noduplicate should
be removed.

https://reviews.llvm.org/D62607

llvm-svn: 363160
2019-06-12 13:34:19 +00:00
Alina Sbirlea 3cef1f7d64 Only passes that preserve MemorySSA must mark it as preserved.
Summary:
The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as
preserved, because it's being called in a lot of passes that do not
preserve MemorySSA.
Instead, mark the MemorySSA analysis as preserved by each pass that does
preserve it.
These changes only affect the new pass mananger.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62536

llvm-svn: 363091
2019-06-11 18:27:49 +00:00
Philip Reames a9633d5f0b [LFTR] Use recomputed BE count
This was discussed as part of D62880.  The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening.  Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion.  This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable.

For the nestedIV example from elim-extend.ll, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

Note that before is an i32 type, and the after is an i64.  Truncating the i64 produces the i32. 

llvm-svn: 362975
2019-06-10 19:18:53 +00:00
Philip Reames 5d84ccb230 Prepare for multi-exit LFTR [NFC]
This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. The actual multi-exit LFTR patch is in D62625 for context.

Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. Oddly, we can end up with a different (more expensive, but semantically equivelent) BE count for the loop when requerying after widening.  For the nestedIV example from elim-extend, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :)

In review, we decided to accept that test change.  This patch is structured to preserve the old behavior, but a separate change will immediate follow with the behavior change.  (I wanted it separate for problem attribution purposes.)

Differential Revision: https://reviews.llvm.org/D62880

llvm-svn: 362971
2019-06-10 17:51:13 +00:00
Keno Fischer eb4a561fa3 [GVN] non-functional code movement
Summary: Move some code around, in preparation for later fixes
to the non-integral addrspace handling (D59661)

Patch By Jameson Nash <jameson@juliacomputing.com>

Reviewed By: reames, loladiro
Differential Revision: https://reviews.llvm.org/D59729

llvm-svn: 362853
2019-06-07 23:08:38 +00:00
Philip Reames 101915cfda [LoopPred] Fix a bug in unconditional latch bailout introduced in r362284
This is a really silly bug that even a simple test w/an unconditional latch would have caught.  I tried to guard against the case, but put it in the wrong if check.  Oops.

llvm-svn: 362727
2019-06-06 18:02:36 +00:00
Matt Arsenault 663d762c9a NewGVN: Handle addrspacecast
The AllConstant check needs to be moved out of the if/else if chain to
avoid a test regression. The "there is no SimplifyZExt" comment
puzzles me, since there is SimplifyCastInst. Additionally, the
Simplify* calls seem to not see the operand as constant, so this needs
to be tried if the simplify failed.

llvm-svn: 362653
2019-06-05 21:15:52 +00:00
Yevgeny Rouban a3e16719c4 Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst"
This reverts commit 5b32f60ec3.
The fix is in commit 4f9e68148b.

This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126

llvm-svn: 362583
2019-06-05 05:46:40 +00:00
Cameron McInally 5c7245b830 [Scalarizer] Add UnaryOperator visitor to scalarization pass
Differential Revision: https://reviews.llvm.org/D62858

llvm-svn: 362558
2019-06-04 23:01:36 +00:00
Cameron McInally 89f9af5487 [SCCP] Add UnaryOperator visitor to SCCP for unary FNeg
Differential Revision: https://reviews.llvm.org/D62819

llvm-svn: 362449
2019-06-03 21:53:56 +00:00
Philip Reames 9ed1673703 [LoopPred] Convert a second member function to a static helper [NFC]
(And remember to actually mark the first one static.)

llvm-svn: 362415
2019-06-03 16:23:20 +00:00
Philip Reames 0912b06f78 [LoopPred] Convert member function to free helper function [NFC]
llvm-svn: 362411
2019-06-03 16:17:14 +00:00
Nikita Popov 46d4dba6e6 [IndVarSimplify] Fixup nowrap flags during LFTR (PR31181)
Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix
for LFTR poison handling issues in general.

When LFTR moves a condition from pre-inc to post-inc, it may now
depend on value that is poison due to nowrap flags. To avoid this,
we clear any nowrap flag that SCEV cannot prove for the post-inc
addrec.

Additionally, LFTR may switch to a different IV that is dynamically
dead and as such may be arbitrarily poison. This patch will correct
nowrap flags in some but not all cases where this happens. This is
related to the adoption of IR nowrap flags for the pre-inc addrec.
(See some of the switch_to_different_iv tests, where flags are not
dropped or insufficiently dropped.)

Finally, there are likely similar issues with the handling of GEP
inbounds, but we don't have a test case for this yet.

Differential Revision: https://reviews.llvm.org/D60935

llvm-svn: 362292
2019-06-01 09:40:18 +00:00
Richard Trieu 4e875464df Inline variable into assert to fix unused variable warning.
llvm-svn: 362285
2019-06-01 03:32:20 +00:00
Philip Reames 19afdf74bb [LoopPred] Eliminate a redundant/confusing cover function [NFC]
llvm-svn: 362284
2019-06-01 03:09:28 +00:00
Philip Reames 099eca832e [LoopPred] Handle a subset of NE comparison based latches
At the moment, LoopPredication completely bails out if it sees a latch of the form:
%cmp = icmp ne %iv, %N
br i1 %cmp, label %loop, label %exit
OR
%cmp = icmp ne %iv.next, %NPlus1
br i1 %cmp, label %loop, label %exit

This is unfortunate since this is exactly the form that LFTR likes to produce. So, go ahead and recognize simple cases where we can.

For pre-increment loops, we leverage the fact that LFTR likes canonical counters (i.e. those starting at zero) and a (presumed) range fact on RHS to discharge the check trivially.

For post-increment forms, the key insight is in remembering that LFTR had to insert a (N+1) for the RHS. CVP can hopefully prove that add nsw/nuw (if there's appropriate range on N to start with). This leaves us both with the post-inc IV and the RHS involving an nsw/nuw add, and SCEV can discharge that with no problem.

This does still need to be extended to handle non-one steps, or other harder patterns of variable (but range restricted) starting values. That'll come later.

Differential Revision: https://reviews.llvm.org/D62748

llvm-svn: 362282
2019-06-01 00:31:58 +00:00
Nikita Popov 7bafae55c0 Reapply [CVP] Simplify non-overflowing saturating add/sub
If we can determine that a saturating add/sub will not overflow based
on range analysis, convert it into a simple binary operation. This is
a sibling transform to the existing with.overflow handling.

Reapplying this with an additional check that the saturating intrinsic
has integer type, as LVI currently does not support vector types.

Differential Revision: https://reviews.llvm.org/D62703

llvm-svn: 362263
2019-05-31 20:48:26 +00:00
Nikita Popov 23a02f6a5f [CVP] Fix assertion failure on vector with.overflow
Noticed on D62703. LVI only handles plain integers, not vectors of
integers. This was previously not an issue, because vector support
for with.overflow is only a relatively recent addition.

llvm-svn: 362261
2019-05-31 20:42:07 +00:00
Nikita Popov ccb63e0bfe Revert "[CVP] Simplify non-overflowing saturating add/sub"
This reverts commit 1e692d1777.

Causes assertion failure in builtins-wasm.c clang test.

llvm-svn: 362254
2019-05-31 19:04:47 +00:00
Nikita Popov 1e692d1777 [CVP] Simplify non-overflowing saturating add/sub
If we can determine that a saturating add/sub will not overflow
based on range analysis, convert it into a simple binary operation.
This is a sibling transform to the existing with.overflow handling.

Differential Revision: https://reviews.llvm.org/D62703

llvm-svn: 362242
2019-05-31 16:46:05 +00:00
Nikita Popov e906f2a370 [CVP] Generalize willNotOverflow(); NFC
Change argument from WithOverflowInst to BinaryOpIntrinsic, so this
function can also be used for saturating math intrinsics.

llvm-svn: 362152
2019-05-30 21:03:10 +00:00
Roman Lebedev e8578953ac [LoopIdiom] Basic OptimizationRemarkEmitter handling
Summary:
I'm adding ORE to memset/memcpy formation, with tests,
but mainly this is split off from D61144.

Reviewers: reames, anemet, thegameg, craig.topper

Reviewed By: thegameg

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62631

llvm-svn: 362092
2019-05-30 13:02:06 +00:00
Roman Lebedev fae2e46766 [LoopIdiomRecognize][NFC] Sort includes
Split off from D61144

llvm-svn: 362091
2019-05-30 13:01:53 +00:00
Matt Arsenault 79b3ea701c LoopVersioningLICM: Respect convergent and noduplicate
llvm-svn: 362031
2019-05-29 20:47:59 +00:00
Roman Lebedev 95dec50a35 [LoopIdiomRecognize][NFC] Use DEBUG_TYPE, add LLVM_DEBUG() to runOnNoncountableLoop()
Split off from D61144

llvm-svn: 362022
2019-05-29 20:11:53 +00:00
Matt Arsenault f80c4241b3 CallSiteSplitting: Respect convergent and noduplicate
llvm-svn: 361990
2019-05-29 16:59:48 +00:00
Matt Arsenault 36e7254441 SpeculateAroundPHIs: Respect convergent
llvm-svn: 361957
2019-05-29 13:14:39 +00:00
Nikita Popov 5b32f60ec3 Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst"
This reverts commit 53f2f32865.

As reported on D62126, this causes assertion failures if the switch
has incorrect branch_weights metadata, which may happen as a result
of other transforms not handling it correctly yet.

llvm-svn: 361881
2019-05-28 21:28:24 +00:00
Yevgeny Rouban 53f2f32865 [CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126

llvm-svn: 361808
2019-05-28 11:33:50 +00:00
Florian Hahn 11b2f4fe50 [LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches.
The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.

This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.

Fixes PR41725.

Reviewers: efriedma, mcrosier, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61576

llvm-svn: 361743
2019-05-26 23:38:25 +00:00
Nikita Popov 8b1fa07639 [CVP] Remove unnecessary checks for empty GNWR; NFC
The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.

To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.

llvm-svn: 361698
2019-05-25 14:11:55 +00:00
Bjorn Pettersson b4771425f5 Use the DataLayout::typeSizeEqualsStoreSize helper. NFC
Just a minor refactoring to use the new helper method
DataLayout::typeSizeEqualsStoreSize(). This is done when
checking if getTypeSizeInBits is equal/non-equal to
getTypeStoreSizeInBits.

llvm-svn: 361613
2019-05-24 09:20:20 +00:00
Neil Henning 119c31ad93 StructurizeCFG: Relax uniformity checks.
This change relaxes the checks for hasOnlyUniformBranches such that our
region is uniform if:

1. All conditional branches that are direct children are uniform.
2. And either:
  a. All sub-regions are uniform.
  b. There is one or less conditional branches among the direct
     children.

Differential Revision: https://reviews.llvm.org/D62198

llvm-svn: 361610
2019-05-24 08:59:17 +00:00
Bjorn Pettersson d63a2bb35f [DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores
Summary:
The DeadStoreElimination pass now skips doing
PartialStoreMerging when stores overlap according to
OW_PartialEarlierWithFullLater and at least one of
the stores is having a store size that is different
from the size of the type being stored.

This solves problems seen in
  https://bugs.llvm.org/show_bug.cgi?id=41949
for which we in the past could end up with
mis-compiles or assertions.

The content and location of the padding bits is not
formally described (or undefined) in the LangRef
at the moment. So the solution is chosen based on
that we cannot assume anything about the padding bits
when having a store that clobbers more memory than
indicated by the type of the value that is stored
(such as storing an i6 using an 8-bit store instruction).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949

Reviewers: spatel, efriedma, fhahn

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62250

llvm-svn: 361605
2019-05-24 08:32:02 +00:00
Alina Sbirlea d82ddfa7c3 [NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61612

llvm-svn: 361560
2019-05-23 21:52:59 +00:00
Kit Barton 987fdfd9a7 Revert [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
This reverts r361517 (git commit 2049e4dd8f)

llvm-svn: 361553
2019-05-23 20:53:05 +00:00
Kit Barton 2049e4dd8f [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
Summary:
    This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard.

      /// Example:
      /// for (int i = lb; i < ub; i+=step)
      ///   <loop body>
      /// --- pseudo LLVMIR ---
      /// beforeloop:
      ///   guardcmp = (lb < ub)
      ///   if (guardcmp) goto preheader; else goto afterloop
      /// preheader:
      /// loop:
      ///   i1 = phi[{lb, preheader}, {i2, latch}]
      ///   <loop body>
      ///   i2 = i1 + step
      /// latch:
      ///   cmp = (i2 < ub)
      ///   if (cmp) goto loop
      /// exit:
      /// afterloop:
      ///
      /// getBounds
      ///   getInitialIVValue      --> lb
      ///   getStepInst            --> i2 = i1 + step
      ///   getStepValue           --> step
      ///   getFinalIVValue        --> ub
      ///   getCanonicalPredicate  --> '<'
      ///   getDirection           --> Increasing
      /// getGuard             --> if (guardcmp) goto loop; else goto afterloop
      /// getInductionVariable          --> i1
      /// getAuxiliaryInductionVariable --> {i1}
      /// isCanonical                   --> false

    Committed on behalf of @Whitney (Whitney Tsang).

    Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn

    Reviewed By: kbarton

    Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 361517
2019-05-23 17:56:35 +00:00
Saleem Abdulrasool 7bbefb13ee Transforms: lower fadd and fsub atomicrmw instructions
`fadd` and `fsub` have recently (r351850) been added as `atomicrmw`
operations. This diff adds lowering cases for them to the LowerAtomic
transform.

Patch by Josh Berdine!

llvm-svn: 361512
2019-05-23 17:03:43 +00:00
Clement Courbet 43882b16a3 [MergeICmps] Make the pass compatible with the new pass manager.
Reviewers: gchatelet, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62287

llvm-svn: 361490
2019-05-23 12:35:26 +00:00
Clement Courbet f8f93ba90d Re-land r361257 "[MergeICmps][NFC] Make BCEAtom move-only.""
llvm-svn: 361366
2019-05-22 09:45:40 +00:00
Clement Courbet 122c6e6f36 [MergeICmps] Make sorting strongly stable on the rhs.
Summary:
Because the sort order was not strongly stable on the RHS, whether the
chain could merge would depend on the order of the blocks in the Phi.

EXPENSIVE_CHECKS would shuffle the blocks before sorting, resulting in
non-deterministic merging.

Reviewers: gchatelet

Subscribers: hiraditya, llvm-commits, RKSimon

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62193

llvm-svn: 361281
2019-05-21 17:58:42 +00:00
Clement Courbet 8361a10493 Revert r361257 "[MergeICmps][NFC] Make BCEAtom move-only."
Broke some bots.

llvm-svn: 361263
2019-05-21 14:24:46 +00:00
Clement Courbet 8fa970c2d8 [MergeICmps][NFC] Make BCEAtom move-only.
And handle for self-move. This is required so that llvm::sort can work
with EXPENSIVE_CHECKS, as it will do a random shuffle of the input
which can result in self-moves.

llvm-svn: 361257
2019-05-21 13:34:12 +00:00
Clement Courbet a95d95d392 [MergeICmps] Preserve the dominator tree.
Summary: In preparation for D60318 .

Reviewers: gchatelet, efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62068

llvm-svn: 361239
2019-05-21 11:02:23 +00:00
Matt Arsenault b04f3258dd GVN: Handle addrspacecast
llvm-svn: 361103
2019-05-18 14:36:06 +00:00
Clement Courbet 90900fbc9f [MergeICmps][NFC] Add more debug.
llvm-svn: 361024
2019-05-17 12:07:51 +00:00
Clement Courbet 632dfdda16 Re-land r360859: "[MergeICmps] Simplify the code."
With a fix for PR41917: The predecessor list was changing under our feet.

-  for (BasicBlock *Pred : predecessors(EntryBlock_)) {
+  while (!pred_empty(EntryBlock_)) {
+    BasicBlock* const Pred = *pred_begin(EntryBlock_);

llvm-svn: 361009
2019-05-17 09:43:45 +00:00
Philip Reames a74d654374 [LFTR] Strengthen assertions in genLoopLimit [NFCI]
llvm-svn: 360978
2019-05-17 02:18:03 +00:00
Philip Reames 45e7690796 [IndVars] Don't reimplement Loop::isLoopInvariant [NFC]
Using dominance vs a set membership check is indistinguishable from a compile time perspective, and the two queries return equivelent results.  Simplify code by using the existing function.

llvm-svn: 360976
2019-05-17 02:09:03 +00:00
Philip Reames 8e169cd266 [LFTR] Factor out a helper function for readability purpose [NFC]
llvm-svn: 360972
2019-05-17 01:39:58 +00:00
Philip Reames 9283f1847c Clarify comments on helpers used by LFTR [NFC]
I'm slowly wrapping my head around this code, and am making comment improvements where I can.

llvm-svn: 360968
2019-05-17 01:12:02 +00:00
Nico Weber d764e7c660 Revert r360859: "Reland r360771 "[MergeICmps] Simplify the code.""
It caused PR41917.

llvm-svn: 360963
2019-05-17 00:43:53 +00:00
Clement Courbet c4fdd717ef Reland r360771 "[MergeICmps] Simplify the code."
This revision does not seem to be the culprit.

llvm-svn: 360859
2019-05-16 06:18:02 +00:00
Hiroshi Yamauchi 7dfd087a9a [JumpThreading] A bug fix for stale loop info after unfold select
Summary:
The return value of a TryToUnfoldSelect call was not checked, which led to an
incorrectly preserved loop info and some crash.

The original crash was reported on https://reviews.llvm.org/D59514.

Reviewers: davidxl, amehsan

Reviewed By: davidxl

Subscribers: fhahn, brzycki, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61920

llvm-svn: 360780
2019-05-15 15:15:16 +00:00
Clement Courbet eaf4413d2d Revert r360771 "[MergeICmps] Simplify the code."
Breaks a bunch of builbdots.

llvm-svn: 360776
2019-05-15 14:21:59 +00:00
Clement Courbet 0d071be474 [MergeICmps] Fix r360771.
Twine references a StringRef by reference, not value...

llvm-svn: 360775
2019-05-15 14:00:45 +00:00
Clement Courbet 157ae639fa [MergeICmps] Simplify the code.
Instead of patching the original blocks, we now generate new blocks and
delete the old blocks. This results in simpler code with a less twisted
control flow (see the change in `entry-block-shuffled.ll`).

This will make https://reviews.llvm.org/D60318 simpler by making it more
obvious where control flow created and deleted.

Reviewers: gchatelet

Subscribers: hiraditya, llvm-commits, spatel

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61736

llvm-svn: 360771
2019-05-15 13:04:24 +00:00
Florian Hahn 53c9d585b5 [LICM] Allow AliasSetMap to contain top-level loops.
When an outer loop gets deleted by a different pass, before LICM visits
it, we cannot clean up its sub-loops in AliasSetMap, because at the
point we receive the deleteAnalysisLoop callback for the outer loop, the loop
object is already invalid and we cannot access its sub-loops any longer.

Reviewers: asbirlea, sanjoy, chandlerc

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D61904

llvm-svn: 360704
2019-05-14 19:41:36 +00:00
Philip Reames 75ad8c5d63 Fix a release mode warning introduced in r360694
llvm-svn: 360696
2019-05-14 17:50:06 +00:00
Philip Reames bd8d309111 [IndVars] Extend reasoning about loop invariant exits to non-header blocks
Noticed while glancing through the code for other reasons.  The extension is trivial enough, decided to just do it.

llvm-svn: 360694
2019-05-14 17:20:10 +00:00
Cameron McInally 7c5c0c9fe5 Support FNeg in SpeculativeExecution pass
Differential Revision: https://reviews.llvm.org/D61910

llvm-svn: 360692
2019-05-14 16:51:18 +00:00
Amara Emerson e5248e6b41 Revert "[LSR] Tweak setup cost depth threshold to 10."
Changing the threshold might not be the best long term approach. Revert for now.

llvm-svn: 360589
2019-05-13 15:37:18 +00:00
Amara Emerson b6af291772 [LSR] Tweak setup cost depth threshold to 10.
The original change introduced a depth limit of 7 which caused a 22% regression
in the Swift MapReduceLazyCollection & Ackermann benchmarks. This new threshold
still ensures that the original test case doesn't hang.

rdar://50359639

llvm-svn: 360444
2019-05-10 17:29:35 +00:00
Michael Liao b284414a1b [InferAddressSpaces] Enhance the handling of cosntexpr.
Summary:
- Constant expressions may not be added in strict postorder as the
  forward instruction scan order. Thus, for a constant express (CE0), if
  its operand (CE1) is used in an previous instruction, they are not in
  postorder. However, different from
  `cloneInstructionWithNewAddressSpace`,
  `cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred
  instructions for later resolving. That results in failure of inferring
  constant address.
- This patch adds the support to infer constant expression operand
  recursively, since there won't be loop, if that operand is another
  constant expression.

Reviewers: arsenm

Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61760

llvm-svn: 360431
2019-05-10 14:57:42 +00:00
Alina Sbirlea f31eba6494 [MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.

Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60833

llvm-svn: 360270
2019-05-08 17:05:36 +00:00
David Greene 6c433713e9 [Reassociation] Place moved instructions after landing pads
Reassociation's NegateValue moved instructions to the beginning of
blocks (after PHIs) without checking for exception handling pads.
It's possible for reassociation to move something into an exception
handling block so we need to make sure we don't move things too early
in the block.  This change advances the insertion point past any
exception handling pads.

If the block we want to move into contains a catchswitch, we cannot
move into it.  In that case just create a new neg as if we had not
found an existing neg to move.

Differential Revision: https://reviews.llvm.org/D61089

llvm-svn: 360262
2019-05-08 15:44:24 +00:00
Florian Hahn 3c696b3e7c [SCCP] Fix crash when trying to constant-fold terminators multiple times.
If we fold a branch/switch to an unconditional branch to another dead block we
replace the branch with unreachable, to avoid attempting to fold the
unconditional branch.

Reviewers: davide, efriedma, mssimpso, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61300

llvm-svn: 360232
2019-05-08 09:09:54 +00:00
Roman Lebedev 1a1b922177 [NFC] BasicBlock: refactor changePhiUses() out of replacePhiUsesWith(), use it
Summary:
It is a common thing to loop over every `PHINode` in some `BasicBlock`
and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block.
`replaceSuccessorsPhiUsesWith()` already had code to do that,
it just wasn't a function.
So outline it into a new function, and use it.

Reviewers: chandlerc, craig.topper, spatel, danielcdh

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61013

llvm-svn: 359996
2019-05-05 18:59:39 +00:00
Roman Lebedev e3b1d82b53 [NFC] PHINode: introduce replaceIncomingBlockWith() function, use it
Summary:
There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()`
and `PHINode::getNumOperands()`, but no function to replace every
specified `BasicBlock*` predecessor with some other specified `BasicBlock*`.
Clearly, there are a lot of places that could use that functionality.

Reviewers: chandlerc, craig.topper, spatel, danielcdh

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61011

llvm-svn: 359995
2019-05-05 18:59:30 +00:00
Yevgeny Rouban 0822bfc6de [LoopSimplifyCFG] Suppress expensive DomTree verification
This patch makes verification level lower for builds with
inexpensive checks.

Differential Revision: https://reviews.llvm.org/D61055

llvm-svn: 359446
2019-04-29 13:29:55 +00:00
Sven van Haastregt 66f612601d [InferAddressSpaces] Add AS parameter to the pass factory
This enables the pass to be used in the absence of
TargetTransformInfo. When the argument isn't passed, the factory
defaults to UninitializedAddressSpace and the flat address space is
obtained from the TargetTransformInfo as before this change. Existing
users won't have to change.

Patch by Kevin Petit.

Differential Revision: https://reviews.llvm.org/D60602

llvm-svn: 359290
2019-04-26 09:21:25 +00:00
Kit Barton 8e64f0a649 Fix unused variable warning in LoopFusion pass.
Do not wrap the contents of printFusionCandidates in the LLVM_DEBUG macro. This
fixes an unused variable warning generated when compiling without asserts but
with -DENABLE_LLVM_DUMP.

Differential Revision: https://reviews.llvm.org/D61035

llvm-svn: 359161
2019-04-25 02:10:02 +00:00
Bjorn Pettersson 71e8c6f20f Add "const" in GetUnderlyingObjects. NFC
Summary:
Both the input Value pointer and the returned Value
pointers in GetUnderlyingObjects are now declared as
const.

It turned out that all current (in-tree) uses of
GetUnderlyingObjects were trivial to update, being
satisfied with have those Value pointers declared
as const. Actually, in the past several of the users
had to use const_cast, just because of ValueTracking
not providing a version of GetUnderlyingObjects with
"const" Value pointers. With this patch we get rid
of those const casts.

Reviewers: hfinkel, materi, jkorous

Reviewed By: jkorous

Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61038

llvm-svn: 359072
2019-04-24 06:55:50 +00:00
Fangrui Song efd94c56ba Use llvm::stable_sort
While touching the code, simplify if feasible.

llvm-svn: 358996
2019-04-23 14:51:27 +00:00
David Green 63a2aa715a [LSR] Limit the recursion for setup cost
In some circumstances we can end up with setup costs that are very complex to
compute, even though the scevs are not very complex to create. This can also
lead to setupcosts that are calculated to be exactly -1, which LSR treats as an
invalid cost. This patch puts a limit on the recursion depth for setup cost to
prevent them taking too long.

Thanks to @reames for the report and test case.

Differential Revision: https://reviews.llvm.org/D60944

llvm-svn: 358958
2019-04-23 08:52:21 +00:00
Nikita Popov 5aacc7a573 Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"
This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445.

After thinking about this more, this isn't right, the range is not exact
in the same sense as makeExactICmpRegion(). This needs a separate
function.

llvm-svn: 358876
2019-04-22 09:01:38 +00:00
Nikita Popov 5299e25f50 [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC
Following D60632 makeGuaranteedNoWrapRegion() always returns an
exact nowrap region. Rename the function accordingly. This is in
line with the naming of makeExactICmpRegion().

llvm-svn: 358875
2019-04-22 08:36:05 +00:00
Luqman Aden 2993661cc0 [CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw.
Summary:
Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative.

Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181)

Reviewers: sanjoy, apilipenko, nikic

Reviewed By: nikic

Subscribers: hiraditya, jfb, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60036

llvm-svn: 358816
2019-04-20 13:14:18 +00:00
Vedant Kumar 282b26ec4d [GVN+LICM] Use line 0 locations for better crash attribution
This is a follow-up to r291037+r291258, which used null debug locations
to prevent jumpy line tables.

Using line 0 locations achieves the same effect, but works better for
crash attribution because it preserves the right inline scope.

Differential Revision: https://reviews.llvm.org/D60913

llvm-svn: 358791
2019-04-19 22:36:40 +00:00
Alina Sbirlea 43709f7233 [LICM & MemorySSA] Make limit flags pass tuning options.
Summary:
Make the flags in LICM + MemorySSA tuning options in the old and new
pass managers.

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60490

llvm-svn: 358772
2019-04-19 17:46:50 +00:00
Alina Sbirlea da0f71af7d [LoopUnroll] Move list of params into a struct [NFCI].
Summary: Cleanup suggested in review of r358304.

Reviewers: sanjoy, efriedma

Subscribers: jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60638

llvm-svn: 358723
2019-04-18 23:43:49 +00:00
Philip Reames 137995d8da [GuardWidening] Wire up a NPM version of the LoopGuardWidening pass
llvm-svn: 358704
2019-04-18 19:17:14 +00:00
Philip Reames adf288c5d9 [LoopPred] Fix a blatantly obvious bug in r358684
The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant.  I don't know how this got missed in the patch and review.  I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases.  Oops.

llvm-svn: 358688
2019-04-18 17:01:19 +00:00
Philip Reames 92a7177e6b [LoopPredication] Allow predication of loop invariant computations (within the loop)
The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i'
A = _a.length;
guard (i < A)
a = _a[i]
B = _b.length;
guard (i < B);
b = _b[i];
...
Z = _z.length;
guard (i < Z)
z = _z[i]
accum += a + b + ... + z;

Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form.

Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later.

As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner.  See the udiv test changes as an example.  If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before.

A couple of subtleties in the implementation:
- SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point).  It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point.
- SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located.  (i.e. it can be in the loop)  This implies we have a speculation burden to prove before expanding them outside loops.
- invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance.  I plan to sink this part into SCEV once this has baked for a bit.

Differential Revision: https://reviews.llvm.org/D60093

llvm-svn: 358684
2019-04-18 16:33:17 +00:00
Richard Trieu 7b6192025e Fix bad compare function over FusionCandidate.
Reverse the checking of the domiance order so that when a self compare happens,
it returns false.  This makes compare function have strict weak ordering.

llvm-svn: 358636
2019-04-18 01:39:45 +00:00
Denis Bakhvalov cfd25a4b0e Test commit by Denis Bakhvalov
Change-Id: I4d85123a157d957434902fb14ba50926b2d56212
llvm-svn: 358619
2019-04-17 22:27:30 +00:00
Kit Barton 3cdf87940f Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Differential Revision: https://reviews.llvm.org/D55851

llvm-svn: 358607
2019-04-17 18:53:27 +00:00
Florian Hahn 893aea58ea [LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.
Summary:
In the following cases, unrolling can be beneficial, even when
optimizing for code size:
 1) very low trip counts
 2) potential to constant fold most instructions after fully unrolling.

We can unroll in those cases, by setting the unrolling threshold to the
loop size. This might highlight some cost modeling issues and fixing
them will have a positive impact in general.

Reviewers: vsk, efriedma, dmgreen, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D60265

llvm-svn: 358586
2019-04-17 15:57:43 +00:00
Roman Lebedev 0080645846 [CVP] processOverflowIntrinsic(): don't crash if constant-holding happened
As reported by Mikael Holmén in post-commit review in
https://reviews.llvm.org/D60791#1469765

llvm-svn: 358559
2019-04-17 06:35:07 +00:00
Eric Christopher e29874eaa0 Revert "Add basic loop fusion pass." Per request.
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358553
2019-04-17 04:55:24 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Kit Barton ab70da0728 Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Phabricator: https://reviews.llvm.org/D55851
llvm-svn: 358543
2019-04-17 01:37:00 +00:00
Sanjay Patel e08783e2f5 [EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101)
This is 1 of the problems discussed in the post-commit thread for:
rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html
and filed as:
https://bugs.llvm.org/show_bug.cgi?id=41101

Instcombine tries to canonicalize some of these cases (and there's room for improvement
there independently of this patch), but it can't always do that because of extra uses.
So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar
to how we detect commuted compares and commuted min/max/abs.

Differential Revision: https://reviews.llvm.org/D60723

llvm-svn: 358523
2019-04-16 20:41:20 +00:00
Nikita Popov 52b24ee932 [CVP] Simplify umulo and smulo that cannot overflow
If a umul.with.overflow or smul.with.overflow operation cannot
overflow, simplify it to a simple mul nuw / mul nsw. After the
refactoring in D60668 this is just a matter of removing an
explicit check against multiplications.

Differential Revision: https://reviews.llvm.org/D60791

llvm-svn: 358521
2019-04-16 20:31:41 +00:00
Nikita Popov 79dffc67b5 [IR] Add WithOverflowInst class
This adds a WithOverflowInst class with a few helper methods to get
the underlying binop, signedness and nowrap type and makes use of it
where sensible. There will be two more uses in D60650/D60656.

The refactorings are all NFC, though I left some TODOs where things
could be improved. In particular we have two places where add/sub are
handled but mul isn't.

Differential Revision: https://reviews.llvm.org/D60668

llvm-svn: 358512
2019-04-16 18:55:16 +00:00
Quentin Colombet fda0426888 [LSR] Rewrite misses some fixup locations if it splits critical edge
If LSR split critical edge during rewriting phi operands and
phi node has other pending fixup operands, we need to
update those pending fixups. Otherwise formulae will not be
implemented completely and some instructions will not be eliminated.

llvm.org/PR41445

Differential Revision: https://reviews.llvm.org/D60645

Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com>

llvm-svn: 358457
2019-04-15 22:23:46 +00:00
Philip Reames e46d77d1d9 [LoopPred] Stop passing around builders [NFC]
This is a preparatory patch for D60093. This patch itself is NFC, but while preparing this I noticed and committed a small hoisting change in rL358419.

The basic structure of the new scheme is that we pass around the guard ("the using instruction"), and select an optimal insert point by examining operands at each construction point. This seems conceptually a bit cleaner to start with as it isolates the knowledge about insertion safety at the actual insertion point.

Note that the non-hoisting path is not actually used at the moment. That's not exercised until D60093 is rebased on this one.

Differential Revision: https://reviews.llvm.org/D60718

llvm-svn: 358434
2019-04-15 18:15:08 +00:00
Hiroshi Yamauchi 09e539fcae [PGO] Profile guided code size optimization.
Summary:
Enable some of the existing size optimizations for cold code under PGO.

A ~5% code size saving in big internal app under PGO.

The way it gets BFI/PSI is discussed in the RFC thread

http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html 

Note it doesn't currently touch loop passes.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59514

llvm-svn: 358422
2019-04-15 16:49:00 +00:00
Philip Reames fbe64a2cfb [LoopPred] Hoist and of predicated checks where legal
If we have multiple range checks which can be predicated, hoist the and of the results outside the loop.  This minorly cleans up the resulting IR, but the main motivation is as a building block for D60093.

llvm-svn: 358419
2019-04-15 15:53:25 +00:00
Alina Sbirlea 2312a06c87 [SCEV] Add option to forget everything in SCEV.
Summary:
Create a method to forget everything in SCEV.
Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll.

Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch.
Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build.
Testcase showcasing this cannot be opensourced and is fairly large.

The option disabled by default, but it may be desirable to enable by
default. Evidence in favor (two difference runs on different days/ToT state):

File Before (s) After (s)
clang-9.bc 7267.91 6639.14
llvm-as.bc 194.12 194.12
llvm-dis.bc 62.50 62.50
opt.bc 1855.85 1857.53

File Before (s) After (s)
clang-9.bc 8588.70 7812.83
llvm-as.bc 196.20 194.78
llvm-dis.bc 61.55 61.97
opt.bc 1739.78 1886.26

Reviewers: sanjoy

Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60144

llvm-svn: 358304
2019-04-12 19:16:07 +00:00
Nikita Popov 00a0d5d1de [CVP] Set NSW/NUW flags when simplifying with.overflow
When CVP determines that a with.overflow intrinsic cannot overflow,
it currently inserts a simple add/sub. As we already determined that
there can be no overflow, we should add the appropriate NUW/NSW flag.

Differential Revision: https://reviews.llvm.org/D60585

llvm-svn: 358298
2019-04-12 18:18:17 +00:00
Jeremy Morse 32afe6a1f8 [DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc
Bug: https://bugs.llvm.org/show_bug.cgi?id=41175

In the bug test case the DSE pass is shortening the range of memory that a
memset is working on. A getelementptr is generated so that the new
starting address can be passed to memset. This instruction was not given
a DebugLoc.

To fix the bug, copy the DebugLoc from the memset instruction.

Patch by Orlando Cazalet-Hyams!

Differential Revision: https://reviews.llvm.org/D60556

llvm-svn: 358270
2019-04-12 09:47:35 +00:00
Brian M. Rzycki 887865c1ad [JumpThreading] Fix incorrect fold conditional after indirectbr/callbr
Fixes bug 40992: https://bugs.llvm.org/show_bug.cgi?id=40992

There is potential for miscompiled code emitted from JumpThreading when
analyzing a block with one or more indirectbr or callbr predecessors. The
ProcessThreadableEdges() function incorrectly folds conditional branches
into an unconditional branch.

This patch prevents incorrect branch folding without fully pessimizing
other potential threading opportunities through the same basic block.

This IR shape was manually fed in via opt and is unclear if clang and the
full pass pipeline will ever emit similar code shapes.

Thanks to Matthias Liedtke for the bug report and simplified IR example.

Differential Revision: https://reviews.llvm.org/D60284

llvm-svn: 357930
2019-04-08 18:20:35 +00:00
Evandro Menezes 85bd3978ae [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

llvm-svn: 357731
2019-04-04 22:40:06 +00:00
Evandro Menezes 7c711ccf36 [IR] Create new method in `Function` class (NFC)
Create method `optForNone()` testing for the function level equivalent of
`-O0` and refactor appropriately.

Differential revision: https://reviews.llvm.org/D59852

llvm-svn: 357638
2019-04-03 21:27:03 +00:00
Philip Reames adb3ece216 [LoopPredication] Simplify widenable condition handling [NFC]
The code doesn't actually need any of the information about the widenable condition at this level.  The only thing we need is to ensure the WC call is the last thing anded in, and even that is a quirk we should really look to remove.

llvm-svn: 357448
2019-04-02 02:42:57 +00:00
Philip Reames f608678f1f [LoopPred] Rename a variable to simply a future patch [NFC]
llvm-svn: 357433
2019-04-01 22:39:54 +00:00
Nick Lewycky 1e1e212d27 [NFC] Remove dead parameter "FreeInLoop", fix some typos and trailing whitespace.
Differential Revision: https://reviews.llvm.org/D60084

llvm-svn: 357427
2019-04-01 20:37:56 +00:00
Philip Reames 05e3e554b4 [LoopPred] Be uniform about proving generated conditions
We'd been optimizing the case where the predicate was obviously true, do the same for the false case.  Mostly just for completeness sake, but also may improve compile time in loops which will exit through the guard.  Such loops are presumed rare in fastpath code, but may be present down untaken paths, so optimizing for them is still useful.

llvm-svn: 357408
2019-04-01 16:26:08 +00:00
Philip Reames d109e2a7c3 [LoopPred] Delete the old condition expressions if unused
LoopPredication was replacing the original condition, but leaving the instructions to compute the old conditions around.  This would get cleaned up by other passes of course, but we might as well do it eagerly.  That also makes the test output less confusing.  

llvm-svn: 357406
2019-04-01 16:05:15 +00:00
Philip Reames b55637b5d7 [LoopPredication] Remove stale TODO
llvm-svn: 357331
2019-03-29 23:10:01 +00:00
Philip Reames 3d4e108237 [LoopPredication] Use the builder's insertion point everywhere [NFC]
llvm-svn: 357330
2019-03-29 23:06:57 +00:00
Florian Hahn 9b41a7320d Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."
Updated to use DenseMap::insert instead of [] operator for insertion, to
avoid a crash caused by epoch checks.

This reverts commit 2b85de4383.

llvm-svn: 357257
2019-03-29 14:10:24 +00:00
Florian Hahn 2b85de4383 Revert Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."
Another buildbot failure

http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20402

clang-9: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/llvm/include/llvm/ADT/DenseMap.h:1228: llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type* llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::operator->() const [with KeyT = const llvm::Instruction*; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<const llvm::Instruction*>; Bucket = llvm::detail::DenseMapPair<const llvm::Instruction*, unsigned int>; bool IsConst = false; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::pointer = llvm::detail::DenseMapPair<const llvm::Instruction*, unsigned int>*; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type = llvm::detail::DenseMapPair<const llvm::Instruction*, unsigned int>]: Assertion `isHandleInSync() && "invalid iterator access!"' failed.

0.	Program arguments: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/bin/clang-9 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name ArchiveCommandLine.cpp -mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu skylake-avx512 -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -coverage-notes-file /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip/Output/ArchiveCommandLine.llvm.gcno -resource-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0 -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/include -I ../../../include -D _GNU_SOURCE -D __STDC_LIMIT_MACROS -D NDEBUG -D BREAK_HANDLER -D UNICODE -D _UNICODE -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/C -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/myWindows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/include_windows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP -I . -D _FILE_OFFSET_BITS=64 -D _LARGEFILE_SOURCE -D NDEBUG -D _REENTRANT -D ENV_UNIX -D _7ZIP_LARGE_PAGES -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/backward -internal-isystem /usr/local/include -internal-isystem /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=gnu++98 -fdeprecated-macro -fdebug-compilation-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -ferror-limit 19 -fmessage-length 0 -pthread -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o Output/ArchiveCommandLine.llvm.o -x c++ /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/7zip/UI/Common/ArchiveCommandLine.cpp -faddrsig

This reverts r357222 (git commit 64cccfcc72)

llvm-svn: 357227
2019-03-29 00:22:26 +00:00
Florian Hahn 64cccfcc72 Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."
Recommitting after addressing a buildbot failure.

This reverts commit c87869ebea.

llvm-svn: 357222
2019-03-28 23:11:00 +00:00
Florian Hahn 45682fd633 [LSR] Fix signed overflow in GenerateCrossUseConstantOffsets.
For the attached test case, unchecked addition of immediate starts and
ends overflows, as they can be arbitrary i64 constants.

Proof: https://rise4fun.com/Alive/Plqc

Reviewers: qcolombet, gilr, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D59218

llvm-svn: 357217
2019-03-28 22:17:29 +00:00
Florian Hahn c87869ebea Revert [DSE] Preserve basic block ordering using OrderedBasicBlock.
This reverts r357208 (git commit c0bfd37d38)

This causes a buildbot failure:  http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/16124

FAILED: lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o
/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/install/stage2/bin/clang++   -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/IR -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR -Iinclude -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -flto=thin -O3    -UNDEBUG  -fno-exceptions -fno-rtti -MD -MT lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -MF lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o.d -o lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -c /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR/IRBuilder.cpp
clang-9: /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/Analysis/OrderedBasicBlock.cpp:38: bool llvm::OrderedBasicBlock::comesBefore(const llvm::Instruction *, const llvm::Instruction *): Assertion `!(LastInstFound == BB->end() && NextInstPos != 0) && "Instruction supposed to be in NumberedInsts"' failed.

llvm-svn: 357211
2019-03-28 20:36:24 +00:00
Florian Hahn c0bfd37d38 [DSE] Preserve basic block ordering using OrderedBasicBlock.
By extending OrderedBB to allow removing and replacing cached
instructions, we can preserve OrderedBBs in DSE easily. This eliminates
one source of quadratic compile time in DSE.

Fixes PR38829.

Reviewers: rnk, efriedma, hfinkel

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D59789

llvm-svn: 357208
2019-03-28 20:02:33 +00:00
Nikita Popov 977934f00f [ConstantRange] Add getFull() + getEmpty() named constructors; NFC
This adds ConstantRange::getFull(BitWidth) and
ConstantRange::getEmpty(BitWidth) named constructors as more readable
alternatives to the current ConstantRange(BitWidth, /* full */ false)
and similar. Additionally private getFull() and getEmpty() member
functions are added which return a full/empty range with the same bit
width -- these are commonly needed inside ConstantRange.cpp.

The IsFullSet argument in the ConstantRange(BitWidth, IsFullSet)
constructor is now mandatory for the few usages that still make use of it.

Differential Revision: https://reviews.llvm.org/D59716

llvm-svn: 356852
2019-03-24 09:34:40 +00:00
Daniel Sanders ef8761fd3b Fix non-determinism in Reassociate caused by address coincidences
Summary:
Between building the pair map and querying it there are a few places that
erase and create Values. It's rare but the address of these newly created
Values is occasionally the same as a just-erased Value that we already
have in the pair map. These coincidences should be accounted for to avoid
non-determinism.

Thanks to Roman Tereshin for the test case.

Reviewers: rtereshin, bogner

Reviewed By: rtereshin

Subscribers: mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59401

llvm-svn: 356803
2019-03-22 20:16:35 +00:00
Simon Pilgrim 92cbcfc325 Fix -Wmisleading-indentation gcc7 warning. NFCI.
llvm-svn: 356658
2019-03-21 11:58:22 +00:00
Mikael Holmen 5b1754f93d Silence warning about unused variable in builds without asserts [NFC]
llvm-svn: 356648
2019-03-21 07:54:44 +00:00
Alina Sbirlea f69f807321 [NFC] Fix brace indentation.
llvm-svn: 356596
2019-03-20 19:18:55 +00:00
Alina Sbirlea 5baa72ea74 [LICM & MemorySSA] Don't sink/hoist stores in the presence of ordered loads.
Summary:
Before this patch, if any Use existed in the loop, with a defining
access in the loop, we conservatively decide to not move the store.
What this approach was missing, is that ordered loads are not Uses, they're Defs
in MemorySSA. So, even when the clobbering walker does not find that
volatile load to interfere, we still cannot hoist a store past a
volatile load.
Resolves PR41140.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59564

llvm-svn: 356588
2019-03-20 18:33:37 +00:00
Florian Hahn d9e88f7b7f [LSR] Check for signed overflow in NarrowSearchSpaceByDetectingSupersets.
We are adding a sign extended IR value to an int64_t, which can cause
signed overflows, as in the attached test case, where we have a formula
with BaseOffset = -1 and a constant with numeric_limits<int64_t>::min().

If the addition would overflow, skip the simplification for this
formula. Note that the target triple is required to trigger the failure.

Reviewers: qcolombet, gilr, kparzysz, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D59211

llvm-svn: 356256
2019-03-15 12:17:36 +00:00
Sam Parker a86ff8640d Fix for buildbots
Remove unused private field.

llvm-svn: 356135
2019-03-14 11:38:55 +00:00
Sam Parker eb0b8019e8 [NFC][LSR] Cleanup Cost API
Create members for Loop, ScalarEvolution, DominatorTree,
TargetTransformInfo and Formula.

Differential Revision: https://reviews.llvm.org/D58389

llvm-svn: 356131
2019-03-14 11:05:07 +00:00
Philip Reames 9b6b4fac83 [SROA] Fix a crash when trying to convert a memset to an non-integral pointer type
The included test case currently crashes on tip of tree. Rather than adding a bailout, I chose to restructure the code so that the existing helper function could be used. Given that, the majority of the diff is NFC-ish, but the key difference is that canConvertValue returns false when only one side is a non-integral pointer.

Thanks to Cherry Zhang for the test case.

Differential Revision: https://reviews.llvm.org/D59000

llvm-svn: 355962
2019-03-12 20:15:05 +00:00
Liang Zou 4a8afeb970 [format] \t => ' '
Summary:
1. \t => '  '
2. test commit access

Reviewers: Higuoxing, liangdzou

Reviewed By: Higuoxing, liangdzou

Subscribers: kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59243

llvm-svn: 355924
2019-03-12 14:48:32 +00:00
Kristina Brooks 5b1e1c0537 Very minor typo. NFC
Typo `we we're` => `we were` in the pass EarlyCSE

Patch by liangdzou (Liang ZOU)

Differential Revision: https://reviews.llvm.org/D59241

llvm-svn: 355895
2019-03-12 07:08:19 +00:00
Jeremy Morse b60aea4131 [JumpThreading] Retain debug info when replacing branch instructions
Fixes bug 37966: https://bugs.llvm.org/show_bug.cgi?id=37966

The Jump Threading pass will replace certain conditional branch
instructions with unconditional branches when it can prove that only one
branch can occur. Prior to this patch, it would not carry the debug
info from the old instruction to the new one.

This patch fixes the bug described by copying the debug info from the
conditional branch instruction to the new unconditional branch
instruction, and adds a regression test for the Jump Threading pass that
covers this case.

Patch by Stephen Tozer!

Differential Revision: https://reviews.llvm.org/D58963

llvm-svn: 355822
2019-03-11 11:48:57 +00:00