Commit Graph

819 Commits

Author SHA1 Message Date
Chandler Carruth 2f1fd1658f [PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193
2015-08-17 02:08:17 +00:00
Chandler Carruth d1a2e05991 [PM/AA] Explicitly depend on TLI rather than getting it out of the
AliasAnalysis.

Same as the other commits, the TLI access from an alias analysis is
going away and isn't very clean -- it is better to explicitly mark the
dependencies.

llvm-svn: 244785
2015-08-12 18:06:08 +00:00
Sanjay Patel fec7965b36 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244617
2015-08-11 15:56:31 +00:00
Sanjay Patel 2a3eb41deb fix code that was accidentally commented out in previous commit
llvm-svn: 244610
2015-08-11 15:08:29 +00:00
Sanjay Patel 320217668e fix typos in comments; NFC
llvm-svn: 244609
2015-08-11 15:04:51 +00:00
Sanjay Patel 25b2601bca fix typo in comment; NFC
llvm-svn: 244607
2015-08-11 14:45:08 +00:00
Tyler Nowicki c94d6ad241 Print vectorization analysis when loop hint is specified.
This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints.

llvm-svn: 244555
2015-08-11 01:09:15 +00:00
Tyler Nowicki 233773837e Moved LoopVectorizeHints and related functions before LoopVectorizationLegality and LoopVectorizationCostModel.
llvm-svn: 244552
2015-08-11 00:52:54 +00:00
Tyler Nowicki 2d5802f38d Simplify processLoop() by moving loop hint verification into Hints::allowVectorization().
llvm-svn: 244550
2015-08-11 00:35:44 +00:00
Adam Nemet 5b0a479541 [LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFC
This was requested by Hal in D11205.

llvm-svn: 244540
2015-08-11 00:09:37 +00:00
Tyler Nowicki 652b0dabe6 Extend late diagnostics to include late test for runtime pointer checks.
This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization.

llvm-svn: 244523
2015-08-10 23:01:55 +00:00
Tyler Nowicki c1a86f5866 Late evaluation of the fast-math vectorization requirement.
This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. 

llvm-svn: 244489
2015-08-10 19:51:46 +00:00
Tyler Nowicki 4d62f2e039 Modify diagnostic messages to clearly indicate the why interleaving wasn't done.
Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done.

llvm-svn: 244485
2015-08-10 19:14:16 +00:00
Silviu Baranga 61bdc51339 [TTI] Add a hook for specifying per-target defaults for Interleaved Accesses
Summary:
This adds a hook to TTI which enables us to selectively turn on by default
interleaved access vectorization for targets on which we have have performed
the required benchmarking.

Reviewers: rengolin

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11901

llvm-svn: 244449
2015-08-10 14:50:54 +00:00
Benjamin Kramer df005cbe19 Fix some comment typos.
llvm-svn: 244402
2015-08-08 18:27:36 +00:00
Sanjay Patel 924879ad2c wrap OptSize and MinSize attributes for easier and consistent access (NFCI)
Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994
2015-08-04 15:49:57 +00:00
Wei Mi d6f7252e2e [SLP vectorizer]: Choose the best consecutive candidate to pair with a store instruction.
The patch changes the SLPVectorizer::vectorizeStores to choose the immediate
succeeding or preceding candidate for a store instruction when it has multiple
consecutive candidates. In this way it has better chance to find more slp
vectorization opportunities.

Differential Revision: http://reviews.llvm.org/D10445

llvm-svn: 243666
2015-07-30 17:40:39 +00:00
Hans Wennborg 13958b739e Fix -Wextra-semi warnings.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D11400

llvm-svn: 242930
2015-07-22 20:46:11 +00:00
Chandler Carruth a1032a0f7c [PM/AA] Remove the last of the legacy update API from AliasAnalysis as
part of simplifying its interface and usage in preparation for porting
to work with the new pass manager.

Note that this will likely expose that we have dead arguments, members,
and maybe even pass requirements for AA. I'll be cleaning those up in
seperate patches. This just zaps the actual update API.

Differential Revision: http://reviews.llvm.org/D11325

llvm-svn: 242881
2015-07-22 09:49:59 +00:00
Chandler Carruth d86a4f5ec8 [PM/AA] Switch to an early-exit. NFC. This was split out of another
change because the diff is *useless*. I assure you, I just switched to
early-return in this function.

Cleanup in preparation for my next commit, as requested in code review!

llvm-svn: 242880
2015-07-22 09:44:54 +00:00
Wei Mi deee61e434 Create a wrapper pass for BlockFrequencyInfo.
This is useful when we want to do block frequency analysis
conditionally (e.g. only in PGO mode) but don't want to add
one more pass dependence.

Patch by congh.
Approved by dexonsmith.
Differential Revision: http://reviews.llvm.org/D11196

llvm-svn: 242248
2015-07-14 23:40:50 +00:00
Adam Nemet 7cdebac0c8 [LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC
I am planning to add more nested classes inside RuntimePointerCheck so
all these triple-nesting would be hard to follow.

Also rename it to RuntimePointerChecking (i.e. append 'ing').

llvm-svn: 242218
2015-07-14 22:32:44 +00:00
Benjamin Kramer e448b5be05 Avoid using Loop::getSubLoopsVector.
Passes should never modify it, just use the const version. While there
reduce copying in LoopInterchange. No functional change intended.

llvm-svn: 242041
2015-07-13 17:21:14 +00:00
Hal Finkel 9cf58c4095 Move getStrideFromPointer and friends from LoopVectorize to VectorUtils
The following functions are moved from the LoopVectorizer to VectorUtils:

  - getGEPInductionOperand
  - stripGetElementPtr
  - getUniqueCastUse
  - getStrideFromPointer

These used to be static functions in LoopVectorize, but will also be used by
the upcoming loop versioning LICM transformation.

Patch by Ashutosh Nema!

llvm-svn: 241980
2015-07-11 10:52:42 +00:00
Tyler Nowicki 3960d85262 Renamed some uses of unroll to interleave in the vectorizer.
llvm-svn: 241971
2015-07-11 00:31:11 +00:00
Jingyue Wu a277561922 [TTI] BasicTTIImpl assumes no vector registers
Summary:
Following the discussion on r241884, it's more reasonable to assume that a
target has no vector registers by default instead of letting every such
target overrides getNumberOfRegisters.

Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to
return 0 when Vector is true, and partially reverts r241884 which
modifies NVPTXTTIImpl::getNumberOfRegisters.

It also fixes a performance bug in LoopVectorizer. Even if a target has
no vector registers, vectorization may still help ILP. So, we need both
checks to be false before disabling loop vectorization all together.

Reviewers: hfinkel

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11108

llvm-svn: 241942
2015-07-10 21:14:54 +00:00
Sanjay Patel 1319446195 [SLPVectorizer] Try different vectorization factors for store chains
...and set max vector register size based on target 

This patch is based on discussion on the llvmdev mailing list:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html

and also solves:
https://llvm.org/bugs/show_bug.cgi?id=17170

Several FIXME/TODO items are noted in comments as potential improvements.

Differential Revision: http://reviews.llvm.org/D10950

llvm-svn: 241760
2015-07-08 23:40:55 +00:00
Michael Zolotukhin 97295ea7dd [LoopVectorizer] Rename BypassBlock to VectorPH, and CheckBlock to NewVectorPH. NFCI.
llvm-svn: 241742
2015-07-08 21:48:03 +00:00
Michael Zolotukhin 8c874bb2f1 [LoopVectorizer] Restructurize code for emitting RT checks. NFCI.
Place all code corresponding to a run-time check in one place.
Previously we generated some code, then proceeded to a next check, then
finished the code for the first check (like splitting blocks and
generating branches). Now the code for generating a check is
self-contained.

llvm-svn: 241741
2015-07-08 21:47:59 +00:00
Michael Zolotukhin 66f5591f9b [LoopVectorizer] Remove redundant variables PastOverflowCheck and OverflowCheckAnchor. NFCI.
llvm-svn: 241740
2015-07-08 21:47:56 +00:00
Michael Zolotukhin 00345cadd5 [LoopVectorizer] Move some code around to ease further refactoring. NFCI.
llvm-svn: 241739
2015-07-08 21:47:53 +00:00
Michael Zolotukhin 7db3063f87 [LoopVectorizer] Remove redundant variable LastBypassBlock. NFC.
llvm-svn: 241738
2015-07-08 21:47:47 +00:00
Sanjay Patel cc9fad0bf7 remove unnecessary temp variable; NFCI
llvm-svn: 241415
2015-07-05 21:21:47 +00:00
Sanjay Patel a4860f3af2 use range-based for loops; NFCI
llvm-svn: 241412
2015-07-05 20:15:21 +00:00
Sanjay Patel f73f8919ed use range-based for loops; NFCI
llvm-svn: 241395
2015-07-04 19:38:52 +00:00
Alexey Samsonov 958dab71b3 [LoopVectorize] Use ReplaceInstWithInst() helper where appropriate.
This is mostly an NFC, which increases code readability (instead of
saving old terminator, generating new one in front of old, and deleting
old, we just call a function). However, it would additionaly copy
the debug location from old instruction to replacement, which
would help PR23837.

llvm-svn: 241197
2015-07-01 22:18:30 +00:00
David Majnemer 9f3979fd78 [LoopVectorize] Pointer indicies may be wider than the pointer
If we are dealing with a pointer induction variable, isInductionPHI
gives back a step value of Stride / size of pointer.  However, we might
be indexing with a legal type wider than the pointer width.
Handle this by inserting casts where appropriate instead of crashing.

This fixes PR23954.

llvm-svn: 240877
2015-06-27 08:38:17 +00:00
David Blaikie b447ac6435 Move VectorUtils from Transforms to Analysis to correct layering violation
llvm-svn: 240804
2015-06-26 18:02:52 +00:00
Michael Zolotukhin 79ff564ef3 [LoopVectorizer] Fix bailing-out condition for OptForSize case.
With option OptForSize enabled, the Loop Vectorizer is not supposed to
create tail loop. The condition checking that was invalid and was not
matching to the comment above.

Patch by Marianne Mailhot-Sarrasin.

llvm-svn: 240556
2015-06-24 17:26:24 +00:00
Alexander Kornienko f00654e31b Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Michael Zolotukhin 4d8ffa082c [SLP] Vectorize for all-constant entries.
Differential Revision: http://reviews.llvm.org/D10531

llvm-svn: 240144
2015-06-19 17:40:15 +00:00
Alexander Kornienko 70bc5f1398 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
Chandler Carruth ac80dc7532 [PM/AA] Remove the Location typedef from the AliasAnalysis class now
that it is its own entity in the form of MemoryLocation, and update all
the callers.

This is an entirely mechanical change. References to "Location" within
AA subclases become "MemoryLocation", and elsewhere
"AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps
out-of-tree folks update.

llvm-svn: 239885
2015-06-17 07:18:54 +00:00
Tyler Nowicki 27b2c39eb3 Refactor RecurrenceInstDesc
Moved RecurrenceInstDesc into RecurrenceDescriptor to simplify the namespaces.

llvm-svn: 239862
2015-06-16 22:59:45 +00:00
Tyler Nowicki 0a91310c7f Rename Reduction variables/structures to Recurrence.
A reduction is a special kind of recurrence. In the loop vectorizer we currently
identify basic reductions. Future patches will extend this to identifying basic
recurrences.

llvm-svn: 239835
2015-06-16 18:07:34 +00:00
Hao Liu 405f1d1651 [LoopVectorize] Revert the enabling of interleaved memory access in Loop Vectorizor, which was wrongly committed in r239514.
llvm-svn: 239515
2015-06-11 09:18:07 +00:00
Hao Liu 4566d18e89 [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true"

E.g. Transform an interleaved load (Factor = 2):
       %wide.vec = load <8 x i32>, <8 x i32>* %ptr
       %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>  ; Extract even elements
       %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>  ; Extract odd elements
     Into:
       %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr)
       %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
       %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Transform an interleaved store (Factor = 2):
       %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
       store <8 x i32> %i.vec, <8 x i32>* %ptr
     Into:
       %v0 = shuffle %i.vec, undef, <0, 1, 2, 3>
       %v1 = shuffle %i.vec, undef, <4, 5, 6, 7>
       call void aarch64.neon.st2(%v0, %v1, %ptr)

llvm-svn: 239514
2015-06-11 09:05:02 +00:00
Hao Liu 32c0539691 [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
       a = A[i];         // load of even element
       b = A[i+1];       // load of odd element
       ...               // operations on a, b, c, d
       A[i] = c;         // store of even element
       A[i+1] = d;       // store of odd element
     }

  The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
     %wide.vec = load <8 x i32>, <8 x i32>* %ptr
     %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
     %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>

  The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
     %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> 
     store <8 x i32> %interleaved.vec, <8 x i32>* %ptr

This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. 

llvm-svn: 239291
2015-06-08 06:39:56 +00:00
Chandler Carruth 70c61c1a8a [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and
port it to the new pass manager.

All this does is extract the inner "location" class used by AA into its
own full fledged type. This seems *much* cleaner as MemoryDependence and
soon MemorySSA also use this heavily, and it doesn't make much sense
being inside the AA infrastructure.

This will also make it much easier to break apart the AA infrastructure
into something that stands on its own rather than using the analysis
group design.

There are a few places where this makes APIs not make sense -- they were
taking an AliasAnalysis pointer just to build locations. I'll try to
clean those up in follow-up commits.

Differential Revision: http://reviews.llvm.org/D10228

llvm-svn: 239003
2015-06-04 02:03:15 +00:00
Benjamin Kramer f5e2fc474d Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.


Call sites were found with the ASTMatcher + some semi-automated cleanup.

memberCallExpr(
    argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
    on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
    hasArgument(0, bindTemporaryExpr(
                       hasType(recordDecl(hasNonTrivialDestructor())),
                       has(constructExpr()))),
    unless(isInTemplateInstantiation()))

No functional change intended.

llvm-svn: 238602
2015-05-29 19:43:39 +00:00