Commit Graph

1075 Commits

Author SHA1 Message Date
David Majnemer 42531260b3 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Matthew Simpson 3f69195b9e [SLP] Make RecursionMaxDepth a command line option (NFC)
llvm-svn: 278343
2016-08-11 15:28:45 +00:00
Benjamin Kramer b7d3311c77 Move helpers into anonymous namespaces. NFC.
llvm-svn: 277916
2016-08-06 11:13:10 +00:00
Michael Kuperstein 3ceac2bbd5 [LV, X86] Be more optimistic about vectorizing shifts.
Shifts with a uniform but non-constant count were considered very expensive to
vectorize, because the splat of the uniform count and the shift would tend to
appear in different blocks. That made the splat invisible to ISel, and we'd
scalarize the shift at codegen time.

Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
are able to select the appropriate vector shifts. This updates the cost model to
to take this into account by making shifts by a uniform cheap again.

Differential Revision: https://reviews.llvm.org/D23049

llvm-svn: 277782
2016-08-04 22:48:03 +00:00
Alina Sbirlea 6f937b1144 LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments.
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.

Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:
- x86 failing tests either did not have the right target, or the right alignment.
- NVPTX failing tests did not have the right alignment.
- AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info
  considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks
  for a maximum size allowed and relies on the next condition checking for %4 for correctness.
  This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).

Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.

Reviewers: arsenm, jlebar, tstellarAMD

Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23068

llvm-svn: 277735
2016-08-04 16:38:44 +00:00
Gil Rapaport e7a8fab275 [Loop Vectorizer] Move store-predication into its own function, remove obsolete comment (NFC)
Differential Revision: https://reviews.llvm.org/D23013

llvm-svn: 277595
2016-08-03 13:23:43 +00:00
Wei Mi dc7001afb2 [LoopVectorize] Change comment for isOutOfScope in collectLoopUniforms, NFC
Update comment for isOutOfScope and add a testcase for uniform value being used
out of scope.

Differential Revision: https://reviews.llvm.org/D23073

llvm-svn: 277515
2016-08-02 20:27:49 +00:00
Matthew Simpson 18d8898317 [LV] Generate both scalar and vector integer induction variables
This patch enables the vectorizer to generate both scalar and vector versions
of an integer induction variable for a given loop. Previously, we only
generated a scalar induction variable if we knew all its users were going to be
scalar. Otherwise, we generated a vector induction variable. In the case of a
loop with both scalar and vector users of the induction variable, we would
generate the vector induction variable and extract scalar values from it for
the scalar users. With this patch, we now generate both versions of the
induction variable when there are both scalar and vector users and select which
version to use based on whether the user is scalar or vector.

Differential Revision: https://reviews.llvm.org/D22869

llvm-svn: 277474
2016-08-02 15:25:16 +00:00
Matthew Simpson 58f562887b [LV] Untangle the concepts of uniform and scalar
This patch refactors the logic in collectLoopUniforms and
collectValuesToIgnore, untangling the concepts of "uniform" and "scalar". It
adds isScalarAfterVectorization along side isUniformAfterVectorization to
distinguish the two. Known scalar values include those that are uniform,
getelementptr instructions that won't be vectorized, and induction variables
and induction variable update instructions whose users are all known to be
scalar.

This patch includes the following functional changes:

- In collectLoopUniforms, we mark uniform the pointer operands of interleaved
  accesses. Although non-consecutive, these pointers are treated like
  consecutive pointers during vectorization.

- In collectValuesToIgnore, we insert a value into VecValuesToIgnore if it
  isScalarAfterVectorization rather than isUniformAfterVectorization. This
  differs from the previous functionaly in that we now add getelementptr
  instructions that will not be vectorized into VecValuesToIgnore.

This patch also removes the ValuesNotWidened set used for induction variable
scalarization since, after the above changes, it is now equivalent to
isScalarAfterVectorization.

Differential Revision: https://reviews.llvm.org/D22867

llvm-svn: 277460
2016-08-02 14:29:41 +00:00
Benjamin Kramer a0053cc0af [LoadStoreVectorizer] Don't use a linear walk for an existence check in a SmallPtrSet
No functionality change intended.

llvm-svn: 277436
2016-08-02 09:35:17 +00:00
Matthew Simpson 228f973189 [LV] Move isGatherOrScatterLegal into LoopVectorizationLegality (NFC)
llvm-svn: 277376
2016-08-01 20:11:25 +00:00
Matthew Simpson 1ce88ff6a7 [LV] Use getPointerOperand helper where appropriate (NFC)
llvm-svn: 277375
2016-08-01 20:08:09 +00:00
Alina Sbirlea 64acfb57bd Revert r277038 until clearing why tests fail.
llvm-svn: 277039
2016-07-28 21:35:20 +00:00
Alina Sbirlea 7116eb6e16 Remove TargetBaseAlign. Keep alignment for stack adjustments.
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.

Reviewers: llvm-commits, jlebar

Subscribers: mzolotukhin, arsenm

Differential Revision: https://reviews.llvm.org/D22936

llvm-svn: 277038
2016-07-28 21:26:40 +00:00
Wei Mi 315bb33f27 Fix the assertion error in collectLoopUniforms caused by empty Worklist before expanding.
Contributed-by: David Callahan

Differential Revision: https://reviews.llvm.org/D22886

llvm-svn: 276943
2016-07-27 23:53:58 +00:00
Justin Lebar 37f4e0e096 [LSV] Use Instruction*s rather than Value*s where possible.
Summary:
Given the crash in D22878, this patch converts the load/store vectorizer
to use explicit Instruction*s wherever possible.  This is an overall
simplification and should be an improvement in safety, as we have fewer
naked cast<>s, and now where we use Value*, we really mean something
different from Instruction*.

This patch also gets rid of some cast<>s around Value*s returned by
Builder.  Given that Builder constant-folds everything, we can't assume
much about what we get out of it.

One downside of this patch is that we have to copy our chain before
calling propagateMetadata.  But I don't think this is a big deal, as our
chains are very small (usually 2 or 4 elems).

Reviewers: asbirlea

Subscribers: mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22887

llvm-svn: 276938
2016-07-27 23:06:00 +00:00
Justin Lebar 23a9686011 [LSV] Don't assume that bitcast ops are Instructions.
Summary:
When we ask the builder to create a bitcast on a constant, we get back a
constant, not an instruction.

Reviewers: asbirlea

Subscribers: jholewinski, mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22878

llvm-svn: 276922
2016-07-27 21:45:48 +00:00
Elena Demikhovsky 376a18bd92 [Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this:
float *A;
float x = init;
for (int i=0; i < N; ++i) {
  A[i] = x;
  x -= fp_inc;
}

The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.

Differential Revision: https://reviews.llvm.org/D21330

llvm-svn: 276554
2016-07-24 07:24:54 +00:00
Michael Kuperstein 38e7298093 [SLPVectorizer] Vectorize reverse-order loads in horizontal reductions
When vectorizing a tree rooted at a store bundle, we currently try to sort the
stores before building the tree, so that the stores can be vectorized. For other
trees, the order of the root bundle - which determines the order of all other
bundles - is arbitrary. That is bad, since if a leaf bundle of consecutive loads
happens to appear in the wrong order, we will not vectorize it.

This is partially mitigated when the root is a binary operator, by trying to
build a "reversed" tree when that's considered profitable. This patch extends the
workaround we have for binops to trees rooted in a horizontal reduction.

This fixes PR28474.

Differential Revision: https://reviews.llvm.org/D22554

llvm-svn: 276477
2016-07-22 21:28:48 +00:00
Matthew Simpson 102729cf1b [LV] Move vector int induction update to end of latch
This patch moves the update instruction for vectorized integer induction phi
nodes to the end of the latch block. This ensures consistent placement of all
induction updates across all the kinds of int inductions we create (scalar,
splat vector, or vector phi).

Differential Revision: https://reviews.llvm.org/D22416

llvm-svn: 276339
2016-07-21 21:20:15 +00:00
Adam Nemet 7cfd5971ab [OptDiag,LV] Add hotness attribute to applied-optimization remarks
Test coverage is provided by modifying the function in the FP-math
testcase that we are allowed to vectorize.

llvm-svn: 276223
2016-07-21 01:07:13 +00:00
Adam Nemet 0e0e2d5d26 [OptDiag,LV] Add hotness attribute to the derived analysis remarks
This includes FPCompute and Aliasing.

Testcase is based on no_fpmath.ll.

llvm-svn: 276211
2016-07-20 23:50:32 +00:00
Adam Nemet 5b3a5cf6b0 [OptDiag,LV] Add hotness attribute to analysis remarks
The earlier change added hotness attribute to missed-optimization
remarks.  This follows up with the analysis remarks (the ones explaining
the reason for the missed optimization).

llvm-svn: 276192
2016-07-20 21:44:26 +00:00
Justin Lebar a272c12b73 [LSV] Don't move stores across may-load instrs, and loosen restrictions on moving loads.
Summary:
Previously we wouldn't move loads/stores across instructions that had
side-effects, where that was defined as may-write or may-throw.  But
this is not sufficiently restrictive: Stores can't safely be moved
across instructions that may load.

This patch also adds a DEBUG check that all instructions in our chain
are either loads or stores.

Reviewers: asbirlea

Subscribers: llvm-commits, jholewinski, arsenm, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22547

llvm-svn: 276171
2016-07-20 20:07:37 +00:00
Justin Lebar 62b03e344e [LSV] Vectorize up to side-effecting instructions.
Summary:
Previously if we had a chain that contained a side-effecting
instruction, we wouldn't vectorize it at all.  Now we'll vectorize
everything that comes before the side-effecting instruction.

Reviewers: asbirlea

Subscribers: arsenm, jholewinski, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22536

llvm-svn: 276170
2016-07-20 20:07:34 +00:00
Adam Nemet 67c8929a2c [LV] Add hotness attribute to missed-optimization remarks
The new OptimizationRemarkEmitter analysis pass is hooked up to both new
and old PM passes.

llvm-svn: 276080
2016-07-20 04:03:43 +00:00
Justin Lebar 6114b37838 [LSV] Don't assume that loads/stores appear in address order in the BB.
Summary:
getVectorizablePrefix previously didn't work properly in the face of
aliasing loads/stores.  It unwittingly assumed that the loads/stores
appeared in the BB in address order.  If they didn't, it would do the
wrong thing.

Reviewers: asbirlea, tstellarAMD

Subscribers: arsenm, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22535

llvm-svn: 276072
2016-07-20 00:55:12 +00:00
Justin Lebar 8778c62629 [LSV] Insert stores at the right point.
Summary:
Previously, the insertion point for stores was the last instruction in
Chain *before calling getVectorizablePrefixEndIdx*.  Thus if
getVectorizablePrefixEndIdx didn't return Chain.size(), we still would
insert at the last instruction in Chain.

This patch changes our internal API a bit in an attempt to make it less
prone to this sort of error.  As a result, we end up recalculating the
Chain's boundary instructions, but I think worrying about the speed hit
of this is a premature optimization right now.

Reviewers: asbirlea, tstellarAMD

Subscribers: mzolotukhin, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D22534

llvm-svn: 276056
2016-07-19 23:19:20 +00:00
Justin Lebar 2cf2c22870 [LSV] Use make_range, and reformat a DEBUG message. NFC
Summary:
The DEBUG message was hard to read because two Values were being printed
on the same line with only the delimiter "aliases".  This change makes
us print each Value on its own line.

Reviewers: asbirlea

Subscribers: llvm-commits, arsenm, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22533

llvm-svn: 276055
2016-07-19 23:19:18 +00:00
Justin Lebar 4ee8a2d024 [LSV] Nix two global (ish) variables in the LoadStoreVectorizer. NFC
Reviewers: asbirlea

Subscribers: mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22532

llvm-svn: 276054
2016-07-19 23:19:16 +00:00
Wei Mi 79997a24d7 Recommit the patch "Use uniforms set to populate VecValuesToIgnore".
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.

The change will make the vector RegUsages estimation less conservative.

Differential Revision: https://reviews.llvm.org/D20474

The recommit fixed the testcase global_alias.ll.

llvm-svn: 275936
2016-07-19 00:50:43 +00:00
Wei Mi f9afff71a2 Revert rL275912.
llvm-svn: 275915
2016-07-18 21:14:43 +00:00
Wei Mi 1fd25726af Use uniforms set to populate VecValuesToIgnore.
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.

The change will make the vector RegUsages estimation less conservative.

Differential Revision: https://reviews.llvm.org/D20474

llvm-svn: 275912
2016-07-18 20:59:53 +00:00
Matthew Simpson f855346f0b [LV] Swap A and B in interleaved access analysis (NFC)
This patch swaps A and B in the interleaved access analysis and clarifies
related comments. The algorithm is more intuitive if we let access A precede
access B in program order rather than the reverse. This change was requested in
the review of D19984.

llvm-svn: 275567
2016-07-15 15:22:43 +00:00
Matthew Simpson 96e881deb5 [LV] Rename StrideAccesses to AccessStrideInfo (NFC)
We now collect all accesses with a constant stride, not just the ones with a
stride greater than one. This change was requested in the review of D19984.

llvm-svn: 275473
2016-07-14 21:05:08 +00:00
Matthew Simpson 65ca32b83c [LV] Allow interleaved accesses in loops with predicated blocks
This patch allows the formation of interleaved access groups in loops
containing predicated blocks. However, the predicated accesses are prevented
from forming groups.

Differential Revision: https://reviews.llvm.org/D19694

llvm-svn: 275471
2016-07-14 20:59:47 +00:00
Matthew Simpson 3c3b4a257b [LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversions
This patch prevents increases in the number of instructions, pre-instcombine,
due to induction variable scalarization. An increase in instructions can lead
to an increase in the compile-time required to simplify the induction
variables. We now maintain a new map for scalarized induction variables to
prevent us from converting between the scalar and vector forms.

This patch should resolve compile-time regressions seen after r274627.

llvm-svn: 275419
2016-07-14 14:36:06 +00:00
Alina Sbirlea 640a61cd8b Extended LoadStoreVectorizer to vectorize subchains.
Summary:
LSV used to abort vectorizing a chain for interleaved load/store accesses that alias.
Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain.

Reviewers: llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22119

llvm-svn: 275317
2016-07-13 21:20:01 +00:00
David Majnemer 81d877b392 [LoopVectorize] Further cleanups
No functional change is intended, just a minor cleanup.

llvm-svn: 275243
2016-07-13 03:24:38 +00:00
Michael Kuperstein 51078b81ca [LV] Do not invalidate use-lists we're iterating over.
Should make sanitizers happier.

llvm-svn: 275230
2016-07-12 23:11:34 +00:00
Michael Kuperstein a99c46cc73 [LV] Remove wrong assumption about LCSSA
The LCSSA pass itself will not generate several redundant PHI nodes in a single
exit block. However, such redundant PHI nodes don't violate LCSSA form, and may
be introduced by passes that preserve LCSSA, and/or preserved by the LCSSA pass
itself. So, assuming a single PHI node per exit block is not safe.

llvm-svn: 275217
2016-07-12 21:24:06 +00:00
David Majnemer 9330b78431 [LoopVectorize] Assorted cleanups
Use range-based for loops instead of doing everything manually.
Use auto when appropriate.

No functional change is intended.

llvm-svn: 275205
2016-07-12 19:35:15 +00:00
Alina Sbirlea cbc6ac2afd Correct ordering of loads/stores.
Summary:
Aiming to correct the ordering of loads/stores. This patch changes the
insert point for loads to the position of the first load.
It updates the ordering method for loads to insert before, rather than after.

Before this patch the following sequence:
"load a[1], store a[1], store a[0], load a[2]"
Would incorrectly vectorize to "store a[0,1], load a[1,2]".
The correctness check was assuming the insertion point for loads is at
the position of the first load, when in practice it was at the last
load. An alternative fix would have been to invert the correctness check.
The current fix changes insert position but also requires reordering of
instructions before the vectorized load.

Updated testcases to reflect the changes.

Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22071

llvm-svn: 275117
2016-07-11 22:34:29 +00:00
Alina Sbirlea 327955e057 Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer
Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Add additional parameters: AddressSpace, Alignment, Fast.

Reviewers: llvm-commits, jlebar

Subscribers: arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21935

llvm-svn: 275100
2016-07-11 20:46:17 +00:00
Benjamin Kramer 4d09892e9a Give helper classes/functions internal linkage. NFC.
llvm-svn: 275014
2016-07-10 11:28:51 +00:00
Sean Silva db90d4d9c1 [PM] Port LoopVectorize to the new PM.
llvm-svn: 275000
2016-07-09 22:56:50 +00:00
Sean Silva 0dacbd8f31 [PM] Fix a think-o. mv {Scalar,Vectorize}/SLPVectorize.h
llvm-svn: 274960
2016-07-09 03:11:29 +00:00
Xinliang David Li 7853c1dd73 Rename LoopAccessAnalysis to LoopAccessLegacyAnalysis /NFC
llvm-svn: 274927
2016-07-08 20:55:26 +00:00