Commit Graph

713 Commits

Author SHA1 Message Date
Zachary Turner 3bd47cee78 Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects.
This allows IDEs to recognize the entire set of header files for
each of the core LLVM projects.

Differential Revision: http://reviews.llvm.org/D7526
Reviewed By: Chris Bieneman

llvm-svn: 228798
2015-02-11 03:28:02 +00:00
Bjorn Steinbrink 5ec7522771 Correctly combine alias.scope metadata by a union instead of intersecting
Summary:
The alias.scope metadata represents sets of things an instruction might
alias with. When generically combining the metadata from two
instructions the result must be the union of the original sets, because
the new instruction might alias with anything any of the original
instructions aliased with.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7490

llvm-svn: 228525
2015-02-08 17:07:14 +00:00
Adam Nemet 7206d7a5d2 [LV] Move addRuntimeCheck to LoopAccessAnalysis
This will allow it to be shared with the new Loop Distribution pass.

getFirstInst is currently duplicated across LoopVectorize.cpp and
LoopAccessAnalysis.cpp.  This is a short-term work-around until we figure out
a better solution.

NFC.  (The code moved is adjusted a bit for the name of the Loop member and
that PtrRtCheck is now a reference rather than a pointer.)

llvm-svn: 228418
2015-02-06 18:31:04 +00:00
Adam Nemet 5add5d9d85 [LV] Split off memcheck block really at the first check
I've noticed this while trying to move addRuntimeCheck to LoopAccessAnalysis.

I think that the intention was to early exit from the overflow checking before
the code for the memchecks.  This is the entire reason why we compute
FirstCheckInst but then we don't use that as the splitting instruction but the
final check.  Looks like an oversight.

llvm-svn: 228056
2015-02-03 22:45:39 +00:00
Adam Nemet b60295a525 [LoopVectorize] Fix rebase glitch in r227751
LoopVectorizationLegality::{getNumLoads,getNumStores} should forward to
LoopAccessAnalysis now.

Thanks to Takumi for noticing this!

llvm-svn: 227992
2015-02-03 17:59:53 +00:00
NAKAMURA Takumi c7f8bfc5e5 Resurrect initializers for NumLoads and NumStores in LoopVectorizationLegality to suppress undefined behavior.
FIXME: Shall they be managed in LAA?
llvm-svn: 227940
2015-02-03 03:55:06 +00:00
Erik Eckstein 7330e358f6 Fix: SLPVectorizer crashes with assertion when vectorizing a cmp instruction.
The commit r225977 uncovered this bug. The problem was that the vectorizer tried to
read the second operand of an already deleted instruction.
The bug didn't show up before r225977 because the freed memory still contained a non-null pointer.
With r225977 deletion of instructions is delayed and the read operand pointer is always null.

llvm-svn: 227800
2015-02-02 12:45:34 +00:00
Benjamin Kramer ad960019b0 LoopVectorize: Remove initializer list that blocks MSVC.
llvm-svn: 227766
2015-02-01 21:13:26 +00:00
Adam Nemet 0456327cfb [LoopVectorize] Move LoopAccessAnalysis to its own module
Other than moving code and adding the boilerplate for the new files, the code
being moved is unchanged.

There are a few global functions that are shared with the rest of the
LoopVectorizer.  I moved these to the new module as well (emitLoopAnalysis,
stripIntegerCast, replaceSymbolicStrideSCEV) along with the Report class used
by emitLoopAnalysis.  There is probably room for further improvement in this
area.

I kept DEBUG_TYPE "loop-vectorize" because it's used as the PassName with
emitOptimizationRemarkAnalysis.  This will obviously have to change.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227756
2015-02-01 16:56:15 +00:00
Adam Nemet 3f737d935d [LoopVectorize] Move RuntimePointerCheck under LoopAccessAnalysis
This class needs to remain public because it's used by
LoopVectorizationLegality::addRuntimeCheck.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227755
2015-02-01 16:56:11 +00:00
Adam Nemet c237f68fc9 [LoopVectorize] Pass parameters explicitly to MemoryDepChecker
Rather than using globals use a structure to pass parameters from the
vectorizer.  This prepares the class to be moved outside the LoopVectorizer.

It's not great how all this is passed through in LoopAccessAnalysis but this
is all expected to change once the class start servicing the Loop Distribution
pass as well where some of these parameters make no sense.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227754
2015-02-01 16:56:09 +00:00
Adam Nemet c28ffdcf35 [LoopVectorize] Split out LoopAccessAnalysis from LoopVectorizationLegality
Move the canVectorizeMemory functionality from LoopVectorizationLegality to a
new class LoopAccessAnalysis and forward users.

Currently the collection of the symbolic stride information is kept with
LoopVectorizationLegality and it becomes an input to LoopAccessAnalysis.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227751
2015-02-01 16:56:04 +00:00
Adam Nemet 5985971fa5 [LoopVectorize] Add accessors for Num{Stores,Loads,PredStores} in AccessAnalysis
These members are moving to LoopAccessAnalysis.  The accessors help to hide
this.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227750
2015-02-01 16:56:02 +00:00
Adam Nemet a2abc59dac [LoopVectorize] Rename the Report class to VectorizationReport
This class will become public in the new LoopAccessAnalysis header so the name
needs to be more global.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227749
2015-02-01 16:56:00 +00:00
Adam Nemet ac71cecdc6 [LoopVectorize] Factor out duplicated code into Report::emitAnalysis
The logic in emitAnalysis is duplicated across multiple functions.  This
splits it into a function.  Another use will be added by the patchset.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227748
2015-02-01 16:55:58 +00:00
Adam Nemet 6854b9203a [LoopVectorize] Split out RuntimePointerCheck from LoopVectorizationLegality
RuntimePointerCheck will be used through LoopAccessAnalysis in
LoopVectorizationLegality.  Later in the patchset it will become a local class
of LoopAccessAnalysis.

NFC.  This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.

llvm-svn: 227747
2015-02-01 16:55:56 +00:00
Chandler Carruth fdb9c573f7 [multiversion] Thread a function argument through all the callers of the
getTTI method used to get an actual TTI object.

No functionality changed. This just threads the argument and ensures
code like the inliner can correctly look up the callee's TTI rather than
using a fixed one.

The next change will use this to implement per-function subtarget usage
by TTI. The changes after that should eliminate the need for FTTI as that
will have become the default.

llvm-svn: 227730
2015-02-01 12:01:35 +00:00
Chandler Carruth 705b185f90 [PM] Change the core design of the TTI analysis to use a polymorphic
type erased interface and a single analysis pass rather than an
extremely complex analysis group.

The end result is that the TTI analysis can contain a type erased
implementation that supports the polymorphic TTI interface. We can build
one from a target-specific implementation or from a dummy one in the IR.

I've also factored all of the code into "mix-in"-able base classes,
including CRTP base classes to facilitate calling back up to the most
specialized form when delegating horizontally across the surface. These
aren't as clean as I would like and I'm planning to work on cleaning
some of this up, but I wanted to start by putting into the right form.

There are a number of reasons for this change, and this particular
design. The first and foremost reason is that an analysis group is
complete overkill, and the chaining delegation strategy was so opaque,
confusing, and high overhead that TTI was suffering greatly for it.
Several of the TTI functions had failed to be implemented in all places
because of the chaining-based delegation making there be no checking of
this. A few other functions were implemented with incorrect delegation.
The message to me was very clear working on this -- the delegation and
analysis group structure was too confusing to be useful here.

The other reason of course is that this is *much* more natural fit for
the new pass manager. This will lay the ground work for a type-erased
per-function info object that can look up the correct subtarget and even
cache it.

Yet another benefit is that this will significantly simplify the
interaction of the pass managers and the TargetMachine. See the future
work below.

The downside of this change is that it is very, very verbose. I'm going
to work to improve that, but it is somewhat an implementation necessity
in C++ to do type erasure. =/ I discussed this design really extensively
with Eric and Hal prior to going down this path, and afterward showed
them the result. No one was really thrilled with it, but there doesn't
seem to be a substantially better alternative. Using a base class and
virtual method dispatch would make the code much shorter, but as
discussed in the update to the programmer's manual and elsewhere,
a polymorphic interface feels like the more principled approach even if
this is perhaps the least compelling example of it. ;]

Ultimately, there is still a lot more to be done here, but this was the
huge chunk that I couldn't really split things out of because this was
the interface change to TTI. I've tried to minimize all the other parts
of this. The follow up work should include at least:

1) Improving the TargetMachine interface by having it directly return
   a TTI object. Because we have a non-pass object with value semantics
   and an internal type erasure mechanism, we can narrow the interface
   of the TargetMachine to *just* do what we need: build and return
   a TTI object that we can then insert into the pass pipeline.
2) Make the TTI object be fully specialized for a particular function.
   This will include splitting off a minimal form of it which is
   sufficient for the inliner and the old pass manager.
3) Add a new pass manager analysis which produces TTI objects from the
   target machine for each function. This may actually be done as part
   of #2 in order to use the new analysis to implement #2.
4) Work on narrowing the API between TTI and the targets so that it is
   easier to understand and less verbose to type erase.
5) Work on narrowing the API between TTI and its clients so that it is
   easier to understand and less verbose to forward.
6) Try to improve the CRTP-based delegation. I feel like this code is
   just a bit messy and exacerbating the complexity of implementing
   the TTI in each target.

Many thanks to Eric and Hal for their help here. I ended up blocked on
this somewhat more abruptly than I expected, and so I appreciate getting
it sorted out very quickly.

Differential Revision: http://reviews.llvm.org/D7293

llvm-svn: 227669
2015-01-31 03:43:40 +00:00
Reid Kleckner f0d389c139 Silence "not all paths return a value" warning in MSVC
llvm-svn: 227614
2015-01-30 21:30:57 +00:00
Chandler Carruth 6b1aa9be6a Fix a warning introduced by r227557 due to a default label in a fully
covering switch.

llvm-svn: 227575
2015-01-30 13:30:43 +00:00
Hao Liu 8de4f8b1b5 [LoopVectorize] Induction variables: support arbitrary constant step.
Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support
arbitrary constant steps.
Initial patch by Alexey Volkov. Some bug fixes are added in the following version.

Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193

llvm-svn: 227557
2015-01-30 05:02:21 +00:00
Erik Eckstein 98df6da740 SLPVectorizer: fix wrong scheduling of atomic load/stores.
This fixes PR22306.

llvm-svn: 227077
2015-01-26 09:07:04 +00:00
Aaron Ballman 9ada6cd0f6 Silencing a -Wsign-compare warning (all uses of this constant are within unsigned expressions anyway); NFC.
llvm-svn: 226826
2015-01-22 13:57:41 +00:00
Erik Eckstein 96cfb9c655 SLPVectorizer: add a second limit for the number of alias checks.
Even with the current limit on the number of alias checks, the containing loop has quadratic complexity.
This begins to hurt for blocks containing > 1K load/store instructions.
This commit introduces a limit for the loop count. It reduces the runtime for such very large blocks.

llvm-svn: 226792
2015-01-22 08:20:51 +00:00
Elena Demikhovsky 079b2d8c0c Fixed a bug in masked load/store in reversed loop.
Added a test.

The bug was submitted to bugzilla:
http://llvm.org/bugs/show_bug.cgi?id=22225

llvm-svn: 226791
2015-01-22 08:20:06 +00:00
Karthik Bhat 0b0f4660fa Fix Operandreorder logic in SLPVectorizer to generate longer vectorizable chain.
This patch fixes 2 issues in reorderInputsAccordingToOpcode
1) AllSameOpcodeLeft and AllSameOpcodeRight was being calculated incorrectly resulting in code not being vectorized in few cases.
2) Adds logic to reorder operands if we get longer chain of consecutive loads enabling vectorization. Handled the same for cases were we have AltOpcode.
Thanks Michael for inputs and review.
Review: http://reviews.llvm.org/D6677

llvm-svn: 226547
2015-01-20 06:11:00 +00:00
Erik Eckstein 76cb53a839 SLPVectorizer: limit the number of alias checks to reduce the runtime.
In case of blocks with many memory-accessing instructions, alias checking can take lot of time
(because calculating the memory dependencies has quadratic complexity).
I chose a limit which resulted in no changes when running the benchmarks.

llvm-svn: 226439
2015-01-19 09:33:38 +00:00
Chandler Carruth 691addc25f [PM] Now that LoopInfo isn't in the Pass type hierarchy, it is much
cleaner to derive from the generic base.

Thise removes a ton of boiler plate code and somewhat strange and
pointless indirections. It also remove a bunch of the previously needed
friend declarations. To fully remove these, I also lifted the verify
logic into the generic LoopInfoBase, which seems good anyways -- it is
generic and useful logic even for the machine side.

llvm-svn: 226385
2015-01-18 01:25:51 +00:00
Chandler Carruth 4f8f307c77 [PM] Split the LoopInfo object apart from the legacy pass, creating
a LoopInfoWrapperPass to wire the object up to the legacy pass manager.

This switches all the clients of LoopInfo over and paves the way to port
LoopInfo to the new pass manager. No functionality change is intended
with this iteration.

llvm-svn: 226373
2015-01-17 14:16:18 +00:00
Alexander Kornienko 8c0809c7f8 Replace size method call of containers to empty method where appropriate
This patch was generated by a clang tidy checker that is being open sourced.
The documentation of that checker is the following:

/// The emptiness of a container should be checked using the empty method
/// instead of the size method. It is not guaranteed that size is a
/// constant-time function, and it is generally more efficient and also shows
/// clearer intent to use empty. Furthermore some containers may implement the
/// empty method but not implement the size method. Using empty whenever
/// possible makes it easier to switch to another container in the future.

Patch by Gábor Horváth!

llvm-svn: 226161
2015-01-15 11:41:30 +00:00
Chandler Carruth b98f63dbdb [PM] Separate the TargetLibraryInfo object from the immutable pass.
The pass is really just a means of accessing a cached instance of the
TargetLibraryInfo object, and this way we can re-use that object for the
new pass manager as its result.

Lots of delta, but nothing interesting happening here. This is the
common pattern that is developing to allow analyses to live in both the
old and new pass manager -- a wrapper pass in the old pass manager
emulates the separation intrinsic to the new pass manager between the
result and pass for analyses.

llvm-svn: 226157
2015-01-15 10:41:28 +00:00
NAKAMURA Takumi 24ebfcb619 Update libdeps since TLI was moved from Target to Analysis in r226078.
llvm-svn: 226126
2015-01-15 05:21:00 +00:00
Erik Eckstein 13c4ab89ba reapply: SLPVectorizer: Cache results from memory alias checking.
This speeds up the dependency calculations for blocks with many load/store/call instructions.
Beside the improved runtime, there is no functional change.

Compared to the original commit, this re-applied commit contains a bug fix which ensures that there are
no incorrect collisions in the alias cache.

llvm-svn: 225977
2015-01-14 11:24:47 +00:00
Hao Liu e28d154cd5 Fix a wrong comment in LoopVectorize.
I.E. more than two -> exactly two
Fix a typo function name in LoopVectorize.
  I.E. collectStrideAcccess() -> collectStrideAccess()

llvm-svn: 225935
2015-01-14 03:02:16 +00:00
Julien Lerouge 0473cb5ab7 Fix non-determinism issue in SLP
The issue was introduced in r214638:

+  for (auto &BSIter : BlocksSchedules) {
+    scheduleBlock(BSIter.second.get());
+  }

Because BlocksSchedules is a DenseMap with BasicBlock* keys, blocks are
scheduled in non-deterministic order, resulting in unpredictable IR.

Patch by Daniel Reynaud!

llvm-svn: 225821
2015-01-13 19:45:52 +00:00
Erik Eckstein a168ef753f Revert "SLPVectorizer: Cache results from memory alias checking."
The alias cache has a problem of incorrect collisions in case a new instruction is allocated at the same address as a previously deleted instruction.

llvm-svn: 225790
2015-01-13 14:36:46 +00:00
Erik Eckstein 4a445c047f SLPVectorizer: Cache results from memory alias checking.
This speeds up the dependency calculations for blocks with many load/store/call instructions.
Beside the improved runtime, there is no functional change.

llvm-svn: 225786
2015-01-13 11:37:51 +00:00
Michael Zolotukhin d9ade185b9 Update comment.
llvm-svn: 225553
2015-01-09 22:15:06 +00:00
Michael Zolotukhin 1c38bc12de Remove duplicating code. NFC.
The removed condition is checked in the previous loop.

llvm-svn: 225542
2015-01-09 20:36:19 +00:00
Suyog Sarda 85d0473650 Assumption that "VectorizedValue" will always be an Instruction is not correct.
It can be a constant or a vector argument.

ex :

define i32 @hadd(<4 x i32> %a) #0 {
entry:
  %vecext = extractelement <4 x i32> %a, i32 0
  %vecext1 = extractelement <4 x i32> %a, i32 1
  %add = add i32 %vecext, %vecext1
  %vecext2 = extractelement <4 x i32> %a, i32 2
  %add3 = add i32 %add, %vecext2
  %vecext4 = extractelement <4 x i32> %a, i32 3
  %add5 = add i32 %add3, %vecext4
  ret i32 %add5
}

llvm-svn: 225517
2015-01-09 10:23:48 +00:00
Jiangning Liu 40c1b35292 Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm.
{code}
// loop body
   ... = a[i]          (1)
    ... = a[i+1]       (2)
 .......
a[i+1] = ....          (3)
   a[i] = ...          (4)
{code}

The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed.

For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized.

The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly.

llvm-svn: 225159
2015-01-05 10:08:58 +00:00
Chandler Carruth 66b3130cda [PM] Split the AssumptionTracker immutable pass into two separate APIs:
a cache of assumptions for a single function, and an immutable pass that
manages those caches.

The motivation for this change is two fold. Immutable analyses are
really hacks around the current pass manager design and don't exist in
the new design. This is usually OK, but it requires that the core logic
of an immutable pass be reasonably partitioned off from the pass logic.
This change does precisely that. As a consequence it also paves the way
for the *many* utility functions that deal in the assumptions to live in
both pass manager worlds by creating an separate non-pass object with
its own independent API that they all rely on. Now, the only bits of the
system that deal with the actual pass mechanics are those that actually
need to deal with the pass mechanics.

Once this separation is made, several simplifications become pretty
obvious in the assumption cache itself. Rather than using a set and
callback value handles, it can just be a vector of weak value handles.
The callers can easily skip the handles that are null, and eventually we
can wrap all of this up behind a filter iterator.

For now, this adds boiler plate to the various passes, but this kind of
boiler plate will end up making it possible to port these passes to the
new pass manager, and so it will end up factored away pretty reasonably.

llvm-svn: 225131
2015-01-04 12:03:27 +00:00
Elena Demikhovsky 84d1997b95 Some code improvements in Masked Load/Store.
No functional changes.

llvm-svn: 224986
2014-12-30 14:28:14 +00:00
Elena Demikhovsky fb81b93e17 Masked Load/Store - Changed the order of parameters in intrinsics.
No functional changes.
The documentation is coming.

llvm-svn: 224829
2014-12-25 07:49:20 +00:00
Tilmann Scheller be98f3c4ec [BBVectorize] Remove two more redundant assignments.
Found by the Clang static analyzer.

llvm-svn: 224590
2014-12-19 17:21:38 +00:00
Tilmann Scheller 945ce0ae00 [BBVectorize] Remove redundant assignment.
Found by the Clang static analyzer.

llvm-svn: 224589
2014-12-19 17:13:12 +00:00
Tilmann Scheller b811030b47 [LoopVectorize] Remove redundant assignment.
Found by the Clang static analyzer.

llvm-svn: 224587
2014-12-19 17:02:31 +00:00
Suyog Sarda 43fae93da8 Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads,
and vectorizes it." 

This was re-ordering floating point data types resulting in mismatch in output.

llvm-svn: 224424
2014-12-17 10:34:27 +00:00
Elena Demikhovsky f5b72afff4 Masked Load and Store Intrinsics in loop vectorizer.
The loop vectorizer optimizes loops containing conditional memory
accesses by generating masked load and store intrinsics.
This decision is target dependent.

http://reviews.llvm.org/D6527

llvm-svn: 224334
2014-12-16 11:50:42 +00:00
Elena Demikhovsky 3fcafa2cdb Loop Vectorizer minor changes in the code -
some comments, function names, identation.

Reviewed here: http://reviews.llvm.org/D6527

llvm-svn: 224218
2014-12-14 09:43:50 +00:00