Commit Graph

288 Commits

Author SHA1 Message Date
Chandler Carruth 04ece8623e Fix a slew of indentation and parameter naming style issues. This 80% of
this patch brought to you by the tool clang-format.

I wanted to fix up the names of constructor parameters because they
followed a bit of an anti-pattern by naming initialisms with CamelCase:
'Tti', 'Se', etc. This appears to have been in an attempt to not overlap
with the names of member variables 'TTI', 'SE', etc. However,
constructor arguments can very safely alias members, and in fact that's
the conventional way to pass in members. I've fixed all of these I saw,
along with making some strang abbreviations such as 'Lp' be simpler 'L',
or 'Lgl' be the word 'Legal'.

However, the code I was touching had indentation and formatting somewhat
all over the map. So I ran clang-format and fixed them.

I also fixed a few other formatting or doxygen formatting issues such as
using ///< on trailing comments so they are associated with the correct
entry.

There is still a lot of room for improvement of the formating and
cleanliness of this code. ;] At least a few parts of the coding
standards or common practices in LLVM's code aren't followed, the enum
naming rules jumped out at me. I may mix some of these while I'm here,
but not all of them.

llvm-svn: 171719
2013-01-07 09:57:00 +00:00
Chandler Carruth 2109f47d97 Fix the enumerator names for ShuffleKind to match tho coding standards,
and make its comments doxygen comments.

llvm-svn: 171688
2013-01-07 03:20:02 +00:00
Chandler Carruth d3e73556d6 Move TargetTransformInfo to live under the Analysis library. This no
longer would violate any dependency layering and it is in fact an
analysis. =]

llvm-svn: 171686
2013-01-07 03:08:10 +00:00
Chandler Carruth 21b3c586ab Switch the loop vectorizer from VTTI to just use TTI directly.
llvm-svn: 171620
2013-01-05 10:16:02 +00:00
Chandler Carruth 7c4f91dea5 Switch the BB vectorizer from the VTTI interface to the simple TTI
interface.

llvm-svn: 171618
2013-01-05 10:05:28 +00:00
Nadav Rotem e9f5bfd5e9 iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS.
PR14803.

llvm-svn: 171583
2013-01-05 01:15:47 +00:00
Paul Redmond 874f01e956 Do not vectorize loops with subtraction reductions
Since subtraction does not commute the loop vectorizer incorrectly vectorizes
reductions such as x = A[i] - x.

Disabling for now.

llvm-svn: 171537
2013-01-04 22:10:16 +00:00
Nadav Rotem 93bd30be9b Fix a warning
llvm-svn: 171525
2013-01-04 21:08:44 +00:00
Nadav Rotem e1d5c4b8b9 LoopVectorizer:
1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.

llvm-svn: 171469
2013-01-04 17:48:25 +00:00
Nadav Rotem 72f984b596 LoopVectorizer: Add support for loop-unrolling during vectorization for increasing the ILP. At the moment this feature is disabled by default and this commit should not cause any functional changes.
llvm-svn: 171436
2013-01-03 00:52:27 +00:00
Nadav Rotem 4897392360 Avoid vectorization when the function has the "noimplicitflot" attribute.
llvm-svn: 171429
2013-01-02 23:54:43 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Benjamin Kramer 614b5e85b9 Add IRBuilder::CreateVectorSplat and use it to simplify code.
llvm-svn: 171349
2013-01-01 19:55:16 +00:00
Bill Wendling 698e84fc4f Remove the Function::getFnAttributes method in favor of using the AttributeSet
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.

llvm-svn: 171253
2012-12-30 10:32:01 +00:00
Nadav Rotem 0b37f14371 LoopVectorizer: Fix a bug in the code that updates the loop exiting block.
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.

PR14725

llvm-svn: 171251
2012-12-30 07:47:00 +00:00
Nadav Rotem 5350cd314b If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified.
PR14719.

llvm-svn: 171124
2012-12-26 23:30:53 +00:00
Nadav Rotem 3f7c4f36ba LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1
llvm-svn: 171114
2012-12-26 19:08:17 +00:00
Hal Finkel 30e95a8ebb BBVectorize: Use VTTI to compute costs for intrinsics vectorization
For the time being this includes only some dummy test cases. Once the
generic implementation of the intrinsics cost function does something other
than assuming scalarization in all cases, or some target specializes the
interface, some real test cases can be added.

Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID
in a few other places.

llvm-svn: 171079
2012-12-26 01:36:57 +00:00
Hal Finkel b44f890133 LoopVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171076
2012-12-25 23:21:29 +00:00
Hal Finkel 2a456112ec BBVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171075
2012-12-25 22:36:08 +00:00
Nadav Rotem 5f7c12cfbd LoopVectorizer: When checking for vectorizable types, also check
the StoreInst operands.

PR14705.

llvm-svn: 171023
2012-12-24 09:14:18 +00:00
Nadav Rotem bd5d1d832a LoopVectorizer: Fix an endless loop in the code that looks for reductions.
The bug was in the code that detects PHIs in if-then-else block sequence.

PR14701.

llvm-svn: 171008
2012-12-24 01:22:06 +00:00
Benjamin Kramer 28691400dd LoopVectorize: Fix accidentaly inverted condition.
llvm-svn: 171001
2012-12-23 13:21:41 +00:00
Benjamin Kramer 855ba03408 LoopVectorize: For scalars and void types there is no need to compute vector insert/extract costs.
Fixes an assert during the build of oggenc in the test suite.

llvm-svn: 171000
2012-12-23 13:19:18 +00:00
Nadav Rotem 2cade68025 Loop Vectorizer: Update the cost model of scatter/gather operations and make
them more expensive.

llvm-svn: 170995
2012-12-23 07:23:55 +00:00
Bill Wendling c79e42c5ce Change 'AttrVal' to 'AttrKind' to better reflect that it's a kind of attribute instead of the value of the attribute.
llvm-svn: 170972
2012-12-22 00:37:52 +00:00
Roman Divacky a229186a82 Remove duplicate includes.
llvm-svn: 170902
2012-12-21 17:06:44 +00:00
Nadav Rotem 3b850b70b3 Enable if-conversion.
llvm-svn: 170841
2012-12-21 04:47:54 +00:00
Nadav Rotem a4b53f20a3 BB-Vectorizer: Check the cost of the store pointer type
and not the return type, which is void. A number of test
cases fail after adding the assertion in TTImpl.

llvm-svn: 170828
2012-12-21 01:24:36 +00:00
Nadav Rotem e7785686a5 Fix a bug in the code that checks if we can vectorize loops while using dynamic
memory bound checks.  Before the fix we were able to vectorize this loop from
the Livermore Loops benchmark:

for ( k=1 ; k<n ; k++ )
  x[k] = x[k-1] + y[k];

llvm-svn: 170811
2012-12-21 00:07:35 +00:00
Nadav Rotem 2ababf68d7 LoopVectorize: Fix a bug in the scalarization of instructions.
Before if-conversion we could check if a value is loop invariant
if it was declared inside the basic block. Now that loops have
multiple blocks this check is incorrect.

This fixes External/SPEC/CINT95/099_go/099_go

llvm-svn: 170756
2012-12-20 20:24:40 +00:00
Nadav Rotem 8b20c0a814 Loop Vectorizer: turn-off if-conversion.
llvm-svn: 170708
2012-12-20 17:42:53 +00:00
Nadav Rotem 7bdc45b570 Loop Vectorizer: Enable if-conversion.
llvm-svn: 170632
2012-12-20 02:00:02 +00:00
Nadav Rotem 28408a20c9 whitespace
llvm-svn: 170626
2012-12-20 00:49:56 +00:00
Benjamin Kramer e300004bd5 LoopVectorize: Make iteration over induction variables not depend on pointer values.
MapVector is a bit heavyweight, but I don't see a simpler way. Also the
InductionList is unlikely to be large. This should help 3-stage selfhost
compares (PR14647).

llvm-svn: 170528
2012-12-19 11:09:15 +00:00
Bill Wendling 3d7b0b8ac7 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
llvm-svn: 170502
2012-12-19 07:18:57 +00:00
Benjamin Kramer f0e5d2f032 LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations.
For example on x86 with SSE4.2 a <8 x i8> add reduction becomes
	movdqa	%xmm0, %xmm1
	movhlps	%xmm1, %xmm1            ## xmm1 = xmm1[1,1]
	paddw	%xmm0, %xmm1
	pshufd	$1, %xmm1, %xmm0        ## xmm0 = xmm1[1,0,0,0]
	paddw	%xmm1, %xmm0
	phaddw	%xmm0, %xmm0
	pextrb	$0, %xmm0, %edx

instead of
	pextrb	$2, %xmm0, %esi
	pextrb	$0, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$4, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$6, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$8, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$10, %xmm0, %edi
	pextrb	$14, %xmm0, %edx
	addb	%sil, %dil
	pextrb	$12, %xmm0, %esi
	addb	%dil, %sil
	addb	%sil, %dl

llvm-svn: 170439
2012-12-18 18:40:20 +00:00
Nadav Rotem e5e28b48c8 Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today.
llvm-svn: 170157
2012-12-13 23:11:54 +00:00
Nadav Rotem 36510f7194 Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc.
llvm-svn: 170051
2012-12-13 00:21:03 +00:00
Nadav Rotem 6027bdf898 Fix indentation.
llvm-svn: 170005
2012-12-12 19:39:36 +00:00
Nadav Rotem d0bb22bba3 LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.
llvm-svn: 170004
2012-12-12 19:29:45 +00:00
Nadav Rotem 6798a04b15 Fix the ascii drawing that was ruined when I split the H and CPP
llvm-svn: 169955
2012-12-12 01:33:47 +00:00
Nadav Rotem 4fa2e3d5af fix a typo.
llvm-svn: 169953
2012-12-12 01:31:10 +00:00
Nadav Rotem aeb17df802 LoopVectorizer: When -Os is used, vectorize only loops that dont require a tail loop. There is no testcase because I dont know of a way to initialize the loop vectorizer pass without adding an additional hidden flag.
llvm-svn: 169950
2012-12-12 01:11:46 +00:00
Nadav Rotem f707bf4ca3 PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion.
llvm-svn: 169916
2012-12-11 21:30:14 +00:00
Nadav Rotem e266efb70b Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars.
llvm-svn: 169904
2012-12-11 18:58:10 +00:00
Nadav Rotem dbb3328194 Fix PR14565. Don't if-convert loops that have switch statements in them.
llvm-svn: 169813
2012-12-11 04:55:10 +00:00
Nadav Rotem 07df5ac1a1 Split the LoopVectorizer into H and CPP.
llvm-svn: 169771
2012-12-10 21:39:02 +00:00
Nadav Rotem 7b5b55c195 Add support for reverse induction variables. For example:
while (i--)
 sum+=A[i];

llvm-svn: 169752
2012-12-10 19:25:06 +00:00
Paul Redmond 2adb13c100 LoopVectorize: support vectorizing intrinsic calls
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.

Reviewed by: Nadav

llvm-svn: 169711
2012-12-09 20:42:17 +00:00
Paul Redmond f7cd6b391a test commit.
llvm-svn: 169709
2012-12-09 19:46:31 +00:00
Nadav Rotem a8f026e2d4 LoopVectorizer: Increase the number of pointers that can be tested at runtime. If we cant prove statically that the pointers are disjoint then we add the runtime check.
llvm-svn: 169334
2012-12-04 23:25:24 +00:00
Nadav Rotem 87fc988c5d Enable if-conversion during vectorization.
llvm-svn: 169331
2012-12-04 22:59:52 +00:00
Nadav Rotem 93fa5ef957 Fix a bug in vectorization of if-converted reduction variables. If the
reduction variable is not used outside the loop then we ran into an
endless loop. This change checks if we found the original PHI.

llvm-svn: 169324
2012-12-04 22:40:22 +00:00
Nadav Rotem a10b311aec Add support for reduction variables when IF-conversion is enabled.
llvm-svn: 169288
2012-12-04 18:17:33 +00:00
Nadav Rotem 07674cb566 Give scalar if-converted blocks half the score because they are not always executed due to CF.
llvm-svn: 169223
2012-12-04 07:11:52 +00:00
Nadav Rotem 628c2dba60 Add the last part that is needed for vectorization of if-converted code.
Added the code that actually performs the if-conversion during vectorization.

We can now vectorize this code:

for (int i=0; i<n; ++i) {
  unsigned k = 0;

  if (a[i] > b[i])   <------ IF inside the loop.
    k = k * 5 + 3;

  a[i] = k;          <---- K is a phi node that becomes vector-select.
}

llvm-svn: 169217
2012-12-04 06:15:11 +00:00
NAKAMURA Takumi f99b535fdb LoopVectorize.cpp: Suppress a warning. [-Wunused-variable]
llvm-svn: 169195
2012-12-04 00:49:34 +00:00
NAKAMURA Takumi 8b07bc579b Fix whitespace.
llvm-svn: 169194
2012-12-04 00:49:28 +00:00
Nadav Rotem d479a57f68 minor renaming, documentation and cleanups.
llvm-svn: 169175
2012-12-03 22:57:09 +00:00
Nadav Rotem fad16be973 IF-conversion: teach the cost-model how to grade if-converted loops.
llvm-svn: 169171
2012-12-03 22:46:31 +00:00
Nadav Rotem eee203d885 Now that we have a basic if-conversion infrastructure we can rename the
"single basic block loop vectorizer" to "innermost loop vectorizer".

llvm-svn: 169158
2012-12-03 21:33:08 +00:00
Nadav Rotem a30aba7a01 Add initial support for IF-conversion. This patch implements the first 1/3,
which is the legality of the if-conversion transformation. The next step is to
implement the cost-model for the if-converted code as well as the
vectorization itself.

llvm-svn: 169152
2012-12-03 21:06:35 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Nadav Rotem 3ae24ee08a minor cleanups
llvm-svn: 169048
2012-11-30 22:37:11 +00:00
Nadav Rotem 6b494be886 Remove the use of LPPassManager. We can remove LPM because we dont need to run any additional loop passes on the new vector loop.
llvm-svn: 169016
2012-11-30 17:27:53 +00:00
Nadav Rotem 8dd6ee8df5 When broadcasting invariant scalars into vectors, place the broadcast code in the preheader.
llvm-svn: 168927
2012-11-29 19:25:41 +00:00
Hal Finkel 88ee6b0082 BBVectorize: Correctly merge SubclassOptionalData
When two instructions are combined into a vector instruction,
the resulting instruction must have the most-conservative flags.

llvm-svn: 168765
2012-11-28 03:04:10 +00:00
Nadav Rotem caf5acfd14 Move the code that uses SCEVs prior to creating the new loops.
llvm-svn: 168601
2012-11-26 19:51:46 +00:00
Nadav Rotem ee7ede76f4 Move the max vector width to a constant parameter. No functionality change.
llvm-svn: 168570
2012-11-25 16:48:08 +00:00
Nadav Rotem ef33b5076c Fix the document style.
llvm-svn: 168569
2012-11-25 16:39:01 +00:00
Nadav Rotem 12192f19eb Refactor the ptr runtime check generation code. No functionality change.
llvm-svn: 168568
2012-11-25 16:27:16 +00:00
Nadav Rotem b15d9fe24d Rename method. No functionality change.
llvm-svn: 168560
2012-11-25 09:13:57 +00:00
Nadav Rotem bf5173460f The induction-pointer work is inspired by a research paper. This commit adds a reference.
llvm-svn: 168559
2012-11-25 09:09:26 +00:00
Nadav Rotem ea3824f160 Add support for pointer induction variables even when there is no integer induction variable.
llvm-svn: 168558
2012-11-25 08:41:35 +00:00
Nadav Rotem c3c07e62e8 LoopVectorizer: Add initial support for pointer induction variables (for example: *dst++ = *src++).
At the moment we still require to have an integer induction variable (for example: i++).

llvm-svn: 168231
2012-11-17 00:27:03 +00:00
Nadav Rotem 0565b5a279 LoopVectorize: Division reductions generate incorrect code. Remove the part of the code that deals with divs.
Thanks to Paul Redmond for catching this while reviewing the code.

llvm-svn: 168142
2012-11-16 06:51:17 +00:00
Hal Finkel e9740a4692 Replace std::vector -> SmallVector in BBVectorize
For now, this uses 8 on-stack elements. I'll need to do some profiling
to see if this is the best number.

Pointed out by Jakob in post-commit review.

llvm-svn: 167966
2012-11-14 19:53:27 +00:00
Hal Finkel 1b7f0aba48 Fix the largest offender of determinism in BBVectorize
Iterating over the children of each node in the potential vectorization
plan must happen in a deterministic order (because it affects which children
are erased when two children conflict). There was no need for this data
structure to be a map in the first place, so replacing it with a vector
is a small change.

I believe that this was the last remaining instance if iterating over the
elements of a Dense* container where the iteration order could matter.
There are some remaining iterations over std::*map containers where the order
might matter, but so long as the Value* for instructions in a block increase
with the order of the instructions in the block (or decrease) monotonically,
then this will appear to be deterministic.

llvm-svn: 167942
2012-11-14 18:38:11 +00:00
Nadav Rotem a43bcddc8d use the getSplat API. Patch by Paul Redmond.
llvm-svn: 167892
2012-11-14 00:02:13 +00:00
Hal Finkel b51bdd20d3 BBVectorize: Remove temporary assert used for debugging
llvm-svn: 167817
2012-11-13 05:54:54 +00:00
Hal Finkel 2a1df367d4 BBVectorize: Don't vectorize vector-manipulation chains
Don't choose a vectorization plan containing only shuffles and
vector inserts/extracts. Due to inperfections in the cost model,
these can lead to infinite recusion.

llvm-svn: 167811
2012-11-13 03:12:40 +00:00
Hal Finkel 3b79f55c5f BBVectorize: Only some insert element operand pairs are free.
This fixes another infinite recursion case when using target costs.
We can only replace insert element input chains that are pure (end
with inserting into an undef).

llvm-svn: 167784
2012-11-12 23:55:36 +00:00
Hal Finkel 9cf3372931 BBVectorize: Use a more sophisticated check for input cost
The old checking code, which assumed that input shuffles and insert-elements
could always be folded (and thus were free) is too simple.
This can only happen in special circumstances.
Using the simple check caused infinite recursion.

llvm-svn: 167750
2012-11-12 21:21:02 +00:00
Hal Finkel f8326b6052 BBVectorize: Check the types of compare instructions
The pass would previously assert when trying to compute the cost of
compare instructions with illegal vector types (like struct pointers).

llvm-svn: 167743
2012-11-12 19:41:38 +00:00
Hal Finkel ef53df0f9f BBVectorize: Check the input types of shuffles for legality
This fixes a bug where shuffles were being fused such that the
resulting input types were not legal on the target. This would
occur only when both inputs and dependencies were also foldable
operations (such as other shuffles) and there were other connected
pairs in the same block.

llvm-svn: 167731
2012-11-12 14:50:59 +00:00
Nadav Rotem 12930749ab Fix a comment typo and add comments.
llvm-svn: 167684
2012-11-11 05:15:00 +00:00
Nadav Rotem 1cfef3e9ee Add support for memory runtime check. When we can, we calculate array bounds.
If the arrays are found to be disjoint then we run the vectorized version of
the loop. If they are not, we run the scalar code.

llvm-svn: 167608
2012-11-09 07:09:44 +00:00
Chandler Carruth acc748b2b5 Fix sign compare warning. Patch by Mahesha HS.
llvm-svn: 167282
2012-11-02 05:24:00 +00:00
Hal Finkel 560545b85f BBVectorize: Use target costs for incoming and outgoing values instead of the depth heuristic.
When target cost information is available, compute explicit costs of inserting and
extracting values from vectors. At this point, all costs are estimated using the
target information, and the chain-depth heuristic is not needed. As a result, it is now, by
default, disabled when using target costs.

llvm-svn: 167256
2012-11-01 21:50:12 +00:00
Hal Finkel c89e75e93e BBVectorize: Account for internal shuffle costs
When target costs are available, use them to account for the costs of
shuffles on internal edges of the DAG of candidate pairs.

Because the shuffle costs here are currently for only the internal edges,
the current target cost model is trivial, and the chain depth requirement
is still in place, I don't yet have an easy test
case. Nevertheless, by looking at the debug output, it does seem to do the right
think to the effective "size" of each DAG of candidate pairs.

llvm-svn: 167217
2012-11-01 06:26:34 +00:00
Nadav Rotem 4cb8cdab5e LoopVectorize: Preserve NSW, NUW and IsExact flags.
llvm-svn: 167174
2012-10-31 21:40:39 +00:00
Nadav Rotem ec3ab49dda Put the threshold magic number in a variable.
llvm-svn: 167134
2012-10-31 16:22:16 +00:00
Nadav Rotem 1265ea8f8d Remove enum values since they are not used anymore.
llvm-svn: 167131
2012-10-31 16:14:06 +00:00
Hal Finkel 842ad0b621 BBVectorize: Choose pair ordering to minimize shuffles
BBVectorize would, except for loads and stores, always fuse instructions
so that the first instruction (in the current source order) would always
represent the low part of the input vectors and the second instruction
would always represent the high part. This lead to too many shuffles
being produced because sometimes the opposite order produces fewer of them.

With this change, BBVectorize tracks the kind of pair connections that form
the DAG of candidate pairs, and uses that information to reorder the pairs to
avoid excess shuffles. Using this information, a future commit will be able
to add VTTI-based shuffle costs to the pair selection procedure. Importantly,
the number of remaining shuffles can now be estimated during pair selection.

There are some trivial instruction reorderings in the test cases, and one
simple additional test where we certainly want to do a reordering to
avoid an unnecessary shuffle.

llvm-svn: 167122
2012-10-31 15:17:07 +00:00
Nadav Rotem ce77ab0c24 LoopVectorize: Do not vectorize loops with tiny constant trip counts.
llvm-svn: 167101
2012-10-31 03:31:07 +00:00
Nadav Rotem ff7889196b Add support for loops that don't start with Zero.
This is important for loops in the LAPACK test-suite.
These loops start at 1 because they are auto-converted from fortran.

llvm-svn: 167084
2012-10-31 00:45:26 +00:00
Nadav Rotem 47a299dcc9 Add documentation.
llvm-svn: 167055
2012-10-30 22:06:26 +00:00
Hal Finkel 08f34ac9dd BBVectorize: Cache fixed-order pairs instead of recomputing pointer info.
Instead of recomputing relative pointer information just prior to fusing,
cache this information (which also needs to be computed during the
candidate-pair selection process). This cuts down on the total number of
SE queries made, and also is a necessary intermediate step on the road toward
including shuffle costs in the pair selection procedure.

No functionality change is intended.

llvm-svn: 167049
2012-10-30 20:17:37 +00:00
Hal Finkel 2eaadd1a2d BBVectorize: Fix a small bug introduced in r167042.
We need to make sure that we take the correct load/store alignment
when the inputs are flipped.

llvm-svn: 167044
2012-10-30 19:47:37 +00:00
Hal Finkel f384890961 BBVectorize: Simplify how input swapping is handled.
Stop propagating the FlipMemInputs variable into the routines that
create the replacement instructions. Instead, just flip the arguments
of those routines. This allows for some associated cleanup (not all
of which is done here). No functionality change is intended.

llvm-svn: 167042
2012-10-30 19:35:29 +00:00
Hal Finkel eac2887143 BBVectorize: Don't make calls to SE when the result is unused.
SE was being called during the instruction-fusion process (when the result
is unreliable, and thus ignored). No functionality change is intended.

llvm-svn: 167037
2012-10-30 18:55:49 +00:00
Nadav Rotem bc21aceb19 LoopVectorize: Add support for write-only loops when the write destination is a single pointer.
Speedup SciMark by 1%

llvm-svn: 167035
2012-10-30 18:36:45 +00:00
Nadav Rotem b3e8e688da LoopVectorize: Fix a bug in the initialization of reduction variables. AND needs to start at all-one
while XOR, and OR need to start at zero.

llvm-svn: 167032
2012-10-30 18:12:36 +00:00
Nadav Rotem 73ddcfe03f LoopVectorizer: change debug prints: Print the module identifier when deciding to vectorize. When deciding not to vectorize do not print the called function name because it can be null.
llvm-svn: 166989
2012-10-30 00:40:39 +00:00
Nadav Rotem 5ad045a8c5 LoopVectorize: Update and preserve the dominator tree info.
llvm-svn: 166970
2012-10-29 21:52:38 +00:00
Hal Finkel bad10bb2f3 Update BBVectorize to use the new VTTI instr. cost interfaces.
The monolithic interface for instruction costs has been split into
several functions. This is the corresponding change. No functionality
change is intended.

llvm-svn: 166865
2012-10-27 04:33:48 +00:00
Nadav Rotem 859366f93f 1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to return the type of the split result.
2. Change the maximum vectorization width from 4 to 8.
3. A test for both.

llvm-svn: 166864
2012-10-27 04:11:32 +00:00
Nadav Rotem afae78edab Refactor the VectorTargetTransformInfo interface.
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.

Port the LoopVectorizer to the new API.

The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.

llvm-svn: 166836
2012-10-26 23:49:28 +00:00
Hal Finkel 4863448dca Use VTTI->getNumberOfParts in BBVectorize.
This change reflects VTTI refactoring; no functionality change intended.

llvm-svn: 166752
2012-10-26 04:28:06 +00:00
Hal Finkel 41a6ded4a0 Disable generation of pointer vectors by BBVectorize.
Once vector-of-pointer support works, then this can be reverted.

llvm-svn: 166741
2012-10-26 00:05:26 +00:00
Hal Finkel 20a49d6f2c BBVectorize, when using VTTI, should not form types that will be split.
This is needed so that perl's SHA can be compiled (otherwise
BBVectorize takes far too long to find its fixed point).

I'll try to come up with a reduced test case.

llvm-svn: 166738
2012-10-25 23:47:16 +00:00
Hal Finkel cbf9365f4c Begin incorporating target information into BBVectorize.
This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:

 1. Target information is used to determine if it is profitable to fuse two
    instructions. This means that the cost of the vector operation must not
    be more expensive than the cost of the two original operations. Pairs that
    are not profitable are no longer considered (because current cost information
    is incomplete, for intrinsics for example, equal-cost pairs are still
    considered).

 2. The 'cost savings' computed for the profitability check are also used to
    rank the DAGs that represent the potential vectorization plans. Specifically,
    for nodes of non-trivial depth, the cost savings is used as the node
    weight.

The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.

llvm-svn: 166716
2012-10-25 21:12:23 +00:00
Nadav Rotem 579042f71b LoopVectorize: Teach the cost model to query scalar costs as scalar types and not vectors of 1.
llvm-svn: 166715
2012-10-25 21:03:48 +00:00
Nadav Rotem 5ffb049a55 Add support for additional reduction variables: AND, OR, XOR.
Patch by Paul Redmond <paul.redmond@intel.com>.

llvm-svn: 166649
2012-10-25 00:08:41 +00:00
Nadav Rotem 4a87683a41 Implement a basic cost model for vector and scalar instructions.
llvm-svn: 166642
2012-10-24 23:47:38 +00:00
Nadav Rotem e4f491e7ee whitespace
llvm-svn: 166622
2012-10-24 20:58:40 +00:00
Nadav Rotem a721b21c64 LoopVectorizer: Add a basic cost model which uses the VTTI interface.
llvm-svn: 166620
2012-10-24 20:36:32 +00:00
Micah Villmow 51e7246cb4 Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this!
llvm-svn: 166596
2012-10-24 17:25:11 +00:00
Micah Villmow 6a8f3f9e20 Delete a directory that wasn't supposed to be checked in yet.
llvm-svn: 166591
2012-10-24 17:20:04 +00:00
Nadav Rotem 5bed7b4fad Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.
PR14158.

llvm-svn: 166491
2012-10-23 18:44:18 +00:00
Nadav Rotem 1c7fc71e69 Don't crash if the load/store pointer is not a GEP.
Fix by Shivarama Rao <Shivarama.Rao@amd.com>

llvm-svn: 166427
2012-10-22 18:27:56 +00:00
Hal Finkel 931c52b84c BBVectorize should ignore unreachable blocks.
Unreachable blocks can have invalid instructions. For example,
jump threading can produce self-referential instructions in
unreachable blocks. Also, we should not be spending time
optimizing unreachable code. Fixes PR14133.

llvm-svn: 166423
2012-10-22 18:00:55 +00:00
Nadav Rotem f17cd27362 Rename a variable.
llvm-svn: 166410
2012-10-22 04:53:05 +00:00
Nadav Rotem 03011f1393 Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector).
llvm-svn: 166409
2012-10-22 04:38:00 +00:00
Nadav Rotem c9741887c3 Update the loop vectorizer docs.
llvm-svn: 166408
2012-10-22 03:52:53 +00:00
Anders Carlsson 7d8991c778 Avoid an extra hash lookup when inserting a value into the widen map.
llvm-svn: 166395
2012-10-21 16:26:35 +00:00
Jakub Staszak baa063bd03 Simplify code. No functionality change.
llvm-svn: 166393
2012-10-21 15:36:03 +00:00
Jakub Staszak 9694ab8ffa Simplify code. No functionality change.
llvm-svn: 166392
2012-10-21 15:29:19 +00:00
Nadav Rotem fe88c67161 Fix a bug in the vectorization of wide load/store operations.
We used a SCEV to detect that A[X] is consecutive. We assumed that X was
the induction variable. But X can be any expression that uses the induction
for example: X = i + 2;

llvm-svn: 166388
2012-10-21 06:49:10 +00:00
Nadav Rotem c1679a95b6 Add support for reduction variables that do not start at zero.
This is important for nested-loop reductions such as :

In the innermost loop, the induction variable does not start with zero:

for (i = 0 .. n)
 for (j = 0 .. m)
  sum += ...

llvm-svn: 166387
2012-10-21 05:52:51 +00:00
Nadav Rotem 364bd30641 Document change. Describe the pass and some papers that inspired the design of the pass.
llvm-svn: 166386
2012-10-21 04:04:25 +00:00
Nadav Rotem 7e1084d36c Vectorizer: fix a bug in the classification of induction/reduction phis.
llvm-svn: 166384
2012-10-21 02:38:01 +00:00
Nadav Rotem e5dc57d4fb Fix an infinite loop in the loop-vectorizer.
PR14134.

llvm-svn: 166379
2012-10-20 20:45:01 +00:00
Nadav Rotem d189b82a9b Vectorize: teach cavVectorizeMemory to distinguish between A[i]+=x and A[B[i]]+=x.
If the pointer is consecutive then it is safe to read and write. If the pointer is non-loop-consecutive then
it is unsafe to vectorize it because we may hit an ordering issue.

llvm-svn: 166371
2012-10-20 08:26:33 +00:00
Nadav Rotem 3940bafb54 Fix a typo
llvm-svn: 166367
2012-10-20 05:03:27 +00:00
Nadav Rotem f70ca3ceed Vectorizer: refactor the memory checks to a new function. No functionality change.
llvm-svn: 166366
2012-10-20 04:59:06 +00:00
Nadav Rotem 550f7f7e19 LoopVectorize: Keep the IRBuilder on the stack.
llvm-svn: 166354
2012-10-19 23:27:19 +00:00
Nadav Rotem 4f7f72702b Vectorizer: Add support for loop reductions.
For example:

  for (i=0; i<n; i++)
   sum += A[i] +  B[i] + i;

llvm-svn: 166351
2012-10-19 23:05:40 +00:00
Benjamin Kramer 319cb771b2 LoopVectorize: Keep the IRBuilder on the stack.
No functionality change.

llvm-svn: 166274
2012-10-19 08:42:02 +00:00
Nadav Rotem ced93f3a05 vectorizer: Add support for reading and writing from the same memory location.
llvm-svn: 166255
2012-10-19 01:24:18 +00:00
Nadav Rotem 1667324f22 cleanup the comment.
llvm-svn: 166247
2012-10-18 23:21:01 +00:00
Nadav Rotem d45a6b93df fix a naming typo
llvm-svn: 166232
2012-10-18 21:45:31 +00:00
Nadav Rotem f8a1396882 Avoid reconstructing the pointer set when searching for duplicated read/write pointers.
llvm-svn: 166205
2012-10-18 18:34:50 +00:00
Nadav Rotem a031c57417 When looking for a vector representation of a scalar, do a single lookup. Also, cache the result of the broadcast instruction.
No functionality change.

llvm-svn: 166191
2012-10-18 17:31:49 +00:00
Nadav Rotem 7a1728094c remove unused variable to fix a warning.
llvm-svn: 166170
2012-10-18 06:09:21 +00:00
Nadav Rotem 642efbcdd8 Remove the use of dominators and AA.
llvm-svn: 166167
2012-10-18 05:33:02 +00:00
Nadav Rotem b52f717411 Vectorizer: Add support for loops with an unknown count. For example:
for (i=0; i<n; i++){
        a[i] = b[i+1] + c[i+3];
     }

llvm-svn: 166165
2012-10-18 05:29:12 +00:00
NAKAMURA Takumi 7857415785 LoopVectorize.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 166153
2012-10-17 23:40:15 +00:00
Jakub Staszak 68e5dfddcb Remove redundant SetInsertPoint call.
llvm-svn: 166138
2012-10-17 23:06:37 +00:00