Commit Graph

313 Commits

Author SHA1 Message Date
Arnold Schwaighofer 88e7fddc8c LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvert
We only want to check this once, not for every conditional block in the loop.

No functionality change (except that we don't perform a check redudantly
anymore).

llvm-svn: 181942
2013-05-15 22:38:14 +00:00
Arnold Schwaighofer 09cee97270 LoopVectorize: Fix comments
No functionality change.

llvm-svn: 181862
2013-05-15 02:02:45 +00:00
Arnold Schwaighofer 2d920477a4 LoopVectorize: Hoist conditional loads if possible
InstCombine can be uncooperative to vectorization and sink loads into
conditional blocks. This prevents vectorization.

Undo this optimization if there are unconditional memory accesses to the same
addresses in the loop.

radar://13815763

llvm-svn: 181860
2013-05-15 01:44:30 +00:00
Arnold Schwaighofer 2e7a922a15 LoopVectorize: Handle loops with multiple forward inductions
We used to give up if we saw two integer inductions. After this patch, we base
further induction variables on the chosen one like we do in the reverse
induction and pointer induction case.

Fixes PR15720.

radar://13851975

llvm-svn: 181746
2013-05-14 00:21:18 +00:00
Duncan Sands 0480b9b54e Suppress GCC compiler warnings in release builds about variables that are only
read in asserts.

llvm-svn: 181689
2013-05-13 07:50:47 +00:00
Nadav Rotem 33dcf0a70f SLPVectorizer: Swap LHS and RHS. No functionality change.
llvm-svn: 181684
2013-05-13 05:13:13 +00:00
Nadav Rotem ce42cc6d4d SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users.
The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract.

llvm-svn: 181674
2013-05-12 22:58:45 +00:00
Nadav Rotem cbf6d24d50 SLPVectorizer: Clear the map that maps between scalars to vectors after each round of vectorization.
Testcase in the next commit.

llvm-svn: 181673
2013-05-12 22:55:57 +00:00
Arnold Schwaighofer f2305e4467 LoopVectorize: Use the widest induction variable type
Use the widest induction type encountered for the cannonical induction variable.

We used to turn the following loop into an empty loop because we used i8 as
induction variable type and truncated 1024 to 0 as trip count.

int a[1024];
void fail() {
  int reverse_induction = 1023;
  unsigned char forward_induction = 0;
  while ((reverse_induction) >= 0) {
    forward_induction++;
    a[reverse_induction] = forward_induction;
    --reverse_induction;
  }
}

radar://13862901

llvm-svn: 181667
2013-05-11 23:04:28 +00:00
Arnold Schwaighofer a544fefa32 LoopVectorize: Use variable instead of repeated function call
No functionality change intended.

llvm-svn: 181666
2013-05-11 23:04:26 +00:00
Arnold Schwaighofer 1ba84df437 LoopVectorize: Use IRBuilder interface in more places
No functionality change intended.

llvm-svn: 181665
2013-05-11 23:04:24 +00:00
Nadav Rotem cdfb48d2fe SLPVectorizer: Add support for trees with external users.
For example:
bar() {
  int a = A[i];
  int b = A[i+1];
  B[i] = a;
  B[i+1] = b;
  foo(a);  <--- a is used outside the vectorized expression.
}

llvm-svn: 181648
2013-05-10 22:59:33 +00:00
Nadav Rotem 0686e5cb05 Add a debug print
llvm-svn: 181647
2013-05-10 22:56:18 +00:00
Arnold Schwaighofer 2e8c69cf97 LoopVectorizer: Don't assert on the absence of induction variables
A computable loop exit count does not imply the presence of an induction
variable. Scalar evolution can return a value for an infinite loop.

Fixes PR15926.

llvm-svn: 181495
2013-05-09 00:32:18 +00:00
Arnold Schwaighofer 3610139ac5 LoopVectorizer: Improve reduction variable identification
The two nested loops were confusing and also conservative in identifying
reduction variables. This patch replaces them by a worklist based approach.

llvm-svn: 181369
2013-05-07 21:55:37 +00:00
Arnold Schwaighofer e78b76fbed LoopVectorize: getConsecutiveVector must respect signed arithmetic
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.

Should fix PR15882.

llvm-svn: 181286
2013-05-07 04:37:05 +00:00
Nadav Rotem 632b25b743 Update the comment to mention that we use TTI.
llvm-svn: 181178
2013-05-06 03:06:36 +00:00
Benjamin Kramer 3e3f2a4b8d LoopVectorize: Print values instead of pointers in debug output.
llvm-svn: 181157
2013-05-05 14:54:52 +00:00
Arnold Schwaighofer d96e427eac LoopVectorize: Add support for floating point min/max reductions
Add support for min/max reductions when "no-nans-float-math" is enabled. This
allows us to assume we have ordered floating point math and treat ordered and
unordered predicates equally.

radar://13723044

llvm-svn: 181144
2013-05-05 01:54:48 +00:00
Arnold Schwaighofer f5183729db LoopVectorizer: Cleanup of miminimum/maximum pattern match code
No need for setting the operands. The pointers are going to be bound by the
matcher.

radar://13723044

llvm-svn: 181142
2013-05-05 01:54:44 +00:00
Arnold Schwaighofer a670a0a3aa LoopVectorize: We don't need an identity element for min/max reductions
We can just use the initial element that feeds the reduction.

  max(max(x, y), z) == max(max(x,y), max(x,z))

radar://13723044

llvm-svn: 181141
2013-05-05 01:54:42 +00:00
Dmitri Gribenko 3238fb7595 Add ArrayRef constructor from None, and do the cleanups that this constructor enables
Patch by Robert Wilhelm.

llvm-svn: 181138
2013-05-05 00:40:33 +00:00
Nadav Rotem 4ce060b3da LoopVectorizer: Add support for if-conversion of PHINodes with 3+ incoming values.
By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements.

We can now vectorize this loop:

int foo(int *A, int *B, int n) {
  for (int i=0; i < n; i++) {
    int x = 9;
    if (A[i] > B[i]) {
      if (A[i] > 19) {
        x = 3;
      } else if (B[i] < 4 ) {
        x = 4;
      } else {
        x = 5;
      }
    }
    A[i] = x;
  }
}

llvm-svn: 181037
2013-05-03 17:42:55 +00:00
Filip Pizlo dec20e43c0 This patch breaks up Wrap.h so that it does not have to include all of
the things, and renames it to CBindingWrapping.h.  I also moved 
CBindingWrapping.h into Support/.

This new file just contains the macros for defining different wrap/unwrap 
methods.

The calls to those macros, as well as any custom wrap/unwrap definitions 
(like for array of Values for example), are put into corresponding C++ 
headers.

Doing this required some #include surgery, since some .cpp files relied 
on the fact that including Wrap.h implicitly caused the inclusion of a 
bunch of other things.

This also now means that the C++ headers will include their corresponding 
C API headers; for example Value.h must include llvm-c/Core.h.  I think 
this is harmless, since the C API headers contain just external function 
declarations and some C types, so I don't believe there should be any 
nasty dependency issues here.

llvm-svn: 180881
2013-05-01 20:59:00 +00:00
Nadav Rotem 9feda6071a Fix a typo
llvm-svn: 180806
2013-04-30 21:04:51 +00:00
Nadav Rotem 13306816fc LoopVectorizer: Calculate the number of pointers to disambiguate at runtime based on the numbers of reads and writes.
llvm-svn: 180593
2013-04-26 05:08:59 +00:00
Nadav Rotem f43cbeee15 LoopVectorizer: No need to generate pointer disambiguation checks between readonly pointers.
llvm-svn: 180570
2013-04-25 19:55:03 +00:00
Arnold Schwaighofer 3fa801fbc2 LoopVectorizer: Change variable name Stride to ConsecutiveStride
This makes it easier to read the code.

No functionality change.

llvm-svn: 180197
2013-04-24 16:16:03 +00:00
Arnold Schwaighofer a6578f7056 LoopVectorize: Scalarize padded types
This patch disables memory-instruction vectorization for types that need padding
bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on
x86_64. Because the load/store vectorization is performed by the bit casting to
a packed vector, which has incompatible memory layout due to the lack of padding
bytes, the present vectorizer produces inconsistent result for memory
instructions of those types.
This patch checks an equality of the AllocSize of a scalar type and allocated
size for each vector element, to ensure that there is no padding bytes and the
array can be read/written using vector operations.

Patch by Daisuke Takahashi!

Fixes PR15758.

llvm-svn: 180196
2013-04-24 16:16:01 +00:00
Arnold Schwaighofer 23a0589bce LoopVectorizer: Bail out if we don't have datalayout we need it
llvm-svn: 180195
2013-04-24 16:15:58 +00:00
Nadav Rotem 71c9d6d333 LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure that the order in which the elements are scalarized is the same as the original order.
This fixes a miscompilation in FreeBSD's regex library.

llvm-svn: 180121
2013-04-23 17:12:42 +00:00
Pekka Jaaskelainen d3c90e132a Call the potentially costly isAnnotatedParallel() only once.
Made the uniform write test's checks a bit stricter.

llvm-svn: 180119
2013-04-23 16:44:43 +00:00
Pekka Jaaskelainen 6f2f66b63f Refuse to (even try to) vectorize loops which have uniform writes,
even if erroneously annotated with the parallel loop metadata.

Fixes Bug 15794: 
"Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata"

llvm-svn: 180081
2013-04-23 08:08:51 +00:00
Eric Christopher 04d4e9312c Move C++ code out of the C headers and into either C++ headers
or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

llvm-svn: 180063
2013-04-22 22:47:22 +00:00
Nadav Rotem c57af326a4 SLPVectorize: Add support for vectorization of casts.
llvm-svn: 179975
2013-04-21 08:05:59 +00:00
Nadav Rotem 98ad5f0f4c SLPVectorizer: Fix a bug in the code that scans the tree in search of nodes with multiple users.
We did not terminate the switch case and we executed the search routine twice.

llvm-svn: 179974
2013-04-21 07:37:56 +00:00
Nadav Rotem 8aca44a623 Fix PR15800. Do not try to vectorize vectors and structs.
llvm-svn: 179960
2013-04-20 22:29:43 +00:00
Benjamin Kramer 519b2e3087 VecUtils: Clean up uses of dyn_cast.
llvm-svn: 179936
2013-04-20 10:36:17 +00:00
Benjamin Kramer 4600bcc337 SLPVectorizer: Strength reduce SmallVectors to ArrayRefs.
Avoids a couple of copies and allows more flexibility in the clients.

llvm-svn: 179935
2013-04-20 09:49:10 +00:00
Nadav Rotem ce2660d639 SLPVectorizer: Reduce the compile time by eliminating the search for some of the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr.
llvm-svn: 179933
2013-04-20 07:29:34 +00:00
Nadav Rotem 998e035cae refactor tryToVectorizePair to a new method that supports vectorization of lists.
llvm-svn: 179932
2013-04-20 07:22:58 +00:00
Nadav Rotem 890387289e Fix an unused variable warning.
llvm-svn: 179931
2013-04-20 06:40:28 +00:00
Nadav Rotem 83c7c41bc2 SLPVectorizer: Improve the cost model for loop invariant broadcast values.
llvm-svn: 179930
2013-04-20 06:13:47 +00:00
Nadav Rotem dfe1c93ca4 Report the number of stores that were found in the debug message.
llvm-svn: 179929
2013-04-20 05:23:11 +00:00
Nadav Rotem dfd8fcbb00 Fix the header comment.
llvm-svn: 179928
2013-04-20 05:18:51 +00:00
Nadav Rotem 5ed99674e9 Use 64bit arithmetic for calculating distance between pointers.
llvm-svn: 179927
2013-04-20 05:17:47 +00:00
Arnold Schwaighofer 5146940316 LoopVectorizer: Use matcher from PatternMatch.h for the min/max patterns
Also make some static function class functions to avoid having to mention the
class namespace for enums all the time.

No functionality change intended.

llvm-svn: 179886
2013-04-19 21:03:36 +00:00
Dmitri Gribenko d29ea04446 Fix a -Wdocumentation warning
llvm-svn: 179789
2013-04-18 20:13:04 +00:00
Arnold Schwaighofer 4cd6aa110c LoopVectorizer: Recognize min/max reductions
A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y)
sequence in LLVM. If we see such a sequence we can treat it just as any other
commutative binary instruction and reduce it.

This appears to help bzip2 by about 1.5% on an imac12,2.

radar://12960601

llvm-svn: 179773
2013-04-18 17:22:34 +00:00
Benjamin Kramer 8df2cfb858 LoopVectorize: Use a set to avoid longer cycles in the reduction chain too.
Fixes PR15748.

llvm-svn: 179757
2013-04-18 14:29:13 +00:00