Commit Graph

53 Commits

Author SHA1 Message Date
Nicolas Vasilache 047400ed82 [mlir][LLVMIR] Add support for InlineAsmOp
The InlineAsmOp mirrors the underlying LLVM semantics with a notable
exception: the embedded `asm_string` is not allowed to define or reference
any symbol or any global variable: only the operands of the op may be read,
written, or referenced.
Attempting to define or reference any symbol or any global behavior is
considered undefined behavior at this time.

The asm dialect syntax is currently specified with an integer (0 [default] for the "att dialect", 1 for the intel dialect) to circumvent the ODS limitation on string enums.

Translation to LLVM is provided and raises the fact that the asm constraints string must be well-formed with respect to in/out operands. No check is performed on the asm_string.

An InlineAsm instruction in LLVM is a special call operation to a function that is constructed on the fly.
It does not fit the current model of MLIR calls with symbols.
As a consequence, the current implementation constructs the function type in ModuleTranslation.cpp.
This should be refactored in the future.

The mlir-cpu-runner is augmented with the global initialization of the X86 asm parser to allow proper execution in JIT mode. Previously, only the X86 asm printer was initialized.

Differential revision: https://reviews.llvm.org/D92166
2020-11-30 08:32:02 +00:00
Stephan Herhut 87568c07f0 [mlir][linalg] Mark linalg.yield as ReturnLike
This change is required so that bufferization can properly identify
the linalg.yield as a terminator with an associated parent op.

Differential Revision: https://reviews.llvm.org/D92173
2020-11-26 14:44:08 +01:00
Eugene Zhulenev f2df67e2a6 [mlir] Fix async microbench integration test
Differential Revision: https://reviews.llvm.org/D91912
2020-11-21 07:02:24 -08:00
Eugene Zhulenev d4f1a3c6e2 [mlir] Add microbenchmark for linalg+async-parallel-for
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D91896
2020-11-21 03:47:14 -08:00
Eugene Zhulenev a86a9b5ef7 [mlir] Automatic reference counting for Async values + runtime support for ref counted objects
Depends On D89963

**Automatic reference counting algorithm outline:**

1. `ReturnLike` operations forward the reference counted values without
    modifying the reference count.
2. Use liveness analysis to find blocks in the CFG where the lifetime of
   reference counted values ends, and insert `drop_ref` operations after
   the last use of the value.
3. Insert `add_ref` before the `async.execute` operation capturing the
   value, and pairing `drop_ref` before the async body region terminator,
   to release the captured reference counted value when execution
   completes.
4. If the reference counted value is passed only to some of the block
   successors, insert `drop_ref` operations in the beginning of the blocks
   that do not have reference coutned value uses.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D90716
2020-11-20 03:08:44 -08:00
Benjamin Kramer c25e1edf61 [MLIR] Fix up integration tests after b7382ed3fe 2020-11-17 15:42:45 +01:00
Rahul Joshi b7382ed3fe [MLIR] Extend Symbol verification to reject public symbol declarations.
- Extend the Symbol interface with `isDeclaration` to identify operations that declare
  a symbol as opposed to define it.
- Extend verification to disallow public declarations as per the discussion in
   https://llvm.discourse.group/t/rfc-symbol-definition-declaration-x-visibility-checks/2140
- Adopt the new interface for `FuncOp` and fix test and code to not have/create public
  function declarations.

Differential Revision: https://reviews.llvm.org/D91456
2020-11-16 16:05:32 -08:00
Eugene Zhulenev c30ab6c2a3 [mlir] Transform scf.parallel to scf.for + async.execute
Depends On D89958

1. Adds `async.group`/`async.awaitall` to group together multiple async tokens/values
2. Rewrite scf.parallel operation into multiple concurrent async.execute operations over non overlapping subranges of the original loop.

Example:

```
   scf.for (%i, %j) = (%lbi, %lbj) to (%ubi, %ubj) step (%si, %sj) {
     "do_some_compute"(%i, %j): () -> ()
   }
```

Converted to:

```
   %c0 = constant 0 : index
   %c1 = constant 1 : index

   // Compute blocks sizes for each induction variable.
   %num_blocks_i = ... : index
   %num_blocks_j = ... : index
   %block_size_i = ... : index
   %block_size_j = ... : index

   // Create an async group to track async execute ops.
   %group = async.create_group

   scf.for %bi = %c0 to %num_blocks_i step %c1 {
     %block_start_i = ... : index
     %block_end_i   = ... : index

     scf.for %bj = %c0 t0 %num_blocks_j step %c1 {
       %block_start_j = ... : index
       %block_end_j   = ... : index

       // Execute the body of original parallel operation for the current
       // block.
       %token = async.execute {
         scf.for %i = %block_start_i to %block_end_i step %si {
           scf.for %j = %block_start_j to %block_end_j step %sj {
             "do_some_compute"(%i, %j): () -> ()
           }
         }
       }

       // Add produced async token to the group.
       async.add_to_group %token, %group
     }
   }

   // Await completion of all async.execute operations.
   async.await_all %group
```
In this example outer loop launches inner block level loops as separate async
execute operations which will be executed concurrently.

At the end it waits for the completiom of all async execute operations.

Reviewed By: ftynse, mehdi_amini

Differential Revision: https://reviews.llvm.org/D89963
2020-11-13 04:02:56 -08:00
Sean Silva faa66b1b2c [mlir] Bufferize tensor constant ops
We lower them to a std.global_memref (uniqued by constant value) + a
std.get_global_memref to produce the corresponding memref value.
This allows removing Linalg's somewhat hacky lowering of tensor
constants, now that std properly supports this.

Differential Revision: https://reviews.llvm.org/D91306
2020-11-12 14:56:10 -08:00
Sean Silva ad2f9f6745 [mlir] Fix subtensor_insert bufferization.
It was incorrect in the presence of a tensor argument with multiple
uses.

The bufferization of subtensor_insert was writing into a converted
memref operand, but there is no guarantee that the converted memref for
that operand is safe to write into. In this case, the same converted
memref is written to in-place by the subtensor_insert bufferization,
violating the tensor-level semantics.

I left some comments in a TODO about ways forward on this. I will be
working actively on this problem in the coming days.

Differential Revision: https://reviews.llvm.org/D91371
2020-11-12 14:56:09 -08:00
Sean Silva 53a0d45db6 [mlir] Add pass to convert elementwise ops to linalg.
This patch converts elementwise ops on tensors to linalg.generic ops
with the same elementwise op in the payload (except rewritten to
operate on scalars, obviously). This is a great form for later fusion to
clean up.

E.g.

```
// Compute: %arg0 + %arg1 - %arg2
func @f(%arg0: tensor<?xf32>, %arg1: tensor<?xf32>, %arg2: tensor<?xf32>) -> tensor<?xf32> {
  %0 = addf %arg0, %arg1 : tensor<?xf32>
  %1 = subf %0, %arg2 : tensor<?xf32>
  return %1 : tensor<?xf32>
}
```

Running this through
`mlir-opt -convert-std-to-linalg -linalg-fusion-for-tensor-ops` we get:

```
func @f(%arg0: tensor<?xf32>, %arg1: tensor<?xf32>, %arg2: tensor<?xf32>) -> tensor<?xf32> {
  %0 = linalg.generic {indexing_maps = [#map0, #map0, #map0, #map0], iterator_types = ["parallel"]} ins(%arg0, %arg1, %arg2 : tensor<?xf32>, tensor<?xf32>, tensor<?xf32>) {
  ^bb0(%arg3: f32, %arg4: f32, %arg5: f32):  // no predecessors
    %1 = addf %arg3, %arg4 : f32
    %2 = subf %1, %arg5 : f32
    linalg.yield %2 : f32
  } -> tensor<?xf32>
  return %0 : tensor<?xf32>
}
```

So the elementwise ops on tensors have nicely collapsed into a single
linalg.generic, which is the form we want for further transformations.

Differential Revision: https://reviews.llvm.org/D90354
2020-11-10 13:44:44 -08:00
Alexander Belyaev 9d02e0e38d [mlir][std] Add ExpandOps pass.
The pass combines patterns of ExpandAtomic, ExpandMemRefReshape,
StdExpandDivs passes. The pass is meant to legalize STD for conversion to LLVM.

Differential Revision: https://reviews.llvm.org/D91082
2020-11-09 21:58:28 +01:00
Nicolas Vasilache 6fc3a44394 [mlir][Linalg] Add support for bufferization of SubTensorOp and SubTensorInsertOp
This revision adds support for bufferization by using a mix of `tensor_load`, `subview`, `linalg.copy` and `tensor_to_memref`.
2020-11-09 16:55:36 +00:00
Alexandre Eichenberger 0795715616 [mlir][std] Add SignedCeilDivIOp and SignedFloorDivIOp with std to std lowering triggered by -std-expand-divs option. The new operations support positive/negative nominator/denominator numbers.
Differential Revision: https://reviews.llvm.org/D89726

Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com>
2020-11-04 14:16:23 -05:00
Sean Silva eb8d386d51 [mlir] Make linalg-bufferize a composable bufferization pass
Previously, linalg-bufferize was a "finalizing" bufferization pass (it
did a "full" conversion). This wasn't great because it couldn't be used
composably with other bufferization passes like std-bufferize and
scf-bufferize.

This patch makes linalg-bufferize a composable bufferization pass.
Notice that the integration tests are switched over to using a pipeline
of std-bufferize, linalg-bufferize, and (to finalize the conversion)
func-bufferize. It all "just works" together.

While doing this transition, I ran into a nasty bug in the 1-use special
case logic for forwarding init tensors. That logic, while
well-intentioned, was fundamentally flawed, because it assumed that if
the original tensor value had one use, then the converted memref could
be mutated in place. That assumption is wrong in many cases. For
example:

```
  %0 = some_tensor : tensor<4xf32>
  br ^bb0(%0, %0: tensor<4xf32>, tensor<4xf32>)
^bb0(%bbarg0: tensor<4xf32>, %bbarg1: tensor<4xf32>)
  // %bbarg0 is an alias of %bbarg1. We cannot safely write
  // to it without analyzing uses of %bbarg1.
  linalg.generic ... init(%bbarg0) {...}
```

A similar example can happen in many scenarios with function arguments.
Even more sinister, if the converted memref is produced by a
`std.get_global_memref` of a constant global memref, then we might
attempt to write into read-only statically allocated storage! Not all
memrefs are writable!

Clearly, this 1-use check is not a local transformation that we can do
on the fly in this pattern, so I removed it.

The test is now drastically shorter and I basically rewrote the CHECK
lines from scratch because:
- the new composable linalg-bufferize just doesn't do as much, so there
is less to test
- a lot of the tests were related to the 1-use check, which is now gone,
so there is less to test
- the `-buffer-hoisting -buffer-deallocation` is no longer mixed in, so
the checks related to that had to be rewritten

Differential Revision: https://reviews.llvm.org/D90657
2020-11-04 10:16:55 -08:00
Thomas Raoux 5d45f758f0 [mlir][vector] Improve vector distribute integration test and fix block distribution
Fix semantic in the distribute integration test based on offline feedback. This
exposed a bug in block distribution, we need to make sure the id is multiplied
by the stride of the vector. Fix the transformation and unit test.

Differential Revision: https://reviews.llvm.org/D89291
2020-10-29 14:54:53 -07:00
Kazuaki Ishizaki 41b09f4eff [mlir] NFC: fix trivial typos
fix typos in comments and documents

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D90089
2020-10-29 04:05:22 +09:00
Sean Silva 9ca97cde85 [mlir] Linalg refactor for using "bufferize" terminology.
Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89261
2020-10-14 12:39:15 -07:00
Nicolas Vasilache 6121117484 [mlir][Linalg] Fix TensorConstantOp bufferization in Linalg.
TensorConstantOp bufferization currently uses the vector dialect to store constant data into memory.
Due to natural vector size and alignment properties, this is problematic with n>1-D vectors whose most minor dimension is not naturally aligned.

Instead, this revision linearizes the constant and introduces a linalg.reshape to go back to the desired shape.

Still this is still to be considered a workaround and a better longer term solution will probably involve `llvm.global`.

Differential Revision: https://reviews.llvm.org/D89311
2020-10-13 16:36:56 +00:00
Nicolas Vasilache 81ead8a535 [mlir][Linalg] Temporarily circumvent TensorConstant bufferize bug
The TensorConstantOp bufferize conversion pattern has a bug that
makes it incorrect in the case of vectors whose alignment is not
the natural alignment. Circumvent it temporarily by using a power of 2.

Differential Revision: https://reviews.llvm.org/D89265
2020-10-12 20:23:57 +00:00
Nicolas Vasilache 422aaf31da [mlir][Linalg] Add named Linalg ops on tensor to buffer support.
This revision introduces support for buffer allocation for any named linalg op.
To avoid template instantiating many ops, a new ConversionPattern is created to capture the LinalgOp interface.

Some APIs are updated to remain consistent with MLIR style:
`OwningRewritePatternList * -> OwningRewritePatternList &`
`BufferAssignmentTypeConverter * -> BufferAssignmentTypeConverter &`

Differential revision: https://reviews.llvm.org/D89226
2020-10-12 11:20:23 +00:00
Thomas Raoux 19119dda16 [mlir][vector] Add integration test for vector distribute transformation
Differential Revision: https://reviews.llvm.org/D89062
2020-10-08 14:45:56 -07:00
Jakub Lichman e547b1e243 [mlir] Rank reducing subview conversion to LLVM
This commit adjusts SubViewOp lowering to take rank reduction into account.

Differential Revision: https://reviews.llvm.org/D88883
2020-10-08 13:47:22 +00:00
Nicolas Vasilache 30e6033b45 [mlir][Linalg] Add TensorsToBuffers support for Constant ops.
This revision also inserts an end-to-end test that lowers tensors to buffers all the way to executable code on CPU.

Differential revision: https://reviews.llvm.org/D88998
2020-10-08 13:15:45 +00:00
Aart Bik 4065a0d98f [mlir] [sparse] Rename getSparseMatrix to getMatrix
Rationale:
More consistent with the other names. Also forward looking to reading
in other kinds of matrices. Also fixes lint issue on hard-coded %llu.

Reviewed By: penpornk

Differential Revision: https://reviews.llvm.org/D89005
2020-10-07 14:25:05 -07:00
Amara Emerson 322d0afd87 [llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle legacy intrinsics.

Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

Differential Revision: https://reviews.llvm.org/D88787
2020-10-07 10:36:44 -07:00
Aart Bik c6c67f643d [mlir] [sparse] convenience runtime support to read Matrix Market format
Setting up input data for benchmarks and integration tests can be tedious in
pure MLIR. With more sparse tensor work planned, this convenience library
simplifies reading sparse matrices in the popular Matrix Market Exchange
Format (see https://math.nist.gov/MatrixMarket). Note that this library
is *not* part of core MLIR. It is merely intended as a convenience library
for benchmarking and integration testing.

Reviewed By: penpornk

Differential Revision: https://reviews.llvm.org/D88856
2020-10-06 13:17:05 -07:00
Jakub Lichman 0b17d4754a [mlir][Linalg] Tile sizes for Conv ops vectorization added as pass arguments
Current setup for conv op vectorization does not enable user to specify tile
sizes as well as dimensions for vectorization. In this commit we change that by
adding tile sizes as pass arguments. Every dimension with corresponding tile
size > 1 is automatically vectorized.

Differential Revision: https://reviews.llvm.org/D88533
2020-09-30 11:31:28 +00:00
Aart Bik e9628955f5 [mlir] [VectorOps] Relaxed restrictions on vector.reduction types even more
Recently, restrictions on vector reductions were made more relaxed by
accepting any width signless integer and floating-point. This CL relaxes
the restriction even more by including unsigned and signed integers.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D88442
2020-09-28 13:38:03 -07:00
Aart Bik 54759cefdb [mlir] [VectorOps] changes to printing support for integers
(1) simplify integer printing logic by always using 64-bit print
(2) add index support (since vector<16xindex> is planned to be added)
(3) adjust naming convention print_x -> printX

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D88436
2020-09-28 11:43:31 -07:00
Aart Bik b8880f5f97 [mlir] [VectorOps] generalize printing support for integers
This generalizes printing beyond just i1,i32,i64 and also accounts
for signed and unsigned interpretation in the output.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D88290
2020-09-25 04:52:21 -07:00
Nicolas Vasilache 93fd30bac3 [mlir][Linalg] Evolve named ops to use assembly form and support linalg on tensors.
This revision allows representing a reduction at the level of linalg on tensors for named ops. When a structured op has a reduction and returns tensor(s), new conventions are added and documented.

As an illustration, the syntax for a `linalg.matmul` writing into a buffer is:

```
  linalg.matmul ins(%a, %b : memref<?x?xf32>, tensor<?x?xf32>)
               outs(%c : memref<?x?xf32>)
```

, whereas the syntax for a `linalg.matmul` returning a new tensor is:

```
  %d = linalg.matmul ins(%a, %b : tensor<?x?xf32>, memref<?x?xf32>)
                    init(%c : memref<?x?xf32>)
                      -> tensor<?x?xf32>
```

Other parts of linalg will be extended accordingly to allow mixed buffer/tensor semantics in the presence of reductions.
2020-09-18 06:14:30 -04:00
Jakub Lichman 347d59b16c [mlir][Linalg] Convolution tiling added to ConvOp vectorization pass
ConvOp vectorization supports now only convolutions of static shapes with dimensions
of size either 3(vectorized) or 1(not) as underlying vectors have to be of static
shape as well. In this commit we add support for convolutions of any size as well as
dynamic shapes by leveraging existing matmul infrastructure for tiling of both input
and kernel to sizes accepted by the previous version of ConvOp vectorization.
In the future this pass can be extended to take "tiling mask" as a user input which
will enable vectorization of user specified dimensions.

Differential Revision: https://reviews.llvm.org/D87676
2020-09-17 09:39:41 +00:00
Jakub Lichman c20852300a [mlir][integration_test] Linalg Conv folder renamed to CPU
Changing directory name to reflect naming convention discussed here:
https://llvm.discourse.group/t/vectorops-rfc-add-suite-of-integration-tests-for-vector-dialect-operations/1213

Differential Revision: https://reviews.llvm.org/D87678
2020-09-15 10:28:20 +00:00
Jakub Lichman edf244217a [mlir][Linalg] Integration tests for convolutions added.
This commit introduces end-to-end integration tests for
convolutions that test multiple ways of ConvOps lowering.

Differential Revision: https://reviews.llvm.org/D87277
2020-09-09 11:37:28 +00:00
Benjamin Kramer 239eff502b [mlir][VectorOps] Redo the scalar loop emission in VectoToSCF to pad instead of clipping
This replaces the select chain for edge-padding with an scf.if that
performs the memory operation when the index is in bounds and uses the
pad value when it's not. For transfer_write the same mechanism is used,
skipping the store when the index is out of bounds.

The integration test has a bunch of cases of how I believe this should
work.

Differential Revision: https://reviews.llvm.org/D87241
2020-09-08 11:15:25 +02:00
Nicolas Vasilache 8d64df9f13 [mlir][Vector] Revisit VectorToSCF.
Vector to SCF conversion still had issues due to the interaction with the natural alignment derived by the LLVM data layout. One traditional workaround is to allocate aligned. However, this does not always work for vector sizes that are non-powers of 2.

This revision implements a more portable mechanism where the intermediate allocation is always a memref of elemental vector type. AllocOp is extended to use the natural LLVM DataLayout alignment for non-scalar types, when the alignment is not specified in the first place.

An integration test is added that exercises the transfer to scf.for + scalar lowering with a 5x5 transposition.

Differential Revision: https://reviews.llvm.org/D87150
2020-09-07 05:19:43 -04:00
Kazuaki Ishizaki a23d055912 [mlir] NFC: fix trivial typo under test and tools
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D86648
2020-08-27 15:37:42 +09:00
aartbik 39379916a7 [mlir] [VectorOps] Add masked load/store operations to Vector dialect
The intrinsics were already supported and vector.transfer_read/write lowered
direclty into these operations. By providing them as individual ops, however,
clients can used them directly, and it opens up progressively lowering transfer
operations at higher levels (rather than direct lowering to LLVM IR as done now).

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D85357
2020-08-05 16:45:24 -07:00
aartbik e8dcf5f87d [mlir] [VectorOps] Add expand/compress operations to Vector dialect
Introduces the expand and compress operations to the Vector dialect
(important memory operations for sparse computations), together
with a first reference implementation that lowers to the LLVM IR
dialect to enable running on CPU (and other targets that support
the corresponding LLVM IR intrinsics).

Reviewed By: reidtatge

Differential Revision: https://reviews.llvm.org/D84888
2020-08-04 12:00:42 -07:00
Alex Zinenko ec1f4e7c3b [mlir] switch the modeling of LLVM types to use the new mechanism
A new first-party modeling for LLVM IR types in the LLVM dialect has been
developed in parallel to the existing modeling based on wrapping LLVM `Type *`
instances. It resolves the long-standing problem of modeling identified
structure types, including recursive structures, and enables future removal of
LLVMContext and related locking mechanisms from LLVMDialect.

This commit only switches the modeling by (a) renaming LLVMTypeNew to LLVMType,
(b) removing the old implementaiton of LLVMType, and (c) updating the tests. It
is intentionally minimal. Separate commits will remove the infrastructure built
for the transition and update API uses where appropriate.

Depends On D85020

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D85021
2020-08-04 14:29:25 +02:00
Nicolas Vasilache 4bfbf74e57 [MLIR] Add an integration test for 2 D vector.transfer_read
Added a "clone" of the 1D vector's test_transfer_read and added a second dimensionality. The test is not as generic as I would like it to be, because more generic versions appear to break the compiler or the runtime at this stage. As bug are fixed, I will be happy to add another more complete test.

Differential Revision: https://reviews.llvm.org/D83096
2020-08-04 04:28:28 -04:00
aartbik 4f92ad508f [mlir] [VectorOps] [integration_test] Sparse matrix times vector (jagged SAXPY version)
Transposed jagged diagonal storage yields longer vector lengths. Also, in
contrast with naive SAXPY (one gather/scatter), this only performs one gather.

Reviewed By: reidtatge

Differential Revision: https://reviews.llvm.org/D84699
2020-07-29 13:25:56 -07:00
aartbik 7832d0f63d [mlir] [VectorOps] [integration_test] Sparse matrix times vector (DOT version)
Integration test that illustrates the gather operation with a
real-world operation expressed in mostly the Vector dialect.
Uses jagged diagonal storage.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D84571
2020-07-27 11:22:28 -07:00
aartbik 19dbb230a2 [mlir] [VectorOps] Add scatter/gather operations to Vector dialect
Introduces the scatter/gather operations to the Vector dialect
(important memory operations for sparse computations), together
with a first reference implementation that lowers to the LLVM IR
dialect to enable running on CPU (and other targets that support
the corresponding LLVM IR intrinsics).

The operations can be used directly where applicable, or can be used
during progressively lowering to bring other memory operations closer to
hardware ISA support for a gather/scatter. The semantics of the operation
closely correspond to those of the corresponding llvm intrinsics.

Note that the operation allows for a dynamic index vector (which is
important for sparse computations). However, this first reference
lowering implementation "serializes" the address computation when
base + index_vector is converted to a vector of pointers. Exploring
how to use SIMD properly during these step is TBD. More general
memrefs and idiomatic versions of striding are also TBD.

Reviewed By: arpith-jacob

Differential Revision: https://reviews.llvm.org/D84039
2020-07-21 10:57:40 -07:00
Nicolas Vasilache affbc0cd1c [mlir] Add alignment attribute to LLVM memory ops and use in vector.transfer
Summary: The native alignment may generally not be used when lowering a vector.transfer to the underlying load/store operation. This revision fixes the unmasked load/store alignment to match that of the masked path.

Differential Revision: https://reviews.llvm.org/D83684
2020-07-13 17:35:20 -04:00
aartbik 9bf6354301 [mlir] [VectorOps] Allow AXPY to be expressed as special case of OUTERPRODUCT
This specialization allows sharing more code where an AXPY follows naturally
in cases where an OUTERPRODUCT on a scalar would be generated.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D83453
2020-07-10 12:23:24 -07:00
aartbik 6404fb428a [mlir] [VectorOps] [integration-test] Add i64 typed outer product
Yields proper SIMD vpmullq/vpaddq on x86.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D83328
2020-07-07 12:34:41 -07:00
aartbik 4a80f19078 [mlir] [VectorOps] Extend vector reduction integration test with reassoc=true cases.
Reviewed By: reidtatge

Differential Revision: https://reviews.llvm.org/D82674
2020-06-29 13:28:20 -07:00
aartbik ba690195d1 [mlir] [integration-test] Let target check-mlir imply target check-mlir-integration too
Note that this does not mean that check-mlir will run check-mlir-integration
tests for all configurations. You still need to do a set up with the flag
MLIR_INCLUDE_INTEGRATION_TESTS set to ON in order to activate the integration test.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D82413
2020-06-23 15:22:39 -07:00