Commit Graph

9299 Commits

Author SHA1 Message Date
Matthias Springer ee1bf18672 [mlir][SCF] Further simplify affine maps during `for-loop-canonicalization`
* Implement `FlatAffineConstraints::getConstantBound(EQ)`.
* Inject a simpler constraint for loops that have at most 1 iteration.
* Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`.

Differential Revision: https://reviews.llvm.org/D114138
2021-11-25 12:44:19 +09:00
Matthias Springer 8a8c655fe7 [mlir][SCF] Fix off-by-one bug in affine analysis
This change is NFC. There were two issues when passing/reading upper bounds into/from FlatAffineConstraints that negate each other, so the bug was not apparent. However, it made debugging harder because some constraints in the FlatAffineConstraints were off by one when dumping all constraints.

Differential Revision: https://reviews.llvm.org/D114137
2021-11-25 12:37:02 +09:00
Uday Bondhugula 23d505571d [NFC] Improve debug message in getAsIntegerSet
Improve debug message in getAsIntegerSet. Add missing trailing new line
and position info.

Differential Revision: https://reviews.llvm.org/D114511
2021-11-25 08:50:21 +05:30
Matthias Springer d3bb4fec2a [mlir][linalg][bufferize][NFC] Move arith interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the arith dialect.

Differential Revision: https://reviews.llvm.org/D114219
2021-11-25 10:21:02 +09:00
bakhtiyar 7bd87a03fd Promote readability by factoring out creation of min/max operation. Remove unnecessary divisions.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110680
2021-11-24 16:17:23 -08:00
Lei Zhang cb395f66ac [mlir][spirv] Change the return type for {Min|Max}VersionBase
For synthesizing an op's implementation of the generated interface
from {Min|Max}Version, we need to define an `initializer` and
`mergeAction`. The `initializer` specifies the initial version,
and `mergeAction` specifies how version specifications from
different parts of the op should be merged to generate a final
version requirements.

Previously we use the specified version enum as the type for both
the initializer and thus the final return type. This means we need
to perform `static_cast` over some hopefully not used number (`~0u`)
as the initializer. This is quite opaque and sort of not guaranteed
to work. Also, there are ops that have an enum attribute where some
values declare version requirements (e.g., enumerant `B` requires
v1.1+) but some not (e.g., enumerant `A` requires nothing). Then a
concrete op instance with `A` will still declare it implements the
version interface (because interface implementation is static for
an op) but actually theirs no requirements for version.

So this commit changes to use an more explicit `llvm::Optional`
to wrap around the returned version enum.  This should make it
more clear.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D108312
2021-11-24 17:33:01 -05:00
Tobias Gysi b6e7b1be73 [mlir][linalg] Simplify padding test (NFC).
The padding tests previously contained the tile loops. This revision removes the tile loops since padding itself does not consider the loops. Instead the induction variables are passed in as function arguments which promotes them to symbols in the affine expressions. Note that the pad-and-hoist.mlir test still exercises padding in the context of the full loop nest.

Depends On D114175

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114227
2021-11-24 19:21:50 +00:00
Tobias Gysi 86f186efea [mlir][linalg] Add makeComposedPadHighOp.
Add the makeComposedPadHighOp method which creates a new PadTensorOp if necessary. If the source to pad is actually the result of a sequence of padded LinalgOps, the method checks if padding is needed or if we can use the padded result of the padded LinalgOp sequence directly.

Example:
```
%0 = tensor.extract_slice %arg0 [%iv0, %iv1] [%sz0, %sz1]
%1 = linalg.pad_tensor %0 low[0, 0] high[...] { linalg.yield %cst }
%2 = linalg.matmul ins(...) outs(%1)
%3 = tensor.extract_slice %2 [0, 0] [%sz0, %sz1]
```
when padding %3 return %2 instead of introducing
```
%4 = linalg.pad_tensor %3 low[0, 0] high[...] { linalg.yield %cst }
```

Depends On D114161

Reviewed By: nicolasvasilache, pifon2a

Differential Revision: https://reviews.llvm.org/D114175
2021-11-24 19:18:59 +00:00
Tobias Gysi a4fd8cb76f [mlir][linalg] Update failure conditions for padOperandToSmallestStaticBoundingBox.
Change the failure condition of padOperandToSmallestStaticBoundingBox to never fail if the operand is already statically sized.

In particular:
- if the padding value computation fails -> return failure if the operand shape is dynamic and success if it is static.
- if there is no extract slice op -> return failure if the operand shape is dynamic and success if it is static.

The latter change prevents padding from failure if the output operand passed by iteration argument is statically sized since in this case the extract / insert slice pairs are removed by canonicalization.

Depends On D114153

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114161
2021-11-24 19:10:50 +00:00
Florian Hahn fb46e64a01
Revert "[ThreadPool] Do not return shared futures."
This reverts commit a5fff58781.

The offending commit broke building with LLVM_ENABLE_THREADS=OFF.
2021-11-24 19:01:47 +00:00
MaheshRavishankar 0a58982b08 [mlir][Linalg] Remove alloc/dealloc pair as a callback.
The alloc dealloc pair generation callback is really central to the
bufferization algorithm, it modifies the state in a way that affects
correctness. This is not really a configurable option. Moving it to
BufferizationState removes what was probably the reason it was added
as a callback.

Differential Revision: https://reviews.llvm.org/D114417
2021-11-24 10:36:34 -08:00
Nicolas Vasilache 1cfa9b4d70 [mlir][Vector] NFC - Apply some clangd suggested fixes. 2021-11-24 15:55:58 +00:00
Matthias Springer ca9d149e07 [mlir][linalg][bufferize][NFC] Move vector interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the vector dialect.

Differential Revision: https://reviews.llvm.org/D114218
2021-11-24 19:36:12 +09:00
Matthias Springer bb273a35a0 [mlir][linalg][bufferize][NFC] Move tensor interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the tensor dialect.

Differential Revision: https://reviews.llvm.org/D114217
2021-11-24 18:25:17 +09:00
Butygin 7f5d9bf13a [mlir][scf] Canonicalize scf.while with unused results
Differential Revision: https://reviews.llvm.org/D114291
2021-11-24 11:11:22 +03:00
Bixia Zheng 02710413a3 Accept symmetric sparse matrix in Matrix Market Exchange Format.
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114402
2021-11-23 19:53:17 -08:00
Uday Bondhugula 8bd08a9fd7 [MLIR] Remove duplicate `Pass` suffix from ViewOpGraph class name
Remove duplicate `Pass` suffix from view-op-graph pass class name. The
extra suffix would lead to methods like registerViewOpGraphPassPass
being generated.

Differential Revision: https://reviews.llvm.org/D114459
2021-11-24 08:00:16 +05:30
wren romano d7d7ffe254 [mlir][sparse] Adding wrappers for constantOverheadTypeEncoding
Minor code cleanup

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114392
2021-11-23 18:30:06 -08:00
Butygin 75a1bee05d [mlir][spirv] Add math to OpenCL conversion
Differential Revision: https://reviews.llvm.org/D113780
2021-11-24 02:31:21 +03:00
Rob Suderman 0f1e52afa9 [mlir][tosa] Materialize tosa.pad value and fold noop pads
Padding now can explicitly specify the padding value when non-zero is wanted.
This also includes bypassing pads when the pad does nothing.

Differential Revision: https://reviews.llvm.org/D113611
2021-11-23 12:23:42 -08:00
Rob Suderman 54eec7cafc [mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support
Transpose convolution decomposition is now performed in a separate pass. This
allows padding / constant propagation to be performed at the TOSA level. It
also adds support for striding when there is no dilation.

Differential Revision: https://reviews.llvm.org/D114409
2021-11-23 12:16:44 -08:00
MaheshRavishankar b57e2f071a [mlir][Linalg] Add pad vectorization patterns into LinalgStrategyVectorize passes.
Add an option to control whether these patterns are added to the
pattern list or not.

Differential Revision: https://reviews.llvm.org/D114290
2021-11-23 11:47:54 -08:00
wren romano 286248db2c [mlir][sparse] Moving integration tests that merely use the Python API
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114192
2021-11-23 10:59:38 -08:00
Nicolas Vasilache 3ff4e5f2a4 [mlir][Vector] Thread 0-d vectors through InsertElementOp.
This revision makes concrete use of 0-d vectors to extend the semantics of
InsertElementOp.

Reviewed By: dcaballe, pifon2a

Differential Revision: https://reviews.llvm.org/D114388
2021-11-23 12:55:11 +00:00
Nicolas Vasilache e7026aba00 [mlir][Vector] Thread 0-d vectors through ExtractElementOp.
This revision starts making concrete use of 0-d vectors to extend the semantics of
ExtractElementOp.
In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in.

Differential Revision: https://reviews.llvm.org/D114387
2021-11-23 12:39:44 +00:00
Matthias Springer f24d9313cc [mlir][linalg][bufferize][NFC] Specify bufferize traversal in `bufferize`
The interface method `bufferize` controls how (and it what order) nested ops are traversed. This simplifies bufferization of scf::ForOps and scf::IfOps, which used to need special rules in scf::YieldOp.

Differential Revision: https://reviews.llvm.org/D114057
2021-11-23 21:33:19 +09:00
Florian Hahn a5fff58781
[ThreadPool] Do not return shared futures.
The only users of returned futures from ThreadPool is llvm-reduce after
D113857.

There should be no cases where multiple threads wait on the same future,
so there should be no need to return std::shared_future<>. Instead return
plain std::future<>.

If users need to share a future between multiple threads, they can share
the futures themselves.

Reviewed By: Meinersbur, mehdi_amini

Differential Revision: https://reviews.llvm.org/D114363
2021-11-23 10:06:08 +00:00
Alexander Belyaev c7cc70c8f8 Revert "Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td.""
This reverts and fixes commit de18b7dee6.
2021-11-23 10:49:26 +01:00
Nicolas Vasilache b2729fda60 [mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations:        100
Instructions:      5900
Total Cycles:      2415
Total uOps:        7300

Dispatch Width:    6
uOps Per Cycle:    3.02
IPC:               2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
  Resource Pressure       [ 89.65% ]
  - SKXPort1  [ 0.04% ]
  - SKXPort2  [ 12.42% ]
  - SKXPort3  [ 12.42% ]
  - SKXPort5  [ 89.52% ]
  Data Dependencies:      [ 37.06% ]
  - Register Dependencies [ 37.06% ]
  - Memory Dependencies   [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations:        100
Instructions:      6300
Total Cycles:      2015
Total uOps:        7700

Dispatch Width:    6
uOps Per Cycle:    3.82
IPC:               3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
  Resource Pressure       [ 83.18% ]
  - SKXPort0  [ 14.49% ]
  - SKXPort1  [ 14.54% ]
  - SKXPort2  [ 19.70% ]
  - SKXPort3  [ 19.70% ]
  - SKXPort5  [ 83.03% ]
  - SKXPort6  [ 14.49% ]
  Data Dependencies:      [ 39.75% ]
  - Register Dependencies [ 39.75% ]
  - Memory Dependencies   [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Differential Revision: https://reviews.llvm.org/D114393
2021-11-23 07:31:22 +00:00
Sandeep Dasgupta e5a8c8c883 [mlir] Refactoring a few Parser APIs
Refactored two new parser APIs parseGenericOperationAfterOperands and
 parseCustomOperationName out of parseGenericOperation and parseCustomOperation.

Motivation: Sometimes an op can be printed in a special way if certain criteria
is met. While parsing, we need to handle all the versions.
`parseGenericOperationAfterOperands` is handy in situation where we already
parsed the operands and decide to fall back to default parsing.

`parseCustomOperationName` is useful when we need to know details (dialect,
operation name etc.) about a parsed token meant to be an mlir operation.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113719
2021-11-23 06:11:01 +00:00
Matthias Springer fb99686bfd [mlir][linalg][bufferize] Limited support for scf.execute_region
Add support for analysis only.

Differential Revision: https://reviews.llvm.org/D114055
2021-11-23 12:20:39 +09:00
Matthias Springer 26c0dd83ab [mlir][linalg][bufferize][NFC] Move helper function to op interface
This is in preparation of changing the op traversal during bufferization.

Differential Revision: https://reviews.llvm.org/D114040
2021-11-23 11:59:47 +09:00
Matthias Springer 8d0994ed21 [mlir][linalg][bufferize][NFC] Remove special casing of CallOps
Differential Revision: https://reviews.llvm.org/D113966
2021-11-23 11:14:10 +09:00
Matthias Springer b1083830d6 [mlir][linalg][bufferize][NFC] Clean up headers and function visibility
Differential Revision: https://reviews.llvm.org/D113964
2021-11-23 10:29:26 +09:00
Benjamin Kramer 966b720983 [mlir][memref] Fix expanded shape ops memref.cast folding with changed type
`memref.expand_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.

This can be relaxed once expand_shape supports more dynamism.

Differential Revision: https://reviews.llvm.org/D114391
2021-11-22 22:56:15 +01:00
Christian Ulmann f6718fc6d3 [mlir] FlatAffineConstraint parsing for unit tests
This patch adds functionality to parse FlatAffineConstraints from a
StringRef with the intention to be used for unit tests. This should
make the construction of FlatAffineConstraints easier for testing
purposes.

The patch contains an example usage of the functionality in a unit test that
uses FlatAffineConstraints.

Reviewed By: bondhugula, grosser

Differential Revision: https://reviews.llvm.org/D113275
2021-11-23 03:04:30 +05:30
Groverkss 98daa4e425 [MLIR] Fix incorrect removal of source loop in loop fusion
This patch fixes a bug in loop fusion pass where the source loop was removed
even when the fused loop did not cover all iterations of the source loop.

This was because the fast hueristic check for checking if source loop and
fused loop have same iterations did not take into account steps in loop.

Reviewed By: dcaballe, bondhugula

Differential Revision: https://reviews.llvm.org/D114164
2021-11-23 02:54:09 +05:30
Alexander Belyaev de18b7dee6 Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td."
This reverts commit 3028bca6a9.
For some reason using FallbackModel works with CMake and does not work
with bazel. Using `ExternalModel` works. I will check what's going on
and resubmit tomorrow.
2021-11-22 21:35:20 +01:00
Alexander Belyaev 3028bca6a9 [mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td.
Remove the interface from op defs in MemRefOps.td and make it an external model.

This is the first PR of many that will move bufferization-related ops, interfaces, passes to Dialect/Bufferize.
RFC: https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712
It is still debated if the comprehensive bufferization has to be moved there as well, so for now I am just moving the "gradual" bufferization.

Differential Revision: https://reviews.llvm.org/D114147
2021-11-22 21:00:59 +01:00
Mehdi Amini e0b7bee7cf Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)"
This reverts commit a9e236bed8.
This broke the Windows build:

mlir\include\mlir/Dialect/X86Vector/Transforms.h(28): error C2061: syntax error: identifier 'uint'
2021-11-22 19:23:18 +00:00
Lei Zhang 93284120f2 [mlir][vector] Fix TransferOpReduceRank for 0-D tensors
We cannot unconditionally generate memref.load ops for such cases;
need to check the source's type.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114376
2021-11-22 12:30:46 -05:00
Alex Zinenko 9c5982ef8e [mlir] support recursive types in type conversion infra
MLIR supports recursive types but they could not be handled by the conversion
infrastructure directly as it would result in infinite recursion in
`convertType` for elemental types. Support this case by keeping the "call
stack" of nested type conversions in the TypeConverter class and by passing it
as an optional argument to the individual conversion callback. The callback can
then check if a specific type is present on the stack more than once to detect
and handle the recursive case.

This approach is preferred to the alternative approach of having a separate
callback dedicated to handling only the recursive case as the latter was
observed to introduce ~3% time overhead on a 50MB IR file even if it did not
contain recursive types.

This approach is also preferred to keeping a local stack in type converters
that need to handle recursive types as that would compose poorly in case of
out-of-tree or cross-project extensions.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113579
2021-11-22 18:16:02 +01:00
Arjun P 0512bf3540 [MLIR] PresburgerSetTest: fix comment and add a test case 2021-11-22 20:00:56 +05:30
Tobias Gysi 247a1a55eb [mlir][linalg] Use getAsOpFoldResult in padding (NFC).
After padding, we introduce a ExtractSliceOp to get the final unpadded result. This revision uses getAsOpFoldResult to compute the size of the unpadded result, which guarantees the result type has a partially static shape if some of the sizes of the unpadded result are statically known. At the moment, we rely on canonicalization to cleanup the types after padding.

Depends On D114085

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114153
2021-11-22 13:15:19 +00:00
Tobias Gysi 32c43241e7 [mlir][linalg] Always generate an extract/insert slice pair when tiling output tensors.
Adapt tiling to always generate an extract/insert slice pair for output tensors even if the tensor is not tiled. Having an explicit extract/insert slice pair simplifies followup transformations such as padding and bufferization. In particular, it makes read and written iteration argument slices explicit.

Depends On D114067

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114085
2021-11-22 13:12:43 +00:00
Tres Popp 106f307499 Rename MlirExecutionEngine lookup to lookupPacked
The purpose of the change is to make clear whether the user is
retrieving the original function or the wrapper function, in line with
the invoke commands. This new functionality is useful for users that
already have defined their own packed interface, so they do not want the
extra layer of indirection, or for users wanting to the look at the
resulting primary function rather than the wrapper function.

All locations, except the python bindings now have a `lookupPacked`
method that matches the original `lookup` functionality. `lookup`
still exists, but with new semantics.

- `lookup` returns the function with a given name. If `bool f(int,int)`
is compiled, `lookup` will return a reference to `bool(*f)(int,int)`.
- `lookupPacked` returns the packed wrapper of the function with the
given name. If `bool f(int,int)` is compiled, `lookupPacked` will return
`void(*mlir_f)(void**)`.

Differential Revision: https://reviews.llvm.org/D114352
2021-11-22 14:12:09 +01:00
Tobias Gysi f7751a3a42 [mlir][linalg] Remove tile and fuse test pass (NFC).
Remove the tile and fuse test pass that has been replaced by codegen strategy.

Depends On D114067

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114068
2021-11-22 12:33:31 +00:00
Nicolas Vasilache 050cc1cd6e [mlir] Add InitializeNativeTargetAsmParser to ExecutionEngine.
This is required to allow python to work with lowerings that use inline_asm.

Differential Revision: https://reviews.llvm.org/D114338
2021-11-22 11:28:14 +00:00
Tobias Gysi e3d386ea27 [mlir][linalg] Add a tile and fuse on tensors pattern.
Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests.

Depends On D114012

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114067
2021-11-22 11:13:21 +00:00
Nicolas Vasilache 789c88e80e [mlir] Fix unintentional mutation by VectorType/RankedTensorType::Builder dropDim
Differential Revision: https://reviews.llvm.org/D113933
2021-11-22 10:51:50 +00:00