Commit Graph

9259 Commits

Author SHA1 Message Date
Lei Zhang 93284120f2 [mlir][vector] Fix TransferOpReduceRank for 0-D tensors
We cannot unconditionally generate memref.load ops for such cases;
need to check the source's type.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114376
2021-11-22 12:30:46 -05:00
Alex Zinenko 9c5982ef8e [mlir] support recursive types in type conversion infra
MLIR supports recursive types but they could not be handled by the conversion
infrastructure directly as it would result in infinite recursion in
`convertType` for elemental types. Support this case by keeping the "call
stack" of nested type conversions in the TypeConverter class and by passing it
as an optional argument to the individual conversion callback. The callback can
then check if a specific type is present on the stack more than once to detect
and handle the recursive case.

This approach is preferred to the alternative approach of having a separate
callback dedicated to handling only the recursive case as the latter was
observed to introduce ~3% time overhead on a 50MB IR file even if it did not
contain recursive types.

This approach is also preferred to keeping a local stack in type converters
that need to handle recursive types as that would compose poorly in case of
out-of-tree or cross-project extensions.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113579
2021-11-22 18:16:02 +01:00
Arjun P 0512bf3540 [MLIR] PresburgerSetTest: fix comment and add a test case 2021-11-22 20:00:56 +05:30
Tobias Gysi 247a1a55eb [mlir][linalg] Use getAsOpFoldResult in padding (NFC).
After padding, we introduce a ExtractSliceOp to get the final unpadded result. This revision uses getAsOpFoldResult to compute the size of the unpadded result, which guarantees the result type has a partially static shape if some of the sizes of the unpadded result are statically known. At the moment, we rely on canonicalization to cleanup the types after padding.

Depends On D114085

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114153
2021-11-22 13:15:19 +00:00
Tobias Gysi 32c43241e7 [mlir][linalg] Always generate an extract/insert slice pair when tiling output tensors.
Adapt tiling to always generate an extract/insert slice pair for output tensors even if the tensor is not tiled. Having an explicit extract/insert slice pair simplifies followup transformations such as padding and bufferization. In particular, it makes read and written iteration argument slices explicit.

Depends On D114067

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114085
2021-11-22 13:12:43 +00:00
Tres Popp 106f307499 Rename MlirExecutionEngine lookup to lookupPacked
The purpose of the change is to make clear whether the user is
retrieving the original function or the wrapper function, in line with
the invoke commands. This new functionality is useful for users that
already have defined their own packed interface, so they do not want the
extra layer of indirection, or for users wanting to the look at the
resulting primary function rather than the wrapper function.

All locations, except the python bindings now have a `lookupPacked`
method that matches the original `lookup` functionality. `lookup`
still exists, but with new semantics.

- `lookup` returns the function with a given name. If `bool f(int,int)`
is compiled, `lookup` will return a reference to `bool(*f)(int,int)`.
- `lookupPacked` returns the packed wrapper of the function with the
given name. If `bool f(int,int)` is compiled, `lookupPacked` will return
`void(*mlir_f)(void**)`.

Differential Revision: https://reviews.llvm.org/D114352
2021-11-22 14:12:09 +01:00
Tobias Gysi f7751a3a42 [mlir][linalg] Remove tile and fuse test pass (NFC).
Remove the tile and fuse test pass that has been replaced by codegen strategy.

Depends On D114067

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114068
2021-11-22 12:33:31 +00:00
Nicolas Vasilache 050cc1cd6e [mlir] Add InitializeNativeTargetAsmParser to ExecutionEngine.
This is required to allow python to work with lowerings that use inline_asm.

Differential Revision: https://reviews.llvm.org/D114338
2021-11-22 11:28:14 +00:00
Tobias Gysi e3d386ea27 [mlir][linalg] Add a tile and fuse on tensors pattern.
Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests.

Depends On D114012

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114067
2021-11-22 11:13:21 +00:00
Nicolas Vasilache 789c88e80e [mlir] Fix unintentional mutation by VectorType/RankedTensorType::Builder dropDim
Differential Revision: https://reviews.llvm.org/D113933
2021-11-22 10:51:50 +00:00
Tobias Gysi 0ccc44cec0 [mlir][linalg] Fix tile and fuse for outermost reduction.
Tile and fuse failed if the outermost tile loop is a reduction dimension. Add the necessary check to handle outermost reductions and introduce a test case to verify the change.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114012
2021-11-22 10:44:15 +00:00
Nicolas Vasilache a9e236bed8 [mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations:        100
Instructions:      5900
Total Cycles:      2415
Total uOps:        7300

Dispatch Width:    6
uOps Per Cycle:    3.02
IPC:               2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
  Resource Pressure       [ 89.65% ]
  - SKXPort1  [ 0.04% ]
  - SKXPort2  [ 12.42% ]
  - SKXPort3  [ 12.42% ]
  - SKXPort5  [ 89.52% ]
  Data Dependencies:      [ 37.06% ]
  - Register Dependencies [ 37.06% ]
  - Memory Dependencies   [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations:        100
Instructions:      6300
Total Cycles:      2015
Total uOps:        7700

Dispatch Width:    6
uOps Per Cycle:    3.82
IPC:               3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
  Resource Pressure       [ 83.18% ]
  - SKXPort0  [ 14.49% ]
  - SKXPort1  [ 14.54% ]
  - SKXPort2  [ 19.70% ]
  - SKXPort3  [ 19.70% ]
  - SKXPort5  [ 83.03% ]
  - SKXPort6  [ 14.49% ]
  Data Dependencies:      [ 39.75% ]
  - Register Dependencies [ 39.75% ]
  - Memory Dependencies   [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Reviewed By: ftynse, dcaballe

Differential Revision: https://reviews.llvm.org/D114335
2021-11-22 10:32:34 +00:00
Arjun P d92aabc336 [MLIR][NFC] Simplex: remove repeated words in comment 2021-11-22 15:50:03 +05:30
Jacques Pienaar e5a4d0f149 [mlir] Fix unused function warning (NFC)
Delete function no longer needed as all derived classes override
printer.
2021-11-21 15:06:08 -08:00
Jacques Pienaar 6f9cceb775 [mlir] Move trait to InferTypeOpInterface
Step towards removing the hard coded behavior for this trait and to instead use common interface.

Differential Revision: https://reviews.llvm.org/D114208
2021-11-21 14:41:12 -08:00
Arjun P ad48ef1e31 [MLIR][NFC] Simplex::restoreRow: improve documentation 2021-11-21 19:23:55 +05:30
Arnab Dutta ec7b0d4d34 [MLIR] Simplify Semi-affine expressions by rule based matching and replacing "expr - q * (expr floordiv q)" with "expr mod q" expression.
Add rule based matching for detecting and transforming "expr - q * (expr floordiv q)"
to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function.

Reviewed By: bondhugula, dcaballe

Differential Revision: https://reviews.llvm.org/D112985
2021-11-20 21:05:36 +05:30
Arnab Dutta 1f9ca5adba [MLIR] Avoid creation of buggy affine maps while replacing dimension and symbol
Initially before appending the newly composed dimension and symbols
to the dimension and symbol list whose size is to be passed in
AffineMap::get(), the call to the AffineMap::get() was made, resulting
in wrong dimCount and symbolCount being passed as argument. We move the
call to the AffineMap::get() after the diimension and symbol list are
updated.

Differential Revision: https://reviews.llvm.org/D114237
2021-11-20 12:01:29 +05:30
Krzysztof Drewniak a6f53afbcb [MLIR][GPU] Link in device libraries during HSA compilation if needed
To perform some operations, such as sin() or printf(), code compiled
for AMD GPUs must be linked to a series of device libraries. This
commit adds support for linking in these libraries.

However, since these device libraries are delivered as LLVM bitcode,
raising the possibility of version incompatibilities, this commit only
links in libraries when the functions from those libraries are called
by the code being compiled.

This code also sets the math flags to their most conservative values,
as MLIR doesn't have a `-ffast-math` equivalent.

Depends on D114114

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114117
2021-11-19 22:29:37 +00:00
rdzhabarov d729f4c38f [mlir] Bug fix. Stream must outlive the pass manager.
Bug fix. Stream must outlive the pass manager.

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D114277
2021-11-19 21:45:43 +00:00
Krzysztof Drewniak 20f79f8caa [MLIR][GPU] Make the path to ROCm a runtime option
Our current build assumes that the path to ROCm we find at build time
will be the path at which ROCm is located when the built code is
executed. This commit adds a --rocm-path option to SerializeToHsaco,
and removes the HIP dependency that the SerializeToHsaco previously had.

Depends on D114113

(though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107)

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114114
2021-11-19 20:51:54 +00:00
Stella Laurenzo 3fcdd182e9 NFC: Callout restriction on folding 0-result ops in documentation.
Differential Revision: https://reviews.llvm.org/D114271
2021-11-19 20:35:01 +00:00
Krzysztof Drewniak bd22554af0 [MLIR][GPU] Run generic LLVM optimizations when serializing (on AMD)
- Adds hooks that allow SerializeTo* passes to arbitrarily transform
the produced LLVM Module before it is passed to the code generation
passes.

- Uses these hooks within the SerializeToHsaco pass in order to run
LLVM optimizations and to set the optimization level on the
TargetMachine.

- Adds an optLevel parameter to SerializeToHsaco

Future work may include moving much of what's been added to
SerializeToHsaco to SerializeToBlob, but that would require
confirmation from the NVVM backend maintainers that it would be
appropriate to do so.

Depends on D114107

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114113
2021-11-19 19:21:24 +00:00
Thomas Raoux 47555d73f6 [mlir][gpu] Extend shuffle op modes and add nvvm lowering
Add up, down and idx modes to gpu shuffle ops, also change the mode from
string to enum

Differential Revision: https://reviews.llvm.org/D114188
2021-11-19 11:14:31 -08:00
Thomas Raoux 7cde516513 [mlir][vector] NFC, move some vector patterns in a separate file
Move patterns related to dropping lead unit dim into their own file.

Differential Revision: https://reviews.llvm.org/D114265
2021-11-19 10:39:29 -08:00
Thomas Raoux 06dbb28569 [mlir][vector] Remove usage of shapecast to remove unit dim
Instead of using shape_cast op in the pattern removing leading unit
dimensions we use extract/broadcast ops. This is part of the effort to
restrict ShapeCastOp fuirther in the future and only allow them to
convert to or from 1D vector.

This also adds extra canonicalization to fill the gaps in simplifying
broadcast/extract ops.

Differential Revision: https://reviews.llvm.org/D114205
2021-11-19 10:25:21 -08:00
Krzysztof Drewniak f849640a0c [MLIR] Make the ROCM integration tests runnable
- Move the #define s to the GPU Transform library from GPU Ops so that
SerializeToHsaco is non-trivially compiled

- Add required includes to SerializeToHsaco

- Move MCSubtargetInfo creation to the correct point in the
compilation process

- Change mlir in ROCM tests to account for renamed/moved ops

Differential Revision: https://reviews.llvm.org/D114184
2021-11-19 17:09:53 +00:00
Valentin Clement 78d69182b7
[mlir] Expose region utils functions
As discussed in D109579, this patch exposes `runRegionDCE` and
`eraseUnreachableBlocks` so they can be used as separate utilities in
other passes.

Reviewed By: rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D114160
2021-11-19 09:24:39 +01:00
Mogball 7c5ecc8b7e [mlir][vector] Insert/extract element can accept index
`vector::InsertElementOp` and `vector::ExtractElementOp` have had their `position`
operand changed to accept `AnySignlessIntegerOrIndex` for better operability with
operations that use `index`, such as affine loops.

LLVM's `extractelement` and `insertelement` can also accept `i64`, so lowering
directly to these operations without explicitly inserting casts is allowed. SPIRV's
equivalent ops can also accept `i64`.

Reviewed By: nicolasvasilache, jpienaar

Differential Revision: https://reviews.llvm.org/D114139
2021-11-18 22:40:29 +00:00
Arjun P 3b7b4a8041 [MLIR][NFC] Simplex::markRowRedundant: assert that row is not already marked redundant 2021-11-19 03:43:25 +05:30
MaheshRavishankar d26beb0be2 [mlir][Linalg] Add method to check if LinalgTransformationFilter has been applied.
Differential Revision: https://reviews.llvm.org/D114170
2021-11-18 13:45:30 -08:00
Markus Böck 0a8a5902a6 [mlir] Fully qualify default generated type/attribute printer and parser
This patch makes it possible to use the newly added useDefaultAttributePrinterParser and useDefaultTypePrinterParser dialect options without any using namespace declarations. Two things had to be done to make this possible:

* Fully qualify any type usages or functions from the mlir namespace in the generated C++ code
* Makes sure to emit the printers and parsers inside the same namespace as the Dialect

Differential Revision: https://reviews.llvm.org/D114168
2021-11-18 20:24:00 +01:00
Quinn Pham a1504281b6 [NFC][mlir] Inclusive language: Replace an instance of master in docs
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with main in `SPIR-V.md`.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114091
2021-11-18 13:10:32 -06:00
MaheshRavishankar 526dfe3f4d [mlir][Linalg] Do not return failure when all tile sizes are zero.
Returning failure when tile sizes are all zero prevents the change in
the marker. This makes pattern rewriter run the pattern multiple times
only to exit when it hits a limit. Instead just clone the operation
(since tiling is essentially cloning in this case). Then the
transformation filter kicks in to avoid the pattern rewriter to be
invoked many times.

Differential Revision: https://reviews.llvm.org/D113949
2021-11-18 09:28:25 -08:00
Jacques Pienaar 1dc1c944d8 [mlir][doc] Avoid name overlap that confuses copy_docs.sh (NFC) 2021-11-18 09:03:49 -08:00
Krzysztof Drewniak fb1a06aa13 [MLIR][GPU] Add target arguments to SerializeToHsaco
Compiling code for AMD GPUs requires knowledge of which chipset is
being targeted, especially if the code uses chipset-specific
intrinsics (which is the case in a downstream convolution generator).
This commit adds `target`, `chipset` and `features` arguments to the
SerializeToHsaco constructor to enable passing in this required
information.

It also amends the ROCm integration tests to pass in the target
chipset, which is set to the chipset of the first GPU on the system
executing the tests.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114107
2021-11-18 16:28:44 +00:00
Jacques Pienaar a3f2be18b8 [mlir][doc] Rename doc to match previous name
Previous change inadvertently changed link.
2021-11-18 08:23:49 -08:00
Michal Terepeta 54c9984207 [mlir][Python] Fix generation of accessors for Optional
Previously, in case there was only one `Optional` operand/result within
the list, we would always return `None` from the accessor, e.g., for a
single optional result we would generate:

```
return self.operation.results[0] if len(self.operation.results) > 1 else None
```

But what we really want is to return `None` only if the length of
`results` is smaller than the total number of element groups (i.e.,
the optional operand/result is in fact missing).

This commit also renames a few local variables in the generator to make
the distinction between `isVariadic()` and `isVariableLength()` a bit
more clear.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D113855
2021-11-18 09:42:57 +01:00
Matthias Springer ebf8d74e92 [mlir][linalg][bufferize] Fix bufferize bug where non-tensor ops are not skipped
`BufferizableOpInterface::bufferize` will only be called on ops that
have tensor operands and/or results.

Differential Revision: https://reviews.llvm.org/D113962
2021-11-18 16:20:22 +09:00
Matthias Springer 26e90423f4 [mlir][linalg][bufferize][NFC] Decouple ComprehensiveBufferize from tensor dialect
Add a new BufferizableOpInterface method `isNotConflicting` that can be used to implement custom analysis rules.

Differential Revision: https://reviews.llvm.org/D113961
2021-11-18 16:11:24 +09:00
River Riddle 0c7890c844 [mlir] Convert NamedAttribute to be a class
NamedAttribute is currently represented as an std::pair, but this
creates an extremely clunky .first/.second API. This commit
converts it to a class, with better accessors (getName/getValue)
and also opens the door for more convenient API in the future.

Differential Revision: https://reviews.llvm.org/D113956
2021-11-18 05:39:29 +00:00
Aart Bik 1ce77b562d [mlir][sparse] refine lexicographic insertion to any tensor
First version was vectors only. With some clever "path" insertion,
we now support any d-dimensional tensor. Up next: reductions too

Reviewed By: bixia, wrengr

Differential Revision: https://reviews.llvm.org/D114024
2021-11-17 18:08:42 -08:00
Robert Suderman 6e41a06911 [mlir][tosa] Revert add-0 canonicalization for floating-point
Floating point optimization can produce incorrect numerical resutls for
-0.0 + 0.0 optimization as result needs to be -0.0.

Reviewed By: eric-k256

Differential Revision: https://reviews.llvm.org/D114127
2021-11-17 17:29:57 -08:00
J. Ryan Stinnett 1f7827e6aa [MLIR][Docs] Fix link syntax in Rationale.md 2021-11-17 23:38:19 +00:00
Rob Suderman 044e7e013e [mlir][tosa] Fixed shape inference for tosa.transpose_conv2d
Transpose conv2d shape inference was incorrect, tests did not properly validate
that the shape inference was executing. Corrected shape inference, and extended
tests to actually execute.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D114026
2021-11-17 14:59:52 -08:00
River Riddle edc6c0ecb9 [mlir] Refactor AbstractOperation and OperationName
The current implementation is quite clunky; OperationName stores either an Identifier
or an AbstractOperation that corresponds to an operation. This has several problems:

* OperationNames created before and after an operation are registered are different
* Accessing the identifier name/dialect/etc. from an OperationName are overly branchy
  - they need to dyn_cast a PointerUnion to check the state

This commit refactors this such that we create a single information struct for every
operation name, even operations that aren't registered yet. When an OperationName is
created for an unregistered operation, we only populate the name field. When the
operation is registered, we populate the remaining fields. With this we now have two
new classes: OperationName and RegisteredOperationName. These both point to the
same underlying operation information struct, but only RegisteredOperationName can
assume that the operation is actually registered. This leads to a much cleaner API, and
we can also move some AbstractOperation functionality directly to OperationName.

Differential Revision: https://reviews.llvm.org/D114049
2021-11-17 22:29:57 +00:00
Jacques Pienaar 0d0c46a35b [mlir] Improve documentation of shape dialect
Add small example of usage (brief which will be further refined).
2021-11-17 14:07:06 -08:00
Alex Zinenko bca003dea8 [mlir] Fix wrong variable name in Linalg OpDSL
The name seems to have been left over from a renaming effort on an unexercised
codepaths that are difficult to catch in Python. Fix it and add a test that
exercises the codepath.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D114004
2021-11-17 22:55:35 +01:00
Michal Terepeta ddf2d62c7d [mlir][Vector] First step for 0D vector type
There seems to be a consensus that we should allow 0D vectors:
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097

This commit is only the first step: it changes the verifier and the parser to
allow vectors like `vector<f32>` (but does not allow explicit 0 dimensions,
i.e., `vector<0xf32>` is not allowed).

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114086
2021-11-17 14:58:24 +00:00
Mogball 209dadf269 [mlir] Fix formatting in Ops.td files (NFC)
MemRefOps.td has some inconsistencies in its formatting of argument
lists.
2021-11-17 00:59:42 +00:00