Commit Graph

7036 Commits

Author SHA1 Message Date
Gaurav Shukla 8e3075c2b0 [MLIR] Fix lowering of Affine IfOp in the presence of yield values.
This commit fixes the lowering of `Affine.IfOp` to `SCF.IfOp` in the
presence of yield values. These changes have been made as a part of
`-lower-affine` pass.

Differential Revision: https://reviews.llvm.org/D98760
2021-03-17 16:33:32 +05:30
River Riddle e60d57451e [mlir][Python] Fix test broken after D98474 2021-03-16 16:52:20 -07:00
River Riddle caa7038a89 [mlir][IR] Move the remaining builtin attributes to ODS.
With this revision, all builtin attributes and types will have been moved to the ODS generator.

Differential Revision: https://reviews.llvm.org/D98474
2021-03-16 16:31:53 -07:00
River Riddle 425e11eea1 [mlir][AttrTypeDefGen] Add support for custom parameter comparators
Some parameters to attributes and types rely on special comparison routines other than operator== to ensure equality. This revision adds support for those parameters by allowing them to specify a `comparator` code block that determines if `$_lhs` and `$_rhs` are equal. An example of one of these paramters is APFloat, which requires `bitwiseIsEqual` for bitwise comparison (which we want for attribute equality).

Differential Revision: https://reviews.llvm.org/D98473
2021-03-16 16:31:53 -07:00
River Riddle 1f13963ec1 [mlir][pdl] Cast the OperationPosition to Position to fix MSVC miscompile
If we don't cast, MSVC picks an overload that hasn't been defined yet(not sure why) and miscompiles.
2021-03-16 16:11:14 -07:00
Eugene Zhulenev 74f6138bd9 [mlir] Add lowering from math::Log1p to LLVM
[mlir] Add lowering from math::Log1p to LLVM

Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D98662
2021-03-16 15:59:09 -07:00
River Riddle 85ab413b53 [mlir][PDL] Add support for variadic operands and results in the PDL byte code
Supporting ranges in the byte code requires additional complexity, given that a range can't be easily representable as an opaque void *, as is possible with the existing bytecode value types (Attribute, Type, Value, etc.). To enable representing a range with void *, an auxillary storage is used for the actual range itself, with the pointer being passed around in the normal byte code memory. For type ranges, a TypeRange is stored. For value ranges, a ValueRange is stored. The above problem represents a majority of the complexity involved in this revision, the rest is adapting/adding byte code operations to support the changes made to the PDL interpreter in the parent revision.

After this revision, PDL will have initial end-to-end support for variadic operands/results.

Differential Revision: https://reviews.llvm.org/D95723
2021-03-16 13:20:19 -07:00
River Riddle 3a833a0e0e [mlir][PDL] Add support for variadic operands and results in the PDL Interpreter
This revision extends the PDL Interpreter dialect to add support for variadic operands and results, with ranges of these values represented via the recently added !pdl.range type. To support this extension, three new operations have been added that closely match the single variant:
* pdl_interp.check_types : Compare a range of types with a known range.
* pdl_interp.create_types : Create a constant range of types.
* pdl_interp.get_operands : Get a range of operands from an operation.
* pdl_interp.get_results : Get a range of results from an operation.
* pdl_interp.switch_types : Switch on a range of types.

This revision handles adding support in the interpreter dialect and the conversion from PDL to PDLInterp. Support for variadic operands and results in the bytecode will be added in a followup revision.

Differential Revision: https://reviews.llvm.org/D95722
2021-03-16 13:20:19 -07:00
River Riddle 1eb6994d6a [mlir][PDL] Add support for variadic operands and results in PDL
This revision extends the PDL dialect to add support for variadic operands and results, with ranges of these values represented via the recently added !pdl.range type. To support this extension, three new operations have been added that closely match the single variant:
* pdl.operands : Define a range of input operands.
* pdl.results : Extract a result group from an operation.
* pdl.types : Define a handle to a range of types.

Support for these in the pdl interpreter dialect and byte code will be added in followup revisions.

Differential Revision: https://reviews.llvm.org/D95721
2021-03-16 13:20:18 -07:00
River Riddle 02c4c0d5b2 [mlir][pdl] Remove CreateNativeOp in favor of a more general ApplyNativeRewriteOp.
This has a numerous amount of benefits, given the overly clunky nature of CreateNativeOp:
* Users can now call into arbitrary rewrite functions from inside of PDL, allowing for more natural interleaving of PDL/C++ and enabling for more of the pattern to be in PDL.
* Removes the need for an additional set of C++ functions/registry/etc. The new ApplyNativeRewriteOp will use the same PDLRewriteFunction as the existing RewriteOp. This reduces the API surface area exposed to users.

This revision also introduces a new PDLResultList class. This class is used to provide results of native rewrite functions back to PDL. We introduce a new class instead of using a SmallVector to simplify the work necessary for variadics, given that ranges will require some changes to the structure of PDLValue.

Differential Revision: https://reviews.llvm.org/D95720
2021-03-16 13:20:18 -07:00
River Riddle 242762c9a3 [mlir][pdl] Restructure how results are represented.
Up until now, results have been represented as additional results to a pdl.operation. This is fairly clunky, as it mismatches the representation of the rest of the IR constructs(e.g. pdl.operand) and also isn't a viable representation for operations returned by pdl.create_native. This representation also creates much more difficult problems when factoring in support for variadic result groups, optional results, etc. To resolve some of these problems, and simplify adding support for variable length results, this revision extracts the representation for results out of pdl.operation in the form of a new `pdl.result` operation. This operation returns the result of an operation at a given index, e.g.:

```
%root = pdl.operation ...
%result = pdl.result 0 of %root
```

Differential Revision: https://reviews.llvm.org/D95719
2021-03-16 13:20:18 -07:00
Aart Bik b85d3e27ad [mlir][amx] reformatted examples
Examples were missing the underscore of the actual ops format.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D98723
2021-03-16 10:24:57 -07:00
Aart Bik b388bbd3f9 [mlir][amx] blocked tilezero integration test
This adds a new integration test. However, it also
adapts to a recent memref.XXX change for existing tests

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D98680
2021-03-16 08:49:31 -07:00
Nicolas Vasilache b661788b77 [mlir] NFC - Expose GlobalCreator so it can be reused. 2021-03-16 12:29:04 +00:00
Adrian Kuegel 2995e161b0 [mlir]: Add canonicalization for dim of 1D alloc of size rank.
Differential Revision: https://reviews.llvm.org/D97542
2021-03-16 10:38:57 +01:00
David Zarzycki 1d297f9064 [lit] Sort test start times based on prior test timing data
Lit as it exists today has three hacks that allow users to run tests earlier:

1) An entire test suite can set the `is_early` boolean.
2) A very recently introduced "early_tests" feature.
3) The `--incremental` flag forces failing tests to run first.

All of these approaches have problems.

1) The `is_early` feature was until very recently undocumented. Nevertheless it still lacks testing and is a imprecise way of optimizing test starting times.
2) The `early_tests` feature requires manual updates and doesn't scale.
3) `--incremental` is undocumented, untested, and it requires modifying the *source* file system by "touching" the file. This "touch" based approach is arguably a hack because it confuses editors (because it looks like the test was modified behind the back of the editor) and "touching" the test source file doesn't work if the test suite is read only from the perspective of `lit` (via advanced filesystem/build tricks).

This patch attempts to simplify and address all of the above problems.

This patch formalizes, documents, tests, and defaults lit to recording the execution time of tests and then reordering all tests during the next execution. By reordering the tests, high core count machines run faster, sometimes significantly so.

This patch also always runs failing tests first, which is a positive user experience win for those that didn't know about the hidden `--incremental` flag.

Finally, if users want, they can _optionally_ commit the test timing data (or a subset thereof) back to the repository to accelerate bots and first-time runs of the test suite.

Reviewed By: jhenderson, yln

Differential Revision: https://reviews.llvm.org/D98179
2021-03-16 05:23:04 -04:00
Lorenzo Chelini fd7eee64c5 scf::ForOp: Fold away iterator arguments with no use and for which the corresponding input is yielded
Enhance 'ForOpIterArgsFolder' to remove unused iteration arguments in a
scf::ForOp. If the block argument corresponding to the given iterator has no
use and the yielded value equals the input, we fold it away.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D98503
2021-03-16 07:01:25 +00:00
Aart Bik 6ad7b97e20 [mlir][amx] Add Intel AMX dialect (architectural-specific vector dialect)
The Intel Advanced Matrix Extensions (AMX) provides a tile matrix
multiply unit (TMUL), a tile control register (TILECFG), and eight
tile registers TMM0 through TMM7 (TILEDATA). This new MLIR dialect
provides a bridge between MLIR concepts like vectors and memrefs
and the lower level LLVM IR details of AMX.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D98470
2021-03-15 17:59:05 -07:00
Alex Zinenko b868a3edad [mlir] fix SPIR-V CPU and Vulkan runners after e2310704d8
The commit in question changed the syntax but did not update the runner
tests. This also required registering the MemRef dialect for custom
parser to work correctly.
2021-03-15 18:36:58 +01:00
Christopher Tetreault 39970764af [CMake] Require python 3.6 if enabling LLVM test targets
The lit test suite uses python 3.6 features. Rather than a strange
python syntax error upon running the lit tests, we will require the
correct version in CMake.

Reviewed By: serge-sans-paille, yln

Differential Revision: https://reviews.llvm.org/D95635
2021-03-15 09:50:39 -07:00
Alex Zinenko 0aceb61665 [mlir] make memref.cast implement ViewLikeOpInterface
This was seemingly dropped in e2310704d8,
potentially due to a misrebase. The absence of this trait makes aliasing
analysis incorrect, leading to, e.g., buffer deallocation pass inserting
deallocations too early.
2021-03-15 17:21:27 +01:00
Alex Zinenko 7aa6f3aa0c [mlir] fix integration tests post e2310704d8
The commit in question moved some ops across dialects but did not update
some of the target-specific integration tests that use these ops,
presumably because the corresponding target hardware was not available.
Fix these tests.
2021-03-15 14:41:27 +01:00
Alex Zinenko e82a30bdce [mlir] enable Python bindings for the MemRef dialect
A previous commit moved multiple ops from Standard to MemRef dialect.
Some of these ops are exercised in Python bindings. Enable bindings for
the newly created MemRef dialect and update a test accordingly.
2021-03-15 14:07:51 +01:00
Alex Zinenko 0fb4a201c0 [mlir] fix shared-lib build fallout of e2310704d8
The patch in question broke the build with shared libraries due to
missing dependencies, one of which would have been circular between
MLIRStandard and MLIRMemRef if added. Fix this by moving more code
around and swapping the dependency direction. MLIRMemRef now depends on
MLIRStandard, but MLIRStandard does _not_ depend on MLIRMemRef.
Arguably, this is the right direction anyway since numerous libraries
depend on MLIRStandard and don't necessarily need to depend on
MLIRMemref.

Other otable changes include:
- some EDSC code is moved inline to MemRef/EDSC/Intrinsics.h because it
  creates MemRef dialect operations;
- a utility function related to shape moved to BuiltinTypes.h/cpp
  because it only realtes to shaped types and not any particular dialect
  (standard dialect is erroneously believed to contain MemRefType);
- a Python test for the standard dialect is disabled completely because
  the ops it tests moved to the new MemRef dialect, but it is not
  exposed to Python bindings, and the change for that is non-trivial.
2021-03-15 13:41:38 +01:00
Julian Gross e2310704d8 [MLIR] Create memref dialect and move dialect-specific ops from std.
Create the memref dialect and move dialect-specific ops
from std dialect to this dialect.

Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
AssumeAlignmentOp -> MemRef_AssumeAlignmentOp
DeallocOp -> MemRef_DeallocOp
DimOp -> MemRef_DimOp
MemRefCastOp -> MemRef_CastOp
MemRefReinterpretCastOp -> MemRef_ReinterpretCastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
LoadOp -> MemRef_LoadOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
SubViewOp -> MemRef_SubViewOp
TransposeOp -> MemRef_TransposeOp
TensorLoadOp -> MemRef_TensorLoadOp
TensorStoreOp -> MemRef_TensorStoreOp
TensorToMemRefOp -> MemRef_BufferCastOp
ViewOp -> MemRef_ViewOp

The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667

Differential Revision: https://reviews.llvm.org/D98041
2021-03-15 11:14:09 +01:00
Alex Zinenko a88371490d [mlir] better formatting in interface docs
Start the description from a new line instead of putting the first
paragraph in the section header. Wrap the class name in backticks to
make it clear that it relates to the code.
2021-03-15 11:10:32 +01:00
Alex Zinenko 03085156ec [mlir] fix cmake for generating data layout documentation 2021-03-15 11:02:03 +01:00
Alex Zinenko 40d8e4d3f9 Revert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants."
This reverts commit b5d9a3c923.

The commit introduced a memory error in canonicalization/operation
walking that is exposed when compiled with ASAN. It leads to crashes in
some "release" configurations.
2021-03-15 10:27:55 +01:00
Frederik Gossen b55f424ffc [MLIR] Add canonicalization for `shape.broadcast`
Remove redundant operands and fold if only one left.

Differential Revision: https://reviews.llvm.org/D98402
2021-03-15 10:11:28 +01:00
Frederik Gossen 2a71f95767 [MLIR] Allow compatible shapes in `Elementwise` operations
Differential Revision: https://reviews.llvm.org/D98186
2021-03-15 09:56:20 +01:00
Matthias Springer 581672be04 [mlir][AVX512] Add while loop-based sparse vector-vector dot product variants.
Differential Revision: https://reviews.llvm.org/D98480
2021-03-15 16:59:10 +09:00
Chris Lattner 91a6ad5ad8 [m_Constant] Check #operands/results before hasTrait()
We know that all ConstantLike operations have one result and no operands,
so check this first before doing the trait check.  This change speeds up
Canonicalize on a CIRCT testcase by ~5%.

Differential Revision: https://reviews.llvm.org/D98615
2021-03-14 20:14:19 -07:00
Chris Lattner b5d9a3c923 [Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants.
Two changes:
 1) Change the canonicalizer to walk the function in top-down order instead of
    bottom-up order.  This composes well with the "top down" nature of constant
    folding and simplification, reducing iterations and re-evaluation of ops in
    simple cases.
 2) Explicitly enter existing constants into the OperationFolder table before
    canonicalizing.  Previously we would "constant fold" them and rematerialize
    them, wastefully recreating a bunch fo constants, which lead to pointless
    memory traffic.

Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.

One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.

Differential Revision: https://reviews.llvm.org/D98609
2021-03-14 18:21:42 -07:00
Aart Bik e7ee4eaaf7 [mlir][sparse] disable nonunit stride dense vectorization
This is a temporary work-around to get our all-annotations-all-flags
stress testing effort run clean. In the long run, we want to provide
efficient implementations of strided loads and stores though

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D98563
2021-03-12 16:49:32 -08:00
Nikita Popov 42eb658f65 [OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC)
This removes some (but not all) uses of type-less CreateGEP()
and CreateInBoundsGEP() APIs, which are incompatible with opaque
pointers.

There are a still a number of tricky uses left, as well as many
more variation APIs for CreateGEP.
2021-03-12 21:01:16 +01:00
Eugene Zhulenev 39b2cd4009 [mlir] Annotate functions used only in debug mode with LLVM_ATTRIBUTE_UNUSED
Functions used only in `assert` cause warnings in release mode

Reviewed By: mehdi_amini, dcaballe, ftynse

Differential Revision: https://reviews.llvm.org/D98476
2021-03-12 11:25:46 -08:00
Alex Zinenko 4affd0c40e [mlir] fix a memory leak in NestedPattern
NestedPattern uses a BumpPtrAllocator to store child (nested) pattern
objects to decrease the overhead of dynamic allocation. This assumes all
allocations happen inside the allocator that will be freed as a whole.
However, NestedPattern contains `std::function` as a member, which
allocates internally using `new`, unaware of the BumpPtrAllocator. Since
NestedPattern only holds pointers to the nested patterns allocated in
the BumpPtrAllocator, it never calls their destructors, so the
destructor of the `std::function`s they contain are never called either,
leaking the allocated memory.

Make NestedPattern explicitly call destructors of nested patterns. This
additionally requires to actually copy the nested patterns in
copy-construction and copy-assignment instead of just sharing the
pointer to the arena-allocated list of children to avoid double-free. An
alternative solution would be to add reference counting to the list of
arena-allocated list of children.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D98485
2021-03-12 18:52:14 +01:00
Christian Sigg 1ef544d4a9 [mlir] Remove mlir-cuda-runner
Change CUDA integration tests to use mlir-opt + mlir-cpu-runner instead.

Depends On D98203

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D98396
2021-03-12 14:06:43 +01:00
Alex Zinenko be5b844a35 [mlir] fix memory leak on failure path in parser
Forward references to blocks lead to `Block`s being allocated in the
parser, but they are not necessarily included into a region if parsing
fails, leading to a leak. Clean them up in parser destructor.

Reviewed By: rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D98403
2021-03-12 09:24:08 +01:00
Marius Brehler 849f8183fb [mlir] Fix ConstantOp verifier
This restricts the attributes to integers for constants of type
IndexType. So far an attribute like StringAttr as in

  %c1 = constant "" : index

is valid.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D98216
2021-03-12 08:49:25 +01:00
Sergei Grechanik fd2b08969b [mlir][Vector] Lowering of transfer_read/write to vector.load/store
This patch introduces progressive lowering patterns for rewriting
vector.transfer_read/write to vector.load/store and vector.broadcast
in certain supported cases.

Reviewed By: dcaballe, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97822
2021-03-11 18:17:51 -08:00
Sergei Grechanik 46ef6ffdaf [NFC] Test commit. Add empty lines. 2021-03-11 17:31:20 -08:00
Mehdi Amini e1364f1068 Replace use of OperationState with builder::create in GPU Kernel Outlining (NFC)
OperationState is a low level API that is rarely indicated, the builder
API convenient wrapper is preferred when possible.
2021-03-12 00:14:02 +00:00
Diego Caballero 0fd0fb5329 Reland: [mlir][Affine][Vector] Add initial support for 'iter_args' to Affine vectorizer.
This patch adds support for vectorizing loops with 'iter_args' when those loops
are not a vector dimension. This allows vectorizing outer loops with an inner
'iter_args' loop (e.g., reductions). Vectorizing scenarios where 'iter_args'
loops are vector dimensions would require more work (e.g., analysis,
generating horizontal reduction, etc.) not included in this patch.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97892
2021-03-12 01:08:28 +02:00
Diego Caballero 96891f0418 Reland: [mlir][Vector][Affine] Improve affine vectorizer algorithm
This patch replaces the root-terminal vectorization approach implemented in the
Affine vectorizer with a topological order approach that vectorizes all the
operations within the target loop nest. These are the most important changes
introduced by the new algorithm:
  * Removed tracking of root and terminal ops. Existing vectorization
    functionality is preserved and extended so that loop nests without
    root-terminal chains can be vectorized.
  * Vectorizing a loop nest now only requires a single topological traversal.
  * A new vector loop nest is incrementally built along the vectorization
    process. The original scalar loop is kept intact. No cloning guard is needed
    to recover the scalar loop if vectorization fails. This approach also
    simplifies the challenging task of replacing a loop operation amid the
    vectorization process without invalidating the analysis information that
    depends on the original loop.
  * Vectorization of specific operations has been implemented as independent,
    preparing them to be moved to a potential vectorization interface.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97442
2021-03-12 00:19:50 +02:00
River Riddle 31bb8efd69 [mlir][StorageUniquer] Properly call the destructor on non-trivially destructible storage instances
This allows for storage instances to store data that isn't uniqued in the context, or contain otherwise non-trivial logic, in the rare situations that they occur. Storage instances with trivial destructors will still have their destructor skipped. A consequence of this is that the storage instance definition must be visible from the place that registers the type.

Differential Revision: https://reviews.llvm.org/D98311
2021-03-11 11:35:32 -08:00
Diego Caballero ed193bce9d [mlir][Vector][Affine] Fix heap-use-after-free in vectorizer
This patch fixes a heap-use-after-free introduced by the recent changes
in the vectorizer: https://reviews.llvm.org/rG95db7b4aeaad590f37720898e339a6d54313422f
The problem is due to the way candidate loops are visited. All candidate loops
are pattern-matched beforehand using the 'NestedMatch' utility. These matches may
intersect with each other so it may happen that we try to vectorize a loop that
was previously vectorized. The new vectorization algorithm replaces the original
loops that are vectorized with new loops and, therefore, any reference to the
original loops in the pre-computed matches becomes invalid.

This patch fixes the problem by classifying the candidate matches into buckets
before vectorization. Each bucket contains all the matches that intersect. The
vectorizer uses these buckets to make sure that we only vectorize *one* match from
each bucket, at most.

Differential Revision: https://reviews.llvm.org/D98382
2021-03-11 20:44:07 +02:00
Nikita Popov f3f0c6cd47 [mlir] Remove uses of type-less CreateLoad() APIs (NFC)
For the use in LLVMOps.td I used the getPointerElementType()
escape hatch, as it's not obvious to me how the load type
should be properly obtained here.
2021-03-11 18:39:20 +01:00
Alex Zinenko 27104390e8 [mlir] fix cmake build 2021-03-11 18:22:00 +01:00
Alex Zinenko 3ba14fa0ce [mlir] Introduce data layout modeling subsystem
Data layout information allows to answer questions about the size and alignment
properties of a type. It enables, among others, the generation of various
linear memory addressing schemes for containers of abstract types and deeper
reasoning about vectors. This introduces the subsystem for modeling data
layouts in MLIR.

The data layout subsystem is designed to scale to MLIR's open type and
operation system. At the top level, it consists of attribute interfaces that
can be implemented by concrete data layout specifications; type interfaces that
should be implemented by types subject to data layout; operation interfaces
that must be implemented by operations that can serve as data layout scopes
(e.g., modules); and dialect interfaces for data layout properties unrelated to
specific types. Built-in types are handled specially to decrease the overall
query cost.

A concrete default implementation of these interfaces is provided in the new
Target dialect. Defaults for built-in types that match the current behavior are
also provided.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D97067
2021-03-11 16:54:47 +01:00