Commit Graph

4478 Commits

Author SHA1 Message Date
Alex Zinenko e41805fdab [mlir] Drop forward-declaration of loop::TerminatorOp.
This Op has been deleted in favor of loop::YieldOp, but the forward
declaration remain in the header.
2020-05-07 18:28:31 +02:00
Lei Zhang 16027bbc3b [mlir][spirv] Serialize all operands together if possible
SPIR-V ops can mix operands and attributes in the definition. These
operands and attributes are serialized in the exact order of the definition
to match SPIR-V binary format requirements. It can cause excessive
generated code bloat because we are emitting code to handle each
operand/attribute separately. So here we probe first to check whether all
the operands are ahead of attributes. Then we can serialize all operands
together.

This removes ~1000 lines of code from the generated inc file.

Differential Revision: https://reviews.llvm.org/D79446
2020-05-07 09:32:03 -04:00
Lei Zhang a2634748cd [mlir][spirv] Remove template functions for getting op's opcode
These template functions are used in the serializer, where we can
actually directly query the opcode from the op's definition and
use that in the auto-generated serialization logic.

This removes a set of templates accounting for 319 lines from
the auto-generated inc file.

Differential Revision: https://reviews.llvm.org/D79444
2020-05-07 09:32:03 -04:00
Alexander Belyaev a6b2877f4c [MLIR] Make ParallelLoopFusion pass scan through all nested regions.
Differential Revision: https://reviews.llvm.org/D79558
2020-05-07 13:47:30 +02:00
Alex Zinenko 54c927b988 [mlir] Add a test exercising partial constant folding of affine min/max
This functionality was introduced in a87db48e6f
but only only tested indirectly though Linalg tests. Add direct tests.
2020-05-07 12:42:03 +02:00
Alex Zinenko 4809580463 [mlir] Add a test for OperationFolder
Adds a test exercising the rewriting pattern in the test dialect that calls
OperationFolder.create.
2020-05-07 12:39:24 +02:00
Alex Zinenko a87db48e6f [mlir] Support partial folding of affine.min/max
Originally, these operations were folded only if all expressions in their
affine maps could be folded to a constant expression that can be then subject
to numeric min/max computation. This introduces a more advanced version that
partially folds the affine map by lifting individual constant expression in it
even if some of the expressions remain variable. The folding can update the
operation in place to use a simpler map. Note that this is not as powerful as
canonicalization, in particular this does not remove dimensions or symbols that
became useless. This allows for better composition of Linalg tiling and
promotion transformation, where the latter can handle some canonical forms of
affine.min that the folding can now produce.

Differential Revision: https://reviews.llvm.org/D79502
2020-05-07 12:30:04 +02:00
Wen-Heng (Jack) Chung f649aca9f3 [mlir][rocdl] Fix typo. NFC.
ROCLD -> ROCDL.

Differential Revision: https://reviews.llvm.org/D79441
2020-05-07 11:55:47 +02:00
Alex Zinenko a99f62c40a [mlir] VectorToLLVM: propagate up from replaceTransferOp
In the Vector to LLVM conversion, the `replaceTransferOp` function calls
into a type converter that may fail and suppresses the status. Change
the function to return the failure status instead, Since it is called
from a pattern, the failure can be readily propagated to the rest of
infrastructure.
2020-05-07 11:53:48 +02:00
Wen-Heng (Jack) Chung a23f190213 [mlir][vector] set alignment when lowering transfer_read and transfer_write.
When emitting masked load / store, set alignment from data layout.

Differential Revision: https://reviews.llvm.org/D79246
2020-05-07 11:44:25 +02:00
Uday Bondhugula 2affcd664e [MLIR] Fix affine fusion bug/efficiency issue / enable more fusion
The list of destination load ops while evaluating producer-consumer
fusion wasn't being maintained as a set, and as such, duplicate load ops
were being added to it. Although this is harmless correctness-wise, it's
a killer efficiency-wise and it prevents interesting/useful fusions
(including for eg. reshapes into a matmul). The reason the latter
fusions would be missed is that a slice union would be unnecessarily
needed due to the duplicate load ops on a memref added to the 'dst
loads' list. Since slice union is unimplemented for the local var case,
a single destination load op that leads to local vars (like a floordiv /
mod producing fusion), a common case, would not get fused due to an
unnecessary union being tried with itself.  (The union would actually be
the same thing but we would bail out.)

Besides the above, this would also significantly speed up fusion as all
the unnecessary slice computations / unions, checks, etc. due to the
duplicates go away.

Differential Revision: https://reviews.llvm.org/D79547
2020-05-07 10:51:34 +05:30
Uday Bondhugula 57d361bd2f [MLIR][NFC] Rename op trait PolyhedralScope -> AffineScope
Rename op trait PolyhedralScope -> AffineScope for consistency.

Differential Revision: https://reviews.llvm.org/D79503
2020-05-07 00:19:56 +05:30
Alex Zinenko 26f93d9f37 [mlir] OperationFolder: fix crash in creation of single-result-ops with in-place folds
When the folding is performed in place, the `::fold` function does not populate
its `results` argument to indicate that. (In the folding hook for single-result
operations, the result of the original operation is expected to be returned,
but it is then ignored by the wrapper.) `OperationFolder::create` would
erronously rely on the _operation_ having zero results instead of on the
_folding_ producing zero new results to populate the list of results with those
of the original operation. This would lead to a crash for single-result ops
with in-place folds where the first result is accessed uncondtionally because
the list of results was not properly populated. Use the list of values produced
by the folding instead.

Differential Revision: https://reviews.llvm.org/D79497
2020-05-06 20:40:32 +02:00
Sean Silva e382b3770e Fix ShapeBase.td
Summary:
- Add license header.
- Remove TODO about extracting ShapeBase.td

Differential Revision: https://reviews.llvm.org/D79506
2020-05-06 10:43:16 -07:00
Renato Golin 5010b5b7e6 Check type for forward reference definition
The types of forward references are checked that they match with other
uses, but they do not check they match with the definition.

    func @forward_reference_type_check() -> (i8) {
      br ^bb2

    ^bb1:
      return %1 : i8

    ^bb2:
      %1 = "bar"() : () -> (f32)
      br ^bb1
    }

Would be parsed and the use site of '%1' would be silently changed to
'f32'.

This commit adds a test for this case, and a check during parsing for
the types to match.

Patch by Matthew Parkinson <mattpark@microsoft.com>

Closes D79317.
2020-05-06 14:34:18 +01:00
Nicolas Vasilache 94438c86ad [mlir] Add a MemRefCastOp canonicalization pattern.
Summary:
This revision adds a conservative canonicalization pattern for MemRefCastOp that are typically inserted during ViewOp and SubViewOp canonicalization.
Ideally such canonicalizations would propagate the type to consumers but this is not a local behavior. As a consequence MemRefCastOp are introduced to keep type compatibility but need to be cleaned up later, in the case where more dynamic behavior than necessary is introduced.

Differential Revision: https://reviews.llvm.org/D79438
2020-05-06 09:10:05 -04:00
Uday Bondhugula ca09dab303 [MLIR][NFC] Fix/update debug messages for analysis utils and affine fusion
Drop trailing period in debug messages. Add an extra line for fusion
debug info.

Differential Revision: https://reviews.llvm.org/D79471
2020-05-06 12:27:59 +05:30
Reid Kleckner 932f0276ea [Support] Move LLD's parallel algorithm wrappers to support
Essentially takes the lld/Common/Threads.h wrappers and moves them to
the llvm/Support/Paralle.h algorithm header.

The changes are:
- Remove policy parameter, since all clients use `par`.
- Rename the methods to `parallelSort` etc to match LLVM style, since
  they are no longer C++17 pstl compatible.
- Move algorithms from llvm::parallel:: to llvm::, since they have
  "parallel" in the name and are no longer overloads of the regular
  algorithms.
- Add range overloads
- Use the sequential algorithm directly when 1 thread is requested
  (skips task grouping)
- Fix the index type of parallelForEachN to size_t. Nobody in LLVM was
  using any other parameter, and it made overload resolution hard for
  for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t.

Remove Threads.h and update LLD for that.

This is a prerequisite for parallel public symbol processing in the PDB
library, which is in LLVM.

Reviewed By: MaskRay, aganea

Differential Revision: https://reviews.llvm.org/D79390
2020-05-05 15:21:05 -07:00
Sean Silva b40d073e53 [mlir][shape] Extract ShapeBase.td 2020-05-05 13:39:19 -07:00
River Riddle 4e9a7c8f5c [mlir][DenseStringElementsAttr] Fix AttributeElementIterator in the case of a splat. 2020-05-05 12:42:37 -07:00
River Riddle 24ad385884 [mlir][DenseElementsAttr] Add support for opaque APFloat/APInt complex values.
This revision allows for creating DenseElementsAttrs and accessing elements using std::complex<APInt>/std::complex<APFloat>. This allows for opaquely accessing and transforming complex values. This is used by the printer/parser to provide pretty printing for complex values. The form for complex values matches that of std::complex, i.e.:

```
// `(` element `,` element `)`
dense<(10,10)> : tensor<complex<i64>>
```

Differential Revision: https://reviews.llvm.org/D79296
2020-05-05 12:42:37 -07:00
River Riddle da2a6f4e3b [mlir][DenseElementsAttr] Add support for ComplexType elements
This revision adds support for storing ComplexType elements inside of a DenseElementsAttr. We store complex objects as an array of two elements, matching the  definition of std::complex. There is no current attribute storage for ComplexType, but DenseElementsAttr provides API for access/creation using std::complex<>. Given that the internal implementation of DenseElementsAttr is already fairly opaque, the only real complexity here is in the printing/parsing. This revision keeps it simple for now and always uses hex when printing complex elements. A followup will add prettier syntax for this.

Differential Revision: https://reviews.llvm.org/D79281
2020-05-05 12:42:37 -07:00
Stephen Neuendorffer c296d2dc53 [MLIR] mlir-opt needs PUBLIC dependence
We see intermittent build errors on the windows buildbot because
mlir-opt is including Linalg headers which haven't been built yet.
This dependence should be resolved by declaring a PUBLIC dependence
on the Linalg library when building MLIROptMain.
2020-05-05 12:39:28 -07:00
Lei Zhang 6f790f784e [mlir] Specify CMAKE_CXX_STANDARD to standalone dialect
This addresses a compilation failure on GCC 5:

error: #error This file requires compiler and library support for the
ISO C++ 2011 standard. This support must be enabled with the -std=c++11
or -std=gnu++11 compiler options.
 #error This file requires compiler and library support

Differential Revision: https://reviews.llvm.org/D79439
2020-05-05 15:26:55 -04:00
Alex Zinenko 9d273c0ef0 [mlir] Harden verifiers for DMA ops
DMA operation classes in the Standard dialect (`DmaStartOp` and `DmaWaitOp`)
provide helper functions that make numerous assumptions about the number and
order of operands, and about their types. However, these assumptions were not
checked in the verifier, leading to assertion failures or crashes when helper
functions were used on ill-formed ops. Some of the assuptions were checked in
the custom parser (and thus could not check assumption violations in ops
constructed programmatically, e.g., during rewrites) and others were not
checked at all. Introduce the verifiers for all these assumptions and drop
unnecessary checks in the parser that are now covered by the verifier.

Addresses PR45560.

Differential Revision: https://reviews.llvm.org/D79408
2020-05-05 20:40:41 +02:00
Andy Davis 93d1108801 [MLIR][LoopOps] Adds the loop unroll transformation for loop::ForOp.
Summary:
Adds the loop unroll transformation for loop::ForOp.
Adds support for promoting the body of single-iteration loop::ForOps into its containing block.
Adds check tests for loop::ForOps with dynamic and static lower/upper bounds and step.
Care was taken to share code (where possible) with the AffineForOp unroll transformation to ease maintenance and potential future transition to a LoopLike construct on which loop transformations for different loop types can implemented.

Reviewers: ftynse, nicolasvasilache

Reviewed By: ftynse

Subscribers: bondhugula, mgorny, zzheng, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79184
2020-05-05 10:42:36 -07:00
Lei Zhang 6fb7e9a195 [mlir] Add missing dependency to MLIRMlirOptMain
Differential Revision: https://reviews.llvm.org/D79429
2020-05-05 13:32:00 -04:00
Stephen Neuendorffer 175a3df9c7 [MLIR] Add a tests for out of tree dialect example.
This attempts to ensure that out of tree usage remains stable.

Differential Revision: https://reviews.llvm.org/D78656
2020-05-05 09:22:49 -07:00
Stephen Neuendorffer e78ef9385c [MLIR] GPUToCUDA conversion: MC is only needed if NVPTX is enabled.
This patch conditionally links with MC
2020-05-05 08:55:17 -07:00
Ehsan Toosi 6ccaf73887 [MLIR][LINALG] Convert Linalg on Tensors to Buffers
This is a basic pass to convert Linalg.GenericOp which works on tensors to use
buffers instead.

Differential Revision: https://reviews.llvm.org/D78996
2020-05-05 15:48:07 +02:00
Jean-Michel Gorius 98b8b36d00 [mlir][standalone] NFC: Update CMakeLists.txt to reflect best practices
Update to follow the changes introduced in 5469f43 and documented in 93f7e52.
2020-05-05 13:37:37 +02:00
Alexander Belyaev 72700fea2b [MLIR] Link MLIRStandardOpsTransforms with MLIRTransforms.
Summary: This fixes shared lib build.

Differential Revision: https://reviews.llvm.org/D79403
2020-05-05 13:18:15 +02:00
Alex Zinenko 898f74c35d [mlir] NFC: update ::build signature in the tutorial document
This was missing from the original commit that changed the interface of
`::build` methods to take `OpBuilder &` instead of `Builder *.
2020-05-05 11:22:14 +02:00
Alexander Belyaev b79751e83d [MLIR] Add conversion from AtomicRMWOp -> GenericAtomicRMWOp.
Adding this pattern reduces code duplication. There is no need to have a
custom implementation for lowering to llvm.cmpxchg.

Differential Revision: https://reviews.llvm.org/D78753
2020-05-05 10:32:13 +02:00
Stephen Neuendorffer 93f7e525f5 [MLIR] Update documentation of cmake best practices 2020-05-04 20:47:58 -07:00
Stephen Neuendorffer 5469f434bb [MLIR] Reapply: Adjust libMLIR building to more closely follow libClang
This reverts commit ab1ca6e60f.
2020-05-04 20:47:57 -07:00
Stephen Neuendorffer 146192ade4 [MLIR] Normalize usage of intrinsics_gen
Portions of MLIR which depend on LLVMIR generally need to depend on
intrinsics_gen, to ensure that tablegen'd header files from LLVM are built
first.  Without this, we get errors, typically about llvm/IR/Attributes.inc
not being found.

Note that previously the Linalg Dialect depended on intrinsics_gen, but it
doesn't need to, since it doesn't use LLVMIR.

Differential Revision: https://reviews.llvm.org/D79389
2020-05-04 20:47:57 -07:00
River Riddle 469c02d058 [mlir] Add support for merging identical blocks during canonicalization
This revision adds support for merging identical blocks, or those with the same operations that branch to the same successors. Operands that mismatch between the different blocks are replaced with new block arguments added to the merged block.

Differential Revision: https://reviews.llvm.org/D79134
2020-05-04 19:56:46 -07:00
Geoffrey Martin-Noble 13090ec7dd [mlir] Remove tabs from predecessor comments
This change removes tabs from the comments printed by the asmprinter after basic
block declarations in favor of two spaces. This is currently the only place in
the printed IR that uses tabs.

Differential Revision: https://reviews.llvm.org/D79377
2020-05-05 02:15:23 +00:00
Nicolas Vasilache 036772acfd [mlir][EDSC] Fix off-by-one BlockBuilder insertion point.
Summary:
In the particular case of an insertion in a block without a terminator, the BlockBuilder insertion point should be block->end().

Adding a unit test to exercise this.

Differential Revision: https://reviews.llvm.org/D79363
2020-05-04 21:07:48 -04:00
River Riddle 1e4faf23ff [mlir][IR] Add a Region::getOps method that returns a range of immediately nested operations
This allows for walking the operations nested directly within a region, without traversing nested regions.

Differential Revision: https://reviews.llvm.org/D79056
2020-05-04 17:46:25 -07:00
River Riddle 6bce7d8d67 [mlir][mlir-opt] Disable multithreading when parsing the input module.
This removes the unnecessary/costly context synchronization when parsing, as the context is guaranteed to not be used by any other threads.
2020-05-04 17:29:56 -07:00
Hanhan Wang 5d10613b6e [mlir][StandardToSPIRV] Emulate bitwidths not supported for store op.
Summary:
As D78974, this patch implements the emulation for store op. The emulation is
done with atomic operations. E.g., if the storing value is i8, rewrite the
StoreOp to:

 1) load a 32-bit integer
 2) clear 8 bits in the loading value
 3) store 32-bit value back
 4) load a 32-bit integer
 5) modify 8 bits in the loading value
 6) store 32-bit value back

The step 1 to step 3 are done by AtomicAnd as one atomic step, and the step 4
to step 6 are done by AtomicOr as another atomic step.

Differential Revision: https://reviews.llvm.org/D79272
2020-05-04 15:18:44 -07:00
Haruki Imai 3a7be241f2 [mlir] Support big endian in DenseElementsAttr
This std::copy_n copies 8 byte data (APInt raw data) by 1 byte from the
beginning of char array. This is no problem in little endian, but the
data is not copied correctly in big endian because the data should be
copied from the end of the char array.

- Example of 4 byte data (such as float32)

Little endian (First 4 bytes):
Address | 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08
Data    | 0xcd 0xcc 0x8c 0x3f 0x00 0x00 0x00 0x00

Big endian (Last 4 bytes):
Address | 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08
Data    | 0x00 0x00 0x00 0x00 0x3f 0x8c 0xcc 0xcd

In general, when it copies N(N<8) byte data in big endian, the start
address should be incremented by (8 - N) bytes.
The original code has no problem when it includes 8 byte data(such as
 double) even in big endian.

Differential Revision: https://reviews.llvm.org/D78076
2020-05-04 22:17:05 +00:00
Stephen Neuendorffer ab1ca6e60f Revert "[MLIR] Adjust libMLIR building to more closely follow libClang"
This reverts commit 4f0f436749.

This seems to show some compile dependence problems, and also breaks flang.
2020-05-04 12:40:12 -07:00
Valentin Churavy 4f0f436749 [MLIR] Adjust libMLIR building to more closely follow libClang
- Exports MLIR targets to be used out-of-tree.
- mimicks `add_clang_library` and `add_flang_library`.
- Fixes libMLIR.so

After https://reviews.llvm.org/D77515 libMLIR.so was no longer containing
any object files. We originally had a cludge there that made it work with
the static initalizers and when switchting away from that to the way the
clang shlib does it, I noticed that MLIR doesn't create a `obj.{name}` target,
and doesn't export it's targets to `lib/cmake/mlir`.

This is due to MLIR using `add_llvm_library` under the hood, which adds
the target to `llvmexports`.

Differential Revision: https://reviews.llvm.org/D78773

[MLIR] Fix libMLIR.so and LLVM_LINK_LLVM_DYLIB

Primarily, this patch moves all mlir references to LLVM libraries into
either LLVM_LINK_COMPONENTS or LINK_COMPONENTS.  This enables magic in
the llvm cmake files to automatically replace reference to LLVM components
with references to libLLVM.so when necessary.  Among other things, this
completes fixing libMLIR.so, which has been broken for some configurations
since D77515.

Unlike previously, the pattern is now that mlir libraries should almost
always use add_mlir_library.  Previously, some libraries still used
add_llvm_library.  However, this confuses the export of targets for use
out of tree because libraries specified with add_llvm_library are exported
by LLVM.  Instead users which don't need/can't be linked into libMLIR.so
can specify EXCLUDE_FROM_LIBMLIR

A common error mode is linking with LLVM libraries outside of LINK_COMPONENTS.
This almost always results in symbol confusion or multiply defined options
in LLVM when the same object file is included as a static library and
as part of libLLVM.so.  To catch these errors more directly, there's now
mlir_check_all_link_libraries.

To simplify usage of add_mlir_library, we assume that all mlir
libraries depend on LLVMSupport, so it's not necessary to separately specify
it.

tested with:
BUILD_SHARED_LIBS=on,
BUILD_SHARED_LIBS=off + LLVM_BUILD_LLVM_DYLIB,
BUILD_SHARED_LIBS=off + LLVM_BUILD_LLVM_DYLIB + LLVM_LINK_LLVM_DYLIB.

By: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com>
Differential Revision: https://reviews.llvm.org/D79067

[MLIR] Move from using target_link_libraries to LINK_LIBS

This allows us to correctly generate dependencies for derived targets,
such as targets which are created for object libraries.

By: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com>
Differential Revision: https://reviews.llvm.org/D79243

Three commits have been squashed to avoid intermediate build breakage.
2020-05-04 11:40:46 -07:00
Nicolas Vasilache 307cfdf533 [mlir][Linalg] Mostly NFC - Refactor Linalg patterns and transformations.
Linalg transformations are currently exposed as DRRs.
Unfortunately RewriterGen does not play well with the line of work on named linalg ops which require variadic operands and results.
Additionally, DRR is arguably not the right abstraction to expose compositions of such patterns that don't rely on SSA use-def semantics.

This revision abandons DRRs and exposes manually written C++ patterns.

Refactorings and cleanups are performed to uniformize APIs.
This refactoring will allow replacing the currently manually specified Linalg named ops.

A collateral victim of this refactoring is the `tileAndFuse` DRR, and the one associated test, which will be revived at a later time.

Lastly, the following 2 tests do not add value and are altered:
- a dot_perm tile + interchange test does not test anything new and is removed
- a dot tile + lower to loops does not need 2-D tiling and is trimmed.
2020-05-04 11:17:37 -04:00
Frederik Gossen 031265ad8a [MLIR] Add complex numbers to standard dialect
Add `CreateComplexOp`, `ReOp`, and `ImOp` to the standard dialect.
This is the first step to support complex numbers.

Differential Revision: https://reviews.llvm.org/D79159
2020-05-04 14:04:28 +00:00
Marcel Koester 67b466deda [mlir] Removed tight coupling of BufferPlacement pass to Alloc and Dealloc.
The current BufferPlacement implementation tries to find Alloc and Dealloc
operations in order to move them. However, this is a tight coupling to
standard-dialect ops which has been removed in this CL.

Differential Revision: https://reviews.llvm.org/D78993
2020-05-04 14:23:15 +02:00
Wen-Heng (Jack) Chung bc23c1d85e [mlir][rocdl] add rocdl.barier op.
- Add rocdl.barrier op.
- Lower gpu.barier to rocdl.barrier in -convert-gpu-to-rocdl.

Differential Revision: https://reviews.llvm.org/D79126
2020-05-04 10:35:01 +02:00