Each hardware that supports SPV_C_CooperativeMatrixNV has a list of
configurations that are supported natively. Add an attribute to
specify the configurations supported to the `spv.target_env`.
Reviewed By: antiagainst, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D89364
The current fusion on tensors fuses reshape ops with generic ops by
linearizing the indexing maps of the fused tensor in the generic
op. This has some limitations
- It only works for static shapes
- The resulting indexing map has a linearization that would be
potentially prevent fusion later on (for ex. tile + fuse).
Instead, try to fuse the reshape consumer (producer) with generic op
producer (consumer) by expanding the dimensionality of the generic op
when the reshape is expanding (folding). This approach conflicts with
the linearization approach. The expansion method is used instead of
the linearization method.
Further refactoring that changes the fusion on tensors to be a
collection of patterns.
Differential Revision: https://reviews.llvm.org/D89002
This CL allows user to specify the same name for the operands in the source pattern which implicitly enforces equality on operands with the same name.
E.g., Pat<(OpA $a, $b, $a) ... > would create a matching rule for checking equality for the first and the last operands. Equality of the operands is enforced at any depth, e.g., OpA ($a, $b, OpB($a, $c, OpC ($a))).
Example usage: Pat<(Reshape $arg0, (Shape $arg0)), (replaceWithValue $arg0)>
Note, this feature only covers operands but not attributes.
Current use cases are based on the operand equality and explicitly add the constraint into the pattern. Attribute equality will be worked out on the different CL.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D89254
The buffers are used as source or destination of transfer commands
so always add VK_BUFFER_USAGE_TRANSFER_{DST,SRC}_BIT to their usage
flags.
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
This patch adds a couple missing LLVM IR dialect floating point types to
the legality check.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D89350
This revision adds a programmable codegen strategy from linalg based on staged rewrite patterns. Testing is exercised on a simple linalg.matmul op.
Differential Revision: https://reviews.llvm.org/D89374
* It reads as more of a TODO for the future and has been long obsoleted by later work.
* One of the authors of the referenced paper called this out as "weird stuff from two years ago" when reviewing the more recent TOSA RFC.
Differential Revision: https://reviews.llvm.org/D89329
This reverts commit 7271c1bcb9.
This broke the gcc-5 build:
/usr/include/c++/5/ext/new_allocator.h:120:4: error: no matching function for call to 'std::pair<const std::__cxx11::basic_string<char>, mlir::tblgen::SymbolInfoMap::SymbolInfo>::pair(llvm::StringRef&, mlir::tblgen::SymbolInfoMap::SymbolInfo)'
{ ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }
^
In file included from /usr/include/c++/5/utility:70:0,
from llvm/include/llvm/Support/type_traits.h:18,
from llvm/include/llvm/Support/Casting.h:18,
from mlir/include/mlir/Support/LLVM.h:24,
from mlir/include/mlir/TableGen/Pattern.h:17,
from mlir/lib/TableGen/Pattern.cpp:14:
/usr/include/c++/5/bits/stl_pair.h:206:9: note: candidate: template<class ... _Args1, long unsigned int ..._Indexes1, class ... _Args2, long unsigned int ..._Indexes2> std::pair<_T1, _T2>::pair(std::tuple<_Args1 ...>&, std::tuple<_Args2 ...>&, std::_Index_tuple<_Indexes1 ...>, std::_Index_tuple<_Indexes2 ...>)
pair(tuple<_Args1...>&, tuple<_Args2...>&,
^
Adds a TypeDef class to OpBase and backing generation code. Allows one
to define the Type, its parameters, and printer/parser methods in ODS.
Can generate the Type C++ class, accessors, storage class, per-parameter
custom allocators (for the storage constructor), and documentation.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D86904
This CL allows user to specify the same name for the operands in the source pattern which implicitly enforces equality on operands with the same name.
E.g., Pat<(OpA $a, $b, $a) ... > would create a matching rule for checking equality for the first and the last operands. Equality of the operands is enforced at any depth, e.g., OpA ($a, $b, OpB($a, $c, OpC ($a))).
Example usage: Pat<(Reshape $arg0, (Shape $arg0)), (replaceWithValue $arg0)>
Note, this feature only covers operands but not attributes.
Current use cases are based on the operand equality and explicitly add the constraint into the pattern. Attribute equality will be worked out on the different CL.
Differential Revision: https://reviews.llvm.org/D89254
This is the same diff as https://reviews.llvm.org/D88809/ except side effect
free check is removed for involution and a FIXME is added until the dependency
is resolved for shared builds. The old diff has more details on possible fixes.
Reviewed By: rriddle, andyly
Differential Revision: https://reviews.llvm.org/D89333
Update linalg-to-loops lowering for pooling operations to perform
padding of the input when specified by the corresponding attribute.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D88911
CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
The dependency target "Core" of target "mlir-cuda-runner" does not exist.
Call Stack (most recent call first):
llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)
CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
The dependency target "LINK_COMPONENTS" of target "mlir-cuda-runner" does
not exist.
Call Stack (most recent call first):
llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)
CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
The dependency target "Support" of target "mlir-cuda-runner" does not
exist.
Call Stack (most recent call first):
llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)
* Extends Context/Operation interning to cover Module as well.
* Implements Module.context, Attribute.context, Type.context, and Location.context back-references (facilitated testing and also on the TODO list).
* Adds method to create an empty Module.
* Discovered missing in npcomp.
Differential Revision: https://reviews.llvm.org/D89294
For some reason the variable `cumulativeSizeInBytes` in
`getCumulativeSizeInBytes` was actually storing number of elements. I decided
to fix it and refactor the function a bit.
Differential Revision: https://reviews.llvm.org/D89336
The build of MLIR occasionally fails (especially on Windows) because there is missing dependency between MLIRLLVMIR and MLIROpenMPOpsIncGen.
1) LLVMDialect.cpp includes LLVMDialect.h
2) LLVMDialect.h includes OpenMPDialect.h
3) OpenMPDialect.h includes OpenMPOpsDialect.h.inc, OpenMPOpsEnums.h.inc and OpenMPOps.h.inc
The OpenMP .inc files are generated by MLIROpenMPOpsIncGen, so MLIRLLVMIR which builds LLVMDialect.cpp should depend on MLIROpenMPOpsIncGen
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D89275
TensorConstantOp bufferization currently uses the vector dialect to store constant data into memory.
Due to natural vector size and alignment properties, this is problematic with n>1-D vectors whose most minor dimension is not naturally aligned.
Instead, this revision linearizes the constant and introduces a linalg.reshape to go back to the desired shape.
Still this is still to be considered a workaround and a better longer term solution will probably involve `llvm.global`.
Differential Revision: https://reviews.llvm.org/D89311
This combines two separate ops (D88972: `gpu.create_token`, D89043: `gpu.host_wait`) into one.
I do after all like the idea of combining the two ops, because it matches exactly the pattern we are
going to have in the other gpu ops that will implement the AsyncOpInterface (launch_func, copies, alloc):
If the op is async, we return a !gpu.async.token. Otherwise, we synchronize with the host and don't return a token.
The use cases for `gpu.wait async` and `gpu.wait` are further apart than those of e.g. `gpu.h2d async` and `gpu.h2d`,
but I like the consistent meaning of the `async` keyword in GPU ops.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D89160
This PR adds support for identified and recursive structs.
This includes: parsing, printing, serializing, and
deserializing such structs.
The following C struct:
```C
struct A {
A* next;
};
```
which is translated to the following MLIR code as:
```mlir
!spv.struct<A, (!spv.ptr<!spv.struct<A>, Generic>)>
```
would be represented in the SPIR-V module as:
```spirv
OpName %A "A"
OpTypeForwardPointer %APtr Generic
%A = OpTypeStruct %APtr
%APtr = OpTypePointer Generic %A
```
In particular the following changes are included:
- SPIR-V structs can now be either identified or literal
(i.e. non-identified).
- All structs now have their members surrounded by a ()-pair.
- For recursive references,
(1) an OpTypeForwardPointer instruction is emitted before
the OpTypeStruct instruction defining the recursive struct
(2) an OpTypePointer instruction is emitted after the
OpTypeStruct instruction which actually defines the recursive
pointer to struct type.
Reviewed By: antiagainst, rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D87206
* Links against libMLIR.so if the project is built for DYLIBs.
* Puts things in the right place in build and install time python/ trees so that RPaths line up.
* Adds install actions to install both the extension and sources.
* Copies py source files to the build directory to match (consistent layout between build/install time and one place to point a PYTHONPATH for tests and interactive use).
* Finally, "import mlir" from an installed LLVM just works.
Differential Revision: https://reviews.llvm.org/D89167
The TensorConstantOp bufferize conversion pattern has a bug that
makes it incorrect in the case of vectors whose alignment is not
the natural alignment. Circumvent it temporarily by using a power of 2.
Differential Revision: https://reviews.llvm.org/D89265
This revision reduces the number of places that specific information needs to be modified when adding new named Linalg ops.
Differential Revision: https://reviews.llvm.org/D89223
This revision introduces support for buffer allocation for any named linalg op.
To avoid template instantiating many ops, a new ConversionPattern is created to capture the LinalgOp interface.
Some APIs are updated to remain consistent with MLIR style:
`OwningRewritePatternList * -> OwningRewritePatternList &`
`BufferAssignmentTypeConverter * -> BufferAssignmentTypeConverter &`
Differential revision: https://reviews.llvm.org/D89226
The buffer placement preparation tests in
test/Transforms/buffer-placement-preparation* are using Linalg as a test
dialect which leads to confusion and "copy-pasta", i.e. Linalg is being
extended now and when TensorsToBuffers.cpp is changed, TestBufferPlacement is
sometimes kept in-sync, which should not be the case.
This has led to the unnoticed bug, because the tests were in a different directory and the patterns were slightly off.
Differential Revision: https://reviews.llvm.org/D89209
This patch introduces the acc.enter_data operation that represents an OpenACC Enter Data directive.
Operands and attributes are dervied from clauses in the spec 2.6.6.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88941
This is required or broadcasting with operands of different ranks will lead to
failures as the select op requires both possible outputs and its output type to
be the same.
Differential Revision: https://reviews.llvm.org/D89134
The patch adds a canonicalization pattern that removes the unused results of scf.if operation. As a result, cse may remove unused computations in the then and else regions of the scf.if operation.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D89029
This patch introduces the acc.exit_data operation that represents an OpenACC Exit Data directive.
Operands and attributes are derived from clauses in the spec 2.6.6.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88969
There is an atomic_rmw and a generic_atomic_rmw operation.
The doc of the latter incorrectly referred to former though.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D89172
They are currently marked as unsupported when windows is part of the triple, but they actually fail when they are run on Windows, so they are unsupported on system-windows
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D89169
This revision belongs to a series of patches that reduce reliance of Linalg transformations on templated rewrite and conversion patterns.
Instead, this uses a MatchAnyTag pattern for the vast majority of cases and dispatches internally.
Differential revision: https://reviews.llvm.org/D89133
This revision belongs to a series of patches that reduce reliance of Linalg transformations on templated rewrite and conversion patterns.
Instead, this uses a MatchAnyTag pattern for the vast majority of cases and dispatches internally.
Differential Revision: https://reviews.llvm.org/D89133