Add an extra RewritePattern that does not convert types to rewrite a CopyOp that has non-identity permutations into a sequence of TransposeOp followed by a CopyOp without such permutations.
This RewitePattern is made to fail in the non-permutation case so that the conversion pattern can kick in to lower to LLVM.
This is an instance of A->A->B lowering where A->A is done by a RewritePattern in case_1 and A->B is done by a ConversionPatternRewriter when not(case_1).
PiperOrigin-RevId: 265171380
Add a conversion pattern that transforms a linalg.transpose op into:
1. A function entry `alloca` operation to allocate a ViewDescriptor.
2. A load of the ViewDescriptor from the pointer allocated in 1.
3. Updates to the ViewDescriptor to introduce the data ptr, offset, size
and stride. Size and stride are permutations of the original values.
4. A store of the resulting ViewDescriptor to the alloca'ed pointer.
The linalg.transpose op is replaced by the alloca'ed pointer.
PiperOrigin-RevId: 265169112
A linalg.transpose op is a pure metadata operation that takes a view + permutation map and produces
another view of the same underlying data, with a different reindexing. This is a
pure metadata operation that does not touch the underlying data.
Example:
```
%t = linalg.transpose %v (i, j) -> (j, i) : !linalg.view<?x?xf32>
```
PiperOrigin-RevId: 265139429
This CL extends support for lowering of linalg to external C++ libraries with CopyOp. Currently this can only work when the permutation maps in the copies are identity. Future support for permutations will be added later.
PiperOrigin-RevId: 265093025
This will allow iterating the values of a non-opaque ElementsAttr, with all of the types currently supported by DenseElementsAttr. This should help reduce the amount of specialization on DenseElementsAttr.
PiperOrigin-RevId: 264968151
linalg.subview used to lower to a slice with a bounded range resulting in correct bounded accesses. However linalg.slice could still index out of bounds. This CL moves the bounding to linalg.slice.
LLVM select and cmp ops gain a more idiomatic builder.
PiperOrigin-RevId: 264897125
This commit adds `PositiveI32Attr` and `PositiveI64Attr` to match positive
integers but not zero nor negative integers. This commit also adds
`HasAnyRankOfPred` to match tensors with the specified ranks.
PiperOrigin-RevId: 264867046
Previously Module and Function are builtinn constructs in MLIR.
Due to the structural requirements we must wrap the SPIR-V
module inside a Function inside a Module. Now the requirement
is lifted and we can remove the wrapping function! :)
PiperOrigin-RevId: 264736051
This will allow iterating the values of a non-opaque ElementsAttr, with all of the types currently supported by DenseElementsAttr. This should help reduce the amount of specialization on DenseElementsAttr.
PiperOrigin-RevId: 264637293
This CL extends declarative rewrite rules to support matching and
generating ops with variadic operands/results. For this, the
generated `matchAndRewrite()` method for each pattern now are
changed to
* Use "range" types for the local variables used to store captured
values (`operand_range` for operands, `ArrayRef<Value *>` for
values, *Op for results). This allows us to have a unified way
of handling both single values and value ranges.
* Create local variables for each operand for op creation. If the
operand is variadic, then a `SmallVector<Value*>` will be created
to collect all values for that operand; otherwise a `Value*` will
be created.
* Use a collective result type builder. All result types are
specified via a single parameter to the builder.
We can use one result pattern to replace multiple results of the
matched root op. When that happens, it will require specifying
types for multiple results. Add a new collective-type builder.
PiperOrigin-RevId: 264588559
In SPIR-V binary format, constants are placed at the module level
and referenced by instructions inside functions using their result
<id>s. To model this natively (using SSA values for result <id>s),
it means we need to have implicit capturing functions. We will
lose the ability to have function passes if going down that path.
Instead, this CL changes to materialize constants at their use
sites in deserialization. It's cheap to copy constants in MLIR
given that attributes is uniqued to MLIRContext. By localizing
constants into functions, we can preserve isolated functions.
PiperOrigin-RevId: 264582532
Similar to global variables, specialization constants also live
in the module scope and can be referenced by instructions in
functions in native SPIR-V. A direct modelling would be to allow
functions in the SPIR-V dialect to implicit capture, but it means
we are losing the ability to write passes for Functions. While
in SPIR-V normally we want to process the module as a whole,
it's not common to see multiple functions get used so we'd like
to leave the door open for those cases. Therefore, similar to
global variables, we introduce spv.specConstant to model three
SPIR-V instructions: OpSpecConstantTrue, OpSpecConstantFalse,
and OpSpecConstant. They do not return SSA value results;
instead they have symbols and can only be referenced by the
symbols. To use it in a function, we need to have another op
spv._reference_of to turn the symbol into an SSA value. This
breaks the tie and makes functions still explicit capture.
Previously specialization constants were handled similarly as
normal constants. That is incorrect given that specialization
constant actually acts more like variable (without need to
load and store). E.g., they cannot be de-duplicated like normal
constants.
This CL also refines various documents and comments.
PiperOrigin-RevId: 264455172
tensorflow/mlir#58 fixed and exercised
verification of load/store ops using empty affine maps. Unfortunately,
it didn't exercise the creation of them. This PR addresses that aspect.
It removes the assumption of AffineMap having at least one result and
stores a pointer to MLIRContext as member of AffineMap.
* Add empty map support to affine.store + test
* Move MLIRContext to AffineMapStorage
Closestensorflow/mlir#74
PiperOrigin-RevId: 264416260
This conversion has been using a stack-allocated array of i8 to store the
null-terminated kernel name in order to pass it to the CUDA wrappers expecting
a C string because the LLVM dialect was missing support for globals. Now that
the suport is introduced, use a global instead.
Refactor global string construction from GenerateCubinAccessors into a common
utility function living in the LLVM namespace.
PiperOrigin-RevId: 264382489
JitRunner can use as entry points functions that produce either a single
'!llvm.f32' value or a list of memrefs. Memref support is legacy and was
introduced before MLIR could lower memref allocation and deallocation to
malloc/free calls so as to allocate the memory externally, and is likely to be
dropped in the future since it unconditionally runs affine+standard-to-llvm
lowering on the module instead of accepting the LLVM dialect. CUDA runner
relies on memref-based flow in the runner without actually returning anything.
Introduce a runner flow to use functions that return void as entry points.
PiperOrigin-RevId: 264381686
LLVM intrinsics have an open name space and their names can potentially overlap
with names of LLVM instructions (LLVM intrinsics are functions, not
instructions). In MLIR, LLVM intrinsics are modeled as operations, so it needs
to make sure their names cannot clash with the instructions. Use the "intr."
prefix for intrinsics in the LLVM dialect.
PiperOrigin-RevId: 264372173
This CL allows binary operations on n-D vector types to be lowered to LLVMIR by performing an (n-1)-D extractvalue, 1-D vector operation and an (n-1)-D insertvalue.
PiperOrigin-RevId: 264339118
- fix missing check while simplifying an expression with floordiv to a
mod
- fixes issue tensorflow/mlir#82
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#84
PiperOrigin-RevId: 264338353
This will allow for naming values the same as existing SSA values for regions attached to operations that are isolated from above. This fits in with how the system already allows separate name scopes for sibling regions. This name shadowing can be enabled in the custom parser of operations by setting the 'enableNameShadowing' flag to true when calling 'parseRegion'.
%arg = constant 10 : i32
foo.op {
%arg = constant 10 : i32
}
PiperOrigin-RevId: 264255999
This CL adds an integer attribute to linalg.buffer_alloc and lowering to LLVM.
The alignment is constrained to be a positive power of 2.
Lowering to LLVM produces the pattern:
```
%[[alloc:.*]] = llvm.call @malloc(%[[s]]) : (!llvm.i64) -> !llvm<"i8*">
%[[cast:.*]] = llvm.bitcast %[[alloc]] : !llvm<"i8*"> to !llvm.i64
%[[rem:.*]] = llvm.urem %[[cast]], %[[c16]] : !llvm.i64
%[[drem:.*]] = llvm.sub %[[c16]], %[[rem]] : !llvm.i64
%[[off:.*]] = llvm.urem %[[drem]], %[[c16]] : !llvm.i64
llvm.getelementptr %{{.*}}[%[[off]]] : (!llvm<"i8*">, !llvm.i64) -> !llvm<"i8*">
```
where `ptr` is aligned on `align` by computing the address
`ptr + (align - ptr % align) % align`.
To allow dealloc op to still be able to free memory, additional information is needed in
the buffer type. The buffer type is thus extended with an extra i8* for the base allocation address.
PiperOrigin-RevId: 264244455
Change the prining/parsing of spv.globalVariable to print the type of
the variable after the ':' to be consistent with MLIR convention.
The spv._address_of should print the variable type after the ':'. It was
mistakenly printing the address of the return value. Add a (missing)
test that should have caught that.
Also move spv.globalVariable and spv._address_of tests to
structure-ops.mlir.
PiperOrigin-RevId: 264204686
This CL adds the spv.ReturnValue op and its tests. Also adds a
InFunctionScope trait to make sure that the op stays inside
a function. To be consistent, ModuleOnly trait is changed to
InModuleScope.
PiperOrigin-RevId: 264193081
The linalg.view type used to be lowered to a struct containing a data pointer, offset, sizes/strides information. This was problematic when passing to external functions due to ABI, struct padding and alignment issues.
The linalg.view type is now lowered to LLVMIR as a *pointer* to a struct containing the data pointer, offset and sizes/strides. This simplifies the interfacing with external library functions and makes it trivial to add new functions without creating a shim that would go from a value type struct to a pointer type.
The consequences are that:
1. lowering explicitly uses llvm.alloca in lieu of llvm.undef and performs the proper llvm.load/llvm.store where relevant.
2. the shim creation function `getLLVMLibraryCallDefinition` disappears.
3. views are passed by pointer, scalars are passed by value. In the future, other structs will be passed by pointer (on a per-need basis).
PiperOrigin-RevId: 264183671
Switch to C++14 standard method as llvm::make_unique has been removed (
https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next
integrates.
PiperOrigin-RevId: 263953918
FuncOps in MLIR use explicit capture. So global variables defined in
module scope need to have a symbol name and this should be used to
refer to the variable within the function. This deviates from SPIR-V
spec, which assigns an SSA value to variables at all scopes that can
be used to refer to the variable, which requires SPIR-V functions to
allow implicit capture. To handle this add a new op,
spirv::GlobalVariableOp that can be used to define module scope
variables.
Since instructions need an SSA value, an new spirv::AddressOfOp is
added to convert a symbol reference to an SSA value for use with other
instructions.
This also means the spirv::EntryPointOp instruction needs to change to
allow initializers to be specified using symbol reference instead of
SSA value
The current spirv::VariableOp which returns an SSA value (as defined
by SPIR-V spec) can still be used to define function-scope variables.
PiperOrigin-RevId: 263951109
This CL adds an optional third argument to the vector.outerproduct instruction.
When such a third argument is specified, it is added to the result of the outerproduct and is lowered to FMA intrinsic when the lowering supports it.
In the future, we can add an attribute on the `vector.outerproduct` instruction to modify the operations for which to emit code (e.g. "+/*", "max/+", "min/+", "log/exp" ...).
This CL additionally performs minor cleanups in the vector lowering and adds tests to improve coverage.
This has been independently verified to result in proper fma instructions for haswell as follows.
Input:
```
func @outerproduct_add(%arg0: vector<17xf32>, %arg1: vector<8xf32>, %arg2: vector<17x8xf32>) -> vector<17x8xf32> {
%2 = vector.outerproduct %arg0, %arg1, %arg2 : vector<17xf32>, vector<8xf32>
return %2 : vector<17x8xf32>
}
}
```
Command:
```
mlir-opt vector-to-llvm.mlir -vector-lower-to-llvm-dialect --disable-pass-threading | mlir-opt -lower-to-cfg -lower-to-llvm | mlir-translate --mlir-to-llvmir | opt -O3 | llc -O3 -march=x86-64 -mcpu=haswell -mattr=fma,avx2
```
Output:
```
outerproduct_add: # @outerproduct_add
# %bb.0:
...
vmovaps 112(%rbp), %ymm8
vbroadcastss %xmm0, %ymm0
...
vbroadcastss 64(%rbp), %ymm15
vfmadd213ps 144(%rbp), %ymm8, %ymm0 # ymm0 = (ymm8 * ymm0) + mem
...
vfmadd213ps 400(%rbp), %ymm8, %ymm9 # ymm9 = (ymm8 * ymm9) + mem
...
```
PiperOrigin-RevId: 263743359
Generate the EnumAttr to represent BuiltIns in SPIR-V dialect. The
builtIn can be specified as a StringAttr with value being the
name of the builtin. Extend Decoration (de)serialization to handle
BuiltIns.
Also fix an error in the SPIR-V dialect generator script.
PiperOrigin-RevId: 263596624
All 'getValue' variants now require that the index is valid, queryable via 'isValidIndex'. 'getSplatValue' now requires that the attribute is a proper splat. This allows for querying these methods on DenseElementAttr with all possible value types; e.g. float, int, APInt, etc. This also allows for removing unnecessary conversions to Attribute that really want the underlying value.
PiperOrigin-RevId: 263437337
This CL moves the linalg.load/range/store ops to ODS.
Minor cleanups are performed.
Additional invalid IR tests are added for coverage.
PiperOrigin-RevId: 263432110
This CL fuses the emission of size and stride information and makes it clearer which indexings are stepped over when querying the positions. This refactor was motivated by an index calculation bug in the stride computation.
PiperOrigin-RevId: 263341610
This CL fixes the stepping through operands when emitting the view sizes of linalg.slice to LLVMIR. This is now consistent with the strides emission.
A relevant test is added.
Fix suggested by Alex Zinenko, thanks!
PiperOrigin-RevId: 263150922
This operation is important to achieve decent performance in computational
kernels. In LLVM, it is implemented as an intrinsic (through function
declaration and function call). Thanks to MLIR's extendable set of operations,
it does not have to differentiate between built-ins and intrinsics, so fmuladd
is introduced as a general type-polymorphic operation. Custom printing and
parsing will be added later.
PiperOrigin-RevId: 263106305
The GenerateCubinAccessors was generating functions that fill
dynamically-allocated memory with the binary constant of a CUBIN attached as a
stirng attribute to the GPU kernel. This approach was taken to circumvent the
missing support for global constants in the LLVM dialect (and MLIR in general).
Global constants were recently added to the LLVM dialect. Change the
GenerateCubinAccessors pass to emit a global constant array of characters and a
function that returns a pointer to the first character in the array.
PiperOrigin-RevId: 263092052
Since raw pointers are always passed around for IR construct without
implying any ownership transfer, it can be error prone to have implicit
ownership transferred the same way.
For example this code can seem harmless:
Pass *pass = ....
pm.addPass(pass);
pm.addPass(pass);
pm.run(module);
PiperOrigin-RevId: 263053082
This instruction is a local counterpart of llvm.global that takes a symbol
reference to a global and produces an SSA value containing the pointer to it.
Used in combination, these two operations allow one to use globals with other
operations expecting SSA values. At a cost of IR indirection, we make sure the
functions don't implicitly capture the surrounding SSA values and remain
suitable for parallel processing.
PiperOrigin-RevId: 262908622
This CL is step 3/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds support for converting MLIR n-D vector types to (n-1)-D arrays of 1-D LLVM vectors and a conversion VectorToLLVM that lowers the `vector.extractelement` and `vector.outerproduct` instructions to the proper mix of `llvm.vectorshuffle`, `llvm.extractelement` and `llvm.mulf`.
This has been independently verified to produce proper avx2 code.
Input:
```
func @vec_1d(%arg0: vector<4xf32>, %arg1: vector<8xf32>) -> vector<8xf32> {
%2 = vector.outerproduct %arg0, %arg1 : vector<4xf32>, vector<8xf32>
%3 = vector.extractelement %2[0 : i32]: vector<4x8xf32>
return %3 : vector<8xf32>
}
```
Command:
```
mlir-opt vector-to-llvm.mlir -vector-lower-to-llvm-dialect --disable-pass-threading | mlir-opt -lower-to-cfg -lower-to-llvm | mlir-translate --mlir-to-llvmir | opt -O3 | llc -O3 -march=x86-64 -mcpu=haswell -mattr=fma,avx2
```
Output:
```
vec_1d: # @vec_1d
# %bb.0:
vbroadcastss %xmm0, %ymm0
vmulps %ymm1, %ymm0, %ymm0
retq
```
PiperOrigin-RevId: 262895929
There are currently several different terms used to refer to a parent IR unit in 'get' methods: getParent/getEnclosing/getContaining. This cl standardizes all of these methods to use 'getParent*'.
PiperOrigin-RevId: 262680287
This will allow for reusing the same pattern list, which may be costly to continually reconstruct, on multiple invocations.
PiperOrigin-RevId: 262664599
Unlike regular constant values, strings must be placed in some memory and
referred to through a pointer to that memory. Until now, they were not
supported in function-local constant declarations with `llvm.constant`.
Introduce support for global strings using `llvm.global`, which would translate
them into global arrays in LLVM IR and thus make sure they have some memory
allocated for storage.
PiperOrigin-RevId: 262569316
This CL introduces the ability to generate the external library name for Linalg operations.
The problem is that neither mlir or C support overloading and we want a simplified form of name mangling that is still reasonable to read.
This CL creates the name of the external call that Linalg expects from the operation name and the type of its arguments.
The interface library names are updated and use new cases are added for FillOp.
PiperOrigin-RevId: 262556833
This CL adds the ability for linalg.view to act as a bitcast operation.
This will be used when promoting views into faster memory and casting to vector types.
In the process, linalg.view is moved to ODS.
PiperOrigin-RevId: 262556246
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the vector.outerproduct operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.
PiperOrigin-RevId: 262552027
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the vector.extractelement operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.
PiperOrigin-RevId: 262545089
This CL is step 1/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.
This CL adds the 3 instructions `llvm.extractelement`, `llvm.insertelement` and `llvm.shufflevector` as documented in the LLVM LangRef "Vector Instructions" section.
The "Experimental Vector Reduction Intrinsics" are left out for now and can be added in the future on a per-need basis.
Appropriate roundtrip and LLVM Target tests are added.
PiperOrigin-RevId: 262542095
Introduce an operation that defines global constants and variables in the LLVM
dialect, to reflect the corresponding LLVM IR capability. This operation is
expected to live in the top-level module and behaves similarly to
llvm.constant. It currently does not model many of the attributes supported by
the LLVM IR for global values (memory space, alignment, thread-local, linkage)
and will be extended as the relevant use cases appear.
PiperOrigin-RevId: 262539445
This adds support for fcmp to the LLVM dialect and adds any necessary lowerings, as well as support for EDSCs.
Closestensorflow/mlir#69
PiperOrigin-RevId: 262475255
LLVM function type has first-class support for variadic functions. In the
current lowering pipeline, it is emulated using an attribute on functions of
standard function type. In LLVMFuncOp that has LLVM function type, this can be
modeled directly. Introduce parsing support for variadic arguments to the
function and use it to support variadic function declarations in LLVMFuncOp.
Function definitions are currently not supported as that would require modeling
va_start/va_end LLVM intrinsics in the dialect and we don't yet have a
consistent story for LLVM intrinsics.
PiperOrigin-RevId: 262372651
Now that modules are also operations, nothing prevents one from defining SSA
values in the module. Doing so in an implicit top-level module, i.e. outside
of a `module` operation, was leading to a crash because the implicit module was
not associated with an SSA name scope. Create a name scope before parsing the
top-level module to fix this.
PiperOrigin-RevId: 262366891
This CL introduces canonicalization patterns for linalg.dim.
This allows the dimenions of chains of view, slice and subview operations to simplify.
Down the line, when mixed with cse, this also allows better composition of linalg tiling and fusion by tracking operations that give the same result (not in this CL).
PiperOrigin-RevId: 262365865
Verification complained when using zero-dimensional memrefs in
affine.load, affine.store, std.load and std.store. This PR extends
verification so that those memrefs can be used.
Closestensorflow/mlir#58
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/58 from dcaballe:dcaballe/zero-dim 49bcdcd45c52c48beca776431328e5ce551dfa9e
PiperOrigin-RevId: 262164916
This CL extends the Linalg GenericOp with an alternative way of specifying the body of the computation based on a single block region. The "fun" attribute becomes optional.
Either a SymbolRef "fun" attribute or a single block region must be specified to describe the side-effect-free computation. Upon lowering to loops, the new region body is inlined in the innermost loop.
The parser, verifier and pretty printer are extended.
Appropriate roundtrip, negative and lowering to loop tests are added.
PiperOrigin-RevId: 261895568
This CL modifies the LowerLinalgToLoopsPass to use RewritePattern.
This will make it easier to inline Linalg generic functions and regions when emitting to loops in a subsequent CL.
PiperOrigin-RevId: 261894120
This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'.
PiperOrigin-RevId: 261816316
This trait provides the ensureTerminator() utility function and
the checks to make sure a spv.module is indeed terminated with
spv._module_end.
PiperOrigin-RevId: 261664153
Similar to all LLVM dialect operations, llvm.func needs to have the custom
syntax. Use the generic FunctionLike printer and parser to implement it.
PiperOrigin-RevId: 261641755
llvm ir printer was changed at LLVM r367755.
Prints value numbers for unnamed functions argument.
Closestensorflow/mlir#67
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/67 from denis0x0D:sandbox/fix_mlir_translate ae46844e66f34a02e0cf86782ddadc5bce58b30d
PiperOrigin-RevId: 261640048
This CL introduces a linalg.generic op to represent generic tensor contraction operations on views.
A linalg.generic operation requires a numbers of attributes that are sufficient to emit the computation in scalar form as well as compute the appropriate subviews to enable tiling and fusion.
These attributes are very similar to the attributes for existing operations such as linalg.matmul etc and existing operations can be implemented with the generic form.
In the future, most existing operations can be implemented using the generic form.
This CL starts by splitting out most of the functionality of the linalg::NInputsAndOutputs trait into a ViewTrait that queries the per-instance properties of the op. This allows using the attribute informations.
This exposes an ordering of verifiers issue where ViewTrait::verify uses attributes but the verifiers for those attributes have not been run. The desired behavior would be for the verifiers of the attributes specified in the builder to execute first but it is not the case atm. As a consequence, to emit proper error messages and avoid crashing, some of the
linalg.generic methods are defensive as such:
```
unsigned getNumInputs() {
// This is redundant with the `n_views` attribute verifier but ordering of verifiers
// may exhibit cases where we crash instead of emitting an error message.
if (!getAttr("n_views") || n_views().getValue().size() != 2)
return 0;
```
In pretty-printed form, the specific attributes required for linalg.generic are factored out in an independent dictionary named "_". When parsing its content is flattened and the "_name" is dropped. This allows using aliasing for reducing boilerplate at each linalg.generic invocation while benefiting from the Tablegen'd verifier form for each named attribute in the dictionary.
For instance, implementing linalg.matmul in terms of linalg.generic resembles:
```
func @mac(%a: f32, %b: f32, %c: f32) -> f32 {
%d = mulf %a, %b: f32
%e = addf %c, %d: f32
return %e: f32
}
#matmul_accesses = [
(m, n, k) -> (m, k),
(m, n, k) -> (k, n),
(m, n, k) -> (m, n)
]
#matmul_trait = {
doc = "C(m, n) += A(m, k) * B(k, n)",
fun = @mac,
indexing_maps = #matmul_accesses,
library_call = "linalg_matmul",
n_views = [2, 1],
n_loop_types = [2, 1, 0]
}
```
And can be used in multiple places as:
```
linalg.generic #matmul_trait %A, %B, %C [other-attributes] :
!linalg.view<?x?xf32>, !linalg.view<?x?xf32>, !linalg.view<?x?xf32>
```
In the future it would be great to have a mechanism to alias / register a new
linalg.op as a pair of linalg.generic, #trait.
Also, note that with one could theoretically only specify the `doc` string and parse all the attributes from it.
PiperOrigin-RevId: 261338740
Add StdIndexedValue to EDSC helper so that we can use it
to generated std.load and std.store in EDSC.
Closestensorflow/mlir#59
PiperOrigin-RevId: 261324965
This CL extends the existing spv.constant op to also support
specialization constant by adding an extra unit attribute
on it.
PiperOrigin-RevId: 261194869
verifyUnusedValue is a bit strange given that it is specified in a
result pattern but used to generate match statements. Now we are
able to support multi-result ops better, we can retire it and replace
it with a HasNoUseOf constraint. This reduces the number of mechanisms.
PiperOrigin-RevId: 261166863
We allow to generate more ops than what are needed for replacing
the matched root op. Only the last N static values generated are
used as replacement; the others serve as auxiliary ops/values for
building the replacement.
With the introduction of multi-result op support, an op, if used
as a whole, may be used to replace multiple static values of
the matched root op. We need to consider this when calculating
the result range an generated op is to replace.
For example, we can have the following pattern:
```tblgen
def : Pattern<(ThreeResultOp ...),
[(OneResultOp ...), (OneResultOp ...), (OneResultOp ...)]>;
// Two op to replace all three results
def : Pattern<(ThreeResultOp ...),
[(TwoResultOp ...), (OneResultOp ...)]>;
// One op to replace all three results
def : Pat<(ThreeResultOp ...), (ThreeResultOp ...)>;
def : Pattern<(ThreeResultOp ...),
[(AuxiliaryOp ...), (ThreeResultOp ...)]>;
```
PiperOrigin-RevId: 261017235
Previously we use one single method with lots of branches to
generate multiple builders. This makes the method difficult
to follow and modify. This CL splits the method into multiple
dedicated ones, by extracting common logic into helper methods
while leaving logic specific to each builder in their own
methods.
PiperOrigin-RevId: 261011082
During serialization, the operand number must be used to get the
values assocaited with an operand. Using the argument number in Op
specification was wrong since some of the elements in the arguments
list might be attributes on the operation. This resulted in a segfault
during serialization.
Add a test that exercise that path.
PiperOrigin-RevId: 260977758
Extend the recently introduced support for hexadecimal float literals to tensor
literals, which may also contain special floating point values such as
infinities and NaNs.
Modify TensorLiteralParser to store the list of tokens representing values
until the type is parsed instead of trying to guess the tensor element type
from the token kinds (hexadecimal values can be either integers or floats, and
can be mixed with both). Maintain the error reports as close as possible to
the existing implementation to avoid disturbing the tests. They can be
improved in a separate clean-up if deemed necessary.
PiperOrigin-RevId: 260794716
All non-argument attributes specified for an operation are treated as
decorations on the result value and (de)serialized using OpDecorate
instruction. An error is generated if an attribute is not an argument,
and the name doesn't correspond to a Decoration enum. Name of the
attributes that represent decoerations are to be the snake-case-ified
version of the Decoration name.
Add utility methods to convert to snake-case and camel-case.
PiperOrigin-RevId: 260792638
MLIR does not have support for parsing special floating point values such as
infinities and NaNs. If programmatically constructed, these values are printed
as NaN and (+-)Inf and cannot be parsed back. Add parser support for
hexadecimal literals in float attributes, following LLVM IR. The literal
corresponds to the in-memory representation of the floating point value.
IEEE 754 defines a range of possible values for NaNs, storing the bitwise
representation allows MLIR to properly roundtrip NaNs with different bit values
of significands.
The initial version of this commit was missing support for float literals that
used to be printed in decimal notation as a fallback, but ended up being
printed in hexadecimal format which became the fallback for special values.
The decimal fallback behavior was not exercised by tests. It is currently
reinstated and tested by the newly added test @f32_potential_precision_loss in
parser.mlir.
PiperOrigin-RevId: 260790900
This CL adds an initial implementation for translation of kernel
function in GPU Dialect (used with a gpu.launch_kernel) op to a
spv.Module. The original function is translated into an entry
function.
Most of the heavy lifting is done by adding TypeConversion and other
utility functions/classes that provide most of the functionality to
translate from Standard Dialect to SPIR-V Dialect. These are intended
to be reusable in implementation of different dialect conversion
pipelines.
Note : Some of the files for have been renamed to be consistent with
the norm used by the other Conversion frameworks.
PiperOrigin-RevId: 260759165
RewriterGen was emitting invalid C++ code if the pattern required to create a
zero-result operation due to the absence of a special case that would avoid
generating a spurious comma. Handle this case. Also add rewriter tests for
zero-argument operations.
PiperOrigin-RevId: 260576998
The code was written with the assumption that on failure an error would be
issued by another verifier. However verification is stopping on the first
failure which lead to an empty output. Instead we make sure an error is
displayed.
Also add tests in the test dialect for this trait.
PiperOrigin-RevId: 260541290
Automatic generation of spirv::AccessChainOp (de)serialization needs
the (de)serialization emitters to handle argument specified as
Variadic<...>. To handle this correctly, this argument can only be
the last entry in the arguments list.
Add a test to (de)serialize spirv::AccessChainOp
PiperOrigin-RevId: 260532598
This CL adds a few specializations for sgemm.
A minor change to alpha is made in cblas_interface.cpp to be compatible with actual BLAS calls.
For now this is for internal testing purposes only.
PiperOrigin-RevId: 260129027
It's quite common that we want to put further constraints on the matched
multi-result op's specific results. This CL enables referencing symbols
bound to source op with the `__N` syntax.
PiperOrigin-RevId: 260122401
In the backward slice computation, BlockArgument coming from function arguments represent a natural boundary for the traversal and should not trigger llvm_unreachable.
This CL also improves the error message and adds a relevant test.
PiperOrigin-RevId: 260118630
This CL provides a fix that makes linal_matmul_impl compliant with the BLAS interface. Before this CL it would compute either C += A * B when called with cblas.cpp:cblas_sgemm implementation and C = A * B with other implementations.
PiperOrigin-RevId: 260117367
Clipping creates non-affine memory accesses, use std_load and std_store instead of affine_load and affine_store.
In the future we may also want a fill with the neutral element rather than clip, this would make the accesses affine if we wanted more analyses and transformations to happen post lowering to pointwise copies.
PiperOrigin-RevId: 260110503
AccessChainOp creates a pointer into a composite object that can be used with
OpLoad and OpStore.
Closestensorflow/mlir#52
PiperOrigin-RevId: 260035676
MLIR does not have support for parsing special floating point values such as
infinities and NaNs. If programmatically constructed, these values are printed
as NaN and (+-)Inf and cannot be parsed back. Add parser support for
hexadecimal literals in float attributes, following LLVM IR. The literal
corresponds to the in-memory representation of the floating point value.
IEEE 754 defines a range of possible values for NaNs, storing the bitwise
representation allows MLIR to properly roundtrip NaNs with different bit values
of significands.
PiperOrigin-RevId: 260018802
This mode analyzes which operations are legalizable to the given target if a conversion were to be applied, i.e. no rewrites are ever performed even on success. This mode is useful for device partitioning or other utilities that may want to analyze the effect of conversion to different targets before performing it.
The analysis method currently just fills a provided set with the operations that were found to be legalizable. This can be extended in the future to capture more information as necessary.
PiperOrigin-RevId: 259987105
This CL fixes an oversight with dealing with loops in slicing analysis.
The forward slice computation properly propagates through loops but not the backward slice.
Add relevant unit tests.
PiperOrigin-RevId: 259903396
Per tacit agreement, individual dialects should now live in lib/Dialect/Name
with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name.
PiperOrigin-RevId: 259896851
This CL adds support for SubViewOp in the alias analysis to permit multiple Linalg fusion passes to compose. The debugging messages are also improved for better readability. The readability benefits came in handy when tracking this issue.
A 2-level fusion test is added to capture the new behavior.
PiperOrigin-RevId: 259720246
Conversion from integers (window or input size, padding etc) to floating point is required to express many ML kernels, for example average pooling.
PiperOrigin-RevId: 259575284
The loop parallelism detection utility only collects the affine.load and
affine.store operations appearing inside the loop to analyze the access
patterns for the absence of dependences. However, any operation, including
unregistered operations, can appear in a body of an affine loop. If such
operation has side effects, the result of parallelism analysis is incorrect.
Conservatively assume affine loops are not parallel in presence of operations
other than affine.load, affine.store, affine.for, affine.terminator that may
have side effects.
This required to update the loop-fusion unit test that relies on parallelism
analysis and was exercising loop fusion in presence of an unregistered
operation.
PiperOrigin-RevId: 259560935
A recent commit introduced UnitAttr into the ODS but did not include the
support for using UnitAttrs in operation definitions (only patterns were
supported). Extend the ODS definition of UnitAttr to be usable in operation
definition by providing a trivial builder and an accessor that returns "true"
if the unit attribute is present since the attribute presence itself has
meaning.
Additionally, test that unit attributes are effectively rewritten in patterns
in addition to the already available FileCheck tests of the generated rewriter
code.
PiperOrigin-RevId: 259560653
Originally, MLIR only supported functions of the built-in FunctionType. On the
conversion path to LLVM IR, we were creating MLIR functions that contained LLVM
dialect operations and used LLVM IR types for everything expect top-level
functions (e.g., a second-order function would have a FunctionType that consume
or produces a wrapped LLVM function pointer type). With MLIR functions
becoming operations, it is now possible to introduce non-built-in function
operations. This will let us use conversion patterns for function conversion,
simplify the MLIR-to-LLVM translation by removing the knowledge of the MLIR
built-in function types, and provide stronger correctness verifications (e.g.
LLVM functions only accept LLVM types).
Furthermore, we can currently construct a situation where the same function is
used with two different types: () -> () when its specified and called directly,
and !llvm<"void ()"> when it's passed somewhere on called indirectly. Having a
special function-op that is always of !llvm<"void ()"> type makes the function
model and the llvm dialect type system more consistent.
Introduce LLVMFuncOp to represent a function in the LLVM dialect. Unlike
standard FuncOp, this function has an LLVMType wrapping an LLVM IR function
type. Generalize the common behavior of function-defining operations
(functions live in a symbol table of a module, contain a single region, are
iterable as a list of blocks, and support argument attributes).
This only defines the operation. Custom syntax, conversion and translation
rules will be added in follow-ups.
The operation name mentions LLVM explicitly to avoid confusion with standard
FuncOp, especially in multiple files that use both `mlir` and `mlir::LLVM`
namespaces.
PiperOrigin-RevId: 259550940
- introduce parseRegionArgumentList (similar to parseOperandList) to parse a
list of region arguments with a delimiter
- allows defining custom parse for op's with multiple/variadic number of
region arguments
- use this on the gpu.launch op (although the latter has a fixed number
of region arguments)
- add a test dialect op to test region argument list parsing (with the
no delimiter case)
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#40
PiperOrigin-RevId: 259442536
SPIR-V has multiple constant instructions covering different
constant types:
* `OpConstantTrue` and `OpConstantFalse` for boolean constants
* `OpConstant` for scalar constants
* `OpConstantComposite` for composite constants
* `OpConstantNull` for null constants
* ...
We model them all with a single spv.constant op for uniformity
and friendliness to transformations. This does mean that when
doing (de)serialization, we need to poke spv.constant's type
to determine which SPIR-V binary instruction to use.
This CL only covers the case of bool and integer spv.constant.
The rest will follow.
PiperOrigin-RevId: 259311698
In the trait verifier of SingleBlockImplicitTerminator, report the name of the
unexpected terminator op found in the end of the block in addition to the name
of the expected terminator op. This may simplify debugging, especially in
cases where the terminator is omitted for brevity and/or after a long series of
conversions.
PiperOrigin-RevId: 259287452
This cl enforces that the conversion of the type signatures for regions, and thus their entry blocks, is handled via ConversionPatterns. A new hook 'applySignatureConversion' is added to the ConversionPatternRewriter to perform the desired conversion on a region. This also means that the handling of rewriting the signature of a FuncOp is moved to a pattern. A default implementation is provided via 'mlir::populateFuncOpTypeConversionPattern'. This removes the hacky implicit 'dynamically legal' status of FuncOp that was present previously, and leaves it up to the user to decide when/how to convert the signature of a function.
PiperOrigin-RevId: 259161999
Since the serialization of EntryPointOp contains the name of the
function as well, the function serialization emits the function name
using OpName instruction, which is used during deserialization to get
the correct function name.
PiperOrigin-RevId: 259158784
The TypeUtilities.{cpp,h}, currently living in {lib,include/mlir}/Support, do
not belong to the Support library. Instead, they form a separate utility
library that depends on the IR library. The operations it provides relate to
standard types (tensors, memrefs) as well as to operation manipulation, making
them a better fit for the main IR library.
PiperOrigin-RevId: 259108314
When printing the value attribute in spv.constant, OpAsmPrinter
already attaches a trailing type. So we don't need to duplicate
it again unless it's an array attribute, which does not have
type by default but we use it for spirv::ArrayType.
PiperOrigin-RevId: 258994197
This CL changes the Op definition of spirv::EntryPointOp and
spirv::ExecutionModeOp to be consistent with the SPIR-V spec.
1) The EntryPointOp doesn't return a value
2) The ExecutionModeOp takes as argument, the SymbolRefAttr to refer
to the function, instead of the result of the EntryPointOp.
Following this, the spirv::EntryPointType is no longer necessary, and
is removed.
PiperOrigin-RevId: 258964027
Several groups of operations in different dialects (e.g. AffineForOp,
AffineIfOp; loop::ForOp, loop::IfOp) share the requirement for their regions to
contain 0 or 1 block, and for blocks to always have a specific terminator type.
Furthermore, this terminator may be omitted from the custom syntax. Generalize
this behavior into OpTrait::SingleBlockImplicitTerminator, parameterized by the
terminator operation type. This trait provides the verifier that checks the
presence of the terminator, and utility functions adding the terminator in case
of absence.
PiperOrigin-RevId: 258957180
This CL introduces a simple loop utility function which rewrites the bounds and step of a loop so as to become mappable on a regular grid of processors whose identifiers are given by SSA values.
A corresponding unit test is added.
For example, using CUDA terminology, and assuming a 2-d grid with processorIds = [blockIdx.x, threadIdx.x] and numProcessors = [gridDim.x, blockDim.x], the loop:
```
loop.for %i = %lb to %ub step %step {
...
}
```
is rewritten into a version resembling the following pseudo-IR:
```
loop.for %i = %lb + threadIdx.x + blockIdx.x * blockDim.x to %ub
step %gridDim.x * blockDim.x {
...
}
```
PiperOrigin-RevId: 258945942
This CL adapts the recently introduced parametric tiling to have an API matching the tiling
of AffineForOp. The transformation using stripmineSink is more general and produces imperfectly nested loops.
Perfect nesting invariants of the tiled version are obtained by selectively applying hoisting of ops to isolate perfectly nested bands. Such hoisting may fail to produce a perfect loop nest in cases where ForOp transitively depend on enclosing induction variables. In such cases, the API provides a LogicalResult return but the SimpleParametricLoopTilingPass does not currently use this result.
A new unit test is added with a triangular loop for which the perfect nesting property does not hold. For this example, the old behavior was to produce IR that did not verify (some use was not dominated by its def).
PiperOrigin-RevId: 258928309
This allows for providing specific handling for dynamically legal operations/dialects without overriding the general 'isDynamicallyLegal' hook. This also means that a derived ConversionTarget class need not always be defined when some operations are dynamically legal.
Example usage:
ConversionTarget target(...);
target.addDynamicallyLegalOp<ReturnOp>([](ReturnOp op) {
return ...
};
target.addDynamicallyLegalDialect<StandardOpsDialect>([](Operation *op) {
return ...
};
PiperOrigin-RevId: 258884753
This specific PatternRewriter will allow for exposing hooks in the future that are only useful for the conversion framework, e.g. type conversions.
PiperOrigin-RevId: 258818122
Some TensorFlow simulated quantize ops such as QuantizeAndDequantizeV2Op have
attribute for the sign of the quantization, so quant_ConstFakeQuant should be
able to represent it with the new attribute is added.
The method for converting these attributes to an QuantizedType is updated to
handle this new argument.
PiperOrigin-RevId: 258810290
We already parse boolean "true"/"false" as ElementsAttr elements.
This CL makes it round-trippable that we are printing the same way.
PiperOrigin-RevId: 258784962
For ops in SPIR-V dialect that are a direct mirror of SPIR-V
operations, the serialization/deserialization methods can be
automatically generated from the Op specification. To enable this an
'autogenSerialization' field is added to SPV_Ops. When set to
non-zero, this will enable the automatic (de)serialization function
generation
Also adding tests that verify the spv.Load, spv.Store and spv.Variable
ops are serialized and deserialized correctly. To fully support these
tests also add serialization and deserialization of float types and
spv.ptr types
PiperOrigin-RevId: 258684764
We only verify broadcastable trait verifier and don't care about mutations so removed all CHECK statements and FileCheck invocation.
PiperOrigin-RevId: 258662882
This cl standardizes the printing of the type of dialect attributes to work the same as other attribute kinds. The type of dialect attributes will trail the dialect specific portion:
`#` dialect-namespace `<` attr-data `>` `:` type
The attribute parsing hooks on Dialect have been updated to take an optionally null expected type for the attribute. This matches the respective parseAttribute hooks in the OpAsmParser.
PiperOrigin-RevId: 258661298
Currently, Broadcastable trait also rejects instances when the op result has shape other than what can be statically inferred based on the operand shapes even if the result shape is compatible with the inferred broadcasted shape.
For example,
(tensor<3x2xi32>, tensor<*xi32>) -> tensor<4x3x2xi32>
(tensor<2xi32>, tensor<2xi32>) -> tensor<*xi32>
PiperOrigin-RevId: 258647493
This cl begins a large refactoring over how signature types are converted in the DialectConversion infrastructure. The signatures of blocks are now converted on-demand when an operation held by that block is being converted. This allows for handling the case where a region is created as part of a pattern, something that wasn't possible previously.
This cl also generalizes the region signature conversion used by FuncOp to work on any region of any operation. This generalization allows for removing the 'apply*Conversion' functions that were specific to FuncOp/ModuleOp. The implementation currently uses a new hook on TypeConverter, 'convertRegionSignature', but this should ideally be removed in favor of using Patterns. That depends on adding support to the PatternRewriter used by ConversionPattern to allow applying signature conversions to regions, which should be coming in a followup.
PiperOrigin-RevId: 258645733
This explicit tag is useful is several ways:
*) This simplifies how to mark sub sections of a dialect as explicitly unsupported, e.g. my target supports all operations in the foo dialect except for these select few. This is useful for partial lowerings between dialects.
*) Partial conversions will now verify that operations that were explicitly marked as illegal must be converted. This provides some guarantee that the operations that need to be lowered by a specific pass will be.
PiperOrigin-RevId: 258582879
Users generally want several different modes of conversion. This cl refactors DialectConversion to provide two:
* Partial (applyPartialConversion)
- This mode allows for illegal operations to exist in the IR, and does not fail if an operation fails to be legalized.
* Full (applyFullConversion)
- This mode fails if any operation is not properly legalized to the conversion target. This allows for ensuring that the IR after a conversion only contains operations legal for the target.
PiperOrigin-RevId: 258412243
Mostly one would use the type specification directly on the operand, but for
cases where the type of the operand depends on other operand types, `TypeIs`
attribute can be used to construct verification methods.
PiperOrigin-RevId: 258411758
With the introduction of the Loop dialect, uses of the `linalg.for` operation can now be subsumed 1-to-1 by `loop.for`.
This CL performs the replacement and tests are updated accordingly.
PiperOrigin-RevId: 258322565
This CL extends the linalg ops that can be tiled and fused to operations that take either views, scalar or vector operands.
PiperOrigin-RevId: 258159734
Multiple (perfectly) nested loops with independent bounds can be combined into
a single loop and than subdivided into blocks of arbitrary size for load
balancing or more efficient parallelism exploitation. However, MLIR wants to
preserve the multi-dimensional multi-loop structure at higher levels of
abstraction. Introduce a transformation that coalesces nested loops with
independent bounds so that they can be further subdivided by tiling.
PiperOrigin-RevId: 258151016
This CL extends the linalg ops that can be tiled and fused to operations that take either views, scalar or vector operands.
PiperOrigin-RevId: 258149291
These ops should not belong to the std dialect.
This CL extracts them in their own dialect and updates the corresponding conversions and tests.
PiperOrigin-RevId: 258123853
When using a RewritePattern and replacing an operation with an existing value, that value may have already been replaced by something else. This cl ensures that only the final value is used when applying rewrites.
PiperOrigin-RevId: 258058488
following SPIRV Instructions serializaiton/deserialization are added
as well
OpFunction
OpFunctionParameter
OpFunctionEnd
OpReturn
PiperOrigin-RevId: 257869806
* Changed SPIR-V types to all use unsigned for member type count and index
* Used the same method name for getting element type and count
* Improved spv.CompositeExtract verification a bit
PiperOrigin-RevId: 257862580
This CL splits the lowering of affine to LLVM into 2 parts:
1. affine -> std
2. std -> LLVM
The conversions mostly consists of splitting concerns between the affine and non-affine worlds from existing conversions.
Short-circuiting of affine `if` conditions was never tested or exercised and is removed in the process, it can be reintroduced later if needed.
LoopParametricTiling.cpp is updated to reflect the newly added ForOp::build.
PiperOrigin-RevId: 257794436
This allows for the attribute to hold symbolic references to other operations than FuncOp. This also allows for removing the dependence on FuncOp from the base Builder.
PiperOrigin-RevId: 257650017
Affine load and store operations take a variadic number of arguments, most of
which are interpreted as subscripts for the multi-dimensional memref they
access. Add a verifier check that ensures the number of operands is equal to
the number affine remapping inputs if present and to the rank of the acessed
memref otherwise. Although it is impossible to obtain such operations by
parsing the custom syntax, it is possible to construct them using the generic
syntax or programmatically.
PiperOrigin-RevId: 257605902
This changes the top-level module parser to handle the case where the top-level module is defined with the module operation syntax, i.e:
module ... {
}
The printer is also updated to always print the top-level module in this form. This allows for cleanly round-tripping the location and attributes of the top-level module.
PiperOrigin-RevId: 257492069
Standard load and store operations are evolving to be separated from the Affine
constructs. Special affine.load/store have been introduced to uphold the
restrictions of the Affine control flow constructs on their operands.
EDSC-produced loads and stores were originally intended to uphold those
restrictions as well so they should use affine.load/store instead of
std.load/store.
PiperOrigin-RevId: 257443307
This was an arbitrary restriction caused by the way that modules were printed. Now that that has been fixed, this restriction can be removed.
PiperOrigin-RevId: 257240329
Change the AsmPrinter to number values breadth-first so that values in adjacent regions can have the same name. This allows for ModuleOp to contain operations that produce results. This also standardizes the special name of region entry arguments to "arg[0-9+]" now that Functions are also operations.
PiperOrigin-RevId: 257225069
Parametric tiling can be used to extract outer loops with fixed number of
iterations. This in turn enables mapping to GPU kernels on a fixed grid
independently of the range of the original loops, which may be unknown
statically, making the kernel adaptable to different sizes. Provide a utility
function that also computes the parametric tile size given the range of the
loop. Exercise the utility function through a simple pass that applies it to
all top-level loop nests. Permutability or parallelism checks must be
performed before calling this utility function in actual passes.
Note that parametric tiling cannot be implemented in a purely affine way,
although it can be encoded using semi-affine maps. The choice to implement it
on standard loops is guided by them being the common representation between
Affine loops, Linalg and GPU kernels.
PiperOrigin-RevId: 257180251
Extend the utility that converts affine loop nests to support other types of
loops by abstracting away common behavior through templates. This also
slightly simplifies the existing Affine to GPU conversion by always passing in
the loop step as an additional kernel argument even though it is a known
constant. If it is used, it will be propagated into the loop body by the
existing canonicalization pattern and can be further constant-folded, otherwise
it will be dropped by canonicalization.
This prepares for the common loop abstraction that will be used for converting
to GPU kernels, which is conceptually close to Linalg loops, while maintaining
the existing conversion operational.
PiperOrigin-RevId: 257172216
Now both functions and modules are just general ops and we do not require
top-level entities in a module's block to be the old builtin functions
any more. Removing the wrapping functions to simplify the tests.
PiperOrigin-RevId: 257003572
This CL adds an "std.if" op to represent an if-then-else construct whose condition is an arbitrary value of type i1.
This is necessary to lower all the existing examples from affine and linalg to std.for + std.if.
This CL introduces the op and adds the relevant positive and negative unit test. Lowering will be done in a separate followup CL.
PiperOrigin-RevId: 256649138
Some operations need to override the default behavior of builders, in
particular region-holding operations such as affine.for or tf.graph want to
inject default terminators into the region upon construction, which default
builders won't do. Provide a flag that disables the generation of default
builders so that the custom builders could use the same function signatures.
This is an intentionally low-level and heavy-weight feature that requires the
entire builder to be implemented, and it should be used sparingly. Injecting
code into the end of a default builder would depend on the naming scheme of the
default builder arguments that is not visible in the ODS. Checking that the
signature of a custom builder conflicts with that of a default builder to
prevent emission would require teaching ODG to differentiate between types and
(optional) argument names in the generated C++ code. If this flag ends up
being used a lot, we should consider adding traits that inject specific code
into the default builder.
PiperOrigin-RevId: 256640069
This tool allows to execute MLIR IR snippets written in the GPU dialect
on a CUDA capable GPU. For this to work, a working CUDA install is required
and the build has to be configured with MLIR_CUDA_RUNNER_ENABLED set to 1.
PiperOrigin-RevId: 256551415
This CL introduces a new syntax for creating multi-result ops and access their
results in result patterns. Specifically, if a multi-result op is unbound or
bound to a name without a trailing `__N` suffix, it will act as a value pack
and expand to all its values. If a multi-result op is bound to a symbol with
`__N` suffix, only the N-th result will be extracted and used.
PiperOrigin-RevId: 256465208
This is an important step in allowing for the top-level of the IR to be extensible. FuncOp and ModuleOp contain all of the necessary functionality, while using the existing operation infrastructure. As an interim step, many of the usages of Function and Module, including the name, will remain the same. In the future, many of these will be relaxed to allow for many different types of top-level operations to co-exist.
PiperOrigin-RevId: 256427100
In most places, this is just a name change (with the exception of affine.dma_start swapping the operand positions of its tag memref and num_elements operands).
Significant code changes occur here:
*) Vectorization: LoopAnalysis.cpp, Vectorize.cpp
*) Affine Transforms: Transforms/Utils/Utils.cpp
PiperOrigin-RevId: 256395088
This CL refactors tiling to enable tiling of views that are not just specified by a simple permutation. This allows the tiling of convolutions for which a new example is added.
PiperOrigin-RevId: 256346028
This CL is the first step of a refactoring unification of the control flow abstraction used in different dialects. The `std.for` loop accepts unrestricted indices to encode min, max and step and will be used as a common abstraction on the way to lower level dialects.
PiperOrigin-RevId: 256331795
This CL uses the generic CopyOp to promote a subview (constructed during tiling) into a new buffer + copy by:
1. Creating a new buffer for the subview.
2. Taking a view into the buffer and copying into it.
3. Adapting the linalg op to operating on the view from point 2.
Tiling is extended with a boolean flag to enable promoting views (all or nothing for now).
More specifically, the current implementation creates a buffer that is always of the full size of the ranges of the subview. This produces a buffer whose size may be bigger
than the actual size of the `subView` at the boundaries and is related to the full/partial tile problem.
In practice, we introduce a `buffer`, a `fullLocalView` and a `partialLocalView` such that:
* `buffer` is always the size of the subview in the full tile case.
* `fullLocalView` is a dense contiguous view into that buffer.
* `partialLocalView` is a dense non-contiguous slice of `fullLocalView`
that corresponds to the size of `subView` and accounting for boundary
effects.
The point of the full tile buffer is that constant static tile sizes are
folded and result in a buffer type with statically known size and alignment
properties.
Padding is introduced on the boundary tiles with a `fill` op followed by a partial `copy` op.
These behaviors will be refined later, on a per-need basis.
PiperOrigin-RevId: 256237319
As with Functions, Module will soon become an operation, which are value-typed. This eases the transition from Module to ModuleOp. A new class, OwningModuleRef is provided to allow for owning a reference to a Module, and will auto-delete the held module on destruction.
PiperOrigin-RevId: 256196193
about the buffer size. This is needed to resolve the operand
correctly. Add that information to view op
serialization/deserialization
Also modify the parsing of buffer type by splitting at 'x' to
side-step issues with StringRef number parsing.
PiperOrigin-RevId: 256188319
This saves us the excessive string conversions and comparisons in
verification and transformation and scopes them only to parsing
and printing, which are meant for I/O so string conversions should
be fine.
In order to do this, changed the custom assembly format of
spv.module regarding addressing model and memory model.
PiperOrigin-RevId: 256149856
The GPU Launch operation may take constants as arguments, in particular
affine-to-GPU mapping pass automatically forwards potentially constant lower
bounds of loops into the kernel. Define a canonicalization pattern for
LaunchOp that recreates the constants inside the kernel region instead of
accepting them as kernel arguments. This is currently restricted to standard
constants but may be extended to other constant operations.
Also fix an off-by-one indexing bug in OperandStorage::eraseOperand.
PiperOrigin-RevId: 256035437
Move the data members out of Function and into a new impl storage class 'FunctionStorage'. This allows for Function to become value typed, which will greatly simplify the transition of Function to FuncOp(given that FuncOp is also value typed).
PiperOrigin-RevId: 255983022
Type conversion does not necessarily affect all types, some of them may remain
untouched. The type conversion tool from the dialect conversion framework will
unconditionally insert a temporary cast operation from the type to itself
anyway, and will try to materialize it to a real conversion operation if there
are remaining uses. Simply use the original value instead.
PiperOrigin-RevId: 255975450
The RUN line was missing a call to FileCheck making the test always pass. Add
the call to FileCheck and temporarily disable one of the tests that does not
produce the expected result.
PiperOrigin-RevId: 255974805
In ODS, right now we use StringAttrs to emulate enum attributes. It is
suboptimal if the op actually can and wants to store the enum as a
single integer value; we are paying extra cost on storing and comparing
the attribute value.
This CL introduces a new enum attribute subclass that are backed by
IntegerAttr. The downside with IntegerAttr-backed enum attributes is
that the assembly form now uses integer values, which is less obvious
than the StringAttr-backed ones. However, that can be remedied by
defining custom assembly form with the help of the conversion utility
functions generated via EnumsGen.
Choices are given to the dialect writers to decide which one to use for
their enum attributes.
PiperOrigin-RevId: 255935542
This functionality is now moved to a new class, ModuleManager. This class allows for inserting functions into a module, and will auto-rename them on insert to ensure a unique name. This now means that users adding new functions to a module must ensure that the function name is unique, as the Module will no longer do it automatically. This also means that Module::getNamedFunction now operates in O(N) instead of the O(c) time it did before. This simplifies the move of Modules to Operations as the ModuleOp will not be able to have this functionality.
PiperOrigin-RevId: 255846088
This CL also fixes a parsing issue in the BufferType, adds LLVM lowering support for handling the static constant buffer size and a roundtrip test.
PiperOrigin-RevId: 255834356
These ops are analogues of the current standard ops dma_start/wait, with the exception that the memref operands are affine expressions of loop IVs and symbols (analogous to affine.load/store).
The addition of these operations will enable changes to affine transformation and analysis passes which operate on memory dereferencing operations.
PiperOrigin-RevId: 255658382
During conversion, if a type conversion has dangling uses a type conversion must persist after conversion has finished to maintain valid IR. In these cases, we now query the TypeConverter to materialize a conversion for us. This allows for the default case of a full conversion to continue working as expected, but also handle the degenerate cases more robustly.
PiperOrigin-RevId: 255637171
constant then it is represented as <size x elementType>. If the size
is not a compile time constant, then it is represented as
<? x elementType>.
PiperOrigin-RevId: 255619400
Remove the ability to print an attribute without a type, but allow for attributes to elide the type under certain circumstances. This fixes a bug where attributes within ArrayAttr, and other collection attributes, would never print the type.
PiperOrigin-RevId: 255306974
annotations.
Getters are required as there are currently no global constants in MLIR and this
is an easy way to unblock CUDA execution while waiting for those.
PiperOrigin-RevId: 255169002
The actual transformation from PTX source to a CUDA binary is now factored out,
enabling compiling and testing the transformations independently of a CUDA
runtime.
MLIR has still to be built with NVPTX target support for the conversions to be
built and tested.
PiperOrigin-RevId: 255167139
Now that Locations are attributes, they have direct access to the MLIR context. This allows for simplifying error emission by removing unnecessary context lookups.
PiperOrigin-RevId: 255112791
The current syntax separates the name and value with ':', but ':' is already overloaded by several other things(e.g. trailing types). This makes the syntax difficult to parse in some situtations:
Old:
"foo: 10 : i32"
New:
"foo = 10 : i32"
PiperOrigin-RevId: 255097928
This is the standard syntax for types on operations, and is also already used by IntegerAttr and FloatAttr.
Example:
dense<5> : tensor<i32>
dense<[3]> : tensor<1xi32>
PiperOrigin-RevId: 255069157
The OperationFolder currently just inserts into the entry block of a Function, but regions may be isolated above, i.e. explicit capture only, and blindly inserting constants may break the invariants of these regions.
PiperOrigin-RevId: 254987796
GPU dialect operations (launch and launch_func) use `index` type for thread and
block index values inside the kernel, for compatibility with affine loops.
NVVM dialect operations, following the NVVM intrinsics, use `!llvm.i32` type,
which does not necessarily have the same bit width as the lowered `index` type.
Optionally sign-extend (indices are signed) or truncate the result of the NVVM
dialect operation to the bit width of the lowered `index` type before passing
it to other operations. This behavior is consistent with `std.index_cast`. We
cannot use the latter since we are targeting LLVM dialect types directly,
rather than standard integer types.
PiperOrigin-RevId: 254980868
PTX backend in LLVM expects additional module-level metadata
`!nvvm.annotations` that lists functions that can be used as GPU kernels.
Generate this metadata based on the `gpu.kernel` attribute attached to
functions. This attribute is added automatically by the kernel outlining pass
in the GPU dialect lowering flow.
PiperOrigin-RevId: 254957345
The error would look like:
path/filename.mlir:32:23: error: use of value '%28' expects different type than prior uses: ''i32'' vs ''!_tf.control''
PiperOrigin-RevId: 254874859
This CL makes use of view_slice in tiling and fusion.
Using a higher level IR element greatly simplifies the IR produced during tiling and fusion.
Lowering to LLVM is updated to first translate view_slice into a sequence of dim, range and cmpi.
This level will also be useful when lowering to affine.
PiperOrigin-RevId: 254767814
Enable reusing the real mlir-opt main from unit tests and in case where
additional initialization needs to happen before main is invoked (e.g., when
using different command line flag libraries).
PiperOrigin-RevId: 254764575
The new operations affine.load and affine.store will take composed affine maps by construction.
These operations will eventually replace load and store operations currently used in affine regions and operated on by affine transformation and analysis passes.
PiperOrigin-RevId: 254754048
The ModuleOp contains a single region that must contain a single block. This block must be terminated by a new pseudo operation 'module_terminator'. The syntax for this operations is as follows:
`module` (`attributes` attr-dict)? region
Example:
module {
...
}
module attributes { ... } {
...
}
PiperOrigin-RevId: 254513752
This CL adds a conv op that corresponds to the TF description along with its lowering to loops (https://www.tensorflow.org/api_docs/python/tf/nn/convolution).
The dimension of the convolution is inferred from the rank of the views. The other logical
dimensions correspond to the TF description.
The computation of tiled views need to be updated to work for the input tensor. This is left for a future CL.
PiperOrigin-RevId: 254505644
This will allow for locations to be used in the same contexts as attributes. Given that attributes are nullable types, the 'Location' class now represents a non-nullable wrapper around a 'LocationAttr'. This preserves the desired semantics we have for non-optional locations.
PiperOrigin-RevId: 254505278
This CL adds the basic SPIR-V serializer and deserializer for converting
SPIR-V module into the binary format and back. Right now only an empty
module with addressing model and memory model is supported; (de)serialize
other components will be added gradually with subsequent CLs.
The purpose of this library is to enable importing SPIR-V binary modules
to run transformations on them and exporting SPIR-V modules to be consumed
by execution environments. The focus is transformations, which inevitably
means changes to the binary module; so it is not designed to be a general
tool for investigating the SPIR-V binary module and does not guarantee
roundtrip equivalence (at least for now).
PiperOrigin-RevId: 254473019
This CL adds support for O-D ops in Linalg ops by:
1. making the CopyOp maps optional instead of default valued
2. allowing certain map operations to accept and return empty maps
3. making linalg::LowerToLoops aware of these changes
4. providing a proper 0-D impl for CopyOp and FillOp
5. adding the relevant tests
PiperOrigin-RevId: 254381908
https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.html#OpTypeImage.
Add new enums to describe Image dimensionality, Image Depth, Arrayed
information, Sampling, Sampler User information, and Image format.
Doesn's support the Optional Access qualifier at this stage
Fix Enum generator for tblgen to add "_" at the beginning if the enum
starts with a number.
PiperOrigin-RevId: 254091423
* Support for 1->0 type mappings, i.e. when the argument is being removed.
* Reordering types when converting a type signature.
* Adding new inputs when converting a type signature.
This cl also lays down the initial foundation for supporting 1->N type mappings, but full support will come in a followup.
Moving forward, function signature changes will be driven by populating a SignatureConversion instance. This class contains all of the necessary information for adding/removing/remapping function signatures; e.g. addInputs, addResults, remapInputs, etc.
PiperOrigin-RevId: 254064665
These were likely added in error because of confusion about the flag when it was just called "-verify". The extra flag doesn't cause much harm, but it does make mlir-opt do more work and clutter the RUN line
PiperOrigin-RevId: 254037016
This name has caused some confusion because it suggests that it's running op verification (and that this verification isn't getting run by default).
PiperOrigin-RevId: 254035268