To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere. For ops that
don't have a SPIR-V spec counterpart, we use spv.mlir.snake_case.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D98014
Normally tensors will be stored in buffers before converting to SPIR-V,
given that is how a large amount of data is sent to the GPU. However,
SPIR-V supports converting from tensors directly too. This is for the
cases where the tensor just contains a small amount of elements and it
makes sense to directly inline them as a small data array in the shader.
To handle this, internally the conversion might create new local
variables. SPIR-V consumers in GPU drivers may or may not optimize that
away. So this has implications over register pressure. Therefore, a
threshold is used to control when the patterns should kick in.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D98052
The two dialects are largely redundant. The former was introduced as a mirror
of the latter operating on LLVM dialect types. This is no longer necessary
since the LLVM dialect operates on built-in types. Combine the two dialects.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D98060
With the new vector.load/store operations, there is no need to go through
unmasked transfer operations (which will canonicalized to l/s anyway).
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D98056
This patch is a follow-up on D97217. It adds a new 'Skip' result to the Operation visitor
so that a callback can stop the ongoing visit of an operation/block/region and
continue visiting the next one without fully interrupting the walk. Skipping is
needed to be able to erase an operation/block in pre-order and do not continue
visiting the internals of that operation/block.
Related to the skipping mechanism, the patch also introduces the following changes:
* Added new TestIRVisitors pass with basic testing for the IR visitors.
* Fixed missing early increment ranges in visitor implementation.
* Updated documentation of walk methods to include erasure information and walk
order information.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97820
This patch extends the Region, Block and Operation visitors to also support pre-order walks.
We introduce a new template argument that dictates the walk order (only pre-order and
post-order are supported for now). The default order for Regions, Blocks and Operations is
post-order. Mixed orders (e.g., Region/Block pre-order + Operation post-order) could easily
be implemented, as shown in NumberOfExecutions.cpp.
Reviewed By: rriddle, frgossen, bondhugula
Differential Revision: https://reviews.llvm.org/D97217
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere. For ops that
don't have a SPIR-V spec counterpart, we use spv.mlir.snake_case.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D98016
In .mlir modules with larges amounts of attributes, e.g. a function with a larger number of argument attributes, the string comparison filtering greatly affects compile time. This revision switches to using a SmallDenseSet in these situations, resulting in over a 10x speed up in some situations.
Differential Revision: https://reviews.llvm.org/D97980
To unify the naming scheme across all ops in the SPIR-V dialect,
we are moving from spv.camelCase to spv.CamelCase everywhere.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D97918
* Mostly imported from experimental repo as-is with cosmetic changes.
* Temporarily left out emission code (for building ops at runtime) to keep review size down.
* Documentation and lit tests added fresh.
* Sample op library that represents current Linalg named ops included.
Differential Revision: https://reviews.llvm.org/D97995
Reduction updates should be masked, just like the load and stores.
Note that alternatively, we could use the fact that masked values are
zero of += updates and mask invariants to get this working but that
would not work for *= updates. Masking the update itself is cleanest.
This change also replaces the constant mask with a broadcast of "true"
since this constant folds much better for various folding patterns.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98000
Canonicalize the iter_args of an scf::ForOp that involve a tensor_load and
for which only the last loop iteration is actually visible outside of the
loop. The canonicalization looks for a pattern such as:
```
%t0 = ... : tensor_type
%0 = scf.for ... iter_args(%bb0 : %t0) -> (tensor_type) {
...
// %m is either tensor_to_memref(%bb00) or defined above the loop
%m... : memref_type
... // uses of %m with potential inplace updates
%new_tensor = tensor_load %m : memref_type
...
scf.yield %new_tensor : tensor_type
}
```
`%bb0` may have either 0 or 1 use. If it has 1 use it must be exactly a
`%m = tensor_to_memref %bb0` op that feeds into the yielded `tensor_load`
op.
If no aliasing write of `%new_tensor` occurs between tensor_load and yield
then the value %0 visible outside of the loop is the last `tensor_load`
produced in the loop.
For now, we approximate the absence of aliasing by only supporting the case
when the tensor_load is the operation immediately preceding the yield.
The canonicalization rewrites the pattern as:
```
// %m is either a tensor_to_memref or defined above
%m... : memref_type
scf.for ... { // no iter_args
... // uses of %m with potential inplace updates
}
%0 = tensor_load %m : memref_type
```
Differential revision: https://reviews.llvm.org/D97953
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D97919
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from `spv.camelCase` to `spv.CamelCase` everywhere.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D97917
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere.
Differential Revision: https://reviews.llvm.org/D97920
Now that attributes can be generated using ODS, we can move the builtin attributes as well. This revision removes a majority of the builtin attributes with a few left for followup revisions. The attributes moved to ODS in this revision are: AffineMapAttr, ArrayAttr, DictionaryAttr, IntegerSetAttr, StringAttr, SymbolRefAttr, TypeAttr, and UnitAttr.
Differential Revision: https://reviews.llvm.org/D97591
The value type of the attribute can be specified by either overriding the typeBuilder field on the AttrDef, or by providing a parameter of type `AttributeSelfTypeParameter`. This removes the need to define custom storage class constructors for attributes that have a value type other than NoneType.
Differential Revision: https://reviews.llvm.org/D97590
This function simplifies calling the getChecked methods on Attributes and Types from within the parser, and removes any need to use `getEncodedSourceLocation` for these methods (by using an SMLoc instead). This is much more efficient than using an mlir::Location, as the encoding process to produce an mlir::Location is inefficient and undesirable for parsing (locations used during parsing should not persist afterwards unless otherwise necessary).
Differential Revision: https://reviews.llvm.org/D97900
`tensor_load(tensor_to_memref(x)) -> x` is an incorrect folding because it ignores potential aliasing.
This revision approximates no-aliasing by restricting the folding to occur only when tensor_to_memref
is immediately preceded by tensor_load in the same block. This is a conservative step back towards
correctness until better alias analysis becomes available.
Context: https://llvm.discourse.group/t/properly-using-bufferization-related-passes/2913/6
Differential Revision: https://reviews.llvm.org/D97957
Add a folder to rewrite a sequence such as:
```
%t1 = ...
%v = vector.transfer_read %t0[%c0...], {masked = [false...]} :
tensor<static_sizesxf32>, vector<static_sizesxf32>
%t2 = vector.transfer_write %v, %t1[%c0...] {masked = [false...]} :
vector<static_sizesxf32>, tensor<static_sizesxf32>
```
into:
```
%t0
```
The producer of t1 may or may not be DCE'd depending on whether it is a
block argument or has side effects.
Differential revision: https://reviews.llvm.org/D97934
There is no need for the interface implementations to be exposed, opaque
registration functions are sufficient for all users, similarly to passes.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97852
Add a Loop Option attribute and generate llvm metadata attached to
branch instructions to control code generation.
Reviewed By: ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D96820
Found with exhaustive testing, it is possible that a while loop
appears in between chainable for loops. As long as we don't
scalarize reductions in while loops, this means we need to
terminate the chain at the while. This also refactors the
reduction code into more readable helper methods.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D97886
The support for attributes closely maps that of Types (basically 1-1) given that Attributes are defined in exactly the same way as Types. All of the current ODS TypeDef classes get an Attr equivalent. The generation of the attribute classes themselves share the same generator as types.
Differential Revision: https://reviews.llvm.org/D97589
This better matches the actual IR concept that is being modeled, and is consistent with how the rest of PDL is structured.
Differential Revision: https://reviews.llvm.org/D95718
This type represents a range of positional values. It will be used in followup revisions to add support for variadic constructs to PDL, such as operand and result ranges.
Differential Revision: https://reviews.llvm.org/D95717
The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient.
Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type.
Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one.
As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one).
This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits:
* Most of the methods on value are now branchless, and often one-liners.
* The "kind" of the value is now stored in ValueImpl instead of Value
This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result.
* Operation result types are now stored in the result, instead of a side array
This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller.
This revision does come with two conceptual downsides:
* Operation::getResultTypes no longer returns an ArrayRef<Type>
This conceptually makes some usages slower, as the iterator increment is slightly more complex.
* OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic
From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster.
Differential Revision: https://reviews.llvm.org/D97804
The SubTensorInsertOp has a requirement that dest type and result
type match. Just folding the tensor.cast operation violates this and
creates verification errors during canonicalization. Also fix other
canonicalization methods that werent inserting casts properly.
Differential Revision: https://reviews.llvm.org/D97800
Different from the definition in Tensorflow and TOSA, the output is [N,H,W,C,M]. This can make transforms easier in LinAlg because the indexing maps are plain. E.g., to determine if the fill op has dependency between the depthwise conv op, the current pipeline only recognizes the dep if they are all projected affine map.
Reviewed By: asaadaldien
Differential Revision: https://reviews.llvm.org/D97798
This offers the ability to create a JIT and invoke a function by passing
ctypes pointers to the argument and the result.
Differential Revision: https://reviews.llvm.org/D97523
This adds minimalistic bindings for the execution engine, allowing to
invoke the JIT from the C API. This is still quite early and
experimental and shouldn't be considered stable in any way.
Differential Revision: https://reviews.llvm.org/D96651