Adds optional attribute to support tensor cores on F32 datatype by lowering to `mma.sync` with TF32 operands. Since, TF32 is not a native datatype in LLVM we are adding `tf32Enabled` as an attribute to allow the IR to be aware of `MmaSyncOp` datatype. Additionally, this patch adds placeholders for nvgpu-to-nvgpu transformation targeting higher precision tf32x3.
For mma.sync on f32 input using tensor cores there are two possibilites:
(a) tf32 (1 `mma.sync` per warp-level matrix-multiply-accumulate)
(b) tf32x3 (3 `mma.sync` per warp-level matrix-multiply-accumulate)
Typically, tf32 tensor core acceleration comes at a cost of accuracy from missing precision bits. While f32 has 23 precision bits, tf32 has only 10 precision bits. tf32x3 aims to recover the precision bits by splitting each operand into two tf32 values and issue three `mma.sync` tensor core operations.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D130294
This patch removes the `type` field from `Attribute` along with the
`Attribute::getType` accessor.
Going forward, this means that attributes in MLIR will no longer have
types as a first-class concept. This patch lays the groundwork to
incrementally remove or refactor code that relies on generic attributes
being typed. The immediate impact will be on attributes that rely on
`Attribute` containing a type, such as `IntegerAttr`,
`DenseElementsAttr`, and `ml_program::ExternAttr`, which will now need
to define a type parameter on their storage classes. This will save
memory as all other attribute kinds will no longer contain a type.
Moreover, it will not be possible to generically query the type of an
attribute directly. This patch provides an attribute interface
`TypedAttr` that implements only one method, `getType`, which can be
used to generically query the types of attributes that implement the
interface. This interface can be used to retain the concept of a "typed
attribute". The ODS-generated accessor for a `type` parameter
automatically implements this method.
Next steps will be to refactor the assembly formats of certain operations
that rely on `parseAttribute(type)` and `printAttributeWithoutType` to
remove special handling of type elision until `type` can be removed from
the dialect parsing hook entirely; and incrementally remove uses of
`TypedAttr`.
Reviewed By: lattner, rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D130092
Added a commutativity utility pattern and a function to populate it. The pattern sorts the operands of an op in ascending order of the "key" associated with each operand iff the op is commutative. This sorting is stable.
The function is intended to be used inside passes to simplify the matching of commutative operations. After the application of the above-mentioned pattern, since the commutative operands now have a deterministic order in which they occur in an op, the matching of large DAGs becomes much simpler, i.e., requires much less number of checks to be written by a user in her/his pattern matching function.
The "key" associated with an operand is the list of the "AncestorKeys" associated with the ancestors of this operand, in a breadth-first order.
The operand of any op is produced by a set of ops and block arguments. Each of these ops and block arguments is called an "ancestor" of this operand.
Now, the "AncestorKey" associated with:
1. A block argument is `{type: BLOCK_ARGUMENT, opName: ""}`.
2. A non-constant-like op, for example, `arith.addi`, is `{type: NON_CONSTANT_OP, opName: "arith.addi"}`.
3. A constant-like op, for example, `arith.constant`, is `{type: CONSTANT_OP, opName: "arith.constant"}`.
So, if an operand, say `A`, was produced as follows:
```
`<block argument>` `<block argument>`
\ /
\ /
`arith.subi` `arith.constant`
\ /
`arith.addi`
|
returns `A`
```
Then, the block arguments and operations present in the backward slice of `A`, in the breadth-first order are:
`arith.addi`, `arith.subi`, `arith.constant`, `<block argument>`, and `<block argument>`.
Thus, the "key" associated with operand `A` is:
```
{
{type: NON_CONSTANT_OP, opName: "arith.addi"},
{type: NON_CONSTANT_OP, opName: "arith.subi"},
{type: CONSTANT_OP, opName: "arith.constant"},
{type: BLOCK_ARGUMENT, opName: ""},
{type: BLOCK_ARGUMENT, opName: ""}
}
```
Now, if "keyA" is the key associated with operand `A` and "keyB" is the key associated with operand `B`, then:
"keyA" < "keyB" iff:
1. In the first unequal pair of corresponding AncestorKeys, the AncestorKey in operand `A` is smaller, or,
2. Both the AncestorKeys in every pair are the same and the size of operand `A`'s "key" is smaller.
AncestorKeys of type `BLOCK_ARGUMENT` are considered the smallest, those of type `CONSTANT_OP`, the largest, and `NON_CONSTANT_OP` types come in between. Within the types `NON_CONSTANT_OP` and `CONSTANT_OP`, the smaller ones are the ones with smaller op names (lexicographically).
---
Some examples of such a sorting:
Assume that the sorting is being applied to `foo.commutative`, which is a commutative op.
Example 1:
> %1 = foo.const 0
> %2 = foo.mul <block argument>, <block argument>
> %3 = foo.commutative %1, %2
Here,
1. The key associated with %1 is:
```
{
{CONSTANT_OP, "foo.const"}
}
```
2. The key associated with %2 is:
```
{
{NON_CONSTANT_OP, "foo.mul"},
{BLOCK_ARGUMENT, ""},
{BLOCK_ARGUMENT, ""}
}
```
The key of %2 < the key of %1
Thus, the sorted `foo.commutative` is:
> %3 = foo.commutative %2, %1
Example 2:
> %1 = foo.const 0
> %2 = foo.mul <block argument>, <block argument>
> %3 = foo.mul %2, %1
> %4 = foo.add %2, %1
> %5 = foo.commutative %1, %2, %3, %4
Here,
1. The key associated with %1 is:
```
{
{CONSTANT_OP, "foo.const"}
}
```
2. The key associated with %2 is:
```
{
{NON_CONSTANT_OP, "foo.mul"},
{BLOCK_ARGUMENT, ""}
}
```
3. The key associated with %3 is:
```
{
{NON_CONSTANT_OP, "foo.mul"},
{NON_CONSTANT_OP, "foo.mul"},
{CONSTANT_OP, "foo.const"},
{BLOCK_ARGUMENT, ""},
{BLOCK_ARGUMENT, ""}
}
```
4. The key associated with %4 is:
```
{
{NON_CONSTANT_OP, "foo.add"},
{NON_CONSTANT_OP, "foo.mul"},
{CONSTANT_OP, "foo.const"},
{BLOCK_ARGUMENT, ""},
{BLOCK_ARGUMENT, ""}
}
```
Thus, the sorted `foo.commutative` is:
> %5 = foo.commutative %4, %3, %2, %1
Signed-off-by: Srishti Srivastava <srishti.srivastava@polymagelabs.com>
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D124750
The current Parser library is solely focused on providing API for
the textual MLIR format, but MLIR will soon also provide a binary
format. This commit renames the current Parser library to AsmParser to
better correspond to what the library is actually intended for. A new
Parser library is added which will act as a unified parser interface
between both text and binary formats. Most parser clients are
unaffected, given that the unified interface is essentially the same as
the current interface. Only clients that rely on utilizing the
AsmParserState, or those that want to parse Attributes/Types need to be
updated to point to the AsmParser library.
Differential Revision: https://reviews.llvm.org/D129605
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.
Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.
@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.
Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.
To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.
For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D117977
The `tileAndFuseLinalgOps` is a legacy approach for tiling + fusion of
Linalg operations. Since it was also intended to work on operations
with buffer operands, this method had fairly complex logic to make
sure tile and fuse was correct even with side-effecting linalg ops.
While complex, it still wasnt robust enough. This patch deprecates
this method and thereby deprecating the tiling + fusion method for ops
with buffer semantics. Note that the core transformation to do fusion
of a producer with a tiled consumer still exists. The deprecation here
only removes methods that auto-magically tried to tile and fuse
correctly in presence of side-effects.
The `tileAndFuseLinalgOps` also works with operations with tensor
semantics. There are at least two other ways the same functionality
exists.
1) The `tileConsumerAndFuseProducers` method. This does a similar
transformation, but using a slightly different logic to
automatically figure out the legal tile + fuse code. Note that this
is also to be deprecated soon.
2) The prefered way uses the `TilingInterface` for tile + fuse, and
relies on the caller to set the tiling options correctly to ensure
that the generated code is correct.
As proof that (2) is equivalent to the functionality provided by
`tileAndFuseLinalgOps`, relevant tests have been moved to use the
interface, where the test driver sets the tile sizes appropriately to
generate the expected code.
Differential Revision: https://reviews.llvm.org/D129901
This one required more changes than ideal due to overlapping generated name
with different return types. Changed getIndexingMaps to getIndexingMapsArray to
move it out of the way/highlight that it returns (more expensively) a
SmallVector and uses the prefixed name for the Attribute.
Differential Revision: https://reviews.llvm.org/D129919
For AttrDef declarations, place specified code in extraClassDefinition into the generated *.cpp.inc file.
Reviewed By: Mogball, rriddle
Differential Revision: https://reviews.llvm.org/D129574
This patch allows custom attribute and type builders to return
something other than the C++ type of the attribute or type.
This is useful for attributes or types that may perform extra work during
construction (e.g. canonicalization) that could result in a different
kind of attribute or type being returned.
Reviewed By: rriddle, lattner
Differential Revision: https://reviews.llvm.org/D129792
This patch adds a pattern to decompose a `linalg.generic` operations
that
- has only parallel iterator types
- has more than 2 statements (including the yield)
into multiple `linalg.generic` operation such that each operation has
a single statement and a yield.
The pattern added here just splits the matching `linalg.generic` into
two `linalg.generic`s, one containing the first statement, and the
other containing the remaining. The same pattern can be applied
repeatedly on the second op to ultimately fully decompose the generic
op.
Differential Revision: https://reviews.llvm.org/D129704
This pass tests patterns that are already tested elsewhere by applying them in a semi-targeted
fashion using anchor function and op names.
From now on, targeted tests should use the transform dialect interpreter.
Differential Revision: https://reviews.llvm.org/D129627
This required changing a bit of how attributes/types are parsed. A new
`KeywordSwitch` class was added to AsmParser that provides a StringSwitch
like API for parsing keywords with a set of potential matches. It intends to
both provide a cleaner API, and enable injection for code completion. This
required changing the API of `generated(Attr|Type)Parser` to handle the
parsing of the keyword, instead of having the user do it. Most upstream
dialects use the autogenerated handling and didn't require a direct update.
Differential Revision: https://reviews.llvm.org/D129267
Previously default attributes were only usable by way of the ODS generated
accessors, but this was undesirable as
1. The ODS getters could construct Attribute each get request;
2. For non-C++ uses this would require either duplicating some of tee default
attribute generating or generating additional bindings to generate methods;
3. Accessing op.getAttr("foo") and op.getFoo() would return different results;
Generate method to populate default attributes that can be used to address
these.
This merely adds this facility but does not employ by default on any path.
Differential Revision: https://reviews.llvm.org/D128962
With SCCP and integer range analysis ported to the new framework, this old framework is redundant. Delete it.
Depends on D128866
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D128867
This patch introduces an implementation of dense data-flow analysis. Dense
data-flow analysis attaches a lattice before and after the execution of every
operation. The lattice state is propagated across operations by a user-defined
transfer function. The state is joined across control-flow and callgraph edges.
Thge patch provides an example pass that uses both a dense and a sparse analysis
together.
Depends on D127139
Reviewed By: rriddle, phisiart
Differential Revision: https://reviews.llvm.org/D127173
By making TypeInterfaces and AttrInterfaces, Types and Attrs respectively it'd then be possible to use them anywhere where a Type or Attr may go. That is within the arguments and results of an Op definition, in a RewritePattern etc.
Prior to this change users had to separately define a Type or Attr, with a predicate to check whether a type or attribute implements a given interface. Such code will be redundant now.
Removing such occurrences in upstream dialects will be part of a separate patch.
As part of implementing this patch, slight refactoring had to be done. In particular, Interfaces cppClassName field was renamed to cppInterfaceName as it "clashed" with TypeConstraints cppClassName. In particular Interfaces cppClassName expected just the class name, without any namespaces, while TypeConstraints cppClassName expected a fully qualified class name.
Differential Revision: https://reviews.llvm.org/D129209
With the exceptions of AttrOrTypeParameter and DerivedAttr, all of MLIR consistently uses $_ctxt as the substitute variable for the MLIRContext in TableGen C++ code.
Usually this does not matter unless one where to reuse some code in multiple fields but it is still needlessly inconsistent and prone to error.
This patch fixes that by consistently using _$ctxt everywhere.
Differential Revision: https://reviews.llvm.org/D129153
This patch implements the analysis state classes needed for sparse data-flow analysis and implements a dead-code analysis using those states to determine liveness of blocks, control-flow edges, region predecessors, and function callsites.
Depends on D126751
Reviewed By: rriddle, phisiart
Differential Revision: https://reviews.llvm.org/D127064
`enableSplitting` simply enables/disables whether we should split
or use the full buffer. `insertMarkerInOutput` toggles if split markers
should be inserted in between prcessed output chunks.
These options allow for merging the duplicate code paths we have
when splitting is optional.
Differential Revision: https://reviews.llvm.org/D128764
This patch adds a `convertFromStorage` field to attribute or type parameters that can implement more complex logic for converting from the parameter's C++ storage type (e.g. `Optional<SmallVector<T>>`) to its C++ type (e.g. `Optional<ArrayRef<T>>`).
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D128293
The 'emit_c_wrappers' option in the FuncToLLVM conversion requests C interface
wrappers to be emitted for every builtin function in the module. While this has
been useful to bootstrap the interface, it is problematic in the longer term as
it may unintentionally affect the functions that should retain their existing
interface, e.g., libm functions obtained by lowering math operations (see
D126964 for an example). Since D77314, we have a finer-grain control over
interface generation via an attribute that avoids the problem entirely. Remove
the 'emit_c_wrappers' option. Introduce the '-llvm-request-c-wrappers' pass
that can be run in any pipeline that needs blanket emission of functions to
annotate all builtin functions with the attribute before performing the usual
lowering that accounts for the attribute.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D127952
Where a constraint also has a def, emit the def only to avoid duplicate
output (and def has more complete info). Also move attributes and types
to the end rather than some on top and some at end.
Differential Revision: https://reviews.llvm.org/D127823
When specifying an op attribute with a default value (via DefaultValuedAttr), the default value is a string of C++ code. In the general case, the default value of such an attribute cannot be translated to Python when generating the bindings. However, we can hard-code default Python values for frequently-used C++ default values.
This change adds a Python default value for empty ArrayAttrs.
Differential Revision: https://reviews.llvm.org/D127750
Add static assertions into the various `attachInterface` methods, which are
used for adding external interface implementations to attributes, operations
and types, that ensure `ExternalModel` interface classes are instantiated for
the same concrete operation for the concrete base (potentially self) attribute
or type as they are attached to. `FallbackModel`s remain usable for generic
interface models that should support more than one kind of entities.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D127850
Removes one element of the pointer union to make it work on 32-bit
systems.
This patch introduces a generic data-flow analysis framework to MLIR. The framework implements a fixed-point iteration algorithm and a dependency graph between lattice states and analysis. Lattice states and points are fully extensible to support highly-customizable analyses.
Reviewed By: phisiart, rriddle
Differential Revision: https://reviews.llvm.org/D126751
This patch introduces a generic data-flow analysis framework to MLIR. The framework implements a fixed-point iteration algorithm and a dependency graph between lattice states and analysis. Lattice states and points are fully extensible to support highly-customizable analyses.
Reviewed By: phisiart, rriddle
Differential Revision: https://reviews.llvm.org/D126751
The generated attribute and type def accessors are changed to match the setting on the dialect. Most importantly, "prefixed" will now correctly convert snake case to camel case (e.g. `weight_zp` -> `getWeightZp`)
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D127688
This patch adds support for tiling operations that implement the
TilingInterface.
- It separates the loop constructs that are used to iterate over tile
from the implementation of the tiling itself. For example, the use
of destructive updates is more related to use of scf.for for
iterating over tiles that are tensors.
- To test the transformation, TilingInterface is implemented for
LinalgOps. The separation of the looping constructs used from the
implementation of tile code generation greatly simplifies the
latter.
- The implementation of TilingInterface for LinalgOp is kept as an
external model for now till this approach can be fully flushed out
to replace the existing tiling + fusion approaches in Linalg.
Differential Revision: https://reviews.llvm.org/D127133
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.
Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.
@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.
Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.
To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.
For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D117977
This commit enables providing long-form documentation more seamlessly to the LSP
by revamping decl documentation. For ODS imported constructs, we now also import
descriptions and attach them to decls when possible. For PDLL constructs, the LSP will
now try to provide documentation by parsing the comments directly above the decls
location within the source file. This commit also adds a new parser flag
`enableDocumentation` that gates the import and attachment of ODS documentation,
which is unnecessary in the normal build process (i.e. it should only be used/consumed
by tools).
Differential Revision: https://reviews.llvm.org/D124881