Start the description from a new line instead of putting the first
paragraph in the section header. Wrap the class name in backticks to
make it clear that it relates to the code.
Change CUDA integration tests to use mlir-opt + mlir-cpu-runner instead.
Depends On D98203
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D98396
Data layout information allows to answer questions about the size and alignment
properties of a type. It enables, among others, the generation of various
linear memory addressing schemes for containers of abstract types and deeper
reasoning about vectors. This introduces the subsystem for modeling data
layouts in MLIR.
The data layout subsystem is designed to scale to MLIR's open type and
operation system. At the top level, it consists of attribute interfaces that
can be implemented by concrete data layout specifications; type interfaces that
should be implemented by types subject to data layout; operation interfaces
that must be implemented by operations that can serve as data layout scopes
(e.g., modules); and dialect interfaces for data layout properties unrelated to
specific types. Built-in types are handled specially to decrease the overall
query cost.
A concrete default implementation of these interfaces is provided in the new
Target dialect. Defaults for built-in types that match the current behavior are
also provided.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97067
Clean-up after D98279, remove one call to createConvertGPUKernelToBlobPass().
Depends On D98203
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D98360
Provide default for gpuBinaryAnnotation so that we don't need to specify it in tests.
The annotation likely only needs to be target specific if we want to lower to e.g. both CUDA and ROCDL.
Reviewed By: herhut, bondhugula
Differential Revision: https://reviews.llvm.org/D98168
This allows the caller to distinguish between a parse error or an
unmatched keyword. It fixes the redundant error that was emitted by the
caller when the generated parser would fail.
Differential Revision: https://reviews.llvm.org/D98162
Use `MLIR_LINALG_ODS_GEN` and `MLIR_LINALG_ODS_YAML_GEN` variables
instead of `MLIR_LINALG_ODS_GEN_EXE` and `MLIR_LINALG_ODS_YAML_GEN_EXE`.
The former are defined in PARENT SCOPE only, so the `if` condition
is never evaluates to `TRUE`.
The logic should be the following (taken from tblgen part):
1. `TOOL_NAME` - CACHE variable (default equal to target name).
User can override it to actual executable path.
2. `TOOL_NAME_EXE` - internal variable, initialized to `${TOOL_NAME}` first.
In case of cross-compilation (`LLVM_USE_HOST_TOOLS == TRUE`) if user
didn't set own path to native executable via `TOOL_NAME` variable,
CMake will create separate targets to build native tool and
will override `TOOL_NAME_EXE` to the executable produced by this target.
3. `TOOL_NAME_TARGET` - internal variable, which points to tool target name.
If the native tool is built as described above, it will point to the
target correspondant to that native tool.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D98025
* Only leaf packages are non-namespace packages. This allows most of the top levels to be split into different directories or deployment packages. In the previous state, the presence of __init__.py files at each level meant that the entire tree could only ever exist in one physical directory on the path.
* This changes the API usage slightly: `import mlir` will no longer do a deep import of `mlir.ir`, etc. This may necessitate some client code changes.
* Dialect gen code was restructured so that the user is responsible for providing the `my_dialect.py` file, which then must import its peer `_my_dialect_ops_gen`. This gives complete control of the dialect namespace to the user instead of to tablegen code, allowing further dialect-specific python APIs.
* Correspondingly, the previous extension modules `_my_dialect.py` are now `_my_dialect_ops_ext.py`.
* Now that the `linalg` namespace is open, moved the `linalg_opdsl` tool into it.
* This may require some corresponding downstream adjustments to npcomp, circt, et al:
* Probably some shallow imports need to be converted to deep imports (i.e. not `import mlir` brings in the world).
* Each tablegen generated dialect now needs an explicit `foo.py` which does a `from ._foo_ops_gen import *`. This is similar to the way that generated code operates in the C++ world.
* If providing dialect op extensions, those need to be moved from `_foo.py` -> `_foo_ops_ext.py`.
Differential Revision: https://reviews.llvm.org/D98096
This patch is a follow-up on D97217. It adds a new 'Skip' result to the Operation visitor
so that a callback can stop the ongoing visit of an operation/block/region and
continue visiting the next one without fully interrupting the walk. Skipping is
needed to be able to erase an operation/block in pre-order and do not continue
visiting the internals of that operation/block.
Related to the skipping mechanism, the patch also introduces the following changes:
* Added new TestIRVisitors pass with basic testing for the IR visitors.
* Fixed missing early increment ranges in visitor implementation.
* Updated documentation of walk methods to include erasure information and walk
order information.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97820
The value type of the attribute can be specified by either overriding the typeBuilder field on the AttrDef, or by providing a parameter of type `AttributeSelfTypeParameter`. This removes the need to define custom storage class constructors for attributes that have a value type other than NoneType.
Differential Revision: https://reviews.llvm.org/D97590
There is no need for the interface implementations to be exposed, opaque
registration functions are sufficient for all users, similarly to passes.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97852
The support for attributes closely maps that of Types (basically 1-1) given that Attributes are defined in exactly the same way as Types. All of the current ODS TypeDef classes get an Attr equivalent. The generation of the attribute classes themselves share the same generator as types.
Differential Revision: https://reviews.llvm.org/D97589
Use `StringLiteral` for function return type if it is known to return
constant string literals only.
This will make it visible to API users, that such values can be safely
stored, since they refers to constant data, which will never be deallocated.
`StringRef` is general is not safe to store for a long term,
since it might refer to temporal data allocated in heap.
Add `inline` and `constexpr` methods support to `OpMethod`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97390
Some variables are unused after D97383 landed. We should generate one symbol for one attrUse.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D97794
These warnings are raised when compiling with gcc due to either having too few or too many commas, or in the case of lldb, the possibility of a nullptr.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97586
This enables this kind of construct in the DSL to generate a named op that is polymorphic over numeric type variables `T` and `U`, generating the correct arithmetic casts at construction time:
```
@tc_def_op
def polymorphic_matmul(A=TensorDef(T1, S.M, S.K),
B=TensorDef(T2, S.K, S.N),
C=TensorDef(U, S.M, S.N, output=True)):
implements(ContractionOpInterface)
C[D.m, D.n] += cast(U, A[D.m, D.k]) * cast(U, B[D.k, D.n])
```
Presently, this only supports type variables that are bound to the element type of one of the arguments, although a further extension that allows binding a type variable to an attribute would allow some more expressiveness and may be useful for some formulations. This is left to a future patch. In addition, this patch does not yet materialize the verifier support which ensures that types are bound correctly (for such simple examples, failing to do so will yield IR that fails verification, it just won't yet fail with a precise error).
Note that the full grid of extensions/truncation/int<->float conversions are supported, but many of them are lossy and higher level code needs to be mindful of numerics (it is not the job of this level).
As-is, this should be sufficient for most integer matmul scenarios we work with in typical quantization schemes.
Differential Revision: https://reviews.llvm.org/D97603
This also exposed a bug in Dialect loading where it was not correctly identifying identifiers that had the dialect namespace as a prefix.
Differential Revision: https://reviews.llvm.org/D97431
Allows querying regions too via OpAdaptor's generated. This does not yet move region verification to adaptor nor require regions for ops where needed.
Differential Revision: https://reviews.llvm.org/D97519
If one operand is not used in the formula, it will be considered a
shaped operand. And the result of indexing map of the operand will be the first
reduction dims.
Depends On D97383
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D97384
This fixes the documentation emitted for type parameters. Also adds a
missing empty line, rendered as line break in mark down.
Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97267
This will allow us to define select(pred, in, out) for TC ops, which is useful
for pooling ops.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D97312
The cuda-runner registers two pass pipelines for nested passes,
so that we don't have to use verbose textual pass pipeline specification.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D97091
`verifyConstructionInvariants` is intended to allow for verifying the invariants of an attribute/type on construction, and `getChecked` is intended to enable more graceful error handling aside from an assert. There are a few problems with the current implementation of these methods:
* `verifyConstructionInvariants` requires an mlir::Location for emitting errors, which is prohibitively costly in the situations that would most likely use them, e.g. the parser.
This creates an unfortunate code duplication between the verifier code and the parser code, given that the parser operates on llvm::SMLoc and it is an undesirable overhead to pre-emptively convert from that to an mlir::Location.
* `getChecked` effectively requires duplicating the definition of the `get` method, creating a quite clunky workflow due to the subtle different in its signature.
This revision aims to talk the above problems by refactoring the implementation to use a callback for error emission. Using a callback allows for deferring the costly part of error emission until it is actually necessary.
Due to the necessary signature change in each instance of these methods, this revision also takes this opportunity to cleanup the definition of these methods by:
* restructuring the signature of `getChecked` such that it can be generated from the same code block as the `get` method.
* renaming `verifyConstructionInvariants` to `verify` to match the naming scheme of the rest of the compiler.
Differential Revision: https://reviews.llvm.org/D97100
* It was decided that this was the end of the line for the existing custom tc parser/generator, and this is the first step to replacing it with a declarative format that maps well to mathy source languages.
* One such source language is implemented here: https://github.com/stellaraccident/mlir-linalgpy/blob/main/samples/mm.py
* In fact, this is the exact source of the declarative `polymorphic_matmul` in this change.
* I am working separately to clean this python implementation up and add it to MLIR (probably as `mlir.tools.linalg_opgen` or equiv). The scope of the python side is greater than just generating named ops: the ops are callable and directly emit `linalg.generic` ops fully dynamically, and this is intended to be a feature for frontends like npcomp to define custom linear algebra ops at runtime.
* There is more work required to handle full type polymorphism, especially with respect to integer formulations, since they require more specificity wrt types.
* Followups to this change will bring the new generator to feature parity with the current one and delete the current. Roughly, this involves adding support for interface declarations and attribute symbol bindings.
Differential Revision: https://reviews.llvm.org/D97135
The functions translating enums to LLVM IR are generated in a single
file included in many places, not all of which use all translations.
Generate functions with "unused" attribute to silence compiler warnings.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D96880
A series of preceding patches changed the mechanism for translating MLIR to
LLVM IR to use dialect interface with delayed registration. It is no longer
necessary for specific dialects to derive from ModuleTranslation. Remove all
virtual methods from ModuleTranslation and factor out the entry point to be a
free function.
Also perform some cleanups in ModuleTranslation internals.
Depends On D96774
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96775
Port the translation of five dialects that define LLVM IR intrinsics
(LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to the new dialect
interface-based mechanism. This allows us to remove individual translations
that were created for each of these dialects and just use one common
MLIR-to-LLVM-IR translation that potentially supports all dialects instead,
based on what is registered and including any combination of translatable
dialects. This removal was one of the main goals of the refactoring.
To support the addition of GPU-related metadata, the translation interface is
extended with the `amendOperation` function that allows the interface
implementation to post-process any translated operation with dialect attributes
from the dialect for which the interface is implemented regardless of the
operation's dialect. This is currently applied to "kernel" functions, but can
be used to construct other metadata in dialect-specific ways without
necessarily affecting operations.
Depends On D96591, D96504
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96592
Currently, vector.contract joins the intermediate result and the accumulator
argument (of ranks K) using summation. We desire more joining operations ---
such as max --- to help vector.contract express reductions. This change extends
Vector_ContractionOp to take an optional attribute (called "kind", of enum type
CombiningKind) specifying the joining operation to be add/mul/min/max for int/fp
, and and/or/xor for int only. By default this attribute has value "add".
To implement this we also need to extend vector.outerproduct, since
vector.contract gets transformed to vector.outerproduct (and that to
vector.fma). The extension for vector.outerproduct is also an optional kind
attribute that uses the same enum type and possible values. The default is
"add". In case of max/min we transform vector.outerproduct to a combination of
compare and select.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D93280
This revision takes advantage of the newly extended `ref` directive in assembly format
to allow better region handling for LinalgOps. Specifically, FillOp and CopyOp now build their regions explicitly which allows retiring older behavior that relied on specific op knowledge in both lowering to loops and vectorization.
This reverts commit 3f22547fd1 and reland 973e133b76 with a workaround for
a gcc bug that does not accept lambda default parameters:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59949
Differential Revision: https://reviews.llvm.org/D96598
This reverts commit 973e133b76.
It triggers an issue in gcc5 that require investigation, the build is
broken with:
/tmp/ccdpj3B9.s: Assembler messages:
/tmp/ccdpj3B9.s:5821: Error: symbol `_ZNSt17_Function_handlerIFvjjEUljjE2_E9_M_invokeERKSt9_Any_dataOjS6_' is already defined
/tmp/ccdpj3B9.s:5860: Error: symbol `_ZNSt14_Function_base13_Base_managerIUljjE2_E10_M_managerERSt9_Any_dataRKS3_St18_Manager_operation' is already defined
Migrate the translation of the OpenMP dialect operations to LLVM IR to the new
dialect-based mechanism.
Depends On D96503
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96504
The existing approach to translation to the LLVM IR relies on a single
translation supporting the base LLVM dialect, extensible through inheritance to
support intrinsic-based dialects also derived from LLVM IR such as NVVM and
AVX512. This approach does not scale well as it requires additional
translations to be created for each new intrinsic-based dialect and does not
allow them to mix in the same module, contrary to the rest of the MLIR
infrastructure. Furthermore, OpenMP translation ingrained itself into the main
translation mechanism.
Start refactoring the translation to LLVM IR to operate using dialect
interfaces. Each dialect that contains ops translatable to LLVM IR can
implement the interface for translating them, and the top-level translation
driver can operate on interfaces without knowing about specific dialects.
Furthermore, the delayed dialect registration mechanism allows one to avoid a
dependency on LLVM IR in the dialect that is translated to it by implementing
the translation as a separate library and only registering it at the client
level.
This change introduces the new mechanism and factors out the translation of the
"main" LLVM dialect. The remaining dialects will follow suit.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96503