When we vectorize a scalar constant, the vector constant is inserted before its
first user if the scalar constant is defined outside the loops to be vectorized.
It is possible that the vector constant does not dominate all its users. To fix
the problem, we find the innermost vectorized loop that encloses that first user
and insert the vector constant at the top of the loop body.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106609
Make broadcastable needs the output shape to determine whether the operation
includes additional broadcasting. Include some canonicalizations for TOSA
to remove unneeded reshape.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D106846
`PadTensorOp` has verification logic to make sure
result dim must be static if all the padding values are static.
Cast folding might add more static information for the src operand
of `PadTensorOp` which might change a valid operation to be invalid.
Change the canonicalizing pattern to fix this.
* Adds source targets (not included in the full set that downstreams use by default) to bundle mlir-c/ headers into the mlir/_mlir_libs/include directory.
* Adds a minimal entry point to get include and library directories.
* Used by npcomp to export a full CAPI (which is then used by the Torch extension to link npcomp).
Reviewed By: mikeurbach
Differential Revision: https://reviews.llvm.org/D107090
Currently TFRT does not support top-level coroutines, so this functionality will allow to have a single blocking await at the top level until TFRT implements the necessary functionality.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D106730
Change the formatting of the debug print outs to elide unnecessary information.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106661
This is redundant with the callback variant and untested. Also remove
the callback-less methods for adding a dynamically legal op, as they
are no longer useful.
Differential Revision: https://reviews.llvm.org/D106786
This allows to use OperationEquivalence to track structural comparison for equality
between two operations.
Differential Revision: https://reviews.llvm.org/D106422
* For python projects that don't need JIT/ExecutionEngine, cuts the number of files to compile roughly in half (with similar reduction in end binary size).
Differential Revision: https://reviews.llvm.org/D106992
By making an explicit template specialization for the TypeID provided by these classes,
the compiler will not emit an inline weak definition and rely on the linker to unique it.
Instead a single definition will be emitted in the C++ file alongside the implementation
for these classes. That will turn into a linker error what is now a hard-to-debug runtime
behavior where instances of the same class may be using a different TypeID inside of
different DSOs.
Recommit 660a56956c after fixing gcc5
build.
Differential Revision: https://reviews.llvm.org/D105903
Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though)
Differential Revision: https://reviews.llvm.org/D105149
Interop parallelism requires needs awaiting on results. Blocking awaits are bad for performance. TFRT supports lightweight resumption on threads, and coroutines are an abstraction than can be used to lower the kernels onto TFRT threads.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D106508
By making an explicit template specialization for the TypeID provided by these classes,
the compiler will not emit an inline weak definition and rely on the linker to unique it.
Instead a single definition will be emitted in the C++ file alongside the implementation
for these classes. That will turn into a linker error what is now a hard-to-debug runtime
behavior where instances of the same class may be using a different TypeID inside of
different DSOs.
Differential Revision: https://reviews.llvm.org/D105903
In translation from MLIR to another IR, run the MLIR verifier on the parsed
module to ensure only valid modules are given to the translation. Previously,
we would send any module that could be parsed to the translation, including
semantically invalid modules, leading to surprising errors or lack thereof down
the pipeline.
Depends On D106937
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106938
The verifier of the llvm.call operation was not checking for mismatches between
the number of operation results and the number of results in the signature of
the callee. Furthermore, it was possible to construct an llvm.call operation
producing an SSA value of !llvm.void type, which should not exist. Add the
verification and treat !llvm.void result type as absence of call results.
Update the GPU conversions to LLVM that were mistakenly assuming that it was
fine for llvm.call to produce values of !llvm.void type and ensure these calls
do not produce results.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106937
- Fixed symbol insertion into `symNameToModuleMap`. Insertion
needs to happen whether symbols are renamed or not.
- Added check for the VCE triple and avoid dropping it.
- Disabled function deduplication. It requires more careful
rules. Right now it can remove different functions.
- Added tests for symbol rename listener.
- And some other code/comment cleanups.
Reviewed By: ergawy
Differential Revision: https://reviews.llvm.org/D106886
Specialize the DeduplicateInputs and RemoveIdentityLinalgOps patterns for GenericOp instead of implementing them for the LinalgOp interface.
This revsion is based on https://reviews.llvm.org/D105622 that moves the logic to erase identity CopyOps in a separate pattern.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105291
Split out an EraseIdentityCopyOp from the existing RemoveIdentityLinalgOps pattern. Introduce an additional check to ensure the pattern checks the permutation maps match. This is a preparation step to specialize RemoveIdentityLinalgOps to GenericOp only.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105622
`memref.collapse_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.
Cast folding might add more static information for the src operand
of `memref.collapse_shape` which might change a valid collapsing
operation to be invalid. Add `CollapseShapeOpMemRefCastFolder` pattern
to fix this.
Minor changes to `convertReassociationIndicesToExprs` to use `context`
instead of `builder` to avoid extra steps to construct temporary
builders.
Reviewed By: nicolasvasilache, mravishankar
Differential Revision: https://reviews.llvm.org/D106670
By making an explicit template specialization for the TypeID provided by these classes,
the compiler will not emit an inline weak definition and rely on the linker to unique it.
Instead a single definition will be emitted in the C++ file alongside the implementation
for these classes. That will turn into a linker error what is now a hard-to-debug runtime
behavior where instances of the same class may be using a different TypeID inside of
different DSOs.
Differential Revision: https://reviews.llvm.org/D105903
RewriteEndOp was a fake terminator operation that is no longer needed now that blocks are not required to have terminators.
Differential Revision: https://reviews.llvm.org/D106911
* Implements all of the discussed features:
- Links against common CAPI libraries that are self contained.
- Stops using the 'python/' directory at the root for everything, opening the namespace up for multiple projects to embed the MLIR python API.
- Separates declaration of sources (py and C++) needed to build the extension from building, allowing external projects to build custom assemblies from core parts of the API.
- Makes the core python API relocatable (i.e. it could be embedded as something like 'npcomp.ir', 'npcomp.dialects', etc). Still a bit more to do to make it truly isolated but the main structural reset is done.
- When building statically, installed python packages are completely self contained, suitable for direct setup and upload to PyPi, et al.
- Lets external projects assemble their own CAPI common runtime library that all extensions use. No more possibilities for TypeID issues.
- Begins modularizing the API so that external projects that just include a piece pay only for what they use.
* I also rolled in a re-organization of the native libraries that matches how I was packaging these out of tree and is a better layering (i.e. all libraries go into a nested _mlir_libs package). There is some further cleanup that I resisted since it would have required source changes that I'd rather do in a followup once everything stabilizes.
* Note that I made a somewhat odd choice in choosing to recompile all extensions for each project they are included into (as opposed to compiling once and just linking). While not leveraged yet, this will let us set definitions controlling the namespacing of the extensions so that they can be made to not conflict across projects (with preprocessor definitions).
* This will be a relatively substantial breaking change for downstreams. I will handle the npcomp migration and will coordinate with the circt folks before landing. We should stage this and make sure it isn't causing problems before landing.
* Fixed a couple of absolute imports that were causing issues.
Differential Revision: https://reviews.llvm.org/D106520
The order of testing in two sparse tensor ops was incorrect,
which could cause an invalid cast (crashing the compiler instead
of reporting the error). This revision fixes that bug.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106841
This aligns the structure of the Affine dialect on all the other dialects.
In particular this makes the ODS C++ generated code independent of the
enclosing namespace.
Retaining old interface and should be constructable as previous, change would have been NFC except it this doesn't implicitly work with OpAdaptor generated in C++14.
Differential Revision: https://reviews.llvm.org/D106772
Tosa shape verification prevent shape propagation when coming from a dialect
of known shape. Relax this constraint to allow ingestion / shape propagation
from these other dialects.
Differential Revision: https://reviews.llvm.org/D106610