`vector::InsertElementOp` and `vector::ExtractElementOp` have had their `position`
operand changed to accept `AnySignlessIntegerOrIndex` for better operability with
operations that use `index`, such as affine loops.
LLVM's `extractelement` and `insertelement` can also accept `i64`, so lowering
directly to these operations without explicitly inserting casts is allowed. SPIRV's
equivalent ops can also accept `i64`.
Reviewed By: nicolasvasilache, jpienaar
Differential Revision: https://reviews.llvm.org/D114139
LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D113955
* It works similar to scf.for coversion, but convert condition and yield ops as part of scf.whille pattern so it don't need to maintain external state
Differential Revision: https://reviews.llvm.org/D113007
This reverts commit 94992670fc.
Build is broken with:
tools/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.cpp.inc:23996:3: error: no matching function for call to 'printSwitchOpCases'
printSwitchOpCases(_odsPrinter, *this, getValue().getType(), getCaseValuesAttr(), getCaseDestinations(), getCaseOperands(), getCaseOperands().getTypes());
^~~~~~~~~~~~~~~~~~
LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D113955
Split tosa.reshape into three individual lowerings: collapse, expand and a
combination of both. Add simple dynamic shape support.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D113936
FMAOp -> LLVM conversion is done progressively by peeling off 1 dimension from FMAOp at each pattern iteration. Add the recursively bounded property declaration to the pattern so that the rewriter can apply it multiple times.
Without this, FMAOps with 3+D do not lower to LLVM.
Differential Revision: https://reviews.llvm.org/D113886
Names should be consistent across all operations otherwise painful bugs will surface.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D113762
Support load with broadcast, elementwise divf op and remove the
hardcoded restriction on the vector size. Picking the right size should
be enfored by user and will fail conversion to llvm/spirv if it is not
supported.
Differential Revision: https://reviews.llvm.org/D113618
New TOSA pad operation can support explicitly specifying the pad value. Added
lowering to linalg that uses the explicit value.
Differential Revision: https://reviews.llvm.org/D113515
Use existing helper instead of handling only a subset of indices lowering
arithmetic. Also relax the restriction on the memref rank for the GPU mma ops
as we can now support any rank.
Differential Revision: https://reviews.llvm.org/D113383
Given that LLVM dialect types may now optionally contain types from other
dialects, which itself is motivated by dialect interoperability and progressive
lowering, the conversion should no longer assume that the outermost LLVM
dialect type can be left as is. Instead, it should inspect the types it
contains and attempt to convert them to the LLVM dialect. Introduce this
capability for LLVM array, pointer and structure types. Only literal structures
are currently supported as handling identified structures requires the
converison infrastructure to have a mechanism for avoiding infite recursion in
case of recursive types.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D112550
In order to support fusion with mma matrix type we need to be able to
execute elementwise operations on them. This add an op to be able to
support some basic elementwise operations. This is a is not a full
solution as it only supports a limited scope or operations. Ideally we would
want to be able to fuse with more kind of operations.
Differential Revision: https://reviews.llvm.org/D112857
wmma intrinsics have a large number of combinations, ideally we want to be able
to target all the different variants. To avoid a combinatorial explosion in the
number of mlir op we use attributes to represent the different variation of
load/store/mma ops. We also can generate with tablegen helpers to know which
combinations are available. Using this we can avoid having too hardcode a path
for specific shapes and can support more types.
This patch also adds boiler plates for tf32 op support.
Differential Revision: https://reviews.llvm.org/D112689
Add the shufflevector conversion. It only handles the static, i.e., IntegerAttr, index.
Co-authored: Xinyi Liu <xyliuhelen@gmail.com>
Reviewed by: antiagainst
Differential revision: https://reviews.llvm.org/D112161
Allow lowering of wmma ops with 64bits indexes. Change the default
version of the test to use default layout.
Differential Revision: https://reviews.llvm.org/D112479
This also fixes the vector.shuffle C++ builder which had an incorrect type assumption that triggers with this new rewrite.
The vector.shuffle semantics were correct though.
Differential revision: https://reviews.llvm.org/D112578
The current implementation invokes materializations
whenever an input operand does not have a mapping for the
desired type, i.e. it requires materialization at the earliest possible
point. This conflicts with goal of dialect conversion (and also the
current documentation) which states that a materialization is only
required if the materialization is supposed to persist after the
conversion process has finished.
This revision refactors this such that whenever a target
materialization "might" be necessary, we insert an
unrealized_conversion_cast to act as a temporary materialization.
This allows for deferring the invocation of the user
materialization hooks until the end of the conversion process,
where we actually have a better sense if it's actually
necessary. This has several benefits:
* In some cases a target materialization hook is no longer
necessary
When performing a full conversion, there are some situations
where a temporary materialization is necessary. Moving forward,
these users won't need to provide any target materializations,
as the temporary materializations do not require the user to
provide materialization hooks.
* getRemappedValue can now handle values that haven't been
converted yet
Before this commit, it wasn't well supported to get the remapped
value of a value that hadn't been converted yet (making it
difficult/impossible to convert multiple operations in many
situations). This commit updates getRemappedValue to properly
handle this case by inserting temporary materializations when
necessary.
Another code-health related benefit is that with this change we
can move a majority of the complexity related to materializations
to the end of the conversion process, instead of handling adhoc
while conversion is happening.
Differential Revision: https://reviews.llvm.org/D111620
Specification specified the output type for quantized average pool should be
an i32. Only accumulator should be an i32, result type should match the input
type.
Caused in https://reviews.llvm.org/D111590
Reviewed By: sjarus, GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D112484
This is the only lowering to Linalg Tosa has, so it's needlessly
verbose. Likely this was a carry over from IREE's usage where we
originally lowered to linalg on buffers (the only linalg that existed at
the time), so the everything on tensors needed the suffix. We're dropping
it in IREE also, having transitioned entirely to using Linalg on
tensors.
Reviewed By: sjarus
Differential Revision: https://reviews.llvm.org/D111911
Part of the arith update broke UiToFp32. Fixed the lowering and included a new
test to detect a regression.
Differential Revision: https://reviews.llvm.org/D111772
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
1. To avoid two ExecutionModeOp using the same name, adding the value of execution mode in name when converting to LLVM dialect.
2. To avoid syntax error in spv.OpLoad, add OpTypeSampledImage into SPV_Type.
Reviewed by:antiagainst
Differential revision:https://reviews.llvm.org/D111193
Average pool assumed the same input/output type. Result type for integers
is always an i32, should be updated appropriately.
Reviewed By: GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D111590
This patch is mainly to propogate location attribute from spv.GlobalVariable to llvm.mlir.global.
It also contains three small changes.
1. Remove the restriction on UniformConstant In SPIRVToLLVM.cpp;
2. Remove the errorCheck on relaxedPrecision when deserializering SPIR-V in Deserializer.cpp
3. In SPIRVOps.cpp, let ConstantOp take signedInteger too.
Co-authered: Alan Liu <alanliu.yf@gmail.com> and Xinyi Liu <xyliuhelen@gmail.com>
Reviewed by:antiagainst
Differential revision: https://reviews.llvm.org/D110207
Add support for dynamic shared memory for GPU launch ops: add an
optional operand to gpu.launch and gpu.launch_func ops to specify the
amount of "dynamic" shared memory to use. Update lowerings to connect
this operand to the GPU runtime.
Differential Revision: https://reviews.llvm.org/D110800
The conversion pattern is particularly useful for conversion of
block arguments in the master op.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D109610
Conversion to the LLVM dialect is being refactored to be more progressive and
is now performed as a series of independent passes converting different
dialects. These passes may produce `unrealized_conversion_cast` operations that
represent pending conversions between built-in and LLVM dialect types.
Historically, a more monolithic Standard-to-LLVM conversion pass did not need
these casts as all operations were converted in one shot. Previous refactorings
have led to the requirement of running the Standard-to-LLVM conversion pass to
clean up `unrealized_conversion_cast`s even though the IR had no standard
operations in it. The pass must have been also run the last among all to-LLVM
passes, in contradiction with the partial conversion logic. Additionally, the
way it was set up could produce invalid operations by removing casts between
LLVM and built-in types even when the consumer did not accept the uncasted
type, or could lead to cryptic conversion errors (recursive application of the
rewrite pattern on `unrealized_conversion_cast` as a means to indicate failure
to eliminate casts).
In fact, the need to eliminate A->B->A `unrealized_conversion_cast`s is not
specific to to-LLVM conversions and can be factored out into a separate type
reconciliation pass, which is achieved in this commit. While the cast operation
itself has a folder pattern, it is insufficient in most conversion passes as
the folder only applies to the second cast. Without complex legality setup in
the conversion target, the conversion infra will either consider the cast
operations valid and not fold them (a separate canonicalization would be
necessary to trigger the folding), or consider the first cast invalid upon
generation and stop with error. The pattern provided by the reconciliation pass
applies to the first cast operation instead. Furthermore, having a separate
pass makes it clear when `unrealized_conversion_cast`s could not have been
eliminated since it is the only reason why this pass can fail.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D109507
OpenMP reductions need a neutral element, so we match some known reduction
kinds (integer add/mul/or/and/xor, float add/mul, integer and float min/max) to
define the neutral element and the atomic version when possible to express
using atomicrmw (everything except float mul). The SCF-to-OpenMP pass becomes a
module pass because it now needs to introduce new symbols for reduction
declarations in the module.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D107549
Previously only await inside the async function (coroutine after lowering to async runtime) would check the error state
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D109229
Create a gpu memset op and corresponding CUDA and ROCm wrappers.
Reviewed By: herhut, lorenrose1013
Differential Revision: https://reviews.llvm.org/D107548
FuncOp always lowers to an LLVM external linkage presently. This makes it impossible to define functions in mlir which are local to the current module. Until MLIR FuncOps have a more formal linkage specification, this commit allows funcop's to have an optionally specified llvm.linkage attribute, whose value will be used as the linkage of the llvm funcop when lowered.
Differential Revision: https://reviews.llvm.org/D108524
Support LLVM linkage
Currently the builtin dialect is the default namespace used for parsing
and printing. As such module and func don't need to be prefixed.
In the case of some dialects that defines new regions for their own
purpose (like SpirV modules for example), it can be beneficial to
change the default dialect in order to improve readability.
Differential Revision: https://reviews.llvm.org/D107236