This enables querying shapes/values as shapes without mutating the IR
directly (e.g., towards enabling doing inference in analysis &
application steps, inferring function shape with constant from callsite,
...). Add a new ShapeAdaptor that abstracts over whether shape is from
Type or ShapedTypeComponents or DenseIntElementsAttribute. This adds new
accessors to ValueShapeRange to get Shape and value as shape, but
doesn't restrict or remove the previous way of accessing Type via the
Value for now, that does mean a less refined shape could be accidentally
queried and will be restricted in follow up.
Currently restricted Value query to what can be represented as Shape. So
only supports cases where constant subgraph evaluation's output is a
shape. I had considered making it more general, but without TBD extern
attribute concept or some such a user cannot today uniformly avoid
overhead.
Update TOSA ops and also the shape inference pass.
Differential Revision: https://reviews.llvm.org/D107768
Replace some code snippets With scf::ForOp methods. Additionally,
share a listener at one more point (although this pattern is still
not safe to roll back currently)
Differential Revision: https://reviews.llvm.org/D107754
Implements lowering dense to sparse conversion, for static tensor types only.
First step towards general sparse_tensor.convert support.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107681
Perform scalar constant propagation for FPTruncOp only if the resulting value can be represented without precision loss or rounding.
Example:
%cst = constant 1.000000e+00 : f32
%0 = fptrunc %cst : f32 to bf16
-->
%cst = constant 1.000000e+00 : bf16
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D107518
CastOp::areCastCompatible does not check whether casts are definitely compatible.
When going from dynamic to static offset or stride, the canonicalization cannot
know whether it is really cast compatible. In that case, it can only canonicalize
to an alloc plus copy.
Differential Revision: https://reviews.llvm.org/D107545
Tested with gcc-10. Other compilers may generate additional warnings. This does not fix all warnings. There are a few extra ones in LLVMCore and MLIR.
* `OpEmitter::getAttrNameIndex`: -Wunused-function (function is private and not used anywhere)
* `PrintOpPass` copy constructor: -Wextra ("Base class should be explicitly initialized in the copy constructor")
* `LegalizeForLLVMExport.cpp`: -Woverflow (overflow is expected, silence warning by making the cast explicit)
Differential Revision: https://reviews.llvm.org/D107525
The existing vector transforms reduce the dimension of transfer_read
ops. However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector. This handles this case.
Note that in the longer term, it may be preferaby to support
zero-dimension vectors. see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.
Differential Revision: https://reviews.llvm.org/D103432
We can propagate the shape from tosa.cond_if operands into the true/false
regions then through the connected blocks. Then, using the tosa.yield ops
we can determine what all possible return types are.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105940
Handles shape inference for identity, cast, and rescale. These were missed
during the initialy elementwise work. This includes resize shape propagation
which includes both attribute and input type based propagation.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105845
This patch adds the critical construct to the OpenMP dialect. The
implementation models the definition in 2.17.1 of the OpenMP 5 standard.
A name and hint can be specified. The name is a global entity or has
external linkage, it is modelled as a FlatSymbolRefAttr. Hint is
modelled as an integer enum attribute.
Also lowering to LLVM IR using the OpenMP IRBuilder.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107135
Silence clang-tidy warning in AffineOps.cpp due to the inability to see
through the typeswitch. NFC.
Differential Revision: https://reviews.llvm.org/D106125
Add ForLoopBoundSpecialization pass, which specializes scf.for loops into a "main loop" where `step` divides the iteration space evenly and into an scf.if that handles the last iteration.
This transformation is useful for vectorization and loop tiling. E.g., when vectorizing loads/stores, programs will spend most of their time in the main loop, in which only unmasked loads/stores are used. Only the in the last iteration (scf.if), slower masked loads/stores are used.
Subsequent commits will apply this transformation in the SparseDialect and in Linalg's loop tiling.
Differential Revision: https://reviews.llvm.org/D105804
Introduces a conversion from one (sparse) tensor type to another
(sparse) tensor type. See the operation doc for details. Actual
codegen for all cases is still TBD.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107205
When we vectorize a scalar constant, the vector constant is inserted before its
first user if the scalar constant is defined outside the loops to be vectorized.
It is possible that the vector constant does not dominate all its users. To fix
the problem, we find the innermost vectorized loop that encloses that first user
and insert the vector constant at the top of the loop body.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106609
Make broadcastable needs the output shape to determine whether the operation
includes additional broadcasting. Include some canonicalizations for TOSA
to remove unneeded reshape.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D106846
`PadTensorOp` has verification logic to make sure
result dim must be static if all the padding values are static.
Cast folding might add more static information for the src operand
of `PadTensorOp` which might change a valid operation to be invalid.
Change the canonicalizing pattern to fix this.
Currently TFRT does not support top-level coroutines, so this functionality will allow to have a single blocking await at the top level until TFRT implements the necessary functionality.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D106730
Change the formatting of the debug print outs to elide unnecessary information.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106661
Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though)
Differential Revision: https://reviews.llvm.org/D105149
Interop parallelism requires needs awaiting on results. Blocking awaits are bad for performance. TFRT supports lightweight resumption on threads, and coroutines are an abstraction than can be used to lower the kernels onto TFRT threads.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D106508
The verifier of the llvm.call operation was not checking for mismatches between
the number of operation results and the number of results in the signature of
the callee. Furthermore, it was possible to construct an llvm.call operation
producing an SSA value of !llvm.void type, which should not exist. Add the
verification and treat !llvm.void result type as absence of call results.
Update the GPU conversions to LLVM that were mistakenly assuming that it was
fine for llvm.call to produce values of !llvm.void type and ensure these calls
do not produce results.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106937
- Fixed symbol insertion into `symNameToModuleMap`. Insertion
needs to happen whether symbols are renamed or not.
- Added check for the VCE triple and avoid dropping it.
- Disabled function deduplication. It requires more careful
rules. Right now it can remove different functions.
- Added tests for symbol rename listener.
- And some other code/comment cleanups.
Reviewed By: ergawy
Differential Revision: https://reviews.llvm.org/D106886
Specialize the DeduplicateInputs and RemoveIdentityLinalgOps patterns for GenericOp instead of implementing them for the LinalgOp interface.
This revsion is based on https://reviews.llvm.org/D105622 that moves the logic to erase identity CopyOps in a separate pattern.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105291
Split out an EraseIdentityCopyOp from the existing RemoveIdentityLinalgOps pattern. Introduce an additional check to ensure the pattern checks the permutation maps match. This is a preparation step to specialize RemoveIdentityLinalgOps to GenericOp only.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105622
`memref.collapse_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.
Cast folding might add more static information for the src operand
of `memref.collapse_shape` which might change a valid collapsing
operation to be invalid. Add `CollapseShapeOpMemRefCastFolder` pattern
to fix this.
Minor changes to `convertReassociationIndicesToExprs` to use `context`
instead of `builder` to avoid extra steps to construct temporary
builders.
Reviewed By: nicolasvasilache, mravishankar
Differential Revision: https://reviews.llvm.org/D106670
The order of testing in two sparse tensor ops was incorrect,
which could cause an invalid cast (crashing the compiler instead
of reporting the error). This revision fixes that bug.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106841
Retaining old interface and should be constructable as previous, change would have been NFC except it this doesn't implicitly work with OpAdaptor generated in C++14.
Differential Revision: https://reviews.llvm.org/D106772
This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be
passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the
involved operands. However, in contrast to the BranchOpInterface, it expects an additional region
number to distinguish between various use cases which might require different operands passed to
different regions.
Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and
getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the
ReturnLike trait and/or implementing the newly added interface. This simplifies reasoning about
terminators in the scope of the nested regions.
We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities.
Differential Revision: https://reviews.llvm.org/D105018
When the output indexing map has a permutation we need to consider in
the contraction vector type.
Differential Revision: https://reviews.llvm.org/D106469