This is consistent to the other methods of the class, as well as
AffineExpr::replaceDimsAndSymbols.
Differential Revision: https://reviews.llvm.org/D80266
Add a new type to SPIRV dialect for cooperative matrix and add new op for
cooperative matrix load. This is missing most instructions to support
cooperative matrix extension but this is a stop-gap patch to avoid creating big
review.
Differential Revision: https://reviews.llvm.org/D80043
This patch introduces interfaces for read and write ops with affine
restrictions. I used `read`/`write` intead of `load`/`store` for the
interfaces so that they can also be implemented by dma ops.
For now, they are only implemented by affine.load, affine.store,
affine.vector_load and affine.vector_store.
For testing purposes, this patch also migrates affine loop fusion and
required analysis to use the new interfaces. No other changes are made
beyond that.
Co-authored-by: Alex Zinenko <zinenko@google.com>
Reviewed By: bondhugula, ftynse
Differential Revision: https://reviews.llvm.org/D79829
Summary:
This is a basic op needed for creating shapes from SSA values
representing the extents.
Differential Revision: https://reviews.llvm.org/D79833
Making these two converters more generic. FunctionAndBlockSignatureConverter now
moves only memref results (after type conversion) to the function argument and
keeps other legal function results unchanged. NonVoidToVoidReturnOpConverter is
renamed to NoBufferOperandsReturnOpConverter. It removes only the buffer
operands from the operands of the converted ReturnOp and inserts CopyOps to copy
each buffer to the target function argument.
Differential Revision: https://reviews.llvm.org/D79329
Thanks to a recent change that made `::build` functions take an instance of
`OpBuilder`, it is now possible to build operations within a region attached to
the operation about to be created. Exercise this on `scf::ForOp` by taking a
callback that populates the loop body while the loop is being created.
Additionally, provide helper functions to build perfect nests of `ForOp`s,
with support for iteration arguments. These functions provide the same
functionality as EDSC LoopNestBuilder with simpler implementation, without
relying on edsc::ScopedContext, and using `OpBuilder` in an unambiguous way.
Compatibility functions for EDSC are provided, but may be removed in the
future.
Differential Revision: https://reviews.llvm.org/D79688
Implemented tangent op from SPIR-V's GLSL extended instruction set.
Added a round-trip and serialization/deserialization tests for the op.
Differential Revision: https://reviews.llvm.org/D80152
Summary:
This patch adds support for flush operation in OpenMP dialect and translation of this construct to LLVM IR.
The OpenMP IRBuilder is used for this translation.
The patch includes code changes and testcase modifications.
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D79937
For now the promoted buffer is indexed using the `full view`. The full view might be
slightly bigger than the partial view (which is accounting for boundaries).
Unfortunately this does not compose easily with other transformations when multiple buffers
with shapes related to each other are involved.
Take `linalg.matmul A B C` (with A of size MxK, B of size KxN and C of size MxN) and suppose we are:
- Tiling over M by 100
- Promoting A only
This is producing a `linalg.matmul promoted_A B subview_C` where `promoted_A` is a promoted buffer
of `A` of size (100xK) and `subview_C` is a subview of size mxK where m could be smaller than 100 due
to boundaries thus leading to a possible incorrect behavior.
We propose to:
- Add a new parameter to the tiling promotion allowing to enable the use of the full tile buffer.
- By default all promoted buffers will be indexed by the partial view.
Note that this could be considered as a breaking change in comparison to the way the tiling promotion
was working.
Differential Revision: https://reviews.llvm.org/D79927
Summary:
Vector transfer ops semantic is extended to allow specifying a per-dimension `masked`
attribute. When the attribute is false on a particular dimension, lowering to LLVM emits
unmasked load and store operations.
Differential Revision: https://reviews.llvm.org/D80098
Summary:
This revision makes the use of vector transfer operatons more idiomatic by
allowing to omit and inferring the permutation_map.
Differential Revision: https://reviews.llvm.org/D80092
Along the lines of D77454 and D79968. Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.
Differential Revision: https://reviews.llvm.org/D80044
The JitRunner library is logically very close to the execution engine,
and shares similar dependencies.
find -name "*.cpp" -exec sed -i "s/Support\/JitRunner/ExecutionEngine\/JitRunner/" "{}" \;
Differential Revision: https://reviews.llvm.org/D79899
It is possible for optimizations to create SSA code which violates
the dominance property in unreachable blocks. Equivalently, dominance
computed using normal mechanisms is undefined in unreachable blocks.
See discussion here: https://llvm.discourse.group/t/rfc-allowing-dialects-to-relax-the-ssa-dominance-condition/833/51
This patch only checks the dominance condition inside blocks which are
reachable from the the entry block of their region. Note that the
dominance conditions of regions contained in an unreachable block are
still checked.
Differential Revision: https://reviews.llvm.org/D79922
The following Conversions are affected: LoopToStandard -> SCFToStandard,
LoopsToGPU -> SCFToGPU, VectorToLoops -> VectorToSCF. Full file paths are
affected. Additionally, drop the 'Convert' prefix from filenames living under
lib/Conversion where applicable.
API names and CLI options for pass testing are also renamed when applicable. In
particular, LoopsToGPU contains several passes that apply to different kinds of
loops (`for` or `parallel`), for which the original names are preserved.
Differential Revision: https://reviews.llvm.org/D79940
This revision starts decoupling the include the kitchen sink behavior of Linalg to LLVM lowering by inserting a -convert-linalg-to-std pass.
The lowering of linalg ops to function calls was previously lowering to memref descriptors by having both linalg -> std and std -> LLVM patterns in the same rewrite.
When separating this step, a new issue occurred: the layout is automatically type-erased by this process. This revision therefore introduces memref casts to perform these type erasures explicitly. To connect everything end-to-end, the LLVM lowering of MemRefCastOp is relaxed because it is artificially more restricted than the op semantics. The op semantics already guarantee that source and target MemRefTypes are cast-compatible. An invalid lowering test now becomes valid and is removed.
Differential Revision: https://reviews.llvm.org/D79468
This patch adds `affine.vector_load` and `affine.vector_store` ops to
the Affine dialect and lowers them to `vector.transfer_read` and
`vector.transfer_write`, respectively, in the Vector dialect.
Reviewed By: bondhugula, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D79658
Summary:
The assembly format for std.rank expects the operand type and not the
result type after the colon.
Differential Revision: https://reviews.llvm.org/D79857
All ops of the SCF dialect now use the `scf.` prefix instead of `loop.`. This
is a part of dialect renaming.
Differential Revision: https://reviews.llvm.org/D79844
The existing implementation of SubViewOp::getRanges relies on all
offsets/sizes/strides to be dynamic values and does not work in
combination with canonicalization. This revision adds a
SubViewOp::getOrCreateRanges to create the missing constants in the
canonicalized case.
This allows reactivating the fused pass with staged pattern
applications.
However another issue surfaces that the SubViewOp verifier is now too
strict to allow folding. The existing folding pattern is turned into a
canonicalization pattern which rewrites memref_cast + subview into
subview + memref_cast.
The transform-patterns-matmul-to-vector can then be reactivated.
Differential Revision: https://reviews.llvm.org/D79759
The current standard Alloca node is not annotated with the
MemEffect<Alloc> trait. This CL updates the Alloc and Alloca
memory-effect annotations using the latest Resource objects. Alloca
nodes will use a newly defined AutomaticAllocationScopeResource to
distinguish between Alloc and Alloca memory effects.
Differential Revision: https://reviews.llvm.org/D79620
This is only valid if the source tensors (result tensor) is static
shaped with all unit-extents when the reshape is collapsing
(expanding) dimensions.
Differential Revision: https://reviews.llvm.org/D79764
The main objective of this revision is to change the way static information is represented, propagated and canonicalized in the SubViewOp.
In the current implementation the issue is that canonicalization may strictly lose information because static offsets are combined in irrecoverable ways into the result type, in order to fit the strided memref representation.
The core semantics of the op do not change but the parser and printer do: the op always requires `rank` offsets, sizes and strides. These quantities can now be either SSA values or static integer attributes.
The result type is automatically deduced from the static information and more powerful canonicalizations (as powerful as the representation with sentinel `?` values allows). Previously static information was inferred on a best-effort basis from looking at the source and destination type.
Relevant tests are rewritten to use the idiomatic `offset: x, strides : [...]`-form. Bugs are corrected along the way that were not trivially visible in flattened strided memref form.
Lowering to LLVM is updated, simplified and now supports all cases.
A mixed static-dynamic mode test that wouldn't previously lower is added.
It is an open question, and a longer discussion, whether a better result type representation would be a nicer alternative. For now, the subview op carries the required semantic.
Differential Revision: https://reviews.llvm.org/D79662
Conversion/ folders were originally intended to store patterns for
DialectA->DialectB conversions that depend on both dialects and do not
conceptually belong to either of the dialects. As such, DialectA->DialectA
conversion does not make sense under Conversion/ and should rather live with
the dialect it operates on.
Differential Revision: https://reviews.llvm.org/D79569
This normalize the name of the tablegen file with the name of the generated
files (SideEffectInterfaces.h.inc) and the other Interface tablegen files,
which all end in Interface(s).td
Differential Revision: https://reviews.llvm.org/D79517
This reverts commit 80d133b24f.
Per Stephan Herhut: The canonicalizer pattern that was added creates
forms of the subview op that cannot be lowered.
This is shown by failing Tensorflow XLA tests such as:
tensorflow/compiler/xla/service/mlir_gpu/tests:abs.hlo.test
Will provide more details offline, they rely on logs from private CI.
Summary:
- Fix comments in several places
- Eliminate extra ' in AST dump and adjust tests accordingly
Differential Revision: https://reviews.llvm.org/D78399
Summary:
The main objective of this revision is to change the way static information is represented, propagated and canonicalized in the SubViewOp.
In the current implementation the issue is that canonicalization may strictly lose information because static offsets are combined in irrecoverable ways into the result type, in order to fit the strided memref representation.
The core semantics of the op do not change but the parser and printer do: the op always requires `rank` offsets, sizes and strides. These quantities can now be either SSA values or static integer attributes.
The result type is automatically deduced from the static information and more powerful canonicalizations (as powerful as the representation with sentinel `?` values allows). Previously static information was inferred on a best-effort basis from looking at the source and destination type.
Relevant tests are rewritten to use the idiomatic `offset: x, strides : [...]`-form. Bugs are corrected along the way that were not trivially visible in flattened strided memref form.
It is an open question, and a longer discussion, whether a better result type representation would be a nicer alternative. For now, the subview op carries the required semantic.
Reviewers: ftynse, mravishankar, antiagainst, rriddle!, andydavis1, timshen, asaadaldien, stellaraccident
Reviewed By: mravishankar
Subscribers: aartbik, bondhugula, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, bader, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79662
Summary:
This revision introduces a helper function to allow applying rewrite patterns, interleaved with more global transformations, in a staged fashion:
1. the first stage consists of an OwningRewritePatternList. The RewritePattern in this list are applied once and in order.
2. the second stage consists of a single OwningRewritePattern that is applied greedily until convergence.
3. the third stage consists of applying a lambda, generally used for non-local transformation effects.
This allows creating custom fused transformations where patterns can be ordered and applied at a finer granularity than a sequence of traditional compiler passes.
A test that exercises these behaviors is added.
Differential Revision: https://reviews.llvm.org/D79518
Summary:
This makes a common pattern of
`dyn_cast_or_null<OpTy>(v.getDefiningOp())` more concise.
Differential Revision: https://reviews.llvm.org/D79681
This [discussion](https://llvm.discourse.group/t/viewop-isnt-expressive-enough/991/2) raised some concerns with ViewOp.
In particular, the handling of offsets is incorrect and does not match the op description.
Note that with an elemental type change, offsets cannot be part of the type in general because sizeof(srcType) != sizeof(dstType).
Howerver, offset is a poorly chosen term for this purpose and is renamed to byte_shift.
Additionally, for all intended purposes, trying to support non-identity layouts for this op does not bring expressive power but rather increases code complexity.
This revision simplifies the existing semantics and implementation.
This simplification effort is voluntarily restrictive and acts as a stepping stone towards supporting richer semantics: treat the non-common cases as YAGNI for now and reevaluate based on concrete use cases once a round of simplification occurred.
Differential revision: https://reviews.llvm.org/D79541
Summary: This revision introduces LinalgPromotionOptions to more easily control the application of promotion patterns. It also simplifies the different entry points into Promotion in preparation for some behavior change in subsequent revisions.
Differential Revision: https://reviews.llvm.org/D79489
This dialect contains various structured control flow operaitons, not only
loops, reflect this in the name. Drop the Ops suffix for consistency with other
dialects.
Note that this only moves the files and changes the C++ namespace from 'loop'
to 'scf'. The visible IR prefix remains the same and will be updated
separately. The conversions will also be updated separately.
Differential Revision: https://reviews.llvm.org/D79578
Summary:
Cast from a value interpreted as floating-point to the corresponding signed
integer value. Similar to an element-wise `static_cast` in C++, performs an
element-wise conversion operation.
Differential Revision: https://reviews.llvm.org/D79373
Region has a default constructor that is called when a region is constructed
while an operation is being created, and therefore before the region can be
attached to this operation. The `container` field is uninitialized, which makes
it impossible to check programmatically if a Region is attached to an operation
or not, leading to sly memory errors when this field is read. Initialize it to
nullptr by default and thus make sure one can check if a region is attached to
an operation or not.