This CL allows the programmatic control of the target hardware vector size when creating a MaterializeVectorsPass.
This is useful for registering passes for the tutorial.
PiperOrigin-RevId: 240996136
A integer number can be specified in the pattern definition and used as the
adjustment to the default benefit score in the generated rewrite pattern C++
definition.
PiperOrigin-RevId: 240994192
This CL removes the reliance of the vectorize pass on the specification of a `fastestVaryingDim` parameter. This parameter is a restriction meant to more easily target a particular loop/memref combination for vectorization and is mainly used for testing.
This also had the side-effect of restricting vectorization patterns to only the ones in which all memrefs were contiguous along the same loop dimension. This simple restriction prevented matmul to vectorize in 2-D.
this CL removes the restriction and adds the matmul test which vectorizes in 2-D along the parallel loops. Support for reduction loops is left for future work.
PiperOrigin-RevId: 240993827
These fail with:
could not convert ‘module’ from ‘llvm::orc::ThreadSafeModule’ to
‘llvm::Expected<llvm::orc::ThreadSafeModule>’
PiperOrigin-RevId: 240892583
Most of the tests have been ported to be unit-tests and this pass is problematic in the way it depends on TableGen-generated files. This pass is also non-deterministic during multi-threading and a blocker to turning it on by default.
PiperOrigin-RevId: 240889154
Originally, the conversion to the LLVM IR dialect had been implemented as pass.
The common conversion infrastructure was factored into DialectConversion from
which the conversion pass inherited. The conversion being a pass is
undesirable for callers that only need the conversion done, for example as a
part of sequence of conversions or outside the pass manager infrastructure.
Split the LLVM IR Dialect conversion into the conversion proper and the
conversion pass, where the latter contains the former instead of inheriting.
NFC.
PiperOrigin-RevId: 240874740
Dialect conversion currently clones the operations that did not match any
pattern. This includes cloning any regions that belong to these operations.
Instead, apply conversion recursively to the nested regions.
Note that if an operation matched one of the conversion patterns, it is up to
the pattern rewriter to fill in the regions of the converted operation. This
may require calling back to the converter and is left for future work.
PiperOrigin-RevId: 240872410
Example:
%call:2 = call @multi_return() : () -> (f32, i32)
use(%calltensorflow/mlir#0, %calltensorflow/mlir#1)
This cl also adds parser support for uniquely named result values. This means that a test writer can now write something like:
%foo, %bar = call @multi_return() : () -> (f32, i32)
use(%foo, %bar)
Note: The printer will still print the collapsed form.
PiperOrigin-RevId: 240860058
This CL allows vectorization to be called and configured in other ways than just via command line arguments.
This allows triggering vectorization programmatically.
PiperOrigin-RevId: 240638208
When multi-threading is enabled in the pass manager the meaning of the display
slightly changes. First, a new timing column is added, `User Time`, that
displays the total time spent across all threads. Secondly, the `Wall Time`
column displays the longest individual time spent amongst all of the threads.
This means that the `Wall Time` column will continue to give an indicator on the
perceived time, or clock time, whereas the `User Time` will display the total
cpu time.
Example:
$ mlir-opt foo.mlir -experimental-mt-pm -cse -canonicalize -convert-to-llvmir -pass-timing
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.0078 seconds
---User Time--- ---Wall Time--- --- Name ---
0.0175 ( 88.3%) 0.0055 ( 70.4%) Function Pipeline
0.0018 ( 9.3%) 0.0006 ( 8.1%) CSE
0.0013 ( 6.3%) 0.0004 ( 5.8%) (A) DominanceInfo
0.0017 ( 8.7%) 0.0006 ( 7.1%) FunctionVerifier
0.0128 ( 64.6%) 0.0039 ( 50.5%) Canonicalizer
0.0011 ( 5.7%) 0.0004 ( 4.7%) FunctionVerifier
0.0004 ( 2.1%) 0.0004 ( 5.2%) ModuleVerifier
0.0010 ( 5.3%) 0.0010 ( 13.4%) LLVMLowering
0.0009 ( 4.3%) 0.0009 ( 11.0%) ModuleVerifier
0.0198 (100.0%) 0.0078 (100.0%) Total
PiperOrigin-RevId: 240636269
The spec allows zero-dimensional memrefs to exist and treats them essentially
as single-element buffers. Unlike single-dimensional memrefs of static shape
<1xTy>, zero-dimensional memrefs do not require indices to access the only
element they store. Add support of zero-dimensional memrefs to the LLVM IR
conversion. In particular, such memrefs are converted into bare pointers, and
accesses to them are converted to bare loads and stores, without the overhead
of `getelementptr %buffer, 0`.
PiperOrigin-RevId: 240579456
When converting to the LLVM IR Dialect, it is possible for the input IR to
contain LLVM IR Dialect operation and/or types, for example, some functions may
have been coverted to the LLVM IR Dialect already, or may have been created
using this dialect directly. Make sure that type conversion keeps LLVM IR
Dialect types unmodified and does not error out. Operations are already kept
as is.
PiperOrigin-RevId: 240574972
Due to legacy reasons (ML/CFG function separation), regions in affine control
flow operations require contained blocks not to have terminators. This is
inconsistent with the notion of the block and may complicate code motion
between regions of affine control operations and other regions.
Introduce `affine.terminator`, a special terminator operation that must be used
to terminate blocks inside affine operations and transfers the control back to
he region enclosing the affine operation. For brevity and readability reasons,
allow `affine.for` and `affine.if` to omit the `affine.terminator` in their
regions when using custom printing and parsing format. The custom parser
injects the `affine.terminator` if it is missing so as to always have it
present in constructed operations.
Update transformations to account for the presence of terminator. In
particular, most code motion transformation between loops should leave the
terminator in place, and code motion between loops and non-affine blocks should
drop the terminator.
PiperOrigin-RevId: 240536998
Before this CL, the result type of the pattern match results need to be as same
as the first operand type, operand broadcast type or a generic tensor type.
This CL adds a new trait to set the result type by attribute. For example, the
TFL_ConstOp can use this to set the output type to its value attribute.
PiperOrigin-RevId: 240441249
Currently, regions can only be constructed by passing in a `Function` or an
`Instruction` pointer referencing the parent object, unlike `Function`s or
`Instruction`s themselves that can be created without a parent. It leads to a
rather complex flow in operation construction where one has to create the
operation first before being able to work with its regions. It may be
necessary to work with the regions before the operation is created. In
particular, in `build` and `parse` functions that are executed _before_ the
operation is created in cases where boilerplate region manipulation is required
(for example, inserting the hypothetical default terminator in affine regions).
Allow creating standalone regions. Such regions are meant to own a list of
blocks and transfer them to other regions on demand.
Each instruction stores a fixed number of regions as trailing objects and has
ownership of them. This decreases the size of the Instruction object for the
common case of instructions without regions. Keep this behavior intact. To
allow some flexibility in construction, make OperationState store an owning
vector of regions. When the Builder creates an Instruction from
OperationState, the bodies of the regions are transferred into the
instruction-owned regions to minimize copying. Thus, it becomes possible to
fill standalone regions with blocks and move them to an operation when it is
constructed, or move blocks from a region to an operation region, e.g., for
inlining.
PiperOrigin-RevId: 240368183
a pointer. This makes it consistent with all the other methods in
FunctionPass, as well as with ModulePass::getModule(). NFC.
PiperOrigin-RevId: 240257910
This combined match/rewrite functionality allows simplifying the majority of existing RewritePatterns, as they do not benefit from separate match and rewrite functions.
Some of the existing canonicalization patterns in StandardOps have been modified to take advantage of this functionality.
PiperOrigin-RevId: 240187856
Previously we have multiple mechanisms to specify op definition and match constraints:
TypeConstraint, AttributeConstraint, Type, Attr, mAttr, mAttrAnyOf, mPat. These variants
are not added because there are so many distinct cases we need to model; essentially,
they are all carrying a predicate. It's just an artifact of implementation.
It's quite confusing for users to grasp these variants and choose among them. Instead,
as the OpBase TableGen file, we need to strike to provide an unified mechanism. Each
dialect has the flexibility to define its own aliases if wanted.
This CL removes mAttr, mAttrAnyOf, mPat. A new base class, Constraint, is added. Now
TypeConstraint and AttrConstraint derive from Constraint. Type and Attr further derive
from TypeConstraint and AttrConstraint, respectively.
Comments are revised and examples are added to make it clear how to use constraints.
PiperOrigin-RevId: 240125076