Summary:
LLVM matrix intrinsics recently introduced an option to support row-major mode.
This matches the MLIR vector model, this revision switches to row-major.
A corner case related to degenerate sizes was also fixed upstream.
This revision removes the guard against this corner case.
A bug was uncovered on the output vector construction which this revision also fixes.
Lastly, this has been tested on a small size and benchmarked independently: no visible performance regression is observed.
In the future, when matrix intrinsics support per op attribute, we can more aggressively translate to that and avoid inserting MLIR-level transposes.
This has been tested independently to work on small matrices.
Differential Revision: https://reviews.llvm.org/D77761
Summary:
This revision adds support to lower 1-D vector transfers to LLVM.
A mask of the vector length is created that compares the base offset + linear index to the dim of the vector.
In each position where this does not overflow (i.e. offset + vector index < dim), the mask is set to 1.
A notable fact is that the lowering uses llvm.dialect_cast to allow writing code in the simplest form by targeting the simplest mix of vector and LLVM dialects and
letting other conversions kick in.
Differential Revision: https://reviews.llvm.org/D77703
The "cblas" lib under mlir/test is meant as a simple integration demonstration.
However it is installed and ends up conflicting with external projects who want to
define the real cblas.
Rename to avoid conflicts.
Differential revision: https://reviews.llvm.org/D76615
Summary: Some pattern rewriters, like dialect conversion, prohibit the unbounded recursion(or reapplication) of patterns on generated IR. Most patterns are not written with recursive application in mind, so will generally explode the stack if uncaught. This revision adds a hook to RewritePattern, `hasBoundedRewriteRecursion`, to signal that the pattern can safely be applied to the generated IR of a previous application of the same pattern. This allows for establishing a contract between the pattern and rewriter that the pattern knows and can handle the potential recursive application.
Differential Revision: https://reviews.llvm.org/D77782
Minor fixes and cleanup for ShapedType accessors, use
ShapedType::kDynamicSize, add ShapedType::isDynamicDim.
Differential Revision: https://reviews.llvm.org/D77710
Summary: This avoids the need for having global static initializers within the JITRunner support library, and only constructs the options when the runner is invoked.
Differential Revision: https://reviews.llvm.org/D77760
Summary: ClassID is used as a type id and must be unique in the face of shared libraries to ensure correctness. This fixes failures related to BUILD_SHARED_LIBs on macos.
Differential Revision: https://reviews.llvm.org/D77764
This revision builds a simple "fused pass" consisting of 2 levels of tiling, memory promotion and vectorization using linalg transformations written as composable pattern rewrites.
Summary: Pass options are a better choice for various reasons and avoid the need for static constructors.
Differential Revision: https://reviews.llvm.org/D77707
Summary:
Update ShapeCastOp folder to use producer-consumer value forwarding.
Support is added for tracking sub-vectors through trivial shape cast operations,
where the sub-vector shape is preserved across shape cast operations and only
leading ones are added or removed.
Support is preserved for cancelling shape cast operations.
One unit test is added and two are updated.
Reviewers: aartbik, nicolasvasilache
Reviewed By: aartbik, nicolasvasilache
Subscribers: frgossen, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77253
820c420d4e1c630b5ead285917c6ecdd2f5092ad did not really fix all build
issues by D77528. This gets rid of two unnecessary 'using' declarations.
Differential Revision: https://reviews.llvm.org/D77726
Support to recognize and deal with aligned_alloc was recently added to
LLVM's TLI/MemoryBuiltins and its various optimization passes. This
revision adds support for generation of aligned_alloc's when lowering
AllocOp from std to LLVM. Setting 'use-aligned_alloc=1' will lead to
aligned_alloc being used for all heap allocations. An alignment and size
that works with the constraints of aligned_alloc is chosen.
Using aligned_alloc is preferable to "using malloc and adjusting the
allocated pointer to align for indexing" because the pointer access
arithmetic done for the latter only makes it harder for LLVM passes to
deal with for analysis, optimization, attribute deduction, and rewrites.
Differential Revision: https://reviews.llvm.org/D77528
This revision removes the reliance of Promotion on `linalg.slice` which is meant
for the rank-reducing case.
Differential Revision: https://reviews.llvm.org/D77676
Invoke `keep()` on the output file of `mlir-opt` in case the invocation of `MlirOptMain` was successful, to make sure the output file is not deleted on exit from `mlir-opt`.
Fixes a similar problem in `standalone-opt` from the example for an out-of-tree, standalone MLIR dialect.
This revision also adds a missing parameter to the invocation of `MlirOptMain` in `standalone-opt`.
Differential Revision: https://reviews.llvm.org/D77643
Summary: 'it' may get invalidated when recursing into optional groups. This revision refactors the inner loop to avoid the need to compare the iterator after invalidation.
Differential Revision: https://reviews.llvm.org/D77686
Error messages for the custom assembly format are difficult to understand
because there are no line numbers. This happens because the assembly format
is parsed as a standalone line, separate from it's parent file, with no useful
location information. Fixing this properly probably requires quite a bit
of invasive plumbing through the SourceMgr, similar to how included files
are handled
This proposal is a less invasive short term solution. When generating an
error message we generate an additional note which at least properly describes
the operation definition the error occured in, if not the actual line number
of the assemblyFormat definition.
A typical message is like:
error: type of operand #0, named 'operand', is not buildable and a buildable type cannot be inferred
$operand type($result) attr-dict
^
/src/llvm-project/mlir/test/mlir-tblgen/op-format-spec.td:296:1: note: in custom assembly format for this operation
def ZCoverageInvalidC : TestFormat_Op<"variable_invalid_c", [{
^
note: suggest adding a type constraint to the operation or adding a 'type($operand)' directive to the custom assembly format
$operand type($result) attr-dict
^
Differential Revision: https://reviews.llvm.org/D77488
The messages are somewhat cryptic, since they are not complete sentences,
include lots of ambiguous words, like 'format' which are hard to parse,
and include names from the users code which may, or may not make sense in
the context of the message. Start to clean this up and provide some
guidance for fixes.
Also, add a test for one of the messages which didn't have a test at all.
Differential Revision: https://reviews.llvm.org/D77449
Summary:
This is much cleaner, and fits the same structure as many other tablegen backends. This was not done originally as the CRTP in the pass classes made it overly verbose/complex.
Differential Revision: https://reviews.llvm.org/D77367
This revision removes all of the CRTP from the pass hierarchy in preparation for using the tablegen backend instead. This creates a much cleaner interface in the C++ code, and naturally fits with the rest of the infrastructure. A new utility class, PassWrapper, is added to replicate the existing behavior for passes not suitable for using the tablegen backend.
Differential Revision: https://reviews.llvm.org/D77350
ModulePass doesn't provide any special utilities and thus doesn't give enough benefit to warrant a special pass class. This revision replaces all usages with the more general OperationPass.
Differential Revision: https://reviews.llvm.org/D77339
Summary:
Add directive to indicate the location to give to op being created. This
directive is optional and if unused the location will still be the fused
location of all source operations.
Currently this directive only works with other op locations, reusing an
existing op location or a fusion of op locations. But doesn't yet support
supplying metadata for the FusedLoc.
Based off initial revision by antiagainst@ and effectively mirrors GlobalIsel
debug_locations directive.
Differential Revision: https://reviews.llvm.org/D77649
Summary:
* Removal of FxpMathOps was discussed on the mailing list.
* Will send a courtesy note about also removing the Quantizer (which had some dependencies on FxpMathOps).
* These were only ever used for experimental purposes and we know how to get them back from history as needed.
* There is a new proposal for more generalized quantization tooling, so moving these older experiments out of the way helps clean things up.
Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77479
Summary: Diagnostics may be cached in the parallel diagnostic handler to preserve proper ordering. Storing the Operation as a DiagnosticArgument is problematic as the operation may be erased or changed before it finally gets printed.
Differential Revision: https://reviews.llvm.org/D77675
If we have two back-to-back loops with block arguments, the OpPhi
instructions generated for the second loop's block arguments should
have use the merge block of the first SPIR-V loop structure as
their incoming parent block.
Differential Revision: https://reviews.llvm.org/D77543
Introduce the alloca op for stack memory allocation. When converting to the
LLVM dialect, this is lowered to an llvm.alloca. Refactor the std to
llvm conversion for alloc op to reuse with alloca. Drop useAlloca option
with alloc op lowering.
Differential Revision: https://reviews.llvm.org/D76602
Fix point-wise copy generation to work with bounds that have max/min.
Change structure of copy loop nest to use absolute loop indices and
subtracting base from the indexes of the fast buffers. Update supporting
utilities: Fix FlatAffineConstraints::getLowerAndUpperBound to look at
equalities as well and for a missing division. Update unionBoundingBox
to not discard common constraints (leads to a tighter system). Update
MemRefRegion::getConstantBoundingSizeAndShape to add memref dimension
constraints. Run removeTrivialRedundancy at the end of
MemRefRegion::compute. Run single iteration loop promotion and
load/store canonicalization after affine data copy (in its test pass as
well).
Differential Revision: https://reviews.llvm.org/D77320
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.
In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.
In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)
Differential Revision: https://reviews.llvm.org/D75661
The rewriter generates a call to build that is not handled by opdef generator
and so will fail to compile. Also if this is a root node being replaced
(depth 0) then using the more generic build method in the rewrite suffices.
Summary: This revision updates the value numbering when printing to number from the next parent operation that is isolated from above. This is the highest level to number from that still ensures thread-safety. This revision also changes the behavior of Operator::operator<< to use local scope to avoid thread races when numbering operations.
Differential Revision: https://reviews.llvm.org/D77525
Summary:
This revision adds a tensor_reshape operation that operates on tensors.
In the tensor world the constraints are less stringent and we can allow more
arbitrary dynamic reshapes, as long as they are contractions.
The expansion of a dynamic dimension into multiple dynamic dimensions is under-specified and is punted on for now.
Differential Revision: https://reviews.llvm.org/D77360
This will fix the case:
$ toyc -emit=jit test.toy
$ cat test.toy
def main() {
var a = 1;
print(a);
}
Without this patch it would trigger an assertion.
Differential Revision: https://reviews.llvm.org/D77464
The current return type sometimes leads to code like
to_vector<2>(ValueRange(loop.getInductionIvs())). It would be nice to
shorten it. Users who need access to Block::BlockArgListType (if there
are any), can always call getBody()->getArguments(); if needed.
Also remove getNumInductionVars(), since there is getNumLoops().
Differential Revision: https://reviews.llvm.org/D77526
Summary:
This revision performs several cleanups on the translation infra:
* Removes the TranslateCLParser library and consolidates into Translation
- This was a weird library that existed in Support, and didn't really justify being a standalone library.
* Cleans up the internal registration and consolidates all of the translation functions within one registry.
Differential Revision: https://reviews.llvm.org/D77514
Summary: Blocks are numbered locally within a region, so numbering above the parent region is unnecessary.
Differential Revision: https://reviews.llvm.org/D77510
Summary: This updates the canonicalization documentation, and properly documents the different ways of canonicalizing operations.
Differential Revision: https://reviews.llvm.org/D77490
Summary:
This revision adds a section to WritingAPass to document the declarative specification, and how to use it.
Differential Revision: https://reviews.llvm.org/D77102
Even if this indicates in general a problem at call sites, the printer
is used for debugging and avoiding crashing is friendlier for example
when used in diagnostics or other printer.
Differential Revision: https://reviews.llvm.org/D77481