This revision adds a new `printNewline` hook to OpAsmPrinter that allows for printing a newline within the custom format of an operation, that is then indented to the start of the operation. Support for the declarative assembly format is also added, in the form of a `\n` literal.
Differential Revision: https://reviews.llvm.org/D93151
Fix a bug causing to pick the wrong vector size to broadcast to when the source
vectors have different ranks.
Differential Revision: https://reviews.llvm.org/D93118
This commit splits SPIR-V's serialization and deserialization code
into separate libraries. The motiviation being that the serializer
is used more often the deserializer and therefore lumping them
together unnecessarily increases binary size for the most common
case.
This commit also moves these libraries into the Target/ directory
to follow MLIR convention.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D91548
This avoids dumping the module post emitting a reproducer, which results in
many MB logs where a reproducer has already been neatly generated.
Differential Revision: https://reviews.llvm.org/D93165
Adds support for 3 ternary ops from SPIR-V extended instructions for
GLSL. Namely, adds support for FClamp, UClamp, and SClamp.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D92859
Zero bit integer types are supported by IntegerType for consistency,
but the asmparser never got updated. Allow them to be parsed, as
required to fix CIRCT issue #316
Differential Revision: https://reviews.llvm.org/D93089
When printing verification errors for ops with the incorrect number of
operand segments, print the required number as well as the actual
number. Split off from D93005.
Differential Revision: https://reviews.llvm.org/D93145
This mirror the C++ API for NamedAttribute, and has the advantage or
internalizing earlier in the Context and not requiring the caller to
keep the StringRef alive beyong this call.
Differential Revision: https://reviews.llvm.org/D93133
This reverts commit 0d48d265db.
This reapplies the following commit, with a fix for CAPI/ir.c:
[mlir] Start splitting the `tensor` dialect out of `std`.
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).
Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.
This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.
Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2
Differential Revision: https://reviews.llvm.org/D92991
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).
Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.
This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.
Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2
Differential Revision: https://reviews.llvm.org/D92991
Introduce support for inlining into affine operations. This uses the generic
inline infrastructure and boils down to checking that, if applied, the inlining
doesn't violate the affine dimension/symbol value categorization. Given valid
IR, only the values that are valid dimensions/symbols thanks to being top-level
in their affine scope need special handling.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D92770
OperationFolder currently uses ConstantOp as a backup when trying to materialize a constant after an operation is folded. This dependency isn't really useful or necessary given that dialects can/should provide a `materializeConstant` implementation.
Fixes PR#44866
Differential Revision: https://reviews.llvm.org/D92980
This fixes a subtle bug where SCCP could incorrectly optimize a private callable while waiting for its arguments to be resolved.
Fixes PR#48457
Differential Revision: https://reviews.llvm.org/D92976
This fixes a crash when no elements are parsed, but the type expects at least one.
Fixes PR#47763
Differential Revision: https://reviews.llvm.org/D92982
This was missed when supported for unsigned/signed integer types was first added, and results in crashes if a user tries to create/print a constant with the incorrect integer type.
Fixes PR#46222
Differential Revision: https://reviews.llvm.org/D92981
- use ConvertOpToLLVMPattern to avoid explicit casting and in most cases the
constructor can be reused to save a few lines of code.
Differential Revision: https://reviews.llvm.org/D92989
The current implementation of the translation to LLVM IR relies on the
existence of a one-to-one mapping between MLIR blocks and LLVM IR basic blocks
in order to configure PHI nodes with appropriate source blocks. The one-to-one
mapping model is broken in presence of OpenMP operations that use LLVM's
OpenMPIRBuilder, which produces multiple blocks under the hood. This can lead
to invalid LLVM IR being emitted if OpenMPIRBuilder moved the branch operation
into a basic block different from the one it was originally created in;
specifically, a block that is not a direct predecessor could be used in the PHI
node. Instead, keep track of the mapping between MLIR LLVM dialect branch
operations and their LLVM IR counterparts and take the parent basic block of
the LLVM IR instruction at the moment of connecting the PHI nodes to
predecessors.
This behavior cannot be triggered as of now, but will be once we introduce the
conversion of OpenMP workshare loops.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D92845
The documentation has become a bit stale with age, and doesn't include great documentation for some newer concepts. This revision tidies up a majority of it, with some more cleanup to come in the future. The documentation for the declarative specification is also moved from OpDefinitions.md to Interfaces.md, which is a much more logical place for it to live.
Differential Revision: https://reviews.llvm.org/D92895
- Remove some unused types from the Shape dialect
- Fix from_extent_tensor to only allow 1D index tensors
- Fix assuming_yield to only allow shape.assuming as the parent op.
- Fix some documentation typos and reword some things.
Differential Revision: https://reviews.llvm.org/D92901
This is to prevent assertion failures on scf.if and shape.assuming
operations where this is not enough information currently to handle any
aliasing information.
Differential Revision: https://reviews.llvm.org/D92963
This patch fixes a bug that allowed vectorizing of loops with loads and
stores having indexing functions varying along different memory
dimensions.
Reviewed By: aartbik, dcaballe
Differential Revision: https://reviews.llvm.org/D92702
Update all reference from the specification to the new OpenACC 3.1
document.
Reviewed By: SouraVX
Differential Revision: https://reviews.llvm.org/D92120
Add an option to pass the number of worker threads to select the number of async regions for parallel for transformation.
```
std::unique_ptr<OperationPass<FuncOp>> createAsyncParallelForPass(int numWorkerThreads);
```
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D92835
This reverts commit 02c9050155.
Painted myself into a corner with -Wvirtual_overload, private access, and final.
Differential Revision: https://reviews.llvm.org/D92855
This is needed so a listener hears all changes during the dialect
conversion to allow correct rollbacks upon failure.
Differential Revision: https://reviews.llvm.org/D92685
In RewritePattern, only expose `matchAndRewrite` as a public function. `match` can be protected (but needs to be protected because we want to call it from an override of `matchAndRewrite`). `rewrite` can be private.
For classes deriving from RewritePattern, all 3 functions can be private.
Side note: I didn't understand the need for the `using RewritePattern::matchAndRewrite` in derived classes, and started poking around. They are gone now, and I think the result is (only very slightly) cleaner.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D92670
This commit adds initial support for SPIR-V OpSpecConstantOp
instruction. The following is introdcued:
- A new `spv.specConstantOperation` operation consisting of a single
region and of 2 operations within that regions (more details in the
docs of the op itself).
- A new `spv.yield` instruction that acts a terminator for
`spv.specConstantOperation`.
For now, the generic form of the new op is supported (i.e. no custom
parsing or printing). This will be done in a follow up patch.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D92232
Add a pass option to control the number of nested parallel loops produced by
the parallelization passes. This is useful to build end-to-end passes targeting
systems that don't need multiple parallel dimensions (e.g., CPUs typically need
only one).
Reviewed By: wsmoses, chelini
Differential Revision: https://reviews.llvm.org/D92765
The existing implementation of the affine parallelization silently copies over
the lower and upper bound maps from affine.for to affine.parallel. However, the
semantics of these maps differ between these two ops: in affine.for, a max(min)
of results is taken for the lower(upper) bound; in affine.parallel, multiple
induction variables can be defined an each result corresponds to one induction
variable. Thus the existing implementation could generate invalid IR or IR that
passes the verifier but has different semantics than the original code. Fix the
parallelization utility to emit dedicated min/max operations before the
affine.parallel in such cases. Disallow parallelization if min/max would have
been in an operation without the AffineScope trait, e.g., in another loop,
since the result of these operations is not considered a valid affine dimension
identifier and may not be properly handled by the affine analyses.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D92763
The default exception handling isn't very user friendly and does not
point accurately to the issue. Instead we can indicate which of the
operands isn't valid and provide contextual information in the error
message.
Differential Revision: https://reviews.llvm.org/D92710
After bufferization, the backend has much more trouble hoisting loop invariant
loads from the loops generated by the sparse compiler. Therefore, this is done
during sparse code generation. Note that we don't bother hoisting derived
invariant expressions on SSA values, since the backend does that very well.
Still TBD: scalarize reductions to avoid load-add-store cycles
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D92534
Add support to normalize affine.for ops i.e., convert the lower bound to zero
and loop step to one. The Upper bound is set to the trip count of the loop.
The exact value of loopIV is calculated just inside the body of affine.for.
Currently loops with lower bounds having single result are supported. No such
restriction exists on upper bounds.
Differential Revision: https://reviews.llvm.org/D92233
Some Ops in OMP dialect have regions associated with them i.e
`ParallelOp` `MasterOp`. Lowering of these regions involves interfacing
with `OMPIRBuilder` using callbacks, yet there still exist opportunities
for sharing common code in between.
This patch factors out common code into a separate function and adds
support for lowering `MasterOp` using that. Lowering of `ParallelOp` is
also modified appropriately.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D87247
Trailing objects are really nice for storing additional data inline with the main class, and is something that we heavily take advantage of for Operation(and many other classes). To get the address of the inline data you need to compute the address by doing some pointer arithmetic taking into account any objects stored before the object you want to access. Most classes keep the count of the number of objects, so this is relatively cheap to compute. This is not the case for results though, which have two different types(inline and trailing) that are not necessarily as cheap to compute as the count for other objects. This revision moves the storage for results to before the operation and stores them in reverse order. This allows for getting results to still be very fast given that they are never iterated directly in order, and also greatly improves the speed when accessing the other trailing objects of an operation(operands/regions/blocks/etc).
This reduced compile time when compiling a decently sized mlir module by about ~400ms, or 2.17s -> 1.76s.
Differential Revision: https://reviews.llvm.org/D92687
The check for formatting enum attributes was missing a call to get the base attribute, which is necessary to strip off the top-level OptionalAttr<> wrapper.
Differential Revision: https://reviews.llvm.org/D92713
Make UnrollVectorPattern inherit from RewritePattern instead of
OpRewritePattern so that we don't need to create many patterns when applying to
many different type of ops. Since we may want to apply the pattern to all
arithmetic op, it is more convenient to filter dynamically.
Differential Revision: https://reviews.llvm.org/D92635
- Instead of hardcoding the parameters and return types of 'inferReturnTypes', use the
InferTypeOpInterface trait to generate the method declaration.
- Fix InferTypeOfInterface to use fully qualified type for inferReturnTypes results.
Differential Revision: https://reviews.llvm.org/D92585
In the past, the reshape op can be folded only if the indexing map is
permutation in consumer's usage. We can relax to condition to be projected
permutation.
This patch still limits the fusion for scalar cases. Scalar case is a corner
case, because we need to decide where to put extra dims.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D92466
This is part of a larger refactoring the better congregates the builtin structures under the BuiltinDialect. This also removes the problematic "standard" naming that clashes with the "standard" dialect, which is not defined within IR/. A temporary forward is placed in StandardTypes.h to allow time for downstream users to replaced references.
Differential Revision: https://reviews.llvm.org/D92435
The definitions of ModuleOp and FuncOp are now within BuiltinOps.h, making the individual files obsolete.
Differential Revision: https://reviews.llvm.org/D92622
A separate AVX512 lowering pass does not compose well with the regular
vector lowering pass. As such, it is at risk of code duplication and
lowering inconsistencies. This change removes the separate AVX512 lowering
pass and makes it an "option" in the regular vector lowering pass
(viz. vector dialect "augmented" with AVX512 dialect).
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D92614
This was important when ModuleOp was the only top level operation, but that isn't necessarily the case anymore. This is one of the last remaining aspects of the infrastructure that is hardcoded to ModuleOp.
Differential Revision: https://reviews.llvm.org/D92605
This was a somewhat important restriction in the past when ModuleOp was distinctly the top-level container operation, as well as before the pass manager had support for running nested pass managers natively. With these two issues fading away, there isn't really a good reason to enforce that a ModuleOp is the thing running within a pass manager. As such, this revision removes the restriction and allows for users to pass in the name of the operation that the pass manager will be scheduled on.
The only remaining dependency on BuiltinOps from Pass after this revision is due to FunctionPass, which will be resolved in a followup revision.
Differential Revision: https://reviews.llvm.org/D92450
There isn't a good reason for anything within IR to specifically reference any of the builtin operations. The only place that had a good reason in the past was AsmPrinter, but the behavior there doesn't need to hardcode ModuleOp anymore.
Differential Revision: https://reviews.llvm.org/D92448
Add support for vectorization for linalg.generic representing element-wise ops.
Those are converted to transfer_read + vector ops + transfer_write.
Also re-organize the vectorization tests to be together.
Implementation derived from the work of @burmako, @agrue and
@fedelebron.
Differential Revision: https://reviews.llvm.org/D92540
This fixes the build on gcc5 toolchain where sizeof is required for
types used in SmallVector now.
This is a consequence of using std::is_trivially_copy_constructible
instead of the LLVM variant: https://reviews.llvm.org/D92543
This reduces the chances of segfault. While it is a good practice to ensure
robust custom printers, it is unfortunately common to have them crash on
invalid input.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D92536
Memrefs with affine_map in the results of normalizable operation were
not normalized by `--normalize-memrefs` option. This patch normalizes
them.
Differential Revision: https://reviews.llvm.org/D88719
Extended promote buffers to stack pass to support dynamically shaped allocas.
The conversion is limited by the rank of the underlying tensor.
An option is added to the pass to adjust the given rank.
Differential Revision: https://reviews.llvm.org/D91969
Given that OpState already implicit converts to Operator*, this seems reasonable.
The alternative would be to add more functions to OpState which forward to Operation.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D92266
OpenMPIRBuilder::createParallel outlines the body region of the parallel
construct into a new function that accepts any value previously defined outside
the region as a function argument. This function is called back by OpenMP
runtime function __kmpc_fork_call, which expects trailing arguments to be
pointers. If the region uses a value that is not of a pointer type, e.g. a
struct, the produced code would be invalid. In such cases, make createParallel
emit IR that stores the value on stack and pass the pointer to the outlined
function instead. The outlined function then loads the value back and uses as
normal.
Reviewed By: jdoerfert, llitchev
Differential Revision: https://reviews.llvm.org/D92189
The test process of the ir_array_attributes.py depends on numpy. This patch checks numpy in Python bindings configuration.
- Add NumPy in find_package as a required component to check numpy.
- If numpy is found, print the version and include directory.
Differential Revision: https://reviews.llvm.org/D92276
PDL patterns are now supported via a new `PDLPatternModule` class. This class contains a ModuleOp with the pdl::PatternOp operations representing the patterns, as well as a collection of registered C++ functions for native constraints/creations/rewrites/etc. that may be invoked via the pdl patterns. Instances of this class are added to an OwningRewritePatternList in the same fashion as C++ RewritePatterns, i.e. via the `insert` method.
The PDL bytecode is an in-memory representation of the PDL interpreter dialect that can be efficiently interpreted/executed. The representation of the bytecode boils down to a code array(for opcodes/memory locations/etc) and a memory buffer(for storing attributes/operations/values/any other data necessary). The bytecode operations are effectively a 1-1 mapping to the PDLInterp dialect operations, with a few exceptions in cases where the in-memory representation of the bytecode can be more efficient than the MLIR representation. For example, a generic `AreEqual` bytecode op can be used to represent AreEqualOp, CheckAttributeOp, and CheckTypeOp.
The execution of the bytecode is split into two phases: matching and rewriting. When matching, all of the matched patterns are collected to avoid the overhead of re-running parts of the matcher. These matched patterns are then considered alongside the native C++ patterns, which rewrite immediately in-place via `RewritePattern::matchAndRewrite`, for the given root operation. When a PDL pattern is matched and has the highest benefit, it is passed back to the bytecode to execute its rewriter.
Differential Revision: https://reviews.llvm.org/D89107
- Change InferTypeOpInterface::inferResultTypes to use fully qualified types matching
the ones generated by genTypeInterfaceMethods, so the redundancy can be detected.
- Move genTypeInterfaceMethods() before genOpInterfaceMethods() so that the
inferResultTypes method generated by genTypeInterfaceMethods() takes precedence
over the declaration that might be generated by genOpInterfaceMethods()
- Modified an op in the test dialect to exercise this (the modified op would fail to
generate valid C++ code due to duplicate inferResultTypes methods).
Differential Revision: https://reviews.llvm.org/D92414
ExecutionEngine/LLJIT do not run globals destructors in loaded dynamic libraries when destroyed, and threads managed by ThreadPool can race with program termination, and it leads to segfaults.
TODO: Re-enable threading after fixing a problem with destructors, or removing static globals from dynamic library.
Differential Revision: https://reviews.llvm.org/D92368
- Address TODO in scf-bufferize: the argument materialization issue is
now fixed and the code is now in Transforms/Bufferize.cpp
- Tighten up finalizing-bufferize to avoid creating invalid IR when
operand types potentially change
- Tidy up the testing of func-bufferize, and move appropriate tests
to a new finalizing-bufferize.mlir
- The new stricter checking in finalizing-bufferize revealed that we
needed a DimOp conversion pattern (found when integrating into npcomp).
Previously, the converion infrastructure was blindly changing the
operand type during finalization, which happened to work due to
DimOp's tensor/memref polymorphism, but is generally not encouraged
(the new pattern is the way to tell the conversion infrastructure that
it is legal to change that type).