Analyze ops in a pseudo-random order to see if any assertions are triggered. Randomizing the order of analysis likely worsens the quality of the bufferization result (more out-of-place bufferizations). However, assertions should never fail, as that would indicate a problem with our implementation.
Differential Revision: https://reviews.llvm.org/D112581
This patch supports the atomic construct (read and write) following
section 2.17.7 of OpenMP 5.0 standard. Also added tests and
verifier for the same.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D111992
This also fixes the vector.shuffle C++ builder which had an incorrect type assumption that triggers with this new rewrite.
The vector.shuffle semantics were correct though.
Differential revision: https://reviews.llvm.org/D112578
This patch fixes:
mlir/lib/Transforms/Utils/DialectConversion.cpp:2775:5: error:
default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
by removing the default case. This way, the compiler should issue a
warning in the future when somebody adds a new enum value without a
corresponding case in the switch statement.
The current implementation invokes materializations
whenever an input operand does not have a mapping for the
desired type, i.e. it requires materialization at the earliest possible
point. This conflicts with goal of dialect conversion (and also the
current documentation) which states that a materialization is only
required if the materialization is supposed to persist after the
conversion process has finished.
This revision refactors this such that whenever a target
materialization "might" be necessary, we insert an
unrealized_conversion_cast to act as a temporary materialization.
This allows for deferring the invocation of the user
materialization hooks until the end of the conversion process,
where we actually have a better sense if it's actually
necessary. This has several benefits:
* In some cases a target materialization hook is no longer
necessary
When performing a full conversion, there are some situations
where a temporary materialization is necessary. Moving forward,
these users won't need to provide any target materializations,
as the temporary materializations do not require the user to
provide materialization hooks.
* getRemappedValue can now handle values that haven't been
converted yet
Before this commit, it wasn't well supported to get the remapped
value of a value that hadn't been converted yet (making it
difficult/impossible to convert multiple operations in many
situations). This commit updates getRemappedValue to properly
handle this case by inserting temporary materializations when
necessary.
Another code-health related benefit is that with this change we
can move a majority of the complexity related to materializations
to the end of the conversion process, instead of handling adhoc
while conversion is happening.
Differential Revision: https://reviews.llvm.org/D111620
Dyn-cast should be checked and bailed out if the dyn_cast failed.
Reviewed By: sjarus, NatashaKnk
Differential Revision: https://reviews.llvm.org/D112574
Rationale:
The currently used trait was demanding that all types are the same
which is not true (since the sparse part may change and the dim sizes
may be relaxed). This revision uses the correct trait and makes the
rank match test explicit in the verify method.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D112576
This refactoring adds a few "event" functions (start/end loop-seq/loop) for
readability of the core function of codegen. This also prepares sparse tensor
output codegen, where these "event" functions will provide convenient
placeholders to start or stop insertion bookkeeping.
This revision also includes a few various minor changes that kept on
pending in my local workspace.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D112506
Polynomial approximation can be extented to support N-d vectors.
N-dimensional vectors are useful when vectorizing operations on N-dimensional
tiles. Before lowering to LLVM these vectors are usually unrolled or flattened
to 1-dimensional vectors.
Differential Revision: https://reviews.llvm.org/D112566
Added a notification in the placeholder section. While writing things
like preciate of an attribute, we may embed certain placeholder in the C
expression. Note that the type of the placeholder is only guaranteed to
be the base type like mlir::Type, it's better not to use the derived
type which is based on the implementation.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112396
1.Combining kind min/max of Vector reduction op has been changed to
minf/maxf, minsi/maxsi, and minui/maxui. Modify getVectorReductionOp
accordingly.
2.Add min/max to supported reductions.
Reviewed By: dcaballe, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112246
Fix AffineExpr `getLargestKnownDivisor` for ceil/floor div cases.
In these cases, nothing can be inferred on the divisor of the
result.
Add test case for `mod` as well.
Differential Revision: https://reviews.llvm.org/D112523
The current behavior is conveniently allowing to iterate on the regions of an operation
implicitly by exposing an operation as Iterable. However this is also error prone and
code that may intend to iterate on the results or the operands could end up "working"
apparently instead of throwing a runtime error.
The lack of static type checking in Python contributes to the ambiguity here, it seems
safer to not do this and require and explicit qualification to iterate (`op.results`, `op.regions`, ...).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D111697
Specification specified the output type for quantized average pool should be
an i32. Only accumulator should be an i32, result type should match the input
type.
Caused in https://reviews.llvm.org/D111590
Reviewed By: sjarus, GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D112484
Using callbacks for allocation/deallocation allows users to override
the default.
Also add an option to comprehensive bufferization pass to use `alloca`
instead of `alloc`s. Note that this option is just for testing. The
option to use `alloca` does not work well with the option to allow for
returning memrefs.
Even though tensor.cast is not part of the sparse tensor dialect,
it may be used to cast static dimension sizes to dynamic dimension
sizes for sparse tensors without changing the actual sparse tensor
itself. Those cases should be lowered properly when replacing sparse
tensor types with their opaque pointers. Likewise, no op sparse
conversions are handled by this revision in a similar manner.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D112173
Using callbacks for allocation/deallocation allows users to override
the default.
Also add an option to comprehensive bufferization pass to use `alloca`
instead of `alloc`s. Note that this option is just for testing. The
option to use `alloca` does not work well with the option to allow for
returning memrefs.
Differential Revision: https://reviews.llvm.org/D112166
In several cases, operation result types can be unambiguously inferred from
operands and attributes at operation construction time. Stop requiring the user
to provide these types as arguments in the ODS-generated constructors in Python
bindings. In particular, handle the SameOperandAndResultTypes and
FirstAttrDerivedResultType traits as well as InferTypeOpInterface using the
recently added interface support. This is a significant usability improvement
for IR construction, similar to what C++ ODS provides.
Depends On D111656
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D111811
Introduce the initial support for operation interfaces in C API and Python
bindings. Interfaces are a key component of MLIR's extensibility and should be
available in bindings to make use of full potential of MLIR.
This initial implementation exposes InferTypeOpInterface all the way to the
Python bindings since it can be later used to simplify the operation
construction methods by inferring their return types instead of requiring the
user to do so. The general infrastructure for binding interfaces is defined and
InferTypeOpInterface can be used as an example for binding other interfaces.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D111656
Splitting the WsLoop tests they were getting harder to debug with the offsets over 100 for some of them.
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D112407
This removes duplication and makes nesting more clear.
It also reduces the amount of changes necessary for exposing future options.
Differential revision: https://reviews.llvm.org/D112344
This allows to clear an OpPassManager and populated it again with a new
pipeline, while preserving all the other options (including instrumentations).
Differential Revision: https://reviews.llvm.org/D112393
This patch fixes a bug in implementation `mergeSymbolIds` where symbol
identifiers were not unique after merging them. Asserts for checking uniqueness
before and after the merge are also added. The asserts checking uniqueness
after the merge fail without the fix on existing test cases.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D111958
This removes duplication and makes nesting more clear.
It also reduces the amount of changes necessary for exposing future options.
Differential revision: https://reviews.llvm.org/D112344
This patch adds a polynomial approximation that matches the
approximation in Eigen.
Note that the approximation only applies to vectorized inputs;
the scalar rsqrt is left unmodified.
The approximation is protected with a flag since it emits an AVX2
intrinsic (generated via the X86Vector). This is the only reasonably
clean way that I could find to generate the exact approximation that
I wanted (i.e. an identical one to Eigen's).
I considered two alternatives:
1. Introduce a Rsqrt intrinsic in LLVM, which doesn't exist yet.
I believe this is because there is no definition of Rsqrt that
all backends could agree on, since hardware instructions that
implement it have widely varying degrees of precision.
This is something that the standard could mandate, but Rsqrt is
not part of IEEE754, so I don't think this option is feasible.
2. Emit fdiv(1.0, sqrt) with fast math flags to allow reciprocal
transformations. Although portable, this doesn't allow us
to generate exactly the code we want; it is the LLVM backend,
and not MLIR, who controls what code is generated based on the
target CPU.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D112192