Thus far IntegerType has been signless: a value of IntegerType does
not have a sign intrinsically and it's up to the specific operation
to decide how to interpret those bits. For example, std.addi does
two's complement arithmetic, and std.divis/std.diviu treats the first
bit as a sign.
This design choice was made some time ago when we did't have lots
of dialects and dialects were more rigid. Today we have much more
extensible infrastructure and different dialect may want different
modelling over integer signedness. So while we can say we want
signless integers in the standard dialect, we cannot dictate for
others. Requiring each dialect to model the signedness semantics
with another set of custom types is duplicating the functionality
everywhere, considering the fundamental role integer types play.
This CL extends the IntegerType with a signedness semantics bit.
This gives each dialect an option to opt in signedness semantics
if that's what they want and helps code sharing. The parser is
modified to recognize `si[1-9][0-9]*` and `ui[1-9][0-9]*` as
signed and unsigned integer types, respectively, leaving the
original `i[1-9][0-9]*` to continue to mean no indication over
signedness semantics. All existing dialects are not affected (yet)
as this is a feature to opt in.
More discussions can be found at:
https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ
Differential Revision: https://reviews.llvm.org/D72533
This change adds some missing arithmetic and logical operators to
`TemplatedIndexedValue` for EDSC usage.
Differential Revision: https://reviews.llvm.org/D74686
Summary: DenseElementsAttr is used to store tensor data, which in some cases can become extremely large(100s of mb). In these cases it is much more efficient to format the data as a string of hex values instead.
Differential Revision: https://reviews.llvm.org/D74922
This fixes test failures caused by a change to the name of the main
dylib, now called 'main'. It also hardens the engine against potential
future changes to the name.
Summary:
This implements the last step for lowering vector.contract progressively
to LLVM IR (except for masks). Multi-dimensional reductions that remain
after expanding all parallel dimensions are lowered into into simpler
vector.contract operations until a trivial 1-dim reduction remains.
Reviewers: nicolasvasilache, andydavis1
Reviewed By: andydavis1
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74880
Summary:
The current structure suffers from several problems, but the main one is that a construction failure is impossible to debug when using the 'get' methods. This is because we only optionally emit errors, so there is no context given to the user about the problem. This revision restructures this so that errors are always emitted, and the 'get' methods simply pass in an UnknownLoc to emit to. This allows for removing usages of the more constrained "emitOptionalLoc", as well as removing the need for the context parameter.
Fixes [PR#44964](https://bugs.llvm.org/show_bug.cgi?id=44964)
Differential Revision: https://reviews.llvm.org/D74876
Summary:
Lowers all free/batch dimensions in a vector.contract progressively
into simpler vector.contract operations until a direct vector.reduction
operation is reached. Then lowers 1-D reductions into vector.reduce.
Still TBD:
multi-dimensional contractions that remain after removing all the parallel dims
Reviewers: nicolasvasilache, andydavis1, rriddle
Reviewed By: andydavis1
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74797
It replaces DenseMap output with a SmallVector and it
removes empty loop levels from the output.
Reviewed By: andydavis1, mehdi_amini
Differential Revision: https://reviews.llvm.org/D74658
Summary: DenseElementsAttr stores float values as raw bits internally, so creating attributes just to have them unwrapped is extremely inefficient.
Differential Revision: https://reviews.llvm.org/D74818
Summary:
This trait takes three arguments: lhs, rhs, transformer. It verifies that the type of 'rhs' matches the type of 'lhs' when the given 'transformer' is applied to 'lhs'. This allows for adding constraints like: "the type of 'a' must match the element type of 'b'". A followup revision will add support in the declarative parser for using these equality constraints to port more c++ parsers to the declarative form.
Differential Revision: https://reviews.llvm.org/D74647
Summary:
This could trigger an assertion due to the block argument being used by
this block's own successor operands.
Reviewers: rriddle!
Subscribers: mehdi_amini, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74583
In some dialects, attributes may have default values that may be
determined only after shape inference. For example, attributes that
are dependent on the rank of the input cannot be assigned a default
value until the rank of the tensor is inferred.
While we can set attributes without explicit setters, referring to
the attributes via accessors instead of having to use the string
interface is better for compile time verification.
The proposed patch add one method per operation attribute that let us
set its value. The code is a very small modification of the existing
getter methods.
Differential Revision: https://reviews.llvm.org/D74143
Add an initial version of mlir-vulkan-runner execution driver.
A command line utility that executes a MLIR file on the Vulkan by
translating MLIR GPU module to SPIR-V and host part to LLVM IR before
JIT-compiling and executing the latter.
Differential Revision: https://reviews.llvm.org/D72696
Summary:
the .row.col variant turns out to be the popular one, contrary to what I
thought as .row.row. Since .row.col is so prevailing (as I inspect
cuDNN's behavior), I'm going to remove the .row.row support here, which
makes the patch a little bit easier.
Reviewers: ftynse
Subscribers: jholewinski, bixia, sanjoy.google, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74655
Summary:
This revision refactors the TypeConverter class to not use inheritance to add type conversions. It instead moves to a registration based system, where conversion callbacks are added to the converter with `addConversion`. This method takes a conversion callback, which must be convertible to any of the following forms(where `T` is a class derived from `Type`:
* Optional<Type> (T type)
- This form represents a 1-1 type conversion. It should return nullptr
or `llvm::None` to signify failure. If `llvm::None` is returned, the
converter is allowed to try another conversion function to perform
the conversion.
* Optional<LogicalResult>(T type, SmallVectorImpl<Type> &results)
- This form represents a 1-N type conversion. It should return
`failure` or `llvm::None` to signify a failed conversion. If the new
set of types is empty, the type is removed and any usages of the
existing value are expected to be removed during conversion. If
`llvm::None` is returned, the converter is allowed to try another
conversion function to perform the conversion.
When attempting to convert a type, the TypeConverter walks each of the registered converters starting with the one registered most recently.
Differential Revision: https://reviews.llvm.org/D74584
Fixing a bug where using a zero-rank shaped type operand to
linalg.generic ops hit an unrelated assert. This also meant that
lowering the operation to loops was not supported. Adding roundtrip
tests and lowering to loops test for zero-rank shaped type operand
with fixes to make the test pass.
Differential Revision: https://reviews.llvm.org/D74638
Summary: This class wraps around the various different ways to construct a range of Type, without forcing the materialization of that range into a contiguous vector.
Differential Revision: https://reviews.llvm.org/D74646
Summary:
Refer folks to the main website and make it explicit that the rendered
output is what is of interest and that the GitHub viewing experience may
not match (even though we are trying to keep it as close as possible, the
renderers do differ).
Differential Revision: https://reviews.llvm.org/D74739
The existing name is an artifact dating back to the times when we did not have
a dedicated TypeConverter infrastructure. It is also confusing with with the
name of classes using it.
Differential revision: https://reviews.llvm.org/D74707
We have one title in every doc which corresponds to `#`, in the some
there are multiple and it is expected to be h1 headers (visual elements
rather than organizational). Indent every nesting by one in all of the
docs with multiple titles.
Also fixing trailing whitespace.
Summary:
Linalg's promotion pass was only supporting f32 buffers due to how the
zero value was build for the `fill` operation.
Moreover, `promoteSubViewOperands` was returning a vector with one entry
per float subview while omitting integer subviews. For a program
with only integer subviews the return vector would be of size 0.
However, `promoteSubViewsOperands` would try to access a non zero
number of entries of this vector, resulting in a sefgault.
Reviewers: nicolasvasilache, ftynse
Reviewed By: ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74532
Summary: getEncodedSourceLocation can be very costly to compute, especially if the input line becomes very long. This revision inlines some of the verification of a few `getChecked` methods to avoid the materialization of an encoded source location.
Differential Revision: https://reviews.llvm.org/D74587
This patch extends affine data copy optimization utility with an
optional memref filter argument. When the memref filter is used, data
copy optimization will only generate copies for such a memref.
Note: this patch is just porting the memref filter feature from Uday's
'hop' branch: https://github.com/bondhugula/llvm-project/tree/hop.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D74342
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
This pass would currently build, but fail to run when this backend isn't
linked in. On the other hand, we'd like it to initialize only the NVPTX
backend, which isn't possible if we continue to build it without the
backend available. Instead of building a broken configuration, let's
skip building the pass entirely.
Differential Revision: https://reviews.llvm.org/D74592
The commit switching the calling convention for memrefs (5a1778057)
inadvertently introduced a bug in the function argument attribute conversion:
due to incorrect indexing of function arguments it was not assigning the
attributes to the arguments beyond those generated from the first original
argument. This was not caught in the commit since the test suite does have a
test for converting multi-argument functions with argument attributes. Fix the
bug and add relevant tests.
Summary: This revision adds support to the declarative parser for formatting enum attributes in the symbolized form. It uses this new functionality to port several of the SPIRV parsers over to the declarative form.
Differential Revision: https://reviews.llvm.org/D74525
Summary:
This sets the basic framework for lowering vector.contract progressively
into simpler vector.contract operations until a direct vector.reduction
operation is reached. More details will be filled out progressively as well.
Reviewers: nicolasvasilache
Reviewed By: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74520
Implement a pass to convert gpu.launch_func op into a sequence of
Vulkan runtime calls. The Vulkan runtime API surface is huge so currently we
don't expose separate external functions in IR for each of them, instead we
expose a few external functions to wrapper libraries which manages
Vulkan runtime.
Differential Revision: https://reviews.llvm.org/D74549
Summary:
To unblock other work, this implements basic lowering based on mapping
attributes that have to be provided on all loop.parallel. The lowering
does not yet support reduce.
Differential Revision: https://reviews.llvm.org/D73893
Summary:
On some platforms the build fails "std::function is not found". The include is used in
PassManager::IRPrinterConfig::enableIRPrinting.
Differential Revision: https://reviews.llvm.org/D74469
In code generators, one can automate the translation of typed ArrayAttrs
if element attribute translators are already implemented. However, the
type of the element attribute is lost at the construction of
TypedArrayAttrBase. With this change one can inspect the element type
and generate the translation logic automatically, which will reduce the
code repetition.
Differential Revision: https://reviews.llvm.org/D73579
Summary:
As discussed in https://llvm.discourse.group/t/rfc-add-affine-parallel/350, this is the first in a series of patches to bring in support for the `affine.parallel` operation.
This first patch adds the IR representation along with custom printer/parser implementations.
Reviewers: bondhugula, herhut, mehdi_amini, nicolasvasilache, rriddle, earhart, jbruestle
Reviewed By: bondhugula, nicolasvasilache, rriddle, earhart, jbruestle
Subscribers: jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74288
This patch adapts the method MemRefDescriptor::fromStaticShape to
support static non-zero offsets. The updated method uses the
getStridesAndOffset method to extract strides and offset. The patch also
adapts the test cases since sizes and strides are now set in forward
instead of reverse order.
Differential Revision: https://reviews.llvm.org/D74474
This revision prepares the ground for declaratively defining Linalg "named" ops.
Such named ops form the backbone of operations that are ubiquitous in the ML
application domain.
This revision closely related to the definition of a "Tensor Computation
Primitives Dialect" and demonstrates that ops can be expressed as declarative
configurations of the `linalg.generic` op.
Differential Revision: https://reviews.llvm.org/D74491
Summary:
This revision allows model builder to create a linalg_matmul whose body
is a vector.contract. This shows the abstractions compose nicely.
Differential Revision: https://reviews.llvm.org/D74457
Summary: This was a missed case when ValueRange was originally added, and allows for constructing a ValueRange from the arguments of a block.
Differential Revision: https://reviews.llvm.org/D74363
A memref_cast casting to a memref with a non identity map can't be
lowered to llvm. Take the following case:
```
func @invalid_memref_cast(%arg0: memref<?x?xf64>) {
%c1 = constant 1 : index
%c0 = constant 0 : index
%5 = memref_cast %arg0 : memref<?x?xf64> to memref<?x?xf64, #map1>
%25 = std.subview %5[%c0, %c0][%c1, %c1][] : memref<?x?xf64, #map1> to memref<?x?xf64, #map1>
return
}
```
When lowering the subview mlir was assuming `%5` to have an llvm type
(which is not the case as mlir failed to lower the memref_cast).
Differential Revision: https://reviews.llvm.org/D74466
Summary:
This was broken recently when moving from dialect registration via
static initializers to explicit intialization.
Differential Revision: https://reviews.llvm.org/D74480
Thus far we have been using builtin func op to model SPIR-V functions.
It was because builtin func op used to have special treatment in
various parts of the core codebase (e.g., pass pipelines, etc.) and
it's easy to bootstrap the development of the SPIR-V dialect. But
nowadays with general op concepts and region support we don't have
such limitations and it's time to tighten the SPIR-V dialect for
completeness.
This commits introduces a spv.func op to properly model SPIR-V
functions. Compared to builtin func op, it can provide the following
benefits:
* We can control the full op so we can integrate SPIR-V information
bits (e.g., function control) in a more integrated way and define
our own assembly form and enforcing better verification.
* We can have a better dialect and library boundary. At the current
moment only functions are modelled with an external op. With this
change, all ops modelling SPIR-V concpets will be spv.* ops and
registered to the SPIR-V dialect.
* We don't need to special-case func op anymore when creating
ConversionTarget declaring SPIR-V dialect as legal. This is quite
important given we'll see more and more conversions in the future.
In the process, bumps a few FuncOp methods to the FunctionLike trait.
Differential Revision: https://reviews.llvm.org/D74226
In the previous state, we were relying on forcing the linker to include
all libraries in the final binary and the global initializer to self-register
every piece of the system. This change help moving away from this model, and
allow users to compose pieces more freely. The current change is only "fixing"
the dialect registration and avoiding relying on "whole link" for the passes.
The translation is still relying on the global registry, and some refactoring
is needed to make this all more convenient.
Differential Revision: https://reviews.llvm.org/D74461
* Rename CMake target MLIROptMain to MLIROptLib:
The target provides the main library
* Rename CMake target MLIRMlirOptLib to MLIRMlirOptMain:
The target provides the main() entry function
At the moment, the Bazel configuration of TenorFlow maps the target
MlirOptLib to "lib/Support/MlirOptMain.cpp" and MlirOptMain to
"tools/mlir-opt/mlir-opt.cpp". This is the other way around in the CMake
configuration. As discussed in the context of the pull request
https://github.com/tensorflow/tensorflow/pull/36301, it seems useful to
revise the naming in the MLIR repo.
Differential Revision: https://reviews.llvm.org/D73778
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.
Includes ASAN bug fix for D73190.
Reviewers: bondhugula, dcaballe
Reviewed By: bondhugula, dcaballe
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74330
Follow-up on D72802. Turn -convert-std-to-llvm-use-alloca and
-convert-std-to-llvm-bare-ptr-memref-call-conv into pass flags
of LLVMLoweringPass.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D73912
Defines a tablegen class RankedIntElementsAttr. This is an integer
version of RankedFloatElementsAttr.
Differential Revision: https://reviews.llvm.org/D73764
Summary:
The lowering to NVVM and ROCm handles tanh operations differently by
mapping them to NVVM/ROCm specific intrinsics. This conflicts with
the lowering to LLVM, which uses the default llvm intrinsic. This change
declares the LLVM intrinsics to be illegal, hence disallowing the
correspondign rewrite.
Differential Revision: https://reviews.llvm.org/D74389
We have spv.entry_point_abi for specifying the local workgroup size.
It should be decorated onto input gpu.func ops to drive the SPIR-V
CodeGen to generate the proper SPIR-V module execution mode. Compared
to using command-line options for specifying the configuration, using
attributes also has the benefits that 1) we are now able to use
different local workgroup for different entry points and 2) the
tests contains the configuration directly.
Differential Revision: https://reviews.llvm.org/D74012
Summary:
After D72555 has been landed, `linalg.indexed_generic` also accepts ranked
tensor as input and output. Add a test for it.
Differential Revision: https://reviews.llvm.org/D74267
Summary:
This revision adds EDSC support for VectorOps to enable the creation of a `vector_matmul` declaratively. The `vector_matmul` is a simple configuration
of the `vector.contract` op that follows the StructuredOps abstraction.
Differential Revision: https://reviews.llvm.org/D74284
This CL refactors EDSCs to layer them better and break unnecessary
dependencies. After this refactoring, the top-level EDSC target only
depends on IR but not on Dialects anymore and each dialect has its
own EDSC directory.
This simplifies the layering and breaks cyclic dependencies.
In particular, the declarative builder + folder are made explicit and
are now confined to Linalg.
As the refactoring occurred, certain classes and abstractions that were not
paying for themselves have been removed.
Differential Revision: https://reviews.llvm.org/D74302
The current standard to llvm conversion pass lowers subview ops only if
dynamic offsets are provided. This commit extends the lowering with a
code path that uses the constant offset of the target memref for the
subview op lowering (see Example 3 of the subview op definition for an
example) if no dynamic offsets are provided.
Differential Revision: https://reviews.llvm.org/D74280
The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.
Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.
Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by `alloca` will be freed on exiting the auxiliary
function.
The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.
This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.
The existing lowering of gpu.block_dim added a global variable with
the WorkGroupSize decoration. This raises an error within
Vulkan/SPIR-V validation since Vulkan requires this to have a constant
initializer. This is not yet supported in SPIR-V dialect. Changing the
lowering to return the workgroup size as a constant value instead,
obtained from spv.entry_point_abi attribute gets around the issue for
now. The validation goes through since the workgroup size is specified
using spv.execution_mode operation.
This revision adds support in the declarative assembly form for printing attributes with buildable types without the type, and moves several more parsers over to the declarative form.
Differential Revision: https://reviews.llvm.org/D74276
Summary:
This revision adds a utility to generate debug locations from the IR during compilation, by snapshotting to a output stream and using the locations that operations were dumped in that stream. The new locations may either;
* Replace the original location of the operation.
old:
loc("original_source.cpp":1:1)
new:
loc("snapshot_source.mlir":10:10)
* Fuse with the original locations as NamedLocs with a specific tag.
old:
loc("original_source.cpp":1:1)
new:
loc(fused["original_source.cpp":1:1, "snapshot"("snapshot_source.mlir":10:10)])
This feature may be used by a debugger to display the code at various different levels of the IR. It would also be able to show the different levels of IR attached to a specific source line in the original source file.
This feature may also be used to generate locations for operations generated during compilation, that don't necessarily have a user source location to attach to.
This requires changes in the printer to track the locations of operations emitted in the stream. Moving forward we need to properly(and efficiently) track the number of newlines emitted to the stream during printing.
Differential Revision: https://reviews.llvm.org/D74019
Summary: This is the most common operation performed on a CallOpInterface. This just moves the existing functionality from the CallGraph so that other users can access it.
Differential Revision: https://reviews.llvm.org/D74250
Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary.
Differential Revision: https://reviews.llvm.org/D73590
Summary:
This revision adds support for printing pass options as part of the normal help description. This also moves registered passes and pipelines into different sections of the help.
Example:
```
Compiler passes to run
--pass-pipeline - ...
Passes:
--affine-data-copy-generate - ...
--convert-gpu-to-spirv - ...
--workgroup-size=<long> - ...
--test-options-pass - ...
--list=<int> - ...
--string=<string> - ...
--string-list=<string> - ...
Pass Pipelines:
--test-options-pass-pipeline - ...
--list=<int> - ...
--string=<string> - ...
--string-list=<string> - ...
```
Differential Revision: https://reviews.llvm.org/D74246
Summary:
The `vector.fma` operation is portable enough across targets that we do not want
to keep it wrapped under `vector.outerproduct` and `llvm.intrin.fmuladd`.
This revision lifts the op into the vector dialect and implements the lowering to LLVM by using two patterns:
1. a pattern that lowers from n-D to (n-1)-D by unrolling when n > 2
2. a pattern that converts from 1-D to the proper LLVM representation
Reviewers: ftynse, stellaraccident, aartbik, dcaballe, jsetoain, tetuante
Reviewed By: aartbik
Subscribers: fhahn, dcaballe, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74075
Summary:
This revision exposes the portable `llvm.fma` intrinsic in LLVMOps and uses it
in lieu of `llvm.fmuladd` when lowering the `vector.outerproduct` op to LLVM.
This guarantees proper `fma` instructions will be emitted if the target ISA
supports it.
`llvm.fmuladd` does not have this guarantee in its semantics, despite evidence
that the proper x86 instructions are emitted.
For more details, see https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic.
Reviewers: ftynse, aartbik, dcaballe, fhahn
Reviewed By: aartbik
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74219
The initial implementation of the fusion operation exposes a method to
fuse a consumer with its producer, when
- both the producer and consumer operate on tensors
- the producer has only a single result value
- the producer has only "parallel" iterator types
A new interface method hasTensorSemantics is added to verify that an
operation has all operands and results of type RankedTensorType.
Differential Revision: https://reviews.llvm.org/D74172
Summary: In some edge cases the default APFloat printer will generate something that we can't parse back in. In these cases, fallback to using hex instead.
Differential Revision: https://reviews.llvm.org/D74181
Technically a leak in tblgen is harmless, but this makes asan builds of
mlir very noisy. Just use a SpecificBumpPtrAllocator that knows how to
clean up after itself.
This reverts commit 64871f778d.
ASAN indicates a use-after-free in in mlir::canFuseLoops(mlir::AffineForOp, mlir::AffineForOp, unsigned int, mlir::ComputationSliceState*) lib/Transforms/Utils/LoopFusionUtils.cpp:202:41
mlir-opt needs to link against MLIRLoopAnalysis
This shouldn't be needed but MLIR "hack" for
"whole-archive" linking is not compatible with
CMake transitive dependencies management.
Differential Revision: https://reviews.llvm.org/D74097
Summary:
This revision adds basic support for emitting line table information when exporting to LLVMIR. We don't yet have a story for supporting all of the LLVM debug metadata, so this revision stubs some features(like subprograms) to enable emitting line tables.
Differential Revision: https://reviews.llvm.org/D73934
Summary:
Previously, vector.contract did not allow an empty set of
free or batch dimensions (K = 0) which defines a basic
reduction into a scalar (like a dot product). This CL
relaxes that restriction. Also adds constraints on
element type of operands and results. With tests.
Reviewers: nicolasvasilache, andydavis1, rriddle
Reviewed By: andydavis1
Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74014
Summary:
[mlir][VectorOps] Support vector transfer_read/write unrolling for memrefs with vector element type. When unrolling vector transfer read/write on memrefs with vector element type, the indices used to index the memref argument must be updated to reflect the unrolled operation. However, in the case of memrefs with vector element type, we need to be careful to only update the relevant memref indices.
For example, a vector transfer read with the following source/result types, memref<6x2x1xvector<2x4xf32>>, vector<2x1x2x4xf32>, should only update memref indices 1 and 2 during unrolling.
Reviewers: nicolasvasilache, aartbik
Reviewed By: nicolasvasilache, aartbik
Subscribers: lebedev.ri, Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72965
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.
Reviewers: bondhugula, dcaballe, nicolasvasilache
Reviewed By: bondhugula, dcaballe
Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73190
Summary:
Add ShapeCastOp to the vector ops dialect.
The shape_cast operation casts between an n-D source vector shape and a k-D result vector shape (the element type remains the same).
Reviewers: nicolasvasilache, aartbik
Reviewed By: nicolasvasilache
Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73635
Summary: Optional regions are supported in the generic op print/parse form, update the docs to match.
Differential Revision: https://reviews.llvm.org/D74061
Summary:
MLIRAnalysis depended on MLIRVectorOps
MLIRVectorOps depended on MLIRAnalysis for Loop information.
Both of these can be solved by factoring out libraries related to loop
analysis into their own library. The new MLIRLoopAnalysis might be
better off with the Loop Dialect in the future.
Reviewers: nicolasvasilache, rriddle!, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: Joonsoo, vchuravy, merge_guards_bot, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73655
Summary:
This breaks a cyclic library dependency where MLIRPass used the verifier
in MLIRAnalysis, but MLIRAnalysis also contained passes used for testing.
The presence of the test passes here is archaeology, predating
test/lib/Transform.
Reviewers: rriddle
Reviewed By: rriddle
Subscribers: merge_guards_bot, mgorny, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74067
The recent refactoring of build files broke building with the MIR CUDA
integration enabled. This fixes it by adding some additional
dependencies to mlir-opt.
Differential Revision: https://reviews.llvm.org/D74041
Summary:
It is often needed to map entire ranges rather than single values. To avoid
writing the same for loop every time, I have added an overload to the map
method.
Differential Revision: https://reviews.llvm.org/D73894
This binplaces `mlir-translate`, `mlir-cuda-runner`, and `mlir-cpu-runner` when building the CMake install target.
Differential Revision: https://reviews.llvm.org/D73986
We were using normal dictionary attribute for target environment
specification. It becomes cumbersome with more and more fields.
This commit changes the modelling to a dialect-specific attribute,
where we can have control over its storage and assembly form.
Differential Revision: https://reviews.llvm.org/D73959
This is fixing a build error:
error: non-constant-expression cannot be narrowed from type 'unsigned int' to 'Region::iterator::difference_type' (aka 'int') in initializer list
Fix pr44767
Summary:
This patch is a step towards enabling BUILD_SHARED_LIBS=on, which
builds most libraries as DLLs instead of statically linked libraries.
The main effect of this is that incremental build times are greatly
reduced, since usually only one library need be relinked in response
to isolated code changes.
The bulk of this patch is fixing incorrect usage of cmake, where library
dependencies are listed under add_dependencies rather than under
target_link_libraries or under the LINK_LIBS tag. Correct usage should be
like this:
add_dependencies(MLIRfoo MLIRfooIncGen)
target_link_libraries(MLIRfoo MLIRlib1 MLIRlib2)
A separate issue is that in cmake, dependencies between static libraries
are automatically included in dependencies. In the above example, if MLIBlib1
depends on MLIRlib2, then it is sufficient to have only MLIRlib1 in the
target_link_libraries. When compiling with shared libraries, it is necessary
to have both MLIRlib1 and MLIRlib2 specified if MLIRfoo uses symbols from both.
Reviewers: mravishankar, antiagainst, nicolasvasilache, vchuravy, inouehrs, mehdi_amini, jdoerfert
Reviewed By: nicolasvasilache, mehdi_amini
Subscribers: Joonsoo, merge_guards_bot, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73653
This commit adds two resource limits, max_compute_workgroup_size
and max_compute_workgroup_invocations as resource limits to
the target environment. They are not used at the current moment,
but they will affect the SPIR-V CodeGen. Adding for now to have
a proper target environment modelling.
Differential Revision: https://reviews.llvm.org/D73905
This revision makes sure that errors emitted outside of testing are treated as fatal errors. This avoids the current silent failures that occur when the format is invalid.
Summary:
Currently BuildableType is assumed to be preceded by a builder. This prevents constructing types that don't have a callable 'get' method with the builder. This revision reworks the format to be like attribute builders, i.e. by accepting $_builder within the format itself.
Differential Revision: https://reviews.llvm.org/D73736
Summary: This revision add support for accepting a few type constraints, e.g. AllTypesMatch, when inferring types for operands and results. This is used to remove the c++ parsers for several additional operations.
Differential Revision: https://reviews.llvm.org/D73735
Summary:
Replace the generic zero- and one-result builders in LLVM::CallOp with a custom
builder that takes an LLVMFuncOp, which can be used to extract the result type
and create the symbol reference attribute. This is merely a convenience for
upcoming changes. The ODS-generated builders remain present.
Introduce LLVM::LLVMType::isVoidTy by analogy with the underlying LLVM type.
Differential Revision: https://reviews.llvm.org/D73895
LinalgDependenceGraph was not updated after successful producer-consumer
fusion for linalg ops. In this patch it is fixed by reconstructing
LinalgDependenceGraph on every iteration. This is very ineffective and
should be improved by updating LDGraph only when it is necessary.
The patterns for converting `std.alloc` and `std.dealoc` can be configured to
use `llvm.alloca` instead of calling `malloc` and `free`. This configuration
has been only possible through a command-line flag, despite the presence of a
(misleading) parameter in the pass constructor. Use the parameter instead and
only initalize it from the command line flags if the pass is constructed from
the mlir-opt registration.
Summary:
These hooks were originally introduced to support passes deriving the
StandardToLLVM conversion, in particular converting types from different
dialects to LLVM types in a single-step conversion. They are no longer in use
since the pass and conversion infrastructure has evolved sufficiently to make
defining new passes with exactly the same functionality simple through the use
of populate* functions, conversion targets and type converters. Remove the
hooks. Any users of this hooks can call the dialect conversion infrastructure
directly instead, which is likely to require less LoC than these hooks.
Differential Revision: https://reviews.llvm.org/D73795
Summary:
In the original design, gpu.launch required explicit capture of uses
and passing them as operands to the gpu.launch operation. This was
motivated by infrastructure restrictions rather than design. This
change lifts the requirement and removes the concept of kernel
arguments from gpu.launch. Instead, the kernel outlining
transformation now does the explicit capturing.
This is a breaking change for users of gpu.launch.
Differential Revision: https://reviews.llvm.org/D73769
Summary:
Start filling in some requirements for the shape function descriptions
that will be used to derive shape computations. This requiement part may
later be reworked to be part of the "context" section of shape dialect. Without
examples this may be a bit too abstract but I hope not (given mappings to
existing shape functions).
Differential Revision: https://reviews.llvm.org/D73572
Seen on gcc 8, in release mode & assertions off warnings about logger,
made all statements referencing logger inside LLVM_DEBUG blocks and
ifdef a few variables only used in debug.
This is mechanical fix to get CI green.
Summary:
This patch introduces an alternative calling convention for
MemRef function arguments in LLVM dialect. It converts MemRef
function arguments to LLVM bare pointers to the MemRef element
type instead of creating a MemRef descriptor. Bare pointers are
then promoted to a MemRef descriptors at the beginning of the
function. This calling convention is only enabled with a flag.
Reviewers: ftynse, bondhugula, nicolasvasilache, rriddle, mehdi_amini
Reviewed By: ftynse, rriddle, mehdi_amini
Subscribers: Joonsoo, flaub, merge_guards_bot, jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72802
This revision does the following post-commit cleanups:
1. don't use -1 magic constants,
2. drop commented out old test that does not belong here,
3. reformat and add a proper clang-format off on a CHECK directive.
Summary:
This revision beefs up the debug logging within dialect conversion. Given the nature of multi-level legalization, and legalization in general, it is one of the harder pieces of infrastructure to debug. This revision adds nice formatting to make the output log easier to parse:
```
Legalizing operation : 'std.constant'(0x608000002420) {
* Fold {
} -> FAILURE : unable to fold
* Pattern : 'std.constant -> ()' {
} -> FAILURE : pattern failed to match
* Pattern : 'std.constant -> ()' {
} -> FAILURE : pattern failed to match
* Pattern : 'std.constant -> (spv.constant)' {
** Insert : 'spv.constant'(0x608000002c20)
** Replace : 'std.constant'(0x608000002420)
//===-------------------------------------------===//
Legalizing operation : 'spv.constant'(0x608000002c20) {
} -> SUCCESS : operation marked legal by the target
//===-------------------------------------------===//
} -> SUCCESS : pattern applied successfully
} -> SUCCESS
```
Differential Revision: https://reviews.llvm.org/D73747
Summary:
Rationale:
When lowering to LLVM for different rank insert (n vs k), the offset
arrays needs to drop one dimension (becomes n-1), but the strides
array needs to be preserved (remains k). With regression test.
Note that this example was actually in the documentation, so
extra important to do it right :-)
Reviewers: nicolasvasilache, andydavis1, ftynse
Reviewed By: nicolasvasilache, ftynse
Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73733
Summary:
After the `subview` operation was migrated from Linalg to Standard, it changed
semantics and does not guarantee the absence of out-of-bounds accesses through
the created view anymore. Compute the size of the subview to make sure it
always fits within the view (subviews in last iterations of the loops may be
smaller than those in other iterations).
Differential Revision: https://reviews.llvm.org/D73614
This commit adds a pattern to lower linalg.generic for reduction
to spv.GroupNonUniform* ops. Right now this only supports integer
reduction on 1-D input memref. Shader entry point ABI is queried
to make sure that the input memref's shape matches the local
workgroup's invocation configuration. This makes sure that the
workload fits in one local workgroup so that we can leverage
SPIR-V group non-uniform operations.
linglg.generic is a structured op that preserves the right level
of information. It is easier to recognize reduction at this level
than performing analysis on loops.
This commit also exposes `getElementPtr` in SPIRVLowering.h given
that it's a generally useful utility function.
Differential Revision: https://reviews.llvm.org/D73437
Summary:
The current code assumes that one always maps at least one loop to block
dimensions and at least one loop to thread dimensions. If either is not
the case, a loop would get mapped twice.
Differential Revision: https://reviews.llvm.org/D73685
The refactored MemRefType::get() calls all intend to clone from another
memref type, with some modifications. In fact, some calls dropped memory space
during the cloning. Migrate them to the cloning API so that nothing gets
dropped if they are not explicitly listed.
It's close to NFC but not quite, as it helps with propagating memory spaces in
some places.
Differential Revision: https://reviews.llvm.org/D73296
Summary:
MLIR materializes various enumeration-based LLVM IR operands as enumeration
attributes using ODS. This requires bidirectional conversion between different
but very similar enums, currently hardcoded. Extend the ODS modeling of
LLVM-specific enumeration attributes to include the name of the corresponding
enum in the LLVM C++ API as well as the names of specific enumerants. Use this
new information to automatically generate the conversion functions between enum
attributes and LLVM API enums in the two-way conversion between the LLVM
dialect and LLVM IR proper.
Differential Revision: https://reviews.llvm.org/D73468
Summary:
This revision switches over many operations to use the declarative methods for defining the assembly specification. This updates operations in the NVVM, ROCDL, Standard, and VectorOps dialects.
Differential Revision: https://reviews.llvm.org/D73407
Summary:
This revision add support, and testing, for generating the parser and printer from the declarative operation format.
Differential Revision: https://reviews.llvm.org/D73406
Summary:
This is the first revision in a series that adds support for declaratively specifying the asm format of an operation. This revision
focuses solely on parsing the format. Future revisions will add support for generating the proper parser/printer, as well as
transitioning the syntax definition of many existing operations.
This was originally proposed here:
https://llvm.discourse.group/t/rfc-declarative-op-assembly-format/340
Differential Revision: https://reviews.llvm.org/D73405
Summary:
In some cases, one may want to use different names for C++ symbol of an
enumerand from its string representation. In particular, in the LLVM dialect
for, e.g., Linkage, we would like to preserve the same enumerand names as LLVM
API and the same textual IR form as LLVM IR, yet the two are different
(CamelCase vs snake_case with additional limitations on not being a C++
keyword).
Modify EnumAttrCaseInfo in OpBase.td to include both the integer value and its
string representation. By default, this representation is the same as C++
symbol name. Introduce new IntStrAttrCaseBase that allows one to use different
names. Exercise it for LLVM Dialect Linkage attribute. Other attributes will
follow as separate changes.
Differential Revision: https://reviews.llvm.org/D73362
Summary:
In the scope of the lowering phase from GPU to ROCDL, the intructions for the conversion patterns seems to be wrong.
According to https://github.com/ROCm-Developer-Tools/HIP/blob/master/include/hip/hcc_detail/math_fwd.h the instructions need two underscores in the beginning instead of one.
Reviewers: nicolasvasilache, herhut, rriddle
Reviewed By: herhut, rriddle
Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73535
Summary:
The 'gpu.terminator' operation is used as the terminator for the
regions of gpu.launch. This is to disambugaute them from the
return operation on 'gpu.func' functions.
This is a breaking change and users of the gpu dialect will need
to adapt their code when producting 'gpu.launch' operations.
Reviewers: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73620
StrEq has some magic inside that should do the explicit conversion from
StringRef to std::string, but apparently this doesn't work with GCC 5.
Just use EXPECT_EQ, it does the same thing with less magic.
Summary:
This will help catch improper use of the MLIR API's. In particular, this
catches an error that was manifesting as nondeterministic assertion
failures (the nondeterminism was due to the failure happening only when the
StorageUniquer's DenseMap's probing happened to compare two specific
keys).
No test. The fact that all the existing tests pass with this additional
invariant gives confidence that it is correct/useful.
Differential Revision: https://reviews.llvm.org/D73645
Summary:
Canonicalization and folding patterns in StandardOps may interfere with the needs
of Linalg. This revision introduces specific foldings for dynamic memrefs that can
be proven to be static.
Very concretely:
Determines whether it is possible to fold it away in the parent Linalg op:
```mlir
%1 = memref_cast %0 : memref<8x16xf32> to memref<?x?xf32>
%2 = linalg.slice %1 ... : memref<?x?xf32> ...
// or
%1 = memref_cast %0 : memref<8x16xf32, affine_map<(i, j)->(16 * i + j)>>
to memref<?x?xf32>
linalg.generic(%1 ...) : memref<?x?xf32> ...
```
into
```mlir
%2 = linalg.slice %0 ... : memref<8x16xf32> ...
// or
linalg.generic(%0 ... : memref<8x16xf32, affine_map<(i, j)->(16 * i + j)>>
```
Reviewers: ftynse, aartbik, jsetoain, tetuante, asaadaldien
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73565
Summary:
Barrier is a simple operation that takes no arguments and returns
nothing, but implies a side effect (synchronization of all threads)
Reviewers: jdoerfert
Subscribers: mgorny, guansong, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72400
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
This diff adds a transformation patter to rewrite linalg.fill as broadcasting a scaler into a vector.
It uses the same preconditioning as matmul (memory is contiguous).
Reviewers: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73391
Summary:
Operation represents all of the uses of each result with one use list, so manipulating the use list of a specific result requires filtering the main use list. This revision adds an optimization for the case of single result operations to avoid this filtering.
Differential Revision: https://reviews.llvm.org/D73430
Summary:
This also removes the explicit pattern for loop.terminator to ensure
that the terminator is only erased if the parent op is rewritten.
Reductions are not yet supported.
Reviewers: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73348
Summary:
The intrinsic operation added multiple type annotations to the llvm intrinsic operations, but only one is needed.
The related tests in llvmir-intrinsics.mlir checked the wrong number and are adjusted as well.
Reviewers: nicolasvasilache, ftynse
Reviewed By: ftynse
Subscribers: merge_guards_bot, ftynse, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73470
Summary:
The tanh lowering from Standard dialect to NVVM and ROCDL was not working.
The conversion pattern are inserted in the lowering files.
The test cases for the lowerings were added in the test files.
Reviewers: nicolasvasilache, ftynse, herhut
Reviewed By: ftynse, herhut
Subscribers: merge_guards_bot, ftynse, jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73471