Summary:
Refer folks to the main website and make it explicit that the rendered
output is what is of interest and that the GitHub viewing experience may
not match (even though we are trying to keep it as close as possible, the
renderers do differ).
Differential Revision: https://reviews.llvm.org/D74739
The existing name is an artifact dating back to the times when we did not have
a dedicated TypeConverter infrastructure. It is also confusing with with the
name of classes using it.
Differential revision: https://reviews.llvm.org/D74707
We have one title in every doc which corresponds to `#`, in the some
there are multiple and it is expected to be h1 headers (visual elements
rather than organizational). Indent every nesting by one in all of the
docs with multiple titles.
Also fixing trailing whitespace.
Summary:
Linalg's promotion pass was only supporting f32 buffers due to how the
zero value was build for the `fill` operation.
Moreover, `promoteSubViewOperands` was returning a vector with one entry
per float subview while omitting integer subviews. For a program
with only integer subviews the return vector would be of size 0.
However, `promoteSubViewsOperands` would try to access a non zero
number of entries of this vector, resulting in a sefgault.
Reviewers: nicolasvasilache, ftynse
Reviewed By: ftynse
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74532
Summary: getEncodedSourceLocation can be very costly to compute, especially if the input line becomes very long. This revision inlines some of the verification of a few `getChecked` methods to avoid the materialization of an encoded source location.
Differential Revision: https://reviews.llvm.org/D74587
This patch extends affine data copy optimization utility with an
optional memref filter argument. When the memref filter is used, data
copy optimization will only generate copies for such a memref.
Note: this patch is just porting the memref filter feature from Uday's
'hop' branch: https://github.com/bondhugula/llvm-project/tree/hop.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D74342
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
This pass would currently build, but fail to run when this backend isn't
linked in. On the other hand, we'd like it to initialize only the NVPTX
backend, which isn't possible if we continue to build it without the
backend available. Instead of building a broken configuration, let's
skip building the pass entirely.
Differential Revision: https://reviews.llvm.org/D74592
The commit switching the calling convention for memrefs (5a1778057)
inadvertently introduced a bug in the function argument attribute conversion:
due to incorrect indexing of function arguments it was not assigning the
attributes to the arguments beyond those generated from the first original
argument. This was not caught in the commit since the test suite does have a
test for converting multi-argument functions with argument attributes. Fix the
bug and add relevant tests.
Summary: This revision adds support to the declarative parser for formatting enum attributes in the symbolized form. It uses this new functionality to port several of the SPIRV parsers over to the declarative form.
Differential Revision: https://reviews.llvm.org/D74525
Summary:
This sets the basic framework for lowering vector.contract progressively
into simpler vector.contract operations until a direct vector.reduction
operation is reached. More details will be filled out progressively as well.
Reviewers: nicolasvasilache
Reviewed By: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74520
Implement a pass to convert gpu.launch_func op into a sequence of
Vulkan runtime calls. The Vulkan runtime API surface is huge so currently we
don't expose separate external functions in IR for each of them, instead we
expose a few external functions to wrapper libraries which manages
Vulkan runtime.
Differential Revision: https://reviews.llvm.org/D74549
Summary:
To unblock other work, this implements basic lowering based on mapping
attributes that have to be provided on all loop.parallel. The lowering
does not yet support reduce.
Differential Revision: https://reviews.llvm.org/D73893
Summary:
On some platforms the build fails "std::function is not found". The include is used in
PassManager::IRPrinterConfig::enableIRPrinting.
Differential Revision: https://reviews.llvm.org/D74469
In code generators, one can automate the translation of typed ArrayAttrs
if element attribute translators are already implemented. However, the
type of the element attribute is lost at the construction of
TypedArrayAttrBase. With this change one can inspect the element type
and generate the translation logic automatically, which will reduce the
code repetition.
Differential Revision: https://reviews.llvm.org/D73579
Summary:
As discussed in https://llvm.discourse.group/t/rfc-add-affine-parallel/350, this is the first in a series of patches to bring in support for the `affine.parallel` operation.
This first patch adds the IR representation along with custom printer/parser implementations.
Reviewers: bondhugula, herhut, mehdi_amini, nicolasvasilache, rriddle, earhart, jbruestle
Reviewed By: bondhugula, nicolasvasilache, rriddle, earhart, jbruestle
Subscribers: jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74288
This patch adapts the method MemRefDescriptor::fromStaticShape to
support static non-zero offsets. The updated method uses the
getStridesAndOffset method to extract strides and offset. The patch also
adapts the test cases since sizes and strides are now set in forward
instead of reverse order.
Differential Revision: https://reviews.llvm.org/D74474
This revision prepares the ground for declaratively defining Linalg "named" ops.
Such named ops form the backbone of operations that are ubiquitous in the ML
application domain.
This revision closely related to the definition of a "Tensor Computation
Primitives Dialect" and demonstrates that ops can be expressed as declarative
configurations of the `linalg.generic` op.
Differential Revision: https://reviews.llvm.org/D74491
Summary:
This revision allows model builder to create a linalg_matmul whose body
is a vector.contract. This shows the abstractions compose nicely.
Differential Revision: https://reviews.llvm.org/D74457
Summary: This was a missed case when ValueRange was originally added, and allows for constructing a ValueRange from the arguments of a block.
Differential Revision: https://reviews.llvm.org/D74363
A memref_cast casting to a memref with a non identity map can't be
lowered to llvm. Take the following case:
```
func @invalid_memref_cast(%arg0: memref<?x?xf64>) {
%c1 = constant 1 : index
%c0 = constant 0 : index
%5 = memref_cast %arg0 : memref<?x?xf64> to memref<?x?xf64, #map1>
%25 = std.subview %5[%c0, %c0][%c1, %c1][] : memref<?x?xf64, #map1> to memref<?x?xf64, #map1>
return
}
```
When lowering the subview mlir was assuming `%5` to have an llvm type
(which is not the case as mlir failed to lower the memref_cast).
Differential Revision: https://reviews.llvm.org/D74466
Summary:
This was broken recently when moving from dialect registration via
static initializers to explicit intialization.
Differential Revision: https://reviews.llvm.org/D74480
Thus far we have been using builtin func op to model SPIR-V functions.
It was because builtin func op used to have special treatment in
various parts of the core codebase (e.g., pass pipelines, etc.) and
it's easy to bootstrap the development of the SPIR-V dialect. But
nowadays with general op concepts and region support we don't have
such limitations and it's time to tighten the SPIR-V dialect for
completeness.
This commits introduces a spv.func op to properly model SPIR-V
functions. Compared to builtin func op, it can provide the following
benefits:
* We can control the full op so we can integrate SPIR-V information
bits (e.g., function control) in a more integrated way and define
our own assembly form and enforcing better verification.
* We can have a better dialect and library boundary. At the current
moment only functions are modelled with an external op. With this
change, all ops modelling SPIR-V concpets will be spv.* ops and
registered to the SPIR-V dialect.
* We don't need to special-case func op anymore when creating
ConversionTarget declaring SPIR-V dialect as legal. This is quite
important given we'll see more and more conversions in the future.
In the process, bumps a few FuncOp methods to the FunctionLike trait.
Differential Revision: https://reviews.llvm.org/D74226
In the previous state, we were relying on forcing the linker to include
all libraries in the final binary and the global initializer to self-register
every piece of the system. This change help moving away from this model, and
allow users to compose pieces more freely. The current change is only "fixing"
the dialect registration and avoiding relying on "whole link" for the passes.
The translation is still relying on the global registry, and some refactoring
is needed to make this all more convenient.
Differential Revision: https://reviews.llvm.org/D74461
* Rename CMake target MLIROptMain to MLIROptLib:
The target provides the main library
* Rename CMake target MLIRMlirOptLib to MLIRMlirOptMain:
The target provides the main() entry function
At the moment, the Bazel configuration of TenorFlow maps the target
MlirOptLib to "lib/Support/MlirOptMain.cpp" and MlirOptMain to
"tools/mlir-opt/mlir-opt.cpp". This is the other way around in the CMake
configuration. As discussed in the context of the pull request
https://github.com/tensorflow/tensorflow/pull/36301, it seems useful to
revise the naming in the MLIR repo.
Differential Revision: https://reviews.llvm.org/D73778
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.
Includes ASAN bug fix for D73190.
Reviewers: bondhugula, dcaballe
Reviewed By: bondhugula, dcaballe
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74330
Follow-up on D72802. Turn -convert-std-to-llvm-use-alloca and
-convert-std-to-llvm-bare-ptr-memref-call-conv into pass flags
of LLVMLoweringPass.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D73912
Defines a tablegen class RankedIntElementsAttr. This is an integer
version of RankedFloatElementsAttr.
Differential Revision: https://reviews.llvm.org/D73764
Summary:
The lowering to NVVM and ROCm handles tanh operations differently by
mapping them to NVVM/ROCm specific intrinsics. This conflicts with
the lowering to LLVM, which uses the default llvm intrinsic. This change
declares the LLVM intrinsics to be illegal, hence disallowing the
correspondign rewrite.
Differential Revision: https://reviews.llvm.org/D74389
We have spv.entry_point_abi for specifying the local workgroup size.
It should be decorated onto input gpu.func ops to drive the SPIR-V
CodeGen to generate the proper SPIR-V module execution mode. Compared
to using command-line options for specifying the configuration, using
attributes also has the benefits that 1) we are now able to use
different local workgroup for different entry points and 2) the
tests contains the configuration directly.
Differential Revision: https://reviews.llvm.org/D74012
Summary:
After D72555 has been landed, `linalg.indexed_generic` also accepts ranked
tensor as input and output. Add a test for it.
Differential Revision: https://reviews.llvm.org/D74267
Summary:
This revision adds EDSC support for VectorOps to enable the creation of a `vector_matmul` declaratively. The `vector_matmul` is a simple configuration
of the `vector.contract` op that follows the StructuredOps abstraction.
Differential Revision: https://reviews.llvm.org/D74284
This CL refactors EDSCs to layer them better and break unnecessary
dependencies. After this refactoring, the top-level EDSC target only
depends on IR but not on Dialects anymore and each dialect has its
own EDSC directory.
This simplifies the layering and breaks cyclic dependencies.
In particular, the declarative builder + folder are made explicit and
are now confined to Linalg.
As the refactoring occurred, certain classes and abstractions that were not
paying for themselves have been removed.
Differential Revision: https://reviews.llvm.org/D74302
The current standard to llvm conversion pass lowers subview ops only if
dynamic offsets are provided. This commit extends the lowering with a
code path that uses the constant offset of the target memref for the
subview op lowering (see Example 3 of the subview op definition for an
example) if no dynamic offsets are provided.
Differential Revision: https://reviews.llvm.org/D74280
The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.
Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.
Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by `alloca` will be freed on exiting the auxiliary
function.
The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.
This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.
The existing lowering of gpu.block_dim added a global variable with
the WorkGroupSize decoration. This raises an error within
Vulkan/SPIR-V validation since Vulkan requires this to have a constant
initializer. This is not yet supported in SPIR-V dialect. Changing the
lowering to return the workgroup size as a constant value instead,
obtained from spv.entry_point_abi attribute gets around the issue for
now. The validation goes through since the workgroup size is specified
using spv.execution_mode operation.
This revision adds support in the declarative assembly form for printing attributes with buildable types without the type, and moves several more parsers over to the declarative form.
Differential Revision: https://reviews.llvm.org/D74276
Summary:
This revision adds a utility to generate debug locations from the IR during compilation, by snapshotting to a output stream and using the locations that operations were dumped in that stream. The new locations may either;
* Replace the original location of the operation.
old:
loc("original_source.cpp":1:1)
new:
loc("snapshot_source.mlir":10:10)
* Fuse with the original locations as NamedLocs with a specific tag.
old:
loc("original_source.cpp":1:1)
new:
loc(fused["original_source.cpp":1:1, "snapshot"("snapshot_source.mlir":10:10)])
This feature may be used by a debugger to display the code at various different levels of the IR. It would also be able to show the different levels of IR attached to a specific source line in the original source file.
This feature may also be used to generate locations for operations generated during compilation, that don't necessarily have a user source location to attach to.
This requires changes in the printer to track the locations of operations emitted in the stream. Moving forward we need to properly(and efficiently) track the number of newlines emitted to the stream during printing.
Differential Revision: https://reviews.llvm.org/D74019
Summary: This is the most common operation performed on a CallOpInterface. This just moves the existing functionality from the CallGraph so that other users can access it.
Differential Revision: https://reviews.llvm.org/D74250
Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary.
Differential Revision: https://reviews.llvm.org/D73590
Summary:
This revision adds support for printing pass options as part of the normal help description. This also moves registered passes and pipelines into different sections of the help.
Example:
```
Compiler passes to run
--pass-pipeline - ...
Passes:
--affine-data-copy-generate - ...
--convert-gpu-to-spirv - ...
--workgroup-size=<long> - ...
--test-options-pass - ...
--list=<int> - ...
--string=<string> - ...
--string-list=<string> - ...
Pass Pipelines:
--test-options-pass-pipeline - ...
--list=<int> - ...
--string=<string> - ...
--string-list=<string> - ...
```
Differential Revision: https://reviews.llvm.org/D74246
Summary:
The `vector.fma` operation is portable enough across targets that we do not want
to keep it wrapped under `vector.outerproduct` and `llvm.intrin.fmuladd`.
This revision lifts the op into the vector dialect and implements the lowering to LLVM by using two patterns:
1. a pattern that lowers from n-D to (n-1)-D by unrolling when n > 2
2. a pattern that converts from 1-D to the proper LLVM representation
Reviewers: ftynse, stellaraccident, aartbik, dcaballe, jsetoain, tetuante
Reviewed By: aartbik
Subscribers: fhahn, dcaballe, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74075
Summary:
This revision exposes the portable `llvm.fma` intrinsic in LLVMOps and uses it
in lieu of `llvm.fmuladd` when lowering the `vector.outerproduct` op to LLVM.
This guarantees proper `fma` instructions will be emitted if the target ISA
supports it.
`llvm.fmuladd` does not have this guarantee in its semantics, despite evidence
that the proper x86 instructions are emitted.
For more details, see https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic.
Reviewers: ftynse, aartbik, dcaballe, fhahn
Reviewed By: aartbik
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74219
The initial implementation of the fusion operation exposes a method to
fuse a consumer with its producer, when
- both the producer and consumer operate on tensors
- the producer has only a single result value
- the producer has only "parallel" iterator types
A new interface method hasTensorSemantics is added to verify that an
operation has all operands and results of type RankedTensorType.
Differential Revision: https://reviews.llvm.org/D74172
Summary: In some edge cases the default APFloat printer will generate something that we can't parse back in. In these cases, fallback to using hex instead.
Differential Revision: https://reviews.llvm.org/D74181
Technically a leak in tblgen is harmless, but this makes asan builds of
mlir very noisy. Just use a SpecificBumpPtrAllocator that knows how to
clean up after itself.
This reverts commit 64871f778d.
ASAN indicates a use-after-free in in mlir::canFuseLoops(mlir::AffineForOp, mlir::AffineForOp, unsigned int, mlir::ComputationSliceState*) lib/Transforms/Utils/LoopFusionUtils.cpp:202:41
mlir-opt needs to link against MLIRLoopAnalysis
This shouldn't be needed but MLIR "hack" for
"whole-archive" linking is not compatible with
CMake transitive dependencies management.
Differential Revision: https://reviews.llvm.org/D74097
Summary:
This revision adds basic support for emitting line table information when exporting to LLVMIR. We don't yet have a story for supporting all of the LLVM debug metadata, so this revision stubs some features(like subprograms) to enable emitting line tables.
Differential Revision: https://reviews.llvm.org/D73934
Summary:
Previously, vector.contract did not allow an empty set of
free or batch dimensions (K = 0) which defines a basic
reduction into a scalar (like a dot product). This CL
relaxes that restriction. Also adds constraints on
element type of operands and results. With tests.
Reviewers: nicolasvasilache, andydavis1, rriddle
Reviewed By: andydavis1
Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74014
Summary:
[mlir][VectorOps] Support vector transfer_read/write unrolling for memrefs with vector element type. When unrolling vector transfer read/write on memrefs with vector element type, the indices used to index the memref argument must be updated to reflect the unrolled operation. However, in the case of memrefs with vector element type, we need to be careful to only update the relevant memref indices.
For example, a vector transfer read with the following source/result types, memref<6x2x1xvector<2x4xf32>>, vector<2x1x2x4xf32>, should only update memref indices 1 and 2 during unrolling.
Reviewers: nicolasvasilache, aartbik
Reviewed By: nicolasvasilache, aartbik
Subscribers: lebedev.ri, Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72965
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.
Reviewers: bondhugula, dcaballe, nicolasvasilache
Reviewed By: bondhugula, dcaballe
Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73190
Summary:
Add ShapeCastOp to the vector ops dialect.
The shape_cast operation casts between an n-D source vector shape and a k-D result vector shape (the element type remains the same).
Reviewers: nicolasvasilache, aartbik
Reviewed By: nicolasvasilache
Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73635
Summary: Optional regions are supported in the generic op print/parse form, update the docs to match.
Differential Revision: https://reviews.llvm.org/D74061
Summary:
MLIRAnalysis depended on MLIRVectorOps
MLIRVectorOps depended on MLIRAnalysis for Loop information.
Both of these can be solved by factoring out libraries related to loop
analysis into their own library. The new MLIRLoopAnalysis might be
better off with the Loop Dialect in the future.
Reviewers: nicolasvasilache, rriddle!, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: Joonsoo, vchuravy, merge_guards_bot, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73655
Summary:
This breaks a cyclic library dependency where MLIRPass used the verifier
in MLIRAnalysis, but MLIRAnalysis also contained passes used for testing.
The presence of the test passes here is archaeology, predating
test/lib/Transform.
Reviewers: rriddle
Reviewed By: rriddle
Subscribers: merge_guards_bot, mgorny, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74067
The recent refactoring of build files broke building with the MIR CUDA
integration enabled. This fixes it by adding some additional
dependencies to mlir-opt.
Differential Revision: https://reviews.llvm.org/D74041
Summary:
It is often needed to map entire ranges rather than single values. To avoid
writing the same for loop every time, I have added an overload to the map
method.
Differential Revision: https://reviews.llvm.org/D73894
This binplaces `mlir-translate`, `mlir-cuda-runner`, and `mlir-cpu-runner` when building the CMake install target.
Differential Revision: https://reviews.llvm.org/D73986
We were using normal dictionary attribute for target environment
specification. It becomes cumbersome with more and more fields.
This commit changes the modelling to a dialect-specific attribute,
where we can have control over its storage and assembly form.
Differential Revision: https://reviews.llvm.org/D73959
This is fixing a build error:
error: non-constant-expression cannot be narrowed from type 'unsigned int' to 'Region::iterator::difference_type' (aka 'int') in initializer list
Fix pr44767
Summary:
This patch is a step towards enabling BUILD_SHARED_LIBS=on, which
builds most libraries as DLLs instead of statically linked libraries.
The main effect of this is that incremental build times are greatly
reduced, since usually only one library need be relinked in response
to isolated code changes.
The bulk of this patch is fixing incorrect usage of cmake, where library
dependencies are listed under add_dependencies rather than under
target_link_libraries or under the LINK_LIBS tag. Correct usage should be
like this:
add_dependencies(MLIRfoo MLIRfooIncGen)
target_link_libraries(MLIRfoo MLIRlib1 MLIRlib2)
A separate issue is that in cmake, dependencies between static libraries
are automatically included in dependencies. In the above example, if MLIBlib1
depends on MLIRlib2, then it is sufficient to have only MLIRlib1 in the
target_link_libraries. When compiling with shared libraries, it is necessary
to have both MLIRlib1 and MLIRlib2 specified if MLIRfoo uses symbols from both.
Reviewers: mravishankar, antiagainst, nicolasvasilache, vchuravy, inouehrs, mehdi_amini, jdoerfert
Reviewed By: nicolasvasilache, mehdi_amini
Subscribers: Joonsoo, merge_guards_bot, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73653
This commit adds two resource limits, max_compute_workgroup_size
and max_compute_workgroup_invocations as resource limits to
the target environment. They are not used at the current moment,
but they will affect the SPIR-V CodeGen. Adding for now to have
a proper target environment modelling.
Differential Revision: https://reviews.llvm.org/D73905
This revision makes sure that errors emitted outside of testing are treated as fatal errors. This avoids the current silent failures that occur when the format is invalid.
Summary:
Currently BuildableType is assumed to be preceded by a builder. This prevents constructing types that don't have a callable 'get' method with the builder. This revision reworks the format to be like attribute builders, i.e. by accepting $_builder within the format itself.
Differential Revision: https://reviews.llvm.org/D73736
Summary: This revision add support for accepting a few type constraints, e.g. AllTypesMatch, when inferring types for operands and results. This is used to remove the c++ parsers for several additional operations.
Differential Revision: https://reviews.llvm.org/D73735
Summary:
Replace the generic zero- and one-result builders in LLVM::CallOp with a custom
builder that takes an LLVMFuncOp, which can be used to extract the result type
and create the symbol reference attribute. This is merely a convenience for
upcoming changes. The ODS-generated builders remain present.
Introduce LLVM::LLVMType::isVoidTy by analogy with the underlying LLVM type.
Differential Revision: https://reviews.llvm.org/D73895
LinalgDependenceGraph was not updated after successful producer-consumer
fusion for linalg ops. In this patch it is fixed by reconstructing
LinalgDependenceGraph on every iteration. This is very ineffective and
should be improved by updating LDGraph only when it is necessary.
The patterns for converting `std.alloc` and `std.dealoc` can be configured to
use `llvm.alloca` instead of calling `malloc` and `free`. This configuration
has been only possible through a command-line flag, despite the presence of a
(misleading) parameter in the pass constructor. Use the parameter instead and
only initalize it from the command line flags if the pass is constructed from
the mlir-opt registration.
Summary:
These hooks were originally introduced to support passes deriving the
StandardToLLVM conversion, in particular converting types from different
dialects to LLVM types in a single-step conversion. They are no longer in use
since the pass and conversion infrastructure has evolved sufficiently to make
defining new passes with exactly the same functionality simple through the use
of populate* functions, conversion targets and type converters. Remove the
hooks. Any users of this hooks can call the dialect conversion infrastructure
directly instead, which is likely to require less LoC than these hooks.
Differential Revision: https://reviews.llvm.org/D73795
Summary:
In the original design, gpu.launch required explicit capture of uses
and passing them as operands to the gpu.launch operation. This was
motivated by infrastructure restrictions rather than design. This
change lifts the requirement and removes the concept of kernel
arguments from gpu.launch. Instead, the kernel outlining
transformation now does the explicit capturing.
This is a breaking change for users of gpu.launch.
Differential Revision: https://reviews.llvm.org/D73769
Summary:
Start filling in some requirements for the shape function descriptions
that will be used to derive shape computations. This requiement part may
later be reworked to be part of the "context" section of shape dialect. Without
examples this may be a bit too abstract but I hope not (given mappings to
existing shape functions).
Differential Revision: https://reviews.llvm.org/D73572
Seen on gcc 8, in release mode & assertions off warnings about logger,
made all statements referencing logger inside LLVM_DEBUG blocks and
ifdef a few variables only used in debug.
This is mechanical fix to get CI green.
Summary:
This patch introduces an alternative calling convention for
MemRef function arguments in LLVM dialect. It converts MemRef
function arguments to LLVM bare pointers to the MemRef element
type instead of creating a MemRef descriptor. Bare pointers are
then promoted to a MemRef descriptors at the beginning of the
function. This calling convention is only enabled with a flag.
Reviewers: ftynse, bondhugula, nicolasvasilache, rriddle, mehdi_amini
Reviewed By: ftynse, rriddle, mehdi_amini
Subscribers: Joonsoo, flaub, merge_guards_bot, jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72802
This revision does the following post-commit cleanups:
1. don't use -1 magic constants,
2. drop commented out old test that does not belong here,
3. reformat and add a proper clang-format off on a CHECK directive.
Summary:
This revision beefs up the debug logging within dialect conversion. Given the nature of multi-level legalization, and legalization in general, it is one of the harder pieces of infrastructure to debug. This revision adds nice formatting to make the output log easier to parse:
```
Legalizing operation : 'std.constant'(0x608000002420) {
* Fold {
} -> FAILURE : unable to fold
* Pattern : 'std.constant -> ()' {
} -> FAILURE : pattern failed to match
* Pattern : 'std.constant -> ()' {
} -> FAILURE : pattern failed to match
* Pattern : 'std.constant -> (spv.constant)' {
** Insert : 'spv.constant'(0x608000002c20)
** Replace : 'std.constant'(0x608000002420)
//===-------------------------------------------===//
Legalizing operation : 'spv.constant'(0x608000002c20) {
} -> SUCCESS : operation marked legal by the target
//===-------------------------------------------===//
} -> SUCCESS : pattern applied successfully
} -> SUCCESS
```
Differential Revision: https://reviews.llvm.org/D73747
Summary:
Rationale:
When lowering to LLVM for different rank insert (n vs k), the offset
arrays needs to drop one dimension (becomes n-1), but the strides
array needs to be preserved (remains k). With regression test.
Note that this example was actually in the documentation, so
extra important to do it right :-)
Reviewers: nicolasvasilache, andydavis1, ftynse
Reviewed By: nicolasvasilache, ftynse
Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73733
Summary:
After the `subview` operation was migrated from Linalg to Standard, it changed
semantics and does not guarantee the absence of out-of-bounds accesses through
the created view anymore. Compute the size of the subview to make sure it
always fits within the view (subviews in last iterations of the loops may be
smaller than those in other iterations).
Differential Revision: https://reviews.llvm.org/D73614
This commit adds a pattern to lower linalg.generic for reduction
to spv.GroupNonUniform* ops. Right now this only supports integer
reduction on 1-D input memref. Shader entry point ABI is queried
to make sure that the input memref's shape matches the local
workgroup's invocation configuration. This makes sure that the
workload fits in one local workgroup so that we can leverage
SPIR-V group non-uniform operations.
linglg.generic is a structured op that preserves the right level
of information. It is easier to recognize reduction at this level
than performing analysis on loops.
This commit also exposes `getElementPtr` in SPIRVLowering.h given
that it's a generally useful utility function.
Differential Revision: https://reviews.llvm.org/D73437
Summary:
The current code assumes that one always maps at least one loop to block
dimensions and at least one loop to thread dimensions. If either is not
the case, a loop would get mapped twice.
Differential Revision: https://reviews.llvm.org/D73685
The refactored MemRefType::get() calls all intend to clone from another
memref type, with some modifications. In fact, some calls dropped memory space
during the cloning. Migrate them to the cloning API so that nothing gets
dropped if they are not explicitly listed.
It's close to NFC but not quite, as it helps with propagating memory spaces in
some places.
Differential Revision: https://reviews.llvm.org/D73296
Summary:
MLIR materializes various enumeration-based LLVM IR operands as enumeration
attributes using ODS. This requires bidirectional conversion between different
but very similar enums, currently hardcoded. Extend the ODS modeling of
LLVM-specific enumeration attributes to include the name of the corresponding
enum in the LLVM C++ API as well as the names of specific enumerants. Use this
new information to automatically generate the conversion functions between enum
attributes and LLVM API enums in the two-way conversion between the LLVM
dialect and LLVM IR proper.
Differential Revision: https://reviews.llvm.org/D73468
Summary:
This revision switches over many operations to use the declarative methods for defining the assembly specification. This updates operations in the NVVM, ROCDL, Standard, and VectorOps dialects.
Differential Revision: https://reviews.llvm.org/D73407
Summary:
This revision add support, and testing, for generating the parser and printer from the declarative operation format.
Differential Revision: https://reviews.llvm.org/D73406
Summary:
This is the first revision in a series that adds support for declaratively specifying the asm format of an operation. This revision
focuses solely on parsing the format. Future revisions will add support for generating the proper parser/printer, as well as
transitioning the syntax definition of many existing operations.
This was originally proposed here:
https://llvm.discourse.group/t/rfc-declarative-op-assembly-format/340
Differential Revision: https://reviews.llvm.org/D73405
Summary:
In some cases, one may want to use different names for C++ symbol of an
enumerand from its string representation. In particular, in the LLVM dialect
for, e.g., Linkage, we would like to preserve the same enumerand names as LLVM
API and the same textual IR form as LLVM IR, yet the two are different
(CamelCase vs snake_case with additional limitations on not being a C++
keyword).
Modify EnumAttrCaseInfo in OpBase.td to include both the integer value and its
string representation. By default, this representation is the same as C++
symbol name. Introduce new IntStrAttrCaseBase that allows one to use different
names. Exercise it for LLVM Dialect Linkage attribute. Other attributes will
follow as separate changes.
Differential Revision: https://reviews.llvm.org/D73362
Summary:
In the scope of the lowering phase from GPU to ROCDL, the intructions for the conversion patterns seems to be wrong.
According to https://github.com/ROCm-Developer-Tools/HIP/blob/master/include/hip/hcc_detail/math_fwd.h the instructions need two underscores in the beginning instead of one.
Reviewers: nicolasvasilache, herhut, rriddle
Reviewed By: herhut, rriddle
Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73535
Summary:
The 'gpu.terminator' operation is used as the terminator for the
regions of gpu.launch. This is to disambugaute them from the
return operation on 'gpu.func' functions.
This is a breaking change and users of the gpu dialect will need
to adapt their code when producting 'gpu.launch' operations.
Reviewers: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73620
StrEq has some magic inside that should do the explicit conversion from
StringRef to std::string, but apparently this doesn't work with GCC 5.
Just use EXPECT_EQ, it does the same thing with less magic.
Summary:
This will help catch improper use of the MLIR API's. In particular, this
catches an error that was manifesting as nondeterministic assertion
failures (the nondeterminism was due to the failure happening only when the
StorageUniquer's DenseMap's probing happened to compare two specific
keys).
No test. The fact that all the existing tests pass with this additional
invariant gives confidence that it is correct/useful.
Differential Revision: https://reviews.llvm.org/D73645
Summary:
Canonicalization and folding patterns in StandardOps may interfere with the needs
of Linalg. This revision introduces specific foldings for dynamic memrefs that can
be proven to be static.
Very concretely:
Determines whether it is possible to fold it away in the parent Linalg op:
```mlir
%1 = memref_cast %0 : memref<8x16xf32> to memref<?x?xf32>
%2 = linalg.slice %1 ... : memref<?x?xf32> ...
// or
%1 = memref_cast %0 : memref<8x16xf32, affine_map<(i, j)->(16 * i + j)>>
to memref<?x?xf32>
linalg.generic(%1 ...) : memref<?x?xf32> ...
```
into
```mlir
%2 = linalg.slice %0 ... : memref<8x16xf32> ...
// or
linalg.generic(%0 ... : memref<8x16xf32, affine_map<(i, j)->(16 * i + j)>>
```
Reviewers: ftynse, aartbik, jsetoain, tetuante, asaadaldien
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73565
Summary:
Barrier is a simple operation that takes no arguments and returns
nothing, but implies a side effect (synchronization of all threads)
Reviewers: jdoerfert
Subscribers: mgorny, guansong, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72400
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
This diff adds a transformation patter to rewrite linalg.fill as broadcasting a scaler into a vector.
It uses the same preconditioning as matmul (memory is contiguous).
Reviewers: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73391
Summary:
Operation represents all of the uses of each result with one use list, so manipulating the use list of a specific result requires filtering the main use list. This revision adds an optimization for the case of single result operations to avoid this filtering.
Differential Revision: https://reviews.llvm.org/D73430
Summary:
This also removes the explicit pattern for loop.terminator to ensure
that the terminator is only erased if the parent op is rewritten.
Reductions are not yet supported.
Reviewers: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73348
Summary:
The intrinsic operation added multiple type annotations to the llvm intrinsic operations, but only one is needed.
The related tests in llvmir-intrinsics.mlir checked the wrong number and are adjusted as well.
Reviewers: nicolasvasilache, ftynse
Reviewed By: ftynse
Subscribers: merge_guards_bot, ftynse, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73470
Summary:
The tanh lowering from Standard dialect to NVVM and ROCDL was not working.
The conversion pattern are inserted in the lowering files.
The test cases for the lowerings were added in the test files.
Reviewers: nicolasvasilache, ftynse, herhut
Reviewed By: ftynse, herhut
Subscribers: merge_guards_bot, ftynse, jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73471
Summary:
The dead function elimination pass in toy was a temporary stopgap until we had proper dead function elimination support in MLIR. Now that this functionality is available, this pass is no longer necessary.
Differential Revision: https://reviews.llvm.org/D72483
Summary: This pass deletes all symbols that are found to be unreachable. This is done by computing the set of operations that are known to be live, propagating that liveness to other symbols, and then deleting all symbols that are not within this live set.
Differential Revision: https://reviews.llvm.org/D72482
Summary: This revision refactors the implementation of the symbol use-list functionality to be a bit cleaner, as well as easier to reason about. Aside from code cleanup, this revision updates the user contract to never recurse into operations if they define a symbol table. The current functionality, which does recurse, makes it difficult to examine the uses held by a symbol table itself. Moving forward users may provide a specific region to examine for uses instead.
Differential Revision: https://reviews.llvm.org/D73427
Summary: The new internal representation of operation results now allows for accessing the result types to be more efficient. Changing the API to ArrayRef is more efficient and removes the need to explicitly materialize vectors in several places.
Differential Revision: https://reviews.llvm.org/D73429
Summary: This allows for providing a default "catchall" legality check that is not dependent on specific operations or dialects. For example, this can be useful to check legality based on the specific types of operation operands or results.
Differential Revision: https://reviews.llvm.org/D73379
Summary:
Remove 'valuesToRemoveIfDead' from PatternRewriter API. The removal
functionality wasn't implemented and we decided [1] not to implement it in
favor of having more powerful DCE approaches.
[1] https://github.com/tensorflow/mlir/pull/212
Reviewers: rriddle, bondhugula
Reviewed By: rriddle
Subscribers: liufengdb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72545
Summary:
Affine minimum computation will be used in tiling transformation. The
implementation is mostly boilerplate as we already lower the minimum in the
upper bound of an affine loop.
Differential Revision: https://reviews.llvm.org/D73488
Summary:
LLVM importer to MLIR was implemented mostly as a prototype. As such, it did
not deal handle errors in a consistent way, reporting them out stderr in some
cases and continuing the execution in the error state until eventually
crashing. This is not desirable for a user-facing tool. Make sure errors are
returned from functions, consistently checked at call sites and propagated
further. Functions returning nullable IR values return nullptr to denote the
error state. Other functions return LogicalResult. LLVM importer in
mlir-translate should no longer crash on unsupported inputs.
The errors are reported without association with the source file (and therefore
cannot be checked using -verify-diagnostics). Attaching them to the actual
input file is left for future work.
Differential Revision: https://reviews.llvm.org/D72839
Summary:
Implement the handling of llvm::ConstantDataSequential and
llvm::ConstantAggregate for (nested) array and vector types when imporitng LLVM
IR to MLIR. In all cases, the result is a DenseElementsAttr that can be used in
either a `llvm.mlir.global` or a `llvm.mlir.constant`. Nested aggregates are
unpacked recursively until an element or a constant data is found. Nested
arrays with innermost scalar type are represented as DenseElementsAttr of
tensor type. Nested arrays with innermost vector type are represented as
DenseElementsAttr with (multidimensional) vector type.
Constant aggregates of struct type are not yet supported as the LLVM dialect
does not have a well-defined way of modeling struct-type constants.
Differential Revision: https://reviews.llvm.org/D72834
This commit changes the logic of `getBuiltinVariableValue` to get
or create the builtin variable in the nearest symbol table. This
will allow us to use this function in other partial conversion
cases where we haven't created the spv.module yet.
Differential Revision: https://reviews.llvm.org/D73416
This commit exposes the func op conversion pattern via a new
`populateBuiltinFuncToSPIRVPatterns` function from the standard
to SPIR-V conversion passs. This is structurally better given
that func op belongs to the builtin dialect. More importantly,
this makes the pattern reusable to other dialect to SPIR-V
dialect conversion as other dialect can well adopt builtin
func op instead of having its own. Besides, it's very common
to use func ops as test wrappers in lit tests, so test passes
will need to handle func ops too.
Differential Revision: https://reviews.llvm.org/D73421
Thus far certain SPIR-V ops have been required to be in spv.module.
While this provides strong verification to catch unexpected errors,
it's quite rigid and makes progressive lowering difficult. Sometimes
we would like to partially lower ops from other dialects, which may
involve creating ops like global variables that should be placed in
other module-like ops. So this commit relaxes the requirement of
such SPIR-V ops' scope to module-like ops. Similarly for function-
like ops.
Differential Revision: https://reviews.llvm.org/D73415
Summary:
Rewrites the extract/insert_slices operation in terms of
strided_slice/insert_strided_slice ops with intermediate
tuple uses (that should get optimimized away with typical
usage). This is done in a separate "pass" to enable testing
this particular rewriting in isolation.
Reviewers: nicolasvasilache, andydavis1, ftynse
Reviewed By: nicolasvasilache
Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73295
Summary:
Rationale:
Some examples were using "offsets : [0, 2]" syntax which
should use a "=" instead. The same examples were referring
to the integer attribute array as k-dimensional, which is
a bit confusing (it is 1-dimensional, with k elements).
Changed to "k-sized".
Reviewers: nicolasvasilache, andydavis1, ftynse
Reviewed By: nicolasvasilache
Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73293
Summary:
LLVMIRIntrinsicGen is using LLVM_Op as the base class for intrinsics.
This works for LLVM intrinsics in the LLVM Dialect, but when we are
trying to convert custom intrinsics that originate from a custom
LLVM dialect (like NVVM or ROCDL) these usually have a different
"cppNamespace" that needs to be applied to these dialect.
These dialect specific characteristics (like "cppNamespace")
are typically organized by creating a custom op (like NVVM_Op or
ROCDL_Op) that passes the correct dialect to the LLVM_OpBase class.
It seems natural to allow LLVMIRIntrinsicGen to take that into
consideration when generating the conversion code from one of these
dialect to a set of target specific intrinsics.
Reviewers: rriddle, andydavis1, antiagainst, nicolasvasilache, ftynse
Subscribers: jdoerfert, mehdi_amini, jpienaar, burmako, shauheen, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73233
This reverts commit eec36909c1.
This modeling is incorrect. baseAttr is intended for attribute
decorators that are not backed by C++ attribute classes. It essentially
says DerivedAttr isa BaseAttr, which is wrong for ArrayAttr classes.
If one needs to store the element type, it should be stored as a
separate filed in the tablegen class.
Summary:
This diff extends the Linalg EDSC builders so we can easily create mixed
tensor/buffer linalg.generic ops. This is expected to be useful for
HLO -> Linalg lowering.
The StructuredIndexed struct is made to derive from ValueHandle and can
now capture a type + indexing expressions. This is used to represent return
tensors.
Pointwise unary and binary builders are extended to allow both output buffers
and return tensors. This has implications on the number of region arguments.
Reviewers: ftynse, hanchung, asaadaldien
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73149