Commit Graph

753 Commits

Author SHA1 Message Date
Rob Suderman 90478251c7 [mlir][tosa] Tosa reverse to linalg supporting dynamic shapes
Needed to switch to extract to support tosa.reverse using dynamic shapes.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108744
2021-08-26 13:23:59 -07:00
Rob Suderman 0600bb4d18 [mlir][tosa] Elementwise operation dynamic shape support
Added dynamic shape support for elementwise operations. This assumes equal
sizes (broadcasting 1-length dynamic is problematic).

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108730
2021-08-26 11:18:58 -07:00
Rob Suderman 5541a05d6a [mlir][tosa] Quantized tosa.avg_pool2d lowering to linalg
Includes the quantized version of average pool lowering to linalg dialect.
This includes a lit test for the transform. It is not 100% correct as the
multiplier / shift should be done in i64 however this is negligable rounding
difference.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108676
2021-08-24 18:54:23 -07:00
Rob Suderman 4ef1770abd [mlir][tosa] Table did not apply offset before extract on i8 input
Lowering to table was incorrect as it did not apply a 128 offset before
extracting the value from the table. Fixed and correct tensor length on input
table.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108436
2021-08-24 18:52:33 -07:00
Rob Suderman a7bf93807b [mlir][tosa] Fix conv/depthwise conv padding for quantized values
When padding quantized operations, the padding needs to equal the zero point
of the input value. Corrected the pass to change the padding value if quantized.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108440
2021-08-24 18:13:22 -07:00
William S. Moses 973cb2c326 [MLIR][OMP] Ensure nested scf.parallel execute all iterations
Presently, the lowering of nested scf.parallel loops to OpenMP creates one omp.parallel region, with two (nested) OpenMP worksharing loops on the inside. When lowered to LLVM and executed, this results in incorrect results. The reason for this is as follows:

An OpenMP parallel region results in the code being run with whatever number of threads available to OpenMP. Within a parallel region a worksharing loop divides up the total number of requested iterations by the available number of threads, and distributes accordingly. For a single ws loop in a parallel region, this works as intended.

Now consider nested ws loops as follows:

omp.parallel {
   A: omp.ws %i = 0...10 {
      B: omp.ws %j = 0...10 {
          code(%i, %j)
      }
   }
}

Suppose we ran this on two threads. The first workshare loop would decide to execute iterations 0, 1, 2, 3, 4 on thread 0, and iterations 5, 6, 7, 8, 9 on thread 1. The second workshare loop would decide the same for its iteration. This means thread 0 would execute i \in [0, 5) and j \in [0, 5). Thread 1 would execute i \in [5, 10) and j \in [5, 10). This means that iterations i in [5, 10), j in [0, 5) and i in [0, 5), j in [5, 10) never get executed, which is clearly wrong.

This permits two options for a remedy:
1) Change the semantics of the omp.wsloop to be distinct from that of the OpenMP runtime call or equivalently #pragma omp for. This could then allow some lowering transformation to remedy the aforementioned issue. I don't think this is desirable for an abstraction standpoint.
2) When lowering an scf.parallel always surround the wsloop with a new parallel region (thereby causing the innermost wsloop to use the number of threads available only to it).

This PR implements the latter change.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108426
2021-08-20 19:06:28 -04:00
Rob Suderman 3205ee7e81 [mlir][tosa] Support UInt8 inputs and outputs for tosa.rescale
Tosa rescale can contain uint8 types. Added support for these types
using an unrealized conversion cast. Optimistically it would be better to
use bitcast however it does not support unsigned integers.

Differential Revision: https://reviews.llvm.org/D108427
2021-08-19 18:58:44 -07:00
Robert Suderman 76c9712196 [mlir][tosa] Fix clamp to restrict only within valid bitwidth range
Its possible for the clamp to have invalid min/max values on its range. To fix
this we validate the range of the min/max and clamp to a valid range.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108256
2021-08-18 12:14:01 -07:00
William S. Moses 8c2ff7b69e [MLIR] Correct linkage of lowered globalop
LLVM considers global variables marked as externals to be defined within the module if it is initialized (including to an undef). Other external globals are considered as being defined externally and imported into the current translation unit. Lowering of MLIR Global Ops does not properly propagate undefined initializers, resulting in a global which is expected to be defined within the current TU, not being defined.

Differential Revision: https://reviews.llvm.org/D108252
2021-08-18 11:09:43 -04:00
natashaknk ba0997ca09 [mlir][tosa] Fix depthwise_conv2D strides/dilation and name
Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D107997
2021-08-12 15:43:41 -07:00
Rob Suderman 7de439b2be [mlir][tosa] Migrate tosa to more efficient linalg.conv
Existing linalg.conv2d is not well optimized for performance. Changed to a
version that is more aligned for optimziation. Include the corresponding
transposes to use this optimized version.

This also splits the conv and depthwise conv into separate implementations
to avoid overly complex lowerings.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D107504
2021-08-11 11:05:12 -07:00
Alex Zinenko a0d8a08e3e [mlir] Add std.bitcast -> llvm.bitcast conversion
The conversion is a straightforward one-to-one mapping with optional unrolling
for nD vectors, similarly to other cast operations.

Depends On D107889

Reviewed By: cota, akuegel

Differential Revision: https://reviews.llvm.org/D107891
2021-08-11 16:30:21 +02:00
Rob Suderman 2b2ebb6f98 [mlir][tosa] Add folders for trivial tosa operation cases
Some folding cases are trivial to fold away, specifically no-op cases where
an operation's input and output are the same. Canonicalizing these away
removes unneeded operations.

The current version includes tensor cast operations to resolve shape
discreprencies that occur when an operation's result type differs from the
input type. These are resolved during a tosa shape propagation pass.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D107321
2021-08-10 14:43:00 -07:00
Rob Suderman 86858c62ba [mlir][tosa] Add dilation to tosa.transpose_conv2d lowering
Dilation only requires increasing the padding on the left/right side of the
input, and including dilation in the convolution. This implementation still
lacks support for strided convolutions.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D107680
2021-08-10 14:36:11 -07:00
natashaknk a1f46569a1 [mlir][tosa] Add quantized and unquantized versions for tosa.depthwise_conv2d lowering
Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D107855
2021-08-10 14:29:26 -07:00
Alex Zinenko 8a7c657c4d [mlir] support nD vector forms of shifts in std-to-llvm conversion
These ops were not ported to the nD vector conversion when it was introduced
and nobody needed them so far.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D107750
2021-08-09 12:00:41 +02:00
Eugene Zhulenev b537c5b414 [mlir] Async: clone constants into async.execute functions and parallel compute functions
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107007
2021-08-02 12:17:41 -07:00
Lei Zhang 0065bd2ad5 [mlir][spirv] Fix loading bool with proper storage capabilities
If the source value to load is bool, and we have native storage
capability support for the source bitwidth, we still cannot directly
rewrite uses; we need to perform casting to bool first.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D107119
2021-07-30 18:06:11 -04:00
Lei Zhang 9f5300c8be [mlir][spirv] Fix storing bool with proper storage capabilities
If the source value to store is bool, and we have native storage
capability support for the target bitwidth, we still cannot directly
store; we need to perform casting to match the target memref
element's bitwidth.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D107114
2021-07-30 18:06:10 -04:00
Lei Zhang 26be7fe27c [mlir] NFC: split MemRef to SPIR-V conversion into their own files
Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D107094
2021-07-29 16:34:10 -04:00
Lei Zhang 995c3984ef [mlir] NFC: split Math to SPIR-V conversion into their own files
Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D107093
2021-07-29 16:34:10 -04:00
River Riddle f8479d9de5 [mlir] Set the namespace of the BuiltinDialect to 'builtin'
Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though)

Differential Revision: https://reviews.llvm.org/D105149
2021-07-28 21:00:10 +00:00
Alex Zinenko c1f719d1a7 [mlir] harden result type verification in llvm.call
The verifier of the llvm.call operation was not checking for mismatches between
the number of operation results and the number of results in the signature of
the callee. Furthermore, it was possible to construct an llvm.call operation
producing an SSA value of !llvm.void type, which should not exist. Add the
verification and treat !llvm.void result type as absence of call results.
Update the GPU conversions to LLVM that were mistakenly assuming that it was
fine for llvm.call to produce values of !llvm.void type and ensure these calls
do not produce results.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106937
2021-07-28 18:15:56 +02:00
Adrian Kuegel fb978f092c [mlir][Complex]: Add lowerings for AddOp and SubOp from Complex dialect to
Standard.

Differential Revision: https://reviews.llvm.org/D106429
2021-07-23 12:43:45 +02:00
Rob Suderman cf8a1f6208 [mlir][tosa] Quantized Conv2DOp lowering to linalg added.
Includes a version of a quantized conv2D operations with a lowering from TOSA
to linalg with corresponding test. We keep the quantized and quantized variants
as separate named ops to avoid the additional operations for non-quantized
convolutions.

Differential Revision: https://reviews.llvm.org/D106407
2021-07-22 15:42:26 -07:00
Nicolas Vasilache a664c14001 [mlir][LLVM] Revert bareptr calling convention handling as an argument materialization.
Type conversion and argument materialization are context-free: there is no available information on which op / branch is currently being converted.
As a consequence, bare ptr convention cannot be handled as an argument materialization: it would apply irrespectively of the parent op.
This doesn't typecheck in the case of non-funcOp and we would see cases where a memref descriptor would be inserted in place of the pointer in another memref descriptor.

For now the proper behavior is to revert to a specific BarePtrFunc implementation and drop the blanket argument materialization logic.

This reverts the relevant piece of the conversion to LLVM to what it was before https://reviews.llvm.org/D105880 and adds a relevant test and documentation to avoid the mistake by whomever attempts this again in the future.

Reviewed By: arpith-jacob

Differential Revision: https://reviews.llvm.org/D106495
2021-07-21 22:06:50 +00:00
Rob Suderman 40a02fae87 [mlir][tosa] Added tosa to linalg lowering to unstrided transposed conv
The unstrided transposed conv can be represented as a regular convolution.
Lower to this variant to handle the basic case. This includes transitioning from
the TC defined convolution operation and a yaml defined one.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D106389
2021-07-20 15:07:08 -07:00
Rob Suderman 6bf0f6a4f7 [mlir][tosa] Add quantized lowering for matmul and fully_connected
Added the named op variants for quantized matmul and quantized batch matmul
with the necessary lowerings/tests from tosa's matmul/fully connected ops.
Current version does not use the contraction op interface as its verifiers
are not compatible with scalar operations.

Differential Revision: https://reviews.llvm.org/D105063
2021-07-20 12:58:02 -07:00
Yi Zhang 381c3b9299 Dyanamic shape support for memref reassociation reshape ops
Only memref with identity layout map is supported for now.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D106180
2021-07-19 15:14:36 -07:00
Hanhan Wang 9c49195330 [mlir][Linalg] Migrate 2D pooling ops from tc definition to yaml definition.
This deletes all the pooling ops in LinalgNamedStructuredOpsSpec.tc. All the
uses are replaced with the yaml pooling ops.

Reviewed By: gysit, rsuderman

Differential Revision: https://reviews.llvm.org/D106181
2021-07-19 09:24:02 -07:00
Matthias Springer d1a9e9a7cb [mlir][vector] Remove vector.transfer_read/write to LLVM lowering
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.

* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)

Differential Revision: https://reviews.llvm.org/D106118
2021-07-17 14:07:27 +09:00
Alex Zinenko 881dc34f73 [mlir] replace llvm.mlir.cast with unrealized_conversion_cast
The dialect-specific cast between builtin (ex-standard) types and LLVM
dialect types was introduced long time before built-in support for
unrealized_conversion_cast. It has a similar purpose, but is restricted
to compatible builtin and LLVM dialect types, which may hamper
progressive lowering and composition with types from other dialects.
Replace llvm.mlir.cast with unrealized_conversion_cast, and drop the
operation that became unnecessary.

Also make unrealized_conversion_cast legal by default in
LLVMConversionTarget as the majority of convesions using it are partial
conversions that actually want the casts to persist in the IR. The
standard-to-llvm conversion, which is still expected to run last, cleans
up the remaining casts  standard-to-llvm conversion, which is still
expected to run last, cleans up the remaining casts

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105880
2021-07-16 15:14:09 +02:00
Alexander Belyaev 46ef86b5d8 [mlir] Move linalg::Expand/CollapseShapeOp to memref dialect.
RFC: https://llvm.discourse.group/t/rfc-reshape-ops-restructuring/3310

Differential Revision: https://reviews.llvm.org/D106141
2021-07-16 13:32:17 +02:00
Adrian Kuegel 74b88807ae [mlir][rocdl] Add math::Exp2Op lowering to ROCDL
Differential Revision: https://reviews.llvm.org/D106057
2021-07-15 14:33:04 +02:00
Adrian Kuegel ffe6a58325 [mlir][nvvm]: Add math::Exp2Op lowering to NVVM.
Differential Revision: https://reviews.llvm.org/D106050
2021-07-15 13:06:30 +02:00
Alex Zinenko 26e59cc19f [mlir] factor math-to-llvm out of standard-to-llvm
After the Math has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the Math
dialect operations to LLVM into a separate library and a separate
conversion pass.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D105702
2021-07-12 11:09:42 +02:00
Alex Zinenko c282d55a38 [mlir] add support for reductions in OpenMP WsLoopOp
Use a modeling similar to SCF ParallelOp to support arbitrary parallel
reductions. The two main differences are: (1) reductions are named and declared
beforehand similarly to functions using a special op that provides the neutral
element, the reduction code and optionally the atomic reduction code; (2)
reductions go through memory instead because this is closer to the OpenMP
semantics.

See https://llvm.discourse.group/t/rfc-openmp-reduction-support/3367.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D105358
2021-07-09 17:54:20 +02:00
Alex Zinenko 75e5f0aac9 [mlir] factor memref-to-llvm lowering out of std-to-llvm
After the MemRef has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the MemRef
dialect operations to LLVM into a separate library and a separate
conversion pass.

Reviewed By: herhut, silvas

Differential Revision: https://reviews.llvm.org/D105625
2021-07-09 14:49:52 +02:00
William S. Moses 9a11c70c18 [SCF] Handle lowering of Execute region to Standard CFG
Lower SCF.executeregionop to llvm by essentially inlining the region and replacing the return

Differential Revision: https://reviews.llvm.org/D105567
2021-07-07 15:27:21 -04:00
William S. Moses eaf22ba011 [MLIR] Provide lowering of std switch to llvm switch
This patch allows lowering of std switch to llvm switch

Differential Revision: https://reviews.llvm.org/D105580
2021-07-07 15:25:55 -04:00
thomasraoux 291025389c [mlir][vector] Refactor Vector Unrolling and remove Tuple ops
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.

This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.

Differential Revision: https://reviews.llvm.org/D105381
2021-07-07 11:11:26 -07:00
Adrian Kuegel 6e80e3bd1b Add Log1pOp to complex dialect.
Also add a lowering pattern from Complex to Standard/Math dialect.

Differential Revision: https://reviews.llvm.org/D105538
2021-07-07 11:33:54 +02:00
Adrian Kuegel bf17ee1950 Add MulOp lowering from Complex dialect to Standard/Math dialect.
The lowering handles special cases with NaN or infinity like C++.

Differential Revision: https://reviews.llvm.org/D105270
2021-07-05 12:51:51 +02:00
Adrian Kuegel 380fa71fb0 [mlir] Add LogOp lowering from Complex dialect to Standard/Math dialect.
Differential Revision: https://reviews.llvm.org/D105342
2021-07-05 09:33:45 +02:00
Rob Suderman 8dea784b3e [mlir][tosa] Add tosa shape inference with InferReturnTypeComponent
Added InferReturnTypeComponents for NAry operations, reshape, and reverse.
With the additional tosa-infer-shapes pass, we can infer/propagate shapes
across a set of TOSA operations. Current version does not modify the
FuncOp type by inserting an unrealized conversion cast prior to any new
non-matchin returns.

Differential Revision: https://reviews.llvm.org/D105312
2021-07-01 16:04:26 -07:00
Matthias Springer c0a6318d96 [mlir][tensor] Add tensor.dim operation
* Split memref.dim into two operations: memref.dim and tensor.dim. Both ops have the same builder interface and op argument names, so that they can be used with templates in patterns that apply to both tensors and memrefs (e.g., some patterns in Linalg).
* Add constant materializer to TensorDialect (needed for folding in affine.apply etc.).
* Remove some MemRefDialect dependencies, make some explicit.

Differential Revision: https://reviews.llvm.org/D105165
2021-07-01 10:00:19 +09:00
thomasraoux 0298f2cfb1 [mlir] Fix wrong type in WmmaConstantOpToNVVMLowering
InsertElement takes a scalar integer attribute not an array of integer.

Differential Revision: https://reviews.llvm.org/D105174
2021-06-30 09:10:02 -07:00
thomasraoux 4392841949 [mlir][VectorToGPU] Support converting vetor.broadcast to MMA op
Differential Revision: https://reviews.llvm.org/D105175
2021-06-30 09:08:55 -07:00
Eugene Zhulenev d43b23608a [mlir:Async] Add the size parameter to the async.group
Specify the `!async.group` size (the number of tokens that will be added to it) at construction time. `async.await_all` operation can potentially race with `async.execute` operations that keep updating the group, for this reason it is required to know upfront how many tokens will be added to the group.

Reviewed By: ftynse, herhut

Differential Revision: https://reviews.llvm.org/D104780
2021-06-25 10:26:50 -07:00
thomasraoux 1a86559276 [mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands
Differential Revision: https://reviews.llvm.org/D104134
2021-06-24 15:42:28 -07:00