Commit Graph

164 Commits

Author SHA1 Message Date
Thomas Raoux 15bcc36eed [mlir][gpu] Move async copy ops to NVGPU and add caching hints
Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.

Differential Revision: https://reviews.llvm.org/D125244
2022-05-10 22:30:24 +00:00
Thomas Raoux 09fc685ce6 [mlir][nvvm] Add attribute to nvvm.cpAsyncOp to control l1 bypass
Add attribute to be able to generate the intrinsic version of async copy
generating a copy with l1 bypass. This correspond to
cp.async.cg.shared.global in ptx.

Differential Revision: https://reviews.llvm.org/D125241
2022-05-09 19:34:48 +00:00
River Riddle 58ceae9561 [mlir:NFC] Remove the forward declaration of FuncOp in the mlir namespace
FuncOp has been moved to the `func` namespace for a little over a month, the
using directive can be dropped now.
2022-04-18 12:01:55 -07:00
Thomas Raoux 894a591cf6 [mlir][nvgpu] Move mma.sync and ldmatrix in nvgpu dialect
Move gpu operation mma.sync and ldmatrix in nvgpu as they are specific
to nvidia target.

Differential Revision: https://reviews.llvm.org/D123824
2022-04-14 23:44:52 +00:00
Christopher Bate 77d2c815f5 [MLIR][GPU] Add GPU ops nvvm.mma.sync, nvvm.mma.ldmatrix, lane_id
This change adds three new operations to the GPU dialect: gpu.mma.sync,
gpu.mma.ldmatrix, and gpu.lane_id. The former two are meant to target
the lower level nvvm.mma.sync and nvvm.ldmatrix instructions, respectively.
Lowerings are added for the new GPU operations for conversion to
NVVM.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123647
2022-04-13 22:50:07 +00:00
River Riddle 3655069234 [mlir] Move the Builtin FuncOp to the Func dialect
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.

Differential Revision: https://reviews.llvm.org/D121266
2022-03-16 17:07:03 -07:00
River Riddle 1d7120c69a [mlir] Split out AttrDef/TypeDef and pattern constructs from OpBase.td
OpBase.td has formed into a huge monolith of all ODS constructs. This
commits starts to rectify that by splitting out some constructs to their
own .td files.

Differential Revision: https://reviews.llvm.org/D118636
2022-03-15 00:18:03 -07:00
River Riddle 5a7b919409 [mlir][NFC] Rename StandardToLLVM to FuncToLLVM
The current StandardToLLVM conversion patterns only really handle
the Func dialect. The pass itself adds patterns for Arithmetic/CFToLLVM, but
those should be/will be split out in a followup. This commit focuses solely
on being an NFC rename.

Aside from the directory change, the pattern and pass creation API have been renamed:
 * populateStdToLLVMFuncOpConversionPattern -> populateFuncToLLVMFuncOpConversionPattern
 * populateStdToLLVMConversionPatterns -> populateFuncToLLVMConversionPatterns
 * createLowerToLLVMPass -> createConvertFuncToLLVMPass

Differential Revision: https://reviews.llvm.org/D120778
2022-03-07 11:25:23 -08:00
River Riddle 1f971e23f0 [mlir] Trim a huge number of unnecessary dependencies on the Func dialect
The Func has a large number of legacy dependencies carried over from the old
Standard dialect, which was pervasive and contained a large number of varied
operations. With the split of the standard dialect and its demise, a lot of lingering
dead dependencies have survived to the Func dialect. This commit removes a
large majority of then, greatly reducing the dependence surface area of the
Func dialect.
2022-03-01 12:10:04 -08:00
Tres Popp b4e0507ce0 Rename PatternRewriteSet::insert to add
insert is soft deprecated, so remove all references so it's less likely
to be used and can be easily removed in the future.

Differential Revision: https://reviews.llvm.org/D120021
2022-02-18 12:18:41 +01:00
Thomas Raoux 5ab04bc068 [mlir][gpu] Add device side async copy operations
Add new operations to the gpu dialect to represent device side
asynchronous copies. This also add the lowering of those operations to
nvvm dialect.
Those ops are meant to be low level and map directly to llvm dialects
like nvvm or rocdl.

We can further add higher level of abstraction by building on top of
those operations.
This has been discuss here:
https://discourse.llvm.org/t/modeling-gpu-async-copy-ampere-feature/4924

Differential Revision: https://reviews.llvm.org/D119191
2022-02-10 17:25:59 -08:00
River Riddle ace01605e0 [mlir] Split out a new ControlFlow dialect from Standard
This dialect is intended to model lower level/branch based control-flow constructs. The initial set
of operations are: AssertOp, BranchOp, CondBranchOp, SwitchOp; all split out from the current
standard dialect.

See https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061

Differential Revision: https://reviews.llvm.org/D118966
2022-02-06 14:51:16 -08:00
River Riddle 38abdddf6f [mlir][NFC] Update AMX/LLVM/NVVM/X86 vector operations to use `hasVerifier` instead of `verifier`
The verifier field is deprecated, and slated for removal.

Differential Revision: https://reviews.llvm.org/D118819
2022-02-02 13:34:29 -08:00
harsh e01e4c9115 Fix bugs in GPUToNVVM lowering
The current lowering from GPU to NVVM does
not correctly handle the following cases when
lowering the gpu shuffle op.

1. When the active width is set to 32 (all lanes),
then the current approach computes (1 << 32) -1 which
results in poison values in the LLVM IR. We fix this by
defining the active mask as (-1) >> (32 - width).

2. In the case of shuffle up, the computation of the third
operand c has to be different from the other 3 modes due to
the op definition in the ISA reference.
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html)
Specifically, the predicate value is computed as j >= maxLane
for up and j <= maxLane for all other modes. We fix this by
computing maskAndClamp as 32 - width for this mode.

TEST: We modify the existing test and add more checks for the up mode.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D118086
2022-01-25 03:24:14 +00:00
Mogball aae5125550 [mlir] Replace StrEnumAttr -> EnumAttr in core dialects
Removes uses of `StrEnumAttr` in core dialects

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D117514
2022-01-18 17:15:00 +00:00
Mehdi Amini be0a7e9f27 Adjust "end namespace" comment in MLIR to match new agree'd coding style
See D115115 and this mailing list discussion:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html

Differential Revision: https://reviews.llvm.org/D115309
2021-12-08 06:05:26 +00:00
Thomas Raoux 47555d73f6 [mlir][gpu] Extend shuffle op modes and add nvvm lowering
Add up, down and idx modes to gpu shuffle ops, also change the mode from
string to enum

Differential Revision: https://reviews.llvm.org/D114188
2021-11-19 11:14:31 -08:00
River Riddle 195730a650 [mlir][NFC] Replace references to Identifier with StringAttr
This is part of the replacement of Identifier with StringAttr.

Differential Revision: https://reviews.llvm.org/D113953
2021-11-16 17:36:26 +00:00
Thomas Raoux e7969240dc [mlir][VectorToGPU] Support more cases in conversion to MMA ops
Support load with broadcast, elementwise divf op and remove the
hardcoded restriction on the vector size. Picking the right size should
be enfored by user and will fail conversion to llvm/spirv if it is not
supported.

Differential Revision: https://reviews.llvm.org/D113618
2021-11-11 13:10:38 -08:00
thomasraoux f309939d06 [mlir][nvvm] Remove special case ptr arithmetic lowering in gpu to nvvm
Use existing helper instead of handling only a subset of indices lowering
arithmetic. Also relax the restriction on the memref rank for the GPU mma ops
as we can now support any rank.

Differential Revision: https://reviews.llvm.org/D113383
2021-11-10 10:00:12 -08:00
thomasraoux d88cc07943 [mlir][gpuTonvvm] Remove hardcoded values in MMAType to llvm struct
Also relax the types allowed in GPU wmma ops

Differential Revision: https://reviews.llvm.org/D112969
2021-11-02 08:12:27 -07:00
thomasraoux 8a992b20db [mlir][gpu] Add basic support to do elementwise ops on mma matrix type
In order to support fusion with mma matrix type we need to be able to
execute elementwise operations on them. This add an op to be able to
support some basic elementwise operations. This is a is not a full
solution as it only supports a limited scope or operations. Ideally we would
want to be able to fuse with more kind of operations.

Differential Revision: https://reviews.llvm.org/D112857
2021-11-01 11:51:19 -07:00
thomasraoux 77eafb8430 [mlir][nvvm] Generalize wmma ops to handle more types and shapes
wmma intrinsics have a large number of combinations, ideally we want to be able
to target all the different variants. To avoid a combinatorial explosion in the
number of mlir op we use attributes to represent the different variation of
load/store/mma ops. We also can generate with tablegen helpers to know which
combinations are available. Using this we can avoid having too hardcode a path
for specific shapes and can support more types.
This patch also adds boiler plates for tf32 op support.

Differential Revision: https://reviews.llvm.org/D112689
2021-11-01 10:27:26 -07:00
thomasraoux eacd6e1ebe [mlir][GPUtoNVVM] Relax restriction on wmma op lowering
Allow lowering of wmma ops with 64bits indexes. Change the default
version of the test to use default layout.

Differential Revision: https://reviews.llvm.org/D112479
2021-10-27 21:31:55 -07:00
Mogball a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
River Riddle ef976337f5 [mlir:OpConversion] Remove the remaing usages of the deprecated matchAndRewrite methods
This commits updates the remaining usages of the ArrayRef<Value> based
matchAndRewrite/rewrite methods in favor of the new OpAdaptor
overload.

Differential Revision: https://reviews.llvm.org/D110360
2021-09-24 17:51:41 +00:00
Adrian Kuegel ffe6a58325 [mlir][nvvm]: Add math::Exp2Op lowering to NVVM.
Differential Revision: https://reviews.llvm.org/D106050
2021-07-15 13:06:30 +02:00
Alex Zinenko 75e5f0aac9 [mlir] factor memref-to-llvm lowering out of std-to-llvm
After the MemRef has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the MemRef
dialect operations to LLVM into a separate library and a separate
conversion pass.

Reviewed By: herhut, silvas

Differential Revision: https://reviews.llvm.org/D105625
2021-07-09 14:49:52 +02:00
Alex Zinenko b5d847b1b9 [mlir] factor out common parts of the converstion to the LLVM dialect
"Standard-to-LLVM" conversion is one of the oldest passes in existence. It has
become quite large due to the size of the Standard dialect itself, which is
being split into multiple smaller dialects. Furthermore, several conversion
features are useful for any dialect that is being converted to the LLVM
dialect, which, without this refactoring, creates a dependency from those
conversions to the "standard-to-llvm" one.

Put several of the reusable utilities from this conversion to a separate
library, namely:
- type converter from builtin to LLVM dialect types;
- utility for building and accessing values of LLVM structure type;
- utility for building and accessing values that represent memref in the LLVM
  dialect;
- lowering options applicable everywhere.

Additionally, remove the type wrapping/unwrapping notion from the type
converter that is no longer relevant since LLVM types has been reimplemented as
first-class MLIR types.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D105534
2021-07-07 10:51:08 +02:00
Uday Bondhugula 4acf3807e3 [MLIR] Split out GPU ops library from Transforms
Split out GPU ops library from GPU transforms. This allows libraries to
depend on GPU Ops without needing/building its transforms.

Differential Revision: https://reviews.llvm.org/D105472
2021-07-07 11:26:49 +05:30
thomasraoux 0298f2cfb1 [mlir] Fix wrong type in WmmaConstantOpToNVVMLowering
InsertElement takes a scalar integer attribute not an array of integer.

Differential Revision: https://reviews.llvm.org/D105174
2021-06-30 09:10:02 -07:00
Matthias Springer 66f878cee9 [mlir][NFC] Remove Standard dialect dependency on MemRef dialect
* Remove dependency: Standard --> MemRef
* Add dependencies: GPUToNVVMTransforms --> MemRef, Linalg --> MemRef, MemRef --> Tensor
* Note: The `subtensor_insert_propagate_dest_cast` test case in MemRef/canonicalize.mlir will be moved to Tensor/canonicalize.mlir in a subsequent commit, which moves over the remaining Tensor ops from the Standard dialect to the Tensor dialect.

Differential Revision: https://reviews.llvm.org/D104506
2021-06-21 17:55:23 +09:00
thomasraoux 428a62f65f [mlir][gpu] Add op to create MMA constant matrix
This allow creating a matrix with all elements set to a given value. This is
needed to be able to implement a simple dot op.

Differential Revision: https://reviews.llvm.org/D103870
2021-06-10 08:34:04 -07:00
thomasraoux 9b496c2373 [mlir][gpu][NFC] Simplify conversion of MMA type to NVVM
Consolidate the type conversion in a single function to make it simpler
to use. This allow to re-use the type conversion for up coming ops.

Differential Revision: https://reviews.llvm.org/D103868
2021-06-09 09:33:38 -07:00
thomasraoux b44007bec2 [mlir][gpu] Relax restriction on MMA store op to allow chain of mma ops.
In order to allow large matmul operations using the MMA ops we need to chain
operations this is not possible unless "DOp" and "COp" type have matching
layout so remove the "DOp" layout and force accumulator and result type to
match.
Added a test for the case where the MMA value is accumulated.

Differential Revision: https://reviews.llvm.org/D103023
2021-05-27 09:13:51 -07:00
Navdeep Kumar eaaf7a6a09 [MLIR][GPU][NVVM] Add conversion of warp synchronous matrix-multiply accumulate GPU ops
Add conversion of warp synchronous matrix-multiply
accumulate GPU ops
Add conversion of warp synchronous matrix-multiply accumulate GPU ops to
NVVM ops. The following conversions are added :-
  1.) subgroup_mma_load_matrix -> wmma.m16n16k16.load.[a,b,c]..row.stride
  2.) subgroup_mma_store_matrix -> wmma.m16n16k16.store.d.[f16,f32].row.stride
  3.) subgroup_mma_compute -> wmma.m16n16k16.mma.row.row.[f16,f32].[f16,f32]

Reviewed By: bondhugula, ftynse

Differential Revision: https://reviews.llvm.org/D95331
2021-05-21 21:20:33 +05:30
Julian Gross 1fbb484ea4 [WIP][mlir] Resolve memref dependency in canonicalize pass.
Splitting the memref dialect lead to an introduction of several dependencies
to avoid compilation issues. The canonicalize pass also depends on the
memref dialect, but it shouldn't. This patch resolves the dependencies
and the unintuitive includes are removed. However, the dependency moves
to the constructor of the std dialect.

Differential Revision: https://reviews.llvm.org/D102060
2021-05-17 11:33:38 +02:00
Alex Zinenko b3386a734e [mlir] introduce data layout entry for index type
Index type is an integer type of target-specific bitwidth present in many MLIR
operations (loops, memory accesses). Converting values of this type to
fixed-size integers has always been problematic. Introduce a data layout entry
to specify the bitwidth of `index` in a given layout scope, defaulting to 64
bits, which is a commonly used assumption, e.g., in constants.

Port builtin-to-LLVM type conversion to use this data layout entry when
converting `index` type and untie it from pointer size. This is particularly
relevant for GPU targets. Keep a possibility to forcibly override the index
type in lowerings.

Depends On D98525

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D98937
2021-03-24 15:13:42 +01:00
Chris Lattner dc4e913be9 [PatternMatch] Big mechanical rename OwningRewritePatternList -> RewritePatternSet and insert -> add. NFC
This doesn't change APIs, this just cleans up the many in-tree uses of these
names to use the new preferred names.  We'll keep the old names around for a
couple weeks to help transitions.

Differential Revision: https://reviews.llvm.org/D99127
2021-03-22 17:20:50 -07:00
Chris Lattner 1d909c9a35 Remove the extraneous MLIRContext argument from populateWithGenerated. NFC. 2021-03-21 10:38:35 -07:00
Chris Lattner 3a506b31a3 Change OwningRewritePatternList to carry an MLIRContext with it.
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters.  There are many many more to be removed.

Differential Revision: https://reviews.llvm.org/D99028
2021-03-21 10:06:31 -07:00
Julian Gross e2310704d8 [MLIR] Create memref dialect and move dialect-specific ops from std.
Create the memref dialect and move dialect-specific ops
from std dialect to this dialect.

Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
AssumeAlignmentOp -> MemRef_AssumeAlignmentOp
DeallocOp -> MemRef_DeallocOp
DimOp -> MemRef_DimOp
MemRefCastOp -> MemRef_CastOp
MemRefReinterpretCastOp -> MemRef_ReinterpretCastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
LoadOp -> MemRef_LoadOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
SubViewOp -> MemRef_SubViewOp
TransposeOp -> MemRef_TransposeOp
TensorLoadOp -> MemRef_TensorLoadOp
TensorStoreOp -> MemRef_TensorStoreOp
TensorToMemRefOp -> MemRef_BufferCastOp
ViewOp -> MemRef_ViewOp

The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667

Differential Revision: https://reviews.llvm.org/D98041
2021-03-15 11:14:09 +01:00
Vladislav Vinogradov 37eca08e5b [mlir][NFC] Rename `MemRefType::getMemorySpace` to `getMemorySpaceAsInt`
Just a pure method renaming.

It is a preparation step for replacing "memory space as raw integer"
with more generic "memory space as attribute", which will be done in
separate commit.

The `MemRefType::getMemorySpace` method will return `Attribute` and
become the main API, while `getMemorySpaceAsInt` will be declared as
deprecated and will be replaced in all in-tree dialects (also in separate
commits).

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D97476
2021-03-02 11:08:54 +03:00
Adrian Kuegel 07cc77187a Lower math.expm1 to intrinsics in the GPUToNVVM and GPUToROCDL conversions.
This adds the lowering for expm1 for GPU backends.

Differential Revision: https://reviews.llvm.org/D96756
2021-02-16 10:23:42 +01:00
Alex Zinenko 4c4876c314 [mlir] Use target-specific GPU kernel attributes in lowering pipelines
Until now, the GPU translation to NVVM or ROCDL intrinsics relied on the
presence of the generic `gpu.kernel` attribute to attach additional LLVM IR
metadata to the relevant functions. This would be problematic if each dialect
were to handle the conversion of its own options, which is the intended
direction for the translation infrastructure. Introduce `nvvm.kernel` and
`rocdl.kernel` in addition to `gpu.kernel` and base translation on these new
attributes instead.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D96591
2021-02-12 14:09:24 +01:00
Stephan Herhut 4348d8ab7f [mlir][math] Split off the math dialect.
This does not split transformations, yet. Those will be done as future clean ups.

Differential Revision: https://reviews.llvm.org/D96272
2021-02-12 10:55:12 +01:00
River Riddle e21adfa32d [mlir] Mark LogicalResult as LLVM_NODISCARD
This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way.

Differential Revision: https://reviews.llvm.org/D95841
2021-02-04 15:10:10 -08:00
Frederik Gossen 294e2544c9 Add log1p lowering from standard to NVVM intrinsics
Differential Revision: https://reviews.llvm.org/D95130
2021-01-21 14:00:38 +01:00
Alex Zinenko 2230bf99c7 [mlir] replace LLVMIntegerType with built-in integer type
The LLVM dialect type system has been closed until now, i.e. did not support
types from other dialects inside containers. While this has had obvious
benefits of deriving from a common base class, it has led to some simple types
being almost identical with the built-in types, namely integer and floating
point types. This in turn has led to a lot of larger-scale complexity: simple
types must still be converted, numerous operations that correspond to LLVM IR
intrinsics are replicated to produce versions operating on either LLVM dialect
or built-in types leading to quasi-duplicate dialects, lowering to the LLVM
dialect is essentially required to be one-shot because of type conversion, etc.
In this light, it is reasonable to trade off some local complexity in the
internal implementation of LLVM dialect types for removing larger-scale system
complexity. Previous commits to the LLVM dialect type system have adapted the
API to support types from other dialects.

Replace LLVMIntegerType with the built-in IntegerType plus additional checks
that such types are signless (these are isolated in a utility function that
replaced `isa<LLVMType>` and in the parser). Temporarily keep the possibility
to parse `!llvm.i32` as a synonym for `i32`, but add a deprecation notice.

Reviewed By: mehdi_amini, silvas, antiagainst

Differential Revision: https://reviews.llvm.org/D94178
2021-01-07 19:48:31 +01:00
Alex Zinenko c69c9e0f0f [mlir] Remove LLVMType, LLVM dialect types now derive Type directly
BEGIN_PUBLIC
[mlir] Remove LLVMType, LLVM dialect types now derive Type directly

This class has become a simple `isa` hook with no proper functionality.
Removing will allow us to eventually make the LLVM dialect type infrastructure
open, i.e., support non-LLVM types inside container types, which itself will
make the type conversion more progressive.

Introduce a call `LLVM::isCompatibleType` to be used instead of
`isa<LLVMType>`. For now, this is strictly equivalent.
END_PUBLIC

Depends On D93681

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D93713
2021-01-05 17:36:54 +01:00