Commit Graph

1495 Commits

Author SHA1 Message Date
Nicolas Vasilache bc4984e4f7 Add TODO to revisit coupling of CallOp to MemRefType lowering
PiperOrigin-RevId: 271619132
2019-09-27 12:03:00 -07:00
Nicolas Vasilache ddf737c5da Promote MemRefDescriptor to a pointer to struct when passing function boundaries in LLVMLowering.
The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues.
Solving the ABI problem generally is a very complex problem and most likely involves taking
a dependence on clang that we do not want atm.

A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement.

PiperOrigin-RevId: 271591708
2019-09-27 09:57:36 -07:00
Christian Sigg 116dac00ba Add AllReduceOp to GPU dialect with lowering to NVVM.
The reduction operation is currently fixed to "add", and the scope is fixed to "workgroup".

The implementation is currently limited to sizes that are multiple 32 (warp size) and no larger than 1024.

PiperOrigin-RevId: 271290265
2019-09-26 00:17:50 -07:00
Uday Bondhugula 458ede8775 Introduce splat op + provide its LLVM lowering
- introduce splat op in standard dialect (currently for int/float/index input
  type, output type can be vector or statically shaped tensor)
- implement LLVM lowering (when result type is 1-d vector)
- add constant folding hook for it
- while on Ops.cpp, fix some stale names

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#141

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/141 from bondhugula:splat 48976a6aa0a75be6d91187db6418de989e03eb51
PiperOrigin-RevId: 270965304
2019-09-24 12:44:58 -07:00
Nicolas Vasilache 42d8fa667b Normalize lowering of MemRef types
The RFC for unifying Linalg and Affine compilation passes into an end-to-end flow with a predictable ABI and linkage to external function calls raised the question of why we have variable sized descriptors for memrefs depending on whether they have static or dynamic dimensions  (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).

This CL standardizes the ABI on the rank of the memrefs.
The LLVM struct for a memref becomes equivalent to:
```
template <typename Elem, size_t Rank>
struct {
  Elem *ptr;
  int64_t sizes[Rank];
};
```

PiperOrigin-RevId: 270947276
2019-09-24 11:21:49 -07:00
Mehdi Amini 5583252173 Add convenience methods to set an OpBuilder insertion point after an Operation (NFC)
PiperOrigin-RevId: 270727180
2019-09-23 11:54:55 -07:00
Christian Sigg b8676da1fc Outline GPU kernel function into a nested module.
Roll forward of commit 5684a12.

When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module.

PiperOrigin-RevId: 270639748
2019-09-23 03:17:01 -07:00
Christian Sigg c900d4994e Fix a number of Clang-Tidy warnings.
PiperOrigin-RevId: 270632324
2019-09-23 02:34:27 -07:00
Manuel Freiberger 2c11997d48 Add integer sign- and zero-extension and truncation to standard.
This adds sign- and zero-extension and truncation of integer types to the
standard dialects. This allows to perform integer type conversions without
having to go to the LLVM dialect and introduce custom type casts (between
standard and LLVM integer types).

Closes tensorflow/mlir#134

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/134 from ombre5733:sext-zext-trunc-in-std c7657bc84c0ca66b304e53ec03797e09152e4d31
PiperOrigin-RevId: 270479722
2019-09-21 16:14:56 -07:00
George Karpenkov 2df646bef6 Automated rollback of commit 5684a12434
PiperOrigin-RevId: 270126672
2019-09-19 14:34:30 -07:00
MLIR Team 5684a12434 Outline GPU kernel function into a nested module.
When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module.

PiperOrigin-RevId: 269987720
2019-09-19 01:51:28 -07:00
MLIR Team 1c73be76d8 Unify error messages to start with lower-case.
PiperOrigin-RevId: 269803466
2019-09-18 07:45:17 -07:00
MLIR Team 1da0290c4b Error out when kernel function is not found while translating GPU calls.
PiperOrigin-RevId: 269327909
2019-09-16 07:19:36 -07:00
Alex Zinenko 6755dfdec9 Drop makePositionAttr and the like in favor of Builder::getI64ArrayAttr
The helper functions makePositionAttr() and positionAttr() were originally
introduced in the lowering-to-LLVM-dialect pass to construct integer array
attributes that are used for static positions in extract/insertelement.
Constructing an integer array attribute being fairly common, a utility function
Builder::getI64ArrayAttr was later introduced into the Builder API.  Drop
makePositionAttr and similar homegrown functions and use that API instead.
PiperOrigin-RevId: 269295836
2019-09-16 03:31:09 -07:00
Mahesh Ravishankar 9814b3fa0d Add mechanism to specify extended instruction sets in SPIR-V.
Add support for specifying extended instructions sets. The operations
in SPIR-V dialect are named as 'spv.<extension-name>.<op-name>'. Use
this mechanism to define a 'Exp' operation from GLSL(450)
instructions.
Later CLs will add support for (de)serialization of these operations,
and update the dialect generation scripts to auto-generate the
specification using the spec directly.

Additional changes:
Add a Type Constraint to OpBase.td to check for vector of specified
lengths. This is used to check that the vector type used in SPIR-V
dialect are of lengths 2, 3 or 4.
Update SPIRVBase.td to use this Type constraints for vectors.

PiperOrigin-RevId: 269234377
2019-09-15 19:40:07 -07:00
Lei Zhang 113aadddf9 Update SPIR-V symbols and use GLSL450 instead of VulkanKHR
SPIR-V recently publishes v1.5, which brings a bunch of symbols
into core. So the suffix "KHR"/"EXT"/etc. is removed from the
symbols. We use a script to pull information from the spec
directly.

Also changed conversion and tests to use GLSL450 instead of
VulkanKHR memory model. GLSL450 is still the main memory model
supported by Vulkan shaders and it does not require extra
capability to enable.

PiperOrigin-RevId: 268992661
2019-09-13 15:26:32 -07:00
River Riddle f1b100c77b NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase.
These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass.

PiperOrigin-RevId: 268970259
2019-09-13 13:34:27 -07:00
MLIR Team 36508528c7 Overload LLVM::TerminatorOp::build() for empty operands list.
PiperOrigin-RevId: 268041584
2019-09-09 11:39:03 -07:00
MLIR Team b5652720c1 Retain address space during MLIR > LLVM conversion.
PiperOrigin-RevId: 267206460
2019-09-04 12:26:52 -07:00
Lei Zhang 5593e005c6 Add folding rule and dialect materialization hook for spv.constant
This will allow us to use MLIR's folding infrastructure to deduplicate
SPIR-V constants.

This CL also changed isValidSPIRVType in SPIRVDialect to a static method.

PiperOrigin-RevId: 266984403
2019-09-03 12:09:58 -07:00
Mehdi Amini 765d60fd4d Add missing lowering to CFG in mlir-cpu-runner + related cleanup
- the list of passes run by mlir-cpu-runner included -lower-affine and
  -lower-to-llvm but was missing -lower-to-cfg (because -lower-affine at
  some point used to lower straight to CFG); add -lower-to-cfg in
  between. IR with affine ops can now be run by mlir-cpu-runner.

- update -lower-to-cfg to be consistent with other passes (create*Pass methods
  were changed to return unique ptrs, but -lower-to-cfg appears to have been
  missed).

- mlir-cpu-runner was unable to parse custom form of affine op's - fix
  link options

- drop unnecessary run options from test/mlir-cpu-runner/simple.mlir
  (none of the test cases had loops)

- -convert-to-llvmir was changed to -lower-to-llvm at some point, but the
  create pass method name wasn't updated (this pass converts/lowers to LLVM
  dialect as opposed to LLVM IR). Fix this.

(If we prefer "convert", the cmd-line options could be changed to
"-convert-to-llvm/cfg" then.)

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#115

PiperOrigin-RevId: 266666909
2019-09-01 11:33:22 -07:00
River Riddle 4bfae66d70 Refactor the 'walk' methods for operations.
This change refactors and cleans up the implementation of the operation walk methods. After this refactoring is that the explicit template parameter for the operation type is no longer needed for the explicit op walks. For example:

    op->walk<AffineForOp>([](AffineForOp op) { ... });

is now accomplished via:

    op->walk([](AffineForOp op) { ... });

PiperOrigin-RevId: 266209552
2019-08-29 13:04:50 -07:00
Stephan Herhut 545c3e489f Port mlir-cuda-runner to use dialect conversion framework.
Instead of lowering the program in two steps (Standard->LLVM followed
by GPU->NVVM), leading to invalid IR inbetween, the runner now uses
one pattern based rewrite step to go directly from Standard+GPU to
LLVM+NVVM.

PiperOrigin-RevId: 265861934
2019-08-28 01:50:57 -07:00
Mahesh Ravishankar 4ced99c085 Enhance GPU To SPIR-V conversion to support builtins and load/store ops.
To support a conversion of a simple load-compute-store kernel from GPU
dialect to SPIR-V dialect, the conversion of operations like
"gpu.block_dim", "gpu.thread_id" which allow threads to get the launch
conversion is needed. In SPIR-V these are specified as global
variables with builin attributes. This CL adds support to specify
builtin variables in SPIR-V conversion framework. This is used to
convert the relevant operations from GPU dialect to SPIR-V dialect.
Also add support for conversion of load/store operation in Standard
dialect to SPIR-V dialect.
To simplify the conversion add a method to build a spv.AccessChain
operation that automatically determines the return type based on the
base pointer type and the indices provided.

PiperOrigin-RevId: 265718525
2019-08-27 10:50:23 -07:00
Nicolas Vasilache fa592908af Let LLVMOpLowering specify a PatternBenefit - NFC
Currently the benefit is always set to 1 which limits the ability to do A->B->C lowering

PiperOrigin-RevId: 264854146
2019-08-22 09:38:42 -07:00
River Riddle ffde975e21 NFC: Move AffineOps dialect to the Dialect sub-directory.
PiperOrigin-RevId: 264482571
2019-08-20 15:36:39 -07:00
Alex Zinenko 006fcce44a ConvertLaunchFuncToCudaCalls: use LLVM dialect globals
This conversion has been using a stack-allocated array of i8 to store the
null-terminated kernel name in order to pass it to the CUDA wrappers expecting
a C string because the LLVM dialect was missing support for globals.  Now that
the suport is introduced, use a global instead.

Refactor global string construction from GenerateCubinAccessors into a common
utility function living in the LLVM namespace.

PiperOrigin-RevId: 264382489
2019-08-20 07:52:01 -07:00
Nicolas Vasilache f55ac5c076 Add support for LLVM lowering of binary ops on n-D vector types
This CL allows binary operations on n-D vector types to be lowered to LLVMIR by performing an (n-1)-D extractvalue, 1-D vector operation and an (n-1)-D insertvalue.

PiperOrigin-RevId: 264339118
2019-08-20 02:00:22 -07:00
Nicolas Vasilache b628194013 Move Linalg and VectorOps dialects to the Dialect subdir - NFC
PiperOrigin-RevId: 264277760
2019-08-19 17:11:38 -07:00
River Riddle ba0fa92524 NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.
PiperOrigin-RevId: 264193915
2019-08-19 11:01:25 -07:00
Nicolas Vasilache 9bf69e6a2e Refactor linalg lowering to LLVM
The linalg.view type used to be lowered to a struct containing a data pointer, offset, sizes/strides information. This was problematic when passing to external functions due to ABI, struct padding and alignment issues.

The linalg.view type is now lowered to LLVMIR as a *pointer* to a struct containing the data pointer, offset and sizes/strides. This simplifies the interfacing with external library functions and makes it trivial to add new functions without creating a shim that would go from a value type struct to a pointer type.

The consequences are that:
1. lowering explicitly uses llvm.alloca in lieu of llvm.undef and performs the proper llvm.load/llvm.store where relevant.
2. the shim creation function `getLLVMLibraryCallDefinition` disappears.
3. views are passed by pointer, scalars are passed by value. In the future, other structs will be passed by pointer (on a per-need basis).

PiperOrigin-RevId: 264183671
2019-08-19 10:21:40 -07:00
Jacques Pienaar 79f53b0cf1 Change from llvm::make_unique to std::make_unique
Switch to C++14 standard method as llvm::make_unique has been removed (
https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next
integrates.

PiperOrigin-RevId: 263953918
2019-08-17 11:06:03 -07:00
Mahesh Ravishankar d745101339 Add spirv::GlobalVariableOp that allows module level definition of variables
FuncOps in MLIR use explicit capture. So global variables defined in
module scope need to have a symbol name and this should be used to
refer to the variable within the function. This deviates from SPIR-V
spec, which assigns an SSA value to variables at all scopes that can
be used to refer to the variable, which requires SPIR-V functions to
allow implicit capture. To handle this add a new op,
spirv::GlobalVariableOp that can be used to define module scope
variables.
Since instructions need an SSA value, an new spirv::AddressOfOp is
added to convert a symbol reference to an SSA value for use with other
instructions.
This also means the spirv::EntryPointOp instruction needs to change to
allow initializers to be specified using symbol reference instead of
SSA value
The current spirv::VariableOp which returns an SSA value (as defined
by SPIR-V spec) can still be used to define function-scope variables.
PiperOrigin-RevId: 263951109
2019-08-17 10:20:13 -07:00
Nicolas Vasilache f826ceef3c Extend vector.outerproduct with an optional 3rd argument
This CL adds an optional third argument to the vector.outerproduct instruction.
When such a third argument is specified, it is added to the result of the outerproduct and  is lowered to FMA intrinsic when the lowering supports it.

In the future, we can add an attribute on the `vector.outerproduct` instruction to modify the operations for which to emit code (e.g. "+/*", "max/+", "min/+", "log/exp" ...).

This CL additionally performs minor cleanups in the vector lowering and adds tests to improve coverage.

This has been independently verified to result in proper fma instructions for haswell as follows.

Input:
```
func @outerproduct_add(%arg0: vector<17xf32>, %arg1: vector<8xf32>, %arg2: vector<17x8xf32>) -> vector<17x8xf32> {
  %2 = vector.outerproduct %arg0, %arg1, %arg2 : vector<17xf32>, vector<8xf32>
  return %2 : vector<17x8xf32>
}
}
```

Command:
```
mlir-opt vector-to-llvm.mlir -vector-lower-to-llvm-dialect --disable-pass-threading | mlir-opt -lower-to-cfg -lower-to-llvm | mlir-translate --mlir-to-llvmir | opt -O3 | llc -O3 -march=x86-64 -mcpu=haswell -mattr=fma,avx2
```

Output:
```
outerproduct_add:                       # @outerproduct_add
# %bb.0:
        ...
        vmovaps 112(%rbp), %ymm8
        vbroadcastss    %xmm0, %ymm0
        ...
        vbroadcastss    64(%rbp), %ymm15
        vfmadd213ps     144(%rbp), %ymm8, %ymm0 # ymm0 = (ymm8 * ymm0) + mem
        ...
        vfmadd213ps     400(%rbp), %ymm8, %ymm9 # ymm9 = (ymm8 * ymm9) + mem
        ...
```
PiperOrigin-RevId: 263743359
2019-08-16 03:53:26 -07:00
Mahesh Ravishankar cc980aa416 Simplify the classes that support SPIR-V conversion.
Modify the Type converters to have a SPIRVBasicTypeConverter which
only handles conversion from standard types to SPIRV types. Rename
SPIRVEntryFnConverter to SPIRVTypeConverter. This contains the
SPIRVBasicTypeConverter within it.

Remove SPIRVFnLowering class and have separate utility methods to
lower a function as entry function or a non-entry function. The
current setup could end with diamond inheritence that is not very
friendly to use.  For example, you could define the following Op
conversion methods that lower from a dialect "Foo" which resuls in
diamond inheritance.

template<typename OpTy>
class FooDialect : public SPIRVOpLowering<OpTy> {...};
class FooFnLowering : public FooDialect, SPIRVFnLowering {...};

PiperOrigin-RevId: 263597101
2019-08-15 10:54:46 -07:00
Alex Zinenko 88de8b2a2b GenerateCubinAccessors: use LLVM dialect constants
The GenerateCubinAccessors was generating functions that fill
dynamically-allocated memory with the binary constant of a CUBIN attached as a
stirng attribute to the GPU kernel.  This approach was taken to circumvent the
missing support for global constants in the LLVM dialect (and MLIR in general).
Global constants were recently added to the LLVM dialect.  Change the
GenerateCubinAccessors pass to emit a global constant array of characters and a
function that returns a pointer to the first character in the array.

PiperOrigin-RevId: 263092052
2019-08-13 01:39:21 -07:00
Mehdi Amini 926fb685de Express ownership transfer in PassManager API through std::unique_ptr (NFC)
Since raw pointers are always passed around for IR construct without
implying any ownership transfer, it can be error prone to have implicit
ownership transferred the same way.
For example this code can seem harmless:

  Pass *pass = ....
  pm.addPass(pass);
  pm.addPass(pass);
  pm.run(module);

PiperOrigin-RevId: 263053082
2019-08-12 19:13:12 -07:00
Nicolas Vasilache 252ada4932 Add lowering of vector dialect to LLVM dialect.
This CL is step 3/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.

This CL adds support for converting MLIR n-D vector types to (n-1)-D arrays of 1-D LLVM vectors and a conversion VectorToLLVM that lowers the `vector.extractelement` and `vector.outerproduct` instructions to the proper mix of `llvm.vectorshuffle`, `llvm.extractelement` and `llvm.mulf`.

This has been independently verified to produce proper avx2 code.

Input:
```
func @vec_1d(%arg0: vector<4xf32>, %arg1: vector<8xf32>) -> vector<8xf32> {
  %2 = vector.outerproduct %arg0, %arg1 : vector<4xf32>, vector<8xf32>
  %3 = vector.extractelement %2[0 : i32]: vector<4x8xf32>
  return %3 : vector<8xf32>
}
```

Command:
```
mlir-opt vector-to-llvm.mlir -vector-lower-to-llvm-dialect --disable-pass-threading | mlir-opt -lower-to-cfg -lower-to-llvm | mlir-translate --mlir-to-llvmir | opt -O3 | llc -O3 -march=x86-64 -mcpu=haswell -mattr=fma,avx2
```

Output:
```
vec_1d:                                 # @vec_1d
# %bb.0:
        vbroadcastss    %xmm0, %ymm0
        vmulps  %ymm1, %ymm0, %ymm0
        retq
```
PiperOrigin-RevId: 262895929
2019-08-12 04:08:57 -07:00
River Riddle 41968fb475 NFC: Update usages of OwningRewritePatternList to pass by & instead of &&.
This will allow for reusing the same pattern list, which may be costly to continually reconstruct, on multiple invocations.

PiperOrigin-RevId: 262664599
2019-08-09 17:20:29 -07:00
Nagy Mostafa 48fdc8d7a3 Add support for floating-point comparison 'fcmp' to the LLVM dialect.
This adds support for fcmp to the LLVM dialect and adds any necessary lowerings, as well as support for EDSCs.

Closes tensorflow/mlir#69

PiperOrigin-RevId: 262475255
2019-08-08 18:29:48 -07:00
River Riddle a0df3ebd15 NFC: Implement OwningRewritePatternList as a class instead of a using directive.
This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'.

PiperOrigin-RevId: 261816316
2019-08-05 18:38:22 -07:00
Mehdi Amini d682877eb3 Remove non-needed includes from ConvertControlFlowToCFG.cpp (NFC)
The includes related to the LLVM dialect are not used in this file and
introduce an implicit dependencies between the two libraries which isn't
reflected in the CMakeLists.txt, causing non-deterministic build failures.

PiperOrigin-RevId: 261576935
2019-08-04 10:59:18 -07:00
Mahesh Ravishankar ea56025f1e Initial implementation to translate kernel fn in GPU Dialect to SPIR-V Dialect
This CL adds an initial implementation for translation of kernel
function in GPU Dialect (used with a gpu.launch_kernel) op to a
spv.Module. The original function is translated into an entry
function.
Most of the heavy lifting is done by adding TypeConversion and other
utility functions/classes that provide most of the functionality to
translate from Standard Dialect to SPIR-V Dialect. These are intended
to be reusable in implementation of different dialect conversion
pipelines.
Note : Some of the files for have been renamed to be consistent with
the norm used by the other Conversion frameworks.
PiperOrigin-RevId: 260759165
2019-07-30 11:55:55 -07:00
Alex Zinenko 60965b4612 Move GPU dialect to {lib,include/mlir}/Dialect
Per tacit agreement, individual dialects should now live in lib/Dialect/Name
with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name.

PiperOrigin-RevId: 259896851
2019-07-25 00:41:17 -07:00
Mahesh Ravishankar 2ad92b6e50 Add a utility function to populate StdOp to SPIRV Conversion Patterns
The function populateStdOpsToSPIRVPatterns appends the conversion
patterns automatically generated from StdOpsToSPIRVConversion.td to a
list of patterns

PiperOrigin-RevId: 259677890
2019-07-23 22:38:51 -07:00
MLIR Team 8cb82c9478 Add sitofp to the standard dialect
Conversion from integers (window or input size, padding etc) to floating point is required to express many ML kernels, for example average pooling.

PiperOrigin-RevId: 259575284
2019-07-23 11:23:40 -07:00
River Riddle 3edbd8bf80 NFC: Update the LoopToStd conversion patterns to use RewritePattern instead of ConversionPattern.
These patterns don't require type changes so they don't need to be using ConversionPattern.

PiperOrigin-RevId: 259393151
2019-07-22 13:22:49 -07:00
River Riddle 00bdc8e070 Refactor region type signature conversion to be explicit via patterns.
This cl enforces that the conversion of the type signatures for regions, and thus their entry blocks, is handled via ConversionPatterns. A new hook 'applySignatureConversion' is added to the ConversionPatternRewriter to perform the desired conversion on a region. This also means that the handling of rewriting the signature of a FuncOp is moved to a pattern. A default implementation is provided via 'mlir::populateFuncOpTypeConversionPattern'. This removes the hacky implicit 'dynamically legal' status of FuncOp that was present previously, and leaves it up to the user to decide when/how to convert the signature of a function.

PiperOrigin-RevId: 259161999
2019-07-20 19:06:07 -07:00
Lei Zhang 9291868960 Place generated StandardOps to SPIR-V patterns in anonymous namespace
This avoids polluting the mlir namespace.

PiperOrigin-RevId: 258826497
2019-07-19 11:40:06 -07:00
River Riddle 8b447b6cad NFC: Expose a ConversionPatternRewriter for use with ConversionPatterns.
This specific PatternRewriter will allow for exposing hooks in the future that are only useful for the conversion framework, e.g. type conversions.

PiperOrigin-RevId: 258818122
2019-07-19 11:40:00 -07:00
River Riddle 9e3c2650d2 Refactor the conversion of block argument types in DialectConversion.
This cl begins a large refactoring over how signature types are converted in the DialectConversion infrastructure. The signatures of blocks are now converted on-demand when an operation held by that block is being converted. This allows for handling the case where a region is created as part of a pattern, something that wasn't possible previously.

This cl also generalizes the region signature conversion used by FuncOp to work on any region of any operation. This generalization allows for removing the 'apply*Conversion' functions that were specific to FuncOp/ModuleOp. The implementation currently uses a new hook on TypeConverter, 'convertRegionSignature', but this should ideally be removed in favor of using Patterns. That depends on adding support to the PatternRewriter used by ConversionPattern to allow applying signature conversions to regions, which should be coming in a followup.

PiperOrigin-RevId: 258645733
2019-07-19 11:38:45 -07:00
Nicolas Vasilache 0002e2964d Move affine.for and affine.if to ODS
As the move to ODS is made, body and region names across affine and loop dialects are uniformized.

PiperOrigin-RevId: 258416590
2019-07-16 13:45:47 -07:00
River Riddle 2b9855b5b4 Refactor DialectConversion to support different conversion modes.
Users generally want several different modes of conversion. This cl refactors DialectConversion to provide two:
* Partial (applyPartialConversion)
  - This mode allows for illegal operations to exist in the IR, and does not fail if an operation fails to be legalized.

* Full (applyFullConversion)
  - This mode fails if any operation is not properly legalized to the conversion target. This allows for ensuring that the IR after a conversion only contains operations legal for the target.

PiperOrigin-RevId: 258412243
2019-07-16 13:45:41 -07:00
Lei Zhang d36dd94c75 NFC: Move SPIR-V dialect to Dialect/ subdirectory
PiperOrigin-RevId: 258345603
2019-07-16 13:45:09 -07:00
Nicolas Vasilache e78ea03b24 Replace linalg.for by loop.for
With the introduction of the Loop dialect, uses of the `linalg.for` operation can now be subsumed 1-to-1 by `loop.for`.
This CL performs the replacement and tests are updated accordingly.

PiperOrigin-RevId: 258322565
2019-07-16 13:44:57 -07:00
River Riddle 2087bf6386 Remove lowerAffineConstructs and lowerControlFlow in favor of providing patterns.
These methods don't compose well with the rest of conversion framework, and create artificial breaks in conversion. Replace these methods with two(populateAffineToStdConversionPatterns and populateLoopToStdConversionPatterns respectively) that populate a list of patterns to perform the same behavior.

PiperOrigin-RevId: 258219277
2019-07-16 13:44:45 -07:00
Alex Zinenko ec82e1c907 Decouple LLVM dialect from Standard dialect
Due to the absence of ODS support for enum attributes, the implementation of
the LLVM dialect `icmp` operation was reusing the comparison predicate from the
Standard dialect, creating an avoidable library dependency.  With ODS support
and ICmpPredicate attribute recently introduced, the dependency is no longer
justified.  Update the Standard to LLVM convresion to also convert the
CmpIPredicate into LLVM::ICmpPredicate and remove the unnecessary includes.

Note that the MLIRLLVMIR library did not explicitly depend on MLIRStandardOps,
requiring dependees of MLIRLLVMIR to also depend on MLIRStandardOps, which
should no longer be the case.

PiperOrigin-RevId: 258148456
2019-07-16 13:43:31 -07:00
Nicolas Vasilache cca53e8527 Extract std.for std.if and std.terminator in their own dialect
These ops should not belong to the std dialect.
This CL extracts them in their own dialect and updates the corresponding conversions and tests.

PiperOrigin-RevId: 258123853
2019-07-16 13:43:18 -07:00
River Riddle 8e349a48b6 Remove the 'region' field from OpBuilder.
This field wasn't updated as the insertion point changed, making it potentially dangerous given the multi-level of MLIR(e.g. 'createBlock' would always insert the new block in 'region'). This also allows for building an OpBuilder with just a context.

PiperOrigin-RevId: 257829135
2019-07-12 17:42:41 -07:00
Nicolas Vasilache cab671d166 Lower affine control flow to std control flow to LLVM dialect
This CL splits the lowering of affine to LLVM into 2 parts:
1. affine -> std
2. std -> LLVM

The conversions mostly consists of splitting concerns between the affine and non-affine worlds from existing conversions.
Short-circuiting of affine `if` conditions was never tested or exercised and is removed in the process, it can be reintroduced later if needed.

LoopParametricTiling.cpp is updated to reflect the newly added ForOp::build.

PiperOrigin-RevId: 257794436
2019-07-12 08:44:28 -07:00
Alex Zinenko 2178467dca LoopsToGPU: use PassRegistration with constructor
PassRegistration with an optional constructor was introduced after the
LoopsToGPUPass, which resorted to deriving one pass from another as a means of
accepting options supplied as command-line arguments. Use PassRegistration with
constructor instead of defining a derived pass for LoopsToGPU.  Also rename the
pass to better reflect its current nature.

PiperOrigin-RevId: 257786923
2019-07-12 08:44:14 -07:00
River Riddle 9dbef0bf96 Rename FunctionAttr to SymbolRefAttr.
This allows for the attribute to hold symbolic references to other operations than FuncOp. This also allows for removing the dependence on FuncOp from the base Builder.

PiperOrigin-RevId: 257650017
2019-07-12 08:43:42 -07:00
River Riddle 6da343ecfc NFC: Replace Module::getNamedFunction with lookupSymbol<FuncOp>.
This allows for removing the last direct reference to FuncOp from ModuleOp.

PiperOrigin-RevId: 257498296
2019-07-12 08:43:03 -07:00
River Riddle b3e28fca53 NFC: Remove Function::getModule.
There is already a more general 'getParentOfType' method, and 'getModule' is likely to be misused as functions get placed within different regions than ModuleOp.

PiperOrigin-RevId: 257442243
2019-07-12 08:42:21 -07:00
River Riddle fec20e590f NFC: Rename Module to ModuleOp.
Module is a legacy name that only exists as a typedef of ModuleOp.

PiperOrigin-RevId: 257427248
2019-07-10 10:11:21 -07:00
River Riddle 6b6dc59f30 Update ModuleOp::create(...) to take a Location instead of a context.
This allows for giving a Module a more interesting location than 'Unknown'.

PiperOrigin-RevId: 257310117
2019-07-10 10:11:00 -07:00
River Riddle 8c44367891 NFC: Rename Function to FuncOp.
PiperOrigin-RevId: 257293379
2019-07-10 10:10:53 -07:00
Alex Zinenko 80e2871087 Extend AffineToGPU to support Linalg loops
Extend the utility that converts affine loop nests to support other types of
loops by abstracting away common behavior through templates.  This also
slightly simplifies the existing Affine to GPU conversion by always passing in
the loop step as an additional kernel argument even though it is a known
constant.  If it is used, it will be propagated into the loop body by the
existing canonicalization pattern and can be further constant-folded, otherwise
it will be dropped by canonicalization.

This prepares for the common loop abstraction that will be used for converting
to GPU kernels, which is conceptually close to Linalg loops, while maintaining
the existing conversion operational.

PiperOrigin-RevId: 257172216
2019-07-09 05:26:50 -07:00
River Riddle 626b8b6a5d NFC: Remove `Module::getFunctions` in favor of a general `getOps<T>`.
Modules can now contain more than just Functions, this just updates the iteration API to reflect that. The 'begin'/'end' methods have also been updated to iterate over opaque Operations.

PiperOrigin-RevId: 257099084
2019-07-08 18:28:17 -07:00
Lei Zhang 891a7911c2 Add dependencies for standard ops to SPIR-V conversion
PiperOrigin-RevId: 257026374
2019-07-08 12:40:21 -07:00
River Riddle ce502af9cd NFC: Remove the various "::getFunction" methods.
These methods assume that a function is a valid builtin top-level operation, and removing these methods allows for decoupling FuncOp and IR/. Utility "getParentOfType" methods have been added to Operation/OpState to allow for querying the first parent operation of a given type.

PiperOrigin-RevId: 257018913
2019-07-08 12:40:08 -07:00
Stephan Herhut e8b21a75f8 Add an mlir-cuda-runner tool.
This tool allows to execute MLIR IR snippets written in the GPU dialect
on a CUDA capable GPU. For this to work, a working CUDA install is required
and the build has to be configured with MLIR_CUDA_RUNNER_ENABLED set to 1.

PiperOrigin-RevId: 256551415
2019-07-04 07:53:54 -07:00
Stephan Herhut 1bcaa3185d Add missing mlir:: namespace in definition of createConvertToLLVMIRPass.
PiperOrigin-RevId: 256546769
2019-07-04 07:53:31 -07:00
Alex Zinenko 9a1b6fec79 Make ConvertStandardToLLVMPass extendable with other patterns
Extend the LLVM lowering pass to accept callbacks that construct an instance of
(a subclass of) LLVMTypeConverter and populate a list of conversion patterns.
These callbacks will be called when the pass processes a module and their
results will be used to set up the dialect conversion infrastructure.  Clients
can now provide additional conversion patterns to avoid the need of
materializing type conversions between LLVM and other types.

PiperOrigin-RevId: 256532415
2019-07-04 07:53:19 -07:00
Lei Zhang 0782b37936 NFC: Move Standard to SPIR-V conversion to lib/Conversion
PiperOrigin-RevId: 256271759
2019-07-03 14:35:42 -07:00
River Riddle 206e55cc16 NFC: Refactor Module to be value typed.
As with Functions, Module will soon become an operation, which are value-typed. This eases the transition from Module to ModuleOp. A new class, OwningModuleRef is provided to allow for owning a reference to a Module, and will auto-delete the held module on destruction.

PiperOrigin-RevId: 256196193
2019-07-02 16:43:36 -07:00
River Riddle 54cd6a7e97 NFC: Refactor Function to be value typed.
Move the data members out of Function and into a new impl storage class 'FunctionStorage'. This allows for Function to become value typed, which will greatly simplify the transition of Function to FuncOp(given that FuncOp is also value typed).

PiperOrigin-RevId: 255983022
2019-07-01 11:39:00 -07:00
Alex Zinenko d046b2ddec Expose AffineToGPUPass for use with PassManager
Originally, AffineToGPUPass was created and registered in the source file
mainly for testing purposes.  Provide a factory function that constructs
AffineToGPU pass to make it usable in pass pipelines.

PiperOrigin-RevId: 255902831
2019-07-01 09:55:24 -07:00
Stephan Herhut 630119f84f Add a pass that inserts getters for all cubins found via nvvm.cubin
annotations.

Getters are required as there are currently no global constants in MLIR and this
is an easy way to unblock CUDA execution while waiting for those.

PiperOrigin-RevId: 255169002
2019-06-26 05:33:11 -07:00
Stephan Herhut c72c6c3907 Make GPU to CUDA transformations independent of CUDA runtime.
The actual transformation from PTX source to a CUDA binary is now factored out,
enabling compiling and testing the transformations independently of a CUDA
runtime.

MLIR has still to be built with NVPTX target support for the conversions to be
built and tested.

PiperOrigin-RevId: 255167139
2019-06-26 05:16:37 -07:00
River Riddle a4c3a6455c Move the emitError/Warning/Remark utility methods out of MLIRContext and into the mlir namespace.
Now that Locations are attributes, they have direct access to the MLIR context. This allows for simplifying error emission by removing unnecessary context lookups.

PiperOrigin-RevId: 255112791
2019-06-25 21:32:23 -07:00
Alex Zinenko 2628641b23 GPUtoNVVM: adjust integer bitwidth when lowering special register ops
GPU dialect operations (launch and launch_func) use `index` type for thread and
block index values inside the kernel, for compatibility with affine loops.
NVVM dialect operations, following the NVVM intrinsics, use `!llvm.i32` type,
which does not necessarily have the same bit width as the lowered `index` type.
Optionally sign-extend (indices are signed) or truncate the result of the NVVM
dialect operation to the bit width of the lowered `index` type before passing
it to other operations.  This behavior is consistent with `std.index_cast`.  We
cannot use the latter since we are targeting LLVM dialect types directly,
rather than standard integer types.

PiperOrigin-RevId: 254980868
2019-06-25 09:21:26 -07:00
Stephan Herhut 10f320f7c0 Add gpu::GPUDialect::isKernel helper.
Also some mild cleanup of the kernel to cubin conversion pass.

PiperOrigin-RevId: 254959303
2019-06-25 09:20:40 -07:00
Alex Zinenko f35d0c8570 NVVM target: emit nvvm.annotations for kernel functions
PTX backend in LLVM expects additional module-level metadata
`!nvvm.annotations` that lists functions that can be used as GPU kernels.
Generate this metadata based on the `gpu.kernel` attribute attached to
functions.  This attribute is added automatically by the kernel outlining pass
in the GPU dialect lowering flow.

PiperOrigin-RevId: 254957345
2019-06-25 09:19:27 -07:00
River Riddle 9764ae3f24 Refactor the TypeConverter to support more robust type conversions:
* Support for 1->0 type mappings, i.e. when the argument is being removed.
* Reordering types when converting a type signature.
* Adding new inputs when converting a type signature.

This cl also lays down the initial foundation for supporting 1->N type mappings, but full support will come in a followup.

Moving forward, function signature changes will be driven by populating a SignatureConversion instance. This class contains all of the necessary information for adding/removing/remapping function signatures; e.g. addInputs, addResults, remapInputs, etc.

PiperOrigin-RevId: 254064665
2019-06-19 23:08:33 -07:00
Stephan Herhut 9d81081d90 Add a pass that translates GPU.launch_func into a series of runtime calls.
This does not map the calls to the CUDA libary directly but uses a slim wrapper
ABI on top that has more convenient types for code generation and is stable. Such
ABI is expected to be provided by the actual runner.

PiperOrigin-RevId: 253983833
2019-06-19 23:07:43 -07:00
Alex Zinenko 14e2f4a22b Fix GPUToNVVM naming: NNVM should have been NVVM
Rename `createLowerGpuOpsToNNVMOpsPass` to `createLowerGpuOpsToNVVMOpsPass`.

PiperOrigin-RevId: 253801577
2019-06-19 23:06:36 -07:00
Alex Zinenko b9beff0384 Make examples/Linalg3 depend on the new standard to LLVM conversion library.
PiperOrigin-RevId: 253767820
2019-06-19 23:05:57 -07:00
Stephan Herhut e0596a4d63 Use llvm::StringSwitch in lowering of GPU ops to NVVM ops.
PiperOrigin-RevId: 253767688
2019-06-19 23:05:48 -07:00
Stephan Herhut 893374bfa2 Add a pass that translates a CUDA kernel function (tagged with nvvm.kernel) to
a CUBIN blob for execution on CUDA GPUs.

This is a first in a series of patches to build a simple CUDA runner to allow
experimenting with MLIR code on GPUs.

PiperOrigin-RevId: 253758915
2019-06-19 23:05:37 -07:00
Alex Zinenko f218519cc2 Introduce std.index_cast and its lowering+translation to LLVM
Index types integers of platform-specific bit width.  They are used to index
memrefs and as loop induction variables, however they could not be obtained
from an integer until now, making it virtually impossible to express indirect
accesses (given that memrefs of indices are not allowed) or data-dependent
loops.  Introduce `std.index_cast` to transform indices into integers and vice
versa.  The semantics of this cast is to sign-extend when casting to a wider
integer, and to truncate when casting to a narrower integer.  It belongs to
StandardOps because both types it operates on are standard types, and because
its results are likely to be used in std.load and std.store.

Introduce llvm.sext, llvm.zext and llvm.trunc operations to the LLVM dialect.
Provide the conversion of `std.index_cast` to llvm.sext or llvm.trunc,
depending on the actual bitwidth of `index` known during the conversion.

PiperOrigin-RevId: 253624100
2019-06-19 23:04:01 -07:00
Alex Zinenko 4291ae7431 Factor Region::getUsedValuesDefinedAbove into Transforms/RegionUtils
Arguably, this function is only useful for transformations and should not
pollute the main IR.  Also make sure it accepts a the resulting container
by-reference instead of returning it.

PiperOrigin-RevId: 253622981
2019-06-19 23:03:51 -07:00
Stephan Herhut a14eeacf2c Add lowering pass from GPU dialect operations to LLVM/NVVM intrinsics.
PiperOrigin-RevId: 253551452
2019-06-19 23:03:30 -07:00
Alex Zinenko ebea5767fb Start moving conversions to {lib,include/mlir}/Conversion
Conversions from dialect A to dialect B depend on both A and B.  Therefore, it
is reasonable for them to live in a separate library that depends on both
DialectA and DialectB library, and does not forces dependees of DialectA or
DialectB to also link in the conversion.  Create the directory layout for the
conversions and move the Standard to LLVM dialect conversion as the first
example.

PiperOrigin-RevId: 253312252
2019-06-19 23:02:50 -07:00
Alex Zinenko ee6f84aebd Convert a nest affine loops to a GPU kernel
This converts entire loops into threads/blocks.  No check on the size of the
block or grid, or on the validity of parallelization is performed, it is under
the responsibility of the caller to strip-mine the loops and to perform the
dependence analysis before calling the conversion.

PiperOrigin-RevId: 253189268
2019-06-19 23:02:02 -07:00