Commit Graph

2054 Commits

Author SHA1 Message Date
Deven Desai 956a831130 [ROCm] Fix the return type for the device function calls from i32 to i64.
This is matching what the runtime library is expecting.

Closes tensorflow/mlir#171

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/171 from deven-amd:deven-rocdl-device-func-i64 80762629a8c34e844ebdc542b34dd783990db9db
PiperOrigin-RevId: 273640767
2019-10-08 17:41:42 -07:00
Denis Khalikov d21ba951de [spirv] Add a pass to decorate the composite types with layout info.
Add a pass to decorate the composite types used by
composite objects in the StorageBuffer, PhysicalStorageBuffer,
Uniform, and PushConstant storage classes with layout information.

Closes tensorflow/mlir#156

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/156 from denis0x0D:sandbox/layout_info_decoration 7c50840fd38ca169a2da7ce9886b52b50c868b84
PiperOrigin-RevId: 273634140
2019-10-08 16:54:11 -07:00
River Riddle 49b29dd186 Add a PatternRewriter hook for cloning a region into another.
This is similar to the `inlineRegionBefore` hook, except the original blocks are unchanged. The region to be cloned *must* not have been modified during the conversion process at the point of cloning, i.e. it must belong an operation that has yet to be converted, or the operation that is currently being converted.

PiperOrigin-RevId: 273622533
2019-10-08 15:45:08 -07:00
Uday Bondhugula 6136f33d59 unroll and jam: fix order of jammed bodies
- bodies would earlier appear in the order (i, i+3, i+2, i+1) instead of
  (i, i+1, i+2, i+3) for example for factor 4.

- clean up hardcoded test cases

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#170

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/170 from bondhugula:ujam b66b405b2b1894a03b376952e32a9d0292042665
PiperOrigin-RevId: 273613131
2019-10-08 15:13:11 -07:00
River Riddle ac91e67375 Add support for walking the uses of a symbol.
MLIR uses symbol references to model references to many global entities, such as functions/variables/etc. Before this change, there is no way to actually reason about the uses of such entities. This change provides a walker for symbol references(via SymbolTable::walkSymbolUses), as well as 'use_empty' support(via SymbolTable::symbol_use_empty). It also resolves some deficiencies in the LangRef definition of SymbolRefAttr, namely the restrictions on where a SymbolRefAttr can be stored, ArrayAttr and DictionaryAttr, and the relationship with operations containing the SymbolTable trait.

PiperOrigin-RevId: 273549331
2019-10-08 10:21:59 -07:00
River Riddle 0dd404e4e1 NFC: Remove unused default cl::opt value.
The default value is never used as the value of the elide option is only used if it has an occurrence.

PiperOrigin-RevId: 273545143
2019-10-08 10:04:28 -07:00
Alex Zinenko 0cdc53a762 Linalg to LLVM lowering: decrease the reliance on symbol lookup in a module
During the conversion, both the original and the converted function may coexist
in the module and have the same symbol name. There is no guarantee which of the
two will be found by the symbol lookup. Avoid returning the result of the
library function lookup when lowering Linalg to Standard or LLVM. Use the
symbol reference instead. After the conversion completes, only one symbol will
remain and the Ops using SymbolRefAttrs will be referring to the correct one.

PiperOrigin-RevId: 273510079
2019-10-08 06:55:25 -07:00
Alex Zinenko 11d12670da GPUToCUDA: attach CUBIN to the nested module rather than to the function
Originally, we were attaching attributes containing CUBIN blobs to the kernel
function called by `gpu.launch_func`. This kernel is now contained in a nested
module that is used as a compilation unit. Attach compiled CUBIN blobs to the
module rather than to the function since we were compiling the module. This
also avoids duplication of the attribute on multiple kernels within the same
module.

PiperOrigin-RevId: 273497303
2019-10-08 05:11:26 -07:00
Alex Zinenko 52e082b6ed GPUToCUDA: emit addressof directly instead of wrapping it into a getter function
Originally, the CUBIN getter function was introduced as a mechanism to
circumvent the absence of globals in the LLVM dialect. It would allocate memory
and populate it with the CUBIN data. LLVM dialect now supports globals and they
are already used to store CUBIN data, making the getter function a trivial
address computation of a global. Emit the address computation directly at the
place of `gpu.launch_func` instead of putting it in a function and calling it.
This simplifies the conversion flow and prepares it for using the
DialectConversion infrastructure.

PiperOrigin-RevId: 273496221
2019-10-08 05:03:42 -07:00
Alex Zinenko 16af5924cb Fuse GenerateCubinAccessors pass into LaunchFunctToCuda
Now that the accessor function is a trivial getter of the global variable, it
makes less sense to have the getter generation as a separate pass. Move the
getter generation into the lowering of `gpu.launch_func` to CUDA calls. This
change is mostly code motion, but the process can be simplified further by
generating the addressof inplace instead of using a call. This is will be done
in a follow-up.

PiperOrigin-RevId: 273492517
2019-10-08 04:35:33 -07:00
Alex Zinenko 90d65d32d6 Use named modules for gpu.launch_func
The kernel function called by gpu.launch_func is now placed into an isolated
nested module during the outlining stage to simplify separate compilation.
Until recently, modules did not have names and could not be referenced. This
limitation was circumvented by introducing a stub kernel at the same name at
the same nesting level as the module containing the actual kernel. This
relation is only effective in one direction: from actual kernel function to its
launch_func "caller".

Leverage the recently introduced symbol name attributes on modules to refer to
a specific nested module from `gpu.launch_func`. This removes the implicit
connection between the identically named stub and kernel functions. It also
enables support for `gpu.launch_func`s to call different kernels located in the
same module.

PiperOrigin-RevId: 273491891
2019-10-08 04:30:32 -07:00
Jing Pu 780f107a57 Update upgrade some uses of mlir::interleave API to take container argument directly.
PiperOrigin-RevId: 273446814
2019-10-07 21:53:11 -07:00
River Riddle a8a73f0640 Add a flag to the AsmPrinter for eliding large ElementsAttrs.
Some modules may have extremely large ElementsAttrs, which makes debugging involving IR dumping extremely slow and painful. This change adds a flag that will elide ElementsAttrs with a "large"(as defined by the user) number of elements by printing "..." instead of the element data.

PiperOrigin-RevId: 273413100
2019-10-07 17:19:20 -07:00
Jing Pu 17606a108b Print result types when dumping graphviz.
PiperOrigin-RevId: 273406833
2019-10-07 16:45:53 -07:00
MLIR Team 6b3462a77b Expose `fuseProducerOf` in Linalg/Utils/Utils.h.
PiperOrigin-RevId: 273384063
2019-10-07 15:01:07 -07:00
Mahesh Ravishankar 37e0e8cf16 Do not add spirv::BitcastOp for cast from signed to unsigned type.
Since MLIR integer types don't make a distinction between signed vs
unsigned integers, during deserialization of SPIR-V binaries, the
OpBitcast might result in a cast from/to the same type. Do not add a
spv.Bitcast operation to the spv.module in these cases.

PiperOrigin-RevId: 273381887
2019-10-07 14:52:00 -07:00
River Riddle aeada290b8 Add a new class, OpPrintingFlags, to enable programmatic control of Operation::print behavior.
This allows for controlling the behavior of the AsmPrinter programmatically, instead of relying exclusively on cl::opt flags. This will also allow for more fine-tuned control of printing behavior per callsite, instead of being applied globally.

PiperOrigin-RevId: 273368361
2019-10-07 13:54:49 -07:00
Mahesh Ravishankar 9e9c3a009a Update UndefOp (de)serialization to generate OpUndef at module level.
The SPIR-V spec recommends all OpUndef instructions be generated at
module level. For the SPIR-V dialect its better for UndefOp to produce
an SSA value for use with other instructions. If UndefOp is to be used
at module level, it cannot produce an SSA value (use of this SSA value
within FuncOp would need implicit capture). To satisfy needs of the
SPIR-V spec while making it simpler to represent UndefOp in the SPIR-V
dialect, the serialization is updated to create OpUndef instruction
at module scope.

PiperOrigin-RevId: 273355526
2019-10-07 12:56:38 -07:00
Lei Zhang ebf584b813 [spirv] Fix function entry block erase after moving to spv.selection
The structured selection/loop's entry block does not have arguments.
If the function's header block is also part of the structured control
flow, we cannot just simply erase it because it may contain arguments
matching the function signature and used by the cloned blocks. Instead,
turn it into a block only containing a spv.Branch op.

Also, we can directly emit instructions for the spv.selection header
block to the block containing the spv.selection op. This eliminates
unnecessary branches in the SPIR-V blob.

Added a test for nested spv.loop.

PiperOrigin-RevId: 273351424
2019-10-07 12:37:13 -07:00
Uday Bondhugula 89e7a76a1c fix simplify-affine-structures bug
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#157

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/157 from bondhugula:quickfix bd1fcd79825fc0bd5b4a3e688153fa0993ab703d
PiperOrigin-RevId: 273316498
2019-10-07 10:04:50 -07:00
Christian Sigg 9f11b0e12f Change Block::getParent() to be a const function. This is only necessary because ilist_node_with_parent specifically requires a 'getParent() const' method. If/When ilist_node removes this constraint we should drop the const to fit the rest of the MLIR const model.
PiperOrigin-RevId: 273316153
2019-10-07 10:03:28 -07:00
Nicolas Vasilache 9f98bcda47 Support AllocOp terminal in Linalg::AliasAnalysis.
Now that linalg.view and strided memrefs are unified, there is no reason to
disallow AllocOp in alias analysis. This CLs adds support for AllocOp which allows writing shorter tests that do not require explicitly creating a view for
each operation.

PiperOrigin-RevId: 273303060
2019-10-07 09:01:18 -07:00
Jacques Pienaar 27e8efedf8 Add DialectType and generate docs for dialect types
Add new `typeDescription` (description was already used by base constraint class) field to type to allow writing longer descriptions about a type being defined. This allows for providing additional information/rationale for a defined type. This currently uses `description` as the heading/name for the type in the generated documentation.

PiperOrigin-RevId: 273299332
2019-10-07 08:41:13 -07:00
MLIR Team da984166df Add OpaqueLoc to MLIR locations.
See RFC: https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/xE2IzfhE3Wg.

Opaque location stores two pointers, one of them points to some data structure that is external to MLIR, and the other one is unique for each type and represents type id of that data structure. OpaqueLoc also stores an optional location that can be used if the first one is not suitable.
OpaqueLoc is managed similar to FileLineColLoc. It is passed around by MLIR transformations and can be used in compound locations like CallSiteLoc.

PiperOrigin-RevId: 273266510
2019-10-07 05:05:42 -07:00
Christian Sigg 7c765d97f9 Support reduction of partial warps.
gpu.all_reduce now supports block sizes that are not multiple of 32.

PiperOrigin-RevId: 273255204
2019-10-07 03:31:00 -07:00
Jacques Pienaar 77672c9777 Enable emitting dialect summary & description during op generation
Sort ops per dialect and emit summary & description (if provided) of each dialect before emitting the ops of the dialect.

PiperOrigin-RevId: 273077138
2019-10-05 12:21:51 -07:00
Geoffrey Martin-Noble 18db4ce493 Allow element type traits to operate on scalars
This allows confirming that a scalar argument has the same element type as a shaped one. It's easy to validate a type is shaped on its own if that's desirable, so this shouldn't make that use case harder. This matches the behavior of other traits that operate on element type (e.g. AllElementTypesMatch). Also this makes the code simpler because now we just use getElementTypeOrSelf.

Verified that all uses in core already check the type is shaped in another way.

PiperOrigin-RevId: 273068507
2019-10-05 10:06:06 -07:00
Lei Zhang c020480fc6 [spirv] Allow return ops to be in control flow ops
Use `getParentOfType<FunctionOp>()` instead of `cast<FuncOp>(getParentOp())`
to avoid crash when return ops are used inside spv.selection/spv.loop.

PiperOrigin-RevId: 273006041
2019-10-04 20:08:52 -07:00
Mahesh Ravishankar 3f8bde40cb Add spv.Undef op to support OpUndef instruction in SPIR-V.
Adding support for OpUndef instruction. Updating the dialect
generation script to fix a few bugs in the instruction spec
generation.

PiperOrigin-RevId: 272975685
2019-10-04 16:00:22 -07:00
Mahesh Ravishankar 77a809d7a1 Add some utility builder functions for SPIR-V operations.
Add builder functions for spv._address_of, spv.EntryPoint,
spv.ExecutionMode and spv.Load to make it easier to create these
operations.
Fix a minor bug in printing of spv.EntryPoint
Add a utility function to get the attribute name associated with a
decoration.

PiperOrigin-RevId: 272952846
2019-10-04 14:02:48 -07:00
Nicolas Vasilache 754ea72794 Replace constexpr MemRefType::kDynamicStrideOrOffset by a MemRefType:;getDynamicStrideOrOffset() method - NFC
This fixes global ODR-use issues, some of which manifest in Parser.cpp.

Fixes tensorflow/mlir#167.

PiperOrigin-RevId: 272886347
2019-10-04 08:58:09 -07:00
Nicolas Vasilache 516f6a3477 Add missing Linalg lowerings to allow roundtrip.mlir to lower to LLVM
Certain lowering patterns were reported as [missing](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/dkdmHa77sSQ).

This CL adds them and allows Linalg/roundtrip.mlir and Linalg/loops.mlir to lower to LLVM directly. Those 2 tests are updated to additionally check that the direct lowering to LLVM does not crash.

The following points, left as TODOs still need to be addressed for correct end-to-end execution:
1. the lowering for ConvOp needs to pass attributes such as strides and dilations; the external library call needs to support it.
2. the lowering for GenericOp needs to support lowering to loops as a DialectConversion pattern. This is blocked on the DialectConversion infrastructure accepting an OperationFolder.

PiperOrigin-RevId: 272878131
2019-10-04 08:07:54 -07:00
Deven Desai d064469f6f Moving the GPUIndexIntrinsicOpLowering template to a common location
The GPUIndexIntrinsicOpLowering template is currently used by the code in both the GPUToNVVM and GPUToROCDL dirs.
Moving it to a common location to remove code duplication.

Closes tensorflow/mlir#163

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/163 from deven-amd:deven-refactor-gpu-index-ops-lowering b8dc2a5f5353df196039b6ff2ad42106028693ed
PiperOrigin-RevId: 272863297
2019-10-04 06:20:05 -07:00
Christian Sigg 85dcaf19c7 Fix typos, NFC.
PiperOrigin-RevId: 272851237
2019-10-04 04:37:53 -07:00
River Riddle 5830f71a45 Add support for inlining calls with different arg/result types from the callable.
Some dialects have implicit conversions inherent in their modeling, meaning that a call may have a different type that the type that the callable expects. To support this, a hook is added to the dialect interface that allows for materializing conversion operations during inlining when there is a mismatch. A hook is also added to the callable interface to allow for introspecting the expected result types.

PiperOrigin-RevId: 272814379
2019-10-03 23:10:51 -07:00
River Riddle a20d96e436 Update the Inliner pass to work on SCCs of the CallGraph.
This allows for the inliner to work on arbitrary call operations. The updated inliner will also work bottom-up through the callgraph enabling support for multiple levels of inlining.

PiperOrigin-RevId: 272813876
2019-10-03 23:05:21 -07:00
Feng Liu 8c95223e3c Add `axis` attribute to the quant.stats op
The first dim length of the axisStats attribute should equals to the slice size
of the input argument when splitted by the axis dimension.

PiperOrigin-RevId: 272798042
2019-10-03 20:29:08 -07:00
MLIR Team 0dfa7fc908 Add fpext and fptrunc to the Standard dialect and includes conversion to LLVM
PiperOrigin-RevId: 272768027
2019-10-03 16:37:24 -07:00
Christian Sigg 496f4590a1 Generalize parse/printBinaryOp to parse/printOneResultOp.
PiperOrigin-RevId: 272722539
2019-10-03 13:00:12 -07:00
Nicolas Vasilache 218f0e611a Add syntactic sugar for strided memref parsing.
This CL implements the last remaining bit of the [strided memref proposal](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).

The syntax is a bit more explicit than what was originally proposed and resembles:
  `memref<?x?xf32, offset: 0 strides: [?, 1]>`

Nonnegative strides and offsets are currently supported. Future extensions will include negative strides.

This also gives a concrete example of syntactic sugar for the ([RFC] Proposed Changes to MemRef and Tensor MLIR Types)[https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/-wKHANzDNTg].

The underlying implementation still uses AffineMap layout.

PiperOrigin-RevId: 272717437
2019-10-03 12:34:36 -07:00
Alex Zinenko 0b93c092b6 Make Module::getName return Optional<StringRef>
Module names are optional so it makes more sense to take and return an optional
any time the name is involved. Also update the language reference to reflect
the module names.

PiperOrigin-RevId: 272684698
2019-10-03 10:04:48 -07:00
Alex Zinenko 8633b6bc8e Give modules a name
Modules are now Ops and, as such, can be nested. They do not produce an SSA
value so there is no possibility to refer to them in the IR. Introduce support
for symbol names attached to the module Op so that it can be referred to using
SymbolRefAttrs. The name is optional, for example the implicit top-level module
does not have a name.

PiperOrigin-RevId: 272671600
2019-10-03 08:56:38 -07:00
Alex Zinenko bd4762502c Add parentheses around boolean operators in assert
This removes a warning and is generally a good practice.

PiperOrigin-RevId: 272613597
2019-10-03 01:47:14 -07:00
Alex Zinenko e0d78eac23 NFC: rename Conversion/ControlFlowToCFG to Conversion/LoopToStandard
This makes the name of the conversion pass more consistent with the naming
scheme, since it actually converts from the Loop dialect to the Standard
dialect rather than working with arbitrary control flow operations.

PiperOrigin-RevId: 272612112
2019-10-03 01:35:03 -07:00
Alex Zinenko 44ef5e5525 Disallow index types in memrefs.
As specified in the MLIR language reference and rationale documents, `memref`
types should not be allowed to have `index` as element types. As observed in
https://groups.google.com/a/tensorflow.org/forum/#!msg/mlir/P49hVWqTMNc/nW89a4i_AgAJ
this restriction was lifted when canonicalization unit tests for affine
operations were introduced, without sufficient motivation to lift the
restriction itself.  The test in question can be trivially rewritten (return
the value from a function instead of storing it to prevent DCE from removing
the producer operation) and the restriction put back in place.

If `memref<...x index>` is relevant for some use cases, the relaxation of the
type system can be implemented separately with appropriate modifications to the
documentation.

PiperOrigin-RevId: 272607043
2019-10-03 00:58:29 -07:00
Nicolas Vasilache 9604bb6269 Extract MemRefType::getStridesAndOffset as a free function and fix dynamic offset determination.
This also adds coverage with a missing test, which uncovered a bug in the conditional for testing whether an offset is dynamic or not.

PiperOrigin-RevId: 272505798
2019-10-02 13:25:05 -07:00
Lei Zhang f294e0e513 [spirv] Add support for spv.selection
Similar to spv.loop, spv.selection is another op for modelling
SPIR-V structured control flow. It covers both OpBranchConditional
and OpSwitch with OpSelectionMerge.

Instead of having a `spv.SelectionMerge` op to directly model
selection merge instruction for indicating the merge target,
we use regions to delimit the boundary of the selection: the
merge target is the next op following the `spv.selection` op.
This way it's easier to discover all blocks belonging to
the selection and it plays nicer with the MLIR system.

PiperOrigin-RevId: 272475006
2019-10-02 11:01:57 -07:00
Deven Desai e81b3129b4 [ROCm] Adding pass to lower GPU Dialect to ROCDL Dialect.
This is a follow-up to the PRtensorflow/mlir#146 which introduced the ROCDL Dialect. This PR introduces a pass to lower GPU Dialect to the ROCDL Dialect. As with the previous PR, this one builds on the work done by @whchung, and addresses most of the review comments in the original PR.

Closes tensorflow/mlir#154

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/154 from deven-amd:deven-lower-gpu-to-rocdl 809893e08236da5ab6a38e3459692fa04247773d
PiperOrigin-RevId: 272390729
2019-10-02 01:50:30 -07:00
Jacques Pienaar 2b86e27dbd Show type even if elementsattr is elided in graph
The type is quite useful for debugging and shouldn't be too large.

PiperOrigin-RevId: 272390311
2019-10-02 01:46:12 -07:00
Eric Schweitz 9e6dde3977 Add a pair of hooks to DominanceInfo.
This exposes hooks for accessing internal dominance nodes, and updating the internal DFS numbers.

Closes tensorflow/mlir#151

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/151 from schweitzpgi:dominance_hooks 69d14214a423b816cbd59feffcacdd02f3b5f921
PiperOrigin-RevId: 272287352
2019-10-01 14:06:50 -07:00