Commit Graph

60 Commits

Author SHA1 Message Date
Diego Caballero 696f80736b [mlir] Turn flags in ConvertStandardToLLVM into pass flags
Follow-up on D72802. Turn -convert-std-to-llvm-use-alloca and
-convert-std-to-llvm-bare-ptr-memref-call-conv into pass flags
of LLVMLoweringPass.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D73912
2020-02-11 10:28:30 -08:00
Alex Zinenko 5a1778057f [mlir] use unpacked memref descriptors at function boundaries
The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.

Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.

Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by `alloca` will be freed on exiting the auxiliary
function.

The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.

This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.
2020-02-10 15:03:43 +01:00
aartbik e52414b1ae [mlir][VectorOps] Generalized vector.print to i32/i64
Summary:
Lowering to LLVM IR was restricted to float/double.
This CL also adds the integral values.

Reviewers: andydavis1, nicolasvasilache, ftynse

Reviewed By: nicolasvasilache, ftynse

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74179
2020-02-07 09:25:30 -08:00
Diego Caballero e5aaf30cf1 [mlir] Introduce bare ptr calling convention for MemRefs in LLVM dialect
Summary:
This patch introduces an alternative calling convention for
MemRef function arguments in LLVM dialect. It converts MemRef
function arguments to LLVM bare pointers to the MemRef element
type instead of creating a MemRef descriptor. Bare pointers are
then promoted to a MemRef descriptors at the beginning of the
function. This calling convention is only enabled with a flag.

Reviewers: ftynse, bondhugula, nicolasvasilache, rriddle, mehdi_amini

Reviewed By: ftynse, rriddle, mehdi_amini

Subscribers: Joonsoo, flaub, merge_guards_bot, jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72802
2020-01-31 15:19:38 -08:00
Mehdi Amini 308571074c Mass update the MLIR license header to mention "Part of the LLVM project"
This is an artifact from merging MLIR into LLVM, the file headers are
now aligned with the rest of the project.
2020-01-26 03:58:30 +00:00
River Riddle 4268e4f4b8 [mlir] Change the syntax of AffineMapAttr and IntegerSetAttr to avoid conflicts with function types.
Summary: The current syntax for AffineMapAttr and IntegerSetAttr conflict with function types, making it currently impossible to round-trip function types(and e.g. FuncOp) in the IR. This revision changes the syntax for the attributes by wrapping them in a keyword. AffineMapAttr is wrapped with `affine_map<>` and IntegerSetAttr is wrapped with `affine_set<>`.

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D72429
2020-01-13 13:24:39 -08:00
Kern Handa ea67737b16 [mlir] mlir-cpu-runner test's cblas_interface should export functions on Windows
This change fixes the build on Windows, so that cblas_interface.dll
exports functions correctly and an implib is created and installed
correctly.

Currently, LLVM cannot be consumed on Windows after it has been
installed in a location because cblas_interface.lib is not
created/installed, thus failing the import check in `LLVMExports.cmake`.

Differential Revision: https://reviews.llvm.org/D72384
2020-01-09 22:55:46 +01:00
Fangrui Song 020ca0cf2f [mlir] Fix -Wunneeded-internal-declaration 2019-12-24 10:33:30 -08:00
Mehdi Amini 56222a0694 Adjust License.txt file to use the LLVM license
PiperOrigin-RevId: 286906740
2019-12-23 15:33:37 -08:00
Nicolas Vasilache 50f9be6d2d Add runtime utils support for print_memref_i8
This CL adds print_memref_i8 along with a unit test.

PiperOrigin-RevId: 286299237
2019-12-18 17:32:35 -08:00
Aart Bik a1e84db66e [VectorOps] Replace iostream with stdio in support lib for vector.print
PiperOrigin-RevId: 286252829
2019-12-18 13:24:30 -08:00
Aart Bik d9b500d3bb [VectorOps] Add vector.print definition, with lowering support
Examples:

  vector.print %f : f32
  vector.print %x : vector<4xf32>
  vector.print %y : vector<3x4xf32>
  vector.print %z : vector<2x3x4xf32>

LLVM lowering replaces these with fully unrolled calls
into a small runtime support library that provides some
basic printing operations (single value, opening closing
bracket, comma, newline).

PiperOrigin-RevId: 286230325
2019-12-18 11:31:34 -08:00
Nicolas Vasilache 95b5a4fd67 Move cpu runner utils templates to .h
This allows reusing the implementation in various places by just including and permits more easily writing test functions without explicit template instantiations.

This also modifies UnrankedMemRefType to take a template type parameter since it cannot be type agnostic atm.

PiperOrigin-RevId: 285187711
2019-12-12 07:33:09 -08:00
nmostafa daff60cd68 Add UnrankedMemRef Type
Closes tensorflow/mlir#261

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/261 from nmostafa:nmostafa/unranked 96b6e918f6ed64496f7573b2db33c0b02658ca45
PiperOrigin-RevId: 284037040
2019-12-05 13:13:20 -08:00
Alex Zinenko 993e79e9bd Fix ViewOp to have at most one offset operand
As described in the documentation, ViewOp is expected to take an optional
dynamic offset followed by a list of dynamic sizes. However, the ViewOp parser
did not include a check for the offset being a single value and accepeted a
list of values instead.

Furthermore, several tests have been exercising the wrong syntax of a ViewOp,
passing multiple values to the dyanmic stride list, which was not caught by the
parser. The trailing values could have been erronously interpreted as dynamic
sizes. This is likely due to resyntaxing of the ViewOp, with the previous
syntax taking the list of sizes before the offset. Update the tests to use the
syntax with the offset preceding the sizes.

Worse, the conversion of ViewOp to the LLVM dialect assumed the wrong order of
operands with offset in the trailing position, and erronously relied on the
permissive parsing that interpreted trailing dynamic offset values as leading
dynamic sizes. Fix the lowering to use the correct order of operands.

PiperOrigin-RevId: 283532506
2019-12-03 06:23:04 -08:00
Nicolas Vasilache 1fa8c8070b Implement Linalg to loops lowering as a pattern
This CL rewrites the linalg ops to loops transformations as patterns that can be targeted directly from Tablegen. Reliance on OpFolder is removed and to cope with it we introduce local folding patterns that are applied greedily.

PiperOrigin-RevId: 282765550
2019-11-27 07:32:13 -08:00
Nicolas Vasilache 9059cf392d Automated rollback of commit d60133f89b
PiperOrigin-RevId: 282574110
2019-11-26 08:47:48 -08:00
Christian Sigg d60133f89b Changing directory shortcut for CPU/GPU runner utils.
Moving cuda-runtime-wrappers.so into subdirectory to match libmlir_runner_utils.so.
Provide parent directory when running test and load .so from subdirectory.

PiperOrigin-RevId: 282410749
2019-11-25 12:30:54 -08:00
Nicolas Vasilache 3c055957de Add StridedMemRef<>::operator[] - NFC
This operator is used for internal debugging purposes.

PiperOrigin-RevId: 281544152
2019-11-20 10:17:13 -08:00
Nicolas Vasilache 3732ba4def Fix pretty printer corner case in mlir_runner_utils.cpp.
In the particular case where the size of a memref dimension is 1, double printing would happen because printLast was called unconditionally.
This CL fixes the print and updates an incorrect test that should have caught this in the first place.

PiperOrigin-RevId: 281345142
2019-11-19 11:52:27 -08:00
Alex Zinenko 8961d8e32f Change conversion CLI flag from -lower-to-llvm to -convert-std-to-llvm
The command-line flag name `lower-to-llvm` for the pass performing dialect
conversion from the Standard dialect to the LLVM dialect is misleading and
inconsistent with most of the conversion passses. It leads the user to believe
that there are no restrictions on what can be converted, while in fact only a
subset of the Standard dialect can be converted (with operations from other
dialects converted by separate passes). Use `convert-std-to-llvm` that better
reflects what the pass does and is consistent with most other conversions.

PiperOrigin-RevId: 281238797
2019-11-19 00:34:51 -08:00
Nicolas Vasilache f51a155337 Add support for alignment attribute in std.alloc.
This CL adds an extra pointer to the memref descriptor to allow specifying alignment.

In a previous implementation, we used 2 types: `linalg.buffer` and `view` where the buffer type was the unit of allocation/deallocation/alignment and `view` was the unit of indexing.

After multiple discussions it was decided to use a single type, which conflates both, so the memref descriptor now needs to carry both pointers.

This is consistent with the [RFC-Proposed Changes to MemRef and Tensor MLIR Types](https://groups.google.com/a/tensorflow.org/forum/#!searchin/mlir/std.view%7Csort:date/mlir/-wKHANzDNTg/4K6nUAp8AAAJ).

PiperOrigin-RevId: 279959463
2019-11-12 07:06:54 -08:00
Nicolas Vasilache 72040bf7c8 Update Linalg to use std.view
Now that a view op has graduated to the std dialect, we can update Linalg to use it and remove ops that have become obsolete. As a byproduct, the linalg buffer and associated ops can also disappear.

PiperOrigin-RevId: 279073591
2019-11-07 06:33:10 -08:00
Nicolas Vasilache 9e7e297da3 Lower vector transfer ops to loop.for operations.
This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189.

PiperOrigin-RevId: 275543361
2019-10-18 14:10:10 -07:00
Nicolas Vasilache 5b03e692f6 Decouple Linalg promotion from Linalg tiling - NFC
This CL creates a new Linalg promotion pass that operates on SubViewOp and decouples it from Linalg tiling. This is mostly moving code around.

PiperOrigin-RevId: 275329213
2019-10-17 13:41:17 -07:00
Nicolas Vasilache 31c5a41a30 Consistent use of int in mlir_runner_utils.cpp
This should fix the OSS build by only using int in template types.

PiperOrigin-RevId: 274843584
2019-10-15 11:04:45 -07:00
Nicolas Vasilache abf5c60af9 Add conversion for splat of vectors of 2+D
This CL adds a missing lowering for splat of multi-dimensional vectors.
Additional support is also added to the runtime utils library to allow printing memrefs with such vectors.

PiperOrigin-RevId: 274794723
2019-10-15 06:53:08 -07:00
Alex Zinenko 5e7959a353 Use llvm.func to define functions with wrapped LLVM IR function type
This function-like operation allows one to define functions that have wrapped
LLVM IR function type, in particular variadic functions. The operation was
added in parallel to the existing lowering flow, this commit only switches the
flow to use it.

Using a custom function type makes the LLVM IR dialect type system more
consistent and avoids complex conversion rules for functions that previously
had to use the built-in function type instead of a wrapped LLVM IR dialect type
and perform conversions during the analysis.

PiperOrigin-RevId: 273910855
2019-10-10 01:34:06 -07:00
Nicolas Vasilache 171637d4f0 Fix Windows linkage error
This CL fixes bad macro names usage in mlir_runner_utils.h.
The macro mlir_runner_utils_EXPORTS now matches what is defined in CMakeLists.txt.

PiperOrigin-RevId: 273773931
2019-10-09 10:38:31 -07:00
Nicolas Vasilache 3b4f133fb7 Start a minimal mlir_utils runtime library for testing debugging purposes
Now that MLIR has a standardized StridedMemRef descriptor, it becomes very easy to interact with external library functions and build utilities directly in C++.
This CL introduces basic printing support in a libmlir_utils.so.
Unit tests are rewritten using this feature and also to improve coverage.

For now, C mandates that we have a unique function for each MemRef element type and rank.
In a future a simple unranked descriptor can be introduced to only require uniqu'ing by element type.

PiperOrigin-RevId: 273304741
2019-10-07 09:06:55 -07:00
Nicolas Vasilache e36337a998 Unify Linalg types by using strided memrefs
This CL finishes the implementation of the Linalg + Affine type unification of the [strided memref RFC](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
As a consequence, the !linalg.view type, linalg::DimOp, linalg::LoadOp and linalg::StoreOp can now disappear and Linalg can use standard types everywhere.

PiperOrigin-RevId: 272187165
2019-10-01 05:23:21 -07:00
Nicolas Vasilache 6543e99fe5 Fix JitRunner.cpp Error creation pattern and reactivate tests.
linalg_integration_test.mlir and simple.mlir were temporarily disabled due to an OSS-only failure.

The issue is that, once created, an llvm::Error must be explicitly checked before it can be discarded or overwritten.

This CL fixes the issue and reenable the test.

PiperOrigin-RevId: 271589651
2019-09-27 09:56:40 -07:00
Jacques Pienaar 7385d87895 Disable failing tests
PiperOrigin-RevId: 271460509
2019-09-26 16:42:35 -07:00
Alex Zinenko 99be3351b8 Drop support for memrefs from JitRunner
The support for functions taking and returning memrefs of floats was introduced
in the first version of the runner, created before MLIR had reliable lowering
of allocation/deallocation to library calls.  It forcibly runs MLIR
transformation convering affine, loop and standard dialects into the LLVM
dialect, unlike the other runner flows that accept the LLVM dialect directly.
Memref support leads to more complex layering and is generally fragile.  Drop
it in favor of functions returning a scalar, or library-based function calls to
print memrefs and other data structures.

PiperOrigin-RevId: 271330839
2019-09-26 05:42:01 -07:00
Mehdi Amini 765d60fd4d Add missing lowering to CFG in mlir-cpu-runner + related cleanup
- the list of passes run by mlir-cpu-runner included -lower-affine and
  -lower-to-llvm but was missing -lower-to-cfg (because -lower-affine at
  some point used to lower straight to CFG); add -lower-to-cfg in
  between. IR with affine ops can now be run by mlir-cpu-runner.

- update -lower-to-cfg to be consistent with other passes (create*Pass methods
  were changed to return unique ptrs, but -lower-to-cfg appears to have been
  missed).

- mlir-cpu-runner was unable to parse custom form of affine op's - fix
  link options

- drop unnecessary run options from test/mlir-cpu-runner/simple.mlir
  (none of the test cases had loops)

- -convert-to-llvmir was changed to -lower-to-llvm at some point, but the
  create pass method name wasn't updated (this pass converts/lowers to LLVM
  dialect as opposed to LLVM IR). Fix this.

(If we prefer "convert", the cmd-line options could be changed to
"-convert-to-llvm/cfg" then.)

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#115

PiperOrigin-RevId: 266666909
2019-09-01 11:33:22 -07:00
Jacques Pienaar 06e8101034 Add mechanism to dump JIT-compiled objects to files
This commit introduces the bits to be able to dump JIT-compile
objects to external files by passing an object cache to OrcJit.
The new functionality is tested in mlir-cpu-runner under the flag
`dump-object-file`.

Closes tensorflow/mlir#95

PiperOrigin-RevId: 266439265
2019-08-30 13:02:10 -07:00
Nicolas Vasilache 7f42b3d721 Add lowering of linalg.copy to an external C++ library and a test.
This CL extends support for lowering of linalg to external C++ libraries with CopyOp. Currently this can only work when the permutation maps in the copies are identity. Future support for permutations will be added later.

PiperOrigin-RevId: 265093025
2019-08-23 11:09:53 -07:00
Nicolas Vasilache 36f48063dd Add alignment support to linalg.buffer_alloc
This CL adds an integer attribute to linalg.buffer_alloc and lowering to LLVM.
The alignment is constrained to be a positive power of 2.

Lowering to LLVM produces the pattern:
```
%[[alloc:.*]] = llvm.call @malloc(%[[s]]) : (!llvm.i64) -> !llvm<"i8*">
%[[cast:.*]] = llvm.bitcast %[[alloc]] : !llvm<"i8*"> to !llvm.i64
%[[rem:.*]] = llvm.urem %[[cast]], %[[c16]] : !llvm.i64
%[[drem:.*]] = llvm.sub %[[c16]], %[[rem]] : !llvm.i64
%[[off:.*]] = llvm.urem %[[drem]], %[[c16]] : !llvm.i64
llvm.getelementptr %{{.*}}[%[[off]]] : (!llvm<"i8*">, !llvm.i64) -> !llvm<"i8*">
```

where `ptr` is aligned on `align` by computing the address
`ptr + (align - ptr % align) % align`.

To allow dealloc op to still be able to free memory, additional information is needed in
the buffer type. The buffer type is thus extended with an extra i8* for the base allocation address.

PiperOrigin-RevId: 264244455
2019-08-19 14:37:18 -07:00
Nicolas Vasilache 9bf69e6a2e Refactor linalg lowering to LLVM
The linalg.view type used to be lowered to a struct containing a data pointer, offset, sizes/strides information. This was problematic when passing to external functions due to ABI, struct padding and alignment issues.

The linalg.view type is now lowered to LLVMIR as a *pointer* to a struct containing the data pointer, offset and sizes/strides. This simplifies the interfacing with external library functions and makes it trivial to add new functions without creating a shim that would go from a value type struct to a pointer type.

The consequences are that:
1. lowering explicitly uses llvm.alloca in lieu of llvm.undef and performs the proper llvm.load/llvm.store where relevant.
2. the shim creation function `getLLVMLibraryCallDefinition` disappears.
3. views are passed by pointer, scalars are passed by value. In the future, other structs will be passed by pointer (on a per-need basis).

PiperOrigin-RevId: 264183671
2019-08-19 10:21:40 -07:00
Nicolas Vasilache 59b473c231 External library name mangling support for linalg.
This CL introduces the ability to generate the external library name for Linalg operations.
The problem is that neither mlir or C support overloading and we want a simplified form of name mangling that is still reasonable to read.
This CL creates the name of the external call that Linalg expects from the operation name and the type of its arguments.

The interface library names are updated and use new cases are added for FillOp.

PiperOrigin-RevId: 262556833
2019-08-09 07:33:58 -07:00
Nicolas Vasilache 1304331926 Automated rollback of commit 3708f53219
PiperOrigin-RevId: 260136255
2019-07-26 11:05:17 -07:00
Nicolas Vasilache 3708f53219 Add sgemm specializations - NFC
This CL adds a few specializations for sgemm.
A minor change to alpha is made in cblas_interface.cpp to be compatible with actual BLAS calls.

For now this is for internal testing purposes only.

PiperOrigin-RevId: 260129027
2019-07-26 05:41:23 -07:00
Nicolas Vasilache 00b48e1a9f Fix linalg_matmul_impl interfacing with sgemm
This CL provides a fix that makes linal_matmul_impl compliant with the BLAS interface. Before this CL it would compute either C += A * B when called with cblas.cpp:cblas_sgemm implementation and C = A * B with other implementations.

PiperOrigin-RevId: 260117367
2019-07-26 03:34:21 -07:00
Nicolas Vasilache e78ea03b24 Replace linalg.for by loop.for
With the introduction of the Loop dialect, uses of the `linalg.for` operation can now be subsumed 1-to-1 by `loop.for`.
This CL performs the replacement and tests are updated accordingly.

PiperOrigin-RevId: 258322565
2019-07-16 13:44:57 -07:00
Nicolas Vasilache cab671d166 Lower affine control flow to std control flow to LLVM dialect
This CL splits the lowering of affine to LLVM into 2 parts:
1. affine -> std
2. std -> LLVM

The conversions mostly consists of splitting concerns between the affine and non-affine worlds from existing conversions.
Short-circuiting of affine `if` conditions was never tested or exercised and is removed in the process, it can be reintroduced later if needed.

LoopParametricTiling.cpp is updated to reflect the newly added ForOp::build.

PiperOrigin-RevId: 257794436
2019-07-12 08:44:28 -07:00
Mahesh Ravishankar 25094e90bd Resolving buffer operand of linalg.view doesnt have the information
about the buffer size. This is needed to resolve the operand
correctly. Add that information to view op
serialization/deserialization

Also modify the parsing of buffer type by splitting at 'x' to
side-step issues with StringRef number parsing.

PiperOrigin-RevId: 256188319
2019-07-02 10:28:59 -07:00
Mahesh Ravishankar 266841751f Add buffer size information to Linalg::BufferType. If the size is
constant then it is represented as <size x elementType>. If the size
is not a compile time constant, then it is represented as
<? x elementType>.

PiperOrigin-RevId: 255619400
2019-06-28 10:10:17 -07:00
Lei Zhang e31c47ee8b Export symbols in cpu runner cblas library
By default MSVC does not export any symbol and does not create a companion
.lib for a .dll. This will cause problems when trying to link against the
library.

PiperOrigin-RevId: 254033454
2019-06-19 23:07:54 -07:00
Mahesh Ravishankar 54b35cec08 Add a definition of the library function to use when Linalg ops are
lowered to LLVM, instead of expecting one to exist in the Module

PiperOrigin-RevId: 253097382
2019-06-19 23:01:12 -07:00
Nicolas Vasilache a8a4d35d3f Add a lowering for Linalg matmul to LLVM
This CL adds a lowering to LLVM for MamulOp and a corresponding integration test.

View descriptor manipulation is moved from MLIR's LLVM dialect to C++ code compiled on the side. To this end a separation is introduced between `cblas.cpp` and `cblas_interface.cpp`, the latter operating on view types whose ABI correspond to the LLVM signature generated by MLIR.

An intermediary step is introduced that allocates a new descriptor on the MLIR side for the purpose of passing it to LLVM. The reason for this extra step is that the ABI for by-value ViewType objects wants aligned descriptors, e.g.:
```
extern "C" void linalg_dot_impl(ViewType<float, 1> X, ViewType<float, 1> Y,
                                BaseViewType<float> Z) {
   ...
}
```
produces LLVM IR with the signature:
```
%struct.ViewType = type { %struct.BaseViewType, [1 x i64], [1 x i64] }
%struct.BaseViewType = type { float*, i64 }

define void @linalg_dot_impl(%struct.ViewType* byval align 8, %struct.ViewType* byval align 8, float*, i64) tensorflow/mlir#0 {
...
}
```

We don't seem to be able to make such aligned  allocations in the MLIR -> LLVM converter atm.
Going through a level of indirection allows the test to pass.
The temporary tradeoff is that the MLIR shims have to be written by hand.
They will disappear in the future.

PiperOrigin-RevId: 252670672
2019-06-19 22:58:46 -07:00