Commit Graph

82 Commits

Author SHA1 Message Date
Christian Sigg bd3a387ee7 Revert [mlir] Link mlir_runner_utils statically into cuda/rocm-runtime-wrappers (cf50f4f764)
There are cmake failures that I do not know how to fix.

Differential Revision: https://reviews.llvm.org/D95162
2021-01-21 22:38:59 +01:00
Christian Sigg cf50f4f764 [mlir] Link mlir_runner_utils statically into cuda/rocm-runtime-wrappers.
The runtime-wrappers depend on LLVMSupport, pulling in static initialization code (e.g. command line arguments). Dynamically loading multiple such libraries results in ODR violoations.

So far this has not been an issue, but in D94421, I would like to load both the async-runtime and the cuda-runtime-wrappers as part of a cuda-runner integration test. When doing this, code that asserts that an option category is only registered once fails (note that I've only experienced this in Google's bazel where the async-runtime depends on LLVMSupport, but a similar issue would happen in cmake if more than one runtime-wrapper starts to depend on LLVMSupport).

The underlying issue is that we have a mix of static and dynamic linking. If all dependencies were loaded as shared objects (i.e. if LLVMSupport was linked dynamically to the runtime wrappers), each dependency would only get loaded once. However, linking dependencies dynamically would require special attention to paths (one could dynamically load the dependencies first given explicit paths). The simpler approach seems to be to link all dependencies statically into a single shared object.

This change basically applies the same logic that we have in the c_runner_utils: we have a shared object target that can be loaded dynamically, and we have a static library target that can be linked to other runtime-wrapper shared object targets.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D94399
2021-01-20 12:10:16 +01:00
Christian Sigg df6cbd37f5 [mlir] Lower gpu.memcpy to GPU runtime calls.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D93204
2020-12-22 22:49:19 +01:00
Christian Sigg 5535696c38 [mlir] Add gpu.allocate, gpu.deallocate ops with LLVM lowering to runtime function calls.
The ops are very similar to the std variants, but support async GPU execution.

gpu.alloc does not currently support an alignment attribute, and the new ops do not have
canonicalizers/folders like their std siblings do.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D91698
2020-11-27 09:40:59 +01:00
River Riddle 65fcddff24 [mlir][BuiltinDialect] Resolve comments from D91571
* Move ops to a BuiltinOps.h
* Add file comments
2020-11-19 11:12:49 -08:00
River Riddle 73ca690df8 [mlir][NFC] Remove references to Module.h and Function.h
These includes have been deprecated in favor of BuiltinDialect.h, which contains the definitions of ModuleOp and FuncOp.

Differential Revision: https://reviews.llvm.org/D91572
2020-11-17 00:55:47 -08:00
Christian Sigg 3307a7c046 [mlir][gpu] Add missing initialization of gpu runtime wrappers.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D91148
2020-11-11 10:34:21 +01:00
Christian Sigg 97b351a827 [mlir][gpu] Fix leaked stream and module when lowering gpu.launch_func to runtime calls.
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D90370
2020-10-29 08:40:51 +01:00
Eugene Zhulenev a297340d9e [mlir] Fix stack-use-after-scope in cuda/vulkan cpu runners
+fix rocm runner

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D90274
2020-10-27 17:26:08 -07:00
Eugene Zhulenev f6c9f6eccd [mlir] JitRunner: add a config option to register symbols with ExecutionEngine at runtime
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D90264
2020-10-27 15:57:34 -07:00
Mehdi Amini e7021232e6 Remove global dialect registration
This has been deprecated for >1month now and removal was announced in:

https://llvm.discourse.group/t/rfc-revamp-dialect-registration/1559/11

Differential Revision: https://reviews.llvm.org/D86356
2020-10-24 00:35:55 +00:00
Mehdi Amini 6a72635881 Revert "Remove global dialect registration"
This reverts commit b22e2e4c6e.

Investigating broken builds
2020-10-23 21:26:48 +00:00
Mehdi Amini b22e2e4c6e Remove global dialect registration
This has been deprecated for >1month now and removal was announced in:

https://llvm.discourse.group/t/rfc-revamp-dialect-registration/1559/11

Differential Revision: https://reviews.llvm.org/D86356
2020-10-23 20:41:44 +00:00
Benjamin Kramer 97e48aadbd [mlir-cuda-runner] Unbreak the build
CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
  The dependency target "Core" of target "mlir-cuda-runner" does not exist.
Call Stack (most recent call first):
  llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
  mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)

CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
  The dependency target "LINK_COMPONENTS" of target "mlir-cuda-runner" does
  not exist.
Call Stack (most recent call first):
  llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
  mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)

CMake Error at llvm/cmake/modules/AddLLVM.cmake:870 (add_dependencies):
  The dependency target "Support" of target "mlir-cuda-runner" does not
  exist.
Call Stack (most recent call first):
  llvm/cmake/modules/AddLLVM.cmake:1169 (add_llvm_executable)
  mlir/tools/mlir-cuda-runner/CMakeLists.txt:69 (add_llvm_tool)
2020-10-13 22:36:08 +02:00
Christian Sigg 01dc85c173 [mlir][gpu] Adding gpu runtime wrapper functions for async execution.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89037
2020-10-12 14:07:27 +02:00
Serge Guelton d94f70fb98 [mlir] Improve LLVM shlib support
mlir-tblgen was incompatible with libLLVM, due to explicit linkage with
libLLVMSupport etc.
As it cannot link with libLLVM, make sure all lib it uses are not using libLLVM
either.

As a side effect, also remove some explicit references to LLVM libs and use
components instead.

Differential Revision: https://reviews.llvm.org/D88846
2020-10-09 07:17:56 +02:00
Stephen Neuendorffer b0dce6b37f Revert "[RFC] Factor out repetitive cmake patterns for llvm-style projects"
This reverts commit e9b87f43bd.

There are issues with macros generating macros without an obvious simple fix
so I'm going to revert this and try something different.
2020-10-04 15:17:34 -07:00
Stephen Neuendorffer e9b87f43bd [RFC] Factor out repetitive cmake patterns for llvm-style projects
New projects (particularly out of tree) have a tendency to hijack the existing
llvm configuration options and build targets (add_llvm_library,
add_llvm_tool).  This can lead to some confusion.

1) When querying a configuration variable, do we care about how LLVM was
configured, or how these options were configured for the out of tree project?
2) LLVM has lots of defaults, which are easy to miss
(e.g. LLVM_BUILD_TOOLS=ON).  These options all need to be duplicated in the
CMakeLists.txt for the project.

In addition, with LLVM Incubators coming online, we need better ways for these
incubators to do things the "LLVM way" without alot of futzing.  Ideally, this
would happen in a way that eases importing into the LLVM monorepo when
projects mature.

This patch creates some generic infrastructure in llvm/cmake/modules and
refactors MLIR to use this infrastructure.  This should expand to include
add_xxx_library, which is by far the most complicated bit of building a
project correctly, since it has to deal with lots of shared library
configuration bits.  (MLIR currently hijacks the LLVM infrastructure for
building libMLIR.so, so this needs to get refactored anyway.)

Differential Revision: https://reviews.llvm.org/D85140
2020-10-03 17:12:35 -07:00
Christian Sigg 2c48e3629c [MLIR] Adding gpu.host_register op and lower it to a runtime call.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D85631
2020-08-10 22:46:17 +02:00
Christian Sigg 0d4b7adb82 [MLIR] Make gpu.launch_func rewrite pattern part of the LLVM lowering pass.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D85073
2020-08-10 19:28:30 +02:00
Christian Sigg 45676a8936 [MLIR] Change GpuLaunchFuncToGpuRuntimeCallsPass to wrap a RewritePattern with the same functionality.
The RewritePattern will become one of several, and will be part of the LLVM conversion pass (instead of a separate pass following LLVM conversion).

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D84946
2020-08-06 11:55:46 +02:00
Christian Sigg c64c04bbaa Clean up cuda-runtime-wrappers API.
Do not return error code, instead return created resource handles or void. Error reporting is done by the library function.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D84660
2020-07-28 16:34:08 +02:00
Christian Sigg 2dd7a9cc2d [MLIR] NFC: Rename mcuMemHostRegister* to mgpuMemHostRegister* to make it consistent with the other cuda-runner functions and ROCm.
Summary: Rename mcuMemHostRegister* to mgpuMemHostRegister*.

Reviewers: herhut

Reviewed By: herhut

Subscribers: yaxunl, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes

Tags: #mlir

Differential Revision: https://reviews.llvm.org/D84583
2020-07-27 15:48:05 +02:00
Benjamin Kramer b9bb3ad3ed Unbreak the build of mlir-cuda-runner 2020-05-29 12:18:48 +02:00
Wen-Heng (Jack) Chung 061fb8eb2d [mlir][gpu][mlir-cuda-runner] Refactor ConvertKernelFuncToCubin to be generic.
Make ConvertKernelFuncToCubin pass to be generic:

- Rename to ConvertKernelFuncToBlob.
- Allow specifying triple, target chip, target features.
- Initializing LLVM backend is supplied by a callback function.
- Lowering process from MLIR module to LLVM module is via another callback.
- Change mlir-cuda-runner to adopt the revised pass.
- Add new tests for lowering to ROCm HSA code object (HSACO).
- Tests for CUDA and ROCm are kept in separate directories.

Differential Revision: https://reviews.llvm.org/D80142
2020-05-28 09:08:28 -05:00
Christian Sigg 222e0e58a8 [MLIR] Helper class referencing MemRefType to unify runner implementations.
Summary:
Add DynamicMemRefType which can reference one of the statically ranked StridedMemRefType or a UnrankedMemRefType so that runner utils only need to be implemented once.

There is definitely room for more clean up and unification, but I will keep that for follow-ups.

Reviewers: nicolasvasilache

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80513
2020-05-26 16:32:36 +02:00
Wen-Heng (Jack) Chung 2cbbc266ec [mlir][gpu] Refactor ConvertGpuLaunchFuncToCudaCalls pass.
Due to similar APIs between CUDA and ROCm (HIP),
ConvertGpuLaunchFuncToCudaCalls pass could be used on both platforms with some
refactoring.

In this commit:

- Migrate ConvertLaunchFuncToCudaCalls from GPUToCUDA to GPUCommon, and rename.
- Rename runtime wrapper APIs be platform-neutral.
- Let GPU binary annotation attribute be specifiable as a PassOption.
- Naming changes within the implementation and tests.

Subsequent patches would introduce ROCm-specific tests and runtime wrapper
APIs.

Differential Revision: https://reviews.llvm.org/D80167
2020-05-21 08:53:47 -05:00
Mehdi Amini 5c3ebd7725 Revert "[mlir][gpu] Refactor ConvertGpuLaunchFuncToCudaCalls pass."
This reverts commit cdb6f05e2d.

The build is broken with:

  You have called ADD_LIBRARY for library obj.MLIRGPUtoCUDATransforms without any source files. This typically indicates a problem with your CMakeLists.txt file
2020-05-21 03:44:35 +00:00
Wen-Heng (Jack) Chung cdb6f05e2d [mlir][gpu] Refactor ConvertGpuLaunchFuncToCudaCalls pass.
Due to similar APIs between CUDA and ROCm (HIP),
ConvertGpuLaunchFuncToCudaCalls pass could be used on both platforms with some
refactoring.

In this commit:

- Migrate ConvertLaunchFuncToCudaCalls from GPUToCUDA to GPUCommon, and rename.
- Rename runtime wrapper APIs be platform-neutral.
- Let GPU binary annotation attribute be specifiable as a PassOption.
- Naming changes within the implementation and tests.

Subsequent patches would introduce ROCm-specific tests and runtime wrapper
APIs.

Differential Revision: https://reviews.llvm.org/D80167
2020-05-20 16:11:48 -05:00
Christian Sigg 62adfed30a Unrank mcuMemHostRegister tensor argument.
Reviewers: herhut

Reviewed By: herhut

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80118
2020-05-19 13:58:54 +02:00
Stephen Neuendorffer ec44e08940 [MLIR] Move JitRunner to live with ExecutionEngine
The JitRunner library is logically very close to the execution engine,
and shares similar dependencies.

find -name "*.cpp" -exec sed -i "s/Support\/JitRunner/ExecutionEngine\/JitRunner/" "{}" \;

Differential Revision: https://reviews.llvm.org/D79899
2020-05-15 14:37:10 -07:00
Christian Sigg b43ae21e60 Fix all-reduce int tests by host-registering memrefs.
Reduce amount of boiler plate to register host memory.

Summary: Fix all-reduce int tests by host-registering memrefs.

Reviewers: herhut

Reviewed By: herhut

Subscribers: clementval, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76563
2020-03-23 11:48:13 +01:00
Stephen Neuendorffer 4594d0e943 [MLIR] Move from add_dependencies() to DEPENDS
add_llvm_library and add_llvm_executable may need to create new targets with
appropriate dependencies.  As a result, it is not sufficient in some
configurations (namely LLVM_BUILD_LLVM_DYLIB=on) to only call
add_dependencies().  Instead, the explicit TableGen dependencies must
be passed to add_llvm_library() or add_llvm_executable() using the DEPENDS
keyword.

Differential Revision: https://reviews.llvm.org/D74930
2020-03-06 13:25:17 -08:00
Stephen Neuendorffer 1c82dd39f9 [MLIR] Ensure that target_link_libraries() always has a keyword.
CMake allows calling target_link_libraries() without a keyword,
but this usage is not preferred when also called with a keyword,
and has surprising behavior.  This patch explicitly specifies a
keyword when using target_link_libraries().

Differential Revision: https://reviews.llvm.org/D75725
2020-03-06 09:14:01 -08:00
Stephen Neuendorffer 798e661567 Revert "[MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries."
This reverts commit 7a6c689771.
This breaks the build with cmake 3.13.4, but succeeds with cmake 3.15.3
2020-02-29 11:52:08 -08:00
Stephen Neuendorffer d675df0379 Revert "[MLIR] Move from add_dependencies() to DEPENDS"
This reverts commit 31e07d716a.
2020-02-29 11:52:08 -08:00
Stephen Neuendorffer 31e07d716a [MLIR] Move from add_dependencies() to DEPENDS
add_llvm_library and add_llvm_executable may need to create new targets with
appropriate dependencies.  As a result, it is not sufficient in some
configurations (namely LLVM_BUILD_LLVM_DYLIB=on) to only call
add_dependencies().  Instead, the explicit TableGen dependencies must
be passed to add_llvm_library() or add_llvm_executable() using the DEPENDS
keyword.

Differential Revision: https://reviews.llvm.org/D74930
2020-02-29 10:47:27 -08:00
Stephen Neuendorffer 7a6c689771 [MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries.
When compiling libLLVM.so, add_llvm_library() manipulates the link libraries
being used.  This means that when using add_llvm_library(), we need to pass
the list of libraries to be linked (using the LINK_LIBS keyword) instead of
using the standard target_link_libraries call.  This is preparation for
properly dealing with creating libMLIR.so as well.

Differential Revision: https://reviews.llvm.org/D74864
2020-02-29 10:47:26 -08:00
Stephen Neuendorffer dc1056a3f1 Revert "[MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries."
This reverts commit 2f265e3528.
2020-02-28 14:13:30 -08:00
Stephen Neuendorffer 67f2a43cf8 Revert "[MLIR] Move from add_dependencies() to DEPENDS"
This reverts commit 8a2b86b2c2.
2020-02-28 12:17:40 -08:00
Stephen Neuendorffer 8a2b86b2c2 [MLIR] Move from add_dependencies() to DEPENDS
add_llvm_library and add_llvm_executable may need to create new targets with
appropriate dependencies.  As a result, it is not sufficient in some
configurations (namely LLVM_BUILD_LLVM_DYLIB=on) to only call
add_dependencies().  Instead, the explicit TableGen dependencies must
be passed to add_llvm_library() or add_llvm_executable() using the DEPENDS
keyword.

Differential Revision: https://reviews.llvm.org/D74930
2020-02-28 11:35:18 -08:00
Stephen Neuendorffer 2f265e3528 [MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries.
When compiling libLLVM.so, add_llvm_library() manipulates the link libraries
being used.  This means that when using add_llvm_library(), we need to pass
the list of libraries to be linked (using the LINK_LIBS keyword) instead of
using the standard target_link_libraries call.  This is preparation for
properly dealing with creating libMLIR.so as well.

Differential Revision: https://reviews.llvm.org/D74864
2020-02-28 11:35:17 -08:00
Stephen Neuendorffer b7d50ba1ee [MLIR] Refactor library initialization of JitRunner.
Previously, lib/Support/JitRunner.cpp was essentially a complete application,
performing all library initialization, along with dealing with command line
arguments and actually running passes.  This differs significantly from
mlir-opt and required a dependency on InitAllDialects.h.  This dependency
is significant, since it requires a dependency on all of the resulting
libraries.

This patch refactors the code so that tools are responsible for library
initialization, including registering all dialects, prior to calling
JitRunnerMain.  This places the concern about what dialect to support
with the end application, enabling more extensibility at the cost of
a small amount of code duplication between tools.  It also fixes
BUILD_SHARED_LIBS=on.

Differential Revision: https://reviews.llvm.org/D75272
2020-02-28 11:35:17 -08:00
Stephen Neuendorffer c07fb9e016 [MLIR] Refactor library handling for conversions.
Collect a list of conversion libraries in cmake, so we don't have to
list these explicitly in most binaries.

Differential Revision: https://reviews.llvm.org/D75222
2020-02-28 11:35:17 -08:00
Stephen Neuendorffer 5869552821 [MLIR] Refactor handling of dialect libraries
Instead of creating extra libraries we don't really need, collect a
list of all dialects and use that instead.

Differential Revision: https://reviews.llvm.org/D75221
2020-02-28 11:35:16 -08:00
Valentin Clement 56aba9699d [MLIR] Fix wrong header for mlir-cuda-runner
Just updated the wrong header probably copied from the mlir-cpu-runner

Differential Revision: https://reviews.llvm.org/D74497
2020-02-12 22:35:46 +01:00
Stephan Herhut 864110b5b4 [MLIR][CUDA] Fix build file for mlir-cuda-runner
Summary:
This was broken recently when moving from dialect registration via
static initializers to explicit intialization.

Differential Revision: https://reviews.llvm.org/D74480
2020-02-12 15:10:51 +01:00
Alex Zinenko 5a1778057f [mlir] use unpacked memref descriptors at function boundaries
The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.

Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.

Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by `alloca` will be freed on exiting the auxiliary
function.

The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.

This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.
2020-02-10 15:03:43 +01:00
River Riddle c33d6970e0 [mlir] Add support for basic location translation to LLVM.
Summary:
This revision adds basic support for emitting line table information when exporting to LLVMIR. We don't yet have a story for supporting all of the LLVM debug metadata, so this revision stubs some features(like subprograms) to enable emitting line tables.

Differential Revision: https://reviews.llvm.org/D73934
2020-02-05 17:41:51 -08:00
Kern Handa b8004b7308 [mlir] Mark the MLIR tools for installation in CMake
This binplaces `mlir-translate`, `mlir-cuda-runner`, and `mlir-cpu-runner` when building the CMake install target.

Differential Revision: https://reviews.llvm.org/D73986
2020-02-05 03:42:57 +00:00