Summary:
Different NVIDIA GPUs support different compute capabilities. To enable the inlining of runtime functions and the best performance on different generations of NVIDIA GPUs, a bc library for each compute capability needs to be compiled. The same compiler build will then be usable in conjunction with multiple generations of NVIDIA GPUs.
To differentiate between versions of the same bc lib, the output file name will contain the compute capability ID.
Depends on D14254
Reviewers: Hahnfeld, hfinkel, carlo.bertolli, caomhin, ABataev, grokos
Reviewed By: Hahnfeld, grokos
Subscribers: guansong, mgorny, openmp-commits
Differential Revision: https://reviews.llvm.org/D41724
llvm-svn: 324904
This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices.
Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs.
The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target.
Differential revision: https://reviews.llvm.org/D14254
llvm-svn: 323649
This patch enables OMPT by default if version 50 or later is built and the config says, that OMPT will be supported.
Differential Revision: https://reviews.llvm.org/D41508
llvm-svn: 321675
We now have several options that apply for both libraries and they
shouldn't be documented in multiple files. When already merging
the two Build_With_CMake.txt documents, convert them to
reStructuredText which is used for all of LLVM's documentation.
Differential Revision: https://reviews.llvm.org/D40920
llvm-svn: 321481