llvm-project

Commit Graph

Author	SHA1	Message	Date
Alex Lorenz	6b938d2ead	Recommit "[clang][driver] Use the provided arch name for a Darwin target triple This ensures that the Darwin driver uses a consistent target triple representation when the triple is printed out to the user. This reverts the revert commit `ab0df6c034`. Differential Revision: https://reviews.llvm.org/D100807	2021-04-29 15:00:40 -07:00
Shilei Tian	c41ae246ac	[OpenMP][Clang][NVPTX] Only build one bitcode library for each SM In D97003, CUDA 9.2 is the minimum requirement for OpenMP offloading on NVPTX target. We don't need to have macros in source code to select right functions based on CUDA version. we don't need to compile multiple bitcode libraries of different CUDA versions for each SM. We don't need to worry about future compatibility with newer CUDA version. `-target-feature +ptx61` is used in this patch, which corresponds to the highest PTX version that CUDA 9.2 can support. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D97198	2021-03-08 12:03:04 -05:00
Pushpinder Singh	99951aa68d	OpenMP: Fix object clobbering issue when using save-temps There are two preconditions to reproduce the issue, 1. Use -save-temps option 2. Provide the -o option with name equal to the input file name without the file extension. For e.g. clang a.c -o a With the -o specified, the AssembleJobAction after OffloadWrapperJobAction will produce the object file with same name as host code object file. Due to this clash, the OffloadWrapperAction overwrites the initial host object file, which results in lld error. This also fixes the `multiple definition of __dummy.omp_offloading.entry'` issue in D96769 . Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D97273	2021-02-25 00:50:51 -05:00
Shilei Tian	76151acf89	[Clang][OpenMP] Require CUDA 9.2+ for OpenMP offloading on NVPTX target In current implementation of `deviceRTLs`, we're using some functions that are CUDA version dependent (if CUDA_VERSION < 9, it is one; otheriwse, it is another one). As a result, we have to compile one bitcode library for each CUDA version supported. A worse problem is forward compatibility. If a new CUDA version is released, we have to update CMake file as well. CUDA 9.2 has been released for three years. Instead of using various weird tricks to make `deviceRTLs` work with different CUDA versions and still have forward compatibility, we can simply drop support for CUDA 9.1 or lower version. It has at least two benifits: - We don't need to generate bitcode libraries for each CUDA version; - Clang driver doesn't need to search for the bitcode lib based on CUDA version. We can claim that starting from LLVM 12, OpenMP offloading on NVPTX target requires CUDA 9.2+. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D97003	2021-02-22 11:00:33 -05:00
Shilei Tian	33d660939d	[Clang][OpenMP] Update driver test case for OpenMP offload to use sm_35 `sm_35` is the minimum requirement for OpenMP offloading on NVPTX device. Current driver test case is using `sm_20`. D97003 is going to switch the minimum CUDA version to 9.2, which only supports `sm_30+`. This patch makes step for the change. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D97120	2021-02-20 15:14:13 -05:00
Shilei Tian	7c03f7d7d0	[OpenMP][deviceRTLs] Build the deviceRTLs with OpenMP instead of target dependent language From this patch (plus some landed patches), `deviceRTLs` is taken as a regular OpenMP program with just `declare target` regions. In this way, ideally, `deviceRTLs` can be written in OpenMP directly. No CUDA, no HIP anymore. (Well, AMD is still working on getting it work. For now AMDGCN still uses original way to compile) However, some target specific functions are still required, but they're no longer written in target specific language. For example, CUDA parts have all refined by replacing CUDA intrinsic and builtins with LLVM/Clang/NVVM intrinsics. Here're a list of changes in this patch. 1. For NVPTX, `DEVICE` is defined empty in order to make the common parts still work with AMDGCN. Later once AMDGCN is also available, we will completely remove `DEVICE` or probably some other macros. 2. Shared variable is implemented with OpenMP allocator, which is defined in `allocator.h`. Again, this feature is not available on AMDGCN, so two macros are redefined properly. 3. CUDA header `cuda.h` is dropped in the source code. In order to deal with code difference in various CUDA versions, we build one bitcode library for each supported CUDA version. For each CUDA version, the highest PTX version it supports will be used, just as what we currently use for CUDA compilation. 4. Correspondingly, compiler driver is also updated to support CUDA version encoded in the name of bitcode library. Now the bitcode library for NVPTX is named as `libomptarget-nvptx-cuda_[cuda_version]-sm_[sm_number].bc`, such as `libomptarget-nvptx-cuda_80-sm_20.bc`. With this change, there are also multiple features to be expected in the near future: 1. CUDA will be completely dropped when compiling OpenMP. By the time, we also build bitcode libraries for all supported SM, multiplied by all supported CUDA version. 2. Atomic operations used in `deviceRTLs` can be replaced by `omp atomic` if OpenMP 5.1 feature is fully supported. For now, the IR generated is totally wrong. 3. Target specific parts will be wrapped into `declare variant` with `isa` selector if it can work properly. No target specific macro is needed anymore. 4. (Maybe more...) Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94745	2021-01-26 12:28:47 -05:00
Shilei Tian	5ad038aafa	[Clang][OpenMP][NVPTX] Replace `libomptarget-nvptx-path` with `libomptarget-nvptx-bc-path` D94700 removed the static library so we no longer need to pass `-llibomptarget-nvptx` to `nvlink`. Since the bitcode library is the only device runtime for now, instead of emitting a warning when it is not found, an error should be raised. We also set a new option `libomptarget-nvptx-bc-path` to let user choose which bitcode library is being used. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95161	2021-01-23 14:42:38 -05:00
Nico Weber	956034c6c8	[mac/arm] XFAIL two more tests on arm64-apple Part of PR46644	2020-12-12 15:20:50 -05:00
Amy Huang	394db22595	Revert "Switch to using -debug-info-kind=constructor as default (from =limited)" This reverts commit `227db86a1b`. Causing debug info errors in google3 LTO builds; also causes a debuginfo-test failure.	2020-07-28 11:23:59 -07:00
Amy Huang	227db86a1b	Switch to using -debug-info-kind=constructor as default (from =limited) Summary: -debug-info-kind=constructor reduces the amount of class debug info that is emitted; this patch switches to using this as the default. Constructor homing emits the complete type info for a class only when the constructor is emitted, so it is expected that there will be some classes that are not defined in the debug info anymore because they are never constructed, and we shouldn't need debug info for these classes. I compared the PDB files for clang, and there are 273 class types that are defined with `=limited` but not with `=constructor` (out of ~60,000 total class types). We've looked at a number of the types that are no longer defined with =constructor. The vast majority of cases are something like class A is used as a parameter in a member function of some other class B, which is emitted. But the function that uses class A is never called, and class A is never constructed, and therefore isn't emitted in the debug info. Bug: https://bugs.llvm.org/show_bug.cgi?id=46537 Subscribers: aprantl, cfe-commits, lldb-commits Tags: #clang, #lldb Differential Revision: https://reviews.llvm.org/D79147	2020-07-09 15:26:46 -07:00
Saiyedul Islam	602d9b0afc	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 1 Summary: Allow AMDGCN as a GPU offloading target for OpenMP during compiler invocation and allow setting CUDAMode for it. Originally authored by Greg Rodgers (@gregrodgers). Reviewers: ronlieb, yaxunl, b-sumner, scchan, JonChesterfield, jdoerfert, sameerds, msearles, hliao, arsenm Reviewed By: sameerds Subscribers: sstefan1, jvesely, wdng, arsenm, guansong, dexonsmith, cfe-commits, llvm-commits, gregrodgers Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79754	2020-05-27 07:51:27 +00:00
Sergey Dmitriev	a0d83768f1	[Clang][OpenMP Offload] Add new tool for wrapping offload device binaries This patch removes the remaining part of the OpenMP offload linker scripts which was used for inserting device binaries into the output linked binary. Device binaries are now inserted into the host binary with a help of the wrapper bit-code file which contains device binaries as data. Wrapper bit-code file is dynamically created by the clang driver with a help of new tool clang-offload-wrapper which takes device binaries as input and produces bit-code file with required contents. Wrapper bit-code is then compiled to an object and resulting object is appended to the host linking by the clang driver. This is the second part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943). Differential Revision: https://reviews.llvm.org/D68166 llvm-svn: 374219	2019-10-09 20:42:58 +00:00
Gheorghe-Teodor Bercea	e62c693c8e	[OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 llvm-svn: 360265	2019-05-08 15:52:33 +00:00
Jonas Devlieghere	fe608c938c	Revert "[OpenMP][Clang] Support for target math functions" This commit appears to be breaking stage-2 builds on GreenDragon. The OpenMP wrappers for cmath and math.h are copied into the root of the resource directory and cause a cyclic dependency in module 'Darwin': Darwin -> std -> Darwin. This blows up when CMake is testing for modules support and breaks all stage 2 module builds, including the ThinLTO bot and all LLDB bots. CMake Error at cmake/modules/HandleLLVMOptions.cmake:497 (message): LLVM_ENABLE_MODULES is not supported by this compiler llvm-svn: 360192	2019-05-07 21:08:15 +00:00
Gheorghe-Teodor Bercea	1e28a668bc	[OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 llvm-svn: 360063	2019-05-06 18:19:15 +00:00
Alexey Bataev	8061acd501	[OPENMP][NVPTX]Use faster teams reduction algorithm. A faster way to reduce the values in teams reductions was found, the codegen is updated to use this faster algorithm and new runtime functions. llvm-svn: 354479	2019-02-20 16:36:22 +00:00
Alexey Bataev	c92fc3c8bc	[CUDA][OPENMP][NVPTX]Improve logic of the debug info support. Summary: Added support for the -gline-directives-only option + fixed logic of the debug info for CUDA devices. If optimization level is O0, then options --[no-]cuda-noopt-device-debug do not affect the debug info level. If the optimization level is >O0, debug info options are used + --no-cuda-noopt-device-debug is used or no --cuda-noopt-device-debug is used, the optimization level for the device code is kept and the emission of the debug directives is used. If the opt level is > O0, debug info is requested + --cuda-noopt-device-debug option is used, the optimization is disabled for the device code + required debug info is emitted. Reviewers: tra, echristo Subscribers: aprantl, guansong, JDevlieghere, cfe-commits Differential Revision: https://reviews.llvm.org/D51554 llvm-svn: 348930	2018-12-12 14:52:27 +00:00
Alexey Bataev	a5178f5369	[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types. clang-offload-bundler should not be invoked with the unbundling action when the input file type does not match the action type. For example, .so files should be unbundled during linking phase and should be linked only with the host code. llvm-svn: 343335	2018-09-28 16:17:59 +00:00
Jonas Hahnfeld	a981f67bcd	[OpenMP] Improve search for libomptarget-nvptx When looking for the bclib Clang considered the default library path first while it preferred directories in LIBRARY_PATH when constructing the invocation of nvlink. The latter actually makes more sense because during development it allows using a non-default runtime library. So change the search for the bclib to start looking in directories given by LIBRARY_PATH. Additionally add a new option --libomptarget-nvptx-path= which will be searched first. This will be handy for testing purposes. Differential Revision: https://reviews.llvm.org/D51686 llvm-svn: 343230	2018-09-27 16:12:32 +00:00
Alexey Bataev	3dfc993437	Revert "[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types." It reverts commit r342991 + several other commits intended to fix the tests. Still have some failed tests, need to investigate it. llvm-svn: 343002	2018-09-25 18:31:56 +00:00
Alexey Bataev	464ab241e7	[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types. clang-offload-bundler should not be invoked with the unbundling action when the input file type does not match the action type. For example, .so files should be unbundled during linking phase and should be linked only with the host code. llvm-svn: 342991	2018-09-25 17:09:17 +00:00
Alexey Bataev	80a9a61ded	[OPENMP][NVPTX] Add options -f[no-]openmp-cuda-force-full-runtime. Added options -f[no-]openmp-cuda-force-full-runtime to [not] force use of the full runtime for OpenMP offloading to CUDA devices. llvm-svn: 341073	2018-08-30 14:45:24 +00:00
Matt Arsenault	a13746b7eb	Rename -mlink-cuda-bitcode to -mlink-builtin-bitcode The same semantics work for OpenCL, and probably any offload language. Keep the old name around as an alias. llvm-svn: 340193	2018-08-20 18:16:48 +00:00
Alexey Bataev	b83b4e40fe	[DEBUGINFO] Disable unsupported debug info options for NVPTX target. Summary: Some targets support only default set of the debug options and do not support additional debug options, like NVPTX target. Patch introduced virtual function supportsDebugInfoOptions() that can be overloaded by the toolchain, checks if the target supports some debug options and emits warning when an unsupported debug option is found. Reviewers: echristo Subscribers: aprantl, JDevlieghere, cfe-commits Differential Revision: https://reviews.llvm.org/D49148 llvm-svn: 338155	2018-07-27 19:45:14 +00:00
Alexey Bataev	e36c67b35c	[NVPTX] Emit debug info in DWARF-2 by default for Cuda devices. Summary: NVPTX target supports debug info in DWARF-2 format. Patch adds emission of debug info in DWARF-2 by default. Reviewers: tra, jlebar Subscribers: aprantl, JDevlieghere, cfe-commits Differential Revision: https://reviews.llvm.org/D42581 llvm-svn: 330272	2018-04-18 16:31:09 +00:00
Gheorghe-Teodor Bercea	0d5aa84ad9	[OpenMP] Add flag for linking runtime bitcode library Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode. Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel Reviewed By: ABataev, grokos Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43197 llvm-svn: 327460	2018-03-13 23:19:52 +00:00
Gheorghe-Teodor Bercea	0805b80a73	Revert revision 327438. llvm-svn: 327447	2018-03-13 20:50:12 +00:00
Gheorghe-Teodor Bercea	148046c11b	[OpenMP] Add flag for linking runtime bitcode library Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode. Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel Reviewed By: ABataev, grokos Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43197 llvm-svn: 327438	2018-03-13 19:39:19 +00:00
Jonas Hahnfeld	4609b25dde	Add target triples to openmp-offload-gpu.c This might fix the failure on Green Dragon. llvm-svn: 318767	2017-11-21 15:06:28 +00:00
Jonas Hahnfeld	7c78cc5273	[OpenMP] Consistently use cubin extension for nvlink This was previously done in some places, but for example not for bundling so that single object compilation with -c failed. In addition cubin was used for all file types during unbundling which is incorrect for assembly files that are passed to ptxas. Tighten up the tests so that we can't regress in that area. Differential Revision: https://reviews.llvm.org/D40250 llvm-svn: 318763	2017-11-21 14:44:45 +00:00
Jonas Hahnfeld	757e61fa4f	[OpenMP] Fix passing of -m arguments to device toolchain AuxTriple is not set if host and device share a toolchain. Also, removing an argument modifies the DAL which needs to be returned for future use. (Move tests back to offload-openmp.c as they are not related to GPUs.) Differential Revision: https://reviews.llvm.org/D38258 llvm-svn: 314329	2017-09-27 18:12:34 +00:00
Jonas Hahnfeld	85f19958e9	[OpenMP] Fix memory leak when translating arguments Parsing the argument after -Xopenmp-target allocates memory that needs to be freed. Associate it with the final DerivedArgList after we know which one will be used. Differential Revision: https://reviews.llvm.org/D38257 llvm-svn: 314328	2017-09-27 18:12:31 +00:00
Gheorghe-Teodor Bercea	5a3608ccfa	[OpenMP] Don't throw cudalib not found error if only front-end is required. Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found. Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra Reviewed By: tra Subscribers: hfinkel, tra, cfe-commits Differential Revision: https://reviews.llvm.org/D37914 llvm-svn: 314217	2017-09-26 15:36:20 +00:00
Gheorghe-Teodor Bercea	20789a5f09	[OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain. Summary: Enable the -nocudalib flag for the OpenMP device offloading toolchain as well. Currently it can only be used for the CUDA toolchain. Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, hfinkel, tra Reviewed By: tra Subscribers: hfinkel, cfe-commits Differential Revision: https://reviews.llvm.org/D37913 llvm-svn: 314164	2017-09-25 21:56:32 +00:00
Gheorghe-Teodor Bercea	5636f4b33a	[OpenMP] Bugfix: output file name drops the absolute path where full path is needed. Summary: When composing the output file name, the path to the file is being dropped. The full path is required. Reviewers: Hahnfeld, ABataev, caomhin, carlo.bertolli, hfinkel, tra Reviewed By: tra Subscribers: hfinkel, tra, cfe-commits Differential Revision: https://reviews.llvm.org/D37912 llvm-svn: 314156	2017-09-25 21:25:38 +00:00
Gheorghe-Teodor Bercea	d45720b55a	Revert commit with wrong message. llvm-svn: 314154	2017-09-25 21:22:49 +00:00
Gheorghe-Teodor Bercea	8cf757ceda	[OpenMP] Don't throw cudalib not found error if only front-end is required. Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found. Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra Reviewed By: tra Subscribers: hfinkel, tra, cfe-commits Differential Revision: https://reviews.llvm.org/D37914 llvm-svn: 314150	2017-09-25 21:07:16 +00:00
Gheorghe-Teodor Bercea	0499212ebf	[OpenMP] Move failing flag tests to disabled GPU offloading test file. This should prevent further errors with the sanitizer. Diff: D29660 llvm-svn: 310765	2017-08-11 21:17:50 +00:00
Gheorghe-Teodor Bercea	9c52574886	[OpenMP] Enable previously successful offloading tests. Create a separate test file to contain all tests for OpenMP offloading to GPUs. Make libdevice checking more robust by accounting for the case in which no libdevice is found. This changes are in connrection with diff: D29660 llvm-svn: 310718	2017-08-11 15:46:22 +00:00

39 Commits