llvm-project

Commit Graph

Author	SHA1	Message	Date
AndreyChurbanov	a60bc55c69	[OpenMP] libomp: cleanup parsing of OMP_ALLOCATOR env variable. Differential Revision: https://reviews.llvm.org/D94932	2021-01-19 16:21:22 +03:00
Kelvin Li	9d81073acb	[OpenMP][Docs] Fix typos in FAQ (NFC)	2021-01-18 18:55:58 -05:00
AndreyChurbanov	aa3a59e0c6	[OpenMP][NFC] Fix test The test fails if memkind library is accessible.	2021-01-19 00:05:34 +03:00
Shilei Tian	9bf843bdc8	Revert "[OpenMP] Added the support for hidden helper task in RTL" This reverts commit `ed939f853d`.	2021-01-18 06:57:52 -05:00
Chandler Carruth	f855751c12	Fix openmp CMake build on non-Linux AArch64 systems. This just checks for `/proc/cpuinfo` existing before reading it. Tested on an ARM macOS machine.	2021-01-17 16:18:31 -08:00
Shilei Tian	ed939f853d	[OpenMP] Added the support for hidden helper task in RTL The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want. Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8. Here are some open issues to be discussed: 1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here? Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D77609	2021-01-16 14:13:35 -05:00
Jon Chesterfield	214387c2c6	[libomptarget][nvptx] Reduce calls to cuda header [libomptarget][nvptx] Reduce calls to cuda header Remove use of clock_t in favour of a builtin. Drop a preprocessor branch. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94731	2021-01-15 02:16:33 +00:00
Jon Chesterfield	6e7094c14b	[libomptarget][nvptx][nfc] Move target_impl functions out of header [libomptarget][nvptx][nfc] Move target_impl functions out of header This removes most of the differences between the two target_impl.h. Also change name mangling from C to C++ for __kmpc_impl_*_lock. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D94728	2021-01-15 00:19:48 +00:00
Shilei Tian	547b032ccc	[OpenMP] Remove omptarget-nvptx from deps as it is no longer a valid target `omptarget-nvptx` is still a dependence for `check-libomptarget-nvtpx` although it has been removed by D94573. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94725	2021-01-14 19:16:11 -05:00
Shilei Tian	64e9e9aeee	[OpenMP] Dropped unnecessary define when compiling deviceRTLs for NVPTX The comment said CUDA 9 header files use the `nv_weak` attribute which `clang` is not yet prepared to handle. It's three years ago and now things have changed. Based on my test, removing the definition doesn't have any problem on my machine with CUDA 11.1 installed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94700	2021-01-14 13:55:12 -05:00
Shilei Tian	763c1f9933	[OpenMP] Drop the static library libomptarget-nvptx For NVPTX target, OpenMP provides a static library `libomptarget-nvptx` built by NVCC, and another bitcode `libomptarget-nvptx-sm_{$sm}.bc` generated by Clang. When compiling an OpenMP program, the `.bc` file will be fed to `clang` in the second run on the program that compiles the target part. Then the generated PTX file will be fed to `ptxas` to generate the object file, and finally the driver invokes `nvlink` to generate the binary, where the static library will be appened to `nvlink`. One question is, why do we need two libraries? The only difference is, the static library contains `omp_data.cu` and the bitcode library doesn't. It's unclear why they were implemented in this way, but per D94565, there is no issue if we also include the file into the bitcode library. Therefore, we can safely drop the static library. This patch is about the change in OpenMP. The driver will be updated as well if this patch is accepted. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D94573	2021-01-14 13:34:25 -05:00
Jon Chesterfield	5d165f0b89	[libomptarget][amdgpu] Fix kernel launch tracing to match previous behavior Restore control of kernel launch tracing to be >= 1 as it was before export LIBOMPTARGET_KERNEL_TRACE=1 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94695	2021-01-14 18:13:22 +00:00
Terry Wilmarth	4fe17ada55	[OpenMP] Fix hierarchical barrier Hierarchical barrier is an experimental barrier algorithm that uses aspects of machine hierarchy to define the barrier tree structure. This patch fixes offset calculation in hierarchical barrier. The offset is used to store info on a flag about sleeping threads waiting on a location stored in the flag. This commit also fixes a potential deadlock in hierarchical barrier when using infinite blocktime by adjusting the offset value of leaf kids so that it matches the value of leaf state. It also adds testing of default barriers with infinite blocktime, and also tests hierarchical barrier algorithm with both default and infinite blocktime. Patch by Terry Wilmarth and Nawrin Sultana. Differential Revision: https://reviews.llvm.org/D94241	2021-01-13 10:22:57 -06:00
Joseph Huber	a957634942	[OpenMP] Add documentation for error messages and release notes Add extra information to the runtime page describing the error messages and add information to the release notes for clang 12.0 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94562	2021-01-13 11:00:41 -05:00
Jon Chesterfield	84e0b14a0a	[libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL [libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D94565	2021-01-13 03:51:11 +00:00
Hansang Bae	bba3a82b56	[OpenMP] Use persistent memory for omp_large_cap_mem This change enables volatile use of persistent memory for omp_large_cap_mem* on supported systems. It depends on libmemkind's support for persistent memory, and requirements/details can be found at the following url. https://pmem.io/2020/01/20/memkind-dax-kmem.html Differential Revision: https://reviews.llvm.org/D94353	2021-01-12 20:35:27 -06:00
Hansang Bae	6f0f022038	[OpenMP] Update allocator trait key/value definitions Use new definitions introduced in 5.1 specification. Differential Revision: https://reviews.llvm.org/D94277	2021-01-12 20:09:45 -06:00
Shilei Tian	01f1273fe2	[OpenMP] Fixed a typo in openmp/CMakeLists.txt	2021-01-12 17:00:49 -05:00
Shilei Tian	68ff52ffea	[OpenMP] Fixed the link error that cannot find static data member Constant static data member can be defined in the class without another define after the class in C++17. Although it is C++17, Clang can still handle it even w/o the flag for C++17. Unluckily, GCC cannot handle that. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D94541	2021-01-12 16:48:28 -05:00
Jon Chesterfield	33e2494bea	[libomptarget][amdgpu][nfc] Fix build on centos [libomptarget][amdgpu][nfc] Fix build on centos rtl.cpp replaced 224 with a #define from elf.h, but that doesn't work on a centos 7 build machine with an old elf.h Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D94528	2021-01-12 19:40:03 +00:00
Shilei Tian	bdd1ad5e5c	[OpenMP] Fixed include directories for OpenMP when building OpenMP with LLVM_ENABLE_RUNTIMES Some LLVM headers are generated by CMake. Before the installation, LLVM's headers are distributed everywhere, some of which are in `${LLVM_SRC_ROOT}/llvm/include/llvm`, and some are in `${LLVM_BINARY_ROOT}/include/llvm`. After intallation, they're all in `${LLVM_INSTALLATION_ROOT}/include/llvm`. OpenMP now depends on LLVM headers. Some headers depend on headers generated by CMake. When building OpenMP along with LLVM, a.k.a via `LLVM_ENABLE_RUNTIMES`, we need to tell OpenMP where it can find those headers, especially those still have not been copied/installed. Reviewed By: jdoerfert, jhuber6 Differential Revision: https://reviews.llvm.org/D94534	2021-01-12 14:32:38 -05:00
Shilei Tian	0871d6d516	[OpenMP] Move memory manager to plugin and make it a common interface The lifetime of `libomptarget` and its opened plugins are not aligned and it's hard for `libomptarget` to determine when the plugins are destroyed. As a result, some issues (see D94256 for details) occur on some platforms. Actually, if we take target memory as target resources, same as other resources, such as CUDA streams, in each plugin, then the memory manager should also be in the plugin. Also considering some platforms may want to opt out the feature, it makes sense to move the memory manager to plugin, make it a common interface, and let plguin developers determine whether they need it. This is what this patch does. CUDA plugin is taken as example to show how to integrate it. In this way, we can also get a bonus that different thresholds can be set for different platforms. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D94379	2021-01-11 21:33:42 -05:00
Shilei Tian	a81c68ae6b	[OpenMP] Take elf_common.c as a interface library For now `elf_common.c` is taken as a common part included into different plugin implementations directly via `#include "../../common/elf_common.c"`, which is not a best practice. Since it is simple enough such that we don't need to create a real library for it, we just take it as a interface library so that other targets can link it directly. Another advantage of this method is, we don't need to add the folder into header search path which can potentially pollute the search path. VE and AMD platforms have not been tested because I don't have target machines. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94443	2021-01-11 17:34:26 -05:00
Shilei Tian	7be3285248	[OpenMP] Not set OPENMP_STANDALONE_BUILD=ON when building OpenMP along with LLVM For now, `*_STANDALONE_BUILD` is set to ON even if they're built along with LLVM because of issues mentioned in the comments. This can cause some issues. For example, if we build OpenMP along with LLVM, we'd like to copy those OpenMP headers to `<prefix>/lib/clang/<version>/include` such that `clang` can find those headers without using `-I <prefix>/include` because those headers will be copied to `<prefix>/include` if it is built standalone. In this patch, we fixed the dependence issue in OpenMP such that it can be built correctly even with `OPENMP_STANDALONE_BUILD=OFF`. The issue is in the call to `add_lit_testsuite`, where `clang` and `clang-resource-headers` are passed as `DEPENDS`. Since we're building OpenMP along with LLVM, `clang` is set by CMake to be the C/C++ compiler, therefore these two dependences are no longer needed, where caused the dependence issue. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93738	2021-01-10 16:46:19 -05:00
Shilei Tian	175c336a1c	[OpenMP] Remove copy constructor of `RTLInfoTy` Multiple `RTLInfoTy` objects are stored in a list `AllRTLs`. Since `RTLInfoTy` contains a `std::mutex`, it is by default not a copyable object. In order to support `AllRTLs.push_back(...)` which is currently used, a customized copy constructor is provided. Every time we need to add a new data member into `RTLInfoTy`, we should keep in mind not forgetting to add corresponding assignment in the copy constructor. In fact, the only use of the copy constructor is to push the object into the list, we can of course write it in a way that first emplace a new object back, and then use the reference to the last element. In this way we don't need the copy constructor anymore. If the element is invalid, we just need to pop it, and that's what this patch does. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94361	2021-01-09 13:01:01 -05:00
Shilei Tian	676c7cb0c0	[OpenMP] Added the support for cache line size 256 for A64FX Fugaku supercomputer is built with the Fujitsu A64FX microprocessor, whose cache line is 256. In current libomp, we only have cache line size 128 for PPC64 and otherwise 64. This patch added the support of cache line 256 for A64FX. It's worth noting that although A64FX is a variant of AArch64, this property is not shared. As a result, in light of UCX source code (`392443ab92/src/ucs/arch/aarch64/cpu.c (L17)`), we can only determine by checking whether the CPU is FUJITSU A64FX. Reviewed By: jdoerfert, Hahnfeld Differential Revision: https://reviews.llvm.org/D93169	2021-01-09 11:58:47 -05:00
Joseph Huber	2ce16810f2	[OpenMP] Always print error messages in libomptarget CUDA plugin Summary: Currently error messages from the CUDA plugins are only printed to the user if they have debugging enabled. Change this behaviour to always print the messages that result in offloading failure. This improves the error messages by indidcating what happened when the error occurs in the plugin library, such as a segmentation fault on the device. Reviewed by: jdoerfert Differential Revision: https://reviews.llvm.org/D94263	2021-01-07 17:47:32 -05:00
Johannes Doerfert	9ae171bcd3	[OpenMP][Docs] Add remarks intro section Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D93735	2021-01-07 14:31:17 -06:00
Joseph Huber	abb174bbc1	[OpenMP] Add example in Libomptarget Information docs Add an example to the OpenMP Documentation on the LIBOMPTARGET_INFO environment variable Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94246	2021-01-07 15:00:51 -05:00
Hansang Bae	fb1c528526	[OpenMP] Use c_int/c_size_t in Fortran target memory routine interface The Fortran interface is now in line with 5.1 specification. Differential Revision: https://reviews.llvm.org/D94042	2021-01-06 16:28:30 -06:00
Shilei Tian	5acdae1f9a	[OpenMP] Fixed an issue that wrong LLVM headers might be included when building libomptarget Wrong LLVM headers might be included if we don't set `include_directories` to a right place. This will cause a compilation error if LLVM is installed in system directories. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93737	2021-01-06 17:07:36 -05:00
Shilei Tian	e2a623094f	[OpenMP] Fixed the test environment when building along with LLVM Currently all built libraries in OpenMP are anywhere if building along with LLVM. It is not an issue if we don't execute any test. However, almost all tests for `libomptarget` fails because in the lit configuration, we only set `<build_dir>/libomptarget` to `LD_LIBRARY_PATH` and `LIBRARY_PATH`. Since those libraries are everywhere, `clang` can no longer find `libomptarget.so` or those deviceRTLs anymore. In this patch, we set a unified path for all built libraries, no matter whether it is built along with LLVM or not. In this way, our lit configuration can work propoerly. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93736	2021-01-06 17:06:16 -05:00
George Rokos	dec02904d2	[libomptarget] Allow calls to omp_target_memcpy with 0 size. Differential Revision: https://reviews.llvm.org/D94095	2021-01-05 16:03:53 -08:00
Joseph Huber	fe5d51a489	[OpenMP] Add using bit flags to select Libomptarget Information Summary: This patch adds more fine-grained support over which information is output from the libomptarget runtime when run with the environment variable LIBOMPTARGET_INFO set. An extensible set of flags can be used to pick and choose which information the user is interested in. Reviewers: jdoerfert JonChesterfield grokos Differential Revision: https://reviews.llvm.org/D93727	2021-01-04 12:03:15 -05:00
Jon Chesterfield	76bfbb74d3	[libomptarget][amdgpu] Call into deviceRTL instead of ockl [libomptarget][amdgpu] Call into deviceRTL instead of ockl Amdgpu codegen presently emits a call into ockl. The same functionality is already present in the deviceRTL. Adds an amdgpu specific entry point to avoid the dependency. This lets simple openmp code (specifically, that which doesn't use libm) run without rocm device libraries installed. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D93356	2021-01-04 16:48:47 +00:00
Hansang Bae	82a29a62ab	[OpenMP] Add definition/interface for target memory routines The change includes new routines introduced in 5.1 and Fortran interface. Differential Revision: https://reviews.llvm.org/D93505	2021-01-04 08:12:57 -06:00
Terry Wilmarth	6b316febb4	[OpenMP] libomp: Handle implicit conversion warnings This patch partially prepares the runtime source code to be built with -Wconversion, which should trigger warnings if any implicit conversions can possibly change a value. For builds done with icc or gcc, all such warnings are handled in this patch. clang gives a much longer list of warnings, particularly for sign conversions, which the other compilers don't report. The -Wconversion flag is commented into cmake files, but I'm not going to turn it on. If someone thinks it is important, and wants to fix all the clang warnings, they are welcome to. Types of changes made here involve either improving the consistency of types used so that no conversion is needed, or else performing careful explicit conversions, when we're sure a problem won't arise. Patch is a combination of changes by Terry Wilmarth and Johnny Peyton. Differential Revision: https://reviews.llvm.org/D92942	2020-12-31 00:39:57 +03:00
Joseph Huber	631501b1f9	[OpenMP] Fixing typo on memory size in Documenation	2020-12-23 11:46:26 -05:00
Joseph Huber	6e60346495	[OpenMP] Fixing Typo in Documentation	2020-12-23 09:17:51 -05:00
Joseph Huber	1c19804ebf	[OpenMP] Add OpenMP Documentation for Libomptarget environment variables Add support to the OpenMP web pages for environment variables supported by Libomptarget and their usage. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93723	2020-12-22 17:41:27 -05:00
Johannes Doerfert	7b0f9dd79a	[OpenMP][Docs] Fix Typo	2020-12-22 13:06:23 -06:00
Shilei Tian	1eb082c2ea	[OpenMP][Docs] Fixed a typo in the doc that can mislead users to a CMake error When setting `LLVM_ENABLE_RUNTIMES`, lower case word should be used; otherwise, it can cause a CMake error that specific path is not found. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D93719	2020-12-22 14:05:58 -05:00
Johannes Doerfert	9cb748724e	[OpenMP][Docs] Add FAQ entry about math and complex on GPUs Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D93718	2020-12-22 13:05:04 -06:00
Shilei Tian	612ddc3117	[OpenMP][Docs] Updated the faq about building an OpenMP offloading capable compiler After some issues about building runtimes along with LLVM were fixed, building an OpenMP offloading capable compiler is pretty simple. This patch updates the FAQ part in the doc. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93671	2020-12-22 13:14:53 -05:00
Johannes Doerfert	994bb6eb7d	[OpenMP][NFC] Provide a new remark and documentation If a GPU function is externally reachable we give up trying to find the (unique) kernel it is called from. This can hinder optimizations. Emit a remark and explain mitigation strategies. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D93439	2020-12-17 14:38:26 -06:00
Hansang Bae	e1fd202489	[OpenMP] Add definitions for 5.1 interop to omp.h	2020-12-17 13:03:59 -06:00
Atmn	907886cc5b	[OpenMP][Libomptarget][NFC] Use CMake Variables This patchs adds CMake variables to add subdirectories and include directories for libomptarget and explicitly gives the location of source files. Differential Revision: https://reviews.llvm.org/D93290	2020-12-16 19:05:15 -05:00
Jon Chesterfield	b607837c75	[libomptarget][nfc] Replace static const with enum [libomptarget][nfc] Replace static const with enum Semantically identical. Replaces 0xff... with ~0 to spare counting the f. Has the advantage that the compiler doesn't need to prove the 4/8 byte value dead before discarding it, and sidesteps the compilation question associated with what static means for a single source language. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93328	2020-12-16 16:40:37 +00:00
Peyton, Jonathan L	5aafdd7b88	[OpenMP] Introduce new file wrapper class for runtime Introduce new kmp_safe_raii_file_t class with RAII semantics for file open/close. It is essentially a wrapper around the C-style FILE* object. This also unifies the way we error report if a file can't be opened. Differential Revision: https://reviews.llvm.org/D92604	2020-12-15 14:46:30 -06:00
Hansang Bae	171ca93c54	[OpenMP] Initialize runtime in the forked child process This patch enables serial initialization in the forked child process to fix unstable runtime behavior when used with Python-based AI tools. Differential Revision: https://reviews.llvm.org/D93230	2020-12-15 07:29:28 -06:00

1 2 3 4 5 ...

1465 Commits