llvm-project

Commit Graph

Author	SHA1	Message	Date
Hansang Bae	ffb21e7f05	[OpenMP] Enable omp_get_num_devices() on Windows This patch enables omp_get_num_devices() and omp_get_initial_device() on Windows by providing an alternative to dlsym on Windows, and proposes to add a new libomptarget entry, __tgt_get_num_devices(). Differential Revision: https://reviews.llvm.org/D96182	2021-02-11 14:53:48 -06:00
Nawrin Sultana	4692bb4a8a	[OpenMP] Add lower and upper bound in num_teams clause This patch adds lower-bound and upper-bound to num_teams clause according to OpenMP 5.1 specification. The initial number of teams created is implementation defined, but it will be greater than or equal to lower-bound and less than or equal to upper-bound. If num_teams clause is not specified, the number of teams created is implementation defined, but it will be greater or equal to 1. Differential Revision: https://reviews.llvm.org/D95820	2021-02-10 13:58:50 -06:00
Jon Chesterfield	56c446a878	[libomptarget][amdgcn] Tolerate deadstripped device_state variable [libomptarget][amdgcn] Tolerate deadstripped device_state variable The device_state variable may have been deadstripped. Similar to device_environment, leave detection of missing but used symbol to loader. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D96330	2021-02-09 16:29:53 +00:00
Jon Chesterfield	4756f76bce	[libomptarget][amdgcn] Tolerate deadstripped env variable [libomptarget][amdgcn] Tolerate deadstripped env variable Discovered by Pushpinder. If the device_environment variable is unused it can be deadstripped, in which case we should not abort due to it missing. This change is safe in that a missing symbol which is actually used can be reported by both linker and loader, and a missing unused symbol is better deadstripped than left in the image. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D96329	2021-02-09 11:58:37 +00:00
Jon Chesterfield	2fa4186d4e	[libomptarget][amdgcn] Fix language linkage post D95300, drop use of assert	2021-02-08 20:07:51 +00:00
Shilei Tian	b68a6b09e6	[OpenMP][libomptarget] Fixed an issue that device sync is skipped if the kernel doesn't have any argument Currently if there is not kernel argument, device synchronization will be skipped. This can lead to two issues: 1. If there is any device error, it will not be captured; 2. The target region might end before the kernel is done, which is not spec conformant. The test added in this patch only runs on NVPTX platform, although it will not be executed by Phab at all. It also requires `not` which is not available on most systems. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96067	2021-02-04 20:14:24 -05:00
Shilei Tian	567b3f8841	[OpenMP][deviceRTLs] Drop `assert` in common parts of `deviceRTLs` The header `assert.h` needs to be included in order to use `assert` in the code. When building NVPTX `deviceRTLs` on a CUDA free system, it requires headers from `gcc-multilib`, which some systems don't have. This patch drops the use of `assert` in common parts of `deviceRTLs`. In light of `openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h`, a code block ``` if (!cond) __builtin_trap(); ``` is being used. The builtin will be translated to `call void @llvm.trap()`, and the corresponding PTX is `trap;`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95986	2021-02-04 12:39:43 -05:00
Shilei Tian	0f0ce3c12e	[OpenMP][NVPTX] Take functions in `deviceRTLs` as `convergent` OpenMP device compiler (similar to other SPMD compilers) assumes that functions are convergent by default to avoid invalid transformations, such as the bug (https://bugs.llvm.org/show_bug.cgi?id=49021). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95971	2021-02-03 20:58:12 -05:00
Shilei Tian	3c31b78455	[OpenMP] Fixed an issue that taskwait doesn't work on detachable task D77609 mistakenly changed the bebavior of task waiting on detachable task that a detachable task is not waited, based on https://lists.llvm.org/pipermail/openmp-dev/2021-February/003836.html. This patch fixed it. Thank Raúl for the report. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95798	2021-02-03 13:12:43 -05:00
Peyton, Jonathan L	ffca74b8b8	[OpenMP] Fix sign comparison warnings from GCC New affinity patch introduced legitimate sign-compare warnings that clang doesn't report but GCC-10 does. This removes the warnings by changing two variables types to unsigned. Differential Revision: https://reviews.llvm.org/D95818	2021-02-02 10:52:16 -06:00
Joseph Huber	ed8943c087	[OpenMP][NFC] Adding FAQ Entry for errors with static libraries	2021-02-02 10:50:22 -05:00
Atmn Patel	b545667d0a	[OpenMP][Libomptarget] Remove possible harmful copy constructor call for RTLsTy From https://bugs.llvm.org/show_bug.cgi?id=48973, we know that `std::call_once(PM->RTLs.initFlag, &RTLsTy::LoadRTLs, PM->RTLs)` causes compile time problems in libstdc++v3 5.3.1. This is because there was a defect in the standard regarding the `call_once` (LWG 2442). This was fixed in libstdc++ soon thereafter, but there are likely other standard libraries where this will fail. By matching this function call with the other one, we fix this bug. Differential Revision: https://reviews.llvm.org/D95769	2021-02-01 20:13:03 -05:00
AndreyChurbanov	d7b12004bd	[OpenMP] libomp: implement nteams-var and teams-thread-limit-var ICVs The change includes OMP_NUM_TEAMS, OMP_TEAMS_THREAD_LIMIT env variables, omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit routines. Differential Revision: https://reviews.llvm.org/D95003	2021-02-01 22:54:11 +03:00
Shilei Tian	f0129cc35e	[OpenMP] Disable tests if FileCheck is not available in in-tree building FileCheck is required for OpenMP tests. The current detection can fail if building OpenMP in-tree when user sets `LLVM_INSTALL_TOOLCHAIN_ONLY=ON`. As a result, CMake will raise an error and the compilation will be broken. This patch fixed the issue. When `FileCheck` is not a target, tests will just be skipped. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D95689	2021-02-01 13:14:55 -05:00
Joseph Huber	fda4853998	[OpenMP] Fix seg fault in libomptarget when using Info with multiple threads Summary: One option for the LIBOMPTARGET_INFO environment variable is to print the current status of the device's data mappings. These are a shared resource among threads so this needs to be protected when using multiple streams. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95786	2021-02-01 11:21:57 -05:00
xgupta	94fac81fcc	[Branch-Rename] Fix some links According to the [[ https://foundation.llvm.org/docs/branch-rename/ \| status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D95766	2021-02-01 16:43:21 +05:30
Tobias Hieta	c3c02d0d5a	[OpenMP] Fix python3 compatibility in openmp's lit.cfg Differential Revision: https://reviews.llvm.org/D95669	2021-02-01 08:20:26 +01:00
Shilei Tian	26d38f6d20	[OpenMP][NVPTX] Refined CMake logic to choose compute capabilites This patch refines the logic to choose compute capabilites via the environment variable `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES`. It supports the following values (all case insensitive): - "all": Build `deviceRTLs` for all supported compute capabilites; - "auto": Only build for the compute capability auto detected. Note that this requires CUDA. If CUDA is not found, a CMake fatal error will be raised. - "xx,yy" or "xx;yy": Build for compute capabilities `xx` and `yy`. If `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES` is not set, it is equivalent to set it to `all`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95687	2021-01-30 15:14:48 -05:00
Jonathan Peyton	67773681c0	[OpenMP] Add environment variable to force monotonic dynamic scheduling This patch introduces a new environment variable to force monotonic behavior for users that absolutely need it. This is in anticipation of 5.0 change that uses non-monotonic behavior for dynamic scheduling by default. Fixes for that and the actual switch are coming soon. Differential Revision: https://reviews.llvm.org/D95263	2021-01-29 12:23:27 -06:00
Shilei Tian	7bc31018f7	[OpenMP][NFC] Added release note for new `deviceRTLs` and hidden helper task Added release note for new `deviceRTLs` and hidden helper task for LLVM 12. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95584	2021-01-29 13:13:03 -05:00
AndreyChurbanov	7f5ad0e071	[OpenMP] libomp: fix build by cl with vs2019 Replace VLA with dynamic allocation using alloca(). This fixes https://bugs.llvm.org/show_bug.cgi?id=48919. Differential Revision: https://reviews.llvm.org/D95627	2021-01-29 13:16:41 +03:00
AndreyChurbanov	ac70a53653	[OpenMP] NFC: disabled two flakey tests as the bug in libomp not fixed yet	2021-01-29 00:54:13 +03:00
Shilei Tian	1b19c42302	[OpenMP][deviceRTLs] Separate declaration of target dependent functions from `target_impl.h` This patch created a new header file `target_interface.h` for declarations of all target dependent functions. All future targets can get things work by simply implementing all functions declared in the header and macros/data same as each `target_impl.h`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95300	2021-01-28 08:14:33 -05:00
Shilei Tian	5a64794bba	[OpenMP][NVPTX] Added the missing -O1 when building NVPTX bitcode libraries In the past `-O1` was used when building NVPTX bitcode libraries. After we switched to OpenMP, `-O1` was missing by mistake, leading to a huge performance regression. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95545	2021-01-28 08:13:38 -05:00
Shilei Tian	19248d30e4	[OpenMP][deviceRTLs] Added `[[clang::loader_uninitialized]]` explicitly `[[clang::loader_uninitialized]]` is in macro `SHARED` but it doesn't work for array like `parallelLevel`, so the variable will be zero initialized. There is also a similar issue for `omptarget_nvptx_device_State` which is in global address space. Its c'tor is also generated, which was not in the past when building the `deviceRTLs` with CUDA. In this patch, we added the attribute to the two variables explicitly. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95550	2021-01-28 08:12:49 -05:00
Shilei Tian	c571b16834	[OpenMP] Disabled profiling in `libomp` by default to unblock link errors Link error occurred when time profiling in libomp is enabled by default because `libomp` is assumed to be a C library but the dependence on `libLLVMSupport` for profiling is a C++ library. Currently the issue blocks all OpenMP tests in Phabricator. This patch set a new CMake option `OPENMP_ENABLE_LIBOMP_PROFILING` to enable/disable the feature. By default it is disabled. Note that once time profiling is enabled for `libomp`, it becomes a C++ library. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95585	2021-01-28 07:24:32 -05:00
Vyacheslav Zakharin	0fc90873b2	[libomptarget][NFC] Link plugins with threads support library due to std::call_once usage. Differential Revision: https://reviews.llvm.org/D95572	2021-01-27 19:26:18 -08:00
Atmn Patel	8a77056256	[OpenMP][Libomptarget] Fix conditional in CMake for remote plugin The remote offloading plugin's CMakeLists was trying to build if its flag was enabled even if it didn't find gRPC/protobuf. The conditional was wrong, it's fixed by this. Differential Revision: https://reviews.llvm.org/D95574	2021-01-27 21:28:25 -05:00
Shilei Tian	fb12df4a8e	[OpenMP][NVPTX] Disable building NVPTX deviceRTL by default on a non-CUDA system D95466 dropped CUDA to build NVPTX deviceRTL and enabled it by default. However, the building requires some libraries that are not available on non-CUDA system by default, which could break the compilation. This patch disabled the build by default. It can be enabled with `LIBOMPTARGET_BUILD_NVPTX_BCLIB=ON`. Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D95556	2021-01-27 17:06:14 -05:00
Peyton, Jonathan L	8e67134364	[OpenMP] Fix misleading warning for OMP_PLACES When OMP_PLACES contains an invalid value, the warning informs the user that the fallback is OMP_PLACES=threads, but the actual internal setting is OMP_PLACES=cores and is detected as such with KMP_SETTINGS=1. This patch informs the user that OMP_PLACES=cores is being used instead of OMP_PLACES=threads. Differential Revision: https://reviews.llvm.org/D95170	2021-01-27 14:27:24 -06:00
Peyton, Jonathan L	598c590b3c	[OpenMP] Add cpuid leaf 1f topology discovery This patch adds the new algorithm for topology discovery using cpuid leaf 1f. Only the new die level is detected and integrated into the current affinity mechanisms including KMP_AFFINITY (granularity level and compact/scatter algorithm), OMP_PLACES=dies, and KMP_HW_SUBSET. Differential Revision: https://reviews.llvm.org/D95157	2021-01-27 14:27:23 -06:00
Peyton, Jonathan L	9f87c6b47d	[OpenMP] Fix HWLOC topology detection for 2.0.x HWLOC 2.0 has numa nodes as separate children and are not in the main parent/child topology tree anymore. This change takes this into account. The main topology detection loop in the create_hwloc_map() routine starts at a hardware thread within the initial affinity mask and goes up the topology tree setting the socket/core/thread labels correctly. This change also introduces some of the more generic changes that the future kmp_topology_t structure will take advantage of including a generic ratio & count array (finding all ratios of topology layers like threads/core cores/socket and finding all counts of each topology layer), generic radix1 reduction step, generic uniformity check, and generic printing of topology (en_US.txt) Differential Revision: https://reviews.llvm.org/D95156	2021-01-27 14:27:23 -06:00
Giorgis Georgakoudis	1e59c1a898	[OpenMP][Libomptarget] Fix check-libomptarget The check-libomptarget fails when building with LLVM_ENABLE_PROJECTS. This is because test configuration misses the path to libomp.so and libLLVMSupport.so when time profiling is enabled (both libraries have the same path when building). This patch add the path to the configuration. Reviewed By: vzakhari Differential Revision: https://reviews.llvm.org/D95376	2021-01-27 06:46:40 -08:00
Giorgis Georgakoudis	bb40e67318	[OpenMP] Fix building using LLVM_ENABLE_RUNTIMES Fix when time profiling is enabled. Related to: D94855 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95398	2021-01-27 06:43:57 -08:00
AndreyChurbanov	498c4b6fc4	[OpenMP] libomp: fix build by clang-cl with vs2019 Problem reported by Joseph Shen <joseph.smeng@gmail.com>. The patch changes *(&<atomic-var>) to (&<atomic-var>)->load(). Differential Revision: https://reviews.llvm.org/D95485	2021-01-27 12:18:15 +03:00
Shilei Tian	e7535f8fed	[OpenMP][NVPTX] Drop dependence on CUDA to build NVPTX `deviceRTLs` With D94745, we no longer use CUDA SDK to compile `deviceRTLs`. Therefore, many CMake code in the project is useless. This patch cleans up unnecessary code and also drops the requirement to build NVPTX `deviceRTLs`. CUDA detection is still being used however to determine whether we need to involve the tests. Auto detection of compute capability is enabled by default and can be disabled by setting CMake variable `LIBOMPTARGET_NVPTX_AUTODETECT_COMPUTE_CAPABILITY=OFF`. If auto detection is enabled, and CUDA is also valid, it will only build the bitcode library for the detected version; otherwise, all variants supported will be generated. One drawback of this patch is, we now generate 96 variants of bitcode library, and totally 1485 files to be built with a clean build on a non-CUDA system. `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=""` can be used to disable building NVPTX `deviceRTLs`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95466	2021-01-26 20:21:36 -05:00
Nawrin Sultana	927af4b3c5	[OpenMP] Modify OMP_ALLOCATOR environment variable This patch sets the def-allocator-var ICV based on the environment variables provided in OMP_ALLOCATOR. Previously, only allowed value for OMP_ALLOCATOR was a predefined memory allocator. OpenMP 5.1 specification allows predefined memory allocator, predefined mem space, or predefined mem space with traits in OMP_ALLOCATOR. If an allocator can not be created using the provided environment variables, the def-allocator-var is set to omp_default_mem_alloc. Differential Revision: https://reviews.llvm.org/D94985	2021-01-26 18:27:39 -06:00
Jon Chesterfield	653655040f	[libomptarget][cuda] Handle missing _v2 symbols gracefully [libomptarget][cuda] Handle missing _v2 symbols gracefully Follow on from D95367. Dlsym the _v2 symbols if present, otherwise use the unsuffixed version. Builds a hashtable for the check, can revise for zero heap allocations later if necessary. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95415	2021-01-27 00:22:29 +00:00
Vyacheslav Zakharin	3caa2d3354	[libomptarget][NFC] Avoid gcc 5/6 issue with lambda captures. Differential Revision: https://reviews.llvm.org/D95486	2021-01-26 16:06:58 -08:00
Vyacheslav Zakharin	5f1d4d4779	[libomptarget][NFC] Use portable printf format specifiers. Differential Revision: https://reviews.llvm.org/D95476	2021-01-26 13:56:25 -08:00
Atmn Patel	810572cc96	[OpenMP][Libomptarget] Fix cmake error on remote plugin Requiring 3.15 causes a build breakage, I'm sure none of the contents actually require 3.15 or above. Differential Revision: https://reviews.llvm.org/D95474	2021-01-26 16:00:40 -05:00
Jon Chesterfield	7baff00eee	[libomptarget][cuda] Gracefully handle missing cuda library [libomptarget][cuda] Gracefully handle missing cuda library If using dynamic cuda, and it failed to load, it is not safe to call cuGetErrorString. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95412	2021-01-26 20:43:07 +00:00
Jon Chesterfield	fdeffd6fb0	[libomptarget][cuda] Only run tests when sure there is cuda available [libomptarget][cuda] Only run tests when sure there is cuda available Prior to D95155, building the cuda plugin implied cuda was installed locally. With that change, every machine can build a cuda plugin, but they won't all have cuda and/or an nvptx card installed locally. This change enables the nvptx tests when either: - libcuda is present - the user has forced use of the dlopen stub The default case when there is no cuda detected will no longer attempt to run the tests on nvptx hardware, as was the case before D95155. Reviewed By: jdoerfert, ronlieb Differential Revision: https://reviews.llvm.org/D95467	2021-01-26 20:41:06 +00:00
Atmn Patel	ec8f4a38c8	[OpenMP][Libomptarget] Introduce Remote Offloading Plugin This introduces a remote offloading plugin for libomptarget. This implementation relies on gRPC and protobuf, so this library will only build if both libraries are available on the system. The corresponding server is compiled to `openmp-offloading-server`. This is a large change, but the only way to split this up is into RTL/server but I fear that could introduce an inconsistency amongst them. Ideally, tests for this should be added to the current ones that but that is problematic for at least one reason. Given that libomptarget registers plugin on a first-come-first-serve basis, if we wanted to offload onto a local x86 through a different process, then we'd have to either re-order the plugin list in `rtl.cpp` (which is what I did locally for testing) or find a better solution for runtime plugin registration in libomptarget. Differential Revision: https://reviews.llvm.org/D95314	2021-01-26 15:33:38 -05:00
Atmn	683719bc0c	[OpenMP][Libomptarget] Introduce changes to support remote plugin In order to support remote execution, we need to be able to send the target binary description to the remote host for registration (and consequent deregistration). To support this, I added these two optional new functions to the plugin API: - `__tgt_rtl_register_lib` - `__tgt_rtl_unregister_lib` These functions will be called to properly manage the instance of libomptarget running on the remote host. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D93293	2021-01-26 14:19:27 -05:00
Jon Chesterfield	32cc5564e2	[libomptarget][devicertl][amdgpu] Fix build, variable renaming error	2021-01-26 19:05:21 +00:00
Shilei Tian	7c03f7d7d0	[OpenMP][deviceRTLs] Build the deviceRTLs with OpenMP instead of target dependent language From this patch (plus some landed patches), `deviceRTLs` is taken as a regular OpenMP program with just `declare target` regions. In this way, ideally, `deviceRTLs` can be written in OpenMP directly. No CUDA, no HIP anymore. (Well, AMD is still working on getting it work. For now AMDGCN still uses original way to compile) However, some target specific functions are still required, but they're no longer written in target specific language. For example, CUDA parts have all refined by replacing CUDA intrinsic and builtins with LLVM/Clang/NVVM intrinsics. Here're a list of changes in this patch. 1. For NVPTX, `DEVICE` is defined empty in order to make the common parts still work with AMDGCN. Later once AMDGCN is also available, we will completely remove `DEVICE` or probably some other macros. 2. Shared variable is implemented with OpenMP allocator, which is defined in `allocator.h`. Again, this feature is not available on AMDGCN, so two macros are redefined properly. 3. CUDA header `cuda.h` is dropped in the source code. In order to deal with code difference in various CUDA versions, we build one bitcode library for each supported CUDA version. For each CUDA version, the highest PTX version it supports will be used, just as what we currently use for CUDA compilation. 4. Correspondingly, compiler driver is also updated to support CUDA version encoded in the name of bitcode library. Now the bitcode library for NVPTX is named as `libomptarget-nvptx-cuda_[cuda_version]-sm_[sm_number].bc`, such as `libomptarget-nvptx-cuda_80-sm_20.bc`. With this change, there are also multiple features to be expected in the near future: 1. CUDA will be completely dropped when compiling OpenMP. By the time, we also build bitcode libraries for all supported SM, multiplied by all supported CUDA version. 2. Atomic operations used in `deviceRTLs` can be replaced by `omp atomic` if OpenMP 5.1 feature is fully supported. For now, the IR generated is totally wrong. 3. Target specific parts will be wrapped into `declare variant` with `isa` selector if it can work properly. No target specific macro is needed anymore. 4. (Maybe more...) Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94745	2021-01-26 12:28:47 -05:00
George Rokos	94cf89d1c2	[libomptarget][NFC] Fixed obsolete function names in comments	2021-01-26 07:39:42 -08:00
Alexey Bataev	4a63e53373	[LIBOMPTARGET]FIX define declaration, NFC Fixed declaration of define by adding a comma symbol. Required to fix build without profiling.	2021-01-26 07:43:31 -05:00
Johannes Doerfert	8c7fdc4c61	[OpenMP] Add source location information to the libomptarget profile In much of the libomptarget interface we have an ident_t object now, if it is not null we can use it to improve the profile output. For now, we simply use the ident_t "source information string" as generated by the FE. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D95282	2021-01-25 22:43:43 -06:00
Shilei Tian	9d64275ae0	[OpenMP] Added the support for hidden helper task in RTL The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want. Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8. Here are some open issues to be discussed: 1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here? Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D77609	2021-01-25 22:16:17 -05:00
Jon Chesterfield	357eea6e8b	Revert "[libomptarget][cuda] Gracefully handle missing cuda library" This reverts commit `fafd45c01f`.	2021-01-26 03:14:53 +00:00
Jon Chesterfield	fafd45c01f	[libomptarget][cuda] Gracefully handle missing cuda library [libomptarget][cuda] Gracefully handle missing cuda library If using dynamic cuda, and it failed to load, it is not safe to call cuGetErrorString. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95412	2021-01-26 02:54:00 +00:00
Shilei Tian	3333244d77	[OpenMP][deviceRTLs] Remove omp_is_initial_device `omp_is_initial_device` in device code was implemented as a builtin function in D38968 for a better performance. Therefore there is no chance that this function will be called to `deviceRTLs`. As we're moving to build `deviceRTLs` with OpenMP compiler, this function can lead to a compilation error. This patch just simply removes it. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95397	2021-01-25 18:34:23 -05:00
Shilei Tian	27cc4a8138	[OpenMP][NVPTX] Rewrite CUDA intrinsics with NVVM intrinsics This patch makes prep for dropping CUDA when compiling `deviceRTLs`. CUDA intrinsics are replaced by NVVM intrinsics which refers to code in `__clang_cuda_intrinsics.h`. We don't want to directly include it because in the near future we're going to switch to OpenMP and by then the header cannot be used anymore. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95327	2021-01-25 14:14:30 -05:00
Joseph Huber	93eef7d8e9	[OpenMP][NFC] Fix SourceInfo.h variable names Summary: Fix the names to use Pascal case to comply with the LLVM coding guidelines. `ident_t` is required for compatibility with the rest of libomp.	2021-01-25 12:43:34 -05:00
Jon Chesterfield	95f0d1edaf	[libomptarget] Compile with older cuda, revert D95274 [libomptarget] Compile with older cuda, revert D95274 Fixes regression reported in comments of D95274. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95367	2021-01-25 16:12:56 +00:00
Jon Chesterfield	e5e448aafa	[libomptarget][cuda] Fix build, change missed from D95274	2021-01-24 18:30:04 +00:00
Shilei Tian	cfd978d5d3	[OpenMP] Fixed test environment of `check-libomptarget-nvptx` D95161 removed the option `--libomptarget-nvptx-path`, which is used in the tests for `libomptarget-nvptx`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95293	2021-01-24 13:18:33 -05:00
Jon Chesterfield	c3074d48d3	[libomptarget][nvptx] Replace cuda atomic primitives with clang intrinsics [libomptarget][nvptx] Replace cuda atomic primitives with clang intrinsics Tested by diff of IR generated for target_impl.cu before and after. NFC. Part of removing deviceRTL build time dependency on cuda SDK. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D95294	2021-01-24 10:59:15 +00:00
Jon Chesterfield	dc70c56be5	[libomptarget][amdgpu][nfc] Update comments [libomptarget][amdgpu][nfc] Update comments Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95295	2021-01-23 22:53:58 +00:00
Jon Chesterfield	78b0630b72	[libomptarget][cuda] Call v2 functions explicitly [libomptarget][cuda] Call v2 functions explicitly rtl.cpp calls functions like cuMemFree that are replaced by a macro in cuda.h with cuMemFree_v2. This patch changes the source to use the v2 names consistently. See also D95104, D95155 for the idea. Alternatives are to use a mixture, e.g. call the macro names and explictly dlopen the _v2 names, or to keep the current status where the symbols are replaced by macros in both files Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95274	2021-01-23 20:33:13 +00:00
Hansang Bae	480cbed31e	[OpenMP] Remove unnecessary pointer checks in a few locations Also, return NULL from unsuccessful OMPT function lookup. Differential Revision: https://reviews.llvm.org/D95277	2021-01-22 19:18:50 -06:00
Jon Chesterfield	47e95e87a3	[libomptarget] Build cuda plugin without cuda installed locally [libomptarget] Build cuda plugin without cuda installed locally Compiles a new file, `plugins/cuda/dynamic_cuda/cuda.cpp`, to an object file that exposes the same symbols that the plugin presently uses from libcuda. The object file contains dlopen of libcuda and cached dlsym calls. Also provides a cuda.h containing the subset that is used. This lets the cmake file choose between the system cuda and a dlopen shim, with no changes to rtl.cpp. The corresponding change to amdgpu is postponed until after a refactor of the plugin to reduce the size of the hsa.h stub required Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95155	2021-01-23 00:15:04 +00:00
Joseph Schuchart	edbcc17b7a	[OpenMP] libomp: properly initialize buckets in __kmp_dephash_extend The buckets are initialized in __kmp_dephash_create but when they are extended the memory is allocated but not NULL'd, potentially leaving some buckets uninitialized after all entries have been copied into the new allocation. This commit makes sure the buckets are properly initialized with NULL before copying the entries. Differential Revision: https://reviews.llvm.org/D95167	2021-01-22 20:29:46 +03:00
Jon Chesterfield	9b19ecb8f1	[libomptarget][devicertl] Drop templated atomic functions [libomptarget][devicertl] Drop templated atomic functions The five __kmpc_atomic templates are instantiated a total of seven times. This change replaces the template with explictly typed functions, which have the same prototype for amdgcn and nvptx, and implements them with the same code presently in use. Rolls in the accepted but not yet landed D95085. The unsigned long long type can be replaced with uint64_t when replacing the cuda function. Until then, clang warns on casting a pointer to one to a pointer to the other. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D95093	2021-01-22 14:48:22 +00:00
Joseph Huber	119a9ea13f	[OpenMP] Fix failing test due to change in offloading flags Summary: Prior to D91261 the information checked the OMP_MAP_TARGET_PARAM flag, change this as it has been removed. The INFO macro was changed to accept a flag as input to make conditionally printing information easier. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95133	2021-01-21 14:09:36 -05:00
Giorgis Georgakoudis	6b7645dd31	[OpenMP] Add time profiling support in libomp Profiling has been recently implemented in libomptarget (D93055). This patch enables time profiling support for libomptarget in libomp, to support profiling of multi-threaded execution of offloaded regions. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94855	2021-01-21 09:15:14 -08:00
Shilei Tian	48c54f0f62	[OpenMP][NVPTX] Added forward declaration for atomic operations Pretty similar to D95058, this patch added forward declaration for CUDA atomic functions. We already have definitions with right mangled names in internal CUDA headers so the forward declaration here can work properly. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D95085	2021-01-21 10:37:16 -05:00
Joseph Huber	e4eaf9d820	[OpenMP] Add support for mapping names in mapper API Summary: The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D94806	2021-01-21 09:26:44 -05:00
Shilei Tian	33a5d212c6	[OpenMP][NVPTX] Added forward declaration to pave the way for building deviceRTLs with OpenMP Once we switch to build deviceRTLs with OpenMP, primitives and CUDA intrinsics cannot be used directly anymore because `__device__` is not recognized by OpenMP compiler. To avoid involving all CUDA internal headers we had in `clang`, we forward declared these functions. Eventually they will be transformed into right LLVM instrinsics. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95058	2021-01-20 15:56:02 -05:00
Jon Chesterfield	fbc1dcb946	[libomptarget][devicertl][nfc] Simplify target_atomic abstraction [libomptarget][devicertl][nfc] Simplify target_atomic abstraction Atomic functions were implemented as a shim around cuda's atomics, with amdgcn implementing those symbols as a shim around gcc style intrinsics. This patch folds target_atomic.h into target_impl.h and folds amdgcn. Further work is likely to be useful here, either changing to openmp's atomic interface or instantiating the templates on the few used types in order to move them into a cuda/c++ implementation file. This change is mostly to group the remaining uses of the cuda api under nvptx' target_impl abstraction. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95062	2021-01-20 19:50:50 +00:00
Jon Chesterfield	ea616f9026	[libomptarget][devicertl][nfc] Remove some cuda intrinsics, simplify [libomptarget][devicertl][nfc] Remove some cuda intrinsics, simplify Replace __popc, __ffs with clang intrinsics. Move kmpc_impl_min to only file that uses it and replace template with explictly typed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95060	2021-01-20 19:45:05 +00:00
Shilei Tian	fd70f70d1e	[OpenMP][NVPTX] Replaced CUDA builtin vars with LLVM intrinsics Replaced CUDA builtin vars with LLVM intrinsics such that we don't need definitions of those intrinsics. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95013	2021-01-20 12:02:06 -05:00
Jon Chesterfield	e069662deb	[libomptarget][devicertl] Wrap source in declare target pragmas [libomptarget][devicertl] Wrap source in declare target pragmas Factored out of D93135 / D94745. C++ and cuda ignore unknown pragmas so this is a NFC for the current implementation language. Removes noise from patches for building deviceRTL as openmp. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D95048	2021-01-20 15:50:41 +00:00
Hansang Bae	2d911f7c72	[OpenMP] Fix atomic entries for captured logical operation Added missing code for the captured atomic operation. Differential Revision: https://reviews.llvm.org/D94848	2021-01-19 09:59:28 -06:00
AndreyChurbanov	a60bc55c69	[OpenMP] libomp: cleanup parsing of OMP_ALLOCATOR env variable. Differential Revision: https://reviews.llvm.org/D94932	2021-01-19 16:21:22 +03:00
Kelvin Li	9d81073acb	[OpenMP][Docs] Fix typos in FAQ (NFC)	2021-01-18 18:55:58 -05:00
AndreyChurbanov	aa3a59e0c6	[OpenMP][NFC] Fix test The test fails if memkind library is accessible.	2021-01-19 00:05:34 +03:00
Shilei Tian	9bf843bdc8	Revert "[OpenMP] Added the support for hidden helper task in RTL" This reverts commit `ed939f853d`.	2021-01-18 06:57:52 -05:00
Chandler Carruth	f855751c12	Fix openmp CMake build on non-Linux AArch64 systems. This just checks for `/proc/cpuinfo` existing before reading it. Tested on an ARM macOS machine.	2021-01-17 16:18:31 -08:00
Shilei Tian	ed939f853d	[OpenMP] Added the support for hidden helper task in RTL The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want. Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8. Here are some open issues to be discussed: 1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here? Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D77609	2021-01-16 14:13:35 -05:00
Jon Chesterfield	214387c2c6	[libomptarget][nvptx] Reduce calls to cuda header [libomptarget][nvptx] Reduce calls to cuda header Remove use of clock_t in favour of a builtin. Drop a preprocessor branch. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94731	2021-01-15 02:16:33 +00:00
Jon Chesterfield	6e7094c14b	[libomptarget][nvptx][nfc] Move target_impl functions out of header [libomptarget][nvptx][nfc] Move target_impl functions out of header This removes most of the differences between the two target_impl.h. Also change name mangling from C to C++ for __kmpc_impl_*_lock. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D94728	2021-01-15 00:19:48 +00:00
Shilei Tian	547b032ccc	[OpenMP] Remove omptarget-nvptx from deps as it is no longer a valid target `omptarget-nvptx` is still a dependence for `check-libomptarget-nvtpx` although it has been removed by D94573. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94725	2021-01-14 19:16:11 -05:00
Shilei Tian	64e9e9aeee	[OpenMP] Dropped unnecessary define when compiling deviceRTLs for NVPTX The comment said CUDA 9 header files use the `nv_weak` attribute which `clang` is not yet prepared to handle. It's three years ago and now things have changed. Based on my test, removing the definition doesn't have any problem on my machine with CUDA 11.1 installed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94700	2021-01-14 13:55:12 -05:00
Shilei Tian	763c1f9933	[OpenMP] Drop the static library libomptarget-nvptx For NVPTX target, OpenMP provides a static library `libomptarget-nvptx` built by NVCC, and another bitcode `libomptarget-nvptx-sm_{$sm}.bc` generated by Clang. When compiling an OpenMP program, the `.bc` file will be fed to `clang` in the second run on the program that compiles the target part. Then the generated PTX file will be fed to `ptxas` to generate the object file, and finally the driver invokes `nvlink` to generate the binary, where the static library will be appened to `nvlink`. One question is, why do we need two libraries? The only difference is, the static library contains `omp_data.cu` and the bitcode library doesn't. It's unclear why they were implemented in this way, but per D94565, there is no issue if we also include the file into the bitcode library. Therefore, we can safely drop the static library. This patch is about the change in OpenMP. The driver will be updated as well if this patch is accepted. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D94573	2021-01-14 13:34:25 -05:00
Jon Chesterfield	5d165f0b89	[libomptarget][amdgpu] Fix kernel launch tracing to match previous behavior Restore control of kernel launch tracing to be >= 1 as it was before export LIBOMPTARGET_KERNEL_TRACE=1 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94695	2021-01-14 18:13:22 +00:00
Terry Wilmarth	4fe17ada55	[OpenMP] Fix hierarchical barrier Hierarchical barrier is an experimental barrier algorithm that uses aspects of machine hierarchy to define the barrier tree structure. This patch fixes offset calculation in hierarchical barrier. The offset is used to store info on a flag about sleeping threads waiting on a location stored in the flag. This commit also fixes a potential deadlock in hierarchical barrier when using infinite blocktime by adjusting the offset value of leaf kids so that it matches the value of leaf state. It also adds testing of default barriers with infinite blocktime, and also tests hierarchical barrier algorithm with both default and infinite blocktime. Patch by Terry Wilmarth and Nawrin Sultana. Differential Revision: https://reviews.llvm.org/D94241	2021-01-13 10:22:57 -06:00
Joseph Huber	a957634942	[OpenMP] Add documentation for error messages and release notes Add extra information to the runtime page describing the error messages and add information to the release notes for clang 12.0 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94562	2021-01-13 11:00:41 -05:00
Jon Chesterfield	84e0b14a0a	[libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL [libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D94565	2021-01-13 03:51:11 +00:00
Hansang Bae	bba3a82b56	[OpenMP] Use persistent memory for omp_large_cap_mem This change enables volatile use of persistent memory for omp_large_cap_mem* on supported systems. It depends on libmemkind's support for persistent memory, and requirements/details can be found at the following url. https://pmem.io/2020/01/20/memkind-dax-kmem.html Differential Revision: https://reviews.llvm.org/D94353	2021-01-12 20:35:27 -06:00
Hansang Bae	6f0f022038	[OpenMP] Update allocator trait key/value definitions Use new definitions introduced in 5.1 specification. Differential Revision: https://reviews.llvm.org/D94277	2021-01-12 20:09:45 -06:00
Shilei Tian	01f1273fe2	[OpenMP] Fixed a typo in openmp/CMakeLists.txt	2021-01-12 17:00:49 -05:00
Shilei Tian	68ff52ffea	[OpenMP] Fixed the link error that cannot find static data member Constant static data member can be defined in the class without another define after the class in C++17. Although it is C++17, Clang can still handle it even w/o the flag for C++17. Unluckily, GCC cannot handle that. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D94541	2021-01-12 16:48:28 -05:00
Jon Chesterfield	33e2494bea	[libomptarget][amdgpu][nfc] Fix build on centos [libomptarget][amdgpu][nfc] Fix build on centos rtl.cpp replaced 224 with a #define from elf.h, but that doesn't work on a centos 7 build machine with an old elf.h Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D94528	2021-01-12 19:40:03 +00:00
Shilei Tian	bdd1ad5e5c	[OpenMP] Fixed include directories for OpenMP when building OpenMP with LLVM_ENABLE_RUNTIMES Some LLVM headers are generated by CMake. Before the installation, LLVM's headers are distributed everywhere, some of which are in `${LLVM_SRC_ROOT}/llvm/include/llvm`, and some are in `${LLVM_BINARY_ROOT}/include/llvm`. After intallation, they're all in `${LLVM_INSTALLATION_ROOT}/include/llvm`. OpenMP now depends on LLVM headers. Some headers depend on headers generated by CMake. When building OpenMP along with LLVM, a.k.a via `LLVM_ENABLE_RUNTIMES`, we need to tell OpenMP where it can find those headers, especially those still have not been copied/installed. Reviewed By: jdoerfert, jhuber6 Differential Revision: https://reviews.llvm.org/D94534	2021-01-12 14:32:38 -05:00
Shilei Tian	0871d6d516	[OpenMP] Move memory manager to plugin and make it a common interface The lifetime of `libomptarget` and its opened plugins are not aligned and it's hard for `libomptarget` to determine when the plugins are destroyed. As a result, some issues (see D94256 for details) occur on some platforms. Actually, if we take target memory as target resources, same as other resources, such as CUDA streams, in each plugin, then the memory manager should also be in the plugin. Also considering some platforms may want to opt out the feature, it makes sense to move the memory manager to plugin, make it a common interface, and let plguin developers determine whether they need it. This is what this patch does. CUDA plugin is taken as example to show how to integrate it. In this way, we can also get a bonus that different thresholds can be set for different platforms. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D94379	2021-01-11 21:33:42 -05:00
Shilei Tian	a81c68ae6b	[OpenMP] Take elf_common.c as a interface library For now `elf_common.c` is taken as a common part included into different plugin implementations directly via `#include "../../common/elf_common.c"`, which is not a best practice. Since it is simple enough such that we don't need to create a real library for it, we just take it as a interface library so that other targets can link it directly. Another advantage of this method is, we don't need to add the folder into header search path which can potentially pollute the search path. VE and AMD platforms have not been tested because I don't have target machines. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94443	2021-01-11 17:34:26 -05:00
Shilei Tian	7be3285248	[OpenMP] Not set OPENMP_STANDALONE_BUILD=ON when building OpenMP along with LLVM For now, `*_STANDALONE_BUILD` is set to ON even if they're built along with LLVM because of issues mentioned in the comments. This can cause some issues. For example, if we build OpenMP along with LLVM, we'd like to copy those OpenMP headers to `<prefix>/lib/clang/<version>/include` such that `clang` can find those headers without using `-I <prefix>/include` because those headers will be copied to `<prefix>/include` if it is built standalone. In this patch, we fixed the dependence issue in OpenMP such that it can be built correctly even with `OPENMP_STANDALONE_BUILD=OFF`. The issue is in the call to `add_lit_testsuite`, where `clang` and `clang-resource-headers` are passed as `DEPENDS`. Since we're building OpenMP along with LLVM, `clang` is set by CMake to be the C/C++ compiler, therefore these two dependences are no longer needed, where caused the dependence issue. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93738	2021-01-10 16:46:19 -05:00
Shilei Tian	175c336a1c	[OpenMP] Remove copy constructor of `RTLInfoTy` Multiple `RTLInfoTy` objects are stored in a list `AllRTLs`. Since `RTLInfoTy` contains a `std::mutex`, it is by default not a copyable object. In order to support `AllRTLs.push_back(...)` which is currently used, a customized copy constructor is provided. Every time we need to add a new data member into `RTLInfoTy`, we should keep in mind not forgetting to add corresponding assignment in the copy constructor. In fact, the only use of the copy constructor is to push the object into the list, we can of course write it in a way that first emplace a new object back, and then use the reference to the last element. In this way we don't need the copy constructor anymore. If the element is invalid, we just need to pop it, and that's what this patch does. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94361	2021-01-09 13:01:01 -05:00
Shilei Tian	676c7cb0c0	[OpenMP] Added the support for cache line size 256 for A64FX Fugaku supercomputer is built with the Fujitsu A64FX microprocessor, whose cache line is 256. In current libomp, we only have cache line size 128 for PPC64 and otherwise 64. This patch added the support of cache line 256 for A64FX. It's worth noting that although A64FX is a variant of AArch64, this property is not shared. As a result, in light of UCX source code (`392443ab92/src/ucs/arch/aarch64/cpu.c (L17)`), we can only determine by checking whether the CPU is FUJITSU A64FX. Reviewed By: jdoerfert, Hahnfeld Differential Revision: https://reviews.llvm.org/D93169	2021-01-09 11:58:47 -05:00
Joseph Huber	2ce16810f2	[OpenMP] Always print error messages in libomptarget CUDA plugin Summary: Currently error messages from the CUDA plugins are only printed to the user if they have debugging enabled. Change this behaviour to always print the messages that result in offloading failure. This improves the error messages by indidcating what happened when the error occurs in the plugin library, such as a segmentation fault on the device. Reviewed by: jdoerfert Differential Revision: https://reviews.llvm.org/D94263	2021-01-07 17:47:32 -05:00
Johannes Doerfert	9ae171bcd3	[OpenMP][Docs] Add remarks intro section Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D93735	2021-01-07 14:31:17 -06:00
Joseph Huber	abb174bbc1	[OpenMP] Add example in Libomptarget Information docs Add an example to the OpenMP Documentation on the LIBOMPTARGET_INFO environment variable Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94246	2021-01-07 15:00:51 -05:00
Hansang Bae	fb1c528526	[OpenMP] Use c_int/c_size_t in Fortran target memory routine interface The Fortran interface is now in line with 5.1 specification. Differential Revision: https://reviews.llvm.org/D94042	2021-01-06 16:28:30 -06:00
Shilei Tian	5acdae1f9a	[OpenMP] Fixed an issue that wrong LLVM headers might be included when building libomptarget Wrong LLVM headers might be included if we don't set `include_directories` to a right place. This will cause a compilation error if LLVM is installed in system directories. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93737	2021-01-06 17:07:36 -05:00
Shilei Tian	e2a623094f	[OpenMP] Fixed the test environment when building along with LLVM Currently all built libraries in OpenMP are anywhere if building along with LLVM. It is not an issue if we don't execute any test. However, almost all tests for `libomptarget` fails because in the lit configuration, we only set `<build_dir>/libomptarget` to `LD_LIBRARY_PATH` and `LIBRARY_PATH`. Since those libraries are everywhere, `clang` can no longer find `libomptarget.so` or those deviceRTLs anymore. In this patch, we set a unified path for all built libraries, no matter whether it is built along with LLVM or not. In this way, our lit configuration can work propoerly. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93736	2021-01-06 17:06:16 -05:00
George Rokos	dec02904d2	[libomptarget] Allow calls to omp_target_memcpy with 0 size. Differential Revision: https://reviews.llvm.org/D94095	2021-01-05 16:03:53 -08:00
Joseph Huber	fe5d51a489	[OpenMP] Add using bit flags to select Libomptarget Information Summary: This patch adds more fine-grained support over which information is output from the libomptarget runtime when run with the environment variable LIBOMPTARGET_INFO set. An extensible set of flags can be used to pick and choose which information the user is interested in. Reviewers: jdoerfert JonChesterfield grokos Differential Revision: https://reviews.llvm.org/D93727	2021-01-04 12:03:15 -05:00
Jon Chesterfield	76bfbb74d3	[libomptarget][amdgpu] Call into deviceRTL instead of ockl [libomptarget][amdgpu] Call into deviceRTL instead of ockl Amdgpu codegen presently emits a call into ockl. The same functionality is already present in the deviceRTL. Adds an amdgpu specific entry point to avoid the dependency. This lets simple openmp code (specifically, that which doesn't use libm) run without rocm device libraries installed. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D93356	2021-01-04 16:48:47 +00:00
Hansang Bae	82a29a62ab	[OpenMP] Add definition/interface for target memory routines The change includes new routines introduced in 5.1 and Fortran interface. Differential Revision: https://reviews.llvm.org/D93505	2021-01-04 08:12:57 -06:00
Terry Wilmarth	6b316febb4	[OpenMP] libomp: Handle implicit conversion warnings This patch partially prepares the runtime source code to be built with -Wconversion, which should trigger warnings if any implicit conversions can possibly change a value. For builds done with icc or gcc, all such warnings are handled in this patch. clang gives a much longer list of warnings, particularly for sign conversions, which the other compilers don't report. The -Wconversion flag is commented into cmake files, but I'm not going to turn it on. If someone thinks it is important, and wants to fix all the clang warnings, they are welcome to. Types of changes made here involve either improving the consistency of types used so that no conversion is needed, or else performing careful explicit conversions, when we're sure a problem won't arise. Patch is a combination of changes by Terry Wilmarth and Johnny Peyton. Differential Revision: https://reviews.llvm.org/D92942	2020-12-31 00:39:57 +03:00
Joseph Huber	631501b1f9	[OpenMP] Fixing typo on memory size in Documenation	2020-12-23 11:46:26 -05:00
Joseph Huber	6e60346495	[OpenMP] Fixing Typo in Documentation	2020-12-23 09:17:51 -05:00
Joseph Huber	1c19804ebf	[OpenMP] Add OpenMP Documentation for Libomptarget environment variables Add support to the OpenMP web pages for environment variables supported by Libomptarget and their usage. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93723	2020-12-22 17:41:27 -05:00
Johannes Doerfert	7b0f9dd79a	[OpenMP][Docs] Fix Typo	2020-12-22 13:06:23 -06:00
Shilei Tian	1eb082c2ea	[OpenMP][Docs] Fixed a typo in the doc that can mislead users to a CMake error When setting `LLVM_ENABLE_RUNTIMES`, lower case word should be used; otherwise, it can cause a CMake error that specific path is not found. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D93719	2020-12-22 14:05:58 -05:00
Johannes Doerfert	9cb748724e	[OpenMP][Docs] Add FAQ entry about math and complex on GPUs Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D93718	2020-12-22 13:05:04 -06:00
Shilei Tian	612ddc3117	[OpenMP][Docs] Updated the faq about building an OpenMP offloading capable compiler After some issues about building runtimes along with LLVM were fixed, building an OpenMP offloading capable compiler is pretty simple. This patch updates the FAQ part in the doc. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93671	2020-12-22 13:14:53 -05:00
Johannes Doerfert	994bb6eb7d	[OpenMP][NFC] Provide a new remark and documentation If a GPU function is externally reachable we give up trying to find the (unique) kernel it is called from. This can hinder optimizations. Emit a remark and explain mitigation strategies. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D93439	2020-12-17 14:38:26 -06:00
Hansang Bae	e1fd202489	[OpenMP] Add definitions for 5.1 interop to omp.h	2020-12-17 13:03:59 -06:00
Atmn	907886cc5b	[OpenMP][Libomptarget][NFC] Use CMake Variables This patchs adds CMake variables to add subdirectories and include directories for libomptarget and explicitly gives the location of source files. Differential Revision: https://reviews.llvm.org/D93290	2020-12-16 19:05:15 -05:00
Jon Chesterfield	b607837c75	[libomptarget][nfc] Replace static const with enum [libomptarget][nfc] Replace static const with enum Semantically identical. Replaces 0xff... with ~0 to spare counting the f. Has the advantage that the compiler doesn't need to prove the 4/8 byte value dead before discarding it, and sidesteps the compilation question associated with what static means for a single source language. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93328	2020-12-16 16:40:37 +00:00
Peyton, Jonathan L	5aafdd7b88	[OpenMP] Introduce new file wrapper class for runtime Introduce new kmp_safe_raii_file_t class with RAII semantics for file open/close. It is essentially a wrapper around the C-style FILE* object. This also unifies the way we error report if a file can't be opened. Differential Revision: https://reviews.llvm.org/D92604	2020-12-15 14:46:30 -06:00
Hansang Bae	171ca93c54	[OpenMP] Initialize runtime in the forked child process This patch enables serial initialization in the forked child process to fix unstable runtime behavior when used with Python-based AI tools. Differential Revision: https://reviews.llvm.org/D93230	2020-12-15 07:29:28 -06:00
Giorgis Georgakoudis	e007b32864	[OpenMP] Add time profiling for libomptarget Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93055	2020-12-11 18:53:37 -08:00
Jon Chesterfield	ce93de3bb2	[libomptarget][nfc] Remove data_sharing type aliasing [libomptarget][nfc] Remove data_sharing type aliasing Libomptarget previous used __kmpc_data_sharing_slot to access values of type __kmpc_data_sharing_{worker,master}_slot_static. This aliasing violation was benign in practice. The master type has since been removed, so a single type can be used instead. This is particularly helpful for the transition to an openmp deviceRTL, as the c++/openmp compiler for amdgcn currently rejects the flexible array member for being an incomplete type. Serves the same purpose as abandoned D86324. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93075	2020-12-11 02:13:34 +00:00
Hansang Bae	c3b5009aa7	[OpenMP] Use RTM lock for OMP lock with synchronization hint This patch introduces a new RTM lock type based on spin lock which is used for OMP lock with speculative hint on supported architecture. Differential Revision: https://reviews.llvm.org/D92615	2020-12-09 19:14:53 -06:00
Nawrin Sultana	540007b427	[OpenMP] Add strict mode in num_tasks and grainsize This patch adds new API __kmpc_taskloop_5 to accomadate strict modifier (introduced in OpenMP 5.1) in num_tasks and grainsize clause. Differential Revision: https://reviews.llvm.org/D92352	2020-12-09 16:46:30 -06:00
Peyton, Jonathan L	fe3b244ef7	[OpenMP] Fix norespect affinity bug for Windows KMP_AFFINITY=norespect was triggering an error because the underlying process affinity mask was not updated to include the entire machine. The Windows documentation states that the thread affinities must be subsets of the process affinity. This patch also moves the printing (for KMP_AFFINITY=verbose) of whether the initial mask was respected out of each topology detection function and to one location where the initial affinity mask is read. Differential Revision: https://reviews.llvm.org/D92587	2020-12-09 14:32:48 -06:00
Peyton, Jonathan L	9b7d6a6bff	[OpenMP] Fix too long name for shm segment on macOS Remove the user id component to the shm segment name and just use the pid like before. Differential Revision: https://reviews.llvm.org/D92660	2020-12-09 14:31:15 -06:00
Jon Chesterfield	7c59614394	[libomptarget][amdgpu] clang-format src/rtl.cpp	2020-12-09 19:45:51 +00:00
Jon Chesterfield	c9bc414840	[libomptarget][amdgpu] Let default number of teams equal number of CUs	2020-12-09 19:35:34 +00:00
Jon Chesterfield	e191d31159	[libomptarget][amdgpu] Robust handling of device_environment symbol	2020-12-09 19:21:51 +00:00
Jon Chesterfield	cab9f69235	[libomptarget][amdgpu] Improve diagnostics on arch mismatch	2020-12-09 18:55:53 +00:00
Giorgis Georgakoudis	18dff28958	[OpenMP] Add doxygen generation for the runtime Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D92779	2020-12-08 16:20:45 -08:00
AndreyChurbanov	fff1abc406	[OpenMP] NFC: comment adjusted	2020-12-07 19:50:14 +03:00
AndreyChurbanov	22558c8501	[OpenMP] libomp: Fix possible NULL dereferences Check pointer returned by strchr, as it can be NULL in case of broken format of input string. Introduced new function __kmp_str_loc_numbers for fast parsing of numbers only in the location string. Also made some cleanup of __kmp_str_loc_init declaration and usage: - changed type of init_fname parameter to bool; - changed input from true to false in places where fname is not used. Differential Revision: https://reviews.llvm.org/D90962	2020-12-07 19:09:07 +03:00
Jon Chesterfield	71f4693020	[libomptarget][amdgpu] Add plumbing to call into hostrpc lib, if linked	2020-12-07 15:24:01 +00:00
Jon Chesterfield	e1b8e8a1f4	[libomptarget][amdgpu] Skip device_State allocation when using bss global	2020-12-06 12:13:56 +00:00
Joachim Protze	a148216b31	[OpenMP][OMPT] Fix OMPT return address guard for gomp interface D91692 missed various locations in kmp_gsupport, where the scope for OMPT_STORE_RETURN_ADDRESS is too narrow, i.e. the scope ends before the OMPT callback is called in some nested function. This patch fixes the scoping issue, so that all OMPT tests pass, when the tests are built with gcc. Differential Revision: https://reviews.llvm.org/D92121	2020-12-05 19:06:28 +01:00
Joachim Protze	d3ec512b1d	[OpenMP][OMPT] Make sure that 0 is never used as ID in tests (NFC)	2020-12-04 18:41:56 +01:00
Jon Chesterfield	f628eef98a	[libomptarget][amdgpu] Fix latent race in load binary	2020-12-04 16:29:09 +00:00
Hansang Bae	c4a22224d9	[OpenMP] Add __kmpc_omp_target_task_alloc to dllexport This patch enables use of the entry on Windows. Differential Revision: https://reviews.llvm.org/D92618	2020-12-04 08:11:14 -06:00
Jon Chesterfield	ae9d96a656	[libomptarget][amdgpu] Address compiler warnings, drive by fixes [libomptarget][amdgpu] Address compiler warnings, drive by fixes Initialize some variables, remove unused ones. Changes the debug printing condition to align with the aomp test suite. Differential Revision: https://reviews.llvm.org/D92559	2020-12-03 11:09:12 +00:00
Pushpinder Singh	afc09c6fe4	[libomptarget][AMDGPU] Remove MaxParallelLevel Removes MaxParallelLevel references from rtl.cpp and drops resulting dead code. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D92463	2020-12-03 00:27:03 -05:00
Terry Wilmarth	e0665a9050	[OpenMP] Add support for Intel's umonitor/umwait These changes add support for Intel's umonitor/umwait usage in wait code, for architectures that support those intrinsic functions. Usage of umonitor/umwait is off by default, but can be turned on by setting the KMP_USER_LEVEL_MWAIT environment variable. Differential Revision: https://reviews.llvm.org/D91189	2020-12-01 14:07:46 -06:00
AndreyChurbanov	6bf84871e9	[OpenMP] libomp: add UNLIKELY hints to rarely executed branches Added UNLIKELY hint to one-time or rarely executed branches. This improves performance of the library on some tasking benchmarks. Differential Revision: https://reviews.llvm.org/D92322	2020-12-01 16:53:21 +03:00
Joachim Protze	fd3d1b09c1	[OpenMP][Tests][NFC] Use FileCheck from cmake config	2020-11-30 23:16:56 +01:00
Todd Erdner	9615890db5	[OpenMP] libomp: change shm name to include UID, call unregister_lib on SIGTERM With the change to using shared memory, there were a few problems that need to be fixed. - The previous filename that was used for SHM only used process id. Given that process is usually based on 16bit number, this was causing some conflicts on machines. Thus we add UID to the name to prevent this. - It appears under some conditions (SIGTERM, etc) the shared memory files were not getting cleaned up. Added a call to clean up the shm files under those conditions. For this user needs to set envirable KMP_HANDLE_SIGNALS to true. Patch by Erdner, Todd <todd.erdner@intel.com> Differential Revision: https://reviews.llvm.org/D91869	2020-12-01 00:40:47 +03:00
AndreyChurbanov	f6f28b44ad	[OpenMP] libomp: fix mutexinoutset dependence for proxy tasks Once __kmp_task_finish is not executed for proxy tasks, move mutexinoutset dependency code to __kmp_release_deps which is executed for all task kinds. Differential Revision: https://reviews.llvm.org/D92326	2020-12-01 00:13:31 +03:00
Joachim Protze	723be4042a	[OpenMP][OMPT][NFC] Fix failing test The test would fail for gcc, when built with debug flag.	2020-11-29 19:07:42 +01:00
Joachim Protze	cdf9401df8	[OpenMP][OMPT][NFC] Fix flaky test The test had a chance to finish the first task before the second task is created. In this case, the dependences-pair event would not trigger.	2020-11-29 19:07:41 +01:00
Jon Chesterfield	89a0f48c58	[libomptarget][cuda] Detect missing symbols in plugin at build time [libomptarget][cuda] Detect missing symbols in plugin at build time Passes -z,defs to the linker. Error on unresolved symbol references. Otherwise, those unresolved symbols present as target code running on the host as the plugin fails to load. This is significantly harder to debug than a link time error. Flag matches that passed by amdgcn and ve plugins. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D92143	2020-11-27 15:39:41 +00:00
Martin Storsjö	6b429668de	[OpenMP][OMPT] Fix building with OMPT disabled after `6d3b81664a`	2020-11-26 10:09:32 +02:00
Johannes Doerfert	227c8ff189	[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links	2020-11-25 11:52:35 -06:00
AndreyChurbanov	9e3e332d27	[OpenMP] libomp: fix non-X86, non-AARCH64 builds Commit https://reviews.llvm.org/rG7b5254223acbf2ef9cd278070c5a84ab278d7e5f broke the build for some architectures, because macro KMP_PREFIX_UNDERSCORE was defined only for x86, x86_64 and aarch64. This patch defines it for other architectures (as a no-op). Differential Revision: https://reviews.llvm.org/D92027	2020-11-25 20:40:23 +03:00
Joachim Protze	6d3b81664a	[OpenMP][OMPT] Introduce a guard to handle OMPT return address This is an alternative approach to address inconsistencies pointed out in: D90078 This patch makes sure that the return address is reset, when leaving the scope. In some cases, I had to move the macro out of an if-statement to have it in the right scope, in some cases I added an additional block to restrict the scope. This patch does not handle inconsistencies, which might occur if the return address is still set when we call into the application. Test case (repeated_calls.c) provided by @hbae Differential Revision: https://reviews.llvm.org/D91692	2020-11-25 18:17:44 +01:00
Isabel Thärigen	b281a05dac	[OpenMP][OMPT] Implement verbose tool loading OpenMP 5.1 introduces the new env variable OMP_TOOL_VERBOSE_INIT=(disabled\|stdout\|stderr\|<filename>) to enable verbose loading and initialization of OMPT tools. This env variable helps to understand the cause when loading of a tool fails (e.g., undefined symbols or dependency not in LD_LIBRARY_PATH) Output of OMP_TOOL_VERBOSE_INIT is added for OMP_DISPLAY_ENV Tests for this patch are integrated into the different existing tool loading tests, making these tests more verbose. An Archer specific verbose test is integrated into an existing Archer test. Patch prepared by: Isabel Thärigen Differential Revision: https://reviews.llvm.org/D91464	2020-11-25 18:17:44 +01:00
AndreyChurbanov	7b5254223a	[OpenMP] fix asm code for for arm64 (AARCH64) for Darwin/macOS Adjusted external reference for Darwin/AARCH64 link compatibility. Made size directive conditional only if __ELF__ defined. Patch by Michael_Pique <mpique@icloud.com> Differential Revision: https://reviews.llvm.org/D88252	2020-11-24 13:08:24 +03:00
AndreyChurbanov	5644f734d6	Revert "[OpenMP] Add support for Intel's umonitor/umwait" This reverts commit `9cfad5f9c5`.	2020-11-20 12:16:34 +03:00
AndreyChurbanov	9cfad5f9c5	[OpenMP] Add support for Intel's umonitor/umwait Patch by tlwilmar (Terry Wilmarth) Differential Revision: https://reviews.llvm.org/D91189	2020-11-19 22:04:21 +03:00
cchen	7036fe8a0c	[libomptarget] Add support for target update non-contiguous This patch is the runtime support for https://reviews.llvm.org/D84192. In order not to modify the tgt_target_data_update information but still be able to pass the extra information for non-contiguous map item (offset, count, and stride for each dimension), this patch overload arg when the maptype is set as OMP_TGT_MAPTYPE_DESCRIPTOR. The origin arg is for passing the pointer information, however, the overloaded arg is an array of descriptor_dim: ``` struct descriptor_dim { int64_t offset; int64_t count; int64_t stride }; ``` and the array size is the dimension size. In addition, since we have count and stride information in descriptor_dim, we can replace/overload the arg_size parameter by using dimension size. Reviewed By: grokos, tianshilei1992 Differential Revision: https://reviews.llvm.org/D82245	2020-11-19 11:33:27 -06:00
Joseph Huber	da8bec47ab	[OpenMP] Add Location Fields to Libomptarget Runtime for Debugging Summary: Add support for passing source locations to libomptarget runtime functions using the ident_t struct present in the rest of the libomp API. This will allow the runtime system to give much more insightful error messages and debugging values. Reviewers: jdoerfert grokos Differential Revision: https://reviews.llvm.org/D87946	2020-11-19 12:01:53 -05:00
Joseph Huber	5378c6a4bf	[OpenMP] Add Support for Mapping Names in Libomptarget RTL Summary: This patch adds basic support for priting the source location and names for the mapped variables. This patch does not support names for custom mappers. This is based on D89802. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D90172	2020-11-18 16:01:59 -05:00
Joseph Huber	97e55cfef5	[OpenMP] Add Passing in Original Declaration Names To Mapper API Summary: This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;" Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D89802	2020-11-18 15:28:39 -05:00
Hansang Bae	44a11c342c	[OpenMP] Use explicit type casting in kmp_atomic.cpp Differential Revision: https://reviews.llvm.org/D91105	2020-11-17 14:31:13 -06:00
Nawrin Sultana	5439db05e7	[OpenMP] Add omp_realloc implementation This patch adds omp_realloc function implementation according to OpenMP 5.1 specification. Differential Revision: https://reviews.llvm.org/D90971	2020-11-17 13:43:00 -06:00
Peyton, Jonathan L	8647c669a4	[OpenMP] NFC: remove tabs in message catalog file	2020-11-17 10:15:04 -06:00
Peyton, Jonathan L	0454154efd	[OpenMP][stats] reset serial state when re-entering serial region Differential Revision: https://reviews.llvm.org/D90867	2020-11-17 10:09:56 -06:00
Joachim Protze	fdc9dfc8e4	[OpenMP][Tool] Add Archer option to disable data race analysis for sequential part This introduces the new `ARCHER_OPTIONS` flag `ignore_serial=0\|1` to disable analysis and logging of memory accesses in the sequential part of the OpenMP application. In the sequential part of an OpenMP program no data race is possible, unless there is non-OpenMP concurrency (such as pthreads, MPI, ...). For the latter reason, this is not active by default. Besides reducing the runtime overhead for the sequential part of the program, this reduces the memory overhead for sequential initialization. In combination with `flush_shadow=1` this can allow analysis of applications, which run close to the limit of available memory, but only access smaller parts of shared memory during each OpenMP parallel region. A problem for this approach is that Archer only gets active, when the OpenMP runtime gets initialized, which might be after serial initialization of the application. In such case, it helps to call for example `omp_get_max_threads()` at the beginning of main. Differential Revision: https://reviews.llvm.org/D90473	2020-11-16 10:45:21 +01:00
Martin Storsjö	9bcef58b63	[OpenMP] Fix building for windows after adding omp_calloc Differential Revision: https://reviews.llvm.org/D91478	2020-11-15 21:32:38 +02:00
Nawrin Sultana	938f1b8581	[OpenMP] Add omp_calloc implementation This patch adds omp_calloc implementation according to OpenMP 5.1 specification. Differential Revision: https://reviews.llvm.org/D90967	2020-11-13 14:35:46 -06:00
Joachim Protze	96eaacc917	[OpenMP][Tool] Update archer to accept new OpenMP 5.1 enum values OpenMP 5.1 adds an extra enum entry for ompt_scope_t, which makes the related switch statement incomplete. Also adding cases for newly added barrier variants. Differential Revision: https://reviews.llvm.org/D90758	2020-11-13 16:09:05 +01:00
Shilei Tian	24d0ef0f50	[OpenMP] Fixed a bug when displaying affinity Currently the affinity format string has initial value. When users set the format via OMP_AFFINITY_FORMAT, it will overwrite the format string. However, when copying the format, the tailing null is missing. As a result, if the user format string is shorter than default value, the remaining part in the default value still makes effort. This bug is not exposed because the test case doesn't check the end of a string. It only checks whether given output "contains" the check string. Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D91309	2020-11-12 22:27:32 -05:00
Joseph Huber	292e898c16	[OpenMP] Begin Adding OpenMP Tool to Gather OpenMP Information Summary: This patch begins to add support for a set of scripts that can be used to get information from OpenMP programs to better describe problems and eventually show the data to the user in formatted output. Right now the only support is forformatting the register and memory usage reports from ptxas and nvlink. This is simply done as a wrapper around clang and clang++. Reviewers: jdoerfert DIfferential Revision: https://reviews.llvm.org/D91085	2020-11-11 20:00:37 -05:00
Joachim Protze	25b3164bfb	[OpenMP][Tools][Tests] Fix ompt multiplex test With `6213ed0` the master callback was renamed to masked. The multiplex tests must check for masked now.	2020-11-12 01:43:49 +01:00
Peyton, Jonathan L	dd8723d348	[OpenMP] Fix shutdown hang/race bug The deadlock/race happens when primary thread gets initz lock and tries to join the worker thread which waits for the same lock in TLS key destructor. The patch removes the lock and the code of setting TLS value which needed the lock. Also removed setting TLS from __kmp_unregister_root_current_thread. Differential Revision: https://reviews.llvm.org/D90647	2020-11-11 13:47:23 -06:00
Joachim Protze	3fa2e19338	[OpenMP][Tool] Fix possible NULL-pointer dereference in test Avoid dereferencing a possibly uninitialized pointer as mentioned in D91280.	2020-11-11 20:13:22 +01:00
Joachim Protze	ce0911b3e9	[OpenMP][Tests] Fix compiler warnings in OpenMP runtime tests This patch allows to pass the OpenMP runtime tests after configuring with `cmake . -DOPENMP_TEST_FLAGS:STRING="-Werror"`. The warnings for OMPT tests are addressed in D90752. Differential Revision: https://reviews.llvm.org/D91280	2020-11-11 20:13:21 +01:00
Joachim Protze	6213ed062b	[OpenMP][OMPT] Update the omp-tools header file to reflect 5.1 changes This doesn't add functionality, but just adds the new types and renames the master callback to masked callback. Differential Revision: https://reviews.llvm.org/D90752	2020-11-11 20:13:21 +01:00
AndreyChurbanov	33da6bd7f5	[OpenMP] Fixes for shared memory cleanup when aborts occur Patch by Erdner, Todd <todd.erdner@intel.com> Differential Revision: https://reviews.llvm.org/D90974	2020-11-11 00:16:23 +03:00
Alexey Bataev	dcde6f17fd	Revert "[libomptarget] Add support for target update non-contiguous" This reverts commit `6847bcec1a`. It breaks the build of libomptarget.	2020-11-10 07:49:00 -08:00
Hansang Bae	ef7738240c	[OpenMP] Remove obsolete Fortran module file Modern Fortran compilers support Fortran 90, so we do not need to use the source code for Fortran compilers that do not support Fortran 90. Differential Revision: https://reviews.llvm.org/D90077	2020-11-09 15:26:38 -06:00
cchen	6847bcec1a	[libomptarget] Add support for target update non-contiguous This patch is the runtime support for https://reviews.llvm.org/D84192. In order not to modify the tgt_target_data_update information but still be able to pass the extra information for non-contiguous map item (offset, count, and stride for each dimension), this patch overload arg when the maptype is set as OMP_TGT_MAPTYPE_DESCRIPTOR. The origin arg is for passing the pointer information, however, the overloaded arg is an array of descriptor_dim: ``` struct descriptor_dim { int64_t offset; int64_t count; int64_t stride }; ``` and the array size is the dimension size. In addition, since we have count and stride information in descriptor_dim, we can replace/overload the arg_size parameter by using dimension size. Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D82245	2020-11-06 20:55:33 -06:00
Nawrin Sultana	082031949c	[OpenMP] Fix potential division by 0 This patch fixes potential division by 0 in case hwloc does not recognize cores (or architecture has no cores). Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D90954	2020-11-06 11:52:19 -06:00
Peyton, Jonathan L	5e34877480	[OpenMP] Add ident_t flags for compiler OpenMP version This patch adds the mask and ident_t function to get the openmp version. It also adds logic to force monotonic:dynamic behavior when OpenMP version less than 5.0. The OpenMP version is stored in the format: major*10+minor e.g., OpenMP 5.0 = 50 Differential Revision: https://reviews.llvm.org/D90632	2020-11-05 11:14:25 -06:00
Joachim Protze	7b0ca32b62	[OpenMP] avoid warning: equality comparison with extraneous parentheses The macros are used in several places with an if(macro) pattern. This results in several warnings about extraneous parenteses in equality comparison. Having the constant at the lhs of the comparison, avoids this warning. Differential Revision: https://reviews.llvm.org/D90756	2020-11-05 12:13:08 +01:00
Jon Chesterfield	93cbf622fc	[libomptarget][nfc] Build amdgcn deviceRTL with nogpulib	2020-11-04 11:29:22 +00:00
Shilei Tian	f5eebc25cc	[OpenMP] Fixed an issue in the test case parallel_offloading_map There is a non-conforming use of variable-sized array in the test case `parallel_offloading_map.c`. This patch fixed it. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D90642	2020-11-03 15:59:16 -05:00
Joachim Protze	eaed9e6b56	[OpenMP][Tools] clang-format Archer (NFC)	2020-11-03 16:32:02 +01:00
Joachim Protze	71041a8b6b	[OpenMP][libomptarget][Tests] fix failing test D88149 updated `omp_get_initial_device` behavior to conform with OpenMP 5.1. omp_get_initial_device() == omp_get_num_devices()	2020-11-03 13:15:33 +01:00
Joachim Protze	b0eb19bf8a	[OpenMP][OMPT][NFC] Fix flaky test As reported by @ronlieb, the test shows intermittent fails. The test failed, if the dependent task was already finished, when the depending task was to be created. We have other tests to check for the dependences pair.	2020-11-03 13:15:32 +01:00
Joachim Protze	e99207feb4	[OpenMP][Tool] Handle detached tasks in Archer Since detached tasks are supported by clang and the OpenMP runtime, Archer must expect to receive the corresponding callbacks. This patch adds support to interpret the synchronization semantics of omp_fulfill_event and cleans up the handling of task switches.	2020-11-03 13:15:32 +01:00
Atmn Patel	a95b25b29e	[Libomptarget][NFC] Move global Libomptarget state to a struct Presently, there a number of global variables in libomptarget (devices, RTLs, tables, mutexes, etc.) that are not placed within a struct. This patch places them into a struct ``PluginManager``. All of the functions that act on this data remain free. Differential Revision: https://reviews.llvm.org/D90519	2020-11-03 00:10:18 -05:00
Johannes Doerfert	30e818db91	[OpenMP][Docs] Structure and content for the OpenMP documentation This adds some initial content as well as structure to the new OpenMP Sphinx documentation hosted at http://openmp.llvm.org/docs/ . The content contains some useful links but most pages are still empty. This uses a "custom" theme which is a copy of the default "agogo" one with minor modifications to get a nicer table of content in the sidebar. This way we can also adjust the theme as we go. Reviewed By: jhuber6, JonChesterfield Differential Revision: https://reviews.llvm.org/D90256	2020-10-30 01:31:48 -05:00
Peyton, Jonathan L	771f0fb92d	[OpenMP] Add NULL check in dispatcher debug output Patch by Nawrin Sultana Differential Revision: https://reviews.llvm.org/D90403	2020-10-29 14:08:03 -05:00
Jon Chesterfield	dee7704829	[AMDGPU] Add __builtin_amdgcn_grid_size [AMDGPU] Add __builtin_amdgcn_grid_size Similar to D76772, loads the data from the dispatch pointer. Marked invariant. Patch also updates the openmp devicertl to use this builtin. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D90251	2020-10-29 16:25:13 +00:00
Benjamin Kramer	207cf71fa9	Revert "[OpenMP] Add Passing in Original Declaration Names To Mapper API" This reverts commit `d981c7b758` and `a87d7b3d44`. Test fails under msan.	2020-10-28 13:58:14 +01:00

... 2 3 4 5 6 ...

1691 Commits