llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonas Hahnfeld	17aabf83e9	[libomptarget-nvptx] loop: Determine if runtime uninitialized The generic entry points for static loop scheduling previously hardcoded that the runtime was initialized. This can be wrong if the compiler analyzes that the runtime is not needed and calls the init functions accordingly. This didn't affect clang-ykt because they have entry points for different combinations of SPMD x Runtime not needed. I didn't do measurements yet but with inlining we might get away with always calling the generic interface and letting compiler and runtime figure out the rest. In any case, a correct runtime is always better than having functions that may only be called if previous calls passed in a specific set of arguments! Differential Revision: https://reviews.llvm.org/D47131 llvm-svn: 333285	2018-05-25 15:56:48 +00:00
Jonas Hahnfeld	65e0b8784c	[CMake] Unify install path for libraries Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands. This also fixes installation of libomptarget-nvptx that previously didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX. Differential Revision: https://reviews.llvm.org/D47130 llvm-svn: 333284	2018-05-25 15:56:41 +00:00
George Rokos	6da6f433a0	[CUDA]Fix dynamic\|guided scheduling. The existing implementation of the dynamic scheduling breaks the contract introduced by the original openmp runtime and, thus, is incorrect. Patch fixes it and introduces correct dynamic scheduling model. Thanks to Alexey Bataev for submitting this patch. Differential Revision: https://reviews.llvm.org/D47333 llvm-svn: 333225	2018-05-24 21:12:41 +00:00
Jonas Hahnfeld	9228f9718c	[libomptarget-nvptx-bc] Pass found CUDA installations We already know where the CUDA SDK is, so there is no point in letting Clang search for it again and possibly finding no or a different installation. --cuda-path is supported since the beginning of CUDA support in Clang, so making this required doesn't impose additional restrictions. Differential Revision: https://reviews.llvm.org/D46930 llvm-svn: 332495	2018-05-16 17:20:27 +00:00
Jonas Hahnfeld	37bbe1a698	[libomptarget-nvptx] Test bitcode compiler flags and enable by default Move all logic related to selecting the bitcode compiler and linker into a new file and dynamically test required compiler flags. This also adds -fcuda-rdc for Clang trunk as previously attempted in D44992 which fixes the build. As a result this change also enables building the library by default if all prerequisites are met. Differential Revision: https://reviews.llvm.org/D46901 llvm-svn: 332494	2018-05-16 17:20:21 +00:00
Gheorghe-Teodor Bercea	787a350021	[OpenMP][libomptarget] Add function for checking SPMD mode Summary: Add function to the NVPTX libomptarget library that will return true if the current target region is being executed in SPMD mode. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D46840 llvm-svn: 332360	2018-05-15 15:16:43 +00:00
Joachim Protze	9be9cf20bf	[OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regions implicit_task_end callbacks in nested parallel regions did not always give the correct thread_num, since the inner parallel region may have already been finalized. Now, the thread_num is stored at the beginning of the implicit task and retrieved at the end, whenever necessary. A testcase was added as well. Differential Revision: https://reviews.llvm.org/D46260 llvm-svn: 331632	2018-05-07 12:42:21 +00:00
Joachim Protze	8fc39f6b19	[OMPT] Add api_calls_misc.c testcase and rename api_calls.c testcase The api_calls_misc.c testcase tests the following api calls: ompt_get_callback() ompt_get_state() ompt_enumerate_states() ompt_enumerate_mutex_impls() These have not been tested previously. The api_calls.c testcase has been renamed to api_calls_places.c because it only tests api calls that are related to places. Differential Revision: https://reviews.llvm.org/D42523 llvm-svn: 331631	2018-05-07 12:42:15 +00:00
Guansong Zhang	e1c7a46d5b	[OpenMP] Use LIBOMPTARGET_DEVICE_RTL_DEBUG env var to control debug messages on the device side Summary: Enable the device side debug messages at compile time, use env var to control at runtime. To achieve this, an environment data block is passed to the device lib when it is loaded. By default, the message is off, to enable it, a user need to set LIBOMPDEVICE_DEBUG=1. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D46210 llvm-svn: 331550	2018-05-04 19:29:28 +00:00
Jonathan Peyton	d47df260ba	[OpenMP][OMPT] Fix api_calls_from_other_thread.cpp Removed environment setting in RUN: line that was being ignored anyways. Changed a few specific checks to "any number" llvm-svn: 331212	2018-04-30 18:46:31 +00:00
Guansong Zhang	ad6c26516b	[OpenMP] Remove compilation warning when using clang to compile bc files. Summary: Minor printf format correction. NVCC ignore those. Clang will give warning on these if debug is enabled. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D45528 llvm-svn: 330944	2018-04-26 14:06:53 +00:00
Guansong Zhang	334c379e32	[OpenMP] Make bc file compilation sensitive to LIBOMPTARGET_NVPTX_DEBUG flag Summary: The LIBOMPTARGET_NVPTX_DEBUG flag is inconsistent between using nvcc to generate .a file and clang to generate .bc file. Sync the two setting so we can get debug messages from the bc file path as well. Reviewers: grokos Subscribers: Hahnfeld, openmp-commits, mgorny Tags: #openmp Differential Revision: https://reviews.llvm.org/D45530 llvm-svn: 330477	2018-04-20 20:41:00 +00:00
Heejin Ahn	f78a493528	[OpenMP] Compilation error fix on const char* Summary: This line (`0ed912c7a7/runtime/src/kmp_gsupport.cpp (L1459)`) added in D45327 (rL330282) causes a compilation failure. Reviewers: jlpeyton Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D45786 llvm-svn: 330299	2018-04-18 22:23:31 +00:00
Jonathan Peyton	1482db9e03	[OpenMP] Fix affinity API for KMP_AFFINITY=none\|compact\|scatter Currently, the affinity API reports garbage for the initial place list and any thread's place lists when using KMP_AFFINITY=none\|compact\|scatter. This patch does two things: for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the initial place list is just a single place with all the proc ids in it. We also set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the thread reports that single place (place 0) instead of garbage (-1) when using the affinity API. When non-OMP_PROC_BIND affinity is used (including KMP_AFFINITY=compact\|scatter), a thread's place list is populated correctly. We assume that each thread is assigned to a single place. This is implemented in two of the affinity API functions Differential Revision: https://reviews.llvm.org/D45527 llvm-svn: 330283	2018-04-18 19:25:48 +00:00
Jonathan Peyton	27a677fc95	Introduce GOMP_taskloop API This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282	2018-04-18 19:23:54 +00:00
Joachim Protze	3865c69b84	Set the license header for all OMPT files llvm-svn: 329928	2018-04-12 17:23:26 +00:00
Guansong Zhang	f679431f91	[OpenMP] Remove extra warning when we build Summary: This one line change is to remove this warning message "warning: integer conversion resulted in a change of sign" Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D45415 llvm-svn: 329713	2018-04-10 15:28:31 +00:00
Guansong Zhang	f0029a7738	Revert "[OpenMP] enable bc file compilation using the latest clang" This reverts commit 6849e31c36d712d97433bca9af39b7a09c8c1207. llvm-svn: 329576	2018-04-09 14:45:41 +00:00
Guansong Zhang	e47fbc9da8	[OpenMP] enable bc file compilation using the latest clang Summary: adding cuda-rdc flag to allow extern global data Reviewers: grokos Reviewed By: grokos Subscribers: gregrodgers, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D44992 llvm-svn: 329072	2018-04-03 15:01:34 +00:00
Jonathan Peyton	1e6bb8d5de	Minor cleanup in __kmp_atfork_child() This change removes the unnecessary lock operation on __kmp_initz_lock inside the __kmp_atfork_child() function for Linux; the lock variable is initialized in the same function later. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D44949 llvm-svn: 328900	2018-03-30 19:55:11 +00:00
Jonathan Peyton	ea82c769f4	Move blocktime_str variable right before its first use llvm-svn: 328575	2018-03-26 19:20:50 +00:00
Jonathan Peyton	b6b79ac95b	Add summarizeStats.py to tools directory The summarizeStats.py script processes raw data provided by the instrumented (stats-gathering) OpenMP* runtime library. It provides: 1) A radar chart which plots counters as frequency (per GigaTick) of use within the program. The frequencies are plotted as log10, however values less than one are kept as it is and represented in red color. This was done to help visualize the differences better. 2) Pie charts separating total time as compute and non-compute. The compute and non-compute times have their own pie charts showing the constructs that contributed to them. The percentages listed are with respect to the total time. 3) '.csv' file with percentage of time spent within the different constructs. The script can be used as: $ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv Patch by Taru Doodi Differential Revision: https://reviews.llvm.org/D41838 llvm-svn: 328568	2018-03-26 18:44:48 +00:00
Andrey Churbanov	2d91a8a3ba	Fixed __kmpc_get_target_offload() to call library initialization. Differential Revision: https://reviews.llvm.org/D44793 llvm-svn: 328228	2018-03-22 18:51:51 +00:00
Gheorghe-Teodor Bercea	4bc36a06e2	[OpenMP][libomptarget] Initialize global memory stack only once. Summary: The global stack initialization function may be called multiple times. The initialization of the shared memory slots should only happen when the function is called for the first time for a given warp master thread. Reviewers: grokos, carlo.bertolli, ABataev, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44754 llvm-svn: 328148	2018-03-21 21:02:55 +00:00
Gheorghe-Teodor Bercea	b4332ca3da	[OpenMP][libomptarget] Fix master warp check Summary: The check for the master warp must take into consideration the actual number of warps: the master warp is equal to the last active warp not necessarily WARPSIZE - 1. Reviewers: grokos, carlo.bertolli, ABataev, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44537 llvm-svn: 328146	2018-03-21 20:51:16 +00:00
Gheorghe-Teodor Bercea	c8d395a168	[OpenMP][libomptarget] Enable globalization for workers Summary: This patch allows worker to have a global memory stack managed by the runtime. This patch is needed for completeness and consistency with the globalization policy: if a worker-side variable escapes the current context it then needs to be globalized. Until now, only the master thread was allowed to have such a stack. These global values can now potentially be shared amongst workers if the semantics of the OpenMP program require it. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44487 llvm-svn: 328144	2018-03-21 20:34:19 +00:00
Jonathan Peyton	78f977fcd1	Read OMP_TARGET_OFFLOAD and provide API to access ICV Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD, if available, otherwise defaulting to DEFAULT. Valid values for the ICV are specified as enum values {0,1,2} for disabled, default, and mandatory. An internal API access function __kmpc_get_target_offload is provided. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D44577 llvm-svn: 328046	2018-03-20 21:18:17 +00:00
Andrey Churbanov	3336aa0d07	Fix for Fix for https://bugs.llvm.org/show_bug.cgi?id=36705 . Differential Revision: https://reviews.llvm.org/D44637 llvm-svn: 327875	2018-03-19 18:05:15 +00:00
George Rokos	6b9bb5e1c2	Bugfix, extern declarations for libomp functions are `extern "C"` declarations llvm-svn: 327763	2018-03-17 02:07:42 +00:00
George Rokos	2878c3957b	Moved extern declarations to private header file, they are only used from within libomptarget, they don't need to be in omptarget.h. llvm-svn: 327740	2018-03-16 20:40:09 +00:00
Gheorghe-Teodor Bercea	876c1ed2e5	[OpenMP][libomptarget] Enable usage of shared memory slots Summary: Allow the runtime to use the existing shared memory statically allocated slots. When a variable is globalized, the underlying memory can be either shared or global memory (both have block-wide visibility). In this case, we allow that the storage to use a limited amount of shared memory that has been statically allocated already. Only if shared memory doesn't prove to be enough do we then invoke malloc() to create a new global memory slot. Reviewers: ABataev, carlo.bertolli, grokos, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44486 llvm-svn: 327639	2018-03-15 16:05:34 +00:00
Gheorghe-Teodor Bercea	f3de222b0d	[OpenMP][libomptarget] Enable multiple frames per global memory slot Summary: To save on calls to malloc, this patch enables the re-use of pre-allocated global memory slots. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44470 llvm-svn: 327637	2018-03-15 15:56:04 +00:00
George Rokos	59be4b434f	[libomptarget][nvptx] Bug fix: Correctly identify the warp master active thread. llvm-svn: 327556	2018-03-14 19:11:36 +00:00
Gheorghe-Teodor Bercea	49b62649cf	[OpenMP][libomptarget] Add global memory data sharing support for master-worker sharing. Summary: This patch adds support for the sharing of variables from the master thread of a team to the worker threads of the team. The runtime uses a stack structure implemented as a doubly-linked list of slots with each slot having the exact same size as the size requested. This implementation leverages existing data structures. The runtime functions are added as separate functions to avoid interfering with the current interface. Limitations to be addressed in future patches: - This current patch only employs global memory. In a future patch we will enable to usage for shared memory as an optimization. - Allow the allocation of several requested sizes in the same slot. Reviewers: ABataev, grokos, caomhin, carlo.bertolli Reviewed By: grokos Subscribers: Hahnfeld, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44260 llvm-svn: 327440	2018-03-13 19:44:53 +00:00
Sylvestre Ledru	1c861c582f	fix a typo on the website llvm-svn: 327237	2018-03-11 10:53:40 +00:00
Gheorghe-Teodor Bercea	d5e5992f9a	[OpenMP][libomptarget] Fix union. Summary: To make the two parts of the union have the same size, the size of vect needs to be increased by 16 bits. Reviewers: grokos, carlo.bertolli, caomhin, ABataev Reviewed By: grokos, ABataev Subscribers: fedor.sergeev, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44254 llvm-svn: 327040	2018-03-08 18:44:02 +00:00
Gheorghe-Teodor Bercea	7a5fa21ae2	[OpenMP] Remove implicit data sharing using device shared memory from libomptarget Summary: This patch reverts the changes to libomptarget that were coupled with the changes to Clang code gen for data sharing using shared memory. A similar patch exists for Clang: D43625 Shared memory is meant to be used as an optimization on top of a more general scheme. So far we didn't have a global memory implementation ready so shared memory was a solution which applied to the current level of OpenMP complexity supported by trunk on GPU devices (due to the missing NVPTX backend patch this functionality has never been exercised). Now that we have a global memory solution this patch is "in the way" and needs to be removed (for now). This patch (or an equivalent version of it) will be put out for review once the global memory scheme is in place. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: Hahnfeld, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D43626 llvm-svn: 326950	2018-03-07 22:10:10 +00:00
Andrey Churbanov	9e9333aa8a	Improve OpenMP threadprivate implementation. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41914 llvm-svn: 326733	2018-03-05 18:42:01 +00:00
Andrey Churbanov	75bc70fb56	Fixed build of the OpenMP stubs library. Differential Revision: https://reviews.llvm.org/D44019 llvm-svn: 326728	2018-03-05 18:01:47 +00:00
Jonas Hahnfeld	b0f051ae63	[OMPT] Fix interoperability test with GCC We have to ensure that the runtime is initialized _before_ waiting for the two started threads to guarantee that the master threads post their ompt_event_thread_begin before the worker threads. This is not guaranteed in the parallel region where one worker thread could start before the other master thread has invoked the callback. The problem did not happen with Clang becauses the generated code calls __kmpc_global_thread_num() and cashes its result for functions that contain OpenMP pragmas. Differential Revision: https://reviews.llvm.org/D43882 llvm-svn: 326435	2018-03-01 14:03:18 +00:00
Joachim Protze	f5aebc27ad	[OMPT] Fix task-type test with GCC This is similar to D43882. The runtime needs to be initialized before calling print_ids(0) http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/60 Differential Revision: https://reviews.llvm.org/D43897 llvm-svn: 326428	2018-03-01 11:26:15 +00:00
Joachim Protze	aa2022e74f	[OMPT] Fix ompt_get_task_info() and add tests for it The thread_num parameter of ompt_get_task_info() was not being used previously, but need to be set. The print_task_type() function (form the task-types.c testcase) was merged into the print_ids() function (in callback.h). Testing of ompt_get_task_info() was added to the task-types.c testcase. It was not tested extensively previously. Differential Revision: https://reviews.llvm.org/D42472 llvm-svn: 326338	2018-02-28 17:36:18 +00:00
Joachim Protze	4df80bda40	[OMPT] Fix inconsistent testcases The main change of this patch is to insert {{.*}} in current_address=[[RETURN_ADDRESS_END]]. This is needed to match any of the alternatively printed addresses. Additionally, clang-format is applied to the two tests. Differential Revision: https://reviews.llvm.org/D43115 llvm-svn: 326312	2018-02-28 09:28:51 +00:00
Jonas Hahnfeld	82768d0ba1	[OMPT] Fix parallel_data in implicit barrier-end This is required to be NULL for implicit barriers at the end of a parallel region. Noticed in review of D43191. Differential Revision: https://reviews.llvm.org/D43308 llvm-svn: 325922	2018-02-23 16:46:25 +00:00
Jonas Hahnfeld	5e44069857	[OMPT] Fix test tasks/serialized.c with optimization The compiler inlines the user code in the task. Check for that case at runtime by comparing the frame addresses and print the expected exit address. Also showcase how I think the OMPT tests could be reformatted to match LLVM's code style. In my opinion it would be great to that kind of change to all tests that need to be touched for whatever reason... Differential Revision: https://reviews.llvm.org/D43191 llvm-svn: 325921	2018-02-23 16:46:11 +00:00
Joachim Protze	b0e4f87fb0	[OMPT] Omissionin in OMPT Formatting Applying clang-format to the /runtime/src/ folder Differential Revision: https://reviews.llvm.org/D42169 llvm-svn: 325424	2018-02-17 09:54:10 +00:00
Joachim Protze	33db70d2d7	[OMPT] Add interoperability testcase Test whether OMPT-callbacks for two threads that initiate a parallel region are correct. Differential Revision: https://reviews.llvm.org/D41942 llvm-svn: 325423	2018-02-17 09:40:08 +00:00
Joachim Protze	76899b84fe	[OMPT] Update api_calls testcase Only use ompt_ functions when testing OMPT in api_calls testcase. Add size parameter to print_list. Fix small bug in implementation of ompt_get_partition_place_nums(): return correct length. Differential Revision: https://reviews.llvm.org/D42162 llvm-svn: 325422	2018-02-17 09:40:02 +00:00
Jonas Hahnfeld	6f9e25d382	[CMake] Add -fno-experimental-isel for testing GlobalISel doesn't yet implement blockaddress and falls back to SelectionDAG. This results in additional branch instruction to the next basic block which breaks the OMPT tests. Disable GlobalISel for now when compiling the tests because fixing them is not easily possible. See http://llvm.org/PR36313 for full discussion history. Differential Revision: https://reviews.llvm.org/D43195 llvm-svn: 325218	2018-02-15 08:10:22 +00:00
Jonas Hahnfeld	cc6d29d72c	[OMPT][test] Correct warning about added wrapper functions This affects all outlined functions, not just tasks! Only show warning when using Clang 5.0 or later. Differential Revision: https://reviews.llvm.org/D43190 llvm-svn: 325131	2018-02-14 15:15:24 +00:00
Gheorghe-Teodor Bercea	d5ae4e6501	[OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining Summary: Different NVIDIA GPUs support different compute capabilities. To enable the inlining of runtime functions and the best performance on different generations of NVIDIA GPUs, a bc library for each compute capability needs to be compiled. The same compiler build will then be usable in conjunction with multiple generations of NVIDIA GPUs. To differentiate between versions of the same bc lib, the output file name will contain the compute capability ID. Depends on D14254 Reviewers: Hahnfeld, hfinkel, carlo.bertolli, caomhin, ABataev, grokos Reviewed By: Hahnfeld, grokos Subscribers: guansong, mgorny, openmp-commits Differential Revision: https://reviews.llvm.org/D41724 llvm-svn: 324904	2018-02-12 16:45:20 +00:00
Jonas Hahnfeld	3cfaf3dd0d	[libomptarget] Fix detection of CUDA stubs library CUDA_LIBRARIES contains additional linker arguments since CMake 3.3 which breakes the current way of finding the stubs library. llvm-svn: 324879	2018-02-12 11:01:56 +00:00
Joachim Protze	cfc98c2493	[OMPT] Add tool_available_search testcase Tests the search for tools as defined in the spec. The OMP_TOOL_LIBRARIES environment variable contains paths to the following files(in that order) -to a nonexisting file -to a shared library that does not have a ompt_start_tool function -to a shared library that has an ompt_start_tool implementation returning NULL -to a shared library that has an ompt_start_tool implementation returning a pointer to a valid instance of ompt_start_tool_result_t The expected result is that the last tool gets active and can print in the thread-begin callback. Differential Revision: https://reviews.llvm.org/D42166 llvm-svn: 324588	2018-02-08 10:04:33 +00:00
Joachim Protze	9440c0ee3c	[OMPT] Add tool_not_available testcase Add a testcase that checks wheter the runtime can handle an ompt_start_tool method that returns NULL indicating that no tool shall be loaded. All tool_available testcases need a separate folder to avoid file conflicts for the generated tools. Differential Revision: https://reviews.llvm.org/D41904 llvm-svn: 324587	2018-02-08 10:04:28 +00:00
Gheorghe-Teodor Bercea	aaeab8d4ef	[OpenMP][libomptarget] Add data sharing support in libomptarget Summary: This patch extends the libomptarget functionality in patch D14254 with support for the data sharing scheme for supporting implicitly shared variables. The runtime therefore maintains a list of references to shared variables. Reviewers: carlo.bertolli, ABataev, Hahnfeld, grokos, caomhin, hfinkel Reviewed By: Hahnfeld, grokos Subscribers: guansong, llvm-commits, openmp-commits Differential Revision: https://reviews.llvm.org/D41485 llvm-svn: 324495	2018-02-07 18:21:55 +00:00
Joachim Protze	2a20299f91	[OMPT] Fix tool initialization returning 0 If tool initialization returns 0, OMPT should not be active. The current implementation provided some callback invocations in this case. Differential Revision: https://reviews.llvm.org/D42709 llvm-svn: 324320	2018-02-06 08:41:27 +00:00
Carlo Bertolli	57e9f44a8c	[OpenMP-RT] Fix debug string for NVPTX runtime library https://reviews.llvm.org/D42757 The method ThreadsInTeam is used to determine the number of threads to be used in a parallel region under SPMD mode (see line 127 of supporti.h in libomptarget/deviceRTLs/nvptx/src/). This patch fixes the corresponding debug print upon initialization of the kernel in SPMD mode. llvm-svn: 323978	2018-02-01 16:12:16 +00:00
Jonas Hahnfeld	a349d4820c	[libomptarget] Check for library with CUDA Driver API That's what we really need to link the CUDA plugin against, not the CUDA runtime API in CUDA_LIBRARIES! While the latter comes with the CUDA SDK, the Driver API is installed with the kernel driver and there is at most one per system. As fallback we can use the stubs library distributed with the CUDA SDK for linking. Differential Revision: https://reviews.llvm.org/D42643 llvm-svn: 323787	2018-01-30 16:49:13 +00:00
Jonas Hahnfeld	c189523529	[libomptarget] Only use CUDA Driver API Use equivalents for the last calls to the Runtime API. Remove stray assert in case of an error found during review, we should only return OFFLOAD_FAIL. Differential Revision: https://reviews.llvm.org/D42686 llvm-svn: 323786	2018-01-30 16:49:06 +00:00
George Rokos	0dd6ed74fd	[OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs. This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices. Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs. The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target. Differential revision: https://reviews.llvm.org/D14254 llvm-svn: 323649	2018-01-29 13:59:35 +00:00
Jonas Hahnfeld	723560d123	[OMPT] Use fuzzy return addresses in lock testcases Use fuzzy return addresses in lock testcases so that these testcases can also be run using the Intel Compiler. Patch by Simon Convent! Differential Revision: https://reviews.llvm.org/D41896 llvm-svn: 323529	2018-01-26 14:19:02 +00:00
Jonas Hahnfeld	e57620308e	Fix name of 'macOS' and add asteriks to brands, NFC. llvm-svn: 323180	2018-01-23 07:54:10 +00:00
Dimitry Andric	9f49676a8a	Sprinkle a few <cstdlib> includes, for libomptarget sources using malloc, free, alloca and getenv. NFCI. llvm-svn: 322869	2018-01-18 18:24:22 +00:00
Jonas Hahnfeld	e5499111b9	Add missing headers for Debug builds llvm-svn: 322830	2018-01-18 10:58:43 +00:00
Joachim Protze	e6269e3509	Partial revert of [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl The previous commit did not revert all replaced ompt_mutex_impl_unknown. llvm-svn: 322631	2018-01-17 11:13:11 +00:00
Joachim Protze	0c9516b36c	[OMPT] Add Workaround for Intel Compiler Bug Add Workaround for Intel Compiler Bug with Case#: 03138964 A critical region within a nested task causes a segfault in icc 14-18: int main() { #pragma omp parallel num_threads(2) #pragma omp master #pragma omp task #pragma omp task #pragma omp critical printf("test\n"); } When the critical region is in a separate function, the segault does not occur. So we add noinline to make sure that the function call stays there. Differential Revision: https://reviews.llvm.org/D41182 llvm-svn: 322622	2018-01-17 10:06:06 +00:00
Joachim Protze	1b2bd2680b	[OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl The defintion is not part of the spec and thus should not have the prefix "ompt_" but rather a prefix that indicates that this is implementation specific. Differential Revision: https://reviews.llvm.org/D41166 llvm-svn: 322621	2018-01-17 10:06:01 +00:00
Joachim Protze	1dc2afdcaf	[OMPT] Return appropiate values for ompt runtime entry points for non-OpenMP threads When the current thread is not an (initialized) OpenMP thread, the runtime entry points return values that correspond to "not available" or similar Differential Revision: https://reviews.llvm.org/D41167 llvm-svn: 322620	2018-01-17 10:05:55 +00:00
Andrey Churbanov	5388acd3de	Fixed libomp static build broken by the commit rL322202. Patch by simone <simone@cs.utah.edu>. Differential Revision: https://reviews.llvm.org/D41945 llvm-svn: 322282	2018-01-11 15:09:49 +00:00
Jonathan Peyton	79390ad709	Force HWLOC topology method for NUMA-specific topology If user requested affinity with granularity=tile we need to either use HWLOC or ignore the request. The change allows user to not specify KMP_TOPOLOGY_METHOD=hwloc and choose it automatically instead. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D40905 llvm-svn: 322205	2018-01-10 18:31:49 +00:00
Jonathan Peyton	1800ecec70	Simplify __kmp_expand_threads This change simplifies __kmp_expand_threads to take a single argument. Previously, it allowed two arguments and had logic to decide on different potential expansion sizes. However, no calls to __kmp_expand_threads in the runtime make use of this extra logic. Thus the extra argument and logic is removed here. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41836 llvm-svn: 322204	2018-01-10 18:27:01 +00:00
Jonathan Peyton	bff8ded906	Minor code cleanup Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41831 llvm-svn: 322203	2018-01-10 18:24:09 +00:00
Jonathan Peyton	eaa9e40c9a	Improve stability of the runtime in parent/child processes This change improves stability of the runtime when the application forks child processes. Acquiring/releasing __kmp_initz_lock and __kmp_forkjoin_lock in the atfork handlers insures that the actual fork does not occur while those two locks are held, and __kmp_itt_reset() reverts the itt's global state to the initial state which also initializes the mutex stored in the global state. Some missing initialization code was also inserted in the child's atfork handler. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D41462 llvm-svn: 322202	2018-01-10 18:21:48 +00:00
Joachim Protze	1014a6b6c6	Missed to add new test case in previous commit llvm-svn: 322179	2018-01-10 12:52:34 +00:00
Joachim Protze	14b512e20c	[OMPT] Fix ompt_task_data handling in implicit barriers Changes to task_data in barrier-begin were not visible at barrier-end Differential Revision: https://reviews.llvm.org/D41176 llvm-svn: 322178	2018-01-10 12:51:27 +00:00
Jonas Hahnfeld	f34d65a164	[OMPT] Fix cast and printf of wait_id in lock test This didn't work on 32 bit platforms. Differential Revision: https://reviews.llvm.org/D41853 llvm-svn: 322160	2018-01-10 08:10:23 +00:00
Paul Osmialowski	6db41e608f	Fix type mismatch in omp_control_tool() implementation that makes it run incorrectly on 32-bit machines. Differential Revision: https://reviews.llvm.org/D41854 llvm-svn: 322068	2018-01-09 10:54:06 +00:00
Jonas Hahnfeld	3ffca790f6	Correct types of pointers to doacross_num_done This field is defined as kmp_int32, so we should use neither pointers to kmp_int64 nor 64 bit atomic instructions. (Found while testing on a Raspberry Pi, 32 bit ARM) Differential Revision: https://reviews.llvm.org/D41656 llvm-svn: 321964	2018-01-07 16:54:36 +00:00
Jonathan Peyton	97f4320086	Fix some comments and formatting in kmp_dispatch.cpp llvm-svn: 321831	2018-01-04 23:05:26 +00:00
Jonathan Peyton	8c432f2d5e	Fix trademarks found by scanner llvm-svn: 321827	2018-01-04 22:56:47 +00:00
Joachim Protze	e5e4afd6db	[OMPT] Build runtime with OMPT support by default This patch enables OMPT by default if version 50 or later is built and the config says, that OMPT will be supported. Differential Revision: https://reviews.llvm.org/D41508 llvm-svn: 321675	2018-01-02 21:09:00 +00:00
Jonas Hahnfeld	2e809acd0b	Unify build documentation and convert to reStructuredText We now have several options that apply for both libraries and they shouldn't be documented in multiple files. When already merging the two Build_With_CMake.txt documents, convert them to reStructuredText which is used for all of LLVM's documentation. Differential Revision: https://reviews.llvm.org/D40920 llvm-svn: 321481	2017-12-27 09:15:10 +00:00
Joachim Protze	265fb584a5	[OMPT] Set and reset frame address when creating a task with dependences As for normal task creation, the task frame addresses need to be stored for the encountering task. Differential Revision: https://reviews.llvm.org/D41165 llvm-svn: 321421	2017-12-24 07:30:23 +00:00
Paul Osmialowski	6b8141acdd	[OMPT] Add missing initialization in nested_lwt.c test case Without this initialization this test case tend to fail. Differential Revision: https://reviews.llvm.org/D41542 llvm-svn: 321379	2017-12-22 19:24:06 +00:00
Joachim Protze	9c9b61df7e	[OMPT] Fix failing test cases for gcc on Ubuntu The compiler warns that _BSD_SOURCE is deprecated and _DEFAULT_SOURCE should be used instead. We keep _BSD_SOURCE for older compilers, that don't know about _DEFAULT_SOURCE. The linker drops the tool when linking, since there is no visible need for the library. So we need to tell the linker, that the tool should be linked anyway. Differential Revision: https://reviews.llvm.org/D41499 llvm-svn: 321362	2017-12-22 16:40:32 +00:00
Joachim Protze	25aa3ec1c5	Remove unused positional argument for printf The format string for hints only prints the second argument (string) and drops the first argument (hint id). Depending on how you read the POSIX text for printf, this could be valid. But for practical reason, i.e., unpacking the va_list passed to printf based on the formating information, it makes sense to fix the implementation and not pass the id for hint. Failing testcases were: misc_bugs/teams-reduction.c ompt/parallel/not_enough_threads.c Differential Revision: https://reviews.llvm.org/D41504 llvm-svn: 321361	2017-12-22 16:40:26 +00:00
Joachim Protze	e8d84a67c2	Add missing test case from D41171 commit llvm-svn: 321270	2017-12-21 14:36:36 +00:00
Joachim Protze	f375f4b49a	[OMPT] Add missing ompt_get_num_procs function This function is defined in OpenMP-TR6 section 4.1.5.1.6 The functions was not implemented yet. Since ompt-functions can only be called after the runtime was initialized and has loaded a tool, it can assume the runtime to be initialized. In contrast to omp_get_num_procs which needs to check whether the runtime is initialized. Differential Revision: https://reviews.llvm.org/D40949 llvm-svn: 321269	2017-12-21 14:36:30 +00:00
Joachim Protze	f8d22f9db8	[OMPT] Fix return address handling in a few GOMP interface methods This revision fixes failing testcases with parallel for loops and the gomp interface. The return address needs to be stored at entry to runtime. The storage is cleared on usage, so we need to update the storage before calling again internal functions, that will trigger event callbacks. Differential Revision: https://reviews.llvm.org/D41181 llvm-svn: 321265	2017-12-21 13:55:39 +00:00
Joachim Protze	4fe83593eb	[OMPT] Handle null pointer in set_callback to improve performance We use the bitmap ompt_enabled thoughout the runtime, to avoid loading the vector of callback functions when testing if specific code should be executed. Before invoking an event callback function, the pointer is tested for NULL. This revision resets the corresponding bit in ompt_enabled to 0 if NULL is passed in set_callback. Differential Revision: https://reviews.llvm.org/D41171 llvm-svn: 321264	2017-12-21 13:55:34 +00:00
Joachim Protze	0e2a2571ca	[OMPT] Use frames at different level when using clang version 5 or higher with debug flag Clang 5 or higher adds an intermediate function call in certain cases when compiling with debug flag. This revision updates the testcases to work correctly. Differential Revision: https://reviews.llvm.org/D40595 llvm-svn: 321263	2017-12-21 13:55:29 +00:00
Joachim Protze	633bc4ca99	[OMPT] Add annotations to testcases that are expected to fail when using certain compilers Reasons for expected failures are mainly bugs when using lables in OpenMP regions or missing support of some OpenMP features. For some worksharing clauses, support to distinguish the kind of workshare was added just recently. If an issue was fixed in a minor release version of a compiler, we flag the test as unsupported for this compiler version to avoid false positives. Same for fixes that where backported to older compiler versions. Differential Revision: https://reviews.llvm.org/D40384 llvm-svn: 321262	2017-12-21 13:55:16 +00:00
Paul Osmialowski	17fb580c12	[AArch64] add required arch specific code for running OMPT test cases Differential Revision: https://reviews.llvm.org/D41482 llvm-svn: 321258	2017-12-21 12:33:31 +00:00
Dimitry Andric	e4f5d01033	Fix more inconsistent line endings. NFC. llvm-svn: 321016	2017-12-18 19:46:56 +00:00
Paul Osmialowski	7634f7093a	[AArch64] fix an issue with older /proc/cpuinfo layout There are two /proc/cpuinfo layots in use for AArch64: old and new. The old one has all 'processor : n' lines in one section, hence checking for duplications does not make sense. Differential Revision: https://reviews.llvm.org/D41000 llvm-svn: 320593	2017-12-13 16:12:24 +00:00
Jonas Hahnfeld	2fcce313ad	[CMake] Remove legacy LIBOMP_LIT_ARGS The bots have been updated, this option isn't needed anymore. llvm-svn: 320153	2017-12-08 15:07:08 +00:00
Jonas Hahnfeld	e628ab4c65	Use hyperbarrier by default on all architectures All architectures except x86_64 used the linear barrier implementation by default which doesn't give good performance for a larger number of threads. Improvements for PARALLEL overhead (EPCC) with this patch on a Power8 system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores) 20 threads: 4.55us -> 3.49us 40 threads: 8.84us -> 4.06us 80 threads: 19.18us -> 4.74us 160 threads: 54.22us -> 6.73us Differential Revision: https://reviews.llvm.org/D40358 llvm-svn: 320152	2017-12-08 15:07:07 +00:00
Jonas Hahnfeld	ce528acf0d	Fix thread affinity on non-x86 Linux To make thread affinity work according to the OpenMP spec, the runtime needs information about the hardware topology. On Linux the default way is to parse /proc/cpuinfo which contains this information for x86 machines but (at least) not for AArch64 and Power architectures. Fortunately, there is a different code path which is able to get that data from sysfs. The needed patch has landed in 2006 for Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had a kernel version derived from 2.6.18, and we are now at RHEL 7!). Differential Revision: https://reviews.llvm.org/D40357 llvm-svn: 320151	2017-12-08 15:07:05 +00:00
Jonas Hahnfeld	86c307821c	Add missing memory barrier for queuing locks Otherwise I see hangs in the omp_single_copyprivate test when compiling in release mode. With the debug assertions, I get a failure `head > 0 && tail > 0`. Differential Revision: https://reviews.llvm.org/D40722 llvm-svn: 320150	2017-12-08 15:07:02 +00:00
Jonas Hahnfeld	a7c4f3202b	[libomptarget] Split implementation of interface functions This last of four patches adds a new file for the interface functions that Clang uses during code generation. The only change except simply moving the current code is renaming the function CheckDeviceAndCtors() and using the correct type for 64bit device ids. Differential Revision: https://reviews.llvm.org/D40801 llvm-svn: 319972	2017-12-06 21:59:15 +00:00

1 2 3 4 5 ...

800 Commits