llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonathan Peyton	dc73f512ae	Fix const cast problem introduced in r336563 336563 eliminated CCAST() macros caused build failures llvm-svn: 336586	2018-07-09 19:09:31 +00:00
Jonathan Peyton	61d44f188a	[OpenMP] Fix a few formatting issues llvm-svn: 336575	2018-07-09 18:09:25 +00:00
Jonathan Peyton	f639936748	[OpenMP] Introduce hierarchical scheduling This patch introduces the logic implementing hierarchical scheduling. First and foremost, hierarchical scheduling is off by default To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage. This work is based off if the IWOMP paper: "Workstealing and Nested Parallelism in SMP Systems" Hierarchical scheduling is the layering of OpenMP schedules for different layers of the memory hierarchy. One can have multiple layers between the threads and the global iterations space. The threads will go up the hierarchy to grab iterations, using possibly a different schedule & chunk for each layer. [ Global iteration space (0-999) ] (use static) [ L1 \| L1 \| L1 \| L1 ] (use dynamic,1) [ T0 T1 \| T2 T3 \| T4 T5 \| T6 T7 ] In the example shown above, there are 8 threads and 4 L1 caches begin targeted. If the topology indicates that there are two threads per core, then two consecutive threads will share the data of one L1 cache unit. This example would have the iteration space (0-999) split statically across the four L1 caches (so the first L1 would get (0-249), the second would get (250-499), etc). Then the threads will use a dynamic,1 schedule to grab iterations from the L1 cache units. There are currently four supported layers: L1, L2, L3, NUMA OMP_SCHEDULE can now read a hierarchical schedule with this syntax: OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h to try to keep it separate from the rest of the code. Differential Revision: https://reviews.llvm.org/D47962 llvm-svn: 336571	2018-07-09 17:51:13 +00:00
Jonathan Peyton	39ada85446	[OpenMP] Restructure loop code for hierarchical scheduling This patch reorganizes the loop scheduling code in order to allow hierarchical scheduling to use it more effectively. In particular, the goal of this patch is to separate the algorithmic parts of the scheduling from the thread logistics code. Moves declarations & structures to kmp_dispatch.h for easier access in other files. Extracts the algorithmic part of __kmp_dispatch_init() and __kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and __kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in __kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the hierarchical scheduler needs to access the scheduling logic without the bookkeeping logic. To prepare for new pointer in dispatch_private_info_t, a new flags variable is created which stores the ordered and nomerge flags instead of them being in two separate variables. This will keep the dispatch_private_info_t structure the same size. Differential Revision: https://reviews.llvm.org/D47961 llvm-svn: 336568	2018-07-09 17:45:33 +00:00
Jonathan Peyton	37e2ef5434	[OpenMP] Use C++11 Atomics - barrier, tasking, and lock code These are preliminary changes that attempt to use C++11 Atomics in the runtime. We are expecting better portability with this change across architectures/OSes. Here is the summary of the changes. Most variables that need synchronization operation were converted to generic atomic variables (std::atomic<T>). Variables that are updated with combined CAS are packed into a single atomic variable, and partial read/write is done through unpacking/packing Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D47903 llvm-svn: 336563	2018-07-09 17:36:22 +00:00
Kelvin Li	b1711b28f7	Define the __STDC_FORMAT_MACROS to avoid test failure on some platforms. ompt/misc/api_calls_from_other_thread.cpp ompt/misc/interoperability.cpp Differential Revision: https://reviews.llvm.org/D48984 llvm-svn: 336438	2018-07-06 14:15:59 +00:00
Joachim Protze	b41c61eed4	Dropped non-supoorted "--no-as-needed" flag from OMPT tests for macOS The flag "--no-as-needed" is not recognized by the linker on macOS making the following tests fail: ompt/loadtool/tool_available/tool_available.c ompt/loadtool/tool_not_available/tool_not_available.c This patch removes this flag for macOS and adds it only for Linux and Windows. I tested it on Ubuntu 16.04 and macOS HighSierra, with Clang/LLVM 6.0.1 and OpenMP trunk. This solution was also discussed in the OpenMP-dev mailing list. Patch provided by Simone Atzeni Differential Revision: https://reviews.llvm.org/D48888 llvm-svn: 336327	2018-07-05 09:14:06 +00:00
Joachim Protze	00505b85a3	[OMPT] Add synchronization to threads_nested.c testcase The testcase potentially fails when a thread is reused. The added synchronization makes sure this does not happen. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48932 llvm-svn: 336326	2018-07-05 09:14:01 +00:00
Joachim Protze	04a00fc18c	[OMPT] Use alloca() to force availability of frame pointer When compiling with icc, there is a problem with reenter frame addresses in parallel_begin callbacks in the interoperability.c testcase. (The address is not available. thus NULL) Using alloca() forces availability of the frame pointer. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48282 llvm-svn: 336088	2018-07-02 09:13:38 +00:00
Joachim Protze	e2eec57a4f	[OMPT] Add tests for runtime entry points from non-OpenMP threads Several runtime entry points have not been tested from non-OpenMP threads. This adds tests to an existing testcase. While at it, the testcase was reformatted Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48124 llvm-svn: 336087	2018-07-02 09:13:34 +00:00
Joachim Protze	28d2d708d4	[OMPT] Add testcases for thread_begin and thread_end callbacks Especially the thread_end callback has not been tested before. This adds a testcase for nested and non-nested threads. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D47824 llvm-svn: 336086	2018-07-02 09:13:30 +00:00
Joachim Protze	4a73ae167e	[OMPT] Provide the right thread_num for ancestor levels The current implementation always provides the thread-num for the current parallel region. This patch fixes the behavior for ancestor levels >0. Differential Revision: https://reviews.llvm.org/D46533 llvm-svn: 336085	2018-07-02 09:13:24 +00:00
Andrey Churbanov	a7fa3f009a	minor: fixed typo in debug print llvm-svn: 335138	2018-06-20 15:54:11 +00:00
Jonathan Peyton	e92ae43be8	[OpenMP] Fix formatting issues in kmp_stats.h llvm-svn: 334335	2018-06-08 22:27:53 +00:00
Joachim Protze	406361330b	[OMPT] Rename ompt_wait_id to omp_wait_id Rename ompt_wait_id to omp_wait_id, as defined in the spec. Differential Revision: https://reviews.llvm.org/D46530 llvm-svn: 333368	2018-05-28 08:16:08 +00:00
Joachim Protze	c5836064bb	[OMPT] Rename ompt_frame_t to omp_frame_t Rename ompt_frame_t to omp_frame_t, as defined in the spec. Differential Revision: https://reviews.llvm.org/D43568 llvm-svn: 333367	2018-05-28 08:14:58 +00:00
Jonas Hahnfeld	3c6595d65d	[OMPT] Fix test parallel/not_enough_threads.c Upcoming changes to FileCheck will modify CHECK-DAG to not match overlapping regions of the input. This test was found to be affected because it expects to find four threads to invoke events of type ompt_event_implicit_task_begin. It turns out this is wrong because OMP_THREAD_LIMIT is set to 2, so there are only two threads. The rest of the test got it right so it went unnoticed until now. (Rewrite test and apply clang-format to it as discussed in the past.) Differential Revision: https://reviews.llvm.org/D47119 llvm-svn: 333361	2018-05-27 17:07:38 +00:00
Jonas Hahnfeld	65e0b8784c	[CMake] Unify install path for libraries Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands. This also fixes installation of libomptarget-nvptx that previously didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX. Differential Revision: https://reviews.llvm.org/D47130 llvm-svn: 333284	2018-05-25 15:56:41 +00:00
Joachim Protze	9be9cf20bf	[OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regions implicit_task_end callbacks in nested parallel regions did not always give the correct thread_num, since the inner parallel region may have already been finalized. Now, the thread_num is stored at the beginning of the implicit task and retrieved at the end, whenever necessary. A testcase was added as well. Differential Revision: https://reviews.llvm.org/D46260 llvm-svn: 331632	2018-05-07 12:42:21 +00:00
Joachim Protze	8fc39f6b19	[OMPT] Add api_calls_misc.c testcase and rename api_calls.c testcase The api_calls_misc.c testcase tests the following api calls: ompt_get_callback() ompt_get_state() ompt_enumerate_states() ompt_enumerate_mutex_impls() These have not been tested previously. The api_calls.c testcase has been renamed to api_calls_places.c because it only tests api calls that are related to places. Differential Revision: https://reviews.llvm.org/D42523 llvm-svn: 331631	2018-05-07 12:42:15 +00:00
Jonathan Peyton	d47df260ba	[OpenMP][OMPT] Fix api_calls_from_other_thread.cpp Removed environment setting in RUN: line that was being ignored anyways. Changed a few specific checks to "any number" llvm-svn: 331212	2018-04-30 18:46:31 +00:00
Heejin Ahn	f78a493528	[OpenMP] Compilation error fix on const char* Summary: This line (`0ed912c7a7/runtime/src/kmp_gsupport.cpp (L1459)`) added in D45327 (rL330282) causes a compilation failure. Reviewers: jlpeyton Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D45786 llvm-svn: 330299	2018-04-18 22:23:31 +00:00
Jonathan Peyton	1482db9e03	[OpenMP] Fix affinity API for KMP_AFFINITY=none\|compact\|scatter Currently, the affinity API reports garbage for the initial place list and any thread's place lists when using KMP_AFFINITY=none\|compact\|scatter. This patch does two things: for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the initial place list is just a single place with all the proc ids in it. We also set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the thread reports that single place (place 0) instead of garbage (-1) when using the affinity API. When non-OMP_PROC_BIND affinity is used (including KMP_AFFINITY=compact\|scatter), a thread's place list is populated correctly. We assume that each thread is assigned to a single place. This is implemented in two of the affinity API functions Differential Revision: https://reviews.llvm.org/D45527 llvm-svn: 330283	2018-04-18 19:25:48 +00:00
Jonathan Peyton	27a677fc95	Introduce GOMP_taskloop API This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282	2018-04-18 19:23:54 +00:00
Joachim Protze	3865c69b84	Set the license header for all OMPT files llvm-svn: 329928	2018-04-12 17:23:26 +00:00
Jonathan Peyton	1e6bb8d5de	Minor cleanup in __kmp_atfork_child() This change removes the unnecessary lock operation on __kmp_initz_lock inside the __kmp_atfork_child() function for Linux; the lock variable is initialized in the same function later. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D44949 llvm-svn: 328900	2018-03-30 19:55:11 +00:00
Jonathan Peyton	ea82c769f4	Move blocktime_str variable right before its first use llvm-svn: 328575	2018-03-26 19:20:50 +00:00
Jonathan Peyton	b6b79ac95b	Add summarizeStats.py to tools directory The summarizeStats.py script processes raw data provided by the instrumented (stats-gathering) OpenMP* runtime library. It provides: 1) A radar chart which plots counters as frequency (per GigaTick) of use within the program. The frequencies are plotted as log10, however values less than one are kept as it is and represented in red color. This was done to help visualize the differences better. 2) Pie charts separating total time as compute and non-compute. The compute and non-compute times have their own pie charts showing the constructs that contributed to them. The percentages listed are with respect to the total time. 3) '.csv' file with percentage of time spent within the different constructs. The script can be used as: $ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv Patch by Taru Doodi Differential Revision: https://reviews.llvm.org/D41838 llvm-svn: 328568	2018-03-26 18:44:48 +00:00
Andrey Churbanov	2d91a8a3ba	Fixed __kmpc_get_target_offload() to call library initialization. Differential Revision: https://reviews.llvm.org/D44793 llvm-svn: 328228	2018-03-22 18:51:51 +00:00
Jonathan Peyton	78f977fcd1	Read OMP_TARGET_OFFLOAD and provide API to access ICV Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD, if available, otherwise defaulting to DEFAULT. Valid values for the ICV are specified as enum values {0,1,2} for disabled, default, and mandatory. An internal API access function __kmpc_get_target_offload is provided. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D44577 llvm-svn: 328046	2018-03-20 21:18:17 +00:00
Andrey Churbanov	3336aa0d07	Fix for Fix for https://bugs.llvm.org/show_bug.cgi?id=36705 . Differential Revision: https://reviews.llvm.org/D44637 llvm-svn: 327875	2018-03-19 18:05:15 +00:00
Andrey Churbanov	9e9333aa8a	Improve OpenMP threadprivate implementation. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D41914 llvm-svn: 326733	2018-03-05 18:42:01 +00:00
Andrey Churbanov	75bc70fb56	Fixed build of the OpenMP stubs library. Differential Revision: https://reviews.llvm.org/D44019 llvm-svn: 326728	2018-03-05 18:01:47 +00:00
Jonas Hahnfeld	b0f051ae63	[OMPT] Fix interoperability test with GCC We have to ensure that the runtime is initialized _before_ waiting for the two started threads to guarantee that the master threads post their ompt_event_thread_begin before the worker threads. This is not guaranteed in the parallel region where one worker thread could start before the other master thread has invoked the callback. The problem did not happen with Clang becauses the generated code calls __kmpc_global_thread_num() and cashes its result for functions that contain OpenMP pragmas. Differential Revision: https://reviews.llvm.org/D43882 llvm-svn: 326435	2018-03-01 14:03:18 +00:00
Joachim Protze	f5aebc27ad	[OMPT] Fix task-type test with GCC This is similar to D43882. The runtime needs to be initialized before calling print_ids(0) http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/60 Differential Revision: https://reviews.llvm.org/D43897 llvm-svn: 326428	2018-03-01 11:26:15 +00:00
Joachim Protze	aa2022e74f	[OMPT] Fix ompt_get_task_info() and add tests for it The thread_num parameter of ompt_get_task_info() was not being used previously, but need to be set. The print_task_type() function (form the task-types.c testcase) was merged into the print_ids() function (in callback.h). Testing of ompt_get_task_info() was added to the task-types.c testcase. It was not tested extensively previously. Differential Revision: https://reviews.llvm.org/D42472 llvm-svn: 326338	2018-02-28 17:36:18 +00:00
Joachim Protze	4df80bda40	[OMPT] Fix inconsistent testcases The main change of this patch is to insert {{.*}} in current_address=[[RETURN_ADDRESS_END]]. This is needed to match any of the alternatively printed addresses. Additionally, clang-format is applied to the two tests. Differential Revision: https://reviews.llvm.org/D43115 llvm-svn: 326312	2018-02-28 09:28:51 +00:00
Jonas Hahnfeld	82768d0ba1	[OMPT] Fix parallel_data in implicit barrier-end This is required to be NULL for implicit barriers at the end of a parallel region. Noticed in review of D43191. Differential Revision: https://reviews.llvm.org/D43308 llvm-svn: 325922	2018-02-23 16:46:25 +00:00
Jonas Hahnfeld	5e44069857	[OMPT] Fix test tasks/serialized.c with optimization The compiler inlines the user code in the task. Check for that case at runtime by comparing the frame addresses and print the expected exit address. Also showcase how I think the OMPT tests could be reformatted to match LLVM's code style. In my opinion it would be great to that kind of change to all tests that need to be touched for whatever reason... Differential Revision: https://reviews.llvm.org/D43191 llvm-svn: 325921	2018-02-23 16:46:11 +00:00
Joachim Protze	b0e4f87fb0	[OMPT] Omissionin in OMPT Formatting Applying clang-format to the /runtime/src/ folder Differential Revision: https://reviews.llvm.org/D42169 llvm-svn: 325424	2018-02-17 09:54:10 +00:00
Joachim Protze	33db70d2d7	[OMPT] Add interoperability testcase Test whether OMPT-callbacks for two threads that initiate a parallel region are correct. Differential Revision: https://reviews.llvm.org/D41942 llvm-svn: 325423	2018-02-17 09:40:08 +00:00
Joachim Protze	76899b84fe	[OMPT] Update api_calls testcase Only use ompt_ functions when testing OMPT in api_calls testcase. Add size parameter to print_list. Fix small bug in implementation of ompt_get_partition_place_nums(): return correct length. Differential Revision: https://reviews.llvm.org/D42162 llvm-svn: 325422	2018-02-17 09:40:02 +00:00
Jonas Hahnfeld	cc6d29d72c	[OMPT][test] Correct warning about added wrapper functions This affects all outlined functions, not just tasks! Only show warning when using Clang 5.0 or later. Differential Revision: https://reviews.llvm.org/D43190 llvm-svn: 325131	2018-02-14 15:15:24 +00:00
Joachim Protze	cfc98c2493	[OMPT] Add tool_available_search testcase Tests the search for tools as defined in the spec. The OMP_TOOL_LIBRARIES environment variable contains paths to the following files(in that order) -to a nonexisting file -to a shared library that does not have a ompt_start_tool function -to a shared library that has an ompt_start_tool implementation returning NULL -to a shared library that has an ompt_start_tool implementation returning a pointer to a valid instance of ompt_start_tool_result_t The expected result is that the last tool gets active and can print in the thread-begin callback. Differential Revision: https://reviews.llvm.org/D42166 llvm-svn: 324588	2018-02-08 10:04:33 +00:00
Joachim Protze	9440c0ee3c	[OMPT] Add tool_not_available testcase Add a testcase that checks wheter the runtime can handle an ompt_start_tool method that returns NULL indicating that no tool shall be loaded. All tool_available testcases need a separate folder to avoid file conflicts for the generated tools. Differential Revision: https://reviews.llvm.org/D41904 llvm-svn: 324587	2018-02-08 10:04:28 +00:00
Joachim Protze	2a20299f91	[OMPT] Fix tool initialization returning 0 If tool initialization returns 0, OMPT should not be active. The current implementation provided some callback invocations in this case. Differential Revision: https://reviews.llvm.org/D42709 llvm-svn: 324320	2018-02-06 08:41:27 +00:00
Jonas Hahnfeld	723560d123	[OMPT] Use fuzzy return addresses in lock testcases Use fuzzy return addresses in lock testcases so that these testcases can also be run using the Intel Compiler. Patch by Simon Convent! Differential Revision: https://reviews.llvm.org/D41896 llvm-svn: 323529	2018-01-26 14:19:02 +00:00
Joachim Protze	e6269e3509	Partial revert of [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl The previous commit did not revert all replaced ompt_mutex_impl_unknown. llvm-svn: 322631	2018-01-17 11:13:11 +00:00
Joachim Protze	0c9516b36c	[OMPT] Add Workaround for Intel Compiler Bug Add Workaround for Intel Compiler Bug with Case#: 03138964 A critical region within a nested task causes a segfault in icc 14-18: int main() { #pragma omp parallel num_threads(2) #pragma omp master #pragma omp task #pragma omp task #pragma omp critical printf("test\n"); } When the critical region is in a separate function, the segault does not occur. So we add noinline to make sure that the function call stays there. Differential Revision: https://reviews.llvm.org/D41182 llvm-svn: 322622	2018-01-17 10:06:06 +00:00
Joachim Protze	1b2bd2680b	[OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl The defintion is not part of the spec and thus should not have the prefix "ompt_" but rather a prefix that indicates that this is implementation specific. Differential Revision: https://reviews.llvm.org/D41166 llvm-svn: 322621	2018-01-17 10:06:01 +00:00

1 2 3 4 5 ...

675 Commits