llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonathan Peyton	20e13d4a38	Fix Hwloc API Incompatibility Older Hwloc libraries (< 1.10.0) don't offer the HWLOC_OBJ_NUMANODE nor HWLOC_OBJ_PACKAGE types. Instead they are named HWLOC_OBJ_NODE and HWLOC_OBJ_SOCKET instead. This patch just defines the newer names based on the older names when using an older Hwloc. Differential Revision: https://reviews.llvm.org/D32496 llvm-svn: 301349	2017-04-25 19:04:07 +00:00
George Rokos	c13df8e5e0	[OpenMP] Optimized default kernel launch parameters in CUDA plugin Differential Revision: https://reviews.llvm.org/D32321 llvm-svn: 301321	2017-04-25 16:34:13 +00:00
George Rokos	4800fc4363	[OpenMP] Add missing parenthesis which triggers a compile error Differential Revision: https://reviews.llvm.org/D32490 llvm-svn: 301318	2017-04-25 15:55:39 +00:00
George Rokos	d57681b703	[OpenMP] libomptarget: Set ref count for global objects to positive infinity Differential Revision: https://reviews.llvm.org/D32326 llvm-svn: 301076	2017-04-22 11:45:03 +00:00
George Rokos	f9cb9c18a0	[OpenMP] libomptarget: Remove obsolete negative device IDs -2/-3 Differential Revision: https://reviews.llvm.org/D32325 llvm-svn: 301075	2017-04-22 11:21:54 +00:00
George Rokos	6c79cc2198	[OpenMP] Run libomptarget regression tests using all available system threads. Differential Revision: https://reviews.llvm.org/D32327 llvm-svn: 301074	2017-04-22 11:20:20 +00:00
Andrey Churbanov	44fea6b864	Fix crash in invoking microtask on ios arm64. Patch by Ni Hui. Differential Revision: https://reviews.llvm.org/D31923 llvm-svn: 300448	2017-04-17 11:58:20 +00:00
Andrey Churbanov	4a9a89241b	KMP_HW_SUBSET extended with NUMA support when HWLOC enabled Differential Revision: https://reviews.llvm.org/D31600 llvm-svn: 300220	2017-04-13 17:15:07 +00:00
Olga Malysheva	80af9c081a	Test cancellation_for_sections.c expectedly fails on GCC llvm-svn: 299437	2017-04-04 14:39:52 +00:00
Olga Malysheva	dbdcfa127f	Reset cancellation status for 'parallel', 'sections' and 'for' constracts. Without this fix cancellation status for parallel, sections and for persists across construct boundaries. Differential Revision: https://reviews.llvm.org/D31419 llvm-svn: 299434	2017-04-04 13:56:50 +00:00
Olga Malysheva	b7784ebdf7	Test check-in, comment changed llvm-svn: 299428	2017-04-04 12:56:55 +00:00
Andrey Churbanov	31d39bfc5f	Fix for bug https://llvm.org/bugs/show_bug.cgi?id=32456 ITT Notify disabled for static build of OpenMP RTL. Differential Revision: https://reviews.llvm.org/D31466 llvm-svn: 299230	2017-03-31 16:20:07 +00:00
Andrey Churbanov	cece72aa04	Fix for bug https://llvm.org/bugs/show_bug.cgi?id=30889 Condition adjusted for Debug assertion. Differential Revision: https://reviews.llvm.org/D29638 llvm-svn: 298915	2017-03-28 13:35:42 +00:00
Paul Osmialowski	0788515cb1	GOMP compatibility: add missing OpenMP4.0 task deps handling code Differential Revision: https://reviews.llvm.org/D31071 llvm-svn: 298605	2017-03-23 15:03:17 +00:00
George Rokos	01954092d0	[OpenMP] CUDA plugin: More descriptive error messages Differential Revision: https://reviews.llvm.org/D31206 llvm-svn: 298527	2017-03-22 17:36:22 +00:00
George Rokos	ba7380bf6c	[OpenMP] Allow multiple weak symbols to be loaded from the fat binary For compatibility with Fortran. Differential Revision: https://reviews.llvm.org/D31205 llvm-svn: 298516	2017-03-22 16:43:40 +00:00
George Rokos	f3fe2dd235	[OpenMP] CUDA plugin: add include directory for libelf Allow the user to manually specify where libelf is installed. Differential Revision: https://reviews.llvm.org/D31207 llvm-svn: 298515	2017-03-22 16:41:46 +00:00
George Rokos	00749e0fd4	[OpenMP] libomptarget: Disable on MacOS X Disable compilation of libomptarget on MacOS X. Differential Revision: https://reviews.llvm.org/D31055 llvm-svn: 298411	2017-03-21 18:19:09 +00:00
Andrey Churbanov	435b419d26	Fixed intermittent hang on tests with "target teams if(0)" construct with no parallel inside. Differential Revision: https://reviews.llvm.org/D29597 llvm-svn: 298373	2017-03-21 13:48:52 +00:00
Andrey Churbanov	3b939d070c	Stride in distribute parallel for loops with no chunk size. Patch by George Rokos. Differential Revision: https://reviews.llvm.org/D24486 llvm-svn: 298362	2017-03-21 12:17:22 +00:00
Jonathan Peyton	35d75aeda2	Minor improvement of KMP_YIELD_NOW() macro. This change slightly improves performance of KMP_YIELD_NOW() macro, by using _rdtsc() intrinsic function if possible. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D31008 llvm-svn: 298314	2017-03-20 22:11:31 +00:00
Jonathan Peyton	16fd8fec76	Fix incorrect initial value of __kmp_affinity_type. Affinity initialization code expects __kmp_affinity_type has the value affinity_default by default, but the cleanup code does not properly set the value back to affinity_default. This may introduce some issues when multiple roots are trying to initialize/uninitialize the runtime successively. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D31012 llvm-svn: 298313	2017-03-20 22:04:02 +00:00
Andrey Churbanov	a193ae19e1	Create a git ignore file for openmp runtime. Patch by Guansong Zhang. Differential Revision: https://reviews.llvm.org/D30784 llvm-svn: 297562	2017-03-11 13:05:08 +00:00
Jonathan Peyton	de8d65914b	Fix assertion failure when 'proclist' is used without 'explicit' in KMP_AFFINITY This change fixes an assertion failure the in case KMP_AFFINITY is set with 'proclist' specified but without 'explicit' e.g., KMP_AFFINITY=verbose,proclist=[0-31] Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D30404 llvm-svn: 297480	2017-03-10 17:22:47 +00:00
Dan Albert	1dc735bf64	Fix GNU strerror_r check for Android. Summary: Bionic didn't get a GNU style strerror_r until Android M. Until then we unconditionally exposed the POSIX one. Expand the check to account for this. Reviewers: pirama, AndreyChurbanov, jlpeyton Reviewed By: jlpeyton Subscribers: openmp-commits, srhines Differential Revision: https://reviews.llvm.org/D30056 llvm-svn: 297235	2017-03-07 22:18:05 +00:00
Jonathan Peyton	e844a54a85	OpenMP version 5.0 added Add build option LIBOMP_OMP_VERSION=50, 5.0 headers, and add the year/month associated with OpenMP 5.0 in relevant source locations. Also, remove the deprecated LIBOMP_OMP_VERSION=41 option. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D30450 llvm-svn: 297083	2017-03-06 22:07:40 +00:00
Jonathan Peyton	41d3800d71	Mixed type atomic routines added to Windows DLL Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D30408 llvm-svn: 297082	2017-03-06 21:46:36 +00:00
Paul Osmialowski	1e254c5295	Add AArch64 support. This adds AArch64 support to recently added part of the runtime responsible for offloading to target. This piece of code allows offloading-to-self on AArch64 machines. Differential Revision: https://reviews.llvm.org/D30644 llvm-svn: 297070	2017-03-06 21:00:07 +00:00
Jonathan Peyton	928b8ea203	Removing couple unnecessary architecture guards. This section of code (__kmp_test_then_* functions) is guarded by (KMP_ARCH_X86 \|\| KMP_ARCH_X86_64) so it does not make sense to have other architecture guards inside this section. Non-x86 architectures always use intrinsics (__sync_*) llvm-svn: 296525	2017-02-28 21:43:28 +00:00
Michal Gorny	018d13597a	[test] Try to link -latomic to provide atomics when available When using -rtlib=libgcc, the fallback implementation of __atomic_* builtins is provided via libatomic (included in GCC). However, neither GCC itself nor clang link libatomic implicitly, and it seems that GCC upstream expects projects to link it explicitly as necessary. Since compiler-rt provides __atomic_* builtins directly in the main library, check if they are provided by the default libraries first. If they are not, check if -latomic is available to provide them and add explicit -latomic for tests in this case. This fixes unresolved __atomic_load() references when running openmp tests on i386 with libgcc backend. Differential Revision: https://reviews.llvm.org/D30083 llvm-svn: 296183	2017-02-24 22:15:24 +00:00
George Rokos	63efdd9e1e	[OpenMP] Missing virtual destructor in KMPAffinity Added virtual destructor in a class containing virtual functions. Differential Revision: https://reviews.llvm.org/D30271 llvm-svn: 295896	2017-02-22 22:50:28 +00:00
Jonathan Peyton	12ecbb35eb	[stats] add stats-gathering for static_steal scheduling method Add counter to count number of static_steal for loops Add counter for number of chunks executed per static_steal for loop Add counter for number of chunks stolen per static_steal for loop llvm-svn: 295461	2017-02-17 17:06:16 +00:00
Andrey Churbanov	72ba210916	Run-time library part of OpenMP 5.0 task reduction implementation. Added test kmp_task_reduction_nest.cpp which has an example of possible compiler codegen. Differential Revision: https://reviews.llvm.org/D29600 llvm-svn: 295343	2017-02-16 17:49:49 +00:00
Andrey Churbanov	ad3f63986d	Added an option to bind initial thread at the start of application via setting envirable KMP_INITIAL_THREAD_BIND=1. Differential Revision: https://reviews.llvm.org/D29665 llvm-svn: 295339	2017-02-16 17:08:40 +00:00
George Rokos	15a6e7daab	[OpenMP] libomptarget: Protect parent struct from being deallocated Fixed bug due to which a parent struct was deallocated when one of the struct's pointers was being unmapped. Differential Revision: https://reviews.llvm.org/D29914 llvm-svn: 295231	2017-02-15 20:45:37 +00:00
Jonathan Peyton	581fdbaad4	Enable yield cycle on Linux This change allows the runtime to turn __kmp_yield() on/off repeatedly on Linux. This feature was removed when disabling monitor thread, but there are applications that perform better with this feature on. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D29227 llvm-svn: 295203	2017-02-15 17:19:21 +00:00
Jonas Hahnfeld	35801a2470	[OpenMP] New Tsan annotations to remove false positive on reduction and barriers Added new ThreadSanitizer annotations to remove false positives with OpenMP reduction. Cleaned up Tsan annotations header file from unused annotations. Patch by Simone Atzeni! Differential Revision: https://reviews.llvm.org/D29202 llvm-svn: 295158	2017-02-15 08:14:22 +00:00
Hans Wennborg	1155ad1bc8	libomptarget: Disable on Win32 It's not supported, and currently breaks the weekly LLVM snapshot builds. Differential Revision: https://reviews.llvm.org/D29801 llvm-svn: 294758	2017-02-10 17:13:28 +00:00
Jonas Hahnfeld	a620a0745f	[libomptarget] Align test code with runtime/ This change allows setting LIBOMPTARGET_LLVM_LIT_EXECUTABLE and LIBOMPTARGET_FILECHECK_EXECUTABLE as full path. It also honors OPENMP_LLVM_TOOLS_DIR which is meant as a common configuration for both libomp and libomptarget. Maybe this should be done in a common CMake module, but I'm no expert here. Differential Revision: https://reviews.llvm.org/D29172 llvm-svn: 294284	2017-02-07 06:58:15 +00:00
Andrey Churbanov	581490e713	Fix a race in shutdown when tasking is used. Patch by Terry Wilmarth. Differential Revision: https://reviews.llvm.org/D28377 llvm-svn: 294214	2017-02-06 18:53:32 +00:00
George Rokos	efd412f8fe	[OpenMP] Redefined macro warning in libomptarget Fixed compilation warning in libomptarget. Differential Revision: https://reviews.llvm.org/D29353 llvm-svn: 293747	2017-02-01 08:33:38 +00:00
George Rokos	3de4cd1281	[OpenMP] Initial implementation of OpenMP offloading library - libomptarget plugins. This is the patch upstreaming the plugins part of libomptarget (CUDA, generic-elf-64). Differential Revision: https://reviews.llvm.org/D14253 llvm-svn: 293724	2017-02-01 00:14:41 +00:00
Jonas Hahnfeld	479088eefa	Correct wrong comment in bug_nested_proxy_task.c The nested proxy task does not have dependencies. llvm-svn: 293472	2017-01-30 09:51:02 +00:00
Jonas Hahnfeld	49739f0e32	[libomptarget] Fix Debug build with glibc < 2.18 glibc < 2.18 is C99 compliant and only provides the format macros in C++ if __STDC_FORMAT_MACROS is defined. This change fixes the debug build for GCC 4.8, GCC 6.2 and Clang 3.9.1 that were previously broken on my machine. It shows no regression for libc++ >= 4.0.0 which has a fix since September: http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20160926/171659.html llvm-svn: 293468	2017-01-30 08:11:20 +00:00
Jonathan Peyton	12313d44cf	Cleanup: put i_maxmin members and ___kmp_size_type into traits_t Put the duplicated i_maxmin into traits_t by adding new members max_value and min_value. Put ___kmp_size_type into traits_t by adding member type_size. Differential Revision: https://reviews.llvm.org/D28847 llvm-svn: 293316	2017-01-27 18:09:22 +00:00
Jonathan Peyton	3061e3e454	Printing OS thread id, when KMP_AFFINITY is set. Patch by Vishakha Agrawal Differential Revision: https://reviews.llvm.org/D28873 llvm-svn: 293315	2017-01-27 18:04:33 +00:00
Jonathan Peyton	2208a85101	Fix performance issue incurred by removing monitor thread. When the monitor thread is used, most threads in the team directly go to sleep if the copy of bt_intervals/bt_set is not available in the cache, and this happens at least once per thread in the wait function, making the overall performance slightly better. This change tries to mimic this behavior by using the bt_intervals cache, which simply keeps the blocktime interval in terms of the platform-dependent ticks or nanoseconds. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D28906 llvm-svn: 293312	2017-01-27 17:54:31 +00:00
Jonas Hahnfeld	cfe5ef589a	[libomptarget] Fix compilation with libc++ iterator is only guaranteed to be default-constructible, without any argument. Differential Revision: https://reviews.llvm.org/D29171 llvm-svn: 293277	2017-01-27 11:03:33 +00:00
George Rokos	2467df6e4f	[OpenMP] Initial implementation of OpenMP offloading library - libomptarget. This is the patch upstreaming the device-agnostic part of libomptarget. Differential Revision: https://reviews.llvm.org/D14031 llvm-svn: 293094	2017-01-25 21:27:24 +00:00
Jonathan Peyton	3692fcf665	Use C++11 static_assert() for build asserts. llvm-svn: 292350	2017-01-18 07:49:30 +00:00
Jonathan Peyton	7f976d556a	Fix memory error in case of reinit using kmp_set_defaults() for lock code. The lock tables were being reallocated if kmp_set_defaults() was called. In the env_init code it says that the user should be able to switch between different KMP_CONSISTENCY_CHECK values which is what this change enables. llvm-svn: 292349	2017-01-18 07:02:21 +00:00
Jonathan Peyton	d0365a228c	Fix small memory leak regarding __kmp_nested_proc_bind There is no corresponding free() for this expandable array. The logic is added in __kmp_cleanup() next to the freeing of __kmp_nested_nth. llvm-svn: 292348	2017-01-18 06:40:19 +00:00
Jonas Hahnfeld	c9a8a6c030	kmp_affinity: Fix check if specific bit is set Clang 4.0 trunk warns: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses] This points to a potential bug if the code really wants to check if the single bit is not set: If for example (buf.edx >> 9) = 2 (has any bit set except the least significant one), 'logical not' will return 0 which stays 0 after the 'bitwise and'. To do this correctly we first need to evaluate the 'bitwise and'. In that case it returns 2 & 1 = 0 which after the 'logical not' evaluates to 1. Differential Revision: https://reviews.llvm.org/D28599 llvm-svn: 291764	2017-01-12 11:39:04 +00:00
Jonas Hahnfeld	49152b3f06	[CMake] Make openmp build under runtimes/ runtimes/CMakeLists.txt in LLVM passes OPENMP_STANDALONE_BUILD. Differential Revision: https://reviews.llvm.org/D28280 llvm-svn: 290978	2017-01-04 18:11:37 +00:00
Andrey Churbanov	76d4285460	Fix for the __kmpc_global_num_threads function to return the value of the __kmp_all_nth global var. Patch by Yonghong Yan. Differential Revision: https://reviews.llvm.org/D27975 llvm-svn: 290272	2016-12-21 21:20:20 +00:00
Oren Ben Simhon	c11addb506	Reverting last change. llvm-svn: 290245	2016-12-21 09:04:08 +00:00
Oren Ben Simhon	016f2af3c7	[X86] Vectorcall Calling Convention - Adding CodeGen Complete Support Fixing build issues. llvm-svn: 290242	2016-12-21 08:58:19 +00:00
Jonathan Peyton	de4749b748	Follow up to r289732: Update comments in source files to reference .cpp files Patch by Hansang Bae llvm-svn: 289739	2016-12-14 23:01:24 +00:00
Jonathan Peyton	7cc577a4ef	Change source files from .c to .cpp Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D26688 llvm-svn: 289732	2016-12-14 22:39:11 +00:00
Andrey Churbanov	5dee8c43da	Cleanup: debug print fixed and moved inside critical section. Patch by Victor Campos. Differential Revision: https://reviews.llvm.org/D27647 llvm-svn: 289640	2016-12-14 08:29:00 +00:00
Sylvestre Ledru	cd9d374337	Support of mips & mips64 for openmprtl Summary: Implemented by Dejan Latinovic See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790735 for more more information Reviewers: AndreyChurbanov, jlpeyton Subscribers: openmp-commits, mgorny Differential Revision: https://reviews.llvm.org/D26576 llvm-svn: 289032	2016-12-08 09:22:24 +00:00
Andrey Churbanov	e0a2c3e99a	fixed type in Windows-specific code llvm-svn: 288368	2016-12-01 16:08:52 +00:00
Jonathan Peyton	a88e8358af	Fixed typo in kmp_process_deps trace output Patch by Victor Campos Differential Revision: https://reviews.llvm.org/D27172 llvm-svn: 288056	2016-11-28 20:10:32 +00:00
Andrey Churbanov	bcadbd6302	Cleanup: memory leaks on warnings printing fixed; some memory freeing cleaned; poor indents and one typo fixed. Patch by Victor Campos. Differential Revision: https://reviews.llvm.org/D26786 llvm-svn: 288054	2016-11-28 19:23:09 +00:00
Jonathan Peyton	96fe1aa380	Set task->td_dephash to NULL after free llvm-svn: 287552	2016-11-21 16:24:59 +00:00
Jonathan Peyton	7ca7ef0478	Fix for D25504 - segfault because of double free()-ing in shutdown code. Paul Osmialowski pointed out a double free bug in shutdown code. This patch Moves the freeing of the implicit task to above the freeing of all fast memory to prevent the double-free issue. Differential Revision: https://reviews.llvm.org/D26860 llvm-svn: 287551	2016-11-21 16:18:57 +00:00
Jonathan Peyton	5375fe820c	Update stats-gathering code Have developer timers use partitioning scheme which also required that some redundant developer timers be removed in favor of the already existing normal timers. Move per thread stats initialization to just after global thread id assignment which is as early as possible. Also put all global stats initialization code in __kmp_stats_init() and all global stats destruction code in __kmp_stats_fini(). Differential Revision: https://reviews.llvm.org/D26361 llvm-svn: 286892	2016-11-14 21:13:44 +00:00
Jonathan Peyton	1cdd87adfd	Introduce dynamic affinity dispatch capabilities This set of changes enables the affinity interface (Either the preexisting native operating system or HWLOC) to be dynamically set at runtime initialization. The point of this change is that we were seeing performance degradations when using HWLOC. This allows the user to use the old affinity mechanisms which on large machines (>64 cores) makes a large difference in initialization time. These changes mostly move affinity code under a small class hierarchy: KMPAffinity class Mask {} KMPNativeAffinity : public KMPAffinity class Mask : public KMPAffinity::Mask KMPHwlocAffinity class Mask : public KMPAffinity::Mask Since all interface functions (for both affinity and the mask implementation) are virtual, the implementation can be chosen at runtime initialization. Differential Revision: https://reviews.llvm.org/D26356 llvm-svn: 286890	2016-11-14 21:08:35 +00:00
Andrey Churbanov	1fbb482928	Added check for malloc return. Patch by Victor Campos. Differential Revision: https://reviews.llvm.org/D26318 llvm-svn: 286441	2016-11-10 09:08:03 +00:00
Jonas Hahnfeld	50fed0475f	[OpenMP] Enable ThreadSanitizer to check OpenMP programs This patch allows ThreadSanitizer (Tsan) to verify OpenMP programs. It means that no false positive will be reported by Tsan when verifying an OpenMP programs. This patch introduces annotations within the OpenMP runtime module to provide information about thread synchronization to the Tsan runtime. In order to enable the Tsan support when building the runtime, you must enable the TSAN_SUPPORT option with the following environment variable: -DLIBOMP_TSAN_SUPPORT=TRUE The annotations will be enabled in the main shared library (same mechanism of OMPT). Patch by Simone Atzeni and Joachim Protze! Differential Revision: https://reviews.llvm.org/D13072 llvm-svn: 286115	2016-11-07 15:58:36 +00:00
Andrey Churbanov	4d49312cad	fixed typo in comment llvm-svn: 285947	2016-11-03 17:48:46 +00:00
Andrey Churbanov	753fa0468c	Change task stealing to always get task from head of victim's deque. Differential Revision: https://reviews.llvm.org/D26187 llvm-svn: 285833	2016-11-02 16:45:25 +00:00
Andrey Churbanov	51107e0abc	Fixed problem introduced by part of https://reviews.llvm.org/D21196 . Check Task Scheduling Constraint (TSC) on stealing of untied task. This is needed because the untied task can produce tied children those can break TSC if untied is not a descendant of current task. This can cause live lock on complex tyasking tests (e.g. kastors/strassen-task-dep). Differential Revision: https://reviews.llvm.org/D26182 llvm-svn: 285703	2016-11-01 16:19:04 +00:00
Andrey Churbanov	dd313b0673	Add more conditions to check whether task waiting is necessary in kmp_omp_taskwait. Differential Revision: https://reviews.llvm.org/D26058 Patch by Victor Campos llvm-svn: 285678	2016-11-01 08:33:36 +00:00
Andrey Churbanov	df0d75edf6	Fixed a memory leak related to task dependencies. Differential Revision: http://reviews.llvm.org/D25504 Patch by Alex Duran. llvm-svn: 285283	2016-10-27 11:43:07 +00:00
Jonathan Peyton	3c4050d698	Fixing typos in __kmp_release_deps trace outputs Patch by Victor Campos Differential Revision: https://reviews.llvm.org/D25972 llvm-svn: 285244	2016-10-26 21:46:43 +00:00
Jonathan Peyton	762bc46224	Use getpagesize() instead of PAGE_SIZE macro when KMP_OS_LINUX is true Patch by Victor Campos Differential Revision: https://reviews.llvm.org/D26001 llvm-svn: 285243	2016-10-26 21:42:48 +00:00
Andrey Churbanov	2e68768d1e	Fixed memory leak mistakenly introduced by https://reviews.llvm.org/D23115 Differential Revision: http://reviews.llvm.org/D25510 llvm-svn: 284747	2016-10-20 17:14:17 +00:00
Samuel Antao	335151914a	[OpenMP] Fix issue with directives used in a macro. Summary: If directives are used in a macro, clang complains with: ``` src/projects/openmp/runtime/src/kmp_runtime.c:7486:2: error: embedding a directive within macro arguments has undefined behavior [-Werror,-Wembedded-directive] #if KMP_USE_MONITOR ``` This patch fixes two occurrences of the issue in `kmp_runtime.cpp`. Reviewers: tlwilmar, jlpeyton, AndreyChurbanov, Hahnfeld Subscribers: Hahnfeld, openmp-commits Differential Revision: https://reviews.llvm.org/D25823 llvm-svn: 284728	2016-10-20 13:20:17 +00:00
Jonathan Peyton	0ac7b75f7b	Fix OpenMP 4.0 library build Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D25505 llvm-svn: 284499	2016-10-18 17:39:06 +00:00
Michal Gorny	efc536ee9d	Fix a compile error on musl-libc due to strerror_r() prototype Function strerror_r() has different signatures in different implementations of libc: glibc's version returns a char*, while BSDs and musl return a int. libomp unconditionally assumes glibc on Linux and thus fails to compile against musl-libc. This patch addresses this issue. Differential Revision: https://reviews.llvm.org/D25071 llvm-svn: 284492	2016-10-18 16:38:44 +00:00
Jonathan Peyton	55466e9106	Mixed type atomic routines added for capture and update/capture reverse. New mixed type atomic routines added for regular capture operations as well as reverse update/capture operations. LHS - all integer and float types (no complex so far), RHS - float16. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D25275 llvm-svn: 284489	2016-10-18 16:20:55 +00:00
Jonathan Peyton	e1c7c13c3d	Code cleanup for the runtime without monitor thread This change removes/disables unnecessary code when monitor thread is not used. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D25102 llvm-svn: 283577	2016-10-07 18:12:19 +00:00
Jonathan Peyton	a1234cf280	Enable omp_get_schedule() to return static steal type. As the code is now, calling omp_get_schedule() when OMP_SCHEDULE=static_steal will cause an assert. llvm-svn: 283576	2016-10-07 18:01:35 +00:00
Paul Osmialowski	7a9c29e4b8	[cmake] Fix for a bug https://llvm.org/bugs/show_bug.cgi?id=30489 "Cannot build with -DLIBOMP_FORTRAN_MODULES=True" Differential Revision: https://reviews.llvm.org/D24959 llvm-svn: 282965	2016-09-30 22:05:45 +00:00
Jonathan Peyton	66e212ce2b	Insert missing checks for KMP_AFFINITY_CAPABLE() in affinity API. If affinity is not capable, then these API functions will perform the stubs version. llvm-svn: 282947	2016-09-30 20:56:44 +00:00
Michal Gorny	3ccf825e22	[test] Support 'lit' executable name Support finding lit as plain 'lit', which is the name used by setup.py in LLVM's utils/lit. Differential Revision: https://reviews.llvm.org/D25072 llvm-svn: 282876	2016-09-30 16:56:16 +00:00
Jonathan Peyton	74f3ffce24	Fix incorrect OpenMP version in Fortran module. Add check for "45" version to use "201511" string for OpenMP 4.5, otherwise "200505" is used in Fortran module. Also, fix kmp_openmp_version variable (used for the debugger, e.g.) and kmp_version_omp_api that is used in KMP_VERSION=1 output. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D24761 llvm-svn: 282868	2016-09-30 15:50:14 +00:00
Jonathan Peyton	be31337e9d	Mixed type atomic routines for unsigned integers. New routines should be used for atomics like "<int>OP=<float>" when <int> is unsigned. Using functions __kmpc_atomic_fixed<bits>_<op>_fp) produces incorrect results Differential Revision: https://reviews.llvm.org/D24756 llvm-svn: 282509	2016-09-27 17:38:48 +00:00
Jonathan Peyton	b66d1aab25	Disable monitor thread creation by default. This change set disables creation of the monitor thread by default. The global counter maintained by the monitor thread was replaced by logic that uses system time directly, and cyclic yielding on Linux target was also removed since there was no clear benefit of using it. Turning on KMP_USE_MONITOR variable (=1) enables creation of monitor thread again if it is really necessary for some reasons. Differential Revision: https://reviews.llvm.org/D24739 llvm-svn: 282507	2016-09-27 17:11:17 +00:00
Michal Gorny	cd2bfb1e7c	Fix respecting LIBOMP_LLVM_LIT_EXECUTABLE as full path Fix lit search to correctly respect LIBOMP_LLVM_LIT_EXECUTABLE as full program path. The variable passed to find_program() is created by CMake as a cache variable, and therefore can be directly overriden by the user. Since this was the design of LIBOMP_LLVM_LIT_EXECUTABLE (as can be deduced from the error messages) and there is no other use of LIT_EXECUTABLE, remove the redundant variable and pass LIBOMP_LLVM_LIT_EXECUTABLE directly to find_program(). Furthermore, the previous code did not work since the HINTS argument specifies more search directories rather than expected full path. Quoting the CMake documentation: > 3. Search the paths specified by the HINTS option. These should be > paths computed by system introspection, such as a hint provided by > the location of another item already found. Hard-coded guesses should > be specified with the PATHS option. Differential Revision: https://reviews.llvm.org/D24710 llvm-svn: 281887	2016-09-19 06:55:56 +00:00
Michal Gorny	23132ebb0e	[cmake] Make libgomp & libiomp5 alias install optional Introduce a new LIBOMP_INSTALL_VARIABLES cache variable that can be used to disable creating libgomp and libiomp5 aliases on 'make install'. Those aliases are undesired e.g. on Gentoo systems where libomp is used purely by clang. Differential Revision: https://reviews.llvm.org/D24563 llvm-svn: 281512	2016-09-14 17:46:27 +00:00
Jonas Hahnfeld	848d690697	[OMPT] fix task frame information for gomp interface Previous differencials D23305-D23310 changed task frame information management only for the kmp interface, but not for the whole gomp interface. This broke some testcases when building with gcc. This patch fixes the broken task frame information for the gomp interface. Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D24502 llvm-svn: 281468	2016-09-14 13:59:39 +00:00
Jonas Hahnfeld	dd9a05d5d8	[OMPT] save exit address to lwt if available In case, the current team is a serialized team (lwt), the frame information should be written to this data structure. Before, nested serialized teams would overwrite the same task information. Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D23310 llvm-svn: 281467	2016-09-14 13:59:31 +00:00
Jonas Hahnfeld	28ea24bba7	[OMPT] fix __ompt_get_teaminfo to consult lwt entries of parent teams The comment already states, that this function should work similarly as __ompt_get_taskinfo. The function only looked for lwt entries of the current team, but not when unrolling the parents. This fix aligns the implementation to __ompt_get_taskinfo. The new test case creates a single theaded team (->lwt) and then a nested active team. Before the innermost print_id(1) would deliver a different team then the outer print_id(0). Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D23309 llvm-svn: 281466	2016-09-14 13:59:24 +00:00
Jonas Hahnfeld	8a27064e05	[OMPT] Reset task exit frame when execution is finished The exit address is set when execution of a task is started and should be reset as soon as the execution is finished. Especially for the asm implementation of __kmp_invoke_microtask, resetting in this call would be painfull, so reset just after the invokation. The testcase shows the effect of this patch: Before, the implicit barriers at the end of an implicit task would see an exit address for the implicit task. This barrier is a task scheduling point. Thus, any explicit task scheduled there would see an exit, but no reenter address for the implicit task. Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D23307 llvm-svn: 281465	2016-09-14 13:59:19 +00:00
Jonas Hahnfeld	fd0614d830	[OMPT] Align implementation of reenter frame address to latest (frozen) version of OMPT spec The latest OMPT spec changed the semantic of a tasks reenter frame to be the application frame, that will be entered, when the runtime frame drops. Before it was the last frame in the runtime. This doesn't work for some gcc execution pathes or even clang generated code for : Since there is no runtime frame between the executed task and the encountering task. The test case compares exit and reenter addresses against addresses captured in application code Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D23305 llvm-svn: 281464	2016-09-14 13:59:13 +00:00
Jonas Hahnfeld	464cdca9d3	[OMPT] extend ompt tests by checks for frame pointers OMPT tests can check for right frame information of tasks: * parent_task_frame was directly printed as a pointer, but actually points to a struct ompt_frame {void, void} * NULL is printed in the beginning of execution and loaded to FileChecker variable [[NULL]] * implicit tasks now also print their frame information * macro to print frame address from application * print task info for barrier begin Patch by Joachim Protze! Differential Revision: https://reviews.llvm.org/D23304 llvm-svn: 281463	2016-09-14 13:59:05 +00:00
Jonathan Peyton	7c465a5f41	Fix bitmask upper bounds check Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we use the get_max_proc() function which can vary based on the operating system. For example on Windows with multiple processor groups, it might be the case that the highest bit possible in the bitmask is not equal to the number of hardware threads on the machine but something higher than that. Differential Revision: https://reviews.llvm.org/D24206 llvm-svn: 281245	2016-09-12 19:02:53 +00:00
George Rokos	118de30b44	[OPENMP] ppc64le recognized as big-endian There is a bug in CMakeLists which causes powerpc64le systems to be recognized as big-endian. This patch fixes the issue. Differential Revision: https://reviews.llvm.org/D23626 llvm-svn: 281068	2016-09-09 18:04:23 +00:00
George Rokos	28f31b405e	[OPENMP] Implementation of omp_get_default_device and omp_set_default_device Implementation of missing OpenMP 4.0 API functions omp_get_default_device and omp_set_default_device. Also, added support for the environment variable OMP_DEFAULT_DEVICE. Differential Revision: https://reviews.llvm.org/D23587 llvm-svn: 281065	2016-09-09 17:55:26 +00:00
Jonathan Peyton	e6abe52905	Move function into cpp file under KMP_AFFINITY_SUPPORTED guard. When affinity isn't supported, __kmp_affinity_compact doesn't exist. The problem is that in kmp_affinity.h there is a function which uses it without the proper KMP_AFFINITY_SUPPORTED guard around it. The compiler was smart enough to ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on it, but I think it is cleaner to have it under the proper guard. Since the function is only used in the kmp_affinity.cpp file and there aren't any plans to have it elsewhere. I have moved it there. llvm-svn: 280542	2016-09-02 20:54:58 +00:00
Jonathan Peyton	9e69696f5a	Decouple the kmp_affin_mask_t type from determining if affinity is capable the __kmp_affinity_determine_capable() functions are highly operating system specific. This change has the functions use the type they expect explicitly. llvm-svn: 280538	2016-09-02 20:35:47 +00:00
Jonathan Peyton	788c5d65e8	Replace a bad instance of __kmp_free() with KMP_CPU_FREE_ARRAY() macro. llvm-svn: 280530	2016-09-02 19:37:12 +00:00
Jonathan Peyton	5c32d5ef0d	Use 'critical' reduction method when 'atomic' is not available but requested. In case atomic reduction method is not available (the compiler can't generate it) the assertion failure occurred if KMP_FORCE_REDUCTION=atomic was specified. This change replaces the assertion with a warning and sets the reduction method to the default one - 'critical'. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D23990 llvm-svn: 280519	2016-09-02 18:29:45 +00:00
Jonathan Peyton	0af717970c	Appease older gcc compilers for the many-microtask-args.c test Older gcc compilers error out with the C99 syntax of: for (int i =...) so this change just moves the int i; declaration up above. llvm-svn: 280138	2016-08-30 19:28:58 +00:00
Andrey Churbanov	b35be69ff5	cleanup: fixed names of dummy arguments of Fortran interfaces declarations, no functional changes done llvm-svn: 278951	2016-08-17 18:18:21 +00:00
Andrey Churbanov	d6e1d7e521	Fixes for hierarchical barrier (possible hang if team size changed). Differential Revision: http://reviews.llvm.org/D23175 llvm-svn: 278332	2016-08-11 13:04:00 +00:00
Dimitry Andric	70ba8c506c	Fix linking of omp_foreign_thread_team_reuse test on FreeBSD Summary: On FreeBSD, linking the misc_bugs/omp_foreign_thread_team_reuse.c test case fails with: /usr/local/bin/ld: /tmp/omp_foreign_thread_team_reuse-c5e71b.o: undefined reference to symbol 'pthread_create@@FBSD_1.0' This is because the program is linked without `-lpthread`. Since the %libomp-compile-and-run macro does not allow that option to be added to the compile command line, split it up and add the required `-lpthread` between %libomp-compile and %libomp-run. Reviewers: jlpeyton, hfinkel, Hahnfeld Subscribers: Hahnfeld, emaste, openmp-commits Differential Revision: https://reviews.llvm.org/D23084 llvm-svn: 278036	2016-08-08 18:34:05 +00:00
Jonas Hahnfeld	ad0c42e3a9	kmp_gsupport: Fix library initialization with taskgroup Differential Revision: https://reviews.llvm.org/D23259 llvm-svn: 278003	2016-08-08 13:23:08 +00:00
Jonas Hahnfeld	ca32babfa7	Mark tests with task dependencies as unsupported with GCC llvm-svn: 277996	2016-08-08 11:52:49 +00:00
Jonas Hahnfeld	bedc371c9d	Do not block on explicit task depending on proxy task Consider the following code: int dep; #pragma omp target nowait depend(out: dep) { sleep(1); } #pragma omp task depend(in: dep) { printf("Task with dependency\n"); } printf("Doing some work...\n"); In its current state the runtime will block on the second task and not continue execution. Differential Revision: https://reviews.llvm.org/D23116 llvm-svn: 277992	2016-08-08 10:08:14 +00:00
Jonas Hahnfeld	69f8511f8f	__kmp_free_task: Fix for serial explicit tasks producing proxy tasks Consider the following code which may be executed by a serial team: int dep; #pragma omp target nowait depend(out: dep) { sleep(1); } #pragma omp task depend(in: dep) { #pragma omp target nowait { sleep(1); } } Here the explicit task may not be freed until the nested proxy task has finished. The current code hasn't considered this and called __kmp_free_task anyway which triggered an assert because of remaining incomplete children: KMP_DEBUG_ASSERT( TCR_4(taskdata->td_incomplete_child_tasks) == 0 ); Differential Revision: https://reviews.llvm.org/D23115 llvm-svn: 277991	2016-08-08 10:08:07 +00:00
Andrey Churbanov	5bf494e73d	Fixed x2APIC discovery for 256-processor architectures. Mask for value read from ebx register returned by CPUID expanded to 0xFFFF. Differential Revision: https://reviews.llvm.org/D23203 llvm-svn: 277825	2016-08-05 15:59:11 +00:00
Jonas Hahnfeld	d1f4b8f6e8	Add test case for nested creation of tasks For discussion in D23115 llvm-svn: 277730	2016-08-04 14:55:56 +00:00
Jonas Hahnfeld	20236611d4	kmp_taskdeps.cpp: Fix debugging output node->dn.task is only filled after the dependencies are already processed. This currently leads to unhelpful output from KA_TRACE or even a crash if one enables KMP_SUPPORT_GRAPH_OUTPUT. llvm-svn: 277717	2016-08-04 11:03:47 +00:00
Pirama Arumuga Nainar	0554d25eb3	Disable KMP_CANCEL_THREADS on Android Summary: Android does not have pthread_cancel. Disable KMP_CANCEL_THREADS if __ANDROID__ is defined. Subscribers: tberghammer, srhines, openmp-commits, danalbert Differential Revision: https://reviews.llvm.org/D23029 llvm-svn: 277618	2016-08-03 18:08:57 +00:00
Paul Osmialowski	ecbe2ea002	Make balanced affinity work on AArch64. This patch enables balanced affinity on machines that do not have hardware threads and have cores clustered into packages. In facts, balacing algorithm could be generalized for any arrangement with at least two levels of hierarchy (depth > 1). Differential Revision: https://reviews.llvm.org/D22365 llvm-svn: 277212	2016-07-29 20:55:03 +00:00
Samuel Antao	71fef77dcb	Replace enum types in variadic functions by build-in types. Summary: When compiling the runtime library with clang we get warnings like: ``` error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Werror,-Wvarargs] va_start( args, id ); ^ note: parameter of type 'kmp_i18n_id_t' (aka 'kmp_i18n_id') is declared here kmp_i18n_id_t id, ``` My understanding is that the va_start macro only gets the promoted type so it won't know what was the exact type of the argument, which can potentially not work for some targets given that the implementation of the the calling convention could not be done properly. This patch fixes that by using a built-in type in the function signature. Reviewers: tlwilmar, jlpeyton, AndreyChurbanov Subscribers: arpith-jacob, carlo.bertolli, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D22427 llvm-svn: 276428	2016-07-22 16:05:35 +00:00
Andrey Churbanov	429dbc2ad2	http://reviews.llvm.org/D22134 : Implementation of OpenMP 4.5 nonmonotonic schedule modifier llvm-svn: 275052	2016-07-11 10:44:57 +00:00
Jonathan Peyton	4d3c21307c	Improving EPCC performance when linking with hwloc When linking with libhwloc, the ORDERED EPCC test slows down on big machines (> 48 cores). Performance analysis showed that a cache thrash was occurring and this padding helps alleviate the problem. Also, inside the main spin-wait loop in kmp_wait_release.h, we can eliminate the references to the global shared variables by instead creating a local variable, oversubscribed and instead checking that. Differential Revision: http://reviews.llvm.org/D22093 llvm-svn: 274894	2016-07-08 17:43:21 +00:00
Andrey Churbanov	50ecf5de01	D22138: Added more Intel compiler versions as allowed build compilers llvm-svn: 274854	2016-07-08 15:23:35 +00:00
Andrey Churbanov	2eca95c9a9	D22137: Memory leak fixed by adding missed cleanup of single level array of hot teams info llvm-svn: 274851	2016-07-08 14:53:24 +00:00
Andrey Churbanov	cb28d6e3a0	D22136: Memory leaks fixed by adding missed __kmp_free() calls llvm-svn: 274850	2016-07-08 14:40:20 +00:00
Andrey Churbanov	42211eb125	D22135: formatting change llvm-svn: 274849	2016-07-08 14:35:41 +00:00
Jonathan Peyton	741b70926f	Fix the nowait tests for omp for and omp single These tests are now modeled after the sections nowait test where threads wait to be released in the first construct (either for or single) and the last thread skips the last for/single construct and releases those threads. If the test fails, then it hangs because an unnecessary barrier is executed in between the constructs. llvm-svn: 274641	2016-07-06 17:26:12 +00:00
Jonas Hahnfeld	170fcc8772	__kmp_partition_places: Update assertion for new parameter update_master_only If update_master_only is set the place list is not completely traversed and therefore this assertion failed. Make it only trigger if update_master_only is false. (was introduced by D20539) Differential Revision: http://reviews.llvm.org/D21925 llvm-svn: 274482	2016-07-04 05:58:10 +00:00
Jonathan Peyton	6b560f0dd9	Fix checks on schedule struct This change fixes an error in comparing the existing schedule on the team to the new schedule, in the chunk field. Also added additional checks and used KMP_CHECK_UPDATE where appropriate. Patch by Terry Wilmarth. Differential Revision: http://reviews.llvm.org/D21897 llvm-svn: 274371	2016-07-01 17:54:32 +00:00
Jonathan Peyton	c1666960f9	Improve performance of #pragma omp single EPCC Performance of single is considerably worse than plain barrier. Adding a read-only check to the code before the atomic compare-and-store helps considerably. Patch by Terry Wilmarth. Differential Revision: http://reviews.llvm.org/D21893 llvm-svn: 274369	2016-07-01 17:37:49 +00:00
Jonathan Peyton	fdcca8cd55	Fix omp_sections_nowait.c test to address Bugzilla Bug 28336 This rewrite of the omp_sections_nowait.c test file causes it to hang if the nowait is not respected. If the nowait isn't respected, the lone thread which can escape the first sections construct will just sleep at a barrier which shouldn't exist. All reliance on timers is taken out. For good measure, the test makes sure that all eight sections are executed as well. The test should take no longer than a few seconds on any modern machine. Differential Revision: http://reviews.llvm.org/D21842 llvm-svn: 274151	2016-06-29 19:46:52 +00:00
Jonathan Peyton	ac7ba406ed	Fix bugs in TAS and futex lock * Incorrect lock value written in __kmp_test_futex_lock * Incorrect lock value check in tas/futex lock with USE_LOCK_PROFILE on Patch by Hansang Bae llvm-svn: 274053	2016-06-28 19:37:24 +00:00
Jonathan Peyton	cceebeef17	Revert r273898's UNICODE quick fix in favor of CMake's remove_definitions() UNICODE and _UNICODE defintions were added in the LLVM CMake build system. While on Unices, the UNICODE/_UNICODE macros don't cause problems, on Windows only ittnotify_static.c should be compiled using -DUNICODE. We are still looking at a proper fix, but this change sets the build back to exactly what it was doing before. Also, a comment and TODO were added in the src/CMakeLists.txt file to help explain. llvm-svn: 274052	2016-06-28 19:25:13 +00:00
Hans Wennborg	8065c51875	Fix the Windows build after r273599 That patch made all LLVM projects build with -DUNICODE. However, this doesn't work for the OpenMP runtime. But just overriding the flag with -UUNICODE breaks compiling ittnotify_static.c, which for some reason needs to be compiled with -DUNICIODE. Note that compiling ittnotify.h with -DUNICODE does not work though. This seems like a mess. This commit fixes it for now, but it would be great if someone who works on the OpenMP runtime could fix it properly. llvm-svn: 273898	2016-06-27 18:03:45 +00:00
Jonathan Peyton	e119e8e5b5	Remove redundant %libomp-compile step from test/lock/omp_lock.c llvm-svn: 273576	2016-06-23 16:18:59 +00:00
Jonathan Peyton	eeec4c8364	Fix bug in futex fast path inside kmp_csupport.c llvm-svn: 273439	2016-06-22 16:36:07 +00:00
Jonathan Peyton	9d2412c9e5	Apply the KMP_USE_FUTEX feature macro everywhere llvm-svn: 273438	2016-06-22 16:35:12 +00:00
Jonathan Peyton	d4f397741b	Add debug trace messages for taskloop llvm-svn: 273299	2016-06-21 19:18:13 +00:00
Jonathan Peyton	c76f9f0df8	Bug fix for hang when tasks used in nested parallel Bug fix for hang when omp task and nested parallelism used together. Still some problem remains with task state saving/restoring, but user's case works fine now. All tasking unit tests passed as well. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21558 llvm-svn: 273297	2016-06-21 19:12:07 +00:00
Jonathan Peyton	ff5ca8b4cf	Performance improvement: accessing thread struct as opposed to team struct Replaced readings of nproc from team structure with ones from thread structure to improve performance. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21559 llvm-svn: 273293	2016-06-21 18:30:15 +00:00
Jonathan Peyton	8c61c597be	Addition of debugger comments and whitespace The removal of legacy code to support long-deprecated debugger support library resulted in some whitespace changes. Comments from that legacy code were made public as they may be useful for other debuggers. Patch by Olga Malysheva. Differential Revision: http://reviews.llvm.org/D21391 llvm-svn: 273282	2016-06-21 15:59:34 +00:00
Jonathan Peyton	fd7cc42fed	Improvements to process affinity mask setting A couple improvements: 1) Add ability to limit fullMask size when KMP_HW_SUBSET limits resources. 2) Make KMP_HW_SUBSET work for affinity_none, and only limit fullMask in this case. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21528 llvm-svn: 273278	2016-06-21 15:54:38 +00:00
Jonathan Peyton	5a276c45c2	Bug fix for segfault in stubs library There was a segfault in the stubs library in posix_memalign because of a bad parameter. The fix is to send address of the pointer as a parameter. Also added check of result of posix_memalign. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21529 llvm-svn: 273276	2016-06-21 15:39:08 +00:00
Jonathan Peyton	98b76f6f87	[STATS] Adding process id to output filename This change appends the process id to the KMP_STATS_FILE (if specified) which enables MPI processes to output their stats to separate files. Differential Revision: http://reviews.llvm.org/D21386 llvm-svn: 273273	2016-06-21 15:20:33 +00:00
Jonathan Peyton	ea26f3f82a	Fix typos in Fortran headers Fix typos in Fortran headers to match spec. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21531 llvm-svn: 273272	2016-06-21 15:16:51 +00:00
Jonathan Peyton	bf35771bcc	Change hwloc discovery algorithm to print topology only for accessible resources Change hwloc discovery algorithm to print topology for only accessible resources, and report uniformity correspondingly, similar to what other topology discovery algorithms do. Fixes minor inconsistency in total topology reported and resources used for threads binding in case hwloc used. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21389 llvm-svn: 272952	2016-06-16 20:31:19 +00:00
Jonathan Peyton	0f3c2b921d	Teach OpenMP Library to use Hwloc on Windows This patch allows a user to enable Hwloc on windows. There are three main changes in here: 1.kmp.h - Move definitions/declarations out of KMP_OS_WINDOWS guard (our windows implementation of affinity) because they need to be defined when KMP_USE_HWLOC is on as well. 2.teach __kmp_set_system_affinity, __kmp_get_system_affinity, __kmp_get_proc_group, and __kmp_affinity_bind_thread how to use hwloc. 3.teach CMake how to include hwloc when building Windows Another minor change in here is to make sure that anything under KMP_USE_HWLOC is also guarded by KMP_AFFINITY_SUPPORTED as well. This is to prevent Mac builds from requiring anything from Hwloc. Differential Revision: http://reviews.llvm.org/D21441 llvm-svn: 272951	2016-06-16 20:23:11 +00:00
Jonathan Peyton	c505ab6733	Fix for crash in task dependencies With single thread using __kmpc_omp_wait_deps segfaults in OpenMP runtime. Offloading with depend also encounters this problem when we generate kmpc_omp_wait_deps instead of kmpc_omp_task_with_deps. Patch by Alex Duran Differential Revision: http://reviews.llvm.org/D21384 llvm-svn: 272949	2016-06-16 20:18:31 +00:00
Jonathan Peyton	72a8498e08	Fixed missing memory cleanup in __kmp_affinity_create_hwloc_map() Cleanup: fixed missing memory cleanup in couple of corner cases. Fixes possible memory leak in some corner cases Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21355 llvm-svn: 272946	2016-06-16 20:14:54 +00:00
Jonathan Peyton	4ba3b0cda9	Reduce perf impact of redundant ittnotify calls Improved performance of ittnotify calls by request from ittnotify owner: calls to __itt_string_handle_create made unique (it was called multiple times). Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21353 llvm-svn: 272945	2016-06-16 20:11:51 +00:00
Jonathan Peyton	b9d28fbeb3	Deprecate KMP_PLACE_THREADS and rename as KMP_HW_SUBSET Deprecate KMP_PLACE_THREADS and rename it to KMP_HW_SUBSET due to confusion about its purpose and function among users. KMP_HW_SUBSET is an environment variable which allows users to easily pick a subset of the hardware topology to use. e.g., KMP_HW_SUBSET=30c,2t means use 30 cores, 2 threads per core. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21340 llvm-svn: 272937	2016-06-16 18:53:48 +00:00
Jonathan Peyton	7cf08d4299	Bug fix: crash if teams executed on host Added argv array check/allocation for parallel directly nested inside the teams construct, as new coming Fortran codegen passes parameters directly into kmpc_fork_call missing same parameters in kmpc_fork_teams (earlier codegen passed to parallel the subset of parameter passed to teams, and thus no check/allocation needed). Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21336 llvm-svn: 272935	2016-06-16 18:47:38 +00:00
Jonathan Peyton	614bb6618e	Fix large overhead with itt notifications on region/barrier name composing Currently, there is a big overhead in reporting of loop metadata through ittnotify. The pair of functions: __kmp_str_loc_init/__kmp_str_loc_free are replaced with strchr/atoi calls. Thus, a lot of time consuming actions are skipped - many memory allocations/deallocations, heavy string duplication, etc. The loop metadata only needs line and column info from the source string, so no allocations and string splitting actually needed. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21309 llvm-svn: 272698	2016-06-14 19:27:22 +00:00
Jonathan Peyton	e85ba3f58f	Remove unused wait/release code. Cleanup - unused code removal. TODO: consider to remove (replace with flag class methods) also kmp_wait_64 and kmp_release_64 routines. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21332 llvm-svn: 272697	2016-06-14 19:15:40 +00:00
Jonathan Peyton	957a151fd1	Whitespace cleanup of dllexports Differential Revision: http://reviews.llvm.org/D21331 llvm-svn: 272691	2016-06-14 18:47:47 +00:00
Jonathan Peyton	df6818bea4	Renaming change: 41 -> 45 and 4.1 -> 4.5 OpenMP 4.1 is now OpenMP 4.5. Any mention of 41 or 4.1 is replaced with 45 or 4.5. Also, if the CMake option LIBOMP_OMP_VERSION is 41, CMake warns that 41 is deprecated and to use 45 instead. llvm-svn: 272687	2016-06-14 17:57:47 +00:00
Jonathan Peyton	e1890e12f0	Bug fix for Bugzilla bug 26602: Remove function bodies with KMP_ASSERT(0) Fix for bugzilla https://llvm.org/bugs/show_bug.cgi?id=26602. Removed functions body consisted of the only KMP_ASSERT(0) statement. Thus possible runtime crash converted to compile-time error, which looks preferable (faster possible error detection). TODO: consider C++11 static assert as an alternative, that could make the diagnostics better. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21304 llvm-svn: 272590	2016-06-13 21:33:30 +00:00
Jonathan Peyton	c5304aa3c4	Affinity mask processing improvements Remove static specifier from var fullMask and remove kmp_get_fullMask() routine. When iterating through procs in a mask, always check if proc is in fullMask (this check was missing in a few places). Patch by Brian Bliss. Differential Revision: http://reviews.llvm.org/D21300 llvm-svn: 272589	2016-06-13 21:28:03 +00:00
Jonathan Peyton	8cb45c838f	Exclude untied tasks from task stealing constraint If either current_task or new_task is untied then skip task scheduling constraint checks, because untied tasks are not affected by the task scheduling constraints. Differential Revision: http://reviews.llvm.org/D21196 llvm-svn: 272570	2016-06-13 17:51:59 +00:00
Jonathan Peyton	93495de265	Fix crash when libomp loaded/unloaded multiple times The problem scenario is the following: A dynamic library, libfoo.so, depends on libomp.so (it creates parallel region and calls some omp functions). An application has a loop where it dynamically loads libfoo.so, calls the function from it, unloads libfoo.so. After several loop iterations application crashes with the message about lack of resources OMP: Error #34: System unable to allocate necessary resources for OMP thread: The problem is that pthread_kill() was not followed by pthread_join() in case of terminated thread. This patch fixes this problem for both worker and monitor threads. Differential Revision: http://reviews.llvm.org/D21200 llvm-svn: 272567	2016-06-13 17:36:40 +00:00
Jonathan Peyton	202a24dd9b	Hwloc refactoring patch These changes remove the hwloc_topology_ignore_type function which doesn't exist in the hwloc 2.0 API. In the existing code, the topology extracted from hwloc has the cache levels stripped out and then assumes the final stripped topology follows the typical three-level topology: packages -> cores -> HW threads. But the code is doing unclean manipulations to determine at what level those resources are located and also assumes too much about what hwloc is detecting (there could be intermediate levels in between socket and core for instance). This new way of extracting the topology doesn't strip out any hardware objects that hwloc detects. It does not assume the three level topology, and instead searches for the relevant three levels within the topology for each bit of information using hwloc interface functions. i.e., the three level topology subset that our affinity code is interested in is extracted from the hwloc topology tree directly. For example, the new __kmp_hwloc_get_nobjs_under_obj function gives the user the number of cores under a socket reliably without worrying if there are unexpected objects between the socket object and core object in the hwloc topology structure. Also, now that all topology information is kept, there are also possibilities of using the caches/numa nodes to determine more sophisticated affinity settings in the future. There is also some cleanup code added for the destruction of the __kmp_hwloc_topology object. Differential Revision: http://reviews.llvm.org/D21195 llvm-svn: 272565	2016-06-13 17:30:08 +00:00
Jonathan Peyton	34c72c4773	Fix bitmask complement operation The bitmask complement operation doesn't consider the max proc id which means something like !{0} will be translated to {1,2,3,4,...,600,601,...,1023} on a Linux system even though there aren't 600 processors on said system. This change has the complement bitmask and-ed with the fullmask so that it will only contain valid processors. Differential Revision: http://reviews.llvm.org/D21245 llvm-svn: 272561	2016-06-13 17:01:26 +00:00
Jonathan Peyton	5a299da55d	[STATS] Add stats gathering for taskloop construct llvm-svn: 272560	2016-06-13 16:56:41 +00:00
Jonathan Peyton	b6f0f521f5	Fix spelling in comment llvm-svn: 272291	2016-06-09 18:51:17 +00:00
Jonathan Peyton	61fdddfd64	Revert accidental commit to lit.cfg llvm-svn: 272287	2016-06-09 18:29:36 +00:00
Jonathan Peyton	c4c722ac0d	Refactor __kmp_execute_tasks_template function Refactored __kmp_execute_tasks_template to shorten and remove code redundancy. The original code for __kmp_execute_tasks_template was very redundant with large sections of repeated code that needed to be kept consistent, and goto statements that made the control flow difficult to discern. This refactoring removes all gotos and redundancy. Patch by Terry Wilmarth Differential Revision: http://reviews.llvm.org/D20879 llvm-svn: 272286	2016-06-09 18:27:03 +00:00
Hans Wennborg	5b89fbc822	kmp_lock.h: Fix VS2013 build after r271324 MSVC doesn't allow std::atomic<>s in a union since they don't have trivial copy constructor. Replacing them with e.g. std::atomic_int works, but that breaks the GCC build on Linux, because then calls to e.g. std::atomic_load_explicit fail, as they expect a real std::atomic<> pointer. Fixing this with an #ifdef to unbreak the build for now. llvm-svn: 272271	2016-06-09 15:54:43 +00:00
Paul Osmialowski	9cc353e2b3	Fine tuning of TC* macros - small followup As I replaced no-op TCR_4 with actual code, compiler complained while building debug build. This patch moves 'cast to int' to the correct place. Extension to Differential Revision: http://reviews.llvm.org/D19880 llvm-svn: 271377	2016-06-01 09:59:26 +00:00
Paul Osmialowski	f7cc6affdb	Use C++11 atomics for ticket locks implementation This patch replaces use of compiler builtin atomics with C++11 atomics for ticket locks implementation. Ticket locks are used in critical places of the runtime, e.g. in the tasking mechanism. The main reason this change was introduced is the problem with work stealing function on ARM architecture which suffered from nasty race condition. It turned out that the root cause of the problem lies in the way ticket locks are implemented. Changing compiler builtins into C++11 atomics solves the problem. Two assertions were added into kmp_tasking.c which are useful for detecting early symptoms of something wrong going on with work stealing, which were among the possible outcomes of the race condition. Differential Revision: http://reviews.llvm.org/D19878 llvm-svn: 271324	2016-05-31 20:20:32 +00:00
Jonathan Peyton	ef7347994e	Addition of OpenMP 4.5 feature: schedule(simd:static) This patch implements the new kmp_sch_static_balanced_chunked schedule kind that the compiler will generate when it encounters schedule(simd: static). It just adds the new constant and the new switch case __kmp_for_static_init. Patch by Alex Duran. Differential Revision: http://reviews.llvm.org/D20699 llvm-svn: 271320	2016-05-31 19:12:18 +00:00
Jonathan Peyton	f4f969569d	Avoid deadlock with COI When an asynchronous offload task is completed, COI calls the runtime to queue a "destructor task". When the task deques are full, a dead-lock situation arises where the OpenMP threads are inside but cannot progress because the COI thread is stuck inside the runtime trying to find a slot in a deque. This patch implements the solution where the task deques doubled in size when a task is being queued from a COI thread. Differential Revision: http://reviews.llvm.org/D20733 llvm-svn: 271319	2016-05-31 19:07:00 +00:00
Jonathan Peyton	067325f935	Offer API for setting number of loop dispatch buffers The problem is the lack of dispatch buffers when thousands of loops with nowait, about 10 iterations each, are executed by hundreds of threads. We only have built-in 7 dispatch buffers, but there is a need in dozens or hundreds of buffers. The problem can be fixed by setting KMP_MAX_DISP_BUF to bigger value. In order to give users same possibility I changed build-time control into run-time one, adding API just in case. This change adds an environment variable KMP_DISP_NUM_BUFFERS and a new API function kmp_set_disp_num_buffers(int num_buffers). The KMP_DISP_NUM_BUFFERS envirable works only before serial initialization, because during the serial initialization we already allocate buffers for the hot team, so it is too late to change the number of buffers later (or we need to reallocate buffers for all teams which sounds too complicated). The kmp_set_defaults() routine does not work for this envirable, because it calls serial initialization before reading the parameter string. So a new routine, kmp_set_disp_num_buffers(), is created so that it can set our internal global variable before the library initialization. If both the envirable and API used the envirable wins. Differential Revision: http://reviews.llvm.org/D20697 llvm-svn: 271318	2016-05-31 19:01:15 +00:00
Hal Finkel	49bee007d0	Fix storing the frame pointer for OMP-T during ppc64 microtask dispatch Thanks to John Mellor-Crummey for reporting the omission. llvm-svn: 271035	2016-05-27 19:04:05 +00:00
Jonathan Peyton	50eae7f8b2	Add missing OpenMP 4.5 device entries to stubs library. llvm-svn: 271006	2016-05-27 15:51:14 +00:00
Jonathan Peyton	7ba9baef6d	Fix for OMP_PROC_BIND=spread strategy The OMP_PROC_BIND=spread strategy fails to assign the master thread the correct place partition after the first parallel region. Other threads in the hot team will remember their place_partition, but the master's place partition is restored to what it was before entering the parallel region. So when the hot team is used for subsequent parallel regions, the master has lost this info. This fix calls __kmp_partition_places to update only the master thread's place partition in the spread case when there are no other changes to the hot team. Patch by Terry Wilmarth Differential Revision: http://reviews.llvm.org/D20539 llvm-svn: 270890	2016-05-26 19:09:46 +00:00
Jonathan Peyton	7abf9d5927	Make LIBOMP_USE_ITT_NOTIFY a setting that can be enabled or disabled On Blue Gene/Q, having LIBOMP_USE_ITT_NOTIFY support compiled into a statically-linked binary causes a failure at runtime because dlopen fails. This patch changes LIBOMP_USE_ITT_NOTIFY to a cacheable configuration setting that can be disabled. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D20517 llvm-svn: 270884	2016-05-26 18:19:10 +00:00
Hal Finkel	0a665a83da	Add a test case for microtask dispatch with many arguments This is a cleaned-up version of the test case posted in the D19879 review. llvm-svn: 270867	2016-05-26 16:34:05 +00:00
Hal Finkel	91e19a3de4	Add an assembly __kmp_invoke_microtask for ppc64[le] Clang no longer restricts itself to generating microtasks with a small number of arguments, and so an assembly implementation is required to prevent hitting the parameter limit present in the C implementation. This adds an implementation for ppc64[le]. llvm-svn: 270821	2016-05-26 04:48:14 +00:00
Andrey Churbanov	2fd1654278	D20525: Use more general function for getting gtid which may be faster than specific one. llvm-svn: 270694	2016-05-25 12:53:17 +00:00
Jonathan Peyton	b044e4fa31	Fork performance improvements Most of this is modifications to check for differences before updating data fields in team struct. There is also some rearrangement of the team struct. Patch by Diego Caballero Differential Revision: http://reviews.llvm.org/D20487 llvm-svn: 270468	2016-05-23 18:01:19 +00:00
Jonathan Peyton	1ab887d403	Allow unit testing on Windows These changes allow testing on Windows using clang.exe. There are two main changes: 1. Only link to -lm when it actually exists on the system 2. Create basic versions of pthread_create() and pthread_join() for windows. They are not POSIX compliant by any stretch but will allow any existing and future tests to use pthread_create() and pthread_join() for testing interactions of libomp with os threads. Differential Revision: http://reviews.llvm.org/D20391 llvm-svn: 270464	2016-05-23 17:50:32 +00:00
Jonathan Peyton	b2b6d4e2e1	Changed parameter names in Fortran modules to correspond with OpenMP 4.5 specification llvm-svn: 270447	2016-05-23 16:24:39 +00:00
Jonathan Peyton	611184919f	Remove trailing whitespace in src/ directory This patch doesn't affect D19878's context. So D19878 still cleanly applies. llvm-svn: 270252	2016-05-20 19:03:38 +00:00
Jonathan Peyton	aa7d2d781b	Remove unnecessary unistd.h header from tests. llvm-svn: 269987	2016-05-18 21:36:34 +00:00
Jonathan Peyton	096ccdd389	Remove trailing whitespace in files in doc/ directory llvm-svn: 269842	2016-05-17 21:12:48 +00:00
Jonathan Peyton	3731076997	Remove trailing whitespace from tests llvm-svn: 269841	2016-05-17 21:08:52 +00:00
Jonathan Peyton	0c3a85a327	Remove trailing whitespace in files in tools/ directory llvm-svn: 269837	2016-05-17 20:54:10 +00:00
Jonathan Peyton	975dabc96e	Remove trailing whitespace in CMake files llvm-svn: 269836	2016-05-17 20:51:24 +00:00
Jonathan Peyton	924a6627ea	Remove trailing whitespace in READMEs, CREDITS.txt and index.html llvm-svn: 269835	2016-05-17 20:48:42 +00:00
Jonathan Peyton	18b61707e8	Update copyright year in LICENSE.txt llvm-svn: 269826	2016-05-17 20:11:26 +00:00
Jonathan Peyton	0e8f053023	[OpenMP Testing] Have lit.py be a valid lit executable Users can use either llvm-lit (generated during llvm build) or lit.py which exists in llvm/utils/lit. llvm-svn: 269774	2016-05-17 15:12:11 +00:00
Paul Osmialowski	fb043fdfff	Clean all the mess around KMP_USE_FUTEX and kmp_lock.h KMP_USE_FUTEX preprocessor definition defined in kmp_lock.h is used inconsequently throughout LLVM libomp code. * some .c files that use this define do not include kmp_lock.h file, in effect guarded part of code are never compiled * some places in code use architecture-depending preprocessor logic expressions which effectively disable use of Futex for AArch64 architecture, all these places should use '#if KMP_USE_FUTEX' instead to avoid any further confusions * some places use KMP_HAS_FUTEX which is nowhere defined, KMP_USE_FUTEX should be used instead Differential Revision: http://reviews.llvm.org/D19629 llvm-svn: 269642	2016-05-16 09:44:11 +00:00
Paul Osmialowski	97ae10c67c	NFC fix indent (relates to my previous commit) llvm-svn: 269443	2016-05-13 17:45:49 +00:00
Paul Osmialowski	7e5e8684fb	Solve 'Too many args to microtask' problem This patch solves 'Too many args to microtask' problem which occurs while executing lulesh2.0.3 benchmark on AArch64. To solve this I had to wrtite AArch64 assembly version of __kmp_invoke_microtask() function, similar to x86 and x86_64 implementations. Differential Revision: http://reviews.llvm.org/D19879 llvm-svn: 269399	2016-05-13 08:26:42 +00:00
Jonathan Peyton	f83ae31caf	Adding new kmp_aligned_malloc() entry point This change adds a new entry point, kmp_aligned_malloc(size_t size, size_t alignment), an entry point corresponding to kmp_malloc() but with the capability to return aligned memory as well. Other allocator routines have been adjusted so that kmp_free() can be used for freeing memory blocks allocated by any kmp_*alloc() routine, including the new kmp_aligned_malloc() routine. Differential Revision: http://reviews.llvm.org/D19814 llvm-svn: 269365	2016-05-12 22:00:37 +00:00
Jonathan Peyton	2b749b33cc	Fix team reuse with foreign threads After hot teams were enabled by default, the library started using levels kept in the team structure. The levels are broken in case foreign thread exits and puts its team into the pool which is then re-used by another foreign thread. The broken behavior observed is when printing the levels for each new team, one gets 1, 2, 1, 2, 1, 2, etc. This makes the library believe that every other team is nested which is incorrect. What is wanted is for the levels to be 1, 1, 1, etc. Differential Revision: http://reviews.llvm.org/D19980 llvm-svn: 269363	2016-05-12 21:54:30 +00:00
Paul Osmialowski	562a3c2b66	New hwloc API compatibility Differential Revision: http://reviews.llvm.org/D19628 llvm-svn: 269284	2016-05-12 11:46:40 +00:00
Hal Finkel	55acbf8877	Restore NULL flag check in __kmp_null_resume_wrapper This reverts a presumaby-unintentional change in: r268640 - [STATS] Use partitioned timer scheme and fixes segfaults in an x86_64 debug build of the runtime library. llvm-svn: 269259	2016-05-12 00:54:08 +00:00
Paul Osmialowski	52bef53f86	Fine tuning of TC* macros This patch introduces following: * TCI_* and TCD_* macros for incrementation and decrementation * Fix for invalid use of TCR_8 in one expression Differential Revision: http://reviews.llvm.org/D19880 llvm-svn: 268826	2016-05-07 00:00:00 +00:00
Jonathan Peyton	11dc82fa83	[STATS] Use partitioned timer scheme This change removes the current timers with ones that partition time properly. The current timers are nested, so that if a new timer, B, starts when the current timer, A, is already timing, A's time will include B's. To eliminate this problem, the partitioned timers are designed to stop the current timer (A), let the new timer run (B), and when the new timer is finished, restart the previously running timer (A). With this partitioning of time, a threads' timers all sum up to the OMP_worker_thread_life time and can now easily show the percentage of time a thread is spending in different parts of the runtime or user code. There is also a new state variable associated with each thread which tells where it is executing a task. This corresponds with the timers: OMP_task_, e.g., if time is spent in OMP_task_taskwait, then that thread executed tasks inside a #pragma omp taskwait construct. The changes are mostly changing the MACROs to use the new PARITIONED_ macros, the new partitionedTimers class and its methods, and new state logic. Differential Revision: http://reviews.llvm.org/D19229 llvm-svn: 268640	2016-05-05 16:15:57 +00:00
Paul Osmialowski	fedce46bbd	NFC remove unneded spaces (test commit) llvm-svn: 268462	2016-05-03 23:10:20 +00:00
Jonathan Peyton	8407f5b3bd	Remove architecture dependent Hwloc DEBUG section This debug sections's functionality can be replicated using the environment variable KMP_TOPOLOGY_METHOD with different values and KMP_AFFINITY=verbose llvm-svn: 267472	2016-04-25 21:11:26 +00:00
Jonathan Peyton	1d5487c5d0	Fix buffer problem with printing long Hwloc affinity mask This change has the hwloc_bitmap_list_snprintf() function use the entire buffer to print the mask. There is no need to shorten the buffer length by 7. It only needs to be shortened by one byte. llvm-svn: 267470	2016-04-25 21:08:31 +00:00
Jonathan Peyton	b1467d1ef0	ARM Limited license agreement from the copyright/patent holder I have prepared some patches for LLVM OpenMP runtime, mostly addressing ARMv8 support. Before I upstream them, I must address legal issues that arose around my planned contribution. I was advised that before I send any substantial commit, I need to make sure that LICENSE.txt file in the projects repository contains a statement submitted by ARM, similar to the one provided by Intel (see "a license agreement from the copyright/patent holders"). This is the same situation as with top-level LLVM project: ARM has provided the same statement in http://llvm.org/svn/llvm-project/llvm/trunk/lib/Target/ARM/LICENSE.TXT file. Patch by Paul Osmialowski Differential Revision: http://reviews.llvm.org/D19319 llvm-svn: 267446	2016-04-25 19:12:20 +00:00
Jonathan Peyton	a1202bf594	[ITTNOTIFY] Remove serialized parallel regions from frame notification llvm-svn: 266760	2016-04-19 16:55:17 +00:00
Jonathan Peyton	5235a1b603	Fix trip count calculation for parallel loops in runtime The trip count calculation was incorrect for loops with large bounds. For example, for(int i=-2,000,000,000; i < 2,000,000,000; i+=50000000), the trip count calculation had overflow (trying to calculate 2,000,000,000 + 2,000,000,000 with signed integers) and wasn't giving the right value. This patch fixes this error in the runtime by using unsigned integers instead. There is still a bug in the clang compiler component because it warns that there is overflow in the test case file when there isn't. This error isn't there for the Intel Compiler. So for now, the test case is designated as XFAIL. Differential Revision: http://reviews.llvm.org/D19078 llvm-svn: 266677	2016-04-18 21:38:29 +00:00
Jonathan Peyton	e6643daa18	Runtime support for untied tasks Introduced a counter of parts of an untied task submitted for execution. The counter controls whether all parts of the task are already finished. The compiler should generate re-submission of partially executed untied task by itself before exiting of each task part except for the lexical last part. Differential Revision: http://reviews.llvm.org/D19026 llvm-svn: 266675	2016-04-18 21:35:14 +00:00
Jonathan Peyton	f252010f69	Fix for pthread_setspecific (TLS and shutdown) problem Some codes that use TLS fail intermittently because one thread tries to write TLS values after the TLS key has been destroyed by another thread. This happens when one thread executes library shutdown (and destroys TLS keys), while another thread starts to execute the TLS key destructor routine. Before this change, the kmp_init_runtime flag was checked before calling pthread_* TLS functions, but this flag is set to FALSE later than the destruction of the TLS keys, which leads to failure. The fix is to check kmp_init_gtid instead, as this flag is unset before the destruction of TLS keys. Differential Revision: http://reviews.llvm.org/D19022 llvm-svn: 266674	2016-04-18 21:33:01 +00:00
Jonathan Peyton	e2289a427d	[STATS] Remove timePair class and unused functions llvm-svn: 266634	2016-04-18 17:27:30 +00:00
Jonathan Peyton	53eca5216e	[STATS] print Total_* stats on their own line llvm-svn: 266633	2016-04-18 17:24:20 +00:00
Jonathan Peyton	99ef4d0433	[ITTNOTIFY] Correct barrier imbalance time in case of tasks ittnotify fix for barrier imbalance time in case tasks exist. In the current implementation, task execution time is included into aggregated time on a barrier. This fix calculates task execution time and corrects the arrive time by subtracting the task execution time. Since __kmp_invoke_task() can not only be called on a barrier, the field th.th_bar_arrive_time is used to check if the function was called at the barrier (th.th_bar_arrive_time != 0). So for this check, th_bar_arrive_time is set to zero right after the value is used on the barrier. Differential Revision: http://reviews.llvm.org/D19030 llvm-svn: 266332	2016-04-14 16:06:49 +00:00
Jonathan Peyton	377aa40d84	Exponential back off logic for test-and-set lock This change adds back off logic in the test and set lock for better contended lock performance. It uses a simple truncated binary exponential back off function. The default back off parameters are tuned for x86. The main back off logic has a two loop structure where each is controlled by a user-level parameter: max_backoff - limits the outer loop number of iterations. This parameter should be a power of 2. min_ticks - the inner spin wait loop number of "ticks" which is system dependent and should be tuned for your system if you so choose. The "ticks" on x86 correspond to the time stamp counter, but on other architectures ticks is a timestamp derived from gettimeofday(). The user can modify these via the environment variable: KMP_SPIN_BACKOFF_PARAMS=max_backoff[,min_ticks] Currently, since the default user lock is a queuing lock, one would have to also specify KMP_LOCK_KIND=tas to use the test-and-set locks. Differential Revision: http://reviews.llvm.org/D19020 llvm-svn: 266329	2016-04-14 16:00:37 +00:00
Jonathan Peyton	2e379fc767	Add declarations of OpenMP 4.5 target/offload routines to headers All these routines are implemented in the offload library. llvm-svn: 266120	2016-04-12 20:37:18 +00:00
Jonathan Peyton	072772bf05	[STATS] Remove trailing whitespace in stats source files llvm-svn: 265437	2016-04-05 18:48:48 +00:00
Jonathan Peyton	50e8f18b52	OMP_WAIT_POLICY changes This change has OMP_WAIT_POLICY=active to mean that threads will busy-wait in spin loops and virtually never go to sleep. OMP_WAIT_POLICY=passive now means that threads will immediately go to sleep inside a spin loop. KMP_BLOCKTIME was the previous mechanism to specify this behavior via KMP_BLOCKTIME=0 or KMP_BLOCKTIME=infinite, but the standard OpenMP environment variable should also be able to specify this behavior. Differential Revision: http://reviews.llvm.org/D18577 llvm-svn: 265339	2016-04-04 19:38:32 +00:00
Jonathan Peyton	1d46d979a9	Fix bug when KMP_USE_ADAPTIVE_LOCKS is 0 #endif was one line too low. If KMP_USE_ADAPTIVE_LOCKS is 0, then queuing locks would incorrectly use drdpa lock mechanism. This is a fix for https://llvm.org/bugs/show_bug.cgi?id=26649 llvm-svn: 264934	2016-03-30 21:50:59 +00:00
Jonathan Peyton	4cfe93c599	Fix comment in kmp_wait_release.h Removed reference to "ref ct" in a comment, as ref_ct no longer exists. Also moved the comment to where the task_team is about to be tested if NULL. llvm-svn: 264786	2016-03-29 21:08:29 +00:00
Jonathan Peyton	ee2f96c79b	Fix incorrect indention in kmp_alloc.c llvm-svn: 264777	2016-03-29 20:10:00 +00:00
Jonathan Peyton	a58563d8c9	Remove dead KMP_USE_POOLED_ALLOC code llvm-svn: 264776	2016-03-29 20:05:27 +00:00
Jonathan Peyton	316af8de48	[STATS] Missing check for MIC in config-ix.cmake llvm-svn: 264616	2016-03-28 18:53:10 +00:00
Hal Finkel	01bb2406a3	Fixing the non-x86 build by removing dependence on kmp_cpuid_t The problem is that the definition of kmp_cpuinfo_t contains: char name [3*sizeof (kmp_cpuid_t)]; // CPUID(0x80000002,0x80000003,0x80000004) and kmp_cpuid_t is only defined when compiling for x86. Differential Revision: http://reviews.llvm.org/D18245 llvm-svn: 264535	2016-03-27 13:24:09 +00:00
Jonas Hahnfeld	e46a494a50	[OMPT] Fix parallel_id and task_id in loop_end with schedule static For serialized parallel regions, wrong ids were reported. Now the same code is used as in kmp_dispatch.cpp which emits the correct ids. Differential Revision: http://reviews.llvm.org/D18348 llvm-svn: 264266	2016-03-24 12:52:20 +00:00
Jonas Hahnfeld	801fe9bbe2	[OMPT] Test ids reported by ompt_get_{parallel,task}_id llvm-svn: 264265	2016-03-24 12:52:11 +00:00
Jonas Hahnfeld	1c1c71776a	[OMPT] Fix duplicate implicit_task_end events for master thread with GCC For non-serialized parallel regions the master thread issued two callbacks: The first one in kmp_gsupport.c and the second in __kmp_join_call. Therefore only trigger the callback in kmp_gsupport.c for serialized parallel regions. Differential Revision: http://reviews.llvm.org/D16716 llvm-svn: 264264	2016-03-24 12:52:04 +00:00
Jonathan Peyton	b7d30cbc7e	Fix Visual Studio builds Have Visual Studio use MemoryBarrier() instead of _mm_mfence() and remove __declspec align attribute from function parameters in kmp_atomic.h llvm-svn: 264166	2016-03-23 16:27:25 +00:00
Jonas Hahnfeld	b1cad2954b	[OMPT] Make tests require OMPT_BLAME ompt_event_barrier_{begin,end} are optional blame events. In total it doesn't make any sense to test partially built OMPT support. llvm-svn: 264031	2016-03-22 08:23:24 +00:00
Jonas Hahnfeld	c804301113	[OMPT] Create infrastructure and add first tests for OMPT Some basic checks next to the implementation should futher lower the possibility to introduce regressions. (Note that this would have catched the ordering issue fixed in rL258866 and pointed to rL263940.) The tests are implementation dependent in one point because they assume that thread ids are assigned in ascending order. This is not defined by the standard but currently ensured in libomp. We have to think about another way of ordering the threads should this ever be subject to change... Note that this isn't aiming at replacing the implementation independent test-suite at https://github.com/OpenMPToolsInterface/ompt-test-suite! Differential Revision: http://reviews.llvm.org/D16715 llvm-svn: 264027	2016-03-22 07:22:49 +00:00
Jonathan Peyton	93a879ce78	[STATS] Add OMP_critical and OMP_critical_wait timers OMP_critical - time spent in critical section OMP_critical_wait - time spent waiting to enter a critical section llvm-svn: 263967	2016-03-21 18:32:26 +00:00
Jonathan Peyton	97cbb42d90	[STATS] separate noTotal bit flag from onlyInMaster and noUnits This change logically separates the stats_flags_e::noTotal bit flag from the stats_flags_e::onlyInMaster and stats_flags_e::noUnits bit flags. If no TOTAL_foo output is wanted for a particular statistic, the flag must be explicitly included in that statistic's flags. Differential Revision: http://reviews.llvm.org/D18198 llvm-svn: 263954	2016-03-21 17:26:23 +00:00
Jonas Hahnfeld	6c250b714c	[OMPT] Fix wrong parent_task_id in serialized parallel_begin with GCC Without this patch a simple '#pragma omp parallel num_threads(1)' leads to ompt_event_parallel_begin: parent_task_id=3, [...], parallel_id=2, [...] ompt_event_parallel_end: parallel_id=2, task_id=4, [...] Differential Revision: http://reviews.llvm.org/D16714 llvm-svn: 263940	2016-03-21 12:37:52 +00:00
Jonathan Peyton	b5969ca42d	Update www/index.html to reflect current status of OpenMP project llvm-svn: 263788	2016-03-18 14:50:01 +00:00
Jonathan Peyton	8a46c067ed	[CMake] Fix Windows build problem for CMake versions < 3.3 Building libomp using CMake versions < 3.3 caused a link time error. These errors occurred because when assembling z_Windows_NT-586_asm.asm, the definitions: OMPT_SUPPORT, _M_AMD64\|_M_IA32 weren't defined on the command line. To fix the problem, the COMPILE_FLAGS property for the assembly file is appended to instead of the COMPILE_DEFINITIONS property being set. For whatever reason, the COMPILE_DEFINITIONS property doesn't pick up the definitions for assembly files for the older CMake versions. llvm-svn: 263651	2016-03-16 18:44:18 +00:00
Jonathan Peyton	4240055ac8	Fix spelling error in comment llvm-svn: 263586	2016-03-15 20:59:10 +00:00
Jonathan Peyton	20c1e4e69d	[STATS] Print "Unknown" for frequency if it wasn't able to be parsed llvm-svn: 263583	2016-03-15 20:55:32 +00:00
Jonathan Peyton	226dcd3243	[STATS] Fix comments in kmp_stats.h llvm-svn: 263582	2016-03-15 20:49:01 +00:00
Jonathan Peyton	6e98d7988b	[STATS] Add header information to stats print out This change adds a header to the printout of the statistics which includes the time, machine name, and processor info if available. This change also includes some cosmetic changes like using enum casting for timer and counter iteration. Differential Revision: http://reviews.llvm.org/D18153 llvm-svn: 263580	2016-03-15 20:28:47 +00:00
Samuel Antao	11e4c539f4	Initialize two variables in kmp_tasking. Summary: Two initialized local variables are causing clang to produce warnings: ``` ./src/projects/openmp/runtime/src/kmp_tasking.c:3019:5: error: variable 'num_tasks' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized] default: ^~~~~~~ ./src/projects/openmp/runtime/src/kmp_tasking.c:3027:21: note: uninitialized use occurs here for( i = 0; i < num_tasks; ++i ) { ^~~~~~~~~ ./src/projects/openmp/runtime/src/kmp_tasking.c:2968:28: note: initialize the variable 'num_tasks' to silence this warning kmp_uint64 i, num_tasks, extras; ^ = 0 ./src/projects/openmp/runtime/src/kmp_tasking.c:3019:5: error: variable 'extras' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized] default: ^~~~~~~ ./src/projects/openmp/runtime/src/kmp_tasking.c:3022:52: note: uninitialized use occurs here KMP_DEBUG_ASSERT(tc == num_tasks * grainsize + extras); ^~~~~~ ./src/projects/openmp/runtime/src/kmp_debug.h:62:60: note: expanded from macro 'KMP_DEBUG_ASSERT' #define KMP_DEBUG_ASSERT( cond ) KMP_ASSERT( cond ) ^ ./src/projects/openmp/runtime/src/kmp_debug.h:60:51: note: expanded from macro 'KMP_ASSERT' #define KMP_ASSERT( cond ) ( (cond) ? 0 : __kmp_debug_assert( #cond, __FILE__, __LINE__ ) ) ^ ./src/projects/openmp/runtime/src/kmp_tasking.c:2968:36: note: initialize the variable 'extras' to silence this warning kmp_uint64 i, num_tasks, extras; ^ = 0 2 errors generated. ``` This patch initializes these two variables. Reviewers: tlwilmar, jlpeyton Subscribers: tlwilmar, openmp-commits Differential Revision: http://reviews.llvm.org/D17909 llvm-svn: 263316	2016-03-12 00:55:17 +00:00
Jonathan Peyton	495e153ff9	[STATS] change TASK_execution name to OMP_task llvm-svn: 263291	2016-03-11 20:23:05 +00:00
Jonathan Peyton	e2554af857	[STATS] Add a total statistics count This change removes synthesized stats and instead has all timers print out a total which is the aggregate statistics across threads. This is displayed as "Total_foo" at the end of program. The stats_flags_e::synthesized flag is removed and the printStats() function is split into two separate functions: printTimerStats() which can display the aggregate total and printCounterStats(). Differential Revision: http://reviews.llvm.org/D17869 llvm-svn: 263290	2016-03-11 20:20:49 +00:00
Jonathan Peyton	c1a7c97c1b	[STATS] fix output formatting when sample count is 0 Force 0.0 to be displayed for all statistics which have sample count equal to 0 llvm-svn: 262658	2016-03-03 21:24:13 +00:00
Jonathan Peyton	30138256fa	[STATS] fix master and single timers Only the thread which executes the single/master section will update its statistics. llvm-svn: 262656	2016-03-03 21:21:05 +00:00
Jonathan Peyton	283a215c7a	Add new OpenMP 4.5 taskloop construct feature From the standard: The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed. This initial implementation uses a simple linear tasks distribution algorithm. Later we can add other algorithms to speedup generation of huge number of tasks (i.e., tree-like tasks generation should be faster). This needs to be put into the OpenMP runtime library in order for the compiler team to develop the compiler side of the implementation. Differential Revision: http://reviews.llvm.org/D17404 llvm-svn: 262535	2016-03-02 22:47:51 +00:00
Jonathan Peyton	a0d7a2cd3f	Forgot to add test files for doacross and task priority. llvm-svn: 262533	2016-03-02 22:43:14 +00:00
Jonathan Peyton	71909c57ca	Add new OpenMP 4.5 doacross loop nest feature From the standard: A doacross loop nest is a loop nest that has cross-iteration dependence. An iteration is dependent on one or more lexicographically earlier iterations. The ordered clause parameter on a loop directive identifies the loop(s) associated with the doacross loop nest. The init/fini routines allocate/free doacross buffer(s) for each loop for each thread. The wait routine waits for a flag designated by the dependence vector. The post routine sets the flag designated by current iteration vector. We use a similar technique of shared buffer indices that covers up to 7 nowait loops executed simultaneously by different threads (number 7 has no real meaning, just heuristic value). Also, the size of structures are kept intact via reducing dummy arrays. This needs to be put into the OpenMP runtime library in order for the compiler team to develop the compiler side of the implementation. Differential Revision: http://reviews.llvm.org/D17399 llvm-svn: 262532	2016-03-02 22:42:06 +00:00
Jonathan Peyton	2f7c077b5a	Add new OpenMP 4.5 affinity API This change introduces the new OpenMP 4.5 affinity api surrounding OpenMP Places. There are six new entry points: Typically called in serial region: * omp_get_num_places - returns the number of places available to the execution environment in the place list. * omp_get_place_num_procs - returns the number of processors available to the execution environment in the specified place. * omp_get_place_proc_ids - returns the numerical identifiers of the processors available to the execution environment in the specified place. Typically called inside parallel region: * omp_get_place_num - returns the place number of the place to which the encountering thread is bound. * omp_get_partition_num_places - returns the number of places in the place partition of the innermost implicit task. * omp_get_partition_place_nums - returns the list of place numbers corresponding to the places in the place-var ICV of the innermost implicit task. Differential Revision: http://reviews.llvm.org/D17417 llvm-svn: 261915	2016-02-25 18:49:52 +00:00
Jonathan Peyton	2851072d69	Add initial support for OpenMP 4.5 task priority feature The maximum task priority value is read from envirable: OMP_MAX_TASK_PRIORITY. But as of now, nothing is done with it. We just handle the environment variable and add the new api: omp_get_max_task_priority() which returns that value or zero if it is not set. Differential Revision: http://reviews.llvm.org/D17411 llvm-svn: 261908	2016-02-25 18:04:09 +00:00
Jonathan Peyton	ea0fe1dfeb	dd new OpenMP 4.5 schedule clause modifiers (monotonic/non-monotonic) feature The monotonic/non-monotonic flags are sent to the runtime via the sched_type by setting the 30th (non-monotonic) or 29th (monotonic) bit in the sched_type. Macros are added to probe if monotonic or non-monotonic is specified (SCHEDULE_HAS_[NON]MONOTONIC & SCHEDULE_HAS_NO_MODIFIERS) and also to to get the base sched_type (SCHEDULE_WITHOUT_MODIFIERS) Currently, nothing is done with the modifiers. Also, this patch adds some comments on the use of the enumerations in at least one place where it is subtle. Differential Revision: http://reviews.llvm.org/D17406 llvm-svn: 261906	2016-02-25 17:55:50 +00:00
Jonathan Peyton	95c95c350e	Remove unnecessary semicolons after braces llvm-svn: 261249	2016-02-18 19:38:25 +00:00
Jonas Hahnfeld	867aa20b1e	[OMPT] Frame information for openmp taskwait For pragma omp taskwait the runtime is called from the task context. Therefore, the reentry frame information should be updated. The information should be available for both taskwait event calls; therefore, set before the first event and reset after the last event. Patch by Joachim Protze Differential Revision: http://reviews.llvm.org/D17145 llvm-svn: 260674	2016-02-12 12:19:59 +00:00
Jonathan Peyton	134f90d59f	Fix incorrect task_team in __kmp_give_task When a target task finishes and it tries to access the th_task_team from the threads in the team where it was created, th_task_team can be NULL or point to a different place when that thread started a nested region that is still running. Finding the exact task_team that the threads were using is difficult as it would require to unwind the task_state_memo_stack. So a new field was added in the taskdata structure to point to the active task_team when the task was created. llvm-svn: 260615	2016-02-11 23:07:30 +00:00
Jonathan Peyton	ff684e4b9e	Fix a couple of typos in comments llvm-svn: 260613	2016-02-11 22:58:29 +00:00

... 3 4 5 6 7 ...

741 Commits