llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonathan Peyton	fd7cc42fed	Improvements to process affinity mask setting A couple improvements: 1) Add ability to limit fullMask size when KMP_HW_SUBSET limits resources. 2) Make KMP_HW_SUBSET work for affinity_none, and only limit fullMask in this case. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21528 llvm-svn: 273278	2016-06-21 15:54:38 +00:00
Jonathan Peyton	5a276c45c2	Bug fix for segfault in stubs library There was a segfault in the stubs library in posix_memalign because of a bad parameter. The fix is to send address of the pointer as a parameter. Also added check of result of posix_memalign. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21529 llvm-svn: 273276	2016-06-21 15:39:08 +00:00
Jonathan Peyton	98b76f6f87	[STATS] Adding process id to output filename This change appends the process id to the KMP_STATS_FILE (if specified) which enables MPI processes to output their stats to separate files. Differential Revision: http://reviews.llvm.org/D21386 llvm-svn: 273273	2016-06-21 15:20:33 +00:00
Jonathan Peyton	ea26f3f82a	Fix typos in Fortran headers Fix typos in Fortran headers to match spec. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21531 llvm-svn: 273272	2016-06-21 15:16:51 +00:00
Jonathan Peyton	bf35771bcc	Change hwloc discovery algorithm to print topology only for accessible resources Change hwloc discovery algorithm to print topology for only accessible resources, and report uniformity correspondingly, similar to what other topology discovery algorithms do. Fixes minor inconsistency in total topology reported and resources used for threads binding in case hwloc used. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21389 llvm-svn: 272952	2016-06-16 20:31:19 +00:00
Jonathan Peyton	0f3c2b921d	Teach OpenMP Library to use Hwloc on Windows This patch allows a user to enable Hwloc on windows. There are three main changes in here: 1.kmp.h - Move definitions/declarations out of KMP_OS_WINDOWS guard (our windows implementation of affinity) because they need to be defined when KMP_USE_HWLOC is on as well. 2.teach __kmp_set_system_affinity, __kmp_get_system_affinity, __kmp_get_proc_group, and __kmp_affinity_bind_thread how to use hwloc. 3.teach CMake how to include hwloc when building Windows Another minor change in here is to make sure that anything under KMP_USE_HWLOC is also guarded by KMP_AFFINITY_SUPPORTED as well. This is to prevent Mac builds from requiring anything from Hwloc. Differential Revision: http://reviews.llvm.org/D21441 llvm-svn: 272951	2016-06-16 20:23:11 +00:00
Jonathan Peyton	c505ab6733	Fix for crash in task dependencies With single thread using __kmpc_omp_wait_deps segfaults in OpenMP runtime. Offloading with depend also encounters this problem when we generate kmpc_omp_wait_deps instead of kmpc_omp_task_with_deps. Patch by Alex Duran Differential Revision: http://reviews.llvm.org/D21384 llvm-svn: 272949	2016-06-16 20:18:31 +00:00
Jonathan Peyton	72a8498e08	Fixed missing memory cleanup in __kmp_affinity_create_hwloc_map() Cleanup: fixed missing memory cleanup in couple of corner cases. Fixes possible memory leak in some corner cases Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21355 llvm-svn: 272946	2016-06-16 20:14:54 +00:00
Jonathan Peyton	4ba3b0cda9	Reduce perf impact of redundant ittnotify calls Improved performance of ittnotify calls by request from ittnotify owner: calls to __itt_string_handle_create made unique (it was called multiple times). Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21353 llvm-svn: 272945	2016-06-16 20:11:51 +00:00
Jonathan Peyton	b9d28fbeb3	Deprecate KMP_PLACE_THREADS and rename as KMP_HW_SUBSET Deprecate KMP_PLACE_THREADS and rename it to KMP_HW_SUBSET due to confusion about its purpose and function among users. KMP_HW_SUBSET is an environment variable which allows users to easily pick a subset of the hardware topology to use. e.g., KMP_HW_SUBSET=30c,2t means use 30 cores, 2 threads per core. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21340 llvm-svn: 272937	2016-06-16 18:53:48 +00:00
Jonathan Peyton	7cf08d4299	Bug fix: crash if teams executed on host Added argv array check/allocation for parallel directly nested inside the teams construct, as new coming Fortran codegen passes parameters directly into kmpc_fork_call missing same parameters in kmpc_fork_teams (earlier codegen passed to parallel the subset of parameter passed to teams, and thus no check/allocation needed). Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21336 llvm-svn: 272935	2016-06-16 18:47:38 +00:00
Jonathan Peyton	614bb6618e	Fix large overhead with itt notifications on region/barrier name composing Currently, there is a big overhead in reporting of loop metadata through ittnotify. The pair of functions: __kmp_str_loc_init/__kmp_str_loc_free are replaced with strchr/atoi calls. Thus, a lot of time consuming actions are skipped - many memory allocations/deallocations, heavy string duplication, etc. The loop metadata only needs line and column info from the source string, so no allocations and string splitting actually needed. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21309 llvm-svn: 272698	2016-06-14 19:27:22 +00:00
Jonathan Peyton	e85ba3f58f	Remove unused wait/release code. Cleanup - unused code removal. TODO: consider to remove (replace with flag class methods) also kmp_wait_64 and kmp_release_64 routines. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21332 llvm-svn: 272697	2016-06-14 19:15:40 +00:00
Jonathan Peyton	957a151fd1	Whitespace cleanup of dllexports Differential Revision: http://reviews.llvm.org/D21331 llvm-svn: 272691	2016-06-14 18:47:47 +00:00
Jonathan Peyton	df6818bea4	Renaming change: 41 -> 45 and 4.1 -> 4.5 OpenMP 4.1 is now OpenMP 4.5. Any mention of 41 or 4.1 is replaced with 45 or 4.5. Also, if the CMake option LIBOMP_OMP_VERSION is 41, CMake warns that 41 is deprecated and to use 45 instead. llvm-svn: 272687	2016-06-14 17:57:47 +00:00
Jonathan Peyton	e1890e12f0	Bug fix for Bugzilla bug 26602: Remove function bodies with KMP_ASSERT(0) Fix for bugzilla https://llvm.org/bugs/show_bug.cgi?id=26602. Removed functions body consisted of the only KMP_ASSERT(0) statement. Thus possible runtime crash converted to compile-time error, which looks preferable (faster possible error detection). TODO: consider C++11 static assert as an alternative, that could make the diagnostics better. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21304 llvm-svn: 272590	2016-06-13 21:33:30 +00:00
Jonathan Peyton	c5304aa3c4	Affinity mask processing improvements Remove static specifier from var fullMask and remove kmp_get_fullMask() routine. When iterating through procs in a mask, always check if proc is in fullMask (this check was missing in a few places). Patch by Brian Bliss. Differential Revision: http://reviews.llvm.org/D21300 llvm-svn: 272589	2016-06-13 21:28:03 +00:00
Jonathan Peyton	8cb45c838f	Exclude untied tasks from task stealing constraint If either current_task or new_task is untied then skip task scheduling constraint checks, because untied tasks are not affected by the task scheduling constraints. Differential Revision: http://reviews.llvm.org/D21196 llvm-svn: 272570	2016-06-13 17:51:59 +00:00
Jonathan Peyton	93495de265	Fix crash when libomp loaded/unloaded multiple times The problem scenario is the following: A dynamic library, libfoo.so, depends on libomp.so (it creates parallel region and calls some omp functions). An application has a loop where it dynamically loads libfoo.so, calls the function from it, unloads libfoo.so. After several loop iterations application crashes with the message about lack of resources OMP: Error #34: System unable to allocate necessary resources for OMP thread: The problem is that pthread_kill() was not followed by pthread_join() in case of terminated thread. This patch fixes this problem for both worker and monitor threads. Differential Revision: http://reviews.llvm.org/D21200 llvm-svn: 272567	2016-06-13 17:36:40 +00:00
Jonathan Peyton	202a24dd9b	Hwloc refactoring patch These changes remove the hwloc_topology_ignore_type function which doesn't exist in the hwloc 2.0 API. In the existing code, the topology extracted from hwloc has the cache levels stripped out and then assumes the final stripped topology follows the typical three-level topology: packages -> cores -> HW threads. But the code is doing unclean manipulations to determine at what level those resources are located and also assumes too much about what hwloc is detecting (there could be intermediate levels in between socket and core for instance). This new way of extracting the topology doesn't strip out any hardware objects that hwloc detects. It does not assume the three level topology, and instead searches for the relevant three levels within the topology for each bit of information using hwloc interface functions. i.e., the three level topology subset that our affinity code is interested in is extracted from the hwloc topology tree directly. For example, the new __kmp_hwloc_get_nobjs_under_obj function gives the user the number of cores under a socket reliably without worrying if there are unexpected objects between the socket object and core object in the hwloc topology structure. Also, now that all topology information is kept, there are also possibilities of using the caches/numa nodes to determine more sophisticated affinity settings in the future. There is also some cleanup code added for the destruction of the __kmp_hwloc_topology object. Differential Revision: http://reviews.llvm.org/D21195 llvm-svn: 272565	2016-06-13 17:30:08 +00:00
Jonathan Peyton	34c72c4773	Fix bitmask complement operation The bitmask complement operation doesn't consider the max proc id which means something like !{0} will be translated to {1,2,3,4,...,600,601,...,1023} on a Linux system even though there aren't 600 processors on said system. This change has the complement bitmask and-ed with the fullmask so that it will only contain valid processors. Differential Revision: http://reviews.llvm.org/D21245 llvm-svn: 272561	2016-06-13 17:01:26 +00:00
Jonathan Peyton	5a299da55d	[STATS] Add stats gathering for taskloop construct llvm-svn: 272560	2016-06-13 16:56:41 +00:00
Jonathan Peyton	b6f0f521f5	Fix spelling in comment llvm-svn: 272291	2016-06-09 18:51:17 +00:00
Jonathan Peyton	61fdddfd64	Revert accidental commit to lit.cfg llvm-svn: 272287	2016-06-09 18:29:36 +00:00
Jonathan Peyton	c4c722ac0d	Refactor __kmp_execute_tasks_template function Refactored __kmp_execute_tasks_template to shorten and remove code redundancy. The original code for __kmp_execute_tasks_template was very redundant with large sections of repeated code that needed to be kept consistent, and goto statements that made the control flow difficult to discern. This refactoring removes all gotos and redundancy. Patch by Terry Wilmarth Differential Revision: http://reviews.llvm.org/D20879 llvm-svn: 272286	2016-06-09 18:27:03 +00:00
Hans Wennborg	5b89fbc822	kmp_lock.h: Fix VS2013 build after r271324 MSVC doesn't allow std::atomic<>s in a union since they don't have trivial copy constructor. Replacing them with e.g. std::atomic_int works, but that breaks the GCC build on Linux, because then calls to e.g. std::atomic_load_explicit fail, as they expect a real std::atomic<> pointer. Fixing this with an #ifdef to unbreak the build for now. llvm-svn: 272271	2016-06-09 15:54:43 +00:00
Paul Osmialowski	9cc353e2b3	Fine tuning of TC* macros - small followup As I replaced no-op TCR_4 with actual code, compiler complained while building debug build. This patch moves 'cast to int' to the correct place. Extension to Differential Revision: http://reviews.llvm.org/D19880 llvm-svn: 271377	2016-06-01 09:59:26 +00:00
Paul Osmialowski	f7cc6affdb	Use C++11 atomics for ticket locks implementation This patch replaces use of compiler builtin atomics with C++11 atomics for ticket locks implementation. Ticket locks are used in critical places of the runtime, e.g. in the tasking mechanism. The main reason this change was introduced is the problem with work stealing function on ARM architecture which suffered from nasty race condition. It turned out that the root cause of the problem lies in the way ticket locks are implemented. Changing compiler builtins into C++11 atomics solves the problem. Two assertions were added into kmp_tasking.c which are useful for detecting early symptoms of something wrong going on with work stealing, which were among the possible outcomes of the race condition. Differential Revision: http://reviews.llvm.org/D19878 llvm-svn: 271324	2016-05-31 20:20:32 +00:00
Jonathan Peyton	ef7347994e	Addition of OpenMP 4.5 feature: schedule(simd:static) This patch implements the new kmp_sch_static_balanced_chunked schedule kind that the compiler will generate when it encounters schedule(simd: static). It just adds the new constant and the new switch case __kmp_for_static_init. Patch by Alex Duran. Differential Revision: http://reviews.llvm.org/D20699 llvm-svn: 271320	2016-05-31 19:12:18 +00:00
Jonathan Peyton	f4f969569d	Avoid deadlock with COI When an asynchronous offload task is completed, COI calls the runtime to queue a "destructor task". When the task deques are full, a dead-lock situation arises where the OpenMP threads are inside but cannot progress because the COI thread is stuck inside the runtime trying to find a slot in a deque. This patch implements the solution where the task deques doubled in size when a task is being queued from a COI thread. Differential Revision: http://reviews.llvm.org/D20733 llvm-svn: 271319	2016-05-31 19:07:00 +00:00
Jonathan Peyton	067325f935	Offer API for setting number of loop dispatch buffers The problem is the lack of dispatch buffers when thousands of loops with nowait, about 10 iterations each, are executed by hundreds of threads. We only have built-in 7 dispatch buffers, but there is a need in dozens or hundreds of buffers. The problem can be fixed by setting KMP_MAX_DISP_BUF to bigger value. In order to give users same possibility I changed build-time control into run-time one, adding API just in case. This change adds an environment variable KMP_DISP_NUM_BUFFERS and a new API function kmp_set_disp_num_buffers(int num_buffers). The KMP_DISP_NUM_BUFFERS envirable works only before serial initialization, because during the serial initialization we already allocate buffers for the hot team, so it is too late to change the number of buffers later (or we need to reallocate buffers for all teams which sounds too complicated). The kmp_set_defaults() routine does not work for this envirable, because it calls serial initialization before reading the parameter string. So a new routine, kmp_set_disp_num_buffers(), is created so that it can set our internal global variable before the library initialization. If both the envirable and API used the envirable wins. Differential Revision: http://reviews.llvm.org/D20697 llvm-svn: 271318	2016-05-31 19:01:15 +00:00
Hal Finkel	49bee007d0	Fix storing the frame pointer for OMP-T during ppc64 microtask dispatch Thanks to John Mellor-Crummey for reporting the omission. llvm-svn: 271035	2016-05-27 19:04:05 +00:00
Jonathan Peyton	50eae7f8b2	Add missing OpenMP 4.5 device entries to stubs library. llvm-svn: 271006	2016-05-27 15:51:14 +00:00
Jonathan Peyton	7ba9baef6d	Fix for OMP_PROC_BIND=spread strategy The OMP_PROC_BIND=spread strategy fails to assign the master thread the correct place partition after the first parallel region. Other threads in the hot team will remember their place_partition, but the master's place partition is restored to what it was before entering the parallel region. So when the hot team is used for subsequent parallel regions, the master has lost this info. This fix calls __kmp_partition_places to update only the master thread's place partition in the spread case when there are no other changes to the hot team. Patch by Terry Wilmarth Differential Revision: http://reviews.llvm.org/D20539 llvm-svn: 270890	2016-05-26 19:09:46 +00:00
Jonathan Peyton	7abf9d5927	Make LIBOMP_USE_ITT_NOTIFY a setting that can be enabled or disabled On Blue Gene/Q, having LIBOMP_USE_ITT_NOTIFY support compiled into a statically-linked binary causes a failure at runtime because dlopen fails. This patch changes LIBOMP_USE_ITT_NOTIFY to a cacheable configuration setting that can be disabled. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D20517 llvm-svn: 270884	2016-05-26 18:19:10 +00:00
Hal Finkel	0a665a83da	Add a test case for microtask dispatch with many arguments This is a cleaned-up version of the test case posted in the D19879 review. llvm-svn: 270867	2016-05-26 16:34:05 +00:00
Hal Finkel	91e19a3de4	Add an assembly __kmp_invoke_microtask for ppc64[le] Clang no longer restricts itself to generating microtasks with a small number of arguments, and so an assembly implementation is required to prevent hitting the parameter limit present in the C implementation. This adds an implementation for ppc64[le]. llvm-svn: 270821	2016-05-26 04:48:14 +00:00
Andrey Churbanov	2fd1654278	D20525: Use more general function for getting gtid which may be faster than specific one. llvm-svn: 270694	2016-05-25 12:53:17 +00:00
Jonathan Peyton	b044e4fa31	Fork performance improvements Most of this is modifications to check for differences before updating data fields in team struct. There is also some rearrangement of the team struct. Patch by Diego Caballero Differential Revision: http://reviews.llvm.org/D20487 llvm-svn: 270468	2016-05-23 18:01:19 +00:00
Jonathan Peyton	1ab887d403	Allow unit testing on Windows These changes allow testing on Windows using clang.exe. There are two main changes: 1. Only link to -lm when it actually exists on the system 2. Create basic versions of pthread_create() and pthread_join() for windows. They are not POSIX compliant by any stretch but will allow any existing and future tests to use pthread_create() and pthread_join() for testing interactions of libomp with os threads. Differential Revision: http://reviews.llvm.org/D20391 llvm-svn: 270464	2016-05-23 17:50:32 +00:00
Jonathan Peyton	b2b6d4e2e1	Changed parameter names in Fortran modules to correspond with OpenMP 4.5 specification llvm-svn: 270447	2016-05-23 16:24:39 +00:00
Jonathan Peyton	611184919f	Remove trailing whitespace in src/ directory This patch doesn't affect D19878's context. So D19878 still cleanly applies. llvm-svn: 270252	2016-05-20 19:03:38 +00:00
Jonathan Peyton	aa7d2d781b	Remove unnecessary unistd.h header from tests. llvm-svn: 269987	2016-05-18 21:36:34 +00:00
Jonathan Peyton	096ccdd389	Remove trailing whitespace in files in doc/ directory llvm-svn: 269842	2016-05-17 21:12:48 +00:00
Jonathan Peyton	3731076997	Remove trailing whitespace from tests llvm-svn: 269841	2016-05-17 21:08:52 +00:00
Jonathan Peyton	0c3a85a327	Remove trailing whitespace in files in tools/ directory llvm-svn: 269837	2016-05-17 20:54:10 +00:00
Jonathan Peyton	975dabc96e	Remove trailing whitespace in CMake files llvm-svn: 269836	2016-05-17 20:51:24 +00:00
Jonathan Peyton	924a6627ea	Remove trailing whitespace in READMEs, CREDITS.txt and index.html llvm-svn: 269835	2016-05-17 20:48:42 +00:00
Jonathan Peyton	18b61707e8	Update copyright year in LICENSE.txt llvm-svn: 269826	2016-05-17 20:11:26 +00:00
Jonathan Peyton	0e8f053023	[OpenMP Testing] Have lit.py be a valid lit executable Users can use either llvm-lit (generated during llvm build) or lit.py which exists in llvm/utils/lit. llvm-svn: 269774	2016-05-17 15:12:11 +00:00
Paul Osmialowski	fb043fdfff	Clean all the mess around KMP_USE_FUTEX and kmp_lock.h KMP_USE_FUTEX preprocessor definition defined in kmp_lock.h is used inconsequently throughout LLVM libomp code. * some .c files that use this define do not include kmp_lock.h file, in effect guarded part of code are never compiled * some places in code use architecture-depending preprocessor logic expressions which effectively disable use of Futex for AArch64 architecture, all these places should use '#if KMP_USE_FUTEX' instead to avoid any further confusions * some places use KMP_HAS_FUTEX which is nowhere defined, KMP_USE_FUTEX should be used instead Differential Revision: http://reviews.llvm.org/D19629 llvm-svn: 269642	2016-05-16 09:44:11 +00:00
Paul Osmialowski	97ae10c67c	NFC fix indent (relates to my previous commit) llvm-svn: 269443	2016-05-13 17:45:49 +00:00
Paul Osmialowski	7e5e8684fb	Solve 'Too many args to microtask' problem This patch solves 'Too many args to microtask' problem which occurs while executing lulesh2.0.3 benchmark on AArch64. To solve this I had to wrtite AArch64 assembly version of __kmp_invoke_microtask() function, similar to x86 and x86_64 implementations. Differential Revision: http://reviews.llvm.org/D19879 llvm-svn: 269399	2016-05-13 08:26:42 +00:00
Jonathan Peyton	f83ae31caf	Adding new kmp_aligned_malloc() entry point This change adds a new entry point, kmp_aligned_malloc(size_t size, size_t alignment), an entry point corresponding to kmp_malloc() but with the capability to return aligned memory as well. Other allocator routines have been adjusted so that kmp_free() can be used for freeing memory blocks allocated by any kmp_*alloc() routine, including the new kmp_aligned_malloc() routine. Differential Revision: http://reviews.llvm.org/D19814 llvm-svn: 269365	2016-05-12 22:00:37 +00:00
Jonathan Peyton	2b749b33cc	Fix team reuse with foreign threads After hot teams were enabled by default, the library started using levels kept in the team structure. The levels are broken in case foreign thread exits and puts its team into the pool which is then re-used by another foreign thread. The broken behavior observed is when printing the levels for each new team, one gets 1, 2, 1, 2, 1, 2, etc. This makes the library believe that every other team is nested which is incorrect. What is wanted is for the levels to be 1, 1, 1, etc. Differential Revision: http://reviews.llvm.org/D19980 llvm-svn: 269363	2016-05-12 21:54:30 +00:00
Paul Osmialowski	562a3c2b66	New hwloc API compatibility Differential Revision: http://reviews.llvm.org/D19628 llvm-svn: 269284	2016-05-12 11:46:40 +00:00
Hal Finkel	55acbf8877	Restore NULL flag check in __kmp_null_resume_wrapper This reverts a presumaby-unintentional change in: r268640 - [STATS] Use partitioned timer scheme and fixes segfaults in an x86_64 debug build of the runtime library. llvm-svn: 269259	2016-05-12 00:54:08 +00:00
Paul Osmialowski	52bef53f86	Fine tuning of TC* macros This patch introduces following: * TCI_* and TCD_* macros for incrementation and decrementation * Fix for invalid use of TCR_8 in one expression Differential Revision: http://reviews.llvm.org/D19880 llvm-svn: 268826	2016-05-07 00:00:00 +00:00
Jonathan Peyton	11dc82fa83	[STATS] Use partitioned timer scheme This change removes the current timers with ones that partition time properly. The current timers are nested, so that if a new timer, B, starts when the current timer, A, is already timing, A's time will include B's. To eliminate this problem, the partitioned timers are designed to stop the current timer (A), let the new timer run (B), and when the new timer is finished, restart the previously running timer (A). With this partitioning of time, a threads' timers all sum up to the OMP_worker_thread_life time and can now easily show the percentage of time a thread is spending in different parts of the runtime or user code. There is also a new state variable associated with each thread which tells where it is executing a task. This corresponds with the timers: OMP_task_, e.g., if time is spent in OMP_task_taskwait, then that thread executed tasks inside a #pragma omp taskwait construct. The changes are mostly changing the MACROs to use the new PARITIONED_ macros, the new partitionedTimers class and its methods, and new state logic. Differential Revision: http://reviews.llvm.org/D19229 llvm-svn: 268640	2016-05-05 16:15:57 +00:00
Paul Osmialowski	fedce46bbd	NFC remove unneded spaces (test commit) llvm-svn: 268462	2016-05-03 23:10:20 +00:00
Jonathan Peyton	8407f5b3bd	Remove architecture dependent Hwloc DEBUG section This debug sections's functionality can be replicated using the environment variable KMP_TOPOLOGY_METHOD with different values and KMP_AFFINITY=verbose llvm-svn: 267472	2016-04-25 21:11:26 +00:00
Jonathan Peyton	1d5487c5d0	Fix buffer problem with printing long Hwloc affinity mask This change has the hwloc_bitmap_list_snprintf() function use the entire buffer to print the mask. There is no need to shorten the buffer length by 7. It only needs to be shortened by one byte. llvm-svn: 267470	2016-04-25 21:08:31 +00:00
Jonathan Peyton	b1467d1ef0	ARM Limited license agreement from the copyright/patent holder I have prepared some patches for LLVM OpenMP runtime, mostly addressing ARMv8 support. Before I upstream them, I must address legal issues that arose around my planned contribution. I was advised that before I send any substantial commit, I need to make sure that LICENSE.txt file in the projects repository contains a statement submitted by ARM, similar to the one provided by Intel (see "a license agreement from the copyright/patent holders"). This is the same situation as with top-level LLVM project: ARM has provided the same statement in http://llvm.org/svn/llvm-project/llvm/trunk/lib/Target/ARM/LICENSE.TXT file. Patch by Paul Osmialowski Differential Revision: http://reviews.llvm.org/D19319 llvm-svn: 267446	2016-04-25 19:12:20 +00:00
Jonathan Peyton	a1202bf594	[ITTNOTIFY] Remove serialized parallel regions from frame notification llvm-svn: 266760	2016-04-19 16:55:17 +00:00
Jonathan Peyton	5235a1b603	Fix trip count calculation for parallel loops in runtime The trip count calculation was incorrect for loops with large bounds. For example, for(int i=-2,000,000,000; i < 2,000,000,000; i+=50000000), the trip count calculation had overflow (trying to calculate 2,000,000,000 + 2,000,000,000 with signed integers) and wasn't giving the right value. This patch fixes this error in the runtime by using unsigned integers instead. There is still a bug in the clang compiler component because it warns that there is overflow in the test case file when there isn't. This error isn't there for the Intel Compiler. So for now, the test case is designated as XFAIL. Differential Revision: http://reviews.llvm.org/D19078 llvm-svn: 266677	2016-04-18 21:38:29 +00:00
Jonathan Peyton	e6643daa18	Runtime support for untied tasks Introduced a counter of parts of an untied task submitted for execution. The counter controls whether all parts of the task are already finished. The compiler should generate re-submission of partially executed untied task by itself before exiting of each task part except for the lexical last part. Differential Revision: http://reviews.llvm.org/D19026 llvm-svn: 266675	2016-04-18 21:35:14 +00:00
Jonathan Peyton	f252010f69	Fix for pthread_setspecific (TLS and shutdown) problem Some codes that use TLS fail intermittently because one thread tries to write TLS values after the TLS key has been destroyed by another thread. This happens when one thread executes library shutdown (and destroys TLS keys), while another thread starts to execute the TLS key destructor routine. Before this change, the kmp_init_runtime flag was checked before calling pthread_* TLS functions, but this flag is set to FALSE later than the destruction of the TLS keys, which leads to failure. The fix is to check kmp_init_gtid instead, as this flag is unset before the destruction of TLS keys. Differential Revision: http://reviews.llvm.org/D19022 llvm-svn: 266674	2016-04-18 21:33:01 +00:00
Jonathan Peyton	e2289a427d	[STATS] Remove timePair class and unused functions llvm-svn: 266634	2016-04-18 17:27:30 +00:00
Jonathan Peyton	53eca5216e	[STATS] print Total_* stats on their own line llvm-svn: 266633	2016-04-18 17:24:20 +00:00
Jonathan Peyton	99ef4d0433	[ITTNOTIFY] Correct barrier imbalance time in case of tasks ittnotify fix for barrier imbalance time in case tasks exist. In the current implementation, task execution time is included into aggregated time on a barrier. This fix calculates task execution time and corrects the arrive time by subtracting the task execution time. Since __kmp_invoke_task() can not only be called on a barrier, the field th.th_bar_arrive_time is used to check if the function was called at the barrier (th.th_bar_arrive_time != 0). So for this check, th_bar_arrive_time is set to zero right after the value is used on the barrier. Differential Revision: http://reviews.llvm.org/D19030 llvm-svn: 266332	2016-04-14 16:06:49 +00:00
Jonathan Peyton	377aa40d84	Exponential back off logic for test-and-set lock This change adds back off logic in the test and set lock for better contended lock performance. It uses a simple truncated binary exponential back off function. The default back off parameters are tuned for x86. The main back off logic has a two loop structure where each is controlled by a user-level parameter: max_backoff - limits the outer loop number of iterations. This parameter should be a power of 2. min_ticks - the inner spin wait loop number of "ticks" which is system dependent and should be tuned for your system if you so choose. The "ticks" on x86 correspond to the time stamp counter, but on other architectures ticks is a timestamp derived from gettimeofday(). The user can modify these via the environment variable: KMP_SPIN_BACKOFF_PARAMS=max_backoff[,min_ticks] Currently, since the default user lock is a queuing lock, one would have to also specify KMP_LOCK_KIND=tas to use the test-and-set locks. Differential Revision: http://reviews.llvm.org/D19020 llvm-svn: 266329	2016-04-14 16:00:37 +00:00
Jonathan Peyton	2e379fc767	Add declarations of OpenMP 4.5 target/offload routines to headers All these routines are implemented in the offload library. llvm-svn: 266120	2016-04-12 20:37:18 +00:00
Jonathan Peyton	072772bf05	[STATS] Remove trailing whitespace in stats source files llvm-svn: 265437	2016-04-05 18:48:48 +00:00
Jonathan Peyton	50e8f18b52	OMP_WAIT_POLICY changes This change has OMP_WAIT_POLICY=active to mean that threads will busy-wait in spin loops and virtually never go to sleep. OMP_WAIT_POLICY=passive now means that threads will immediately go to sleep inside a spin loop. KMP_BLOCKTIME was the previous mechanism to specify this behavior via KMP_BLOCKTIME=0 or KMP_BLOCKTIME=infinite, but the standard OpenMP environment variable should also be able to specify this behavior. Differential Revision: http://reviews.llvm.org/D18577 llvm-svn: 265339	2016-04-04 19:38:32 +00:00
Jonathan Peyton	1d46d979a9	Fix bug when KMP_USE_ADAPTIVE_LOCKS is 0 #endif was one line too low. If KMP_USE_ADAPTIVE_LOCKS is 0, then queuing locks would incorrectly use drdpa lock mechanism. This is a fix for https://llvm.org/bugs/show_bug.cgi?id=26649 llvm-svn: 264934	2016-03-30 21:50:59 +00:00
Jonathan Peyton	4cfe93c599	Fix comment in kmp_wait_release.h Removed reference to "ref ct" in a comment, as ref_ct no longer exists. Also moved the comment to where the task_team is about to be tested if NULL. llvm-svn: 264786	2016-03-29 21:08:29 +00:00
Jonathan Peyton	ee2f96c79b	Fix incorrect indention in kmp_alloc.c llvm-svn: 264777	2016-03-29 20:10:00 +00:00
Jonathan Peyton	a58563d8c9	Remove dead KMP_USE_POOLED_ALLOC code llvm-svn: 264776	2016-03-29 20:05:27 +00:00
Jonathan Peyton	316af8de48	[STATS] Missing check for MIC in config-ix.cmake llvm-svn: 264616	2016-03-28 18:53:10 +00:00
Hal Finkel	01bb2406a3	Fixing the non-x86 build by removing dependence on kmp_cpuid_t The problem is that the definition of kmp_cpuinfo_t contains: char name [3*sizeof (kmp_cpuid_t)]; // CPUID(0x80000002,0x80000003,0x80000004) and kmp_cpuid_t is only defined when compiling for x86. Differential Revision: http://reviews.llvm.org/D18245 llvm-svn: 264535	2016-03-27 13:24:09 +00:00
Jonas Hahnfeld	e46a494a50	[OMPT] Fix parallel_id and task_id in loop_end with schedule static For serialized parallel regions, wrong ids were reported. Now the same code is used as in kmp_dispatch.cpp which emits the correct ids. Differential Revision: http://reviews.llvm.org/D18348 llvm-svn: 264266	2016-03-24 12:52:20 +00:00
Jonas Hahnfeld	801fe9bbe2	[OMPT] Test ids reported by ompt_get_{parallel,task}_id llvm-svn: 264265	2016-03-24 12:52:11 +00:00
Jonas Hahnfeld	1c1c71776a	[OMPT] Fix duplicate implicit_task_end events for master thread with GCC For non-serialized parallel regions the master thread issued two callbacks: The first one in kmp_gsupport.c and the second in __kmp_join_call. Therefore only trigger the callback in kmp_gsupport.c for serialized parallel regions. Differential Revision: http://reviews.llvm.org/D16716 llvm-svn: 264264	2016-03-24 12:52:04 +00:00
Jonathan Peyton	b7d30cbc7e	Fix Visual Studio builds Have Visual Studio use MemoryBarrier() instead of _mm_mfence() and remove __declspec align attribute from function parameters in kmp_atomic.h llvm-svn: 264166	2016-03-23 16:27:25 +00:00
Jonas Hahnfeld	b1cad2954b	[OMPT] Make tests require OMPT_BLAME ompt_event_barrier_{begin,end} are optional blame events. In total it doesn't make any sense to test partially built OMPT support. llvm-svn: 264031	2016-03-22 08:23:24 +00:00
Jonas Hahnfeld	c804301113	[OMPT] Create infrastructure and add first tests for OMPT Some basic checks next to the implementation should futher lower the possibility to introduce regressions. (Note that this would have catched the ordering issue fixed in rL258866 and pointed to rL263940.) The tests are implementation dependent in one point because they assume that thread ids are assigned in ascending order. This is not defined by the standard but currently ensured in libomp. We have to think about another way of ordering the threads should this ever be subject to change... Note that this isn't aiming at replacing the implementation independent test-suite at https://github.com/OpenMPToolsInterface/ompt-test-suite! Differential Revision: http://reviews.llvm.org/D16715 llvm-svn: 264027	2016-03-22 07:22:49 +00:00
Jonathan Peyton	93a879ce78	[STATS] Add OMP_critical and OMP_critical_wait timers OMP_critical - time spent in critical section OMP_critical_wait - time spent waiting to enter a critical section llvm-svn: 263967	2016-03-21 18:32:26 +00:00
Jonathan Peyton	97cbb42d90	[STATS] separate noTotal bit flag from onlyInMaster and noUnits This change logically separates the stats_flags_e::noTotal bit flag from the stats_flags_e::onlyInMaster and stats_flags_e::noUnits bit flags. If no TOTAL_foo output is wanted for a particular statistic, the flag must be explicitly included in that statistic's flags. Differential Revision: http://reviews.llvm.org/D18198 llvm-svn: 263954	2016-03-21 17:26:23 +00:00
Jonas Hahnfeld	6c250b714c	[OMPT] Fix wrong parent_task_id in serialized parallel_begin with GCC Without this patch a simple '#pragma omp parallel num_threads(1)' leads to ompt_event_parallel_begin: parent_task_id=3, [...], parallel_id=2, [...] ompt_event_parallel_end: parallel_id=2, task_id=4, [...] Differential Revision: http://reviews.llvm.org/D16714 llvm-svn: 263940	2016-03-21 12:37:52 +00:00
Jonathan Peyton	b5969ca42d	Update www/index.html to reflect current status of OpenMP project llvm-svn: 263788	2016-03-18 14:50:01 +00:00
Jonathan Peyton	8a46c067ed	[CMake] Fix Windows build problem for CMake versions < 3.3 Building libomp using CMake versions < 3.3 caused a link time error. These errors occurred because when assembling z_Windows_NT-586_asm.asm, the definitions: OMPT_SUPPORT, _M_AMD64\|_M_IA32 weren't defined on the command line. To fix the problem, the COMPILE_FLAGS property for the assembly file is appended to instead of the COMPILE_DEFINITIONS property being set. For whatever reason, the COMPILE_DEFINITIONS property doesn't pick up the definitions for assembly files for the older CMake versions. llvm-svn: 263651	2016-03-16 18:44:18 +00:00
Jonathan Peyton	4240055ac8	Fix spelling error in comment llvm-svn: 263586	2016-03-15 20:59:10 +00:00
Jonathan Peyton	20c1e4e69d	[STATS] Print "Unknown" for frequency if it wasn't able to be parsed llvm-svn: 263583	2016-03-15 20:55:32 +00:00
Jonathan Peyton	226dcd3243	[STATS] Fix comments in kmp_stats.h llvm-svn: 263582	2016-03-15 20:49:01 +00:00
Jonathan Peyton	6e98d7988b	[STATS] Add header information to stats print out This change adds a header to the printout of the statistics which includes the time, machine name, and processor info if available. This change also includes some cosmetic changes like using enum casting for timer and counter iteration. Differential Revision: http://reviews.llvm.org/D18153 llvm-svn: 263580	2016-03-15 20:28:47 +00:00
Samuel Antao	11e4c539f4	Initialize two variables in kmp_tasking. Summary: Two initialized local variables are causing clang to produce warnings: ``` ./src/projects/openmp/runtime/src/kmp_tasking.c:3019:5: error: variable 'num_tasks' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized] default: ^~~~~~~ ./src/projects/openmp/runtime/src/kmp_tasking.c:3027:21: note: uninitialized use occurs here for( i = 0; i < num_tasks; ++i ) { ^~~~~~~~~ ./src/projects/openmp/runtime/src/kmp_tasking.c:2968:28: note: initialize the variable 'num_tasks' to silence this warning kmp_uint64 i, num_tasks, extras; ^ = 0 ./src/projects/openmp/runtime/src/kmp_tasking.c:3019:5: error: variable 'extras' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized] default: ^~~~~~~ ./src/projects/openmp/runtime/src/kmp_tasking.c:3022:52: note: uninitialized use occurs here KMP_DEBUG_ASSERT(tc == num_tasks * grainsize + extras); ^~~~~~ ./src/projects/openmp/runtime/src/kmp_debug.h:62:60: note: expanded from macro 'KMP_DEBUG_ASSERT' #define KMP_DEBUG_ASSERT( cond ) KMP_ASSERT( cond ) ^ ./src/projects/openmp/runtime/src/kmp_debug.h:60:51: note: expanded from macro 'KMP_ASSERT' #define KMP_ASSERT( cond ) ( (cond) ? 0 : __kmp_debug_assert( #cond, __FILE__, __LINE__ ) ) ^ ./src/projects/openmp/runtime/src/kmp_tasking.c:2968:36: note: initialize the variable 'extras' to silence this warning kmp_uint64 i, num_tasks, extras; ^ = 0 2 errors generated. ``` This patch initializes these two variables. Reviewers: tlwilmar, jlpeyton Subscribers: tlwilmar, openmp-commits Differential Revision: http://reviews.llvm.org/D17909 llvm-svn: 263316	2016-03-12 00:55:17 +00:00
Jonathan Peyton	495e153ff9	[STATS] change TASK_execution name to OMP_task llvm-svn: 263291	2016-03-11 20:23:05 +00:00
Jonathan Peyton	e2554af857	[STATS] Add a total statistics count This change removes synthesized stats and instead has all timers print out a total which is the aggregate statistics across threads. This is displayed as "Total_foo" at the end of program. The stats_flags_e::synthesized flag is removed and the printStats() function is split into two separate functions: printTimerStats() which can display the aggregate total and printCounterStats(). Differential Revision: http://reviews.llvm.org/D17869 llvm-svn: 263290	2016-03-11 20:20:49 +00:00
Jonathan Peyton	c1a7c97c1b	[STATS] fix output formatting when sample count is 0 Force 0.0 to be displayed for all statistics which have sample count equal to 0 llvm-svn: 262658	2016-03-03 21:24:13 +00:00
Jonathan Peyton	30138256fa	[STATS] fix master and single timers Only the thread which executes the single/master section will update its statistics. llvm-svn: 262656	2016-03-03 21:21:05 +00:00
Jonathan Peyton	283a215c7a	Add new OpenMP 4.5 taskloop construct feature From the standard: The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed. This initial implementation uses a simple linear tasks distribution algorithm. Later we can add other algorithms to speedup generation of huge number of tasks (i.e., tree-like tasks generation should be faster). This needs to be put into the OpenMP runtime library in order for the compiler team to develop the compiler side of the implementation. Differential Revision: http://reviews.llvm.org/D17404 llvm-svn: 262535	2016-03-02 22:47:51 +00:00
Jonathan Peyton	a0d7a2cd3f	Forgot to add test files for doacross and task priority. llvm-svn: 262533	2016-03-02 22:43:14 +00:00
Jonathan Peyton	71909c57ca	Add new OpenMP 4.5 doacross loop nest feature From the standard: A doacross loop nest is a loop nest that has cross-iteration dependence. An iteration is dependent on one or more lexicographically earlier iterations. The ordered clause parameter on a loop directive identifies the loop(s) associated with the doacross loop nest. The init/fini routines allocate/free doacross buffer(s) for each loop for each thread. The wait routine waits for a flag designated by the dependence vector. The post routine sets the flag designated by current iteration vector. We use a similar technique of shared buffer indices that covers up to 7 nowait loops executed simultaneously by different threads (number 7 has no real meaning, just heuristic value). Also, the size of structures are kept intact via reducing dummy arrays. This needs to be put into the OpenMP runtime library in order for the compiler team to develop the compiler side of the implementation. Differential Revision: http://reviews.llvm.org/D17399 llvm-svn: 262532	2016-03-02 22:42:06 +00:00
Jonathan Peyton	2f7c077b5a	Add new OpenMP 4.5 affinity API This change introduces the new OpenMP 4.5 affinity api surrounding OpenMP Places. There are six new entry points: Typically called in serial region: * omp_get_num_places - returns the number of places available to the execution environment in the place list. * omp_get_place_num_procs - returns the number of processors available to the execution environment in the specified place. * omp_get_place_proc_ids - returns the numerical identifiers of the processors available to the execution environment in the specified place. Typically called inside parallel region: * omp_get_place_num - returns the place number of the place to which the encountering thread is bound. * omp_get_partition_num_places - returns the number of places in the place partition of the innermost implicit task. * omp_get_partition_place_nums - returns the list of place numbers corresponding to the places in the place-var ICV of the innermost implicit task. Differential Revision: http://reviews.llvm.org/D17417 llvm-svn: 261915	2016-02-25 18:49:52 +00:00
Jonathan Peyton	2851072d69	Add initial support for OpenMP 4.5 task priority feature The maximum task priority value is read from envirable: OMP_MAX_TASK_PRIORITY. But as of now, nothing is done with it. We just handle the environment variable and add the new api: omp_get_max_task_priority() which returns that value or zero if it is not set. Differential Revision: http://reviews.llvm.org/D17411 llvm-svn: 261908	2016-02-25 18:04:09 +00:00
Jonathan Peyton	ea0fe1dfeb	dd new OpenMP 4.5 schedule clause modifiers (monotonic/non-monotonic) feature The monotonic/non-monotonic flags are sent to the runtime via the sched_type by setting the 30th (non-monotonic) or 29th (monotonic) bit in the sched_type. Macros are added to probe if monotonic or non-monotonic is specified (SCHEDULE_HAS_[NON]MONOTONIC & SCHEDULE_HAS_NO_MODIFIERS) and also to to get the base sched_type (SCHEDULE_WITHOUT_MODIFIERS) Currently, nothing is done with the modifiers. Also, this patch adds some comments on the use of the enumerations in at least one place where it is subtle. Differential Revision: http://reviews.llvm.org/D17406 llvm-svn: 261906	2016-02-25 17:55:50 +00:00
Jonathan Peyton	95c95c350e	Remove unnecessary semicolons after braces llvm-svn: 261249	2016-02-18 19:38:25 +00:00
Jonas Hahnfeld	867aa20b1e	[OMPT] Frame information for openmp taskwait For pragma omp taskwait the runtime is called from the task context. Therefore, the reentry frame information should be updated. The information should be available for both taskwait event calls; therefore, set before the first event and reset after the last event. Patch by Joachim Protze Differential Revision: http://reviews.llvm.org/D17145 llvm-svn: 260674	2016-02-12 12:19:59 +00:00
Jonathan Peyton	134f90d59f	Fix incorrect task_team in __kmp_give_task When a target task finishes and it tries to access the th_task_team from the threads in the team where it was created, th_task_team can be NULL or point to a different place when that thread started a nested region that is still running. Finding the exact task_team that the threads were using is difficult as it would require to unwind the task_state_memo_stack. So a new field was added in the taskdata structure to point to the active task_team when the task was created. llvm-svn: 260615	2016-02-11 23:07:30 +00:00
Jonathan Peyton	ff684e4b9e	Fix a couple of typos in comments llvm-svn: 260613	2016-02-11 22:58:29 +00:00
Jonathan Peyton	d3f2b94d97	Proxy task fix: task_state stack push condition on fork The problem is that the master's thread state was not saved before entering a parallel region so it does not remember tasks when it returns. llvm-svn: 260306	2016-02-09 22:32:41 +00:00
Jonathan Peyton	89d9b333b0	Have Mac builds use @rpath when supported in CMake The -install_name linker flag will use "@rpath/" when supported in CMake which is the recommended usage for dynamic libraries on Mac OSX. llvm-svn: 260300	2016-02-09 22:15:30 +00:00
Jonas Hahnfeld	9dffeff894	[GCC] GOMP_task: Change argument type of if_cond from int to bool (libgomp has bool as well) This was causing a test failure in omp_test_if.c when building with GCC in Debug mode. I have verified that GCC versions 4.9.2 and 5.3.0 now work and compile-tested this change with clang 3.7.1 and Intel Compiler 16.0. Differential Revision: http://reviews.llvm.org/D16921 llvm-svn: 260204	2016-02-09 07:07:30 +00:00
Jonas Hahnfeld	66594990b1	[CMake] Introduce OPENMP_LLVM_TOOLS_DIR This will be used in a later patch to find additional LLVM tools for tests and enables reusability for libomptarget that is currently under review. Differential Revision: http://reviews.llvm.org/D16713 llvm-svn: 259876	2016-02-05 07:00:13 +00:00
Jonathan Peyton	fd74f90072	Add LIBOMP_ENABLE_SHARED option for CMake When building executables for Cray supercomputers, statically-linked executables are preferred. This patch makes it possible to build the OpenMP runtime as an archive for building statically-linked executables. The patch adds the flag LIBOMP_ENABLE_SHARED, which defaults to true. When true, a build of the OpenMP runtime yields dynamic libraries. When false, a build of the OpenMP runtime yields static libraries. There is no setting that allows both kinds of libraries to be built. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D16525 llvm-svn: 259817	2016-02-04 19:29:35 +00:00
Jonathan Peyton	7d45451a0d	Fix task dependency performance problem In: http://lists.llvm.org/pipermail/openmp-dev/2015-August/000858.html, a performance issue was found with libomp's task dependencies. The task dependencies hash table has an issue with collisions. The current table size is a power of two. This combined with the current hash function causes a large number of collisions to occurr. Also, the current size (64) is too small for larger applications so the table size is increased. This patch creates a two level hash table approach for task dependencies. The implicit task is considered the "master" or "top-level" task which has a large static sized hash table (997), and nested tasks will have smaller hash tables (97). Prime numbers were chosen to help reduce collisions. Differential Revision: http://reviews.llvm.org/D16640 llvm-svn: 259113	2016-01-28 23:10:44 +00:00
Jonas Hahnfeld	39b6862482	[OMPT] Add support for ompt_event_task_dependences and ompt_event_task_dependence_pair The attached patch adds support for ompt_event_task_dependences and ompt_event_task_dependence_pair events from the OMPT specification [1]. These events only apply to OpenMP 4.0 and 4.1 (aka 4.5) because task dependencies were introduced in 4.0. With respect to the changes: ompt_event_task_dependences According to the specification, this event is raised after the task has been created, thefore this event needs to be raised after ompt_event_task_begin (in __kmp_task_start). However, the dependencies are known at __kmpc_omp_task_with_deps which occurs before __kmp_task_start. My modifications extend the ompt_task_info_t struct in order to store the dependencies of the task when _kmpc_omp_task_with_deps occurs and then they are emitted in __kmp_task_start just after raising the ompt_event_task_begin. The deps field is allocated and valid until the event is raised and it is freed and set to null afterwards. ompt_event_task_dependence_pair The processing of the dependences (i.e. checking whenever a dependence is already satisfied) is done within __kmp_process_deps. That function checks every dependence and calls the __kmp_track_dependence routine which gives some support for graphical output. I used that routine to emit the dependence pair but I also needed to know the sink_task. Despite the fact that the code within KMP_SUPPORT_GRAPH_OUTPUT refers to task_sink it may be null because sink->dn.task (there's a comment regarding this) and in fact it does not point to a proper pointer value because the value is set in node->dn.task = task; after the __kmp_process_deps calls in __kmp_check_deps. I have extended the __kmp_process_deps and __kmp_track_dependence parameter list to receive the sink_task. [1] https://github.com/OpenMPToolsInterface/OMPT-Technical-Report/blob/target/ompt-tr.pdf Patch by Harald Servat Differential Revision: http://reviews.llvm.org/D14746 llvm-svn: 259038	2016-01-28 10:39:52 +00:00
Jonas Hahnfeld	dbf627dbd4	[OMPT] Avoid SEGV when a worker thread needs its parallel id behind the barrier When the code behind the barrier is executed, the master thread may have already resumed execution. That's why we cannot safely assume that *pteam is not yet freed. This has been introduced by r258866. llvm-svn: 259037	2016-01-28 10:39:45 +00:00
Jonas Hahnfeld	bba248c368	[OMPT] Workaround clang failing with 'declare target' Current clang trunk reports _OPENMP to be 201307 = OpenMP 4.0. It doesn't recognize '#pragma omp declare target' though (patch still pending) and therefore fails compilation. Differential Revision: http://reviews.llvm.org/D16631 llvm-svn: 259026	2016-01-28 07:14:44 +00:00
Jonathan Peyton	727ba6e843	Restore th_current_task first as suggested by John Mellor-Crummey If an asynchronous inquiry peers into the runtime system it doesn't see the freed task as the current task. llvm-svn: 258990	2016-01-27 21:20:26 +00:00
Jonathan Peyton	749b4d51ed	Formatting fixes Removing extraneous { } bracket sections. Unindenting blocks of code as a result. Also removing empty #ifdef KMP_STUB llvm-svn: 258986	2016-01-27 21:02:04 +00:00
Jonathan Peyton	bf0cc3a241	Fixing comments. Removing references to non-existent functions, fixing typos. llvm-svn: 258985	2016-01-27 20:57:32 +00:00
Jonathan Peyton	bf89c491c5	Removing extra empty lines llvm-svn: 258984	2016-01-27 20:44:49 +00:00
Jonas Hahnfeld	1473d5b546	Change whitespace to test commit access llvm-svn: 258910	2016-01-27 07:24:03 +00:00
Jonathan Peyton	b4c73d8d8a	[OMPT]: Fix the order of implicit_task_end_events For implcit barriers in simple parallel for loops, the order of the OMPT events was wrong. The barrier_{begin,end} events came after the implcit_task_end event for the implcit barrier at the end of the parallel region. This is wrong because the implicit task executes the barrier before ending. This patch fixes the order of the event: It will be triggerd now just before __kmp_pop_current_task_from_thread() is called. Patch by Tim Cramer Differential Revision: http://reviews.llvm.org/D16347 llvm-svn: 258866	2016-01-26 21:45:21 +00:00
Jonathan Peyton	4c91ad1be7	Bypass Perl modules in build system This change fixes the bug: https://llvm.org/bugs/show_bug.cgi?id=25975 by bypassing the perl module files which try to deduce system information. These perl modules files don't offer useful information and are from the original build system. They can be removed after this change. llvm-svn: 258843	2016-01-26 19:44:31 +00:00
Ismail Donmez	c9655d9bd5	Fix compilations with msvc's /Zc:strictStrings llvm-svn: 258797	2016-01-26 08:24:57 +00:00
Andrey Churbanov	24d4eba0f9	omp_barrier.c test fixed in order to reliably and faster run on any number of processors llvm-svn: 258695	2016-01-25 16:52:10 +00:00
Jonathan Peyton	3bd88d4c15	Add missing cleanup code for cached indirect lock pool. This change fixes one issue reported at https://llvm.org/bugs/show_bug.cgi?id=26184 There was missing cleanup code for the cached indirect lock pool. The change will fix the reported case where it tries to initialize a lock after runtime cleanup/reinitialization, but it is still possible that the user program runs into another problem because most test programs have a call to __kmpc_set_lock after cleanup/reinitialization without calling __kmpc_init_lock causing a crash/hang. llvm-svn: 258528	2016-01-22 19:16:14 +00:00
Hans Wennborg	464307ffe7	lit.cfg: Pass -isysroot to the SDK on Darwin Newly-built Clangs don't automatically find the SDK, and newer versions of Mac OS X don't provide it under /usr/include etc. llvm-svn: 258169	2016-01-19 19:26:43 +00:00
Hans Wennborg	59162da0eb	Don't use __DATE__ or __TIME__; it breaks release builds (PR26145) The release builds are configured to be reproducible, so that the binaries compare equal between bootstrap iterations. The OpenMP run-time build was failing like this: runtime/src/kmp_version.c:108:79: error: expansion of date or time macro is not reproducible [-Werror,-Wdate-time] char const __kmp_version_build_time[] = KMP_VERSION_PREFIX "build time: " __DATE__ " " __TIME__; Figuring as the build currently doesn't set LIBOMP_DATE, it's probably OK to skip setting the build time here too. llvm-svn: 257833	2016-01-14 23:18:20 +00:00
Jonathan Peyton	3076fa4c35	New API for restoring current thread's affinity to init affinity of application This new API, int kmp_set_thread_affinity_mask_initial(), is available for use by other parallel runtime libraries inside a possibly OpenMP-registered thread. This entry point restores the current thread's affinity mask to the affinity mask of the application when it first began. If -1 is returned it can be assumed that either the thread hasn't called affinity initialization or that the thread isn't registered with the OpenMP library. If 0 is returned then, then the call was successful. Any return value greater than zero indicates an error occurred when setting affinity. Differential Revision: http://reviews.llvm.org/D15867 llvm-svn: 257489	2016-01-12 17:21:55 +00:00
Jonathan Peyton	f6498629db	Remove double negative in if() logic. Change (__kmp_mic_type != non_mic) to (__kmp_mic_type == mic2) llvm-svn: 257380	2016-01-11 20:37:39 +00:00
Jonathan Peyton	1a78c6322c	Put function names on their own line. llvm-svn: 257378	2016-01-11 20:28:55 +00:00
Jonathan Peyton	32a1ea1b7e	Removed unused __kmp_*_i8 functions. llvm-svn: 256790	2016-01-04 23:20:26 +00:00
Jonathan Peyton	703d4042ad	Fix for barrier problem: applications with many parallel regions (2^30) hang The barrier states type doesn't need to be explicitly set. llvm-svn: 256778	2016-01-04 20:51:48 +00:00
Andrey Churbanov	4b939405c5	test omp_threadprivate_for.c fixed llvm-svn: 256473	2015-12-27 18:14:40 +00:00
Jonathan Peyton	2c295c4e53	Fix build error: OMPT_SUPPORT=true was not tested after hinted lock changes Recent changes to support dynamic locks didn't consider the code compiled when OMPT_SUPPORT=true. As a result, the OMPT support was broken by recent changes to nested locks to support dynamic locks. For OMPT to work with dynamic locks, they need to provide a return code indicating whether a nested lock acquisition was the first or not. This patch moves the OMPT support for nested locks into the #else case when DYNAMIC locks were not used. New support is needed for dynamic locks. This patch fixes the build and leaves a placeholder where the missing OMPT callbacks can be added either the author of the OMPT support for locks, or the dynamic locking support. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D15656 llvm-svn: 256314	2015-12-23 02:34:03 +00:00
Jonathan Peyton	4fee5f6416	Prevent monitor thread creation when KMP_BLOCKTIME="infinite" When users sets envirable KMP_BLOCKTIME to "infinite" (the time one busy-waits at barrieres, etc.), the monitor thread is not useful and can be ignored. This change prevents the creation of the monitor thread when the users sets KMP_BLOCKTIME to "infinite". Differential Revision: http://reviews.llvm.org/D15628 llvm-svn: 256061	2015-12-18 23:20:36 +00:00
Jonathan Peyton	6cb33c60bd	Remove some extra spaces llvm-svn: 256060	2015-12-18 23:15:35 +00:00
Jonathan Peyton	b9e8326088	[STATS] Have CMake do real check for stats functionality This change allows clang to build the stats library for every architecture which supports __builtin_readcyclecounter(). CMake also checks for all necessary features for stats and will error out if the platform does not support it. Patch by Hal Finkel and Johnny Peyton llvm-svn: 256002	2015-12-18 16:19:35 +00:00
Jonathan Peyton	8b524597ef	[STATS] Properly guard the tick_time() function and its uses llvm-svn: 255910	2015-12-17 17:27:51 +00:00
Jonathan Peyton	f741312c6f	[STATS] replace __cpuid() intrinsic with already existing __kmp_x86_cpuid() function llvm-svn: 255907	2015-12-17 16:58:26 +00:00
Jonathan Peyton	ad57992887	[STATS] Fix stats lock problem to be compatible with new hinted lock code llvm-svn: 255901	2015-12-17 16:19:05 +00:00
Jonathan Peyton	4b1aad37d8	[STATS] Add libm.so to lib dependencies for stats library llvm-svn: 255900	2015-12-17 16:15:39 +00:00
Jonathan Peyton	67390c6cd3	Fix broken visual studio builds by disabling KMP_USE_TSX. Visual studio can't handle the asm extension in the KMP_USE_TSX code sections. llvm-svn: 255514	2015-12-14 17:39:30 +00:00
Jonathan Peyton	b87b58131a	Hinted lock (OpenMP 4.5 feature) Updates/Fixes Part 3 This change set includes all changes to make the code conform to the OMP 4.5 specification: * Removed hint / hinted_init definitions from include/40 files * Hint values are powers of 2 to enable composition (4.5 spec) * Hinted lock initialization functions were renamed (4.5 spec) kmp_init_lock_hinted -> omp_init_lock_with_hint kmp_init_nest_lock_hinted -> omp_init_nest_lock_with_hint * __kmpc_critical_section_with_hint was added to support a critical section with a hint (4.5 spec) * __kmp_map_hint_to_lock was added to convert a hint (possibly a composite) to an internal lock type * kmpc_init_lock_with_hint and kmpc_init_nest_lock_with_hint were added as internal entries for the hinted lock initializers. The preivous internal functions (__kmp_init) were moved to kmp_csupport.c and reused in multiple places Added the two init functions to dllexports * KMP_USE_DYNAMIC_LOCK is turned on if OMP_41_ENABLED is turned on Differential Revision: http://reviews.llvm.org/D15205 llvm-svn: 255376	2015-12-11 22:04:05 +00:00
Jonathan Peyton	dae13d81b4	Hinted lock (OpenMP 4.5 feature) Updates/Fixes Part 2 * Added a new user TSX lock implementation, RTM, This implementation is a light-weight version of the adaptive lock implementation, omitting the back-off logic for deciding when to specualte (or not). The fall-back lock is still the queuing lock. * Changed indirect lock table management. The data for indirect lock management was encapsulated in the "kmp_indirect_lock_table_t" type. Also, the lock table dimension was changed to 2D (was linear), and each entry is a kmp_indirect_lock_t object now (was a pointer to an object). * Some clean up in the critical section code * Removed the limits of the tuning parameters read from KMP_ADAPTIVE_LOCK_PROPS * KMP_USE_DYNAMIC_LOCK=1 also turns on these two switches: KMP_USE_TSX, KMP_USE_ADAPTIVE_LOCKS Differential Revision: http://reviews.llvm.org/D15204 llvm-svn: 255375	2015-12-11 21:57:06 +00:00
Jonathan Peyton	a03533d35f	Hinted lock (OpenMP 4.5 feature) Updates/Fixes There are going to be two more patches which bring this feature up to date and in line with OpenMP 4.5. * Renamed jump tables for the lock functions (and some clean up). * Renamed some macros to be in KMP_ namespace. * Return type of unset functions changed from void to int. * Enabled use of _xebgin() et al. intrinsics for accessing TSX instructions. Differential Revision: http://reviews.llvm.org/D15199 llvm-svn: 255373	2015-12-11 21:49:08 +00:00
Jonathan Peyton	f2d119ff8e	Replace DYNA_* names with KMP_* names llvm-svn: 254637	2015-12-03 19:37:20 +00:00
Jonathan Peyton	1be692ecdb	Fix honoring of OMP_THREAD_LIMIT in the teams construct Fix for crash in the teams construct in case user sets OMP_THREAD_LIMIT to a number less than the number of processors. Now the number of threads will be silently reduced if the user didn't specify teams parameters or with a warning if the user specified teams parameters conflicting with OMP_THREAD_LIMIT. Differential Revision: http://reviews.llvm.org/D14732 llvm-svn: 254322	2015-11-30 20:14:05 +00:00
Jonathan Peyton	e1dad19aac	Fix crash when __kmp_task_team_setup called for single threaded team The task_team pointer is dereferenced unconditionally which causes a SEGFAULT when it is NULL (e.g. for serialized parallel, that can happen for "teams" construct or for "target nowait"). The solution is to skip second task team setup for single thread team. Differential Revision: http://reviews.llvm.org/D14729 llvm-svn: 254321	2015-11-30 20:05:13 +00:00
Jonathan Peyton	01dcf36bd5	Adding Hwloc library option for affinity mechanism These changes allow libhwloc to be used as the topology discovery/affinity mechanism for libomp. It is supported on Unices. The code additions: * Canonicalize KMP_CPU_* interface macros so bitmask operations are implementation independent and work with both hwloc bitmaps and libomp bitmaps. So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and the like. These are all in kmp.h and appropriately placed. * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc interface to create a libomp address2os object which the rest of libomp knows how to handle already. * To build, use -DLIBOMP_USE_HWLOC=on and -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake can't find the library or hwloc.h, then it will tell you and exit. Differential Revision: http://reviews.llvm.org/D13991 llvm-svn: 254320	2015-11-30 20:02:59 +00:00
Jonathan Peyton	55c447f70f	Add newlines to debug TRACE messages in kmp_taskdeps.cpp llvm-svn: 253265	2015-11-16 22:53:38 +00:00
Jonathan Peyton	baaccfab38	Add missing KMP_NESTED_HOT_TEAMS guards llvm-svn: 253264	2015-11-16 22:48:41 +00:00
Alexey Bataev	ffca01ce9f	[OPENMP] Fixed tests for gcc build. llvm-svn: 253200	2015-11-16 11:35:57 +00:00
Jonathan Peyton	90862c40ad	Add debug trace message for hierarchical barrier Trace when thread is waiting at join phase for oncore children. llvm-svn: 252954	2015-11-12 21:40:39 +00:00
Jonathan Peyton	d6c8de1ef2	Remove outdated comment llvm-svn: 252953	2015-11-12 21:34:29 +00:00
Jonathan Peyton	00afbd01ad	Fix for ittnotify loop reporting Fix ittnotify loop metadata reporting for schedule(runtime) and chunked schedule set via OMP_SCHEDULE. The bug was that chunk=1 reported always. llvm-svn: 252952	2015-11-12 21:26:22 +00:00
Jonathan Peyton	adee8c5a18	[OMPT] Add ompt_event_task_switch event into OMPT/OpenMP The patch adds support for ompt_event_task_switch into LLVM/OpenMP. Note that the patch has also updated the signature of ompt_event_task_switch to ompt_task_pair_callback_t (rather than the previous ompt_task_switch_callback_t). Patch by Harald Servat Differential Revision: http://reviews.llvm.org/D14566 llvm-svn: 252761	2015-11-11 17:49:50 +00:00
Jonathan Peyton	9b54b41f7b	[OMPT] Remove unnecessary header in ompt-general.c Patch by Harald Servat Differential Revision: http://reviews.llvm.org/D14565 llvm-svn: 252756	2015-11-11 17:30:26 +00:00
Jonathan Peyton	3f5dfc2562	Fixes to wait-loop code 1) Add get_ptr_type() method to all wait flag types. 2) Flag in sleep_loc may change type by the time the resume is called from __kmp_null_resume_wrapper. We use get_ptr_type to obtain the real type and compare it to the casted object received. If they don't match, we know the flag has changed (already resumed and replaced by another flag). If they match, it doesn't hurt to go ahead and resume it. Differential Revision: http://reviews.llvm.org/D14458 llvm-svn: 252487	2015-11-09 16:31:51 +00:00
Jonathan Peyton	b0b83c8b0c	Fixes and improvements to tasking in barriers 1) When the number of threads in a team increases, new threads need to have all their barrier struct fields initialized. We were missing the parent_bar and team fields. 2) For non-forkjoin barriers, we now do the __kmp_task_team_setup before the gather. The setup now sets up the task_team that all the threads will switch to after the barrier, but it needs to be done before other threads do the switch. 3) Remove an unneeded assignment of tt_found_tasks in task team free function. Differential Revision: http://reviews.llvm.org/D14456 llvm-svn: 252486	2015-11-09 16:28:32 +00:00
Jonathan Peyton	7dee82e729	Improvements to machine_hierarchy code for re-sizing These changes include: 1) Machine hierarchy now uses the base_num_threads field to indicate the maximum number of threads the current hierarchy can handle without a resize. 2) In __kmp_get_hierarchy, we need to get depth after any potential resize is done. 3) Cleanup of hierarchy resize code to support 1 above. Differential Revision: http://reviews.llvm.org/D14455 llvm-svn: 252475	2015-11-09 16:24:53 +00:00
Jonathan Peyton	960ea2f677	[OMPT] Add OMPT events for the OpenMP taskwait construct. llvm-svn: 252472	2015-11-09 15:57:04 +00:00
Jonathan Peyton	70bda912fb	Fix for zero chunk size Setting dynamic schedule with chunk size 0 via omp_set_schedule(dynamic,0) and then using "schedule (runtime)" causes infinite loop because for the chunked dynamic schedule we didn't correct zero chunk to the default (1). llvm-svn: 252338	2015-11-06 20:32:44 +00:00
Jonathan Peyton	95246e7def	Improve OMPT initialization code Use of #ifdef OMPT_DEBUG was causing messages to be generated under normal operation when the OpenMP library was compiled with KMP_DEBUG enabled. Elsewhere, KMP_DEBUG evaluates assertions, but never produces messages during normal operation. To avoid this inconsistency, set OMPT_DEBUG using a cmake variable LIBOMP_OMPT_DEBUG. While I was editing the associated ompt-specific.h and ompt-general.c files, make the spacing and comments consistent. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D14355 llvm-svn: 252173	2015-11-05 16:54:55 +00:00
Jonathan Peyton	dd23974a5d	Remove incorrect debug assert. in __kmp_free_team(), the team's number of processors can be == 1. llvm-svn: 252086	2015-11-04 22:31:57 +00:00
Jonathan Peyton	4505bf68b0	Remove some empty lines. llvm-svn: 252084	2015-11-04 22:06:07 +00:00
Jonathan Peyton	54127981be	Refactor of task_team code. This is a refactoring of the task_team code that more elegantly handles the two task_team case. Two task_teams per team are kept in use for the lifetime of the team. Thus no reference counting is needed. Differential Revision: http://reviews.llvm.org/D13993 llvm-svn: 252082	2015-11-04 21:37:48 +00:00
Alexey Bataev	b0eae8d6f4	[OPENMP] Add dependency to clang/clang-headers etc. for in-tree build of libomp. Add additional dependency to clang/clang-headers/FileCheck to avoid possible troubles with in-tree build/test of libomp + allow parallel testing of libomp. Also includes bugfixes for tests + improvements to avoid possible race conditions. Differential Revision: http://reviews.llvm.org/D14055 llvm-svn: 251797	2015-11-02 13:43:32 +00:00
Jonathan Peyton	57d171c9a6	[OMPT] Adding missing free() calls to ompt_tool_windows() function. llvm-svn: 251719	2015-10-30 20:24:25 +00:00
Jonathan Peyton	69e596a5e7	[OMPT] Windows Support for OMPT The problem is that the ompt_tool() function (which must be implemented by a performance tool) should be defined in the RTL as well to cover the case when the tool is not present in the address space of the process. This functionality is accomplished with weak symbols in Unices. Unfortunately, Windows does not support weak symbols. The solution in these changes is to grab the list of all modules loaded by the process and then search for symbol "ompt_tool()" within them. The function ompt_tool_windows() performs the search of the ompt_tool symbol. If ompt_tool is found, then its return value is used to initialize the tool. If ompt_tool is not found, then ompt_tool_windows() returns NULL and OMPT is thus, disabled. While doing these changes, the OMPT_SUPPORT detection in CMake was changed to test for the required featuers for OMPT_SUPPORT, namely: builtin_frame_address() existence, weak attribute existence and psapi.dll existence. For LIBOMP_HAVE_OMPT_SUPPORT to be true, it must be that the builtin_frame_address() intrinsic exists AND one of: either weak attributes exist or psapi.dll exists. Also, since Process Status API is used I had to add new dependency -- psapi.dll to the library dependency micro test. Differential Revision: http://reviews.llvm.org/D14027 llvm-svn: 251654	2015-10-29 20:56:24 +00:00
Jonathan Peyton	0dd75fdfa9	Removed zeroing th.th_task_state for master thread at start of nested parallel. The th.th_task_state for the master thread at the start of a nested parallel should not be zeroed in __kmp_allocate_team() because it is later put in the stack of states in __kmp_fork_call() for further re-use after exiting the nested region. It is zeroed after being put in the stack. Differential Revision: http://reviews.llvm.org/D13702 llvm-svn: 250847	2015-10-20 19:21:04 +00:00
Jonathan Peyton	55f027b1d4	Removed '@' from delimiters, added it as offset designator. Moved '@' from delimiters to offset designators for the KMP_PLACE_THREADS environment variable. Only one of: postfix "o" or prefix @, should be used in the value of KMP_PLACE_THREADS. For example, '2s@2,4c@2,1t'. This is also the format of KMP_SETTINGS=1 output now (removed "o" from there). e.g., 2s,2o,4c,2o,1t. Differential Revision: http://reviews.llvm.org/D13701 llvm-svn: 250846	2015-10-20 19:15:48 +00:00
Jonathan Peyton	6778c73243	Fix OMP_PLACES negation operator parsing (!place) Just moved the *scan++ line up before the recursive call. Otherwise, infinite recursion occurs and leads to a segmentation fault. llvm-svn: 250729	2015-10-19 19:43:01 +00:00
Jonathan Peyton	45ca5dada1	Clean-up cancellation state flag between parallel regions Without this fix, cancellation requests in one parallel region cause cancellation of the second region even though the second one was not intended to be cancelled. llvm-svn: 250727	2015-10-19 19:33:38 +00:00
Dimitry Andric	9b8c353c90	On FreeBSD, PTHREADS_THREADS_MAX does not fit into an int, leading to warnings similar to the following: runtime/src/kmp_global.c:117:35: warning: implicit conversion from 'unsigned long' to 'int' changes value from 18446744073709551615 to -1 [-Wconstant-conversion] int __kmp_sys_max_nth = KMP_MAX_NTH; ~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ runtime/src/kmp.h:849:34: note: expanded from macro 'KMP_MAX_NTH' # define KMP_MAX_NTH PTHREAD_THREADS_MAX ^~~~~~~~~~~~~~~~~~~ Clamp KMP_MAX_NTH to INT_MAX to avoid these warnings. Also use INT_MAX whenever PTHREAD_THREADS_MAX is not defined at all. Differential Revision: http://reviews.llvm.org/D13827 llvm-svn: 250708	2015-10-19 17:32:04 +00:00
Jonathan Peyton	0e6d457797	[OMPT] Add OMPT events for API locking This fix implements the following OMPT events for the API locking routines: * ompt_event_acquired_lock * ompt_event_acquired_nest_lock_first * ompt_event_acquired_nest_lock_next * ompt_event_init_lock * ompt_event_init_nest_lock * ompt_event_destroy_lock * ompt_event_destroy_nest_lock For the acquired events the depths of the locks ist required, so a return value was added similiar to the return values we already have for the release lock routines. Patch by Tim Cramer Differential Revision: http://reviews.llvm.org/D13689 llvm-svn: 250526	2015-10-16 16:52:58 +00:00
Jonathan Peyton	33d1d283f6	Detect final task in GOMP interface. llvm-svn: 250198	2015-10-13 18:36:22 +00:00
Jonathan Peyton	71797c043f	[OPENMP][TESTSUITE] Undefined variable in test omp_task_final.c Patch by Alexey Bataev Differential Revision: http://reviews.llvm.org/D13661 llvm-svn: 250066	2015-10-12 17:01:05 +00:00
Jonathan Peyton	f0344bb02b	[OMPT] Reduce overhead of OMPT * Avoid computing state needed only by OMPT unless the ompt_enabled flag is set. * Properly handle a corner case in OMPT where team == NULL. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D13502 llvm-svn: 249857	2015-10-09 17:42:52 +00:00
Jonathan Peyton	b401db6d73	[OMPT] Initialize task fields only if needed Because __kmp_task_init_ompt is called for every initial task in each thread and always generated task ids, this was a big performance issue on bigger systems even without any tool attached. After changing the initialization interface to ompt_tool, we can now rely on already knowing whether a tool is attached and OMPT is enabled at this point. Patch by Jonas Hahnfeld Differential Revision: http://reviews.llvm.org/D13494 llvm-svn: 249855	2015-10-09 17:38:05 +00:00
Jonathan Peyton	1bd61b423e	Formatting/Whitespace/Comment changes associated with wait/release improvements. llvm-svn: 249725	2015-10-08 19:44:16 +00:00
Jonathan Peyton	e03b62f3bc	Debug trace and assert statement changes for wait/release improvements. These changes improve/update the trace messages and debug asserts related to the previous wait/release checkin. llvm-svn: 249717	2015-10-08 18:49:40 +00:00
Jonathan Peyton	a0e159f7aa	OpenMP Wait/release improvements. These changes improve the wait/release mechanism for threads spinning in barriers that are handling tasks while spinnin by providing feedback to the barriers about any task stealing that occurs. Differential Revision: http://reviews.llvm.org/D13353 llvm-svn: 249711	2015-10-08 18:23:38 +00:00
Jonathan Peyton	dd4aa9b6b5	Added sockets to the syntax of KMP_PLACE_THREADS environment variable. Added (optional) sockets to the syntax of the KMP_PLACE_THREADS environment variable. Some limitations: * The number of sockets and then optional offset should be specified first (before other parameters). * The letter designation is mandatory for sockets and then for other parameters. * If number of cores is specified first, then the number of sockets is defaulted to all sockets on the machine; also, the old syntax is partially supported if sockets are skipped. * If number of threads per core is specified first, then the number of sockets and cores per socket are defaulted to all sockets and all cores per socket respectively. * The number of cores per socket cannot be specified before sockets or after threads per core. * The number of threads per core can be specified before or after core-offset (old syntax required it to be before core-offset); * Parameters delimiter can be: empty, comma, lower-case x; * Spaces are allowed around numbers, around letters, around delimiter. Approximate shorthand specification: KMP_PLACE_THREADS="[num_sockets(S\|s)[[delim]offset(O\|o)][delim]][num_cores_per_socket(C\|c)[[delim]offset(O\|o)][delim]][num_threads_per_core(T\|t)]" Differential Revision: http://reviews.llvm.org/D13175 llvm-svn: 249708	2015-10-08 17:55:54 +00:00
Jonathan Peyton	7edeef1bbf	Fix memory corruption in Windows debug library This patch adjusts the buffer size when reducing the buffer used for printing. This solves the memory corruption in Windows debug library, and potential memory corruption in other builds. llvm-svn: 248588	2015-09-25 17:23:17 +00:00
Jonathan Peyton	f209cdfade	[OpenMP Testsuite] Change omp_get_wtime.c timer resolution to 3 percent llvm-svn: 248501	2015-09-24 15:10:57 +00:00
Jonathan Peyton	5a60bc5743	[OpenMP Testsuite] Mac rpath specified when compiling tests llvm-svn: 248500	2015-09-24 15:09:51 +00:00
Jonathan Peyton	3a91ada1e2	Fix stats build problem. This change removes the KMP_STATS_ENABLED macro inside kmp_stats.cpp since it is only compiled anyways when LIBOMP_STATS=on. Also, include kmp_config.h in kmp_stats.h to ensure KMP_STATS_ENABLED is defined. llvm-svn: 248494	2015-09-24 14:47:51 +00:00
Jonathan Peyton	1acc2dbf6e	Update Reference.pdf files. This updates the Reference.pdf files to say LLVM OpenMP Runtime Library and also updates the build documentation to show how to build with CMake. llvm-svn: 248407	2015-09-23 18:09:47 +00:00
Jonathan Peyton	614c7ef81c	OpenMP Initial testsuite change to purely llvm-lit based testing This change introduces a check-libomp target which is based upon llvm's lit test infrastructure. Each test (generated from the University of Houston's OpenMP testsuite) is compiled and then run. For each test, an exit status of 0 indicates success and non-zero indicates failure. This way, FileCheck is not needed. I've added a bit of logic to generate symlinks (libiomp5 and libgomp) in the build tree so that gcc can be tested as well. When building out-of- tree builds, the user will have to provide llvm-lit either by specifying -DLIBOMP_LLVM_LIT_EXECUTABLE or having llvm-lit in their PATH. Differential Revision: http://reviews.llvm.org/D11821 llvm-svn: 248211	2015-09-21 20:41:31 +00:00
Joerg Sonnenberger	7649cd4389	Use sysconf for the number of cores on FreeBSD too. llvm-svn: 248209	2015-09-21 20:29:12 +00:00
Joerg Sonnenberger	8abf7c87cd	Complex division requires libm on NetBSD, add it. llvm-svn: 248207	2015-09-21 20:21:02 +00:00
Joerg Sonnenberger	1564f3c4ec	Add basic NetBSD support. llvm-svn: 248204	2015-09-21 20:02:45 +00:00
Joerg Sonnenberger	40252cecb0	Teach the Perl modules about NetBSD. llvm-svn: 248203	2015-09-21 19:42:05 +00:00
Joerg Sonnenberger	f16f649e0d	libomp on NetBSD needs libc, libpthread and libm. llvm-svn: 248200	2015-09-21 19:40:59 +00:00
Joerg Sonnenberger	64be2d271d	Assume that all Unix-like systems will want to handle signals and simplify conditional. llvm-svn: 248199	2015-09-21 19:38:56 +00:00
Joerg Sonnenberger	d742184e0b	Darwin is the exception when it comes to accessing environ, all other Unix-like systems can follow the same code path. llvm-svn: 248198	2015-09-21 19:37:05 +00:00

... 2 3 4 5 6 ...

551 Commits