llvm-project

Commit Graph

Author	SHA1	Message	Date
Joachim Protze	6b840ccea9	[OpenMP] Remove compiler warning about unused value The compiler warns about an unused variable/statement: runtime/src/kmp_affinity.cpp:4958:18: warning: statement has no effect [-Wunused-value] KA_TRACE(1000, ; { ^ runtime/src/kmp_debug.h:84:24: note: in definition of macro 'KA_TRACE' __kmp_debug_printf x; \ ^ Instead of the unused reference to this function, this patch now calls the function with an empty string. The call to this function should have no effect. Patch provided by joachim.protze Reviewers: jlpeyton, hbae, AndreyChurbanov Reviewed By: AndreyChurbanov Tags: #openmp, #ompt Differential Revision: https://reviews.llvm.org/D56775 llvm-svn: 351323	2019-01-16 11:35:11 +00:00
Joachim Protze	c3716617df	Fix compiler error in r351311 llvm-svn: 351315	2019-01-16 09:39:42 +00:00
Joachim Protze	582b183dda	[OMPT] Make sure that OMPT is enabled when accessing internals of the runtime Make sure that OMPT is enabled in runtime entry points that access internals of the runtime. Else, return an appropiate value indicating an error or that the data is not available. Patch provided by @sconvent Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze Reviewed By: joachim.protze Tags: #openmp, #ompt Differential Revision: https://reviews.llvm.org/D47717 llvm-svn: 351311	2019-01-16 08:58:17 +00:00
Jonathan Peyton	9355d0dc13	[OpenMP] Fix for nested proc_bind affinity bug Using proc_bind clause on a nested #pragma omp parallel region with KMP_AFFINITY set causes an assertion error. This assertion occurs because the place-partition-var is not properly initialized in the nested master threads. Trying to get an intuitive result with KMP_AFFINITY + proc_bind is difficult because of how the KMP_AFFINITY gtid-to-place mapping occurs. This patch creates an initial place list no matter what affinity mechanism is used. For KMP_AFFINITY, the place-partition-var is initialized to all the places. Differential Revision: https://reviews.llvm.org/D55795 llvm-svn: 351227	2019-01-15 19:39:32 +00:00
Jonathan Peyton	fce3972553	[OpenMP] Add lock function definitions to fix Bug 40042 This change fixes the sanity issue reported in Bug 40042. Lock function definitions for the three lock kinds were added to disambiguate calls to the lock functions done directly and indirectly. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40042 Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D56103 llvm-svn: 351224	2019-01-15 19:14:00 +00:00
Jonathan Peyton	1c268554ba	[OpenMP][Cmake] Allowed OpenMP testing detect test compiler with same generator Fix ninja build detect test compiler failed under windows. Patch by Peiyuan Song Differential Revision: https://reviews.llvm.org/D53479 llvm-svn: 351223	2019-01-15 19:08:26 +00:00
Jonathan Peyton	dc375486b0	[OpenMP] Fix performance regression in SPEC kdtree test Make __ompt_implicit_task_end a static function and remove the inline part. Remove pId variable that is unused. This fixes small regression in SPEC kdtree benchmark. Also reformat some of __ompt_implicit_task_end. Differential Revision: https://reviews.llvm.org/D55788 llvm-svn: 351221	2019-01-15 18:57:24 +00:00
Joachim Protze	2b46d30fc7	[OMPT] Second chunk of final OMPT 5.0 interface updates The omp-tools.h file is generated from the OpenMP spec to ensure that the interface is implemented as specified. The other changes are necessary to update the interface implementation to the final version as published in 5.0. The omp-tools.h header was previously called ompt.h, currently a copy under this name is installed for legacy tools. Patch partially perpared by @sconvent Reviewers: AndreyChurbanov, hbae, Hahnfeld Reviewed By: hbae Tags: #openmp, #ompt Differential Revision: https://reviews.llvm.org/D55579 llvm-svn: 351197	2019-01-15 15:36:53 +00:00
Hans Wennborg	eb60fbfdb4	Update year in license files In last year's update (D48219) it was suggested that the release manager might want to do this, so here we go. llvm-svn: 351194	2019-01-15 15:10:32 +00:00
Roman Lebedev	06e3950561	[OpenMP] Fix LIBOMP_USE_DEBUGGER=ON build (PR38612) Summary: Two things: 1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning. 2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off, thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`. Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly. It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block.. I think this is the only source file with this issue, everything else seem to `#include` either `kmp.h` or `kmp_config.h`. The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake. I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 \| PR38612 ]]. Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld Reviewed By: jlpeyton Subscribers: guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D55783 llvm-svn: 351019	2019-01-13 12:54:34 +00:00
Gheorghe-Teodor Bercea	1653633a1c	[OpenMP][libomptarget] Use shared memory variable for tracking parallel level Summary: Replace existing infrastructure for tracking parallel level using global memory with a per-team shared memory variable. This minimizes the impact of the overhead of tracking the parallel level for non-nested cases. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D55773 llvm-svn: 350747	2019-01-09 18:30:14 +00:00
Andrey Churbanov	b7a8ab3417	Doc: fixed description of a parameter of the __kmpc_taskloop Patch by sergi.mateo.bellido@gmail.com Differential Revision: https://reviews.llvm.org/D56432 llvm-svn: 350713	2019-01-09 13:06:23 +00:00
Alexey Bataev	26e6c86b79	[OPENMP][NVPTX]Fix dynamic scheduling. Summary: Previous implementation may cause the runtime crash when the number of teams is > 1024. Patch fixes this problem + reduces number of the atomic operations by 32 times. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56332 llvm-svn: 350524	2019-01-07 14:25:25 +00:00
Alexey Bataev	6b3153ada0	[OPENMP][NVPTX]General formatting/code improvement, NFC. Summary: Formatting. Reviewers: gtbercea, grokos, kkwli0 Subscribers: guansong, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56290 llvm-svn: 350431	2019-01-04 20:16:54 +00:00
Alexey Bataev	dcf2edcdf5	[OPENMP][NVPTX]Improve performance + reduce number of used registers. Summary: Reduced number of the used register + improved performance propagating the information about current execution/data sharing mode directly from the compiler, where it is possible. In some cases, it requires new/reworked interfaces of the runtime external functions. Old functions are marked as deprecated. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56278 llvm-svn: 350405	2019-01-04 17:09:12 +00:00
Joel E. Denny	f17f7a5d4d	[OpenMP] Fix nvidia-cuda-toolkit detection on Debian/Ubuntu The OpenMP runtime's cmake scripts do not correctly locate the libdevice that the Debian/Ubuntu package nvidia-cuda-toolkit currently includes, at least on my Ubuntu 18.04.1 installation. This patch fixes that for me. This problem was discussed at length in D55269. D40453 added a similar adjustment in clang, but reviewers of D55269 concluded that, for the OpenMP runtime, the right place to address this problem is in cmake's CUDA support. However, it was also suggested we could add a workaround to OpenMP's cmake scripts now. This patch contains such a workaround, which I've tried to design so that it will have no harmful effect if cmake improves in the future. nvidia-cuda-toolkit also needs improvements because its intended monolithic CUDA tree shim, /usr/lib/cuda, has many empty directories, such as bin. I reported that at: <https://bugs.launchpad.net/ubuntu/+source/nvidia-cuda-toolkit/+bug/1808999> Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D55588 llvm-svn: 350377	2019-01-04 02:07:13 +00:00
Jonathan Peyton	76f3980a20	[OpenMP] Add omp_get_device_num() and update several other device API functions Add omp_get_device_num() function for 5.0 which returns the number of the device the current thread is running on. Currently, we are leaving it to the compiler to handle this properly if it is called inside target. Also, did some cleanup and updating of duplicate device API functions (in both libomp and libomptarget) to make them into weak functions that check for the symbol from libomptarget, and will call the version in libomptarget if it is present. If any additional device API functions are implemented also in libomptarget in the future, we should add the dlsym calls to the host functions. Also, if the omp_target_* functions are to be implemented for the host (this has been requested), they should attempt to call the libomptarget versions as well. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55578 llvm-svn: 350352	2019-01-03 21:14:19 +00:00
Alexey Bataev	3c74be8049	[OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC. Summary: One of the LLVM optimizations, split critical edges, also clones tail instructions. This is a dangerous operation for __syncthreads() functions and this transformation leads to undefined behavior or incorrect results. Patch fixes this problem by replacing __syncthreads() function with the assembler instruction, which cost is too high and wich cannot be copied. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56274 llvm-svn: 350333	2019-01-03 17:43:46 +00:00
Vyacheslav Zakharin	e889ac7e6b	[libomptarget] Added install component for libomptarget Differential Revision: https://reviews.llvm.org/D56108 llvm-svn: 350254	2019-01-02 19:39:49 +00:00
Alexey Bataev	d1cd005ec5	[OPENMP][NVPTX]Added/fixed debugging messages, NFC. Summary: Added or fixed new/old debugging messages for the better diagnostics. Reviewers: gtbercea, kkwli0, grokos Reviewed By: grokos Subscribers: caomhin, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D56102 llvm-svn: 350137	2018-12-28 21:36:09 +00:00
Alexey Bataev	28eccf5ba0	[OPENMP][NVPTX]Fixed initialization of the data-sharing interface. Summary: Avoid using of the atomic loop to wait for the completion of the data-sharing interface initialization, use __shfl_sync instead for the communication within the warp to signal other threads in the warp about completion of the initialization. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D56100 llvm-svn: 350129	2018-12-28 17:31:06 +00:00
Alexey Bataev	1708858dbd	[OPENMP][NVPTX]Outline assert into noinline function, NFC. Summary: At high optimization level asserts lead to some unexpected results because of auto-inserted unreachable instructions. This outlining prevents some of such dangerous optimizations and leads to better stability. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D56101 llvm-svn: 350128	2018-12-28 17:29:47 +00:00
Michal Gorny	a70184ba92	[runtime] [test] Fix using %python path Fix the newly-added tests to use %python substitution in order to use the correct path to Python interpreter. Otherwise, they fail on NetBSD where there is no 'python', just 'pythonX.Y'. Differential Revision: https://reviews.llvm.org/D56048 llvm-svn: 350001	2018-12-22 10:51:53 +00:00
Stefan Pintilie	4230f91aa2	[Tests] [OpenMP] XFAIL also for ppc64le. Two tests were XFAILed for powerpc64le in r349512. They should have also been XFAILed for ppc64le. llvm-svn: 349521	2018-12-18 19:05:07 +00:00
Stefan Pintilie	ea79468b41	XFAIL Pair of OpenMP Tests for PowerPC LE Linux XFAIL two tests that fail on PowerPC LE Linux due to the change of default from PIC to no-PIC on that platform. A Bug has been opened for this: https://bugs.llvm.org/show_bug.cgi?id=40082 The tests are: runtime/test/ompt/misc/control_tool.c runtime/test/ompt/synchronization/taskwait.c llvm-svn: 349512	2018-12-18 17:39:22 +00:00
Joachim Protze	cf80e72e30	[Tests] fix non-determinism failure in testcase llvm-svn: 349460	2018-12-18 08:57:23 +00:00
Joachim Protze	0e0d6cdd58	[OMPT] First chunk of final OMPT 5.0 interface updates This patch updates the implementation of the ompt_frame_t, ompt_wait_id_t and ompt_state_t. The final version of the OpenMP 5.0 spec added the "t" for these types. Furthermore the structure for ompt_frame_t changed and allows to specify that the reenter frame belongs to the runtime. Patch partially prepared by Simon Convent Reviewers: hbae llvm-svn: 349458	2018-12-18 08:52:30 +00:00
Joachim Protze	1f7d4aca8d	[OMPT] Add testcase for thread_num provided by implicit task events llvm-svn: 349457	2018-12-18 08:52:12 +00:00
Jonathan Peyton	fca3ac543e	[OpenMP] version the affinity format tests and fix one test llvm-svn: 349412	2018-12-17 22:53:47 +00:00
Jonathan Peyton	5640556b55	[OpenMP] Add affinity format tests llvm-svn: 349411	2018-12-17 22:33:21 +00:00
Roman Lebedev	781a0896b0	[OpenMP] Fixes for LIBOMP_OMP_VERSION=45/40 Summary: I have discovered this because i wanted to experiment with building static libomp (with openmp-4.0 support only) for debugging purposes. There are three kinds of problems here: 1. `__kmp_compare_and_store_acq()` simply does not exist. It was added in D47903 by @jlpeyton. I'm guessing `__kmp_atomic_compare_store_acq()` was meant. 2. In `__kmp_is_ticket_lock_initialized()`, `lck->lk.initialized` is `std::atomic<bool>`, while `lck` is `kmp_ticket_lock_t *`. Naturally, they can't be equality-compared. Either, it should return the value read from `lck->lk.initialized`, or do what `__kmp_is_queuing_lock_initialized()` does, compare the passed pointer with the field in the struct pointed by the pointer. I think the latter is correct-er choice here. 3. Tests were not versioned. They assume that `LIBOMP_OMP_VERSION` is at the latest version. This does not touch LIBOMP_OMP_VERSION=30. That is still broken. Reviewers: jlpeyton, Hahnfeld, AndreyChurbanov Reviewed By: AndreyChurbanov Subscribers: guansong, jfb, openmp-commits, jlpeyton Tags: #openmp Differential Revision: https://reviews.llvm.org/D55496 llvm-svn: 349260	2018-12-15 09:23:39 +00:00
Jonathan Peyton	bdb0a2ffaa	[OpenMP] Fix transient divide by zero bug in 32-bit code The value returned by __kmp_now_nsec() can overflow 32-bit values causing incorrect values to be returned. The overflow can end up causing a divide by zero error because in __kmp_initialize_system_tick(), the value (__kmp_now_nsec() - nsec) can end up being much larger than the numerator: 1e6 * (delay + (now - goal)) during a pathological timing where the current time calculated is much larger than nsec. When this happens, the value of __kmp_ticks_per_msec is set to zero which is then used as the denominator in the KMP_NOW_MSEC() macro leading to the divide by zero error. Differential Revision: https://reviews.llvm.org/D55300 llvm-svn: 349090	2018-12-13 23:18:55 +00:00
Jonathan Peyton	6d88e049dc	[OpenMP] Implement OpenMP 5.0 affinity format functionality This patch adds the affinity format functionality introduced in OpenMP 5.0. This patch adds: Two new environment variables: OMP_DISPLAY_AFFINITY=TRUE\|FALSE OMP_AFFINITY_FORMAT=<string> and Four new API: 1) omp_set_affinity_format() 2) omp_get_affinity_format() 3) omp_display_affinity() 4) omp_capture_affinity() The affinity format functionality has two ICV's associated with it: affinity-display-var (bool) and affinity-format-var (string). The affinity-display-var enables/disables the functionality through the envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted string with the special field types beginning with a '%' character similar to printf For example, the affinity-format-var could be: "OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}" The affinity-format-var is displayed by every thread implicitly at the beginning of a parallel region when any thread's affinity has changed (including a brand new thread being spawned), or explicitly using the omp_display_affinity() API. The omp_capture_affinity() function can capture the affinity-format-var in a char buffer. And omp_set\|get_affinity_format() allow the user to set\|get the affinity-format-var explicitly at runtime. omp_capture_affinity() and omp_get_affinity_format() both return the number of characters needed to hold the entire string it tried to make (not including NULL character). If not enough buffer space is available, both these functions truncate their output. Differential Revision: https://reviews.llvm.org/D55148 llvm-svn: 349089	2018-12-13 23:14:24 +00:00
Andrey Churbanov	74f98554f9	Fix for bugzilla https://bugs.llvm.org/show_bug.cgi?id=39970 Broken tests fixed Differential Revision: https://reviews.llvm.org/D55598 llvm-svn: 349017	2018-12-13 10:04:10 +00:00
Michal Gorny	8876dac50a	[runtime] Disable KMP_HAVE_QUAD on NetBSD gcc Disable KMP_HAVE_QUAD when building via gcc on NetBSD system, as the build fails due to unimplemented builtins: .../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_mul': .../kmp_atomic.cpp:1332: undefined reference to `__multc3' .../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_div': .../kmp_atomic.cpp:1334: undefined reference to `__divtc3' ... Differential Revision: https://reviews.llvm.org/D55478 llvm-svn: 348886	2018-12-11 19:02:14 +00:00
Michal Gorny	70cdd83cd6	[runtime] Use getloadavg() on NetBSD as well Switch NetBSD from reading /proc (which is broken) to getloadavg() (which is already used by Darwin). NetBSD discourages using procfs in favor of system API calls. Differential Revision: https://reviews.llvm.org/D55486 llvm-svn: 348885	2018-12-11 19:02:09 +00:00
Kamil Rytarowski	316f423876	Implement __kmp_is_address_mapped() for NetBSD Summary: Use the sysctl(3) function to check whether an address is mapped into the address space. Reviewers: mgorny, joerg, #openmp Reviewed By: mgorny Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D55549 llvm-svn: 348874	2018-12-11 18:35:07 +00:00
Kamil Rytarowski	98bdf1f21d	Implement __kmp_gettid() for NetBSD Summary: _lwp_self() returns current Thread Id in a numeric version on NetBSD. Reviewers: joerg, mgorny, #openmp Reviewed By: mgorny Subscribers: llvm-commits, openmp-commits, #openmp Tags: #openmp Differential Revision: https://reviews.llvm.org/D55497 llvm-svn: 348873	2018-12-11 18:34:33 +00:00
Michal Gorny	276df88154	[test] [runtime] Permit omp_get_wtick() to return 0.01 Increase the range for omp_get_wtick() test to allow for 0.01 (from <0.01). This is needed for NetBSD where it returns exactly that value due to CLOCKS_PER_SEC being 100. This should not cause a significant difference from e.g. FreeBSD where it is 128, and especially from Linux where CLOCKS_PER_SEC is apparently meaningless and sysconf(_SC_CLK_TCK) gives 100 as well. Differential Revision: https://reviews.llvm.org/D55493 llvm-svn: 348857	2018-12-11 15:39:34 +00:00
Michal Gorny	3815b9f5f9	[test] [runtime] Do not include alloca.h on NetBSD On NetBSD, alloca() is in stdlib.h and there is no alloca.h. Adjust the includes appopriately. Differential Revision: https://reviews.llvm.org/D55487 llvm-svn: 348856	2018-12-11 15:39:30 +00:00
Michal Gorny	7bbc1a782f	[runtime] [test] Use more portable short options to sort(1) Pass `-n -s` instead of `--numeric --stable` to sort(1), as long options are not supported by NetBSD sort implementation. `-n` is defined by POSIX, so it should be fully portable. `-s` is used consistently at least in GNU sort and FreeBSD sort, and I honestly doubt it would cause issues with any other implementation supporting `--stable`. Differential Revision: https://reviews.llvm.org/D55479 llvm-svn: 348855	2018-12-11 15:39:26 +00:00
Michal Gorny	e9d4267277	[cmake] Use -std=gnu++11 to fix alloca() on NetBSD Prefer using '-std=gnu++11' over '-std=c++11' when available, as NetBSD exposes the correct alloca() implementation only with gnu* C/C++ standards. Differential Revision: https://reviews.llvm.org/D55477 llvm-svn: 348854	2018-12-11 15:39:22 +00:00
Jonathan Peyton	17e53b9299	[OpenMP] Fix a few build issues Fix two build issues: 1) Recent commit 348756 accidentally included Unix clang compilers to use immintrin.h when only clang-cl should be using it leading to the following error: openmp-llvm/runtime/src/kmp_lock.cpp:2035:25: error: always_ inline function '_xbegin' requires target feature 'rtm', but would be inlined into function '__kmp_test_adaptive_lock_only' that is compiled without support for 'rtm' kmp_uint32 status = _xbegin(); This patch changes the guard to use immintrin.h to only use clang-cl instead of all clang 2) gcc-8 gives a warning about multiline comment in kmp_runtime.cpp: This patch just changes it to a two line comment openmp-llvm/runtime/src/kmp_runtime.cpp:7697:8: warning: multi-line comment [-Wcomment] #endif // KMP_OS_LINUX \|\| KMP_OS_DRAGONFLY \|\| KMP_OS_FREEBSD \|\| KMP_OS_NETBSD \ llvm-svn: 348783	2018-12-10 18:26:50 +00:00
Alexey Bataev	9056f1116d	[OPENMP][NVPTX]Revert __kmpc_shuffle_int64 to its original form. Summary: Use the original shuffle implementation for __kmpc_shuffle_int64 since default implementation uses the same implementation. Reviewers: gtbercea Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55514 llvm-svn: 348772	2018-12-10 16:50:36 +00:00
Alexey Bataev	cc6cf64c38	[OPENMP][NVPTX]Enable fast shuffles on 64bit values only if CUDA >= 9. Summary: Shuffle on 64bit data is allowed only for CUDA >= 9.0. Also, fixed the constant for the mask, need one extra L in the end. Reviewers: gtbercea, kkwli0 Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55440 llvm-svn: 348758	2018-12-10 14:29:05 +00:00
Andrey Churbanov	f700e9ed8c	Support clang compiling under windows-gnu and windows-msvc Patch by Peiyuan Song <squallatf@gmail.com> Differential Revision: https://reviews.llvm.org/D53422 llvm-svn: 348756	2018-12-10 13:45:00 +00:00
Kamil Rytarowski	7e1ea993e0	Add OpenBSD support to OpenMP Summary: This patch permits OpenMP to build and work (with both gcc and clang) on OpenBSD. It mostly follows what was done for FreeBSD and NetBSD, except OpenBSD does not have pthread_getattr_np support, so it follows OS X in that one instance. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: guansong, jfb, emaste, mgorny, krytarowski, #openmp Tags: #openmp Differential Revision: https://reviews.llvm.org/D34280 llvm-svn: 348726	2018-12-09 16:46:48 +00:00
Kamil Rytarowski	a56ac949ec	Add DragonFlyBSD support to OpenMP Summary: Additions mostly follow FreeBSD and NetBSD and are not intrusive. There is similar patch for OpenBSD: https://reviews.llvm.org/D34280 The -lm was being omitted due to -Wl,--as-needed in cmake rule, similar patch is in freebsd-ports/devel/llvm-devel port. Simple OpenMP programs compile and work as expected: $ clang-devel ~/omp_hello.c -fopenmp -I/usr/local/llvm-devel/include $ LD_LIBRARY_PATH=/usr/local/llvm-devel/lib OMP_NUM_THREADS=100 ./a.out The assertion in LLVMgold.so when -fopenmp was used together with -flto in 20170524 snapshot is no longer triggered on current svn-trunk and works fine as in llvm-4.0 with our local patches. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: dexonsmith, jfb, krytarowski, guansong, gregrodgers, emaste, mgorny, mehdi_amini Differential Revision: https://reviews.llvm.org/D35129 llvm-svn: 348725	2018-12-09 16:40:33 +00:00
Alexey Bataev	8acafff404	[OPENMP][NVPTX]Save registers for optimized builds with enabled logging. Summary: Introduced special noinline function log that allows to save some registers for optimized builds but with enabled logging. Also, it increases the stability of the optimized builds with inlined runtime. Reviewers: gtbercea, kkwli0 Reviewed By: gtbercea Subscribers: caomhin, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D55436 llvm-svn: 348606	2018-12-07 16:08:29 +00:00
Alexey Bataev	653e8ba79a	[OPENMP][NVPTX]Correct type casting for printf args + simplified shfl64 function. Summary: Explicitly casted printf's args to the required types + simplified shfl64 function. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55379 llvm-svn: 348521	2018-12-06 19:45:48 +00:00
Alexey Bataev	5442f3e549	[OPENMP][NVPTX]Fix __kmpc_flush to flush the memory per system, not per block. Summary: According to the standard, after memory flushing the changes in the memory must be visible to all the threads in all teams. Patch fixes this. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55370 llvm-svn: 348491	2018-12-06 15:27:58 +00:00
Gheorghe-Teodor Bercea	10b2e60b7e	[OpenMP][libomptarget] Flush intermediate values during team reduction Summary: Ensure intermediate values of a team reduction are flushed to memory. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D55219 llvm-svn: 348148	2018-12-03 15:21:49 +00:00
Alexey Bataev	0f221f53d8	[OPENMP][NVPTX]Make runtime compatible with the original runtime. Summary: Reworked runtime to make it compatible with the requirements of the original runtime library. Also, simplified some code to reduce number of function calls. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55130 llvm-svn: 348003	2018-11-30 16:52:38 +00:00
Jonathan Peyton	bfe427bf41	Revert r347799: Add omp_get_device_num() and update other device API There is a conflict between libomptarget and libomp concerning some of the standard OpenMP device API which needs further intestigation. llvm-svn: 347932	2018-11-29 23:56:14 +00:00
Jonathan Peyton	b04f7d681a	[OpenMP] Add stubs for Task affinity API This patch adds __kmpc_omp_reg_task_with_affinity to register affinity information for tasks. For now, the affinity information is not used, and the function always succeeds. This also adds the kmp_task_affinity_info_t structure to store the task affinity information. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55026 llvm-svn: 347907	2018-11-29 20:04:29 +00:00
Jonathan Peyton	1742eced55	[OpenMP] Rename ompt_mutex_impl_unknown to ompt_mutex_impl_none This change renames ompt_mutex_impl_unknown to ompt_mutex_impl_none, following the name change in the specification. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D54347 llvm-svn: 347802	2018-11-28 20:19:53 +00:00
Jonathan Peyton	1ce776ea2f	[OpenMP] Minor cleanup of debug code * Fix calculation of string length. * Remove NULL-check of pointer which has been dereferenced. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D54948 llvm-svn: 347801	2018-11-28 20:18:06 +00:00
Jonathan Peyton	f4c0720ad0	[OpenMP] Fixed possible array out of bound access There is low probability that array th_hot_teams can be accessed out of bound (when many nested levels are requested to keep hot teams via KMP_HOT_TEAMS_MAX_LEVEL). The patch adds the check of index that fixes the problem. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D54950 llvm-svn: 347800	2018-11-28 20:15:11 +00:00
Jonathan Peyton	a17318b89b	[OpenMP] Add omp_get_device_num() and update several other device API functions Add omp_get_device_num() function for 5.0 which returns the number of the device the current thread is running on. Also, did some cleanup and updating of device API functions to make them into weak functions that should be replaced with libomptarget functions when libomptarget is present. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D54342 llvm-svn: 347799	2018-11-28 20:10:26 +00:00
Gheorghe-Teodor Bercea	31c1589ab0	[OpenMP][libomptarget] Add new version of SPMD deinit kernel function with argument Summary: To enable the compiler to optimize parts of the function that are not needed when runtime can be omitted, a new version of the SPMD deinit kernel function is needed. This function takes the runtime required flag as an argument. Reviewers: ABataev, kkwli0, caomhin Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D54969 llvm-svn: 347714	2018-11-27 21:23:40 +00:00
Alexey Bataev	d4de439cf4	[OPENMP][NVPTX]Basic support for reductions across the teams. Summary: Added functions __kmpc_nvptx_teams_reduce_nowait_simple and __kmpc_nvptx_teams_end_reduce_nowait_simple to implement basic support for reductions across the teams. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D54967 llvm-svn: 347710	2018-11-27 21:06:09 +00:00
Gheorghe-Teodor Bercea	ad8632a9ba	[OpenMP][libomptarget] Refactor SPMD and runtime requirement checking Summary: Refactor the checking for SPMD mode and whether the runtime is initialized or not. This uses constant flags which enables the runtime to optimize out unused sections of code that depend on these flags. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54960 llvm-svn: 347698	2018-11-27 19:45:10 +00:00
Alexey Bataev	8ab0924ab4	[OPENMP][NVPTX]Improved lock/critical constructs. Summary: Improved support for critical constructs + omp_..._lock... constructs. Reviewers: gtbercea, kkwli0, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54766 llvm-svn: 347342	2018-11-20 20:19:36 +00:00
Andrey Churbanov	82318c6f14	Fix for bugzilla https://bugs.llvm.org/show_bug.cgi?id=39137 . Do not write to internal structure if it keeps same value. Differential Revision: https://reviews.llvm.org/D54305 llvm-svn: 346862	2018-11-14 13:49:41 +00:00
Alexey Bataev	15ab891e68	[OPENMP]Make lambda mapping follow reqs for PTR_AND_OBJ mapping. Summary: The base pointer for the lambda mapping must point to the lambda capture placement and pointer must point to the captured variable itself. Patch fixes this problem. Reviewers: gtbercea Subscribers: guansong, openmp-commits, kkwli0, caomhin Differential Revision: https://reviews.llvm.org/D54260 llvm-svn: 346407	2018-11-08 15:47:30 +00:00
Andrey Churbanov	855d09855d	Add Hurd support. Patch by samuel.thibault@ens-lyon.org Differential Revision: https://reviews.llvm.org/D54079 llvm-svn: 346310	2018-11-07 12:27:38 +00:00
Andrey Churbanov	c334434550	Implementation of OpenMP 5.0 mutexinoutset task dependency type. Differential Revision: https://reviews.llvm.org/D53380 llvm-svn: 346307	2018-11-07 12:19:57 +00:00
Alexey Bataev	9476ca7db9	[OPENMP][OFFLOADING]Change the lambda capturing flags. Summary: The previously used combination `PTR_AND_OBJ \| PRIVATE` could be used for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ \| LITERAL`. Reviewers: gtbercea Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D54035 llvm-svn: 345981	2018-11-02 15:24:47 +00:00
Alexey Bataev	463e9f3224	[OPENMP][NVPTX]Fixed/improved support for globalization in team contexts. Summary: Current globalization scheme works correctly only for SPMD+lightweight runtime mode and does not work for full runtime. Patch improves support for the globalization scheme + reduces global memory consumption in lightweight runtime mode. Patch adds runtime functions to work with the statically allocated global memory. It allows to improve performance and memory consumption. This global memory must be allocated by the compiler. Reviewers: grokos, kkwli0, gtbercea, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53943 llvm-svn: 345976	2018-11-02 14:43:23 +00:00
Gheorghe-Teodor Bercea	b10bacf122	[OpenMP][libomptarget] Add runtime function for pushing coalesced global records Summary: In the case of coalesced global records, we need to push the exact data size passed in. This patch fixes this by outlining the common functionality of the previous push function and by adding a separate entry point for coalesced pushes. The pop function remains unchanged. Reviewers: ABataev, grokos, caomhin Reviewed By: ABataev, grokos Subscribers: jholewinski, cfe-commits, Hahnfeld, guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53141 llvm-svn: 345867	2018-11-01 18:08:12 +00:00
Alexey Bataev	e5369885dd	[LIBOMPTARGET] Add support for mapping of lambda captures. Summary: Added support for correct mapping of variables captured by reference in lambdas. That kind of mapping may appear only in target-executable regions and must follow the original lambda or another lambda capture for the same lambda. The expected data: base address - the address of the lambda, begin pointer - pointer to the address of the lambda capture, size - size of the captured variable. When OMP_TGT_MAPTYPE_PTR_AND_OBJ mapping type is seen in target-executable region, the target address of the last processed item is taken as the address of the original lambda `tgt_lambda_ptr`. Then, the pointer to capture on the device is calculated like `tgt_lambda_ptr + (host_begin_pointer - host_begin_base)` and the target-based address of the original variable (which host address is `(void*)begin_pointer`) is written to that pointer. Reviewers: kkwli0, gtbercea, grokos Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D51107 llvm-svn: 345608	2018-10-30 15:42:12 +00:00
Andrey Churbanov	6ca3609418	remove duplicate omp_control_tool export to fix windows build Patch by squallatf@gmail.com Differential Revision: https://reviews.llvm.org/D53480 llvm-svn: 345255	2018-10-25 11:04:01 +00:00
Jonathan Peyton	8b3842fc99	[OpenMP] Convert KMP_DYNAMIC_LIB to a 0 or 1 guard everywhere llvm-svn: 343869	2018-10-05 17:59:39 +00:00
Jonathan Peyton	f194033316	[OpenMP] Fix KMP_DYNAMIC_LIB to be dependent on LIBOMP_ENABLE_SHARED The KMP_DYNAMIC_LIB guard was hard set to 1. This patch has the guard depend on CMake variable LIBOMP_ENABLE_SHARED. llvm-svn: 343866	2018-10-05 17:47:58 +00:00
Jonathan Peyton	3574f28709	[OpenMP][OMPT] Fix unsafe initialization of ompt_data_t objects Initializing an ompt_data_t object using the pointer union member is potentially unsafe in 32-bit programs. This change fixes the issue by using the constant, ompt_data_none. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D52046 llvm-svn: 343785	2018-10-04 14:57:04 +00:00
Jonathan Peyton	8bb8a92de9	[OpenMP] Shutdown library on Windows if possible for better OMPT behavior On Windows, child workers are terminated by the parent during the normal program exit process (ExitProcess()) and they are not able to finish generating their OpenMP events. We can force manual library shut down in __kmpc_end() to fix this at least for the cases where __kmpc_end() is properly inserted. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D52628 llvm-svn: 343619	2018-10-02 19:15:04 +00:00
Jonas Hahnfeld	a762bfc03a	[libomptarget-nvptx] Enable asserts in bclib If the user requested LIBOMPTARGET_NVPTX_DEBUG, include asserts in the bitcode library. Everything else will have very unpleasent effects because asserts will appear when falling back to the static library libomptarget-nvptx.a. Differential Revision: https://reviews.llvm.org/D52701 llvm-svn: 343477	2018-10-01 14:16:55 +00:00
Jonas Hahnfeld	a1100e6b9a	[libomptarget-nvptx] reduction: Determine if runtime uninitialized Pass in the correct value of isRuntimeUninitialized() which solves parallel reductions as reported on the mailing list. For reference: r333285 did the same for loop scheduling. Differential Revision: https://reviews.llvm.org/D52725 llvm-svn: 343476	2018-10-01 14:14:26 +00:00
Andrey Churbanov	df60b37226	Fixed workaround made in https://reviews.llvm.org/D51694 . Patch suggested by Kelvin Li: removed optional "kind=" part of kind-selector for variables with long names and kind names. Differential Revision: https://reviews.llvm.org/D52712 llvm-svn: 343475	2018-10-01 14:08:50 +00:00
Jonas Hahnfeld	1bf767fb8e	[libomptarget-nvptx] Align data sharing stack NVPTX requires addresses of pointer locations to be 8-byte aligned or there will be an exception during runtime. This could happen without this patch as shown in the added test: getId() requires 4 byte of stack and putValueInParallel() uses 16 bytes to store the addresses of the captured variables. Differential Revision: https://reviews.llvm.org/D52655 llvm-svn: 343402	2018-09-30 09:23:21 +00:00
Jonas Hahnfeld	067235f227	[libomptarget-nvptx] Fix ancestor_thread_num and team_size (non-SPMD) According to OpenMP 4.5, p250:12-14: If the requested nest level is outside the range of 0 and the nest level of the current thread, as returned by the omp_get_level routine, the routine returns -1. The SPMD code path will need a similar fix. Differential Revision: https://reviews.llvm.org/D51787 llvm-svn: 343401	2018-09-30 09:23:14 +00:00
Jonas Hahnfeld	fb1b80191e	[libomptarget-nvptx] Add tests for nested parallelism Clang trunk will serialize nested parallel regions. Check that this is correctly reflected in various API methods. Differential Revision: https://reviews.llvm.org/D51786 llvm-svn: 343382	2018-09-29 16:02:32 +00:00
Jonas Hahnfeld	c89a14f5d2	[libomptarget-nvptx] Ignore calls to dynamic API There is no support and according to the OpenMP 4.5, p238:7-9: For implementations that do not support dynamic adjustment of the number of threads this routine has no effect: the value of dyn-var remains false. Add a test that cancellation and nested parallelism aren't supported either. Differential Revision: https://reviews.llvm.org/D51785 llvm-svn: 343381	2018-09-29 16:02:25 +00:00
Jonas Hahnfeld	a743c04412	[libomptarget-nvptx] Fix number of threads in parallel If there is no num_threads() clause we must consider the nthreads-var ICV. Its value is set by omp_set_num_threads() and can be queried using omp_get_max_num_threads(). The rewritten code now closely resembles the algorithm given in the OpenMP standard. Differential Revision: https://reviews.llvm.org/D51783 llvm-svn: 343380	2018-09-29 16:02:17 +00:00
Alexey Bataev	418af6f6cf	[OPENMP] Add the test to check that the libomptarget does not cause infinite loop on removing non-mapped pointer-with-object. Added test to check that libomptarget does not cause infinite loop when trying to unmap the pointer-with-object data that was not previously mapped. llvm-svn: 343344	2018-09-28 17:13:11 +00:00
Jonas Hahnfeld	122dbb5dce	[libomptarget-nvptx] Add testing infrastructure This patch also introduces testing for libomptarget-nvptx which has been missing until now. I propose to add tests for all bugs that are fixed in the future. The target check-libomptarget-nvptx is not run by default because - we can't determine if there is a GPU plugged into the system. - it will require the latest Clang compiler. Keeping compatibility with older releases would prevent testing newer code generation developed in trunk. Differential Revision: https://reviews.llvm.org/D51687 llvm-svn: 343324	2018-09-28 15:05:43 +00:00
Jonathan Peyton	83e360a427	[OpenMP] Add missing __kmpc_critical_with_hint to dllexports This patch puts the __kmpc_critical_with_hint function in dllexports and also replaces some OMP_45_ENABLED to OMP_50_ENABLED Differential Revision: https://reviews.llvm.org/D52380 llvm-svn: 343143	2018-09-26 20:47:25 +00:00
Jonathan Peyton	e525f0d4e2	[OpenMP] Fix balanced affinity so thread's private affinity mask is updated Balanced affinity only updated the thread's affinity with the operating system. This change also has the thread's private mask reflect that change as well so that any API that probes the thread's affinity mask will report the correct mask value. Differential Revision: https://reviews.llvm.org/D52379 llvm-svn: 343142	2018-09-26 20:43:23 +00:00
Jonathan Peyton	985f152f25	[OpenMP] Update ittnotify sources This patch updates the ittnotify sources to the latest corresponding with Intel(R) VTune(TM) Amplifier 2018 Differential Revision: https://reviews.llvm.org/D52378 llvm-svn: 343139	2018-09-26 20:30:00 +00:00
Jonathan Peyton	cf27e31bdd	[OpenMP] Fix performance issue from 376.kdtree This change improves the performance of 376.kdtree by giving the compiler an opportunity to do inlining and other optimizations for the call path, __kmpc_omp_task_complete_if0()->__kmp_task_finish(), which is one of the hot paths in the program; some functions in kmp_taskdeps.cpp were moved to the new header file, kmp_taskdeps.h to achieve this. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D51889 llvm-svn: 343138	2018-09-26 20:24:39 +00:00
Jonathan Peyton	60eec6fecb	[OpenMP][OMPT] A few improvements This change includes miscellaneous improvements as follows: 1) Added ompt_get_proc_id() implementation for Windows 2) Added parser and print tool for omp-tool-var, just in case it needs to be printed (OMP_DISPLAY_ENV) 3) omp_control_tool is exported on Windows Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D50538 llvm-svn: 343137	2018-09-26 20:19:44 +00:00
Gheorghe-Teodor Bercea	f7256a593f	[OpenMP][libomptarget] Set the frame pointer then test empty slot condition Summary: NFC - just fixing a bug: the empty slot test was before the re-setting of the Stack pointer. Reviewers: ABataev, caomhin, Hahnfeld Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D52122 llvm-svn: 343006	2018-09-25 18:48:14 +00:00
Gheorghe-Teodor Bercea	9bc3bfffb4	[OpenMP][libomptarget] Simplify warp master selection for data sharing Summary: There is currently no supported situation where the warp master is not the first thread in the warp. This also avoids the device execution from hanging on Volta GPUs when ballot_sync is called by a number of threads that is less that the size of a warp. Reviewers: ABataev, caomhin, grokos Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D50188 llvm-svn: 342972	2018-09-25 13:23:32 +00:00
Alexey Bataev	022bf16b41	[OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime. Summary: We need the support for per-team shared variables to support codegen for lastprivates/reductions. Patch adds this support by using shared memory if the total size of the reductions/lastprivates is <= 128 bytes, then pre-allocated buffer in global memory if size is <= 4K bytes,or uses malloc/free, otherwise. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51875 llvm-svn: 342737	2018-09-21 14:11:41 +00:00
Alexey Bataev	06b6e0f406	[OPENMP]Increment iterator when the loop is continued. Summary: Missed operation of the incrementing iterator when required just to continue execution. Reviewers: kkwli0, gtbercea, grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51937 llvm-svn: 341964	2018-09-11 17:16:26 +00:00
Joachim Protze	489cdb783a	[OMPT] Update types according to TR7 Some types and callback signatures have changed from TR6 to TR7. Major changes (only adding signatures and stubs): (-remove idle callback) done by D48362 -add reduction and dispatch callback -add get_task_memory and finalize_tool runtime entry points -ompt_invoker_t becomes ompt_parallel_flag_t -more types of sync_regions Patch provided by Simon Convent Reviewers: hbae, protze.joachim Differential Revision: https://reviews.llvm.org/D50774 llvm-svn: 341834	2018-09-10 14:34:54 +00:00
Jonas Hahnfeld	dc79c7187c	[libomptarget-nvptx] Remove last mentions of __kmpc_print_* Their implementation was removed during review, delete their prototype declarations. llvm-svn: 341748	2018-09-08 12:10:19 +00:00
Jonathan Peyton	08f0180ba9	[OpenMP] Update copyright to 2018 Better late than never llvm-svn: 341703	2018-09-07 20:33:35 +00:00
Jonathan Peyton	a2f6eff488	[OpenMP] Change hint parameter type for critical to uint32_t Add atomic hint flags to the enum. The hint parameter type was changed to uint32_t in __kmpc_critical_with_hint() Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D51235 llvm-svn: 341694	2018-09-07 18:46:40 +00:00
Jonathan Peyton	2ff302d5d7	[OpenMP] Synchronization hint constants added to headers ident flags reserved for atomic hints. This patch adds omp_sync_hint_t to omp.h and omp_sync_hint_kind to omp_lib.h. For better maintainability the list of macros for ident flags was replaced with a enum. The new KMP_IDENT_ATOMIC_HINT_MASK was added to the enum to support possible future atomic hints. Also fix omp_lib.h.var to be under 72 chars again after 5.0 OpenMP Memory commit Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D51233 llvm-svn: 341693	2018-09-07 18:45:13 +00:00
Jonathan Peyton	92ca61884b	[OpenMP] Initial implementation of OMP 5.0 Memory Management routines Implemented omp_alloc, omp_free, omp_{set,get}_default_allocator entries, and OMP_ALLOCATOR environment variable. Added support for HBW memory on Linux if libmemkind.so library is accessible (dynamic library only, no support for static libraries). Only used stable API (hbwmalloc) of the memkind library though we may consider using experimental API in future. The ICV def-allocator-var is implemented per implicit task similar to place-partition-var. In the absence of a requested allocator, the uses the default allocator. Predefined allocators (the only ones currently available) are made similar for C and Fortran, - pointers (long integers) with values 1 to 8. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D51232 llvm-svn: 341687	2018-09-07 18:25:49 +00:00
Andrey Churbanov	d946778b9f	Fix for https://bugs.llvm.org/show_bug.cgi?id=38839 : Changed style of declarations to be less than 72 char each. Differential Revision: https://reviews.llvm.org/D51694 llvm-svn: 341653	2018-09-07 12:22:04 +00:00
Jonas Hahnfeld	21e3ee0afe	[libomptarget] Remove two unneeded includes, NFCI. Follow-up to r340542 and r340767. llvm-svn: 341563	2018-09-06 17:00:57 +00:00
Jonas Hahnfeld	f27dcf01d2	[libomptaret][test] Announce compiler features This is a follow-up to r341371: The new test for PR38704 doesn't work with Clang 6.0. It uses an UNSUPPORTED: clang-6, but that hasn't worked because the compiler features weren't known to lit. llvm-svn: 341448	2018-09-05 07:26:00 +00:00
Sergey Dmitriev	b4dc69ff80	[libomptarget] Remove `Devices` from `RTLInfoTy` This patch removes unused field `Devices` from `RTLInfoTy`. Differential Revision: https://reviews.llvm.org/D51653 llvm-svn: 341399	2018-09-04 20:23:09 +00:00
Jonas Hahnfeld	bb51d39871	[libomptarget][CUDA] Use cuDeviceGetAttribute, NFCI. cuDeviceGetProperties has apparently been deprecated since CUDA 5.0. Nvidia started using annotations only in CUDA 9.2, so nobody noticed nor cared before. The new function returns the same values, tested with a P100. Differential Revision: https://reviews.llvm.org/D51624 llvm-svn: 341372	2018-09-04 15:13:28 +00:00
Jonas Hahnfeld	f7f86971e6	[libomptarget] PR38704: Fix erase of ShadowPtrMap erase() invalidates the iterator and returns a new one pointing to the following element. The code now follows the example at https://en.cppreference.com/w/cpp/container/map/erase. (The added testcase crashes without this patch.) Reported by David Binderman (https://llvm.org/PR38704)! Differential Revision: https://reviews.llvm.org/D51623 llvm-svn: 341371	2018-09-04 15:13:23 +00:00
Jonas Hahnfeld	82d20201d0	[libomptarget][NVPTX] Drop dead code and data structures, NFCI. * cg and HasCancel in WorkDescr were never read and can be removed. * This eliminates the last use of priv in ThreadPrivateContext. * CounterGroup is unused afterwards. * Remove duplicate external declares in omptarget-nvptx.cu that are already in the header omptarget-nvptx.h. Differential Revision: https://reviews.llvm.org/D51622 llvm-svn: 341370	2018-09-04 15:13:17 +00:00
Jonas Hahnfeld	96c13488ab	[libomptarget][NVPTX] Fix __kmpc_spmd_kernel_deinit If the runtime is uninitialized the master thread must Enqueue the state object, and ALL threads must return immediately. Found post-commit of https://reviews.llvm.org/D51222. llvm-svn: 341328	2018-09-03 17:24:23 +00:00
Alexey Bataev	39a4724095	[OPENMP][NVPTX] Replace assert() by ASSERT0() macro, NFC. Required to fix the buildbots. llvm-svn: 340956	2018-08-29 19:22:06 +00:00
Alexey Bataev	b7a5d38cf5	[OPENMP][NVPTX] Lightweight runtime support for SPMD mode. Summary: Implemented simple and lightweight runtime support for SPMD mode-based constructs. It adds support for L2 sequential parallelism wihtout full runtime support. Also, patch fixes some use cases for uninitialized\|lightweight runtime. Reviewers: grokos, kkwli0, Hahnfeld, gtbercea Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51222 llvm-svn: 340944	2018-08-29 17:35:09 +00:00
Gheorghe-Teodor Bercea	15f5407d92	[OpenMP][Fix] Conditional compilation leaves variables unused Summary: Prevent variables from being left unused by conditional compilation. Reviewers: ABataev, grokos, Hahnfeld, caomhin, protze.joachim Reviewed By: Hahnfeld Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51303 llvm-svn: 340771	2018-08-27 19:54:26 +00:00
Alexandre Eichenberger	e9b7d8dcd6	[OpenMP][libomptarget] rework of fatal error reporting Summary: Removed the function that used a lock and varargs Used the same mechanism as for debug messages Reviewers: ABataev, gtbercea, grokos, Hahnfeld Reviewed By: gtbercea, Hahnfeld Subscribers: mikerice, ABataev, RaviNarayanaswamy, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51226 llvm-svn: 340767	2018-08-27 18:20:15 +00:00
Gheorghe-Teodor Bercea	353adf437d	[OpenMP][Fix] Ensure comparison between unsigned values. Summary: Ensure the values being compared are both unsigned. Reviewers: ABataev, Hahnfeld, caomhin, grokos, AndreyChurbanov Reviewed By: AndreyChurbanov Subscribers: AndreyChurbanov, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51301 llvm-svn: 340745	2018-08-27 14:52:20 +00:00
Jonathan Peyton	2a966e84ce	[OpenMP] Remove deprecated/obsolete MIC attributes from headers llvm-svn: 340656	2018-08-24 21:34:10 +00:00
Jonathan Peyton	2c3e5d82b4	[OpenMP] Fixed affinity verbose double printing for balanced type. llvm-svn: 340647	2018-08-24 20:35:42 +00:00
Jonathan Peyton	a4a9c48c78	[OpenMP] Fix tasking bug for decreasing hot team nthreads The __kmp_execute_tasks_template() function reads the task_team and current_task from the thread structure. There appears to be a pathological timing where the number of threads in the hot team decreases and so a thread is put in the pool via __kmp_free_thread(). It could be the case that: 1) A thread reads th_task_team into task_team local variables and is then interrupted by the OS 2) Master frees the thread and sets current task and task team to NULL 3) The thread reads current_task as NULL When this happens, current_task is dereferenced and a segfault occurs. This patch just checks for current_task to not be NULL as well. Differential Revision: https://reviews.llvm.org/D50651 llvm-svn: 340632	2018-08-24 18:07:35 +00:00
Jonathan Peyton	ca10a76f08	[OpenMP] Add check for hot_teams array If hot teams are not being used, this code could seg fault without the added check, and does so when composability is used in conjunction with nesting. The fix prevents the segfault. Differential Revision: https://reviews.llvm.org/D50649 llvm-svn: 340629	2018-08-24 18:05:00 +00:00
Jonathan Peyton	b1b221c82c	[OpenMP] Fix incorrect barrier imbalance reporting in ITTNOTIFY Exclude nested explicit tasks from timing, only outer level explicit task counted and its time added to barrier arrive time for the thread. Differential Revision: https://reviews.llvm.org/D50584 llvm-svn: 340628	2018-08-24 18:03:27 +00:00
Alexandre Eichenberger	1b4a666ba5	[OpenMP][libomptarget] Bringing up to spec with respect to OMP_TARGET_OFFLOAD env var Summary: Right now, only the OMP_TARGET_OFFLOAD=DISABLED was implemented. Added support for the other MANDATORY and DEFAULT values. Reviewers: gtbercea, ABataev, grokos, caomhin, Hahnfeld Reviewed By: Hahnfeld Subscribers: protze.joachim, gtbercea, AlexEichenberger, RaviNarayanaswamy, Hahnfeld, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D50522 llvm-svn: 340542	2018-08-23 16:22:42 +00:00
Joachim Protze	e1a04b4659	[OMPT] Remove OMPT idle callback The idle callback was removed from the spec as of TR7. This removes it from the implementation. Patch provided by Simon Convent Reviewers: hbae, protze.joachim Differential Revision: https://reviews.llvm.org/D48362 llvm-svn: 339771	2018-08-15 13:54:28 +00:00
Jonathan Peyton	a3f6d4c5b8	[OMPT] Make omp_control_tool() compliant when called from Fortran programs This change fixes an incorrect behavior of the omp_control_tool function when called from Fortran applications. A tool callback function for this event is supposed to get NULL for the third argument according to the specification, but the current implementation just passes a garbage value. A possible fix is to use the OPTIONAL attribute for the third argument. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D50565 llvm-svn: 339585	2018-08-13 17:26:18 +00:00
Jonathan Peyton	baad3f6016	[OpenMP] Cleanup code This patch cleans up unused functions, variables, sign compare issues, and addresses some -Warning flags which are now enabled including -Wcast-qual. Not all the warning flags in LibompHandleFlags.cmake are enabled, but some are with this patch. Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and KMP_ASSERT() macros which used the comma operator. This had to be done for the innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT() Differential Revision: https://reviews.llvm.org/D49105 llvm-svn: 339393	2018-08-09 22:04:30 +00:00
Jonathan Peyton	821649229e	[OpenMP] Fix doacross testing for gcc This patch adds a test using the doacross clauses in OpenMP and removes gcc from testing kmp_doacross_check.c which is only testing the kmp rather than the gomp interface. Differential Revision: https://reviews.llvm.org/D50014 llvm-svn: 338757	2018-08-02 19:13:07 +00:00
Jonas Hahnfeld	ef8f737288	[OMPT] Disable by default on Windows This is broken per PR36561 and PR36574, so disable it for now until somebody interested can take a look. OMPT can still be activated manually by passing -DLIBOMP_OMPT_SUPPORT=ON during configuration. Differential Revision: https://reviews.llvm.org/D50086 llvm-svn: 338721	2018-08-02 14:34:08 +00:00
Jonas Hahnfeld	5b57eb4b09	[tests] Add annotations for taskloop features Only supported since GCC 6 and Intel 17.0. However GCC 6.3.0 is crashing on two of the tests, so disable them as well... Differential Revision: https://reviews.llvm.org/D50085 llvm-svn: 338720	2018-08-02 14:34:03 +00:00
Joachim Protze	935399d254	[OMPT,tests] Fix taskloop testcase scheduling effects The taskloop testcase had scheduling effects. Tasks of the taskloop would sometimes be scheduled before all task were created. The testing is now split into two phases. First, the task creation on the master is tested, than the scheduling events of the tasks are tested. Thus, the order of creation and scheduling events is irrelavant. Patch by Simon Convent Reviewed by: protze.joachim, Hahnfeld Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D50140 llvm-svn: 338580	2018-08-01 16:15:18 +00:00
Jonas Hahnfeld	51fc3cc628	[test] Convert test for PR36720 to c89 GCC 4.8.5 defaults to this old C standard. I think we should make the tests pass a newer -std=c99\|c11 but that's too intrusive for now... Differential Revision: https://reviews.llvm.org/D50084 llvm-svn: 338490	2018-08-01 06:26:55 +00:00
Jonathan Peyton	28226e7d64	[OpenMP] Fix tasking + parallel bug From the bug report, the runtime needs to initialize the nproc variables (inside middle init) for each root when the task is encountered, otherwise, a segfault can occur. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36720 Differential Revision: https://reviews.llvm.org/D49996 llvm-svn: 338313	2018-07-30 21:47:56 +00:00
Gheorghe-Teodor Bercea	f729df821a	[OpenMP] Fix new task creation Summary: When OMPT is not supported the __kmp_omp_task() function is passed the parameters in the wrong order. This is a fix related to patch D47709. Reviewers: Hahnfeld, sconvent, caomhin, jlpeyton Reviewed By: Hahnfeld Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D50001 llvm-svn: 338295	2018-07-30 19:51:51 +00:00
Jonas Hahnfeld	f985f98128	[CMake] Disable -Wstringop-overflow GCC 8 produces false-positives with this: In file included from <openmp>/src/runtime/src/kmp_os.h:950, from <openmp>/src/runtime/src/kmp.h:78, from <openmp>/src/runtime/src/kmp_environment.cpp:54: <openmp>/src/runtime/src/kmp_environment.cpp: In function ‘char* __kmp_env_get(const char)’: <openmp>/src/runtime/src/kmp_safe_c_api.h:52:50: warning: ‘char strncpy(char, const char, size_t)’ specified bound depends on the length of the source argument [-Wstringop-overflow=] #define KMP_STRNCPY_S(dst, bsz, src, cnt) strncpy(dst, src, cnt) ~~~~~~~^~~~~~~~~~~~~~~ <openmp>/src/runtime/src/kmp_environment.cpp:97:5: note: in expansion of macro ‘KMP_STRNCPY_S’ KMP_STRNCPY_S(result, len, value, len); ^~~~~~~~~~~~~ <openmp>/src/runtime/src/kmp_environment.cpp:92:28: note: length computed here size_t len = KMP_STRLEN(value) + 1; This is stupid because result is allocated with KMP_INTERNAL_MALLOC(len), so the arguments are correct. Differential Revision: https://reviews.llvm.org/D49904 llvm-svn: 338283	2018-07-30 18:16:22 +00:00
Jonathan Peyton	284fab195a	[OpenMP] Add GOMP version symbols for OMP_4.5 API This patch adds the appropriate version symbols to the relevant API functions Differential Revision: https://reviews.llvm.org/D49859 llvm-svn: 338281	2018-07-30 17:50:35 +00:00
Jonathan Peyton	369d72db11	[OpenMP] Implement GOMP doacross compatibility This change introduces GOMP doacross compatibility. There are 12 new interface functions 6 for long type and 6 for unsigned long long type: GOMP_doacross_post, GOMP_doacross_wait, GOMP_loop_doacross_[schedule]_start where schedule can be static, dynamic, guided, or runtime. These functions just translate the parameters if necessary and send them to the corresponding kmp function. E.g., GOMP_doacross_post() -> __kmpc_doacross_post() For the GOMP_doacross_post function, there is template specialization to account for when long is a four byte vs an eight byte type. If it is a four byte type, then a temporary array has to be created to convert the four byte integers into eight byte integers and then sending that into __kmpc_doacross_post(). Because GOMP_doacross_wait uses varargs, it always needs a temporary array and does not need template specialization. Differential Revision: https://reviews.llvm.org/D49857 llvm-svn: 338280	2018-07-30 17:48:33 +00:00
Jonathan Peyton	8692e142b3	[OpenMP] Fix build errors when building with KMP_DEBUG_ADAPTIVE_LOCKS=1 This change fixes build errors when building a runtime with adaptive lock stats enabled. Most of the errors were due to the recent changes in the runtime, but it seems that we have not tried to build this debug runtime on Windows for a long time. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49823 llvm-svn: 338277	2018-07-30 17:45:23 +00:00
Jonathan Peyton	f0682ac498	[OpenMP][Stats] Cleanup stats gathering code 1) Remove unnecessary data from list node structure 2) Remove timerPair in favor of pushing/popping explicitTimers. This way, nested timers will work properly. 3) Fix #pragma omp critical timers 4) Add histogram capability 5) Add KMP_STATS_FILE formatting capability 6) Have time partitioned into serial & parallel by introducing partitionedTimers::exchange(). This also counts the number of serial regions in the executable. 7) Fix up the timers around OMP loops so that scheduling overhead and work are both counted correctly. 8) Fix up the iterations statistics so they count the number of iterations the thread receives at each loop scheduling event 9) Change timers so there is only one RDTSC read per event change 10) Fix up the outdated comments for the timers Differential Revision: https://reviews.llvm.org/D49699 llvm-svn: 338276	2018-07-30 17:41:08 +00:00
Joachim Protze	cdaefac5bd	[OMPT] Fix OMPT callbacks for the taskloop construct and add testcase Fix the order of callbacks related to the taskloop construct. Add the iteration_count to work callbacks (according to the spec). Use kmpc_omp_task() instead of kmp_omp_task() to include OMPT callbacks. Add a testcase. Patch by Simon Convent Reviewed by: protze.joachim, hbae Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D47709 llvm-svn: 338146	2018-07-27 18:13:24 +00:00
Joachim Protze	86ed6aa668	[OMPT] Adapt OMPT callbacks for tasks to handle untied tasks correctly The ompt/tasks/task_types.c testcase did not test untied tasks properly. Now, frame addresses are tested and two scheduling points are added at which the task can switch to another thread. Due to scheduling effects, the frame address could be NULL. This needed a restructure of the way OMPT callbacks are called. __ompt_task_finish() now as an extra parameter, whether a task is completed. Its invocation has been moved into __kmp_task_finish(). Thus, the order of the writes to the frame addresses is not subject to scheduling effects anymore. Patch by Simon Convent Reviewed by: protze.joachim, hbae Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D49181 llvm-svn: 338145	2018-07-27 18:13:20 +00:00
Joachim Protze	f203109edb	[OMPT] Print two more addresses in print_fuzzy_address_block() The two more outputs are needed to match the return addresses when using the Intel Compiler, as it generates more instructions between the fuzzy-printing of the address and the runtime call. Patch by Simon Convent Reviewed By: protze.joachim, hbae Differential Revision: https://reviews.llvm.org/D49373 llvm-svn: 338144	2018-07-27 18:13:15 +00:00
Jonas Hahnfeld	3a0e9b37f3	PR30734: Remove __kmp_ft_page_allocate() This function was not enabled by default and not exported when manually tweaking the build flags. Additionally it was hard to use since there is no corresponding __kmp_ft_page_free(). The code itself is questionable because the returned memory address is padded by an extra pointer which stores the unpadded start of the allocated region (this would need to be freed). Differential Revision: https://reviews.llvm.org/D49802 llvm-svn: 338052	2018-07-26 18:15:02 +00:00
Jonas Hahnfeld	6fbbf27d98	[test] Remove XFAIL of omp_for_bigbounds.c for Intel Compiler The initial commit said that the test passes with Intel Compiler, so change XFAIL to only list clang and gcc. Differential Revision: https://reviews.llvm.org/D49801 llvm-svn: 338051	2018-07-26 18:14:57 +00:00
Jonas Hahnfeld	ba5ec9c684	[OMPT] Fix typo in test parallel/nested_thread_num.c This caused test failures with GCC since its initial commit in r336085 (https://reviews.llvm.org/D46533). llvm-svn: 337911	2018-07-25 12:34:31 +00:00
Alexey Bataev	37d4156b11	[OPNEMP, NVPTX] Fixed sychronization construct + code cleanup. Summary: 1. Fixed internal problem in `__kmpc_barrier` function: SPMD mode synchronization function should be called only in L1 parallel level. 2. Removed some extra code for synchronization inside of the code, used `__kmpc_barrier` instead. 3. Some code cleanup. Reviewers: gtbercea, grokos Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D49564 llvm-svn: 337691	2018-07-23 13:52:12 +00:00
Jonathan Peyton	a764af68be	Block library shutdown until unreaped threads finish spin-waiting This change fixes possibly invalid access to the internal data structure during library shutdown. In a heavily oversubscribed situation, the library shutdown sequence can reach the point where resources are deallocated while there still exist threads in their final spinning loop. The added loop in __kmp_internal_end() checks if there are such busy-waiting threads and blocks the shutdown sequence if that is the case. Two versions of kmp_wait_template() are now used to minimize performance impact. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D49452 llvm-svn: 337486	2018-07-19 19:17:00 +00:00
George Rokos	a0da24683b	[OpenMP][libomptarget] New map interface: remove translation code and ensure proper alignment of struct members This patch removes the translation code since this functionality is now implemented in the compiler. target_data_begin and target_data_end are also patched to handle some special cases that used to be handled by the obsolete translation function, namely ensure proper alignment of struct members when we have partially mapped structs. Mapping a struct from a higher address (i.e. not from its beginning) can result in distortion of the alignment for some of its member fields. Padding restores the original (proper) alignment. Differential revision: https://reviews.llvm.org/D44186 llvm-svn: 337455	2018-07-19 13:41:03 +00:00
Joachim Protze	bb869f42b7	[libomptarget] Also support several images for elf In revision r336569 (D49036) libomptarget support for multiple nvidia images has been fixed in case a target region resides inside one or multiple libraries and in the compiled application. But the issues is still present for elf images. This fix will also support multiple images for elf. Patch by Jannis Klinkenberg Reviewers: protze.joachim, ABataev, grokos Reviewed By: protze.joachim, ABataev, grokos Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D49418 llvm-svn: 337355	2018-07-18 07:23:46 +00:00
Azharuddin Mohammed	6712b8675b	[cmake] Fix libomptarget/test/CMakeLists.txt Summary: Should be variable name instead of variable reference. If the variable is somehow unset, it messes up the if condition expression and causes a CMake error. Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld Reviewed By: Hahnfeld Subscribers: mgorny, llvm-commits, openmp-commits Differential Revision: https://reviews.llvm.org/D47221 llvm-svn: 337133	2018-07-15 17:29:43 +00:00
Gheorghe-Teodor Bercea	9e94326185	[OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode Summary: This patch fixes the data sharing infrastructure to work for the SPMD and non-SPMD cases. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: ABataev, grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D49204 llvm-svn: 337013	2018-07-13 16:14:22 +00:00
Alexey Bataev	c2c0138a04	[OPENMP, NVPTX] Fix loop boundaries calculation for dynamic loops. Summary: Patch fixes the next problems. 1. Removes unused functions from omptarget_nvptx_ThreadPrivateContext class + simplified data members. 2. Fixed calculation of loop boundaries for dynamic loops with static scheduling. 3. Introduced saving/restoring of the dynamic loop boundaries to support several nested parallel dynamic loops. Reviewers: grokos Subscribers: guansong, kkwli0, openmp-commits Differential Revision: https://reviews.llvm.org/D49241 llvm-svn: 336915	2018-07-12 15:18:28 +00:00
Jonathan Peyton	dc73f512ae	Fix const cast problem introduced in r336563 336563 eliminated CCAST() macros caused build failures llvm-svn: 336586	2018-07-09 19:09:31 +00:00
Jonathan Peyton	61d44f188a	[OpenMP] Fix a few formatting issues llvm-svn: 336575	2018-07-09 18:09:25 +00:00
Jonathan Peyton	f639936748	[OpenMP] Introduce hierarchical scheduling This patch introduces the logic implementing hierarchical scheduling. First and foremost, hierarchical scheduling is off by default To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage. This work is based off if the IWOMP paper: "Workstealing and Nested Parallelism in SMP Systems" Hierarchical scheduling is the layering of OpenMP schedules for different layers of the memory hierarchy. One can have multiple layers between the threads and the global iterations space. The threads will go up the hierarchy to grab iterations, using possibly a different schedule & chunk for each layer. [ Global iteration space (0-999) ] (use static) [ L1 \| L1 \| L1 \| L1 ] (use dynamic,1) [ T0 T1 \| T2 T3 \| T4 T5 \| T6 T7 ] In the example shown above, there are 8 threads and 4 L1 caches begin targeted. If the topology indicates that there are two threads per core, then two consecutive threads will share the data of one L1 cache unit. This example would have the iteration space (0-999) split statically across the four L1 caches (so the first L1 would get (0-249), the second would get (250-499), etc). Then the threads will use a dynamic,1 schedule to grab iterations from the L1 cache units. There are currently four supported layers: L1, L2, L3, NUMA OMP_SCHEDULE can now read a hierarchical schedule with this syntax: OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h to try to keep it separate from the rest of the code. Differential Revision: https://reviews.llvm.org/D47962 llvm-svn: 336571	2018-07-09 17:51:13 +00:00
Alexey Bataev	2622e9e5b3	[OPENMP, NVPTX] Support several images in the executable. Summary: Currently Cuda plugin supports loading of the single image, though we may have the executable with the several images, if it has target regions inside of the dynamically loaded library. Patch allows to load multiple images. Reviewers: grokos Subscribers: guansong, openmp-commits, kkwli0 Differential Revision: https://reviews.llvm.org/D49036 llvm-svn: 336569	2018-07-09 17:46:55 +00:00
Jonathan Peyton	39ada85446	[OpenMP] Restructure loop code for hierarchical scheduling This patch reorganizes the loop scheduling code in order to allow hierarchical scheduling to use it more effectively. In particular, the goal of this patch is to separate the algorithmic parts of the scheduling from the thread logistics code. Moves declarations & structures to kmp_dispatch.h for easier access in other files. Extracts the algorithmic part of __kmp_dispatch_init() and __kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and __kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in __kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the hierarchical scheduler needs to access the scheduling logic without the bookkeeping logic. To prepare for new pointer in dispatch_private_info_t, a new flags variable is created which stores the ordered and nomerge flags instead of them being in two separate variables. This will keep the dispatch_private_info_t structure the same size. Differential Revision: https://reviews.llvm.org/D47961 llvm-svn: 336568	2018-07-09 17:45:33 +00:00
Jonathan Peyton	37e2ef5434	[OpenMP] Use C++11 Atomics - barrier, tasking, and lock code These are preliminary changes that attempt to use C++11 Atomics in the runtime. We are expecting better portability with this change across architectures/OSes. Here is the summary of the changes. Most variables that need synchronization operation were converted to generic atomic variables (std::atomic<T>). Variables that are updated with combined CAS are packed into a single atomic variable, and partial read/write is done through unpacking/packing Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D47903 llvm-svn: 336563	2018-07-09 17:36:22 +00:00
Kelvin Li	b1711b28f7	Define the __STDC_FORMAT_MACROS to avoid test failure on some platforms. ompt/misc/api_calls_from_other_thread.cpp ompt/misc/interoperability.cpp Differential Revision: https://reviews.llvm.org/D48984 llvm-svn: 336438	2018-07-06 14:15:59 +00:00
Joachim Protze	b41c61eed4	Dropped non-supoorted "--no-as-needed" flag from OMPT tests for macOS The flag "--no-as-needed" is not recognized by the linker on macOS making the following tests fail: ompt/loadtool/tool_available/tool_available.c ompt/loadtool/tool_not_available/tool_not_available.c This patch removes this flag for macOS and adds it only for Linux and Windows. I tested it on Ubuntu 16.04 and macOS HighSierra, with Clang/LLVM 6.0.1 and OpenMP trunk. This solution was also discussed in the OpenMP-dev mailing list. Patch provided by Simone Atzeni Differential Revision: https://reviews.llvm.org/D48888 llvm-svn: 336327	2018-07-05 09:14:06 +00:00
Joachim Protze	00505b85a3	[OMPT] Add synchronization to threads_nested.c testcase The testcase potentially fails when a thread is reused. The added synchronization makes sure this does not happen. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48932 llvm-svn: 336326	2018-07-05 09:14:01 +00:00
Joachim Protze	04a00fc18c	[OMPT] Use alloca() to force availability of frame pointer When compiling with icc, there is a problem with reenter frame addresses in parallel_begin callbacks in the interoperability.c testcase. (The address is not available. thus NULL) Using alloca() forces availability of the frame pointer. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48282 llvm-svn: 336088	2018-07-02 09:13:38 +00:00
Joachim Protze	e2eec57a4f	[OMPT] Add tests for runtime entry points from non-OpenMP threads Several runtime entry points have not been tested from non-OpenMP threads. This adds tests to an existing testcase. While at it, the testcase was reformatted Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48124 llvm-svn: 336087	2018-07-02 09:13:34 +00:00
Joachim Protze	28d2d708d4	[OMPT] Add testcases for thread_begin and thread_end callbacks Especially the thread_end callback has not been tested before. This adds a testcase for nested and non-nested threads. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D47824 llvm-svn: 336086	2018-07-02 09:13:30 +00:00
Joachim Protze	4a73ae167e	[OMPT] Provide the right thread_num for ancestor levels The current implementation always provides the thread-num for the current parallel region. This patch fixes the behavior for ancestor levels >0. Differential Revision: https://reviews.llvm.org/D46533 llvm-svn: 336085	2018-07-02 09:13:24 +00:00
Alexey Bataev	3994bafbc7	[OPENMP, NVPTX] Sync threads before start ordered loops. Summary: Threads must be synchronized before starting ordered construct. Reviewers: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D48732 llvm-svn: 335987	2018-06-29 16:16:00 +00:00
Alexey Bataev	0ac29350b5	[OPENMP, NVPTX] Fixes for NVPTX RTL Summary: Patch fixes several problems in the implementation of NVPTX RTL. 1. Detection of the last iteration for loops with static scheduling, no chunks. 2. Fixes reductions for the serialized parallel constructs. 3. Fixes handling of the barriers. Reviewers: grokos Reviewed By: grokos Subscribers: Hahnfeld, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D48480 llvm-svn: 335469	2018-06-25 13:43:35 +00:00
Andrey Churbanov	a7fa3f009a	minor: fixed typo in debug print llvm-svn: 335138	2018-06-20 15:54:11 +00:00
Jonas Hahnfeld	d03cbf2cfe	Remove liboffload from repository See the mailing list for the proposal and discussion: http://lists.llvm.org/pipermail/openmp-dev/2018-June/002041.html llvm-svn: 335069	2018-06-19 19:08:17 +00:00
Guansong Zhang	f9e56e5982	[OpenMP] [CUDA] Expose teamid to the debug path Summary: Small bug fix for debug build. A previous fix causing trouble for debug build. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D48286 llvm-svn: 335046	2018-06-19 14:05:38 +00:00
Jonathan Peyton	e92ae43be8	[OpenMP] Fix formatting issues in kmp_stats.h llvm-svn: 334335	2018-06-08 22:27:53 +00:00
Joachim Protze	406361330b	[OMPT] Rename ompt_wait_id to omp_wait_id Rename ompt_wait_id to omp_wait_id, as defined in the spec. Differential Revision: https://reviews.llvm.org/D46530 llvm-svn: 333368	2018-05-28 08:16:08 +00:00
Joachim Protze	c5836064bb	[OMPT] Rename ompt_frame_t to omp_frame_t Rename ompt_frame_t to omp_frame_t, as defined in the spec. Differential Revision: https://reviews.llvm.org/D43568 llvm-svn: 333367	2018-05-28 08:14:58 +00:00
Jonas Hahnfeld	3c6595d65d	[OMPT] Fix test parallel/not_enough_threads.c Upcoming changes to FileCheck will modify CHECK-DAG to not match overlapping regions of the input. This test was found to be affected because it expects to find four threads to invoke events of type ompt_event_implicit_task_begin. It turns out this is wrong because OMP_THREAD_LIMIT is set to 2, so there are only two threads. The rest of the test got it right so it went unnoticed until now. (Rewrite test and apply clang-format to it as discussed in the past.) Differential Revision: https://reviews.llvm.org/D47119 llvm-svn: 333361	2018-05-27 17:07:38 +00:00
Jonas Hahnfeld	17aabf83e9	[libomptarget-nvptx] loop: Determine if runtime uninitialized The generic entry points for static loop scheduling previously hardcoded that the runtime was initialized. This can be wrong if the compiler analyzes that the runtime is not needed and calls the init functions accordingly. This didn't affect clang-ykt because they have entry points for different combinations of SPMD x Runtime not needed. I didn't do measurements yet but with inlining we might get away with always calling the generic interface and letting compiler and runtime figure out the rest. In any case, a correct runtime is always better than having functions that may only be called if previous calls passed in a specific set of arguments! Differential Revision: https://reviews.llvm.org/D47131 llvm-svn: 333285	2018-05-25 15:56:48 +00:00
Jonas Hahnfeld	65e0b8784c	[CMake] Unify install path for libraries Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands. This also fixes installation of libomptarget-nvptx that previously didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX. Differential Revision: https://reviews.llvm.org/D47130 llvm-svn: 333284	2018-05-25 15:56:41 +00:00
George Rokos	6da6f433a0	[CUDA]Fix dynamic\|guided scheduling. The existing implementation of the dynamic scheduling breaks the contract introduced by the original openmp runtime and, thus, is incorrect. Patch fixes it and introduces correct dynamic scheduling model. Thanks to Alexey Bataev for submitting this patch. Differential Revision: https://reviews.llvm.org/D47333 llvm-svn: 333225	2018-05-24 21:12:41 +00:00
Jonas Hahnfeld	9228f9718c	[libomptarget-nvptx-bc] Pass found CUDA installations We already know where the CUDA SDK is, so there is no point in letting Clang search for it again and possibly finding no or a different installation. --cuda-path is supported since the beginning of CUDA support in Clang, so making this required doesn't impose additional restrictions. Differential Revision: https://reviews.llvm.org/D46930 llvm-svn: 332495	2018-05-16 17:20:27 +00:00
Jonas Hahnfeld	37bbe1a698	[libomptarget-nvptx] Test bitcode compiler flags and enable by default Move all logic related to selecting the bitcode compiler and linker into a new file and dynamically test required compiler flags. This also adds -fcuda-rdc for Clang trunk as previously attempted in D44992 which fixes the build. As a result this change also enables building the library by default if all prerequisites are met. Differential Revision: https://reviews.llvm.org/D46901 llvm-svn: 332494	2018-05-16 17:20:21 +00:00
Gheorghe-Teodor Bercea	787a350021	[OpenMP][libomptarget] Add function for checking SPMD mode Summary: Add function to the NVPTX libomptarget library that will return true if the current target region is being executed in SPMD mode. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D46840 llvm-svn: 332360	2018-05-15 15:16:43 +00:00
Joachim Protze	9be9cf20bf	[OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regions implicit_task_end callbacks in nested parallel regions did not always give the correct thread_num, since the inner parallel region may have already been finalized. Now, the thread_num is stored at the beginning of the implicit task and retrieved at the end, whenever necessary. A testcase was added as well. Differential Revision: https://reviews.llvm.org/D46260 llvm-svn: 331632	2018-05-07 12:42:21 +00:00
Joachim Protze	8fc39f6b19	[OMPT] Add api_calls_misc.c testcase and rename api_calls.c testcase The api_calls_misc.c testcase tests the following api calls: ompt_get_callback() ompt_get_state() ompt_enumerate_states() ompt_enumerate_mutex_impls() These have not been tested previously. The api_calls.c testcase has been renamed to api_calls_places.c because it only tests api calls that are related to places. Differential Revision: https://reviews.llvm.org/D42523 llvm-svn: 331631	2018-05-07 12:42:15 +00:00
Guansong Zhang	e1c7a46d5b	[OpenMP] Use LIBOMPTARGET_DEVICE_RTL_DEBUG env var to control debug messages on the device side Summary: Enable the device side debug messages at compile time, use env var to control at runtime. To achieve this, an environment data block is passed to the device lib when it is loaded. By default, the message is off, to enable it, a user need to set LIBOMPDEVICE_DEBUG=1. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D46210 llvm-svn: 331550	2018-05-04 19:29:28 +00:00
Jonathan Peyton	d47df260ba	[OpenMP][OMPT] Fix api_calls_from_other_thread.cpp Removed environment setting in RUN: line that was being ignored anyways. Changed a few specific checks to "any number" llvm-svn: 331212	2018-04-30 18:46:31 +00:00
Guansong Zhang	ad6c26516b	[OpenMP] Remove compilation warning when using clang to compile bc files. Summary: Minor printf format correction. NVCC ignore those. Clang will give warning on these if debug is enabled. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D45528 llvm-svn: 330944	2018-04-26 14:06:53 +00:00
Guansong Zhang	334c379e32	[OpenMP] Make bc file compilation sensitive to LIBOMPTARGET_NVPTX_DEBUG flag Summary: The LIBOMPTARGET_NVPTX_DEBUG flag is inconsistent between using nvcc to generate .a file and clang to generate .bc file. Sync the two setting so we can get debug messages from the bc file path as well. Reviewers: grokos Subscribers: Hahnfeld, openmp-commits, mgorny Tags: #openmp Differential Revision: https://reviews.llvm.org/D45530 llvm-svn: 330477	2018-04-20 20:41:00 +00:00
Heejin Ahn	f78a493528	[OpenMP] Compilation error fix on const char* Summary: This line (`0ed912c7a7/runtime/src/kmp_gsupport.cpp (L1459)`) added in D45327 (rL330282) causes a compilation failure. Reviewers: jlpeyton Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D45786 llvm-svn: 330299	2018-04-18 22:23:31 +00:00
Jonathan Peyton	1482db9e03	[OpenMP] Fix affinity API for KMP_AFFINITY=none\|compact\|scatter Currently, the affinity API reports garbage for the initial place list and any thread's place lists when using KMP_AFFINITY=none\|compact\|scatter. This patch does two things: for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the initial place list is just a single place with all the proc ids in it. We also set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the thread reports that single place (place 0) instead of garbage (-1) when using the affinity API. When non-OMP_PROC_BIND affinity is used (including KMP_AFFINITY=compact\|scatter), a thread's place list is populated correctly. We assume that each thread is assigned to a single place. This is implemented in two of the affinity API functions Differential Revision: https://reviews.llvm.org/D45527 llvm-svn: 330283	2018-04-18 19:25:48 +00:00
Jonathan Peyton	27a677fc95	Introduce GOMP_taskloop API This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282	2018-04-18 19:23:54 +00:00
Joachim Protze	3865c69b84	Set the license header for all OMPT files llvm-svn: 329928	2018-04-12 17:23:26 +00:00
Guansong Zhang	f679431f91	[OpenMP] Remove extra warning when we build Summary: This one line change is to remove this warning message "warning: integer conversion resulted in a change of sign" Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D45415 llvm-svn: 329713	2018-04-10 15:28:31 +00:00
Guansong Zhang	f0029a7738	Revert "[OpenMP] enable bc file compilation using the latest clang" This reverts commit 6849e31c36d712d97433bca9af39b7a09c8c1207. llvm-svn: 329576	2018-04-09 14:45:41 +00:00
Guansong Zhang	e47fbc9da8	[OpenMP] enable bc file compilation using the latest clang Summary: adding cuda-rdc flag to allow extern global data Reviewers: grokos Reviewed By: grokos Subscribers: gregrodgers, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D44992 llvm-svn: 329072	2018-04-03 15:01:34 +00:00
Jonathan Peyton	1e6bb8d5de	Minor cleanup in __kmp_atfork_child() This change removes the unnecessary lock operation on __kmp_initz_lock inside the __kmp_atfork_child() function for Linux; the lock variable is initialized in the same function later. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D44949 llvm-svn: 328900	2018-03-30 19:55:11 +00:00
Jonathan Peyton	ea82c769f4	Move blocktime_str variable right before its first use llvm-svn: 328575	2018-03-26 19:20:50 +00:00
Jonathan Peyton	b6b79ac95b	Add summarizeStats.py to tools directory The summarizeStats.py script processes raw data provided by the instrumented (stats-gathering) OpenMP* runtime library. It provides: 1) A radar chart which plots counters as frequency (per GigaTick) of use within the program. The frequencies are plotted as log10, however values less than one are kept as it is and represented in red color. This was done to help visualize the differences better. 2) Pie charts separating total time as compute and non-compute. The compute and non-compute times have their own pie charts showing the constructs that contributed to them. The percentages listed are with respect to the total time. 3) '.csv' file with percentage of time spent within the different constructs. The script can be used as: $ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv Patch by Taru Doodi Differential Revision: https://reviews.llvm.org/D41838 llvm-svn: 328568	2018-03-26 18:44:48 +00:00
Andrey Churbanov	2d91a8a3ba	Fixed __kmpc_get_target_offload() to call library initialization. Differential Revision: https://reviews.llvm.org/D44793 llvm-svn: 328228	2018-03-22 18:51:51 +00:00
Gheorghe-Teodor Bercea	4bc36a06e2	[OpenMP][libomptarget] Initialize global memory stack only once. Summary: The global stack initialization function may be called multiple times. The initialization of the shared memory slots should only happen when the function is called for the first time for a given warp master thread. Reviewers: grokos, carlo.bertolli, ABataev, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44754 llvm-svn: 328148	2018-03-21 21:02:55 +00:00
Gheorghe-Teodor Bercea	b4332ca3da	[OpenMP][libomptarget] Fix master warp check Summary: The check for the master warp must take into consideration the actual number of warps: the master warp is equal to the last active warp not necessarily WARPSIZE - 1. Reviewers: grokos, carlo.bertolli, ABataev, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44537 llvm-svn: 328146	2018-03-21 20:51:16 +00:00
Gheorghe-Teodor Bercea	c8d395a168	[OpenMP][libomptarget] Enable globalization for workers Summary: This patch allows worker to have a global memory stack managed by the runtime. This patch is needed for completeness and consistency with the globalization policy: if a worker-side variable escapes the current context it then needs to be globalized. Until now, only the master thread was allowed to have such a stack. These global values can now potentially be shared amongst workers if the semantics of the OpenMP program require it. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44487 llvm-svn: 328144	2018-03-21 20:34:19 +00:00
Jonathan Peyton	78f977fcd1	Read OMP_TARGET_OFFLOAD and provide API to access ICV Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD, if available, otherwise defaulting to DEFAULT. Valid values for the ICV are specified as enum values {0,1,2} for disabled, default, and mandatory. An internal API access function __kmpc_get_target_offload is provided. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D44577 llvm-svn: 328046	2018-03-20 21:18:17 +00:00
Andrey Churbanov	3336aa0d07	Fix for Fix for https://bugs.llvm.org/show_bug.cgi?id=36705 . Differential Revision: https://reviews.llvm.org/D44637 llvm-svn: 327875	2018-03-19 18:05:15 +00:00
George Rokos	6b9bb5e1c2	Bugfix, extern declarations for libomp functions are `extern "C"` declarations llvm-svn: 327763	2018-03-17 02:07:42 +00:00
George Rokos	2878c3957b	Moved extern declarations to private header file, they are only used from within libomptarget, they don't need to be in omptarget.h. llvm-svn: 327740	2018-03-16 20:40:09 +00:00

... 2 3 4 5 6 ...

1070 Commits