llvm-project

Commit Graph

Author	SHA1	Message	Date
AndreyChurbanov	9ce2e5e700	Revert "[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type" This reverts commit `a1f550e052`. Revert in order to fix backwards compatibility breakage caused by type size change for task dependence flag.	2021-06-09 17:38:38 +03:00
Vignesh Balasubramanian	f61602b0d3	[OpenMP][OMPD] Implementation of OMPD debugging library - libompd. This is the first of seven patches that implements OMPD, a debugging interface to support debugging of OpenMP programs. It contains support code required in "openmp/runtime" for OMPD implementation. Reviewed By: @hbae Differential Revision: https://reviews.llvm.org/D100181	2021-06-08 16:44:22 +05:30
Peyton, Jonathan L	d70e1f1276	[OpenMP][runtime] add .clang-tidy file Use same checks as compiler-rt which removes checks for readability-* and llvm-header style. Differential Revision: https://reviews.llvm.org/D103711	2021-06-07 13:56:39 -05:00
AndreyChurbanov	a1f550e052	[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type Refactored code of dependence processing and added new inoutset dependence type. Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps. Size of type of the dependence flag changed from 1 to 4 bytes in clang. All dependence flags library gets so far and corresponding dependence types: 1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET. Differential Revision: https://reviews.llvm.org/D97085	2021-06-07 21:42:51 +03:00
Bryan Chan	54f059c900	[OpenMP] Check loc for NULL before dereferencing it The ident_t * argument in __kmp_get_monotonicity was being used without a customary NULL check, causing the function to crash in a Debug build. Release builds were not affected thanks to dead store elimination.	2021-06-07 10:45:48 -04:00
Terry Wilmarth	8ec9aa236e	[OpenMP] Add experimental nesting mode feature Nesting mode is a new experimental feature in the OpenMP runtime. It allows a user to set up nesting for an application in a way that corresponds to the hardware topology levels on the machine an application is being run on. For example, if a machine has 2 sockets, each with 12 cores, then use of nesting mode could set up an outer level of nesting that uses 2 threads per parallel region, and an inner level of nesting that uses 12 threads per parallel region. Nesting mode is controlled with the KMP_NESTING_MODE environment variable as follows: 1) KMP_NESTING_MODE = 0: Nesting mode is off (default); max-active-levels-var is set to 1 (the default -- nesting is off, nested parallel regions are serialized). 2) KMP_NESTING_MODE = 1: Nesting mode is on, and a number of threads will be assigned for each level discovered in the machine topology; max-active-levels-var is set to the number of levels discovered. 3) KMP_NESTING_MODE = n, n>1: [Note: this option is experimental and may change or be removed in the future.] Nesting mode is on, and a number of threads will be assigned for each topology level discovered on the machine, up to k<=n levels (since there may be fewer than n levels discovered in the topology), and beyond the kth level, nested parallel regions will be serialized; NOTE: max-active-levels-var is 1 (the default -- nesting is off, and nested parallel regions are serialized until the user changes max-active-levels-var. If the user sets OMP_NUM_THREADS or OMP_MAX_ACTIVE_LEVELS, they will override KMP_NESTING_MODE settings for the associated environment variables. The detected topology may be limited by an affinity mask setting on the initial thread, or if the user sets KMP_HW_SUBSET. See also: KMP_HOT_TEAMS_MAX_LEVEL for controlling use of hot teams for nested parallel regions. Note that this feature only sets numbers of threads used at nesting levels. The user should make use of OMP_PLACES and OMP_PROC_BIND or KMP_AFFINITY for affinitizing those threads, if desired. Differential Revision: https://reviews.llvm.org/D102188	2021-06-04 16:01:11 -05:00
Peyton, Jonathan L	56dd158c32	[OpenMP] fix spelling error in message-converter.pl	2021-06-04 11:20:32 -05:00
Peyton, Jonathan L	f7655f3df3	[OpenMP] Fix improper printf format specifier	2021-06-02 11:04:48 -05:00
Hansang Bae	7ba4e96ede	[OpenMP] Use new task type/flag for taskwait depend events. Differential Revision: https://reviews.llvm.org/D103464	2021-06-02 10:16:38 -05:00
Peyton, Jonathan L	2020c981fa	[OpenMP] Add L2-Tile equivalence for KNL When on KNL and L2 or Tile layer is detected, manually add the corresponding layer which is equivalent. Differential Revision: https://reviews.llvm.org/D102865	2021-06-01 14:17:13 -05:00
Hansang Bae	cf5c94ef08	[OpenMP] Define named constants for interop's foreign runtime ID Also added missing Fortran definitions for interop support. Differential Revision: https://reviews.llvm.org/D102883	2021-06-01 13:06:59 -05:00
Hansang Bae	95cefacfe1	[OpenMP] Fix crashing critical section with hint clause Runtime was using the default lock type without using the hint. Differential Revision: https://reviews.llvm.org/D102955	2021-05-24 17:25:01 -05:00
AndreyChurbanov	aa6e7e8da8	[OpenMP] libomp: move warnings to after library initialization Warnings on deprecated api cannot be suppressed if the library is not initialized. With this change it is possible to set KMP_WARNINGS=false to suppress the warnings. Differential Revision: https://reviews.llvm.org/D102676	2021-05-21 23:47:23 +03:00
Shilei Tian	af6511d730	[OpenMP] Fixed Bug 49356 Bug 49356 (https://bugs.llvm.org/show_bug.cgi?id=49356) reports crash in the test case `tasking/bug_taskwait_detach.cpp`, which is caused by the wrong function declaration. `gtid` in `__kmpc_omp_task` should be `kmp_int32`. Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D102584	2021-05-17 12:14:54 -04:00
Christopher Pulido	4fb0aaf033	[OpenMP] Changes to enable MSVC ARM64 build of libomp This is the first in a series of changes to the OpenMP runtime that have been done internally by Microsoft. This patch makes the necessary changes to enable libomp.dll to build with the MSVC compiler targeting ARM64. Differential Revision: https://reviews.llvm.org/D101173	2021-05-11 23:03:12 +03:00
Peyton, Jonathan L	c765d140fe	[OpenMP] Fix hidden helper + affinity When KMP_AFFINITY is set, each worker thread's gtid value is used as an index into the place list to determine the thread's placement. With hidden helpers enabled, this gtid value is shifted down leading to unexpected shifted thread placement. This patch restores the previous behavior by adjusting the mask index to take the number of hidden helper threads into account. Hidden helper threads are given the full initial mask and do not participate in any of the other affinity mechanisms (place partitioning, balanced affinity). Their affinity is only printed for debug builds. Differential Revision: https://reviews.llvm.org/D101882	2021-05-11 08:54:22 -05:00
Peyton, Jonathan L	9982f33e2c	[OpenMP] Refactor/Rework topology discovery code This patch does the following: 1) Introduce kmp_topology_t as the runtime-friendly structure (the corresponding global variable is __kmp_topology) to determine the exact machine topology which can vary widely among current and future architectures. The current design is not easy to expand beyond the assumed three layer topology: sockets, cores, and threads so a rework capable of using the existing KMP_AFFINITY mechanisms is required. This new topology structure has: * The depth and types of the topology * Ratio count for each consecutive level (e.g., number of cores per socket, number of threads per core) * Absolute count for each level (e.g., 2 sockets, 16 cores, 32 threads) * Equivalent topology layer map (e.g., Numa domain is equivalent to socket, L1/L2 cache equivalent to core) * Whether it is uniform or not The hardware threads are represented with the kmp_hw_thread_t structure. This structure contains the ids (e.g., socket 0, core 1, thread 0) and other information grabbed from the previous Address structure. The kmp_topology_t structure contains an array of these. 2) Generalize the KMP_HW_SUBSET envirable for the new kmp_topology_t structure. The algorithm doesn't assume any order with tiles,numa domains,sockets,cores,threads. Instead it just parses the envirable, makes sure it is consistent with the detected topology (including taking into account equivalent layers) and then trims away the unneeded subset of hardware threads. To enable this, a new kmp_hw_subset_t structure is introduced which contains a vector of items (hardware type, number user wants, offset). Any keyword within __kmp_hw_get_keyword() can be used as a name and can be shortened as well. e.g., KMP_HW_SUBSET=1s,2numa,4tile,2c,3t can be used on the KNL SNC-4 machine. 3) Simplify topology detection functions so they only do the singular task of detecting the machine's topology. Printing, and all canonicalizing functionality is now done afterwards. So many lines of duplicated code are eliminated. 4) Add new ll_caches and numa_domains to OMP_PLACES, and consequently, KMP_AFFINITY's granularity setting. All the names within __kmp_hw_get_keyword() are available for use in OMP_PLACES or KMP_AFFINITY's granularity setting. 5) Simplify and future-proof code where explicit lists of allowed affinity settings keywords inside if() conditions. 6) Add x86 CPUID leaf 4 cache detection to existing x2apic id method so equivalent caches could be detected (in particular for the ll_caches place). Differential Revision: https://reviews.llvm.org/D100997	2021-05-03 18:00:24 -05:00
Martin Storsjö	01d27fc408	[OpenMP] Fix warnings due to redundant semicolons. NFC.	2021-05-02 21:51:06 +03:00
Kevin Athey	bc9120047b	Correct tiny misspelling (readlef -> readelf). Getting my feet wet here as a new committer. Correct misspelling in check-depends.pl. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D101552	2021-04-30 17:20:35 -07:00
Peyton, Jonathan L	4457565757	[OpenMP] Implement GOMP task reductions Implement the remaining GOMP_* functions to support task reductions in taskgroup, parallel, loop, and taskloop constructs. The unused mem argument to many of the work-sharing constructs has to do with the scan() directive/ inscan() modifier. If mem is set, each function will call KMP_FATAL() and tell the user scan/inscan is unsupported. The GOMP reduction implementation is kept separate from our implementation because of how GOMP presents reduction data and computes the reductions. GOMP expects the privatized copies to be present even after a #pragma omp parallel reduction(task:...) region has ended so the data is stored inside GOMP's uintptr_t* data pseudo-structure. This style is tightly coupled with GCC compiler codegen. There also isn't any init(), combiner(), fini() functions in GOMP's codegen so the two implementations were to disparate to try to wrap GOMP's around our own. Differential Revision: https://reviews.llvm.org/D98806	2021-04-16 16:36:31 -05:00
Peyton, Jonathan L	5ebbb366c4	[OpenMP] Allow affinity to re-detect for child processes Current atfork() handler for child processes does not reset the affinity masks array which prevents users from setting their own affinity in child processes. Differential Revision: https://reviews.llvm.org/D99218	2021-04-16 16:34:02 -05:00
Hansang Bae	9b98497b44	[OpenMP] Add omp_target_is_accessible() to header files -- Added omp_target_is_accessible to the header files -- Added missing const qualifier to device memory routines Differential Revision: https://reviews.llvm.org/D100420	2021-04-16 07:54:15 -05:00
Hansang Bae	77dc7b4653	[OpenMP] Fix printing routine for OMP_TOOL_VERBOSE_INIT Also fixed typo in the verbose message. Differential Revision: https://reviews.llvm.org/D100414	2021-04-14 07:55:26 -05:00
Hansang Bae	3da61ddae7	[OpenMP] Define omp_is_initial_device() variants in omp.h omp_is_initial_device() is marked as a built-in function in the current compiler, and user code guarded by this call may be optimized away, resulting in undesired behavior in some cases. This patch provides a possible fix for such cases by defining the routine as a variant function and removing it from builtin list. Differential Revision: https://reviews.llvm.org/D99447	2021-04-06 16:58:01 -05:00
Peyton, Jonathan L	2aebb7cb3c	[OpenMP] Fix incorrect KMP_STRLEN() macro The second argument to the strnlen_s(str, size) function should be sizeof(str) when str is a true array of characters with known size (instead of just a char*). Use type traits to determine if first parameter is a character array and use the correct size based on that trait. Differential Revision: https://reviews.llvm.org/D98209	2021-04-05 09:03:09 -05:00
Hansang Bae	467f39249d	[OpenMP] Misc. changes that add or remove pointer/bound checks -- Added or moved checks to appropriate places. -- Removed ineffective null check where the pointer is already being dereferenced around the code. -- Initialized variables that can be used without definitions. -- Added call to dlclose/FreeLibrary in OMPT tool activation. -- Added a new build compiler definition. Differential Revision: https://reviews.llvm.org/D98584	2021-03-23 18:55:08 -05:00
Shilei Tian	2df65f87c1	[OpenMP] Fixed a crash in hidden helper thread It is reported that after enabling hidden helper thread, the program can hit the assertion `new_gtid < __kmp_threads_capacity` sometimes. The root cause is explained as follows. Let's say the default `__kmp_threads_capacity` is `N`. If hidden helper thread is enabled, `__kmp_threads_capacity` will be offset to `N+8` by default. If the number of threads we need exceeds `N+8`, e.g. via `num_threads` clause, we need to expand `__kmp_threads`. In `__kmp_expand_threads`, the expansion starts from `__kmp_threads_capacity`, and repeatedly doubling it until the new capacity meets the requirement. Let's assume the new requirement is `Y`. If `Y` happens to meet the constraint `(N+8)2^X=Y` where `X` is the number of iterations, the new capacity is not enough because we have 8 slots for hidden helper threads. Here is an example. ``` #include <vector> int main(int argc, char argv[]) { constexpr const size_t N = 1344; std::vector<int> data(N); #pragma omp parallel for for (unsigned i = 0; i < N; ++i) { data[i] = i; } #pragma omp parallel for num_threads(N) for (unsigned i = 0; i < N; ++i) { data[i] += i; } return 0; } ``` My CPU is 20C40T, then `__kmp_threads_capacity` is 160. After offset, `__kmp_threads_capacity` becomes 168. `1344 = (160+8)*2^3`, then the assertions hit. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D98838	2021-03-18 18:25:36 -04:00
Hansang Bae	a6f9cb6adc	[OpenMP] Add runtime interface for OpenMP 5.1 error directive The proposed new interface is for supporting `at(execution)` clause in the error directive. Differential Revision: https://reviews.llvm.org/D98448	2021-03-16 08:55:25 -05:00
Peyton, Jonathan L	7085f04573	[OpenMP] Remove unused cpu_stackoffset member	2021-03-15 16:52:04 -05:00
AndreyChurbanov	aaf16b80dd	[OpenMP] libomp: eliminate pause from atomic CAS loops For clang this change is NFC cleanup, because clang never calls atomic functions from runtime library. Basically, pause is good in spin-loops waiting for something. Atomic CAS loops do not wait for anything, each CAS failure means some other thread progressed. Performance experiments show that the pause only causes unnecessary slowdown on CPUs with slow pause instruction, no difference on CPUs with fast pause instruction, removal of the pause gives lesser binary size which is good. Differential Revision: https://reviews.llvm.org/D97079	2021-03-09 18:30:08 +03:00
AndreyChurbanov	e4492b6f31	[OpenMP] NFC: temporarily disable assertion until the bug with dependences is fixed	2021-03-08 22:18:30 +03:00
Peyton, Jonathan L	e2738b3758	[OpenMP] Fix potential integer overflow in dynamic schedule code Restrict the chunk_size * chunk_num to only occur for valid chunk_nums and reimplement calculating the limit to avoid overflow. Differential Revision: https://reviews.llvm.org/D96747	2021-03-08 09:43:05 -06:00
tlwilmar	97d000cfc6	Added API for "masked" construct via two entrypoints: __kmpc_masked, and __kmpc_end_masked. The "master" construct is deprecated. Changed proc-bind keyword from "master" to "primary". Use of both master construct and master as proc-bind keyword is still allowed, but deprecated. Remove references to "master" in comments and strings, and replace with "primary" or "primary thread". Function names and variables were not touched, nor were references to deprecated master construct. These can be updated over time. No new code should refer to master.	2021-03-05 09:29:57 -06:00
Hansang Bae	b6c2f538b2	[OpenMP] Add allocator support for target memory This is a preview of allocator support for target memory that depends on the offload runtime API which allocates memory as described below. llvm_omp_target_alloc_host(size_t size, int device_num); -- Returns non-migratable memory owned by host. -- Memory is accessible by host and device(s). llvm_omp_target_alloc_shared(size_t size, int device_num); -- Returns migratable memory owned by host and device. -- Memory is accessible by host and device. llvm_omp_target_alloc_device(size_t size, int device_num); -- Returns memory owned by device. -- Memory is only accessible by device. New memory space and predefined allocator names are -- llvm_omp_target_host_mem_space -- llvm_omp_target_shared_mem_space -- llvm_omp_target_device_mem_space -- llvm_omp_target_host_mem_alloc -- llvm_omp_target_shared_mem_alloc -- llvm_omp_target_device_mem_alloc Differential Revision: https://reviews.llvm.org/D96669	2021-03-02 16:45:12 -06:00
Peyton, Jonathan L	e83380fccc	[OpenMP] Fix clang-cl build error regarding TSX intrinsics Fix for https://bugs.llvm.org/show_bug.cgi?id=49339 The CMake check for the RTM intrinsics needs the -mrtm flag to be set during the test. This way clang-cl correctly detects it has the _xbegin() intrinsic. Otherwise, the CMake check fails. Differential Revision: https://reviews.llvm.org/D97413	2021-03-02 07:47:42 -06:00
AndreyChurbanov	1df6e58e55	[OpenMP] libomp minor cleanup Cleanup changes: - check value read from file; - remove dead code; - make unsigned variable to read hexadecimal number to; - add debug assertion to check ref count. Differential Revision: https://reviews.llvm.org/D96893	2021-02-26 00:44:51 +03:00
AndreyChurbanov	4932101177	[OpenMP] libomp: fix ittnotify stack stitching for teams construct Stitching id could be overridden causing reference of destroyed object when number of teams is 1. The patch separates stitching id store location for teams and parallel nested in teams. Differential Revision: https://reviews.llvm.org/D96562	2021-02-26 00:23:24 +03:00
Peyton, Jonathan L	d12ae7db99	[OpenMP] Fix accidental addition of use omp_lib_kinds Fortran header accidentally had use omp_lib_kinds added inside a subroutine and function. This patch removes the lines.	2021-02-25 12:49:56 -06:00
Harmen Stoppels	a54f160b3a	Prefer /usr/bin/env xxx over /usr/bin/xxx where xxx = perl, python, awk Allow users to use a non-system version of perl, python and awk, which is useful in certain package managers. Reviewed By: JDevlieghere, MaskRay Differential Revision: https://reviews.llvm.org/D95119	2021-02-25 11:32:27 +01:00
Shilei Tian	e5da63d5a9	[OpenMP] Fixed a crash when offloading to x86_64 with target nowait PR#49334 reports a crash when offloading to x86_64 with `target nowait`, which is caused by referencing a nullptr. The root cause of the issue is, when pushing a hidden helper task in `__kmp_push_task`, it also maps the gtid to its shadow gtid, which is wrong. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D97329	2021-02-24 12:37:30 -05:00
Joachim Protze	35ab6d6390	[OpenMP][Tests][NFC] rename macro to avoid naming clash When including <ostream>, the register_callback macro of the OMPT callback.h clashes with a function defined in ostream. This patch renames the macro and includes ompt into the macro name.	2021-02-24 18:03:54 +01:00
Peyton, Jonathan L	56223b1e91	[OpenMP] Help static loop code avoid over/underflow This code alleviates some pathological loop parameters (lower, upper, stride) within calculations involved in the static loop code. It bounds the chunk size to the trip count if it is greater than the trip count and also minimizes problematic code for when trip count < nth. Differential Revision: https://reviews.llvm.org/D96426	2021-02-22 13:22:01 -06:00
Peyton, Jonathan L	1b968467c0	[OpenMP] Remove shutdown attempt on Windows process detach Only attempt shutdown if lpReserved is NULL. The Windows documentation states: When handling DLL_PROCESS_DETACH, a DLL should free resources such as heap memory only if the DLL is being unloaded dynamically (the lpReserved parameter is NULL). If the process is terminating (the lpReserved parameter is non-NULL), all threads in the process except the current thread either have exited already or have been explicitly terminated by a call to the ExitProcess function, which might leave some process resources such as heaps in an inconsistent state. In this case, it is not safe for the DLL to clean up the resources. Instead, the DLL should allow the operating system to reclaim the memory. Differential Revision: https://reviews.llvm.org/D96750	2021-02-22 13:15:06 -06:00
Peyton, Jonathan L	8c73be9d86	[OpenMP] Limit number of dispatch buffers This patch limits the number of dispatch buffers (used for loop worksharing construct) to between 1 and 4096. Differential Revision: https://reviews.llvm.org/D96749	2021-02-22 13:14:28 -06:00
Peyton, Jonathan L	55dff8b2e4	[OpenMP] Update HWLOC code for die level detection Differential Revision: https://reviews.llvm.org/D96748	2021-02-22 13:05:55 -06:00
AndreyChurbanov	1611e5473c	[OpenMP] libomp: cleanup some resource leaks Close mutexattr and condattr local objects to eliminate resource leaks. Differential Revision: https://reviews.llvm.org/D96892	2021-02-20 23:27:37 +03:00
Shilei Tian	309b00a42e	[OpenMP][NFC] clang-format the whole openmp project Same script as D95318. Test files are excluded. Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D97088	2021-02-20 12:46:32 -05:00
AndreyChurbanov	dab5d6c2eb	[OpenMP] fix race condition in test	2021-02-18 02:27:49 +03:00
AndreyChurbanov	cf1ddae7e3	[OpenMP][NFC] replaced 'dependencies' with 'dependences' in comments and debug prints	2021-02-18 00:38:18 +03:00
AndreyChurbanov	5631842d18	[OpenMP] NFC: fix test removing the target construct	2021-02-13 04:49:52 +03:00

1 2 3 4 5 ...

1078 Commits