This patch adds the `llvm_omp_target_dynamic_shared_alloc` function to
the `omp.h` header file so users can access it by default. Also changed
the name to keep it consistent with the other target allocators. Added
some documentation so users know how to use it. Didn't add the interface
for Fortran since there's no way to test it right now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D123246
The target allocators have been supported for NVPTX offloading for
awhile. The tests should use the allocators instead of calling the
functions manually. Also the comments indicating these being a preview
should be removed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D123242
This change adds support for ompt_callback_dispatch with the new
dispatch chunk type introduced in 5.2. Definitions of the new
ompt_work_loop types were also added in the header file.
Differential Revision: https://reviews.llvm.org/D122107
Register constraint switched to "=q" which means very specifically (from
https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
> Any register accessible as rl. In 32-bit mode, a, b, c, and d; in 64-bit
mode, any integer register.
Older gcc versions (8.x and below) were trying to use esi or edi for the
8 bit flag variable, but it wound up displaying this error in the end:
kmp_lock.cpp: In function ‘void __kmp_spin_backoff(kmp_backoff_t*)’:
kmp_lock.cpp:2684:1: error: unsupported size for integer register
Hence the correct restriction is "=q" instead of "=r".
Fixes: https://github.com/llvm/llvm-project/issues/53309
Differential Revision: https://reviews.llvm.org/D120519
Before this patch task priorities were ignored, that was a valid implementation
as the task priority is a hint according to OpenMP specification.
Implemented shared list of sorted (high -> low) task deques one per task
priority value. Tasks execution changed to first check if priority tasks ready
for execution exist, and these tasks executed before others;
otherwise usual tasks execution mechanics work.
Differential Revision: https://reviews.llvm.org/D119676
Essentially removed the "use omp_lib_kinds" statement and replaced it
with import to maintain consistency (and avoid compilation error
in case the omp_lib_kinds.mod file is not accessible) in header file.
The import is required to access entities in host scoping unit.
Differential Revision: https://reviews.llvm.org/D120707
Introduce KMP_COMPILER_ICX macro to represent compilation with oneAPI
compiler.
Fixup flag detection and compiler ID detection in CMake. Older CMake's
detect IntelLLVM as Clang.
Fix compiler warnings.
Fixup many of the tests to have non-empty parallel regions as they are
elided by oneAPI compiler.
The __kmp_hidden_helper_threads_num set to N+1 if user requested N threads.
Thus number of worker hidden helper threads corresponds to user request,
main thread of helper team excluded as it does not participate in actual work.
This also fixes divide-by-0 issue in the code.
Fixes#48656
Differential Revision: https://reviews.llvm.org/D119586
As @mstorsjo wrote in https://reviews.llvm.org/D117945#inline-1132920 :
> This change seems to have broken one aspect: When doing `ninja
install` I now get a warning saying `Error copying file "libomp.dll" to
"libiomp5md.dll".`, and `libiomp5md.dll` isn't installed.
>
> I believe the reason is that the inline cmake snippet is written to
`runtime/src/cmake_install.cmake` and then executed on install, but on
install, `${CMAKE_INSTALL_BINDIR}` isn't set (as `GNUInstallDirs` isn't
included there). Should this maybe expand `${CMAKE_INSTALL_BINDIR}`
right here instead of deferring it to the install cmake, or what's the
right course of action?
I agree that is the right course of action. We also agreed to restore the `CMAKE_INSTALL_PREFIX` that was there before, too.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D118528
This patch fixes the link error on Windows caused by `interop`
functions.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D118524
This patch fixes the issue that numbers assigned to `interop` functions were already taken in `openmp/runtime/src/dllexports`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118523
This implements the runtime portion of the interop directive.
It expects the frontend and IRBuilder portions to be in place
for proper execution. It currently works only for GPUs
and has several TODOs that should be addressed going forward.
Reviewed By: RaviNarayanaswamy
Differential Revision: https://reviews.llvm.org/D106674
This patch allows Openmp runtime atomic functions operating on x87 high-precision
to be present only in Openmp runtime for x86 architectures
The functions affected are:
__kmpc_atomic_10
__kmpc_atomic_20
__kmpc_atomic_cmplx10_add
__kmpc_atomic_cmplx10_div
__kmpc_atomic_cmplx10_mul
__kmpc_atomic_cmplx10_sub
__kmpc_atomic_float10_add
__kmpc_atomic_float10_div
__kmpc_atomic_float10_mul
__kmpc_atomic_float10_sub
__kmpc_atomic_float10_add_fp
__kmpc_atomic_float10_div_fp
__kmpc_atomic_float10_mul_fp
__kmpc_atomic_float10_sub_fp
__kmpc_atomic_float10_max
__kmpc_atomic_float10_min
Differential Revision: https://reviews.llvm.org/D117473
Add use of TPAUSE (from WAITPKG) to the runtime for Intel hardware,
with an envirable to turn it on in a particular C-state. Always uses
TPAUSE if it is selected and enabled by Intel hardware and presence of
WAITPKG, and if not, falls back to old way of checking
__kmp_use_yield, etc.
Differential Revision: https://reviews.llvm.org/D115758
This is the original patch in my GNUInstallDirs series, now last to merge as the final piece!
It arose as a new draft of D28234. I initially did the unorthodox thing of pushing to that when I wasn't the original author, but since I ended up
- Using `GNUInstallDirs`, rather than mimicking it, as the original author was hesitant to do but others requested.
- Converting all the packages, not just LLVM, effecting many more projects than LLVM itself.
I figured it was time to make a new revision.
I have used this patch series (and many back-ports) as the basis of https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS), which was merged last spring (2021). It looked like people were generally on board in D28234, but I make note of this here in case extra motivation is useful.
---
As pointed out in the original issue, a central tension is that LLVM already has some partial support for these sorts of things. Variables like `COMPILER_RT_INSTALL_PATH` have already been dealt with. Variables like `LLVM_LIBDIR_SUFFIX` however, will require further work, so that we may use `CMAKE_INSTALL_LIBDIR`.
These remaining items will be addressed in further patches. What is here is now rote and so we should get it out of the way before dealing more intricately with the remainder.
Reviewed By: #libunwind, #libc, #libc_abi, compnerd
Differential Revision: https://reviews.llvm.org/D99484
This is the original patch in my GNUInstallDirs series, now last to merge as the final piece!
It arose as a new draft of D28234. I initially did the unorthodox thing of pushing to that when I wasn't the original author, but since I ended up
- Using `GNUInstallDirs`, rather than mimicking it, as the original author was hesitant to do but others requested.
- Converting all the packages, not just LLVM, effecting many more projects than LLVM itself.
I figured it was time to make a new revision.
I have used this patch series (and many back-ports) as the basis of https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS), which was merged last spring (2021). It looked like people were generally on board in D28234, but I make note of this here in case extra motivation is useful.
---
As pointed out in the original issue, a central tension is that LLVM already has some partial support for these sorts of things. Variables like `COMPILER_RT_INSTALL_PATH` have already been dealt with. Variables like `LLVM_LIBDIR_SUFFIX` however, will require further work, so that we may use `CMAKE_INSTALL_LIBDIR`.
These remaining items will be addressed in further patches. What is here is now rote and so we should get it out of the way before dealing more intricately with the remainder.
Reviewed By: #libunwind, #libc, #libc_abi, compnerd
Differential Revision: https://reviews.llvm.org/D99484
In most cases, hidden helper task behave similar as detached tasks. That means,
for example, if we have to wait for detached tasks, we have to do the same thing
for hidden helper tasks as well. This patch adds the missing condition for hidden
helper task accordingly along with detached task.
Reviewed By: AndreyChurbanov
Differential Revision: https://reviews.llvm.org/D107316
This patch allows the user to request all resources of a particular
layer (or core-attribute). The syntax of KMP_HW_SUBSET is modified
so the number of units requested is optional or can be replaced with an
'*' character.
e.g., KMP_HW_SUBSET=c:intel_atom@3 will use all the cores after offset 3
e.g., KMP_HW_SUBSET=*c:intel_core will use all the big cores
e.g., KMP_HW_SUBSET=*s,*c,1t will use all the sockets, all cores per
each socket and 1 thread per core.
Differential Revision: https://reviews.llvm.org/D115826
Just defensive CMake-ing. I pulled this from D115544 and D99484 which
are blocked on some lldb CI failures I don't yet understand. Hoping to land
something smaller in the meantime.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D115566
This reverts commit 492de35df4.
I tried to apply John's changes in 8d897ec915 that were expected to
fix his patch but that didn't work unfortunately.
Reverting this again to fix the macOS bots and leave him more time to
investigate the issue.
This reverts commit 797b50d4be.
See the original D99484. @mib who noticed the original problem could not longer
reproduce it, after I tried and also failed. We are threfore hoping it went
away on its own!
Reviewed By: mib
Differential Revision: https://reviews.llvm.org/D115544
Allow filtering of resources based on core attributes. There are two new
attributes added:
1) Core Type (intel_atom, intel_core)
2) Core Efficiency (integer) where the higher the efficiency, the more
performant the core
On hybrid architectures , e.g., Alder Lake, users can specify
KMP_HW_SUBSET=4c:intel_atom,4c:intel_core to select the first four Atom
and first four Big cores. The can also use the efficiency syntax. e.g.,
KMP_HW_SUBSET=2c:eff0,2c:eff1
Differential Revision: https://reviews.llvm.org/D114901
Regardless that specification requires thread_limit to be positive,
it is better to warn user instead of crash in case the value is negative.
Differential Revision: https://reviews.llvm.org/D115340
When OpenMP is compiled as a part runtimes for multiple targets, openmp
is compiled under build/runtimes/runtimes-arch-unknown-linux-gnu-bins
directory. Old implementation treats this directory name as errors.
This patch adds a guard like "[Uu]known[^-]".
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D114346
Eachempati.
This patch adds clang (parsing, sema, serialization, codegen) support for the 'depend' clause on the 'taskwait' directive.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D113540
Teach the HWLOC topology method how to detect Atom and Core
types so hybrid CPUs are properly detected and represented when using
the HWLOC topology method.
Differential Revision: https://reviews.llvm.org/D112270
The current implementation of Windows Processor Groups has
a separate topology method to handle them. This patch deprecates
that specific method and uses the regular CPUID topology
method by default and inserts the Windows Processor Group objects
in the topology manually.
Notes:
* The preference for processor groups is lowered to a value less than
socket so that the user will see sockets in the KMP_AFFINITY=verbose
output instead of processor groups when sockets=processor groups.
* The topology's capacity is modified to handle additional topology layers
without the need for reallocation.
* If a user asks for a granularity setting that is "above" the processor
group layer, then the granularity is adjusted "down" to the processor
group since this is the coarsest layer available for threads.
Differential Revision: https://reviews.llvm.org/D112273
If some CPUs are offline, then make sure they are not included in the
fullMask even if norespect is given to KMP_AFFINITY.
Differential Revision: https://reviews.llvm.org/D112274
Remove restriction forcing users to specify the KMP_HW_SUBSET value in
topology order. This patch sorts the user KMP_HW_SUBSET value before
trying to apply it. For example: 1s,4c,2t is equivalent to 2t,1s,4c
Differential Revision: https://reviews.llvm.org/D112027
Have standalone builds define uppercase_CMAKE_BUILD_TYPE and use it.
llvm/CMakeLists.txt defines uppercase_CMAKE_BUILD_TYPE for regular LLVM
builds with OpenMP enabled.
Differential Revision: https://reviews.llvm.org/D112951
[NFC] As part of using inclusive language within the llvm project,
this patch replaces master with main when referring to `.chm` files.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D113299