llvm-project/openmp/docs/SupportAndFAQ.rst

Support, Getting Involved, and FAQ
==================================

Please do not hesitate to reach out to us via openmp-dev@lists.llvm.org or join
one of our :ref:`regular calls <calls>`. Some common questions are answered in
the :ref:`faq`.

.. _calls:

Calls
-----

OpenMP in LLVM Technical Call
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-   Development updates on OpenMP (and OpenACC) in the LLVM Project, including Clang, optimization, and runtime work.
-   Join `OpenMP in LLVM Technical Call <https://bluejeans.com/544112769//webrtc>`__.
-   Time: Weekly call on every Wednesday 7:00 AM Pacific time.
-   Meeting minutes are `here <https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit>`__.
-   Status tracking `page <https://openmp.llvm.org/docs>`__.


OpenMP in Flang Technical Call
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-   Development updates on OpenMP and OpenACC in the Flang Project.
-   Join `OpenMP in Flang Technical Call <https://bit.ly/39eQW3o>`_
-   Time: Weekly call on every Thursdays 8:00 AM Pacific time.
-   Meeting minutes are `here <https://docs.google.com/document/d/1yA-MeJf6RYY-ZXpdol0t7YoDoqtwAyBhFLr5thu5pFI>`__.
-   Status tracking `page <https://docs.google.com/spreadsheets/d/1FvHPuSkGbl4mQZRAwCIndvQx9dQboffiD-xD0oqxgU0/edit#gid=0>`__.


.. _faq:

FAQ
---

.. note::
   The FAQ is a work in progress and most of the expected content is not
   yet available. While you can expect changes, we always welcome feedback and
   additions. Please contact, e.g., through ``openmp-dev@lists.llvm.org``.


Q: How to contribute a patch to the webpage or any other part?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

All patches go through the regular `LLVM review process
<https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.


.. _build_offload_capable_compiler:

Q: How to build an OpenMP GPU offload capable compiler?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To build an *effective* OpenMP offload capable compiler, only one extra CMake
option, `LLVM_ENABLE_RUNTIMES="openmp"`, is needed when building LLVM (Generic
information about building LLVM is available `here <https://llvm.org/docs/GettingStarted.html>`__.).
Make sure all backends that are targeted by OpenMP to be enabled. By default,
Clang will be built with all backends enabled.

For Nvidia offload, please see :ref:`_build_nvidia_offload_capable_compiler`.
For AMDGPU offload, please see :ref:`_build_amdgpu_offload_capable_compiler`.

.. note::
  The compiler that generates the offload code should be the same (version) as
  the compiler that builds the OpenMP device runtimes. The OpenMP host runtime
  can be built by a different compiler.

.. _advanced_builds: https://llvm.org//docs/AdvancedBuilds.html

.. _build_nvidia_offload_capable_compiler:

Q: How to build an OpenMP NVidia offload capable compiler?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Cuda SDK is required on the machine that will execute the openmp application.

If your build machine is not the target machine or automatic detection of the
available GPUs failed, you should also set:

- `CLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_XX` where `XX` is the architecture of your GPU, e.g, 80.
- `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=YY` where `YY` is the numeric compute capacity of your GPU, e.g., 75.


.. _build_amdgpu_offload_capable_compiler:

Q: How to build an OpenMP AMDGPU offload capable compiler?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A subset of the `ROCm <https://github.com/radeonopencompute>` toolchain is
required to build the LLVM toolchain and to execute the openmp application.
Either install ROCm somewhere that cmake's find_package can locate it, or
build the required subcomponents ROCt and ROCr from source.

The two components used are ROCT-Thunk-Interface, roct, and ROCR-Runtime,
rocr. Roct is the userspace part of the linux driver. It calls into the
driver which ships with the linux kernel. It is an implementation detail of
Rocr from OpenMP's perspective. Rocr is an implementation of `HSA <http://www.hsafoundation.com>`.

    SOURCE_DIR=same-as-llvm-source # e.g. the checkout of llvm-project, next to openmp
    BUILD_DIR=somewhere
    INSTALL_PREFIX=same-as-llvm-install
    
    cd $SOURCE_DIR
    git clone git@github.com:RadeonOpenCompute/ROCT-Thunk-Interface.git -b roc-4.1.x --single-branch
    git clone git@github.com:RadeonOpenCompute/ROCR-Runtime.git -b rocm-4.1.x --single-branch
    
    cd $BUILD_DIR && mkdir roct && cd roct
    cmake $SOURCE_DIR/ROCT-Thunk-Interface/ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
    make && make install

    cd $BUILD_DIR && mkdir rocr && cd rocr
    cmake $SOURCE_DIR/ROCR-Runtime/src -DIMAGE_SUPPORT=OFF -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON
    make && make install

IMAGE_SUPPORT requires building rocr with clang and is not used by openmp.
    
Provided cmake's find_package can find the ROCR-Runtime package, LLVM will
build a tool `bin/amdgpu-arch` which will print a string like 'gfx906' when
run if it recognises a GPU on the local system. LLVM will also build a shared
library, libomptarget.rtl.amdgpu.so, which is linked against rocr.

With those libraries installed, then LLVM build and installed, try:

    clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa example.c -o example && ./example

Q: What are the known limitations of OpenMP AMDGPU offload?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
LD_LIBRARY_PATH is presently required to find the openmp libraries.

There is no libc. That is, malloc and printf do not exist. Also no libm, so
functions like cos(double) will not work from target regions.

Cards from the gfx10 line, 'navi', that use wave32 are not yet implemented.

Some versions of the driver for the radeon vii (gfx906) will error unless the
environment variable 'export HSA_IGNORE_SRAMECC_MISREPORT=1' is set.

It is a recent addition to LLVM and the implementation differs from that which
has been shipping in ROCm and AOMP for some time. Early adopters will encounter
bugs.

Q: Does OpenMP offloading support work in pre-packaged LLVM releases?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For now, the answer is most likely *no*. Please see :ref:`build_offload_capable_compiler`.

Q: Does OpenMP offloading support work in packages distributed as part of my OS?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For now, the answer is most likely *no*. Please see :ref:`build_offload_capable_compiler`.


.. _math_and_complex_in_target_regions:

Q: Does Clang support `<math.h>` and `<complex.h>` operations in OpenMP target on GPUs?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Yes, LLVM/Clang allows math functions and complex arithmetic inside of OpenMP target regions
that are compiled for GPUs.

Clang provides a set of wrapper headers that are found first when `math.h` and
`complex.h`, for C, `cmath` and `complex`, for C++, or similar headers are
included by the application. These wrappers will eventually include the system
version of the corresponding header file after setting up a target device
specific environment. The fact that the system header is included is important
because they differ based on the architecture and operating system and may
contain preprocessor, variable, and function definitions that need to be
available in the target region regardless of the targeted device architecture.
However, various functions may require specialized device versions, e.g.,
`sin`, and others are only available on certain devices, e.g., `__umul64hi`. To
provide "native" support for math and complex on the respective architecture,
Clang will wrap the "native" math functions, e.g., as provided by the device
vendor, in an OpenMP begin/end declare variant. These functions will then be
picked up instead of the host versions while host only variables and function
definitions are still available. Complex arithmetic and functions are support
through a similar mechanism. It is worth noting that this support requires
`extensions to the OpenMP begin/end declare variant context selector
<https://clang.llvm.org/docs/AttributeReference.html#pragma-omp-declare-variant>`__
that are exposed through LLVM/Clang to the user as well.

Q: What is a way to debug errors from mapping memory to a target device?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

An experimental way to debug these errors is to use :ref:`remote process 
offloading <remote_offloading_plugin>`.
By using ``libomptarget.rtl.rpc.so`` and ``openmp-offloading-server``, it is
possible to explicitly perform memory transfers between processes on the host
CPU and run sanitizers while doing so in order to catch these errors.

Q: Why does my application say "Named symbol not found" and abort when I run it?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is most likely caused by trying to use OpenMP offloading with static
libraries. Static libraries do not contain any device code, so when the runtime
attempts to execute the target region it will not be found and you will get an
an error like this.

.. code-block:: text

   CUDA error: Loading '__omp_offloading_fd02_3231c15__Z3foov_l2' Failed
   CUDA error: named symbol not found
   Libomptarget error: Unable to generate entries table for device id 0.

Currently, the only solution is to change how the application is built and avoid
the use of static libraries.

Q: Can I use dynamically linked libraries with OpenMP offloading
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Dynamically linked libraries can be only used if there is no device code split
between the library and application. Anything declared on the device inside the
shared library will not be visible to the application when it's linked.
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00			`Support, Getting Involved, and FAQ`
			`==================================`
[OpenMP][Docs] Structure and content for the OpenMP documentation This adds some initial content as well as structure to the new OpenMP Sphinx documentation hosted at http://openmp.llvm.org/docs/ . The content contains some useful links but most pages are still empty. This uses a "custom" theme which is a copy of the default "agogo" one with minor modifications to get a nicer table of content in the sidebar. This way we can also adjust the theme as we go. Reviewed By: jhuber6, JonChesterfield Differential Revision: https://reviews.llvm.org/D90256 2020-10-28 02:16:03 +08:00
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00			`Please do not hesitate to reach out to us via openmp-dev@lists.llvm.org or join`
			one of our :ref:`regular calls <calls>`. Some common questions are answered in
			the :ref:`faq`.
[OpenMP][Docs] Structure and content for the OpenMP documentation This adds some initial content as well as structure to the new OpenMP Sphinx documentation hosted at http://openmp.llvm.org/docs/ . The content contains some useful links but most pages are still empty. This uses a "custom" theme which is a copy of the default "agogo" one with minor modifications to get a nicer table of content in the sidebar. This way we can also adjust the theme as we go. Reviewed By: jhuber6, JonChesterfield Differential Revision: https://reviews.llvm.org/D90256 2020-10-28 02:16:03 +08:00
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00			`.. _calls:`

			`Calls`
			`-----`

			`OpenMP in LLVM Technical Call`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

			`- Development updates on OpenMP (and OpenACC) in the LLVM Project, including Clang, optimization, and runtime work.`
			- Join `OpenMP in LLVM Technical Call <https://bluejeans.com/544112769//webrtc>`__.
			`- Time: Weekly call on every Wednesday 7:00 AM Pacific time.`
			- Meeting minutes are `here <https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit>`__.
			- Status tracking `page <https://openmp.llvm.org/docs>`__.


			`OpenMP in Flang Technical Call`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
			`- Development updates on OpenMP and OpenACC in the Flang Project.`
			- Join `OpenMP in Flang Technical Call <https://bit.ly/39eQW3o>`_
			`- Time: Weekly call on every Thursdays 8:00 AM Pacific time.`
			- Meeting minutes are `here <https://docs.google.com/document/d/1yA-MeJf6RYY-ZXpdol0t7YoDoqtwAyBhFLr5thu5pFI>`__.
			- Status tracking `page <https://docs.google.com/spreadsheets/d/1FvHPuSkGbl4mQZRAwCIndvQx9dQboffiD-xD0oqxgU0/edit#gid=0>`__.


			`.. _faq:`

			`FAQ`
			`---`

			`.. note::`
			`The FAQ is a work in progress and most of the expected content is not`
			`yet available. While you can expect changes, we always welcome feedback and`
			additions. Please contact, e.g., through ``openmp-dev@lists.llvm.org``.


			`Q: How to contribute a patch to the webpage or any other part?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

			All patches go through the regular `LLVM review process
			<https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.


			`.. _build_offload_capable_compiler:`

[libomptarget] Initial documentation on amdgpu offload [libomptarget] Initial documentation on amdgpu offload Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101927 2021-05-06 02:58:51 +08:00			`Q: How to build an OpenMP GPU offload capable compiler?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
[OpenMP][Docs] Updated the faq about building an OpenMP offloading capable compiler After some issues about building runtimes along with LLVM were fixed, building an OpenMP offloading capable compiler is pretty simple. This patch updates the FAQ part in the doc. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93671 2020-12-23 02:14:34 +08:00			`To build an effective OpenMP offload capable compiler, only one extra CMake`
[OpenMP][Docs] Fixed a typo in the doc that can mislead users to a CMake error When setting `LLVM_ENABLE_RUNTIMES`, lower case word should be used; otherwise, it can cause a CMake error that specific path is not found. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D93719 2020-12-23 03:05:46 +08:00			option, `LLVM_ENABLE_RUNTIMES="openmp"`, is needed when building LLVM (Generic
[OpenMP][Docs] Updated the faq about building an OpenMP offloading capable compiler After some issues about building runtimes along with LLVM were fixed, building an OpenMP offloading capable compiler is pretty simple. This patch updates the FAQ part in the doc. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93671 2020-12-23 02:14:34 +08:00			information about building LLVM is available `here <https://llvm.org/docs/GettingStarted.html>`__.).
			`Make sure all backends that are targeted by OpenMP to be enabled. By default,`
[OpenMP][Docs] Fix typos in FAQ (NFC) 2021-01-19 07:55:58 +08:00			`Clang will be built with all backends enabled.`
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00
[libomptarget] Initial documentation on amdgpu offload [libomptarget] Initial documentation on amdgpu offload Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101927 2021-05-06 02:58:51 +08:00			For Nvidia offload, please see :ref:`_build_nvidia_offload_capable_compiler`.
			For AMDGPU offload, please see :ref:`_build_amdgpu_offload_capable_compiler`.
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00
			`.. note::`
			`The compiler that generates the offload code should be the same (version) as`
[OpenMP][Docs] Fix typos in FAQ (NFC) 2021-01-19 07:55:58 +08:00			`the compiler that builds the OpenMP device runtimes. The OpenMP host runtime`
			`can be built by a different compiler.`
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00
			`.. _advanced_builds: https://llvm.org//docs/AdvancedBuilds.html`

[libomptarget] Initial documentation on amdgpu offload [libomptarget] Initial documentation on amdgpu offload Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101927 2021-05-06 02:58:51 +08:00			`.. _build_nvidia_offload_capable_compiler:`

			`Q: How to build an OpenMP NVidia offload capable compiler?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
			`The Cuda SDK is required on the machine that will execute the openmp application.`

			`If your build machine is not the target machine or automatic detection of the`
			`available GPUs failed, you should also set:`

			- `CLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_XX` where `XX` is the architecture of your GPU, e.g, 80.
			- `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=YY` where `YY` is the numeric compute capacity of your GPU, e.g., 75.


			`.. _build_amdgpu_offload_capable_compiler:`

			`Q: How to build an OpenMP AMDGPU offload capable compiler?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
			A subset of the `ROCm <https://github.com/radeonopencompute>` toolchain is
			`required to build the LLVM toolchain and to execute the openmp application.`
			`Either install ROCm somewhere that cmake's find_package can locate it, or`
			`build the required subcomponents ROCt and ROCr from source.`

			`The two components used are ROCT-Thunk-Interface, roct, and ROCR-Runtime,`
			`rocr. Roct is the userspace part of the linux driver. It calls into the`
			`driver which ships with the linux kernel. It is an implementation detail of`
			Rocr from OpenMP's perspective. Rocr is an implementation of `HSA <http://www.hsafoundation.com>`.

			`SOURCE_DIR=same-as-llvm-source # e.g. the checkout of llvm-project, next to openmp`
			`BUILD_DIR=somewhere`
			`INSTALL_PREFIX=same-as-llvm-install`

			`cd $SOURCE_DIR`
			`git clone git@github.com:RadeonOpenCompute/ROCT-Thunk-Interface.git -b roc-4.1.x --single-branch`
			`git clone git@github.com:RadeonOpenCompute/ROCR-Runtime.git -b rocm-4.1.x --single-branch`

			`cd $BUILD_DIR && mkdir roct && cd roct`
			`cmake $SOURCE_DIR/ROCT-Thunk-Interface/ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF`
			`make && make install`

			`cd $BUILD_DIR && mkdir rocr && cd rocr`
			`cmake $SOURCE_DIR/ROCR-Runtime/src -DIMAGE_SUPPORT=OFF -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON`
			`make && make install`

			`IMAGE_SUPPORT requires building rocr with clang and is not used by openmp.`

			`Provided cmake's find_package can find the ROCR-Runtime package, LLVM will`
			build a tool `bin/amdgpu-arch` which will print a string like 'gfx906' when
			`run if it recognises a GPU on the local system. LLVM will also build a shared`
			`library, libomptarget.rtl.amdgpu.so, which is linked against rocr.`

			`With those libraries installed, then LLVM build and installed, try:`

			`clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa example.c -o example && ./example`

			`Q: What are the known limitations of OpenMP AMDGPU offload?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
			`LD_LIBRARY_PATH is presently required to find the openmp libraries.`

			`There is no libc. That is, malloc and printf do not exist. Also no libm, so`
			`functions like cos(double) will not work from target regions.`

			`Cards from the gfx10 line, 'navi', that use wave32 are not yet implemented.`

			`Some versions of the driver for the radeon vii (gfx906) will error unless the`
			`environment variable 'export HSA_IGNORE_SRAMECC_MISREPORT=1' is set.`
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00
[libomptarget] Initial documentation on amdgpu offload [libomptarget] Initial documentation on amdgpu offload Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101927 2021-05-06 02:58:51 +08:00			`It is a recent addition to LLVM and the implementation differs from that which`
			`has been shipping in ROCm and AOMP for some time. Early adopters will encounter`
			`bugs.`
[OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-26 01:49:30 +08:00
			`Q: Does OpenMP offloading support work in pre-packaged LLVM releases?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
			For now, the answer is most likely no. Please see :ref:`build_offload_capable_compiler`.

			`Q: Does OpenMP offloading support work in packages distributed as part of my OS?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
			For now, the answer is most likely no. Please see :ref:`build_offload_capable_compiler`.
[OpenMP][Docs] Add FAQ entry about math and complex on GPUs Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D93718 2020-12-23 02:49:11 +08:00

			`.. _math_and_complex_in_target_regions:`

			Q: Does Clang support `<math.h>` and `<complex.h>` operations in OpenMP target on GPUs?
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

			`Yes, LLVM/Clang allows math functions and complex arithmetic inside of OpenMP target regions`
			`that are compiled for GPUs.`

			Clang provides a set of wrapper headers that are found first when `math.h` and
[OpenMP][Docs] Fix Typo 2020-12-23 03:05:45 +08:00			`complex.h`, for C, `cmath` and `complex`, for C++, or similar headers are
[OpenMP][Docs] Add FAQ entry about math and complex on GPUs Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D93718 2020-12-23 02:49:11 +08:00			`included by the application. These wrappers will eventually include the system`
			`version of the corresponding header file after setting up a target device`
			`specific environment. The fact that the system header is included is important`
			`because they differ based on the architecture and operating system and may`
			`contain preprocessor, variable, and function definitions that need to be`
			`available in the target region regardless of the targeted device architecture.`
			`However, various functions may require specialized device versions, e.g.,`
			`sin`, and others are only available on certain devices, e.g., `__umul64hi`. To
			`provide "native" support for math and complex on the respective architecture,`
			`Clang will wrap the "native" math functions, e.g., as provided by the device`
			`vendor, in an OpenMP begin/end declare variant. These functions will then be`
			`picked up instead of the host versions while host only variables and function`
			`definitions are still available. Complex arithmetic and functions are support`
			`through a similar mechanism. It is worth noting that this support requires`
			`extensions to the OpenMP begin/end declare variant context selector
			<https://clang.llvm.org/docs/AttributeReference.html#pragma-omp-declare-variant>`__
			`that are exposed through LLVM/Clang to the user as well.`
[OpenMP][Libomptarget] Introduce Remote Offloading Plugin This introduces a remote offloading plugin for libomptarget. This implementation relies on gRPC and protobuf, so this library will only build if both libraries are available on the system. The corresponding server is compiled to `openmp-offloading-server`. This is a large change, but the only way to split this up is into RTL/server but I fear that could introduce an inconsistency amongst them. Ideally, tests for this should be added to the current ones that but that is problematic for at least one reason. Given that libomptarget registers plugin on a first-come-first-serve basis, if we wanted to offload onto a local x86 through a different process, then we'd have to either re-order the plugin list in `rtl.cpp` (which is what I did locally for testing) or find a better solution for runtime plugin registration in libomptarget. Differential Revision: https://reviews.llvm.org/D95314 2021-01-24 01:23:50 +08:00
			`Q: What is a way to debug errors from mapping memory to a target device?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

			An experimental way to debug these errors is to use :ref:`remote process
			offloading <remote_offloading_plugin>`.
			By using ``libomptarget.rtl.rpc.so`` and ``openmp-offloading-server``, it is
			`possible to explicitly perform memory transfers between processes on the host`
[OpenMP][NFC] Adding FAQ Entry for errors with static libraries 2021-02-02 23:48:36 +08:00			`CPU and run sanitizers while doing so in order to catch these errors.`

			`Q: Why does my application say "Named symbol not found" and abort when I run it?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

			`This is most likely caused by trying to use OpenMP offloading with static`
			`libraries. Static libraries do not contain any device code, so when the runtime`
			`attempts to execute the target region it will not be found and you will get an`
			`an error like this.`

			`.. code-block:: text`

			`CUDA error: Loading '__omp_offloading_fd02_3231c15__Z3foov_l2' Failed`
			`CUDA error: named symbol not found`
			`Libomptarget error: Unable to generate entries table for device id 0.`

			`Currently, the only solution is to change how the application is built and avoid`
			`the use of static libraries.`
[OpenMP][Documentation] Add FAQ entry for dynamically linked libraries Summary: Add an FAW entry detailing the support for using dynamically linked libraries with OpenMP Offloading 2021-04-27 02:20:14 +08:00
			`Q: Can I use dynamically linked libraries with OpenMP offloading`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

			`Dynamically linked libraries can be only used if there is no device code split`
			`between the library and application. Anything declared on the device inside the`
			`shared library will not be visible to the application when it's linked.`