Fixes miscompile of calls into ocml. Bug 51445.
The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.
This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107971
Fixes miscompile of calls into ocml. Bug 51445.
The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.
This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107971
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
hipRTC compiles HIP device code at run time. Since the system may not
have development tools installed, when a HIP program is compiled through
hipRTC, there is no standard C or C++ header available. As such, the HIP
headers should not depend on standard C or C++ headers when used
with hipRTC. Basically when hipRTC is used, HIP headers only provides
definitions of HIP device API functions. This is in line with what nvRTC does.
This patch adds support of hipRTC to HIP headers in clang. Basically hipRTC
defines a macro __HIPCC_RTC__ when compile HIP code at run time. When
this macro is defined, HIP headers do not include standard C/C++ headers.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D100652
The __ocml_*_rte_f32 and __ocml_*_rte_f64 functions are not
available if OCML_BASIC_ROUNDED_OPERATIONS is not defined.
Reviewed By: b-sumner, yaxunl
Fixes: SWDEV-257235
Differential Revision: https://reviews.llvm.org/D89966
Separate __clang_hip_math.h header into __clang_hip_cmath.h
and __clang_hip_math.h. Improve the math function definition,
and add missing definitions or declarations. Add missing
overloads.
Reviewed By: tra, JonChesterfield
Differential Review: https://reviews.llvm.org/D88837
Automation to detect compiler features, such as CMake's target_compile_features,
would attempt to detect compiler features by explicitly using langugage flags.
This change ensures that the HIP headers would still work with C++98.
Patch by Siu Chi Chan
Differential Revision: https://reviews.llvm.org/D85471
Change-Id: I304e964b18a525b0fde55efd841da74b6c4dc8ed
Summary:
Make hip math headers easier to use from C
Motivation is a step towards using the hip math headers to implement math.h
for openmp, which needs to work with C as well as C++. NFC for C++ code.
Reviewers: yaxunl, jdoerfert
Reviewed By: yaxunl
Subscribers: sstefan1, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D84476
To support std::complex and some other standard C/C++ functions in HIP device code,
they need to be forced to be __host__ __device__ functions by pragmas. This is done
by some clang standard C++ wrapper headers which are shared between cuda-clang and hip-Clang.
For these standard C++ wapper headers to work properly, specific include path order
has to be enforced:
clang C++ wrapper include path
standard C++ include path
clang include path
Also, these C++ wrapper headers require device version of some standard C/C++ functions
must be declared before including them. This needs to be done by including a default
header which declares or defines these device functions. The default header is always
included before any other headers are included by users.
This patch adds the the default header and include path for HIP.
Differential Revision: https://reviews.llvm.org/D81176