The patch implements header-only support for testure lookups.
The patch has been tested on a source file with all possible combinations of
argument types supported by CUDA headers, compiled and verified that the
generated instructions and their parameters match the code generated by NVCC.
Unfortunately, compiling texture code requires CUDA headers and can't be tested
in clang itself. The test will need to be added to the test-suite later.
While generated code compiles and seems to match NVCC, I do not have any code
that uses textures that I could test correctness of the implementation. Hence
the experimental status.
Differential Revision: https://reviews.llvm.org/D110089
Fixes miscompile of calls into ocml. Bug 51445.
The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.
This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107971
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
This adds a very basic test in `cuda_with_openmp.cu` that just checks whether the CUDA & OpenMP integrated headers do compile, when a CUDA file is compiled with OpenMP (CPU) enabled.
Thus this basically adds the missing test for https://reviews.llvm.org/D90415.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105322
We do provide `operator delete(void*)` in `<new>` but it should be
available by default. This is mostly boilerplate to test it and the
unconditional include of `<new>` in the header we always in include
on the device.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D100620
The last (big) missing piece to get "math" working in OpenMP target
regions (that I know of) was complex math functions, e.g.,
`std::sin(std::complex<double>)`. With this patch we overload the system
template functions for these operations with versions that have been
distilled from `libcxx/include/complex`. We use the same
`omp begin/end declare variant`
mechanism we use for other math functions before, except that we this
time overload templates (via D85735).
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D85777
`std::isnan` and friends can be found in two variants in the wild, one
returns `bool`, as the standard defines it, one returns `int`, as the C
macros do. So far we kinda hoped the system versions of these functions
will work for people, e.g. they are definitions that can be compiled for
the target. We know that is not the case always so we leverage the
`disable_implicit_base` OpenMP context extension to specialize both
versions of these functions without causing an invalid redeclaration.
Reviewed By: JonChesterfield, tra
Differential Revision: https://reviews.llvm.org/D85879
This simply follows the scheme we have for other wrappers. It resolves
the current link problem, e.g., `__muldc3 not found`, when std::complex
operations are used on a device.
This will not allow complex make math function calls to work properly,
e.g., sin, but that is more complex (pan intended) anyway.
Reviewed By: tra, JonChesterfield
Differential Revision: https://reviews.llvm.org/D80897
If we are in C++ mode and include <math.h> (not <cmath>) first, we still
need to make sure <cmath> is read first. The problem otherwise is that
we haven't seen the declarations of the math.h functions when the system
math.h includes our cmath overlay. However, our cmath overlay, or better
the underlying overlay, e.g. CUDA, uses the math.h functions. Since we
haven't declared them yet we get errors. CUDA avoids this by eagerly
declaring all math functions (in the __device__ space) but we cannot do
this. Instead we break the dependence by forcing cmath to go first.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D77774
For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions,
we include the appropriate definitions inside of an `omp begin/end declare variant match(device={arch(nvptx)})` scope.
This way, the vendor specific math functions will become specialized versions of the system math functions.
When a system math function is called and specialized version is available the selection logic introduced in D75779
instead call the specialized version. In contrast to the code path we used so far, the system header is actually included.
This means functions without specialized versions are available and so are macro definitions.
This should address PR42061, PR42798, and PR42799.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D75788
Summary:
This is a fix for the reported bug:
[[ https://bugs.llvm.org/show_bug.cgi?id=41861 | 41861 ]]
abs functions need to be moved under the c++ macro to avoid conflicts with included headers.
Reviewers: tra, jdoerfert, hfinkel, ABataev, caomhin
Reviewed By: jdoerfert
Subscribers: guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61959
llvm-svn: 360809
Summary: In OpenMP device offloading we must ensure that unde C++ 17, the inclusion of cstdlib will works correctly.
Reviewers: ABataev, tra, jdoerfert, hfinkel, caomhin
Reviewed By: jdoerfert
Subscribers: Hahnfeld, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61949
llvm-svn: 360804
Summary: This patches fixes an issue in which the __clang_cuda_cmath.h header is being included even when cmath or math.h headers are not included.
Reviewers: jdoerfert, ABataev, hfinkel, caomhin, tra
Reviewed By: tra
Subscribers: tra, mgorny, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61765
llvm-svn: 360626
This reverts r360271 (git commit a0933bd8ec)
There are concerns on the review that this breaks EFI builds and that
the transitive includes (sal.h) are actually heavy enough that we might
care.
llvm-svn: 360291
compatibility. This allows some applications developed with MSVC to
compile with clang without any extra changes.
Fixes: llvm.org/PR40789
Differential Revision: https://reviews.llvm.org/D61646
llvm-svn: 360271
Summary:
In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.
We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.
Authors:
@gtbercea
@jdoerfert
Reviewers: hfinkel, caomhin, ABataev, tra
Reviewed By: hfinkel, ABataev, tra
Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61399
llvm-svn: 360265
This commit appears to be breaking stage-2 builds on GreenDragon. The
OpenMP wrappers for cmath and math.h are copied into the root of the
resource directory and cause a cyclic dependency in module 'Darwin':
Darwin -> std -> Darwin. This blows up when CMake is testing for modules
support and breaks all stage 2 module builds, including the ThinLTO bot
and all LLDB bots.
CMake Error at cmake/modules/HandleLLVMOptions.cmake:497 (message):
LLVM_ENABLE_MODULES is not supported by this compiler
llvm-svn: 360192
Summary:
In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.
We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.
Authors:
@gtbercea
@jdoerfert
Reviewers: hfinkel, caomhin, ABataev, tra
Reviewed By: hfinkel, ABataev, tra
Subscribers: mgorny, guansong, cfe-commits, jdoerfert
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61399
llvm-svn: 360063
Reapply r289181 but rename the include guard to avoid
conflict with the one from Darwin.
Allow darwin to provide additional definitions and implementation
specifc values for tgmath.h on Apple platforms.
rdar://problem/19019845
llvm-svn: 298013
Reverts r289181: it's currently breaking modules using simd.h in
10.12 SDK.
This reverts commit 6e73e3464e96a4e00492c24aa790d36e1adb5702.
llvm-svn: 289487
Allow darwin to provide additional definitions and implementation
specifc values for tgmath.h on Apple platforms.
rdar://problem/19019845
llvm-svn: 289181
xmmintrin.h includes emmintrin.h and vice versa if SSE2 is enabled. We break
this cycle for a modules build, and instead make the xmmintrin.h module
re-export the immintrin.h module. Also included is a fix for an assert in the
serialization code if a module exports another module that was declared later
in the same module map.
llvm-svn: 237321
This also adds a definition for uint64_t, which was causing build failures
on some platforms. (I'm actually surprised this didn't happen on more
builders, but maybe the search paths are different.)
llvm-svn: 164706