llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	b5667d00e0	[OpenMP][CUDA] Fix std::complex in GPU regions The old way worked to some degree for C++-mode but in C mode we actually tried to introduce variants of macros (e.g., isinf). To make both modes work reliably we get rid of those extra variants and directly use NVIDIA intrinsics in the complex implementation. While this has to be revisited as we add other GPU targets which want to reuse the code, it should be fine for now. Reviewed By: tra, JonChesterfield, yaxunl Differential Revision: https://reviews.llvm.org/D83591	2020-07-11 00:40:05 -05:00
Johannes Doerfert	e3e47e8035	[OpenMP] Make complex soft-float functions on the GPU weak definitions To avoid linkage errors we have to ensure the linkage allows multiple definitions of these compiler inserted functions. Since they are on the cold path of complex computations, we want to avoid `inline`. Instead, we opt for `weak` and `noinline` for now.	2020-07-09 01:06:55 -05:00
Johannes Doerfert	d999cbc988	[OpenMP] Initial support for std::complex in target regions This simply follows the scheme we have for other wrappers. It resolves the current link problem, e.g., `__muldc3 not found`, when std::complex operations are used on a device. This will not allow complex make math function calls to work properly, e.g., sin, but that is more complex (pan intended) anyway. Reviewed By: tra, JonChesterfield Differential Revision: https://reviews.llvm.org/D80897	2020-07-08 17:33:59 -05:00
Johannes Doerfert	f85ae058f5	[OpenMP] Provide math functions in OpenMP device code via OpenMP variants For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions, we include the appropriate definitions inside of an `omp begin/end declare variant match(device={arch(nvptx)})` scope. This way, the vendor specific math functions will become specialized versions of the system math functions. When a system math function is called and specialized version is available the selection logic introduced in D75779 instead call the specialized version. In contrast to the code path we used so far, the system header is actually included. This means functions without specialized versions are available and so are macro definitions. This should address PR42061, PR42798, and PR42799. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D75788	2020-04-07 23:33:24 -05:00

4 Commits