llvm-project

Commit Graph

Author	SHA1	Message	Date
Paul Robinson	d2c0eefd5c	[X86] Remove const from some intrinsics that shouldn't have them llvm-svn: 366699	2019-07-22 16:14:09 +00:00
Sven van Haastregt	e9e59ad79f	[OpenCL] Define CLK_NULL_EVENT without cast Defining CLK_NULL_EVENT with a `(void*)` cast has the (unintended?) side-effect that the address space will be fixed (as generic in OpenCL 2.0 mode). The consequence is that any target specific address space for the clk_event_t type will not be applied. It is not clear why the void pointer cast was needed in the first place, and it seems we can do without it. Differential Revision: https://reviews.llvm.org/D63876 llvm-svn: 366546	2019-07-19 09:11:48 +00:00
Qiu Chaofan	03aaef8e72	[PowerPC][Clang] Remove use of malloc in mm_malloc Remove dependency of malloc in implementation of mm_malloc function in PowerPC intrinsics and alignment assumption on glibc. Reviewed By: Hal Finkel Differential Revision: https://reviews.llvm.org/D64850 llvm-svn: 366406	2019-07-18 06:20:12 +00:00
Momchil Velikov	0e2b74a2b0	Revert [AArch64] Add support for Transactional Memory Extension (TME) This reverts r366322 (git commit `4b8da3a503`) llvm-svn: 366355	2019-07-17 17:43:32 +00:00
Momchil Velikov	4b8da3a503	[AArch64] Add support for Transactional Memory Extension (TME) TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322	2019-07-17 13:23:27 +00:00
Kyrylo Tkachov	eb72138340	[AArch64] Implement __jcvt intrinsic from Armv8.3-A The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined. This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function. The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used. I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example). make check-all didn't show any new failures. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics Differential Revision: https://reviews.llvm.org/D64495 llvm-svn: 366197	2019-07-16 09:27:39 +00:00
Ulrich Weigand	b98bf60ef7	[SystemZ] Add support for new cpu architecture - arch13 This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10303. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365933	2019-07-12 18:14:51 +00:00
Craig Topper	caf6b71ab2	[X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669	2019-07-10 17:11:29 +00:00
Sven van Haastregt	b502a44110	[OpenCL] Restore ATOMIC_VAR_INIT We accidentally lost the ATOMIC_VAR_INIT and ATOMIC_FLAG_INIT macros in r363794. Also put the `memory_order` typedef back inside a `>= CL2.0` guard. llvm-svn: 364174	2019-06-24 10:06:40 +00:00
Sven van Haastregt	853dfab799	[OpenCL] Remove more duplicates from opencl-c.h Identified the duplicate declarations using sort lib/Headers/opencl-c.h \| uniq -c \| grep ' 2' llvm-svn: 364173	2019-06-24 10:06:34 +00:00
Anastasia Stulova	999f676d75	[OpenCL][PR41963] Add generic addr space to old atomics in C++ mode Add overloads with generic address space pointer to old atomics. This is currently only added for C++ compilation mode. Differential Revision: https://reviews.llvm.org/D62335 llvm-svn: 364071	2019-06-21 16:19:16 +00:00
Sven van Haastregt	772a7a7680	[OpenCL] Remove duplicate read_image declarations Patch by Pierre Gondois. llvm-svn: 364020	2019-06-21 10:26:10 +00:00
Craig Topper	6d9fb68c53	[X86] Make _mm_mask_cvtps_ph, _mm_maskz_cvtps_ph, _mm256_mask_cvtps_ph, and _mm256_maskz_cvtps_ph aliases for their corresponding cvt_roundps_ph intrinsic. These intrinsics should always take an immediate for the rounding mode. The base instruction comes from before EVEX embdedded rounding. The user should always provide the immediate rather than us assuming CUR_DIRECTION. Make the 512-bit versions also explicit aliases instead of copy pasting the code. llvm-svn: 363961	2019-06-20 18:24:29 +00:00
Xing Xue	ab4bcd844a	AIX system headers need stdint.h and inttypes.h to be re-enterable Summary: AIX system headers need stdint.h and inttypes.h to be re-enterable when macro _STD_TYPES_T is defined so that limit macro definitions such as UINT32_MAX can be found. This patch attempts to allow that on AIX. Reviewers: hubert.reinterpretcast, jasonliu, mclow.lists, EricWF Reviewed by: hubert.reinterpretcast, mclow.lists Subscribers: jfb, jsji, christof, cfe-commits, libcxx-commits, llvm-commits Tags: #LLVM, #clang, #libc++ Differential Revision: https://reviews.llvm.org/D59253 llvm-svn: 363939	2019-06-20 15:36:32 +00:00
Craig Topper	24151619a0	[X86] Correct the __min_vector_width__ attribute on a few intrinsics. llvm-svn: 363890	2019-06-19 23:27:04 +00:00
Sven van Haastregt	af1c230e70	[OpenCL] Split type and macro definitions into opencl-c-base.h Using the -fdeclare-opencl-builtins option will require a way to predefine types and macros such as `int4`, `CLK_GLOBAL_MEM_FENCE`, etc. Move these out of opencl-c.h into opencl-c-base.h such that the latter can be shared by -fdeclare-opencl-builtins and -finclude-default-header. This changes the behaviour of -finclude-default-header when -fdeclare-opencl-builtins is specified: instead of including the full header, it will include the header with only the base definitions. Differential revision: https://reviews.llvm.org/D63256 llvm-svn: 363794	2019-06-19 12:48:22 +00:00
Zi Xuan Wu	cc12f68fff	[PowerPC] [Clang] Port SSE2 intrinsics to PowerPC Port emmintrin.h which include Intel SSE2 intrinsics implementation to PowerPC platform (using Altivec). The new headers containing those implemenations are located into a directory named ppc_wrappers which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. It's a follow-up patch of D62121. Patched by: Qiu Chaofan <qiucf@cn.ibm.com> Differential Revision: https://reviews.llvm.org/D62569 llvm-svn: 363122	2019-06-12 05:25:40 +00:00
Pengfei Wang	244062eece	[X86] Enable intrinsics that convert float and bf16 data to each other Scalar version : _mm_cvtsbh_ss , _mm_cvtness_sbh Vector version: _mm512_cvtpbh_ps , _mm256_cvtpbh_ps _mm512_maskz_cvtpbh_ps , _mm256_maskz_cvtpbh_ps _mm512_mask_cvtpbh_ps , _mm256_mask_cvtpbh_ps Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62363 llvm-svn: 363018	2019-06-11 01:17:28 +00:00
Pengfei Wang	3a29f7c99c	[X86] Add ENQCMD instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Patch by Tianqing Wang (tianqing) Differential Revision: https://reviews.llvm.org/D62282 llvm-svn: 362685	2019-06-06 08:28:42 +00:00
Andrew Savonichev	9ed325e463	[OpenCL] Undefine cl_intel_planar_yuv extension Summary: Remove unnecessary definition (otherwise the extension will be defined where it's not supposed to be defined). Consider the code: #pragma OPENCL EXTENSION cl_intel_planar_yuv : begin // some declarations #pragma OPENCL EXTENSION cl_intel_planar_yuv : end is enough for extension to become known for clang. Patch by: Dmitry Sidorov <dmitry.sidorov@intel.com> Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Tags: #clang Differential Revision: https://reviews.llvm.org/D58666 llvm-svn: 362398	2019-06-03 13:02:43 +00:00
Pengfei Wang	cc3629d545	[X86] Add VP2INTERSECT instructions Support intel AVX512 VP2INTERSECT instructions in clang Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D62367 llvm-svn: 362196	2019-05-31 06:09:35 +00:00
Zi Xuan Wu	fc3ed1ec50	re-commit r361928: [PowerPC] [Clang] Port SSE intrinsics to PowerPC Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec). The new headers containing those implemenations are located into a directory named ppc_wrappers which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Patched by: Qiu Chaofan <qiucf@cn.ibm.com> Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D62121 llvm-svn: 362190	2019-05-31 04:42:13 +00:00
Zi Xuan Wu	48061cd999	revert rC361928: [PowerPC] [Clang] Port SSE intrinsics to PowerPC Because test fails in other targets rather than PowerPC llvm-svn: 361930	2019-05-29 07:09:54 +00:00
Zi Xuan Wu	b3bcbb5b66	[PowerPC] [Clang] Port SSE intrinsics to PowerPC Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec). The new headers containing those implemenations are located into a directory named ppc_wrappers which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Patched by: Qiu Chaofan <qiucf@cn.ibm.com> Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D62121 llvm-svn: 361928	2019-05-29 05:17:03 +00:00
Kevin Petit	aa7754cc90	[OpenCL] Add support for the cl_arm_integer_dot_product extensions The specification is available in the Khronos OpenCL registry: https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_integer_dot_product.txt Signed-off-by: Kevin Petit <kevin.petit@arm.com> llvm-svn: 361641	2019-05-24 14:53:52 +00:00
Craig Topper	3d7ecc4618	[X86] Remove semicolons at the end of intrinsics implemented as macros so they can be used as arguments to other intrinsics. Also fix one intrinsic that was using variable names without underscores. Fixes PR41932 llvm-svn: 361109	2019-05-19 01:01:52 +00:00
Gheorghe-Teodor Bercea	144291e14c	[OpenMP][bugfix] Add missing math functions variants for log and abs. Summary: When including the random header in C++, some of the math functions it relies on are not present in the CUDA headers. We include this variants in this case. Reviewers: jdoerfert, hfinkel, tra, caomhin Reviewed By: tra Subscribers: efriedma, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D62046 llvm-svn: 361066	2019-05-17 19:15:53 +00:00
Craig Topper	20040db9a6	[X86] Stop implicitly enabling avx512vl when avx512bf16 is enabled. Previously we were doing this so that the 256 bit selectw builtin could be used in the implementation of the 512->256 bit conversion intrinsic. After this commit we now use a masked convert builtin that will emit the intrinsic call and the 256-bit select from custom code in CGBuiltin. Then the header only needs to call that one intrinsic. llvm-svn: 360924	2019-05-16 18:28:17 +00:00
Craig Topper	58964566e0	[X86] Update doxygen comments for AVX512BF16 to not refer to masks as 'immediates'. Refer to parameter names instead of 'src', 'src1', 'src2'. NFC llvm-svn: 360918	2019-05-16 17:34:35 +00:00
Gheorghe-Teodor Bercea	9392bd6987	[OpenMP][Bugfix] Move double and float versions of abs under c++ macro Summary: This is a fix for the reported bug: [[ https://bugs.llvm.org/show_bug.cgi?id=41861 \| 41861 ]] abs functions need to be moved under the c++ macro to avoid conflicts with included headers. Reviewers: tra, jdoerfert, hfinkel, ABataev, caomhin Reviewed By: jdoerfert Subscribers: guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61959 llvm-svn: 360809	2019-05-15 20:28:23 +00:00
Gheorghe-Teodor Bercea	7641f310d7	[OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions Summary: In OpenMP device offloading we must ensure that unde C++ 17, the inclusion of cstdlib will works correctly. Reviewers: ABataev, tra, jdoerfert, hfinkel, caomhin Reviewed By: jdoerfert Subscribers: Hahnfeld, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61949 llvm-svn: 360804	2019-05-15 20:18:21 +00:00
Volodymyr Sapsai	51e79f0634	[X86] Make `x86intrin.h`, `immintrin.h` includable with `-fno-gnu-inline-asm`. Currently `immintrin.h` includes `pconfigintrin.h` and `sgxintrin.h` which contain inline assembly. It causes failures when building with the flag `-fno-gnu-inline-asm`. Fix by excluding functions with inline assembly when this extension is disabled. So far there was no need to support `_pconfig_u32`, `_enclu_u32`, `_encls_u32`, `_enclv_u32` on platforms that require `-fno-gnu-inline-asm`. But if developers start using these functions, they'll have compile-time undeclared identifier errors which is preferrable to runtime errors. rdar://problem/49540880 Reviewers: craig.topper, GBuella, rnk, echristo Reviewed By: rnk Subscribers: jkorous, dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D61621 llvm-svn: 360630	2019-05-13 22:40:11 +00:00
Gheorghe-Teodor Bercea	946957189d	[OpenMP][Clang][BugFix] Split declares and math functions inclusion. Summary: This patches fixes an issue in which the __clang_cuda_cmath.h header is being included even when cmath or math.h headers are not included. Reviewers: jdoerfert, ABataev, hfinkel, caomhin, tra Reviewed By: tra Subscribers: tra, mgorny, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61765 llvm-svn: 360626	2019-05-13 22:11:44 +00:00
Reid Kleckner	55fab1ff48	Revert Include corecrt.h in stddef.h and vcruntime.h in stdarg.h to improve MS compatibility. This reverts r360271 (git commit `a0933bd8ec`) There are concerns on the review that this breaks EFI builds and that the transitive includes (sal.h) are actually heavy enough that we might care. llvm-svn: 360291	2019-05-08 22:01:20 +00:00
Mike Rice	a0933bd8ec	Include corecrt.h in stddef.h and vcruntime.h in stdarg.h to improve MS compatibility. This allows some applications developed with MSVC to compile with clang without any extra changes. Fixes: llvm.org/PR40789 Differential Revision: https://reviews.llvm.org/D61646 llvm-svn: 360271	2019-05-08 17:15:21 +00:00
Gheorghe-Teodor Bercea	e62c693c8e	[OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 llvm-svn: 360265	2019-05-08 15:52:33 +00:00
Jonas Devlieghere	fe608c938c	Revert "[OpenMP][Clang] Support for target math functions" This commit appears to be breaking stage-2 builds on GreenDragon. The OpenMP wrappers for cmath and math.h are copied into the root of the resource directory and cause a cyclic dependency in module 'Darwin': Darwin -> std -> Darwin. This blows up when CMake is testing for modules support and breaks all stage 2 module builds, including the ThinLTO bot and all LLDB bots. CMake Error at cmake/modules/HandleLLVMOptions.cmake:497 (message): LLVM_ENABLE_MODULES is not supported by this compiler llvm-svn: 360192	2019-05-07 21:08:15 +00:00
Gheorghe-Teodor Bercea	1e28a668bc	[OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 llvm-svn: 360063	2019-05-06 18:19:15 +00:00
Fangrui Song	041c377a59	[X86] Move files to correct directories after D60552 llvm-svn: 360022	2019-05-06 09:24:36 +00:00
Luo, Yuanke	844f662932	Enable intrinsics of AVX512_BF16, which are supported for BFLOAT16 in Cooper Lake Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Patch by LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon Reviewed By: craig.topper Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60552 llvm-svn: 360018	2019-05-06 08:25:11 +00:00
Tom Stellard	dbe1c4aa6f	lib/Header: Fix Visual Studio builds try #2 Summary: This is a follow up to r355253 and a better fix than the first attempt which was r359257. We can't install anything from ${CMAKE_CFG_INTDIR}, because this value is only defined at build time, but we still must make sure to copy the headers into ${CMAKE_CFG_INTDIR}/lib/clang/$VERSION/include, because the lit tests look for headers there. So for this fix we revert to the old behavior of copying the headers to ${CMAKE_CFG_INTDIR}/lib/clang/$VERSION/include during the build and then installing them from the source tree. Reviewers: smeenai, vzakhari, phosek Reviewed By: smeenai, vzakhari Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61220 llvm-svn: 359654	2019-05-01 06:18:03 +00:00
Javed Absar	18b0c40bc5	[AArch64] Add support for MTE intrinsics This provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. These intrinsics are available when __ARM_FEATURE_MEMORY_TAGGING is defined. Each intrinsic is described in detail in the ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed By: Tim Nortover, David Spickett Differential Revision: https://reviews.llvm.org/D60485 llvm-svn: 359348	2019-04-26 21:08:11 +00:00
Tom Stellard	0184819e81	Revert lib/Header: Fix Visual Studio builds This reverts r359257 (git commit `00d9789509`) This broke check-clang. llvm-svn: 359258	2019-04-26 01:43:59 +00:00
Tom Stellard	00d9789509	lib/Header: Fix Visual Studio builds Summary: This is a follow up to r355253, which inadvertently broke Visual Studio builds by trying to copy files from CMAKE_CFG_INTDIR. See https://reviews.llvm.org/D58537#inline-532492 Reviewers: smeenai, vzakhari, phosek Reviewed By: smeenai Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61054 llvm-svn: 359257	2019-04-26 01:18:59 +00:00
Jinsong Ji	12450d51a2	[PowerPC][NFC]Update licence to Apache 2 llvm-svn: 359164	2019-04-25 02:40:06 +00:00
Qiu Chaofan	19828e399b	[PowerPC] [Clang] Port MMX intrinsics and basic test cases to Power Port mmintrin.h which include x86 MMX intrinsics implementation to PowerPC platform (using Altivec). To make the include process correct, PowerPC's toolchain class is overrided to insert new headers directory (named ppc_wrappers) into the path. Basic test cases for several intrinsic functions are added. The header is mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D59924 llvm-svn: 358949	2019-04-23 05:50:24 +00:00
Evgeny Mankov	88aa3d7237	[CUDA][Windows] Restrict long double device functions declarations to Windows As agreed in D60220, make long double declarations unobservable on non-windows platforms. [Testing] {Windows 10, Ubuntu 16.04.5}/{Visual C++ 2017 15.9.11 & 2019 16.0.1, gcc+ 5.4.0}/CUDA {8.0, 9.0, 9.1, 9.2, 10.0, 10.1} Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D60818 llvm-svn: 358654	2019-04-18 10:08:55 +00:00
Craig Topper	8e364c680f	[X86] Restore the pavg intrinsics. The pattern we replaced these with may be too hard to match as demonstrated by PR41496 and PR41316. This patch restores the intrinsics and then we can start focusing on the optimizing the intrinsics. I've mostly reverted the original patch that removed them. Though I modified the avx512 intrinsics to not have masking built in. Differential Revision: https://reviews.llvm.org/D60674 llvm-svn: 358427	2019-04-15 17:17:35 +00:00
Chandler Carruth	4cf5743b77	Move the builtin headers to use the new license file header. Summary: These all had somewhat custom file headers with different text from the ones I searched for previously, and so I missed them. Thanks to Hal and Kristina and others who prompted me to fix this, and sorry it took so long. Reviewers: hfinkel Subscribers: mcrosier, javed.absar, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60406 llvm-svn: 357941	2019-04-08 20:51:30 +00:00
Evgeny Mankov	66a8b07cd9	[CUDA][Windows] Last fix for the clang Bug 38811 "Clang fails to compile with CUDA-9.x on Windows" (https://bugs.llvm.org/show_bug.cgi?id=38811 ). [IMPORTANT] With that last fix, CUDA has just started being compiling by clang on Windows after nearly a year and two clangâ€™s major releases (7 and 8). As long as the last LLVM release, in which clang was compiling CUDA on Windows successfully, was 6.0.1, this fix and two previous have to be included into upcoming 7.1.0 and 8.0.1 releases. [How to repro] clang++.exe -x cuda "c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\0_Simple\simplePrintf\simplePrintf.cu" -I"c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\common\inc" --cuda-gpu-arch=sm_50 --cuda-path="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0" -L"c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64" -lcudart.lib -v [Output] In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.9\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:390:11: error: no matching function for call to '__isinfl' return (__isinfl(a) != 0); ^~~~~~~~ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2662:14: note: candidate function not viable: call to __host__ function from __device__ function __func__(int __isinfl(long double a)) ^ In file included from <built-in>:1: In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.9\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:438:11: error: no matching function for call to '__isnanl' return (__isnanl(a) != 0); ^~~~~~~~ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2672:14: note: candidate function not viable: call to __host__ function from __device__ function __func__(int __isnanl(long double a)) ^ In file included from <built-in>:1: In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.9\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:486:11: error: no matching function for call to '__finitel' return (__finitel(a) != 0); ^~~~~~~~~ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2652:14: note: candidate function not viable: call to __host__ function from __device__ function __func__(int __finitel(long double a)) ^ 3 errors generated when compiling for sm_50. [Solution] Add missing long double device functions' declarations. Provide only declarations to prevent any use of long double on the device side, because CUDA does not support long double on the device side. [Testing] {Windows 10, Ubuntu 16.04.5}/{Visual C++ 2017 15.9.9, gcc+ 5.4.0}/CUDA {8.0, 9.0, 9.1, 9.2, 10.0, 10.1} Reviewed by: Artem Belevich Differential Revision: http://reviews.llvm.org/D60220 llvm-svn: 357779	2019-04-05 16:51:10 +00:00

1 2 3 4 5 ...

1559 Commits