llvm-project

Commit Graph

Author	SHA1	Message	Date
Lei Huang	8b3d944a97	[PowerPC] Disable vector types when not supported by subtarget features Update clang to treat vector unsigned long long and friends as invalid for AltiVec without VSX. Reported in: https://bugs.llvm.org/show_bug.cgi?id=47782 Reviewed By: nemanjai, amyk Differential Revision: https://reviews.llvm.org/D109178	2021-10-04 14:16:47 -05:00
Nemanja Ivanovic	369d785574	[PowerPC] Optimal sequence for doubleword vec_all_{eq\|ne} on Power7 These builtins produce inefficient code for CPU's prior to Power8 due to vcmpequd being unavailable. The predicate forms can actually leverage the available vcmpequw along with xxlxor to produce a better sequence.	2021-10-01 08:27:15 -05:00
Nemanja Ivanovic	fad14a17a4	[PowerPC] Truncate element index for vec_insert in altivec.h When a user specifies an out-of-range index for vec_insert, we just produce IR that has undefined behaviour even though the documentation states that modulo arithmetic is used. This patch just truncates the value to a valid index.	2021-09-30 05:58:22 -05:00
Nemanja Ivanovic	09b67aa1c3	[PowerPC] Implement builtin for vbpermd The instruction has similar semantics to vbpermq but for doublewords. It was added in Power9 and the ABI documents the builtin. Differential revision: https://reviews.llvm.org/D107899	2021-09-29 06:34:31 -05:00
Wang, Pengfei	7d6889964a	[X86][FP16] Add more builtins to avoid multi evaluation problems & add 2 missed intrinsics Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110336	2021-09-27 09:27:04 +08:00
Quinn Pham	f9912fe4ea	[PowerPC] Add range checks for P10 Vector Builtins This patch adds range checking for some Power10 altivec builtins and changes the signature of a builtin to match documentation. For `vec_cntm`, range checking is done via SemaChecking. For `vec_splati_ins`, the second argument is masked to extract the 0th bit so that we always receive either a `0` or a `1`. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D109710	2021-09-23 11:05:49 -05:00
Wang, Pengfei	ebec077e07	[X86][FP16] Change the order of the operands in complex FMA intrinsics to allow swap between the mul operands. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D109658	2021-09-23 11:02:48 +08:00
Albion Fung	b93359ea3f	[PowerPC] Support for vector bool int128 on vector comparison builtins This patch implements support for the type vector bool int128 for arguments on vector comparison builtins listed below, which would otherwise crash due to ambiguity. The following builtins are added: vec_all_eq (vector bool __int128, vector bool __int128) vec_all_ne (vector bool __int128, vector bool __int128) vec_any_eq (vector bool __int128, vector bool __int128) vec_any_ne (vector bool __int128, vector bool __int128) vec_cmpne(vector bool __int128 a, vector bool __int128 b) vec_cmpeq(vector bool __int128 a, vector bool __int128 b) Differential revision: https://reviews.llvm.org/D110084	2021-09-21 16:29:37 -05:00
Justas Janickas	228dd20c3f	[OpenCL] Supports atomics in C++ for OpenCL 2021 Atomics in C++ for OpenCL 2021 are now handled the same way as in OpenCL C 3.0. This is a header-only change. Differential Revision: https://reviews.llvm.org/D109424	2021-09-20 16:24:30 +01:00
serge-sans-paille	9aeecdfa8e	Check supported architectures in sseXYZ/avxXYZ headers It doesn't make sense to include those headers on the wrong architecture, provide an explicit error message in that case. Fix https://bugs.llvm.org/show_bug.cgi?id=48915 Differential Revision: https://reviews.llvm.org/D109686	2021-09-14 09:57:54 +02:00
Sven van Haastregt	d353d1c501	[OpenCL] Support cl_ext_float_atomics See https://github.com/KhronosGroup/OpenCL-Docs/pull/552 for initial specification. Patch by Haonan Yang. Differential Revision: https://reviews.llvm.org/D106343	2021-09-13 12:12:40 +01:00
Xiang1 Zhang	c81d6ab875	[X86] Adjust Keylocker handle mem size Reviewed By: Topper Craig Differential Revision: https://reviews.llvm.org/D109488	2021-09-13 18:03:27 +08:00
Xiang1 Zhang	bdce8d40c6	Revert "[X86] Adjust Keylocker handle mem size" This reverts commit `3731de6b7f`.	2021-09-13 18:00:46 +08:00
Xiang1 Zhang	3731de6b7f	[X86] Adjust Keylocker handle mem size Reviewed By: Topper Craig Differential Revision: https://reviews.llvm.org/D109354	2021-09-13 17:59:33 +08:00
Wang, Pengfei	2aaa6466fe	[X86] Support *_set1_pch(Float16 _Complex h) Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D109487	2021-09-11 17:47:31 +08:00
Joseph Huber	f28e710db7	[OpenMP] Make CUDA math library functions SPMD amenable This patch adds the SPMD amenable assumption to the CUDA math library defintions in Clang. Previously these functions would block SPMD execution on the device because they're intrinsic calls into the library and can't be calculated. These functions don't have side-effects so they are safe to execute in SPMD mode. Depends on D105937 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108958	2021-09-10 14:52:45 -04:00
Simon Pilgrim	ea685e1028	[X86][AVX] Update _mm256_loadu2_m128* intrinsics to use _mm256_set_m128* (PR51796) As reported on PR51796, the _mm256_loadu2_m128i in particular was inserting bitcasts and shuffles with different types making it trickier for some combines, and prevented the value tracker from identifying the shuffle sequences as a single insert_subvector style concat_vectors pattern. This patch instead concatenate the 128-bit unaligned loads with _mm256_set_m128*, which was written to avoid the unnecessary bitcasts and only emits a single shuffle. Differential Revision: https://reviews.llvm.org/D109497	2021-09-09 19:15:48 +01:00
Simon Pilgrim	55d9396278	[X86] Move _mm256_set_m128* intrinsics before _mm256_loadu2_m128* intrinsics. NFC. This is necessary for PR51796 where we'll update _mm256_loadu2_m128* to use _mm256_set_m128*	2021-09-09 11:23:50 +01:00
Pushpinder Singh	12dcbf913c	[AMDGPU][OpenMP] Use complex definitions from complex_cmath.h Following nvptx approach, this patch uses complex function definitions from complex_cmath.h. With this patch, ovo passes 23/34 complex mathematical test cases. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D109344	2021-09-09 10:55:17 +05:30
Tianqing Wang	12fa608af4	[X86] Add CRC32 feature. `d8faf03807` implemented general-regs-only for X86 by disabling all features with vector instructions. But the CRC32 instruction in SSE4.2 ISA, which uses only GPRs, also becomes unavailable. This patch adds a CRC32 feature for this instruction and allows it to be used with general-regs-only. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D105462	2021-09-06 17:24:30 +08:00
Stuart Brady	32955be6bf	[OpenCL] Remove decls for scalar vloada_half and vstorea_half* fns These functions are not part of the OpenCL C specification. See https://github.com/KhronosGroup/OpenCL-Docs/issues/648 for a clarification regarding the vloada_half declarations. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D108761	2021-09-02 22:08:09 +01:00
Nico Weber	e5438f3868	clang/win: Add __readfsdword to intrin.h When using __readfsdword(), clang used to warn that one has to include <intrin.h> -- no matter if that was already included or not. Now it only warns if it's not yet included. To verify that this was the only intrin with this problem, I ran: $ for f in $(grep intrin.h clang/include/clang/Basic/BuiltinsX86* \| egrep -o '\([^,]+,' \| egrep -o '[^(,]*'); do if ! grep -q $f clang/lib/Headers/intrin.h; then echo $f; fi; done This printed 9 more functions, but those are all in emmintrin.h, xsaveintrin.h (which are included by intrin.h based on /arch: flags). So this is indeed the only built-in that was missing in intrin.h. Fixes PR51188. Differential Revision: https://reviews.llvm.org/D109085	2021-09-02 12:22:07 -04:00
Justas Janickas	fb321c2ea2	[OpenCL] Define OpenCL 3.0 optional core features in C++ for OpenCL 2021 Modifies OpenCL 3.0 optional core feature macro definitions so that they are set analogously in C++ for OpenCL 2021. This change aims to achieve compatibility between C++ for OpenCL 2021 and OpenCL 3.0. Differential Revision: https://reviews.llvm.org/D108704	2021-09-01 10:15:17 +01:00
Victor Huang	2e5c17d19e	[PowerPC][NFC] Rename P10 builtins vec_clrl, vec_clrr to vec_clr_first and vec_clr_last This patch renames the vector clear left/right builtins vec_clrl, vec_clrr to vec_clr_first and vec_clr_last to avoid the ambiguities when dealing with endianness. Reviewed By: amyk, lei Differential revision: https://reviews.llvm.org/D108702	2021-08-30 09:52:15 -05:00
Wang, Pengfei	ab40dbfe03	[X86] AVX512FP16 instructions enabling 6/6 Enable FP16 complex FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105269	2021-08-30 13:08:45 +08:00
Xiang1 Zhang	80f7ce8993	[X86] Support __SSC_MARK(const int id) Differential Revision: https://reviews.llvm.org/D108682	2021-08-30 09:55:35 +08:00
Xiang1 Zhang	4c29dc18cf	Revert "[X86] Support __SSC_MARK(const int id)" This reverts commit `78fbde5779`.	2021-08-30 09:50:26 +08:00
Xiang1 Zhang	78fbde5779	[X86] Support __SSC_MARK(const int id) Differential Revision: https://reviews.llvm.org/D108682	2021-08-30 09:21:22 +08:00
Xiang1 Zhang	fd88fac6ca	Revert "[X86] Support __SSC_MARK(const int id)" This reverts commit `83e82ff767`.	2021-08-30 09:18:27 +08:00
Xiang1 Zhang	83e82ff767	[X86] Support __SSC_MARK(const int id) Differential Revision: https://reviews.llvm.org/D108682	2021-08-30 08:51:20 +08:00
Pushpinder Singh	07e85823aa	[OpenMP][AMDGCN] Enable complex functions This patch enables basic complex functionality using the ocml builtins. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108552	2021-08-24 12:40:41 +05:30
Wang, Pengfei	c728bd5bba	[X86] AVX512FP16 instructions enabling 5/6 Enable FP16 FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105268	2021-08-24 09:07:19 +08:00
Wang, Pengfei	b088536ce9	[X86] AVX512FP16 instructions enabling 4/6 Enable FP16 unary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105267	2021-08-22 08:59:35 +08:00
Craig Topper	5cf5df8014	[X86] Add missing __inline__ to functions in amxintrin.h	2021-08-20 09:35:02 -07:00
Thomas Lively	88962cea46	[WebAssembly] Restore builtins and intrinsics for pmin/pmax Partially reverts `85157c0079`, which had removed these builtins and intrinsics in favor of normal codegen patterns. It turns out that it is possible for the patterns to be split over multiple basic blocks, however, which means that DAG ISel is not able to select them to the pmin/pmax instructions. To make sure the SIMD intrinsics generate the correct instructions in these cases, reintroduce the clang builtins and corresponding LLVM intrinsics, but also keep the normal pattern matching as well. Differential Revision: https://reviews.llvm.org/D108387	2021-08-20 09:21:31 -07:00
Thomas Lively	64a9957bf7	[WebAssembly] Make shift values unsigned in wasm_simd128.h On some platforms, negative shift values mean to shift in the opposite direction, but this is not true with WebAssembly. To avoid confusion, make the shift values in the shift intrinsics unsigned. Differential Revision: https://reviews.llvm.org/D108415	2021-08-20 09:10:37 -07:00
Thomas Lively	2456e11614	[WebAssembly] Add SIMD intrinsics using unsigned integers For each SIMD intrinsic function that takes or returns a scalar signed integer value, ensure there is a corresponding intrinsic that returns or an unsigned value. This is a convenience for users who use -Wsign-conversion so they don't have to insert explicit casts, especially when the intrinsic arguments are integer literals that fit into the unsigned integer type but not the signed type. Differential Revision: https://reviews.llvm.org/D108412	2021-08-20 08:56:51 -07:00
Thomas Lively	fd3bd63df2	[WebAssembly] Make bitmask instructions return unsigned ints Since they are bitmasks, it will be more common for them to be used and potentially extended to 64-bit integers as unsigned values rather than signed values. Differential Revision: https://reviews.llvm.org/D108401	2021-08-19 16:23:47 -07:00
Martin Storsjö	cc3affd8b0	[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64 The code is based on the same __mulh and __umulh intrinsics for x86. This should fix PR51128. Differential Revision: https://reviews.llvm.org/D106721	2021-08-19 11:29:55 +03:00
Jon Chesterfield	dbd7bad9ad	[openmp] Annotate tmp variables with omp_thread_mem_alloc Fixes miscompile of calls into ocml. Bug 51445. The stack variable `double __tmp` is moved to dynamically allocated shared memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable is passed to a function that is explicitly annotated address_space(5) then allocating the variable off-stack leads to a miscompile in the back end, which cannot decide to move the variable back to the stack from shared. This could be fixed by removing the AS(5) annotation from the math library or by explicitly marking the variables as thread_mem_alloc. The cast to AS(5) is still a no-op once IR is reached. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D107971	2021-08-19 02:22:11 +01:00
Wang, Pengfei	2379949aad	[X86] AVX512FP16 instructions enabling 3/6 Enable FP16 conversion instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105265	2021-08-18 09:03:41 +08:00
Craig Topper	705b1191aa	[X86] Add parentheses around casts in X86 intrinsic headers. Fixes PR51324.	2021-08-14 18:14:44 -07:00
Wang, Pengfei	f1de9d6dae	[X86] AVX512FP16 instructions enabling 2/6 Enable FP16 binary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105264	2021-08-15 08:56:33 +08:00
Craig Topper	d2cb189184	[X86] Use a do {} while (0) in the _MM_EXTRACT_FLOAT implementation. Previously we just used {}, but that doesn't work in situations like this. if (1) _MM_EXTRACT_FLOAT(d, x, n); else ... The semicolon would terminate the if.	2021-08-14 16:41:55 -07:00
Craig Topper	73c4c32767	[X86] Use __builtin_bit_cast _mm_extract_ps instead of type punning through a union. NFC	2021-08-14 16:35:55 -07:00
Craig Topper	4190d99dfc	[X86] Add parentheses around casts in some of the X86 intrinsic headers. This covers the SSE and AVX/AVX2 headers. AVX512 has a lot more macros due to rounding mode. Fixes part of PR51324. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D107843	2021-08-13 09:36:16 -07:00
Jon Chesterfield	6a8e5120ab	Revert "[openmp] Annotate tmp variables with omp_thread_mem_alloc" This reverts commit `b6113548c9`.	2021-08-12 17:44:36 +01:00
Jon Chesterfield	b6113548c9	[openmp] Annotate tmp variables with omp_thread_mem_alloc Fixes miscompile of calls into ocml. Bug 51445. The stack variable `double __tmp` is moved to dynamically allocated shared memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable is passed to a function that is explicitly annotated address_space(5) then allocating the variable off-stack leads to a miscompile in the back end, which cannot decide to move the variable back to the stack from shared. This could be fixed by removing the AS(5) annotation from the math library or by explicitly marking the variables as thread_mem_alloc. The cast to AS(5) is still a no-op once IR is reached. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D107971	2021-08-12 17:30:22 +01:00
Freddy Ye	6c1468854d	[X86] Reverse _set_ph and _setr_ph 's set order. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D107946	2021-08-12 16:27:04 +08:00
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Dave Airlie	1854db74c5	opencl-c.h: add 3.0 optional extension support for a few more bits These 3 are fairly simple, pipes, workgroups and subgroups. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D105858	2021-08-07 09:25:00 +10:00
Justas Janickas	a5a2f05dcc	[C++4OpenCL] Introduces __remove_address_space utility This change provides a way to conveniently declare types that have address space qualifiers removed. Since OpenCL adds address spaces implicitly even when they are not specified in source, it is useful to allow deriving address space unqualified types. Fixes llvm.org/PR45326 Differential Revision: https://reviews.llvm.org/D106785	2021-08-06 10:40:22 +01:00
Jon Chesterfield	509854b69c	[clang] Replace asm with __asm__ in cuda header Asm is a gnu extension for C, so at present -fopenmp -std=c99 and similar fail to compile on nvptx, bug 51344 Changing to `__asm__` or `__asm` works for openmp, all three appear to work for cuda. Suggesting `__asm__` here as `__asm` is used by MSVC with different syntax, so this should make for better error diagnostics if the header is passed to a compiler other than clang. Reviewed By: tra, emankov Differential Revision: https://reviews.llvm.org/D107492	2021-08-05 18:46:57 +01:00
Dave Airlie	14cb67862a	[OpenCL] allow generic address and non-generic defs for CL3.0 This allows both sets of definitions to exist on CL 3.0 Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D107318	2021-08-05 07:32:45 +10:00
Pushpinder Singh	f3eb5f900d	[AMDGPU][OpenMP] Wrap amdgcn declare variant inside ifdef This fixes the issue https://bugs.llvm.org/show_bug.cgi?id=51337 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D107468	2021-08-04 15:24:46 +00:00
Pushpinder Singh	713a5d12cd	[OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next patch. Reviewed By: JonChesterfield, jdoerfert, scchan Differential Revision: https://reviews.llvm.org/D104904	2021-08-02 14:38:52 +00:00
Hans Wennborg	12dc13b73c	prfchwintrin.h: Make _m_prefetchw take a pointer to volatile (PR49124) For some reason, Microsoft declares _m_prefetch to take a const void, but _m_prefetchw to take a /volatile/ const void. Do the same for compatibility. Differential revision: https://reviews.llvm.org/D106790	2021-08-02 15:16:04 +02:00
Jon Chesterfield	7f97ddaf8a	Revert "[OpenMP][AMDGCN] Initial math headers support" Broke nvptx compilation on files including <complex> This reverts commit `12da97ea10`.	2021-07-30 22:07:00 +01:00
Nemanja Ivanovic	9019b55b60	[PowerPC] Fix byte ordering of ld/st with length on BE The builtins vec_xl_len_r and vec_xst_len_r actually use the wrong side of the vector on big endian Power9 systems. We never spotted this before because there was no such thing as a big endian distro that supported Power9. Now we have AIX and the elements are in the wrong part of the vector. This just fixes it so the elements are loaded to and stored from the right side of the vector.	2021-07-30 14:37:24 -05:00
Pushpinder Singh	12da97ea10	[OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next patch. Reviewed By: JonChesterfield, jdoerfert, scchan Differential Revision: https://reviews.llvm.org/D104904	2021-07-30 14:52:41 +00:00
Dave Airlie	3c7d2f1b67	[OpenCL] opencl-c.h: add CL 3.0 non-generic address space atomics CL 2.0 introduced atomics and generic address space so there were only one set of APIs for doing atomics, however since CL 3.0 makes generic address space optional, there has to be new sets of atomic interfaces to handle that cases. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D106778	2021-07-30 14:46:47 +10:00
Thomas Lively	33786576fd	[WebAssembly] Codegen for extmul SIMD instructions Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions with normal codegen patterns. Differential Revision: https://reviews.llvm.org/D106724	2021-07-27 08:41:30 -07:00
Anastasia Stulova	e5f47eedeb	[OpenCL] NULL redefined as nullptr in C++ mode. Redefines NULL as nullptr instead of ((void*)0) in C++ for OpenCL. Such internal representation of NULL provides compatibility with C++11 and later language standards. Patch by Topotuna (Justas Janickas)! Differential Revision: https://reviews.llvm.org/D105987	2021-07-27 16:33:50 +01:00
Nemanja Ivanovic	1c50a5da36	[PowerPC] Implement partial vector ld/st builtins for XL compatibility XL provides functions __vec_ldrmb/__vec_strmb for loading/storing a sequence of 1 to 16 bytes in big endian order, right justified in the vector register (regardless of target endianness). This is equivalent to vec_xl_len_r/vec_xst_len_r which are only available on Power9. This patch simply uses the Power9 functions when compiled for Power9, but provides a more general implementation for Power8. Differential revision: https://reviews.llvm.org/D106757	2021-07-26 13:19:52 -05:00
Qiu Chaofan	240dde9482	[PowerPC] Change altivec indexed load/store builtins argument type This patch changes the index argument of lvxl?/lve[bhw]x and stvxl?/stve[bhw]x builtins from int to long. Because on 64-bit subtargets, an extra extsw will always been generated, which is incorrect. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D106530	2021-07-27 00:26:50 +08:00
Ulrich Weigand	8cd8120a7b	[SystemZ] Add support for new cpu architecture - arch14 This patch adds support for the next-generation arch14 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Detection of arch14 as host processor. - Assembler/disassembler support for new instructions. - New LLVM intrinsics for certain new instructions. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10304. Note: No currently available Z system supports the arch14 architecture. Once new systems become available, the official system name will be added as supported -march name.	2021-07-26 16:57:28 +02:00
Dave Airlie	9451403c5f	[OPENCL] opencl-c.h: add initial CL 3.0 conditionals for atomic operations. This adds the optional wrappers around things, however this isn't sufficient yet for CL 3.0 without generic address space, I've got one more additional patch to add all those APIs, but this is an easier to review precursor. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D106111	2021-07-26 11:06:33 +10:00
Thomas Lively	85157c0079	[WebAssembly] Codegen for pmin and pmax Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax} with standard codegen patterns. Since wasm_simd128.h uses an integer vector as the standard single vector type, the IR for the pmin and pmax intrinsic functions contains bitcasts that would not be there otherwise. Add extra codegen patterns that can still select the pmin and pmax instructions in the presence of these bitcasts. Differential Revision: https://reviews.llvm.org/D106612	2021-07-23 14:49:21 -07:00
Anastasia Stulova	5c63bf3abd	[OpenCL] Add NULL to standards prior to v2.0. NULL was undefined in OpenCL prior to version 2.0. However, the language specification states that "macro names defined by the C99 specification but not currently supported by OpenCL are reserved for future use". Therefore, application developers cannot redefine NULL. The change is supposed to resolve inconsistency between language versions. Currently there is no apparent reason why NULL should be kept undefined. Patch by Topotuna (Justas Janickas)! Differential Revision: https://reviews.llvm.org/D105988	2021-07-23 11:54:36 +01:00
Sven van Haastregt	989bedec7a	[OpenCL] Add cl_khr_integer_dot_product Add the builtins defined by Section 42 "Integer dot product" in the OpenCL Extension Specification. Differential Revision: https://reviews.llvm.org/D106434	2021-07-23 10:10:16 +01:00
namazso	91bc85b1eb	[MS] Preserve base register %esi around movs[bwl] fix for behavior reported in https://bugs.llvm.org/show_bug.cgi?id=51100 workaround for root cause https://bugs.llvm.org/show_bug.cgi?id=16830 similar to https://reviews.llvm.org/D101338 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106210	2021-07-23 16:28:32 +08:00
Aaron En Ye Shi	9ce931bd71	[HIP] Fix no matching constructor for init of shared_ptr and malloc Allow standard header versions of malloc and free to be defined before introducing the device versions. Fixes: SWDEV-295901 Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D106463	2021-07-22 14:32:41 +00:00
Thomas Lively	db7efcab7d	[WebAssembly] Remove clang builtins for extract_lane and replace_lane These builtins were added to capture the fact that the underlying Wasm instructions return i32s and implicitly sign or zero extend the extracted lanes in the case of the i8x16 and i16x8 variants. But we do sufficient optimizations during code gen that these low-level details do not need to be exposed to users. This commit replaces the use of the builtins in wasm_simd128.h with normal target-independent vector code. As a result, we can switch the relevant intrinsics to use functions rather than macros and can use more user-friendly return types rather than trying to precisely expose the underlying Wasm types. Note, however, that the generated LLVM IR is no different after this change. Differential Revision: https://reviews.llvm.org/D106500	2021-07-21 16:11:00 -07:00
Yaxun (Sam) Liu	db5f100fe4	[HIP] Remove workaround in __clang_hip_runtime_wrapper.h Remove the workaround for -fopenmp in __clang_hip_runtime_wrapper.h since it causes device functions in HIP wrapper headers disabled when compiling HIP program with -fopenmp. Reviewed by: Aaron Enye Shi, Jon Chesterfield Differential Revision: https://reviews.llvm.org/D106070	2021-07-21 15:16:28 -04:00
Jon Chesterfield	d71062fbda	Revert "[OpenMP][AMDGCN] Initial math headers support" This reverts commit `968899ad9c`.	2021-07-21 17:35:40 +01:00
Pushpinder Singh	968899ad9c	[OpenMP][AMDGCN] Initial math headers support With this patch, OpenMP on AMDGCN will use the math functions provided by ROCm ocml library. Linking device code to the ocml will be done in the next patch. Reviewed By: JonChesterfield, jdoerfert, scchan Differential Revision: https://reviews.llvm.org/D104904	2021-07-21 16:15:39 +01:00
Sven van Haastregt	724f0e2abb	[OpenCL] Add cl_khr_extended_bit_ops Add the builtins defined by Section 40 "Extended Bit Operations" in the OpenCL Extension Specification. Differential Revision: https://reviews.llvm.org/D106267	2021-07-21 10:01:19 +01:00
Jon Chesterfield	3e649f8ef1	[openmp][nfc] Simplify macros guarding math complex headers The `__CUDA__` macro is already defined for openmp/nvptx and is not used by `__clang_cuda_complex_builtins.h`, so dropping that macro slightly simplifies nvptx and avoids defining it on amdgcn (where it is likely to be harmful). Also dropped a cplusplus test from a C++ header as compilation will have failed on cmath earlier if it was included from C. Reviewed By: jdoerfert, fodinabor Differential Revision: https://reviews.llvm.org/D105221	2021-07-18 23:30:35 +01:00
Stefan Pintilie	0bf4b81d57	[Clang] Add an empty builtins.h file. On Power PC some legacy compilers included a number of builtins in a builtins.h header file. While this header file is not required to hold builtins for clang some legacy code does try to include this file and so this patch provides an empty version of that file. Differential Revision: https://reviews.llvm.org/D106065	2021-07-16 12:50:04 -05:00
Dave Airlie	de79ba9f9a	[OpenCL] opencl-c.h: CL3.0 generic address space This is one of the easier pieces of adding CL3.0 support. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D105526	2021-07-15 10:51:04 +10:00
Dave Airlie	090f007e34	[OpenCL][NFC] opencl-c.h: reorder atomic operations This just reorders the atomics, it doesn't change anything except their layout in the header. This is a prep patch for adding some conditionals around these for CL3.0 but that patch is much easier to review if all the atomic operations are grouped together like this. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D105601	2021-07-15 10:48:44 +10:00
Thomas Lively	4a4229f70f	[WebAssembly] Codegen for v128.storeX_lane instructions Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435. Differential Revision: https://reviews.llvm.org/D106019	2021-07-14 16:15:25 -07:00
Thomas Lively	970e090010	[WebAssembly] Codegen for v128.loadX_lane instructions Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433. Differential Revision: https://reviews.llvm.org/D105950	2021-07-14 11:31:53 -07:00
Thomas Lively	cbabfc63b1	[WebAssembly] Custom combines for f32x4.demote_zero_f64x2 Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern. Differential Revision: https://reviews.llvm.org/D105755	2021-07-12 10:32:18 -07:00
Bardia Mahjour	2071ce9d45	[Altivec] Use signed comparison for vec_all_* and vec_any_* interfaces We are currently being inconsistent in using signed vs unsigned comparisons for vec_all_* and vec_any_* interfaces that use vector bool types. For example we use signed comparison for vec_all_ge(vector signed char, vector bool char) but unsigned comparison for when the arguments are swapped. GCC and XL use signed comparison instead. This patch makes clang consistent with itself and with XL and GCC. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D105666	2021-07-12 11:41:16 -04:00
Nemanja Ivanovic	84e429693f	[PowerPC] Fix rounding mode for vec_round in altivec.h The function is supposed to be the equivalent of rint() (as in round to nearest, ties to even) rather than round() (round to nearest, ties away from zero). In fact, the instruction we emit without VSX is vrfin which is correct. However, with VSX we emit xvrspi which is the equivalent of round() and therefore incorrect. Since there is no equivalent VSX instruction, simply use vrfin regardless of availability of VSX.	2021-07-12 06:11:27 -05:00
Nemanja Ivanovic	41ce5ec5f6	[PowerPC] Remove unnecessary 64-bit guards from altivec.h A number of functions in the header have guards for 64-bit only that were presumably added as some of the functions in the blocks use vector __int128 which is only available in 64-bit mode. A more appropriate guard (__SIZEOF_INT128__) has been added for those functions since, making the 64-bit guards redundant. This patch removes those guards as they inadvertently guard code that uses vector long long which does not actually require 64-bit mode.	2021-07-12 04:59:00 -05:00
Thomas Lively	e5220104d0	[WebAssembly] Custom combines for f64x2.promote_low_f32x4 Replace the clang builtin function and LLVM intrinsic previously used to select the f64x2.promote_low_f32x4 instruction with custom combines from standard SelectionDAG nodes. Implement the new combines to share code with the similar combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232. Differential Revision: https://reviews.llvm.org/D105675	2021-07-09 18:59:29 -07:00
Aaron En Ye Shi	ccb10266f5	[HIP] Move std headers after device malloc/free Set the device malloc and free functions as weak, and move the std headers after device malloc/free to avoid issues with std malloc/free. Fixes: SWDEV-293590 Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D105707	2021-07-09 21:20:16 +00:00
Joachim Meyer	5d689cf2a6	[NFC][CUDA] Fix order of round(f) definition in __clang_cuda_math.h for non-LP64. This broke ARM builds e.g.: https://lab.llvm.org/buildbot/#/builders/187/builds/212	2021-07-02 21:55:48 +02:00
Brian Cain	28b01c59c9	[hexagon] Add {hvx,}hexagon_{protos,circ_brev...} Add definitions for Hexagon, Hexagon circular/bit-reverse and HVX intrinsics.	2021-06-30 22:58:56 -05:00
Xiang1 Zhang	6d234a6908	[X86] Zero some outputs of Kelocker intrinsics in error case Reviewed By: WangPengfei Differential Revision: https://reviews.llvm.org/D104766	2021-06-29 13:35:40 +08:00
Nemanja Ivanovic	ef906573a1	[PowerPC] Fix vec_add for 64-bit on pre-Power7 subtargets The shift of the carry was actually incorrect.	2021-06-24 18:42:44 -05:00
Ethan Stewart	5dfdc1812d	[OpenMP][AMDGCN] Apply fix for isnan, isinf and isfinite for amdgcn. This fixes issues with various return types(bool/int) and was already in place for nvptx headers, adjusted to work for amdgcn. This does not affect hip as the change is guarded with OPENMP_AMDGCN. Similar to D85879. Reviewed By: jdoerfert, JonChesterfield, yaxunl Differential Revision: https://reviews.llvm.org/D104677	2021-06-23 15:26:09 +01:00
Yaxun (Sam) Liu	186f2ac612	[HIP] Add support functions for C++ polymorphic types Add runtime functions to detect invalid calls to pure or deleted virtual functions. Patch by: Siu Chi Chan Reviewed by: Yaxun Liu Differential Revision: https://reviews.llvm.org/D104392	2021-06-21 11:41:07 -04:00
Bing1 Yu	56d5c46b49	[X86] Support __tile_stream_loadd intrinsic for new AMX interface Adding support for __tile_stream_loadd intrinsic. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D103784	2021-06-11 17:28:43 +08:00
Sven van Haastregt	d54e7b731e	[OpenCL] Add memory_scope_all_devices Add the `memory_scope_all_devices` enum value, which is restricted to OpenCL 3.0 or newer and the `__opencl_c_atomic_scope_all_devices` feature. Also guard `memory_scope_all_svm_devices` accordingly, which is already available in OpenCL 2.0. The `__opencl_c_atomic_scope_all_devices` feature is header-only, so set its define to 1 in `opencl-c-base.h`. This is done unconditionally at the moment, as the mechanism for disabling header-only options hasn't been decided yet. This patch only adds a negative test for now. Ideally adding a CL3.0 run line to atomic-ops.cl should suffice as a positive test, but we cannot do that yet until (at least) generic address spaces and program scope variables are supported in OpenCL 3.0 mode. Differential Revision: https://reviews.llvm.org/D103241	2021-06-08 11:51:12 +01:00
Stuart Brady	9b14670f3c	[OpenCL] Add const attribute to ctz() builtins Reviewed By: svenvh Differential Revision: https://reviews.llvm.org/D97725	2021-06-07 11:41:52 +01:00
Stuart Brady	86c24493ea	[OpenCL][NFC] Test commit: tidy up whitespace in comment	2021-06-04 14:44:12 +01:00
Qiu Chaofan	c0b3071833	[PowerPC] Fix x86 vector intrinsics wrapper compilation under C++ Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D103386	2021-06-01 01:19:12 +08:00
Artem Belevich	9a75c06cd9	[CUDA] Work around compatibility issue with libstdc++ 11.1.0 libstdc++ redeclares __failed_assertion multiple times and that results in the function declared with conflicting set of attributes when we include <complex> with __host__ __device__ attributes force-applied to all functions. In order to work around the issue, we rename __failed_assertion within the region with forced attributes. See https://bugs.llvm.org/show_bug.cgi?id=50383 for the details. Differential Revision: https://reviews.llvm.org/D102936	2021-05-24 11:07:09 -07:00
Yaxun (Sam) Liu	91dfd68e90	[NFC][HIP] fix comments in __clang_hip_cmath.h	2021-05-21 17:44:18 -04:00
Nemanja Ivanovic	7cd2833311	[PowerPC] Add vec_vupkhpx and vec_vupklpx for XL compatibility These are old names for these functions that XL still supports.	2021-05-14 08:02:00 -05:00
Aaron En Ye Shi	a249ffa421	[HIP] Clean up llvm intrinsics using __asm Instead of using inline asm, use clang builtins for llvm intrinsics. Differential Revision: https://reviews.llvm.org/D102427	2021-05-13 18:55:51 +00:00
Nemanja Ivanovic	39e4676ca7	[PowerPC] Provide doubleword vector predicate form comparisons on Power7 There are two reasons this shouldn't be restricted to Power8 and up: 1. For XL compatibility 2. Because clang will expand comparison operators to these intrinsics* *Without this patch, the following causes a selection error: int test(vector signed long a, vector signed long b) { return a < b; } This patch provides the handling for the intrinsics in the back end and removes the Power8 guards from the predicate functions (vec_{all\|any}_{eq\|ne\|gt\|ge\|lt\|le}).	2021-05-13 04:56:56 -05:00
Anastasia Stulova	58d18dde5c	[OpenCL] Remove pragma requirement from Arm dot extension. This removed the pointless need for extension pragma since it doesn't disable anything properly and it doesn't need to enable anything that is not possible to disable. The change doesn't break existing kernels since it allows to compile more cases i.e. without pragma statements but the pragma continues to be accepted. Differential Revision: https://reviews.llvm.org/D100985	2021-05-12 16:25:33 +01:00
Thomas Lively	1e9c39a3f9	[WebAssembly] Use functions instead of macros for const SIMD intrinsics To improve hygiene, consistency, and usability, it would be good to replace all the macro intrinsics in wasm_simd128.h with functions. The reason for using macros in the first place was to enforce the use of constants for some arguments using `_Static_assert` with `__builtin_constant_p`. This commit switches to using functions and uses the `__diagnose_if__` attribute rather than `_Static_assert` to enforce constantness. The remaining macro intrinsics cannot be made into functions until the builtin functions they are implemented with can be replaced with normal code patterns because the builtin functions themselves require that their arguments are constants. This commit also fixes a bug with the const_splat intrinsics in which the f32x4 and f64x2 variants were incorrectly producing integer vectors. Differential Revision: https://reviews.llvm.org/D102018	2021-05-07 11:50:19 -07:00
Thomas Lively	b198b9b897	[WebAssembly] Fix argument types in SIMD narrowing intrinsics The builtins were updated to take signed parameters in `627a526955`, but the intrinsics that use those builtins were not updated as well. The intrinsic test did not catch this sign mismatch because it is only reported as an error under -fno-lax-vector-conversions. This commit fixes the type mismatch and adds -fno-lax-vector-conversions to the test to catch similar problems in the future. Differential Revision: https://reviews.llvm.org/D101979	2021-05-06 10:07:45 -07:00
Nemanja Ivanovic	1faf3b195e	[PowerPC] Re-commit `ed87f512bb` This was reverted in `3761b9a234` just as I was about to commit the fix. This patch inlcudes the necessary fix.	2021-05-06 09:50:12 -05:00
Nico Weber	3761b9a234	Revert "[PowerPC] Provide some P8-specific altivec overloads for P7" This reverts commit `ed87f512bb`. Breaks check-clang, see e.g. https://lab.llvm.org/buildbot/#/builders/139/builds/3818	2021-05-06 10:01:16 -04:00
Nemanja Ivanovic	ed87f512bb	[PowerPC] Provide some P8-specific altivec overloads for P7 This adds additional support for XL compatibility. There are a number of functions in altivec.h that produce a single instruction (or a very short sequence) for Power8 but can be done on Power7 without scalarization. XL provides these implementations. This patch adds the following overloads for doubleword vectors: vec_add vec_cmpeq vec_cmpgt vec_cmpge vec_cmplt vec_cmple vec_sl vec_sr vec_sra	2021-05-06 08:37:36 -05:00
Johannes Doerfert	5d8d994dfb	[OpenMP] Make sure classes work on the device as they do on the host We do provide `operator delete(void*)` in `<new>` but it should be available by default. This is mostly boilerplate to test it and the unconditional include of `<new>` in the header we always in include on the device. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D100620	2021-05-06 02:10:30 -05:00
Thomas Lively	81fce29d6e	[WebAssembly] Add SIMD const_splat intrinsics These intrinsics do not correspond to their own underlying instruction, but are a convenience for the common case of materializing a constant vector that has the same value in each lane. Differential Revision: https://reviews.llvm.org/D101885	2021-05-05 13:46:45 -07:00
Thomas Lively	602f318cfd	[WebAssembly] Fix constness of pointer params to load intrinsics Update the SIMD builtin load functions to take pointers to const data and update the intrinsics themselves to not cast away constness. Differential Revision: https://reviews.llvm.org/D101884	2021-05-05 13:16:56 -07:00
Nemanja Ivanovic	bfd60b36f8	[PowerPC] Add floating point overloads for vec_sldw These are added for compatibility with XLC.	2021-04-30 20:29:03 -05:00
Nemanja Ivanovic	c3da07d216	[PowerPC] Provide fastmath sqrt and div functions in altivec.h This adds the long overdue implementations of these functions that have been part of the ABI document and are now part of the "Power Vector Intrinsic Programming Reference" (PVIPR). The approach is to add new builtins and to emit code with the fast flag regardless of whether fastmath was specified on the command line. Differential revision: https://reviews.llvm.org/D101209	2021-04-30 19:17:48 -05:00
Anastasia Stulova	3ec82e5195	[OpenCL] Prevent adding vendor extensions for all targets Removed extension begin/end pragma as it has no effect and it is added unconditionally for all targets. Differential Revision: https://reviews.llvm.org/D92244	2021-04-30 14:42:51 +01:00
Wang, Pengfei	e0c7db7d8c	[MS] Preserve base register %rbx around cpuid This patch copies implementation from cpuid.h, which preserve base register %rbx around cpuid. It fixes PR50133. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101338	2021-04-30 10:16:25 +08:00
Luo, Yuanke	d6c6db2fea	[X86][AMX] Add description for AMX new interface. Differential Revision: https://reviews.llvm.org/D101059	2021-04-27 16:05:11 +08:00
Thomas Lively	502f54049d	[WebAssembly] Finalize wasm_simd128.h intrinsics Adds new intrinsics for instructions that are in the final SIMD spec but did not previously have intrinsics. Also updates the names of existing intrinsics to reflect the final names of the underlying instructions in the spec. Keeps the old names as deprecated functions to ease the transition to the new names. Differential Revision: https://reviews.llvm.org/D101112	2021-04-23 13:37:27 -07:00
Nemanja Ivanovic	19b29b1ed1	[PowerPC] Provide XL-compatible builtins in altivec.h There are some interfaces in altivec.h that are not compatible between Clang and XL (although Clang is compatible with GCC). Currently, we have found 3 but there may be others. Clang/GCC signatures: vector double vec_ctf(vector signed long long) vector double vec_ctf(vector unsigned long long) vector signed long long vec_cts(vector double) vector unsigned long long vec_ctu(vector double) XL signatures: vector float vec_ctf(vector signed long long) vector float vec_ctf(vector unsigned long long) vector signed int vec_cts(vector double) vector unsigned int vec_ctu(vector double) This patch provides the XL behaviour under the __XL_COMPAT_ALTIVEC__ macro for users that rely on XL behaviour. Differential revision: https://reviews.llvm.org/D101130	2021-04-23 15:13:46 -05:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Wang, Pengfei	e8bce83996	[X86] Enable compilation of user interrupt handlers. Add __uintr_frame structure and use UIRET instruction for functions with x86 interrupt calling convention when UINTR is present. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D99708	2021-04-23 11:43:57 +08:00
Yaxun (Sam) Liu	8baba6890d	[HIP] Support overloaded math functions for hipRTC Remove the dependence on standard C++ header for overloaded math functions in HIP header since standard C++ header is not available for hipRTC. Reviewed by: Artem Belevich, Justin Lebar Differential Revision: https://reviews.llvm.org/D100794	2021-04-22 19:06:51 -04:00
Nemanja Ivanovic	7a5641d651	[PowerPC] Add missing casts for vec_xlds and vec_load_splats The previous commits just missed some pointer casts and ended up producing warnings.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	1cc1d9db28	[PowerPC] Add vec_vclz as an alias for vec_cntlz in altivec.h Another addition for compatibility with XLC. The functions have the same overloads so just add it as a preprocessor define.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	e43963db24	[PowerPC] Add vec_load_splats to altivec.h Add these overloads for compatibility with XLC. This is a word load-and-splat.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	a0e6189712	[PowerPC] Add vec_xlds to altivec.h Add these overloads for compatibility with XLC. This is a doubleword load-and-splat.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	a1d325af67	[PowerPC] Add vec_roundz as alias for vec_trunc in altivec.h Add the overloads for compatibility with XLC.	2021-04-22 10:31:00 -05:00
Nemanja Ivanovic	1550c47c18	[PowerPC] Add vec_roundp as alias for vec_ceil Add the overloads for compatibility with XLC.	2021-04-22 10:30:59 -05:00
Nemanja Ivanovic	51692c6c63	[PowerPC] Add missing VSX guard for vec_roundm with vector double The guard was missed in the previous commit.	2021-04-22 10:30:59 -05:00
Nemanja Ivanovic	3a46667059	[PowerPC] Add vec_roundm as alias for vec_floor in altivec.h Add the overloads for compatibility with XLC.	2021-04-22 10:30:59 -05:00
Nemanja Ivanovic	3bcd0ece43	[PowerPC] Add vec_roundc as alias for vec_rint in altivec.h For compatibility with XLC, add these overloads.	2021-04-22 05:31:38 -05:00
Liu, Chen3	72e4bf12ee	[X86] Support some missing intrinsics Support for _mm512_i32logather_pd, _mm512_mask_i32logather_pd, _mm512_i32logather_epi64, _mm512_mask_i32logather_epi64, _mm512_i32loscatter_pd, _mm512_mask_i32loscatter_pd, _mm512_i32loscatter_epi64, _mm512_mask_i32loscatter_epi64. Differential Revision: https://reviews.llvm.org/D100368	2021-04-21 10:50:37 +08:00
Yaxun (Sam) Liu	d8805574c1	[CUDA][HIP] Allow non-ODR use of host var in device Reviewed by: Artem Belevich, Richard Smith Differential Revision: https://reviews.llvm.org/D98193	2021-04-19 14:45:24 -04:00
Yaxun (Sam) Liu	6823af0ca8	[HIP] Support hipRTC in header hipRTC compiles HIP device code at run time. Since the system may not have development tools installed, when a HIP program is compiled through hipRTC, there is no standard C or C++ header available. As such, the HIP headers should not depend on standard C or C++ headers when used with hipRTC. Basically when hipRTC is used, HIP headers only provides definitions of HIP device API functions. This is in line with what nvRTC does. This patch adds support of hipRTC to HIP headers in clang. Basically hipRTC defines a macro __HIPCC_RTC__ when compile HIP code at run time. When this macro is defined, HIP headers do not include standard C/C++ headers. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D100652	2021-04-17 11:34:52 -04:00
Sven van Haastregt	35bc7569f8	[OpenCL] Add as_size/ptrdiff/intptr/uintptr_t operators size_t and friends are built-in scalar data types and s6.4.4.2 of the OpenCL C Specification says the as_type() operator must be available for these data types. Differential Revision: https://reviews.llvm.org/D98959	2021-04-07 10:16:41 +01:00
Yaxun (Sam) Liu	85ff35a952	[HIP] remove overloaded abs in header This function seems to be introduced by accident by `aa2b593f14` Such overloaded abs function did not exist before the refactoring, and does not exist in https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__clang_cuda_cmath.h Conceptually it also does not make sense, since it adds something like double abs(int x) { return ::abs((double)x); } It caused regressions in CuPy. Reviewed by: Aaron Enye Shi, Artem Belevich Differential Revision: https://reviews.llvm.org/D99738	2021-04-01 12:23:29 -04:00
Sven van Haastregt	b5995fced4	[OpenCL] Limit popcount to OpenCL 1.2 and above s6.15.3 of the OpenCL C Specification v3.0.6 states that OpenCL 1.2 or newer is required.	2021-03-31 09:54:18 +01:00
Craig Topper	3fb40ce167	[X86] Don't define vpclmulqdq or vaes intrinsics in the headers unless avx512fintrin.h has been included. The intrinsics won't compile unless avx512fintrin.h has declared the 512 bit types.	2021-03-28 11:26:30 -07:00
Zakk Chen	821547cabb	[RISCV][Clang] Update new overloading rules for RVV intrinsics. RVV intrinsics has new overloading rule, please see `82aac7dad4` Changed: 1. Rename `generic` to `overloaded` because the new rule is not using C11 generic. 2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations support overloading api. 3. Add more overloaded tests due to overloading rule changed. Differential Revision: https://reviews.llvm.org/D99189	2021-03-28 09:04:35 -07:00
Nemanja Ivanovic	06411edb9f	[PowerPC][NFC] Provide legacy names for VSX loads and stores Before we unified the names of the builtins across all the compilers, there were a number of synonyms between them. There is code out there that uses XL naming for some of these loads and stores. This just adds those names.	2021-03-25 06:32:40 -05:00
Nemanja Ivanovic	4020932706	[PowerPC] Make altivec.h work with AIX which has no __int128 There are a number of functions in altivec.h that use vector __int128 which isn't supported on AIX. Those functions need to be guarded for targets that don't support the type. Furthermore, the functions that produce quadword instructions without using the type need a builtin. This patch adds the macro guards to altivec.h using the __SIZEOF_INT128__ which is only defined on targets that support the __int128 type.	2021-03-24 00:35:51 -05:00
Nemanja Ivanovic	4146864735	[PowerPC][NFC] Use valid type for offset in altivec.h We currently use signed long long instead of ptrdiff_t for offsets in altivec.h. This has never really presented a problem because all platforms where we use these are 64-bit. However, now that we have 32-bit targets, we need to use a meaningful type.	2021-03-23 08:45:37 -05:00
Nemanja Ivanovic	2f782a796a	[PowerPC] Add more missing overloads to altivec.h Add overloads that perform subtraction on v1i128 that take and produce vector unsigned char to avoid needing to use __int128. The overloads are suffixed with _u128 and are needed for targets where __int128 isn't supported (AIX).	2021-03-23 05:52:36 -05:00
Sven van Haastregt	1c6521a0dd	[OpenCL] Remove mixed signedness atomic_fetch_ from opencl-c.h The OpenCL C specification v3.0.6 s6.15.12.7.5 mentions: For atomic_fetch and modify functions with key = or, xor, and, min and max on atomic type atomic_intptr_t, M is intptr_t, and on atomic type atomic_uintptr_t, M is uintptr_t. Remove the atomic_fetch_* overloads from opencl-c.h that mix intptr_t and uintptr_t in the same declaration. Differential Revision: https://reviews.llvm.org/D98418	2021-03-23 10:20:13 +00:00
Nemanja Ivanovic	54e4654f04	[PowerPC] Add more missing overloads to altivec.h Add overloads that perform addition on v1i128 that take and produce vector unsigned char to avoid needing to use __int128. The overloads are suffixed with _u128 and are needed for targets where __int128 isn't supported (AIX).	2021-03-23 05:09:19 -05:00
Nemanja Ivanovic	10cc5bcd86	[PowerPC] Add more missing overloads to altivec.h Add vec_permi as a synonym for vec_xxpermdi (but only for doubleword vectors).	2021-03-22 23:09:41 -05:00
Nemanja Ivanovic	b5e96e0ad6	[PowerPC] Add more missing overloads to altivec.h Add vec_gbb as a synonym for vec_vgbbd but for doubleword vectors.	2021-03-22 22:25:28 -05:00
Nemanja Ivanovic	d8e574c8e6	[PowerPC] Add more missing overloads to altivec.h Add vec_cvf as a synonym for vec_doublee/vec_floate.	2021-03-22 22:08:43 -05:00

1 2 3 4 5 ...

2012 Commits