llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Liao	6fe902daf9	[cuda] Add address space predicate funuctions. - Add the missing NVVM predicate builtins on address space checking - Redefine them as pure functions so that they could be used in __builtin_assume. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D112053	2021-10-19 16:20:14 -04:00
Artem Belevich	f526ee5b85	[CUDA] Provide address space conversion builtins. CUDA-11 headers rely on these NVCC builtins. Despite having `__nv` previx, those are not provided by libdevice. Differential Revision: https://reviews.llvm.org/D111665	2021-10-12 14:56:39 -07:00
Sven van Haastregt	544d89e847	[OpenCL] Add atomic_half type builtins Add atomic_half types and builtins operating on the types from the cl_ext_float_atomics extension. Patch by Haonan Yang. Differential Revision: https://reviews.llvm.org/D109740	2021-10-12 10:45:30 +01:00
Qiu Chaofan	2fc0d439a4	[Clang] [PowerPC] Fix header include typo in smmintrin.h The SSE4 header (smmintrin.h) should include SSSE3 (tmmintrin.h) instead of SSE2 (emmintrin.h). Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D111482	2021-10-11 10:44:08 +08:00
Amy Kwan	03bfddae50	[NFC] Update vec_extract builtin signatures to take signed int. This patch updates the vec_extract builtins to take a signed int as the second parameter, as defined by the Power Vector Intrinsics Programming Reference. This patch is NFC and all existing tests pass. Differential Revision: https://reviews.llvm.org/D110935	2021-10-08 15:09:53 -05:00
Artem Belevich	29e00b29f7	[CUDA] Make sure <string.h> is included with original __THROW defined. Otherwise we may end up with an inconsistent redeclarations of the standard library functions if _FORTIFY_SOURCE is in effect. https://bugs.llvm.org/show_bug.cgi?id=47869 Differential Revision: https://reviews.llvm.org/D110781	2021-10-07 11:43:56 -07:00
Amy Kwan	74b1ac7155	[NFC] Update return type of vec_popcnt to vector unsigned. This patch updates the vec_popcnt builtins to return vector unsigned, as defined by the Power Vector Intrinsics Programming Reference. This patch is NFC and all existing tests pass. Differential Revision: https://reviews.llvm.org/D110934	2021-10-07 11:33:19 -05:00
Artem Belevich	6707a7d7e9	[CUDA] remove unneeded includes from CUDA-related headers. This should fix bot failures on PPC and windows.	2021-10-06 17:20:21 -07:00
Artem Belevich	ccfb0555f7	[CUDA] Implement experimental support for texture lookups. The patch implements header-only support for testure lookups. The patch has been tested on a source file with all possible combinations of argument types supported by CUDA headers, compiled and verified that the generated instructions and their parameters match the code generated by NVCC. Unfortunately, compiling texture code requires CUDA headers and can't be tested in clang itself. The test will need to be added to the test-suite later. While generated code compiles and seems to match NVCC, I do not have any code that uses textures that I could test correctness of the implementation. Hence the experimental status. Differential Revision: https://reviews.llvm.org/D110089	2021-10-06 15:15:53 -07:00
Nico Weber	f9457f1f88	[clang] Don't mark _ReadBarrier, _ReadWriteBarrier, _WriteBarrier deprecated It's true that docs.microsoft.com says: """The _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier compiler intrinsics and the MemoryBarrier macro are all deprecated and should not be used. For inter-thread communication, use mechanisms such as atomic_thread_fence and std::atomic<T>, which are defined in the C++ Standard Library. For hardware access, use the /volatile:iso compiler option together with the volatile keyword.""" And these attributes have been here since these builtins were added in r192860. However: - cl.exe does not warn on them even with /Wall - none of the replacements are useful for C code - we don't add __attribute__((__deprecated__())) to any other declarations in intrin.h - intrin0.h in the MSVC headers declares _ReadWriteBarrier() (but without the deprecation attribute), so you get inconsistent deprecation warnings depending on if you include intrin.h or intrin0.h The motivation is that compiling sqlite.h with clang-cl produces a deprecation warning with clang-cl for _ReadWriteBarrier(), but not with cl.exe. Differential Revision: https://reviews.llvm.org/D111232	2021-10-06 10:50:02 -04:00
Albion Fung	13d3cd37e2	[PowerPC] Implement vector float and vector double version for vec_orc builtin The builtin for vec_orc has support for the following two signatures, but currently the compiler marks it ambiguous: vector float vec_orc(vector float, vector float) vector double vec_orc(vector double, vector double) This patch implements these two builtins. Differential revision: https://reviews.llvm.org/D110858	2021-10-06 02:47:42 -05:00
Lei Huang	8b3d944a97	[PowerPC] Disable vector types when not supported by subtarget features Update clang to treat vector unsigned long long and friends as invalid for AltiVec without VSX. Reported in: https://bugs.llvm.org/show_bug.cgi?id=47782 Reviewed By: nemanjai, amyk Differential Revision: https://reviews.llvm.org/D109178	2021-10-04 14:16:47 -05:00
Nemanja Ivanovic	369d785574	[PowerPC] Optimal sequence for doubleword vec_all_{eq\|ne} on Power7 These builtins produce inefficient code for CPU's prior to Power8 due to vcmpequd being unavailable. The predicate forms can actually leverage the available vcmpequw along with xxlxor to produce a better sequence.	2021-10-01 08:27:15 -05:00
Nemanja Ivanovic	fad14a17a4	[PowerPC] Truncate element index for vec_insert in altivec.h When a user specifies an out-of-range index for vec_insert, we just produce IR that has undefined behaviour even though the documentation states that modulo arithmetic is used. This patch just truncates the value to a valid index.	2021-09-30 05:58:22 -05:00
Nemanja Ivanovic	09b67aa1c3	[PowerPC] Implement builtin for vbpermd The instruction has similar semantics to vbpermq but for doublewords. It was added in Power9 and the ABI documents the builtin. Differential revision: https://reviews.llvm.org/D107899	2021-09-29 06:34:31 -05:00
Wang, Pengfei	7d6889964a	[X86][FP16] Add more builtins to avoid multi evaluation problems & add 2 missed intrinsics Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110336	2021-09-27 09:27:04 +08:00
Quinn Pham	f9912fe4ea	[PowerPC] Add range checks for P10 Vector Builtins This patch adds range checking for some Power10 altivec builtins and changes the signature of a builtin to match documentation. For `vec_cntm`, range checking is done via SemaChecking. For `vec_splati_ins`, the second argument is masked to extract the 0th bit so that we always receive either a `0` or a `1`. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D109710	2021-09-23 11:05:49 -05:00
Wang, Pengfei	ebec077e07	[X86][FP16] Change the order of the operands in complex FMA intrinsics to allow swap between the mul operands. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D109658	2021-09-23 11:02:48 +08:00
Albion Fung	b93359ea3f	[PowerPC] Support for vector bool int128 on vector comparison builtins This patch implements support for the type vector bool int128 for arguments on vector comparison builtins listed below, which would otherwise crash due to ambiguity. The following builtins are added: vec_all_eq (vector bool __int128, vector bool __int128) vec_all_ne (vector bool __int128, vector bool __int128) vec_any_eq (vector bool __int128, vector bool __int128) vec_any_ne (vector bool __int128, vector bool __int128) vec_cmpne(vector bool __int128 a, vector bool __int128 b) vec_cmpeq(vector bool __int128 a, vector bool __int128 b) Differential revision: https://reviews.llvm.org/D110084	2021-09-21 16:29:37 -05:00
Justas Janickas	228dd20c3f	[OpenCL] Supports atomics in C++ for OpenCL 2021 Atomics in C++ for OpenCL 2021 are now handled the same way as in OpenCL C 3.0. This is a header-only change. Differential Revision: https://reviews.llvm.org/D109424	2021-09-20 16:24:30 +01:00
serge-sans-paille	9aeecdfa8e	Check supported architectures in sseXYZ/avxXYZ headers It doesn't make sense to include those headers on the wrong architecture, provide an explicit error message in that case. Fix https://bugs.llvm.org/show_bug.cgi?id=48915 Differential Revision: https://reviews.llvm.org/D109686	2021-09-14 09:57:54 +02:00
Sven van Haastregt	d353d1c501	[OpenCL] Support cl_ext_float_atomics See https://github.com/KhronosGroup/OpenCL-Docs/pull/552 for initial specification. Patch by Haonan Yang. Differential Revision: https://reviews.llvm.org/D106343	2021-09-13 12:12:40 +01:00
Xiang1 Zhang	c81d6ab875	[X86] Adjust Keylocker handle mem size Reviewed By: Topper Craig Differential Revision: https://reviews.llvm.org/D109488	2021-09-13 18:03:27 +08:00
Xiang1 Zhang	bdce8d40c6	Revert "[X86] Adjust Keylocker handle mem size" This reverts commit `3731de6b7f`.	2021-09-13 18:00:46 +08:00
Xiang1 Zhang	3731de6b7f	[X86] Adjust Keylocker handle mem size Reviewed By: Topper Craig Differential Revision: https://reviews.llvm.org/D109354	2021-09-13 17:59:33 +08:00
Wang, Pengfei	2aaa6466fe	[X86] Support *_set1_pch(Float16 _Complex h) Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D109487	2021-09-11 17:47:31 +08:00
Joseph Huber	f28e710db7	[OpenMP] Make CUDA math library functions SPMD amenable This patch adds the SPMD amenable assumption to the CUDA math library defintions in Clang. Previously these functions would block SPMD execution on the device because they're intrinsic calls into the library and can't be calculated. These functions don't have side-effects so they are safe to execute in SPMD mode. Depends on D105937 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108958	2021-09-10 14:52:45 -04:00
Simon Pilgrim	ea685e1028	[X86][AVX] Update _mm256_loadu2_m128* intrinsics to use _mm256_set_m128* (PR51796) As reported on PR51796, the _mm256_loadu2_m128i in particular was inserting bitcasts and shuffles with different types making it trickier for some combines, and prevented the value tracker from identifying the shuffle sequences as a single insert_subvector style concat_vectors pattern. This patch instead concatenate the 128-bit unaligned loads with _mm256_set_m128*, which was written to avoid the unnecessary bitcasts and only emits a single shuffle. Differential Revision: https://reviews.llvm.org/D109497	2021-09-09 19:15:48 +01:00
Simon Pilgrim	55d9396278	[X86] Move _mm256_set_m128* intrinsics before _mm256_loadu2_m128* intrinsics. NFC. This is necessary for PR51796 where we'll update _mm256_loadu2_m128* to use _mm256_set_m128*	2021-09-09 11:23:50 +01:00
Pushpinder Singh	12dcbf913c	[AMDGPU][OpenMP] Use complex definitions from complex_cmath.h Following nvptx approach, this patch uses complex function definitions from complex_cmath.h. With this patch, ovo passes 23/34 complex mathematical test cases. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D109344	2021-09-09 10:55:17 +05:30
Tianqing Wang	12fa608af4	[X86] Add CRC32 feature. `d8faf03807` implemented general-regs-only for X86 by disabling all features with vector instructions. But the CRC32 instruction in SSE4.2 ISA, which uses only GPRs, also becomes unavailable. This patch adds a CRC32 feature for this instruction and allows it to be used with general-regs-only. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D105462	2021-09-06 17:24:30 +08:00
Stuart Brady	32955be6bf	[OpenCL] Remove decls for scalar vloada_half and vstorea_half* fns These functions are not part of the OpenCL C specification. See https://github.com/KhronosGroup/OpenCL-Docs/issues/648 for a clarification regarding the vloada_half declarations. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D108761	2021-09-02 22:08:09 +01:00
Nico Weber	e5438f3868	clang/win: Add __readfsdword to intrin.h When using __readfsdword(), clang used to warn that one has to include <intrin.h> -- no matter if that was already included or not. Now it only warns if it's not yet included. To verify that this was the only intrin with this problem, I ran: $ for f in $(grep intrin.h clang/include/clang/Basic/BuiltinsX86* \| egrep -o '\([^,]+,' \| egrep -o '[^(,]*'); do if ! grep -q $f clang/lib/Headers/intrin.h; then echo $f; fi; done This printed 9 more functions, but those are all in emmintrin.h, xsaveintrin.h (which are included by intrin.h based on /arch: flags). So this is indeed the only built-in that was missing in intrin.h. Fixes PR51188. Differential Revision: https://reviews.llvm.org/D109085	2021-09-02 12:22:07 -04:00
Justas Janickas	fb321c2ea2	[OpenCL] Define OpenCL 3.0 optional core features in C++ for OpenCL 2021 Modifies OpenCL 3.0 optional core feature macro definitions so that they are set analogously in C++ for OpenCL 2021. This change aims to achieve compatibility between C++ for OpenCL 2021 and OpenCL 3.0. Differential Revision: https://reviews.llvm.org/D108704	2021-09-01 10:15:17 +01:00
Victor Huang	2e5c17d19e	[PowerPC][NFC] Rename P10 builtins vec_clrl, vec_clrr to vec_clr_first and vec_clr_last This patch renames the vector clear left/right builtins vec_clrl, vec_clrr to vec_clr_first and vec_clr_last to avoid the ambiguities when dealing with endianness. Reviewed By: amyk, lei Differential revision: https://reviews.llvm.org/D108702	2021-08-30 09:52:15 -05:00
Wang, Pengfei	ab40dbfe03	[X86] AVX512FP16 instructions enabling 6/6 Enable FP16 complex FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105269	2021-08-30 13:08:45 +08:00
Xiang1 Zhang	80f7ce8993	[X86] Support __SSC_MARK(const int id) Differential Revision: https://reviews.llvm.org/D108682	2021-08-30 09:55:35 +08:00
Xiang1 Zhang	4c29dc18cf	Revert "[X86] Support __SSC_MARK(const int id)" This reverts commit `78fbde5779`.	2021-08-30 09:50:26 +08:00
Xiang1 Zhang	78fbde5779	[X86] Support __SSC_MARK(const int id) Differential Revision: https://reviews.llvm.org/D108682	2021-08-30 09:21:22 +08:00
Xiang1 Zhang	fd88fac6ca	Revert "[X86] Support __SSC_MARK(const int id)" This reverts commit `83e82ff767`.	2021-08-30 09:18:27 +08:00
Xiang1 Zhang	83e82ff767	[X86] Support __SSC_MARK(const int id) Differential Revision: https://reviews.llvm.org/D108682	2021-08-30 08:51:20 +08:00
Pushpinder Singh	07e85823aa	[OpenMP][AMDGCN] Enable complex functions This patch enables basic complex functionality using the ocml builtins. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108552	2021-08-24 12:40:41 +05:30
Wang, Pengfei	c728bd5bba	[X86] AVX512FP16 instructions enabling 5/6 Enable FP16 FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105268	2021-08-24 09:07:19 +08:00
Wang, Pengfei	b088536ce9	[X86] AVX512FP16 instructions enabling 4/6 Enable FP16 unary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105267	2021-08-22 08:59:35 +08:00
Craig Topper	5cf5df8014	[X86] Add missing __inline__ to functions in amxintrin.h	2021-08-20 09:35:02 -07:00
Thomas Lively	88962cea46	[WebAssembly] Restore builtins and intrinsics for pmin/pmax Partially reverts `85157c0079`, which had removed these builtins and intrinsics in favor of normal codegen patterns. It turns out that it is possible for the patterns to be split over multiple basic blocks, however, which means that DAG ISel is not able to select them to the pmin/pmax instructions. To make sure the SIMD intrinsics generate the correct instructions in these cases, reintroduce the clang builtins and corresponding LLVM intrinsics, but also keep the normal pattern matching as well. Differential Revision: https://reviews.llvm.org/D108387	2021-08-20 09:21:31 -07:00
Thomas Lively	64a9957bf7	[WebAssembly] Make shift values unsigned in wasm_simd128.h On some platforms, negative shift values mean to shift in the opposite direction, but this is not true with WebAssembly. To avoid confusion, make the shift values in the shift intrinsics unsigned. Differential Revision: https://reviews.llvm.org/D108415	2021-08-20 09:10:37 -07:00
Thomas Lively	2456e11614	[WebAssembly] Add SIMD intrinsics using unsigned integers For each SIMD intrinsic function that takes or returns a scalar signed integer value, ensure there is a corresponding intrinsic that returns or an unsigned value. This is a convenience for users who use -Wsign-conversion so they don't have to insert explicit casts, especially when the intrinsic arguments are integer literals that fit into the unsigned integer type but not the signed type. Differential Revision: https://reviews.llvm.org/D108412	2021-08-20 08:56:51 -07:00
Thomas Lively	fd3bd63df2	[WebAssembly] Make bitmask instructions return unsigned ints Since they are bitmasks, it will be more common for them to be used and potentially extended to 64-bit integers as unsigned values rather than signed values. Differential Revision: https://reviews.llvm.org/D108401	2021-08-19 16:23:47 -07:00
Martin Storsjö	cc3affd8b0	[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64 The code is based on the same __mulh and __umulh intrinsics for x86. This should fix PR51128. Differential Revision: https://reviews.llvm.org/D106721	2021-08-19 11:29:55 +03:00

1 2 3 4 5 ...

1923 Commits