llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Zuckerman	25eb420233	[X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max\|min) intrinsics to Clang . After LGTM and Check-all Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs.This class of vector operation forms the basis of many scientific computations. In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V. Reviewer: 1. craig.topper 2. igorb Differential Revision: https://reviews.llvm.org/D25988 llvm-svn: 285493	2016-10-29 10:29:20 +00:00
Nemanja Ivanovic	931bc548e6	[PPC] add float and double overloads for vec_orc and vec_nand in altivec.h This patch corresponds to review https://reviews.llvm.org/D25950. Committing on behalf of Sean Fertile. llvm-svn: 285439	2016-10-28 20:04:53 +00:00
Nemanja Ivanovic	4f69f924df	Implement vector count leading/trailing bytes with zero lsb and vector parity builtins - clang portion This patch corresponds to review: https://reviews.llvm.org/D26002 Committing on behalf of Zaara Syeda. llvm-svn: 285436	2016-10-28 19:49:03 +00:00
Michael Zuckerman	edd99eb07a	1. Fixing small types issue (PD\|PS) (reduce) . 2. Cosmetic changes llvm-svn: 285405	2016-10-28 15:16:03 +00:00
Anastasia Stulova	7c30533362	[OpenCL] Diagnose variadic arguments OpenCL disallows using variadic arguments (s6.9.e and s6.12.5 OpenCL v2.0) apart from some exceptions: - printf - enqueue_kernel This change adds error diagnostic for variadic functions but accepts printf and any compiler internal function (which should cover __enqueue_kernel_XXX cases). It also unifies diagnostic with block prototype and adds missing uncaught cases for blocks. llvm-svn: 285395	2016-10-28 12:59:39 +00:00
Nemanja Ivanovic	09dd423a7d	[PPC] add vector byte reverse functions to altivec.h This patch corresponds to review https://reviews.llvm.org/D25915. Committing on behalf of Sean Fertile. llvm-svn: 285268	2016-10-27 06:23:57 +00:00
Justin Lebar	ebeeab87a1	[CUDA] Move device placement new definitions into a wrapper header. Previously, these were always included -- after this change, you have to #include <new>, which is consistent with how things ought to work. llvm-svn: 285251	2016-10-26 22:13:26 +00:00
Justin Lebar	6f5ec7ee88	[CUDA] Switch cuda_wrappers/complex to use a proper include guard instead of #pragma once. This is consistent with the rest of our internal headers. llvm-svn: 285250	2016-10-26 22:13:20 +00:00
Nemanja Ivanovic	3de0a385c9	[PowerPC] Implement vector_insert_exp builtins - clang portion This patch corresponds to review https://reviews.llvm.org/D25956. Committing on behalf of Zaara Syeda. llvm-svn: 285229	2016-10-26 19:27:11 +00:00
Nemanja Ivanovic	85a28dcc5d	[PPC] Implement vector reverse elements builtins (vec_reve) This patch corresponds to review https://reviews.llvm.org/D25906. Committing on behalf of Tony Jiang. llvm-svn: 285218	2016-10-26 18:25:45 +00:00
Craig Topper	f202365910	[AVX-512] Fix the operand order for all calls to __builtin_ia32_vfmaddss3_mask. Summary: The preserved input should be the first argument and the vector inputs should be in the same order as the intrinsics it is used to implement. Reviewers: igorb, delena Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25902 llvm-svn: 285175	2016-10-26 05:35:38 +00:00
Yaxun Liu	a49bd14843	[OpenCL] Add missing atom_xor for 64 bit to opencl-c.h Differential Revision: https://reviews.llvm.org/D25954 llvm-svn: 285125	2016-10-25 21:37:05 +00:00
Michael Zuckerman	facb37cabf	[X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (Operators: +,*,&&,\|\|) intrinsics to Clang Committed after LGTM and check-all Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs. This class of vector operation forms the basis of many scientific computations. In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V. Used bisection method. At each step, we partition the vector with previous step in half, and the operation is performed on its two halves. This takes log2(n) steps where n is the number of elements in the vector. Reviwer: 1. igorb 2. craig.topper Differential Revision: https://reviews.llvm.org/D25527 llvm-svn: 285054	2016-10-25 07:56:04 +00:00
Michael Zuckerman	33bd5b235b	revert r284963 because new test file is failing in some OS. test/CodeGen/avx512-reduceIntrin.c llvm-svn: 284967	2016-10-24 11:30:23 +00:00
Michael Zuckerman	98cb041891	[X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (Operators: +,*,&&,\|\|) intrinsics to Clang Committed after LGTM and check-all Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs. This class of vector operation forms the basis of many scientific computations. In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V. Used bisection method. At each step, we partition the vector with previous step in half, and the operation is performed on its two halves. This takes log2(n) steps where n is the number of elements in the vector. Differential Revision: https://reviews.llvm.org/D25527 llvm-svn: 284963	2016-10-24 10:53:20 +00:00
Craig Topper	eee7c0520c	[AVX-512] Replace masked 128/256-bit byte, word, and dword min/max builtins with selects and the older unmasked builtins. llvm-svn: 284954	2016-10-23 23:57:30 +00:00
Craig Topper	0c5da26572	[AVX-512] Replace 512-bit pmovzx/sx builtins with native IR. llvm-svn: 284936	2016-10-23 07:35:47 +00:00
Craig Topper	4ef879ac2c	[AVX-512] Remove masked 128/256-bit packss/packus builtins and replace with selects and the older unmasked builtins. llvm-svn: 284935	2016-10-23 07:35:39 +00:00
Ekaterina Romanova	06477bf035	Add more doxygen comments to emmintrin.h's intrinsics. With this patch, all intrinsics in this file (with an exception of a handful of a recently added ones) will be documented. I will send out a patch for 4 missining intrisics later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao. llvm-svn: 284934	2016-10-23 07:30:50 +00:00
Craig Topper	4d63dfc286	[AVX-512] Replace masked 128/256-bit pavg builtins and replace with select and older unmasked builtins. llvm-svn: 284929	2016-10-22 21:24:56 +00:00
Craig Topper	622c63614d	[AVX-512] Replace masked 128/256-bit saturating add/sub builtins with select and older unmasked builtins. llvm-svn: 284928	2016-10-22 21:24:52 +00:00
Craig Topper	11dda92405	[AVX-512] Replace masked 128/256-bit vpmovzx/vpmovsx builtins with native IR. llvm-svn: 284927	2016-10-22 21:24:48 +00:00
Craig Topper	eb1c0afa90	[AVX-512] Remove masked 128/256-bit pshufb builtins. Replace with a select and the older unmaksed builtins. llvm-svn: 284925	2016-10-22 21:24:42 +00:00
Craig Topper	78a9c40326	[AVX-512] Remove builtins for 128/256-bit pabsb/pabsw. We can use a select and the older non-masked versions instead. llvm-svn: 284924	2016-10-22 21:24:38 +00:00
Craig Topper	c2c7e42bfe	[AVX-512] Add typecasts to alignr intrinsics that were modified in r284920. llvm-svn: 284923	2016-10-22 21:24:34 +00:00
Craig Topper	f6373bc6fd	[AVX-512] Remove masked 128/256-bit palignr builtins. We can just use a select in the header file with the older unmasked versions instead. llvm-svn: 284920	2016-10-22 18:32:33 +00:00
Ekaterina Romanova	493091fdef	Add more doxygen comments to emmintrin.h's intrinsics. With this patch, 75% of the intrinsics in this file will be documented now. The patches for the rest of the intrisics in this file will be send out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao. llvm-svn: 284754	2016-10-20 17:59:15 +00:00
Albert Gutowski	1deab38717	Implement __stosb intrinsic as a volatile memset Summary: We need `__stosb` to be an intrinsic, because SecureZeroMemory function uses it without including intrin.h. Implementing it as a volatile memset is not consistent with MSDN specification, but it gives us target-independent IR while keeping the most important properties of `__stosb`. Reviewers: rnk, hans, thakis, majnemer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25334 llvm-svn: 284253	2016-10-14 17:33:05 +00:00
Albert Gutowski	5e08df0266	Add 64-bit MS _Interlocked functions as builtins again Summary: Previously global 64-bit versions of _Interlocked functions broke buildbots on i386, so now I'm adding them as builtins for x86-64 and ARM only (should they be also on AArch64? I had problems with testing it for AArch64, so I left it) Reviewers: hans, majnemer, mstorsjo, rnk Subscribers: cfe-commits, aemerson Differential Revision: https://reviews.llvm.org/D25576 llvm-svn: 284172	2016-10-13 22:35:07 +00:00
Albert Gutowski	397d81bb9a	Implement MS _ReturnAddress and _AddressOfReturnAddress intrinsics Reviewers: rnk, thakis, majnemer, hans Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25540 llvm-svn: 284131	2016-10-13 16:03:42 +00:00
Yunzhong Gao	d9fa56a4fb	[NFC] Fixing the description for _mm_store_ps and _mm_store_ps1. It seems that the doxygen description of these two intrinsics were swapped by mistake. llvm-svn: 284080	2016-10-12 23:27:27 +00:00
Albert Gutowski	2a0621e58a	Implement MS _BitScan intrinsics Summary: _BitScan intrinsics (and some others, for example _Interlocked and _bittest) are supposed to work on both ARM and x86. This is an attempt to isolate them, avoiding repeating their code or writing separate function for each builtin. Reviewers: hans, thakis, rnk, majnemer Subscribers: RKSimon, cfe-commits, aemerson Differential Revision: https://reviews.llvm.org/D25264 llvm-svn: 284060	2016-10-12 22:01:05 +00:00
Yunzhong Gao	c37e2231ad	[NFC] Trial change to remove a redundant blank line. llvm-svn: 284033	2016-10-12 19:33:33 +00:00
Justin Lebar	49ec14692a	[CUDA] Re-land support for <complex> (r283683 and r283680). These were reverted in r283753 and r283747. The first patch added a header to the root 'Headers' install directory, instead of into 'Headers/cuda_wrappers'. This was fixed in the second patch, but by then the damage was done: The bad header stayed in the 'Headers' directory, continuing to break the build. We reverted both patches in an attempt to fix things, but that still didn't get rid of the header, so the Windows boostrap build remained broken. It's probably worth fixing up our cmake logic to remove things from the install dirs, but in the meantime, re-land these patches, since we believe they no longer have this bug. llvm-svn: 283907	2016-10-11 17:36:03 +00:00
Albert Gutowski	fcea61c563	Implement MS read/write barriers and __faststorefence intrinsic Reviewers: hans, rnk, majnemer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25442 llvm-svn: 283793	2016-10-10 19:40:51 +00:00
Albert Gutowski	7216f17653	Implement __emul, __emulu, _mul128 and _umul128 MS intrinsics Reviewers: rnk, thakis, majnemer, hans Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25353 llvm-svn: 283785	2016-10-10 18:09:27 +00:00
Nico Weber	21b9c7a6dc	Revert r283683 because r283680 got reverted. llvm-svn: 283753	2016-10-10 14:20:35 +00:00
Nico Weber	67dd74ef89	Revert r283680. Breaks bootstrap builds on (at least) Windows: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\lib\Support\Allocator.cpp:14: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/Allocator.h:24: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/ADT/SmallVector.h:20: In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/MathExtras.h:19: D:\buildslave\clang-x64-ninja-win7\stage1.install\bin\..\lib\clang\4.0.0\include\algorithm(63,8) : error: unknown type name '__device__' inline __device__ const __T & llvm-svn: 283747	2016-10-10 14:10:00 +00:00
Justin Lebar	3b593f56fc	[CUDA] Don't install cuda_wrappers/{algorithm,complex} into the main include dir. This is obviously wrong -- if we do this, then all compiles will pick up these wrappers, which is not what we want. llvm-svn: 283683	2016-10-09 00:27:39 +00:00
Justin Lebar	d3c5d2a4de	[CUDA] Support <complex> and std::min/max on the device. Summary: We do this by wrapping <complex> and <algorithm>. Tests are in the test-suite. Reviewers: tra Subscribers: jhen, beanz, cfe-commits, mgorny Differential Revision: https://reviews.llvm.org/D24979 llvm-svn: 283680	2016-10-08 22:16:12 +00:00
Justin Lebar	2dfbe9a3b4	[CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h. Summary: This matches the idiom we use for our other CUDA wrapper headers. Reviewers: tra Subscribers: beanz, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D24978 llvm-svn: 283679	2016-10-08 22:16:08 +00:00
Justin Lebar	e9eb792a0f	[CUDA] Declare our __device__ math functions in the same inline namespace as our standard library. Summary: Currently we declare our inline __device__ math functions in namespace std. But libstdc++ and libc++ declare these functions in an inline namespace inside namespace std. We need to match this because, in a later patch, we want to get e.g. <complex> to use our device overloads, and it only will if those overloads are in the right inline namespace. Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: https://reviews.llvm.org/D24977 llvm-svn: 283678	2016-10-08 22:16:03 +00:00
Michael Zuckerman	9e43ccfe68	[Clang][AVX512][BuiltIn]Adding missing intrinsics move_{sd\|ss} to clang Differential Revision: http://reviews.llvm.org/D21021 llvm-svn: 283314	2016-10-05 12:56:06 +00:00
Albert Gutowski	f3a0bce155	Separate builtins for x84-64 and i386; implement __mulh and __umulh Summary: We need x86-64-specific builtins if we want to implement some of the MS intrinsics - winnt.h contains definitions of some functions for i386, but not for x86-64 (for example _InterlockedOr64), which means that we cannot treat them as builtins for both i386 and x86-64, because then we have definitions of builtin functions in winnt.h on i386. Reviewers: thakis, majnemer, hans, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24598 llvm-svn: 283264	2016-10-04 22:29:49 +00:00
Craig Topper	c4a8228bcc	[AVX-512] Use native IR for masked 512-bit add/sub/mul/div ps/pd intrinsics when rounding mode isn't used. llvm-svn: 283073	2016-10-02 17:43:00 +00:00
Artem Belevich	d4d9dc8252	[CUDA] Added support for CUDA-8 Differential Revision: https://reviews.llvm.org/D24946 llvm-svn: 282610	2016-09-28 17:47:40 +00:00
Martin Storsjo	963f75efc2	[Headers] Replace stray indentation with tabs with spaces. NFC. This matches the rest of the surrounding file. llvm-svn: 282569	2016-09-28 09:34:51 +00:00
Ayman Musa	17a2819b05	Update to commit r282488, fix the buildboot failure. llvm-svn: 282492	2016-09-27 15:37:31 +00:00
Ayman Musa	2e250e8845	[avx512] Add aliases to some missing avx512 intrinsics. Differential Revision:https: //reviews.llvm.org/D24961 llvm-svn: 282488	2016-09-27 14:06:32 +00:00
Nemanja Ivanovic	10e2b5dcaa	[Power9] Builtins for ELF v.2 ABI conformance - front end portion This patch corresponds to review: https://reviews.llvm.org/D24397 It adds the __POWER9_VECTOR__ macro and the -mpower9-vector option along with a number of altivec.h functions (refer to the code review for a list). llvm-svn: 282481	2016-09-27 10:45:22 +00:00
Saleem Abdulrasool	eae64f8a62	headers: add missing Windows ARM Interlocked intrinsics On ARM, there are multiple versions of each of the intrinsics, with acquire/relaxed/release barrier semantics. The newly added ones are provided as inline functions here instead of builtins, since they should only be available on certain archs (arm/aarch64). This is necessary in order to compile C++ code for ARM in MSVC mode. Patch by Martin Storsjö! llvm-svn: 282447	2016-09-26 22:12:43 +00:00
Simon Dardis	3d9c763816	[mips] MSA intrinsics header file This patch adds the msa.h header file containing the shorter names for the MSA instrinsics, e.g. msa_sll_b for builtin_msa_sll_b. Reviewers: vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24674 llvm-svn: 281975	2016-09-20 15:07:36 +00:00
Justin Lebar	e3612a039f	[CUDA] Make __clang_cuda_cmath.h compatible with libc++. Summary: We need to add a bunch more "using"s, which weren't necessary with libstdc++. Once this is in I can check in a test to the test-suite. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24588 llvm-svn: 281544	2016-09-14 21:50:14 +00:00
Albert Gutowski	727ab8a803	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: alexshap, cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281540	2016-09-14 21:19:43 +00:00
Albert Gutowski	fc19fa3721	Temporary fix for MS _Interlocked intrinsics llvm-svn: 281401	2016-09-13 21:51:37 +00:00
Albert Gutowski	9918cb6573	Reverse commit 281375 (breaks building Chromium) llvm-svn: 281399	2016-09-13 21:24:51 +00:00
Albert Gutowski	ce7a9a47b2	Add bunch of _Interlocked builtins Reviewers: compnerd, thakis, Prazek, majnemer, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24153 llvm-svn: 281378	2016-09-13 19:43:33 +00:00
Albert Gutowski	ae3fb3113f	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281375	2016-09-13 19:26:42 +00:00
Albert Gutowski	b6a11acb53	Implement MS _rot intrinsics Reviewers: thakis, Prazek, compnerd, rnk Subscribers: majnemer, cfe-commits Differential Revision: https://reviews.llvm.org/D24311 llvm-svn: 280997	2016-09-08 22:32:19 +00:00
Reid Kleckner	5de2bcdcf6	Add MS __nop intrinsic to intrin.h Summary: There was no definition for __nop function - added inline assembly. Patch by Albert Gutowski! Reviewers: rnk, thakis Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24286 llvm-svn: 280826	2016-09-07 16:55:12 +00:00
Craig Topper	2dfab63bb3	[AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div builtins and replace with native operations. We can't do the 512-bit ones because they take a rounding mode argument that we can't represent. llvm-svn: 280635	2016-09-04 18:30:17 +00:00
Elad Cohen	fb6358d2b5	[Modules] Add 'freestanding' to the 'requires-declaration' feature-list. This adds support for modules that require (non-)freestanding environment, such as the compiler builtin mm_malloc submodule. Differential Revision: https://reviews.llvm.org/D23871 llvm-svn: 280613	2016-09-04 06:00:42 +00:00
Joerg Sonnenberger	b50b2fac9f	Trailing dot that shouldn't have been committed. llvm-svn: 280609	2016-09-04 00:51:02 +00:00
Joerg Sonnenberger	82216f0faa	PR 27200: Fix names of the atomic lock-free macros. llvm-svn: 280607	2016-09-04 00:44:10 +00:00
Craig Topper	f43e4a1728	[AVX-512] Remove masked integer mullo builtins and replace with native IR. llvm-svn: 280597	2016-09-03 19:19:49 +00:00
Craig Topper	0e18976b8d	[AVX-512] Remove masked integer add/sub builtins and replace with native IR. llvm-svn: 280596	2016-09-03 18:29:35 +00:00
Craig Topper	a815f488d5	[AVX-512] Implement masked floating point logical operations with native IR and remove the builtins. llvm-svn: 280197	2016-08-31 05:38:58 +00:00
Craig Topper	d0681d528d	[X86] Use v2i64 vectors to implement _mm_and/andn/or/xor_pd. These will be reused when removing some builtins from avx512vldqintrin.h and this will make the tests for that change show a better number of vector elements. llvm-svn: 280196	2016-08-31 05:38:55 +00:00
Bruno Cardoso Lopes	6736e199c7	[Modules] Add 'gnuinlineasm' to the 'requires-declaration' feature-list. This adds support for modules that require (no-)gnu-inline-asm environment, such as the compiler builtin cpuid submodule. This is the gnu-inline-asm variant of https://reviews.llvm.org/D23871 Differential Revision: https://reviews.llvm.org/D23905 rdar://problem/26931199 llvm-svn: 280159	2016-08-30 21:25:42 +00:00
Alexey Bader	b5d90e57dc	[OpenCL] Make is_valid_event, create_user_event overloadable. Summary: Make is_valid_event and create_user_event overloadable like other built-ins. Patch by Evgeniy Tyurin. Reviewers: bader, yaxunl Subscribers: Anastasia, cfe-commits Differential Revision: https://reviews.llvm.org/D23914 llvm-svn: 280097	2016-08-30 14:42:54 +00:00
Asaf Badouh	356bb76809	[X86][AVX512F] minor fix of the parameter names add "__" prefix Bug 28842 https://llvm.org/bugs/show_bug.cgi?id=29040 Differential Revision: https://reviews.llvm.org/D23753 llvm-svn: 279392	2016-08-21 07:56:47 +00:00
Justin Lebar	cb20a09f54	[CUDA] Improve handling of math functions. Summary: A bunch of related changes here to our CUDA math headers. - The second arg to nexttoward is a double (well, technically, long double, but we don't have that), not a float. - Add a forward-declare of llround(float), which is defined in the CUDA headers. We need this for the same reason we need most of the other forward-declares: To prevent a constexpr function in our standard library from becoming host+device. - Add nexttowardf implementation. - Pull "foobarf" functions defined by the CUDA headers in the global namespace into namespace std. This lets you do e.g. std::sinf. - Add overloads for math functions accepting integer types. This lets you do e.g. std::sin(0) without having an ambiguity between the overload that takes a float and the one that takes a double. With these changes, we pass testcases derived from libc++ for cmath and math.h. We can check these testcases in to the test-suite once support for CUDA lands there. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23627 llvm-svn: 279140	2016-08-18 20:43:13 +00:00
Yaxun Liu	3317446301	[OpenCL] AMDGPU: Add extensions cl_amd_media_ops and cl_amd_media_ops2 Differential Revision: https://reviews.llvm.org/D23322 llvm-svn: 278851	2016-08-16 20:49:49 +00:00
Reid Kleckner	66e7717b46	Revert "[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms" This reverts commit r278783. It breaks usage of _xgetbv on Windows. llvm-svn: 278814	2016-08-16 16:04:14 +00:00
Marina Yatsina	197b65f833	[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms commit on behalf of guyblank Differential Revision: https://reviews.llvm.org/D21959 llvm-svn: 278783	2016-08-16 08:13:36 +00:00
Lama Saba	5d01f224cf	[X86][AVX512] lower __mm512_andnot_ps/__mm512_andnot_pd to IR Differential revision: https://reviews.llvm.org/D23262 llvm-svn: 278209	2016-08-10 10:34:45 +00:00
Justin Lebar	2ef3dabd45	[CUDA] Add __device__ overloads for placement new and delete. Summary: Previously these sort of worked because they didn't end up resulting in calls at the ptx layer. But I'm adding stricter checks that break placement new without these changes. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23239 llvm-svn: 278194	2016-08-10 01:09:14 +00:00
Asaf Badouh	2f344b788c	[AVX512] integer comparisions enumeration. fix Bug 28842 https://llvm.org/bugs/show_bug.cgi?id=28842 Differential Revision: https://reviews.llvm.org/D22212 llvm-svn: 277955	2016-08-07 10:43:04 +00:00
Saleem Abdulrasool	afdef205d8	Headers: Add ARM support to intrin.h for MSVC compatibility This fixes compiling with headers from the Windows SDK for ARM, where the YieldProcessor function (in winnt.h) refers to _ARM_BARRIER_ISHST. The actual MSVC armintr.h contains a lot more definitions, but this is enough to build code that uses the Windows SDK but doesn't use ARM intrinsics directly. An alternative would to just keep the addition to intrin.h (to include armintr.h), but not actually ship armintr.h, instead having clang's intrin.h include armintr.h from MSVC's include directory. (That one works fine with clang, at least for building code that uses the Windows SDK.) Patch by Martin Storsjö! llvm-svn: 277928	2016-08-06 17:58:24 +00:00
Yaxun Liu	c489e39eca	[OpenCL] Remove extra native_ functions from opencl-c.h There should be no native_ builtin functions with double type arguments. Patch by Aaron En Ye Shi. Differential Revision : https://reviews.llvm.org/D23071 llvm-svn: 277754	2016-08-04 19:30:54 +00:00
Dimitry Andric	f8099f256d	Add more gcc compatibility names to clang's cpuid.h Summary: Some cpuid bit defines are named slightly different from how gcc's cpuid.h calls them. Define a few more compatibility names to appease software built for gcc: * `bit_PCLMUL` alias of `bit_PCLMULQDQ` * `bit_SSE4_1` alias of `bit_SSE41` * `bit_SSE4_2` alias of `bit_SSE42` * `bit_AES` alias of `bit_AESNI` * `bit_CMPXCHG8B` alias of `bit_CX8` While here, add the misssing 29th bit, `bit_F16C` (which is how gcc calls this bit). Reviewers: joerg, rsmith Subscribers: bruno, cfe-commits Differential Revision: https://reviews.llvm.org/D22010 llvm-svn: 277307	2016-07-31 20:23:23 +00:00
Eric Christopher	b638558e12	Remove unused variable. Fixes PR28761. llvm-svn: 277221	2016-07-29 22:11:11 +00:00
Yaxun Liu	c944e65a24	[OpenCL] Added CLK_ABGR definition for get_image_channel_order return value Added CLK_ABGR definition for get_image_channel_order return value inside opencl-c.h file. Patch by Aaron En Ye Shi. Differential Revision: https://reviews.llvm.org/D22767 llvm-svn: 277179	2016-07-29 17:50:10 +00:00
Craig Topper	351ed42795	[X86] Block pbroadcastq instructions on 32-bit targets instead of pbroadcastb. Thanks to Simon Pilgrim for catching the mistake. llvm-svn: 276564	2016-07-24 14:58:06 +00:00
Ekaterina Romanova	a84c24f39c	Add doxygen comments to emmintrin.h's intrinsics. Only around 50% of the intrinsics in this file are documented now. The patches for the rest of the intrisics in this file will be send out later. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Paul Robinson. llvm-svn: 276499	2016-07-22 23:49:37 +00:00
Craig Topper	45db56c375	[X86] Add missing __x86_64__ qualifiers on a bunch of intrinsics that assume 64-bit GPRs are available. Usages of these intrinsics in a 32-bit build results in assertions in the backend. llvm-svn: 276249	2016-07-21 07:38:39 +00:00
Simon Pilgrim	e3b9ee0645	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102	2016-07-20 10:18:01 +00:00
Asaf Badouh	a0b6f8fb56	[X86][AVX512F] minor fix of the parameter names add "__" prefix llvm-svn: 275384	2016-07-14 08:40:30 +00:00
Michael Zuckerman	3378653f8d	[Clang][AVX512] Making cosmetic changes llvm-svn: 275169	2016-07-12 12:42:27 +00:00
Craig Topper	4d61a3c2d8	[AVX512] Replace masked AND/OR/XOR intrinsics with native code and remove the builtins. llvm-svn: 275049	2016-07-11 06:14:18 +00:00
Craig Topper	6e76fb61a7	[X86] Use __butilin_shufflevector for 512-bit shufps intrinsics. llvm-svn: 275012	2016-07-10 05:57:21 +00:00
Craig Topper	95b61b0544	[X86] Use __builtin_ia32_vec_ext_v4hi and __builtin_ia32_vec_set_v4hi to implement pextrw/pinsertw MMX intrinsics instead of trying to use native IR. Without this we end up generating code that doesn't use mmx registers and probably doesn't work well with other mmx intrinsics. llvm-svn: 274968	2016-07-09 05:30:41 +00:00
Justin Bogner	2d5de7e568	NVPTX: Use the nvvm builtins to read SRegs rather than the legacy ptx ones The ptx spellings were removed from LLVM in r274769. llvm-svn: 274770	2016-07-07 16:41:08 +00:00
Justin Bogner	2f8de9fb4f	NVPTX: Rename __builtin_ptx_shfl -> __nvvm_shfl To match "NVPTX: Make the llvm.nvvm.shfl intrinsics and builtin names consistent" in LLVM. llvm-svn: 274663	2016-07-06 19:52:32 +00:00
Michael Zuckerman	b920665493	[Clang][Feature] Adding CLFLUSHOPT feature and intrinsic to clang Differential Revision: http://reviews.llvm.org/D21792 llvm-svn: 274559	2016-07-05 15:56:03 +00:00
Simon Pilgrim	f5a8837e1b	[X86][AVX512] Converted the VBROADCAST intrinsics to generic IR llvm-svn: 274544	2016-07-05 12:59:33 +00:00
Asaf Badouh	136332888a	[X86][AVX512F] add float/double abs intrinsics add abs intrinsics that use native LLVM-IR. change _mm512_mask[z]_and_epi{32\|64} to use select intrinsic Differential Revision: http://reviews.llvm.org/D21973 llvm-svn: 274542	2016-07-05 12:24:14 +00:00
Asaf Badouh	f9cdb8de7a	[AVX512] minor fix in sqrt{ss\|sd} intrinsics arguments Differential Revision: http://reviews.llvm.org/D21988 llvm-svn: 274541	2016-07-05 11:36:21 +00:00
Anastasia Stulova	db7a31cce7	[OpenCL] An implementation of device side enqueue (DSE) from OpenCL v2.0 s6.13.17. - Added new Builtins: enqueue_kernel, get_kernel_work_group_size and get_kernel_preferred_work_group_size_multiple. These Builtins use custom check to diagnose parameters of the passed Blocks i. e. variable number of 'local void*' type params, and check different overloads specified in Table 6.31 of OpenCL v2.0. - IR is generated as an internal library call for each OpenCL Builtin, reusing ObjC Block implementation. Review: http://reviews.llvm.org/D20249 llvm-svn: 274540	2016-07-05 11:31:24 +00:00
Michael Zuckerman	a72b49efe4	ntrinsics _mm256_permutexvar_epi64 doesn't accept three parameters as specify bellow. I deleted the extra mask parameter. __m256i _mm256_permutexvar_epi64 (__m256i idx, __m256i a) #include "immintrin.h" Instruction: vpermq CPUID Flags: AVX512VL + AVX512F Description Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst. Operation FOR j := 0 to 3 i := j64 id := idx[i+1:i]64 dst[i+63:i] := a[id+63:id] ENDFOR dst[MAX:256] := 0 dst[MAX:256] := 0 (From: Intel intrinsics guide) llvm-svn: 274539	2016-07-05 11:30:31 +00:00
Michael Zuckerman	7dac6fbdf8	[Clang][BuiltIn][AVX512] adding _mm{\|256\|512}_mask_cvt{s\|us\|}epi16_storeu_epi8 intrinsics Differential Revision: http://reviews.llvm.org/D21729 llvm-svn: 274532	2016-07-05 08:08:01 +00:00
Craig Topper	2a383c9273	[X86] Use undefined instead of setzero in shufflevector based intrinsics when the second source is unused. Rewrite immediate extractions in shuffle intrinsics to be in ((c >> x) & y) form instead of ((c & z) >> x). This way only x varies between each use instead of having to vary x and z. llvm-svn: 274525	2016-07-04 22:18:01 +00:00
Simon Pilgrim	427154db2a	[X86][AVX512] Converted the VSHUFPD intrinsics to generic IR llvm-svn: 274523	2016-07-04 21:30:47 +00:00
Simon Pilgrim	30db811526	[X86][AVX512] Converted the VPERMPD/VPERMQ intrinsics to generic IR llvm-svn: 274502	2016-07-04 13:34:44 +00:00
Simon Pilgrim	17388f2569	[X86][AVX512] Converted the VPERMILPD/VPERMILPS intrinsics to generic IR llvm-svn: 274492	2016-07-04 11:06:15 +00:00
Simon Pilgrim	275d721485	[X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR llvm companion patch imminent llvm-svn: 274442	2016-07-02 17:16:25 +00:00
Craig Topper	b3a4477b13	[X86] Replace 128-bit and 256 masked vpermilps/vpermilpd builtins with native IR. llvm-svn: 274425	2016-07-02 05:36:43 +00:00
Michael Zuckerman	3f316abdce	[Clang][Intrinsics][AVX512][BuiltIn] adding intrinsics for vrangesd instruction set Differential Revision: http://reviews.llvm.org/D21734 llvm-svn: 274218	2016-06-30 08:05:46 +00:00
Alexey Bader	e5b3aebfb5	[OpenCL] Add attribute 'pure' to read_image built-in functions to enable optimizations. Reviewers: Anastasia, yaxunl Subscribers: pekka.jaaskelainen, pxli168, cfe-commits Differential Revision: http://reviews.llvm.org/D21795 llvm-svn: 274122	2016-06-29 12:30:26 +00:00
David Majnemer	2916a612cd	[intrin.h] Certain _Interlocked intrinsics return the old value This fixes PR28326. llvm-svn: 273986	2016-06-28 02:54:43 +00:00
Asaf Badouh	57819aa185	[X86] add _mm_loadu_si64 Differential Revision: http://reviews.llvm.org/D21504 llvm-svn: 273812	2016-06-26 13:51:54 +00:00
Craig Topper	50e3dfe9d0	[X86] Fix pslldq/psrldq intrinsics to not fail compilation with immediates larger than 16. This was accidentally broken in r272246. llvm-svn: 273775	2016-06-25 07:31:14 +00:00
Craig Topper	79f53ca0b5	[AVX512] Replace masked unpack builtins with shufflevector and selects. llvm-svn: 273533	2016-06-23 06:36:42 +00:00
Michael Zuckerman	716859aa64	[Clang][bmi][intrinsics] Adding _mm_tzcnt_64 _mm_tzcnt_32 intrinsics to clang. Differential Revision: http://reviews.llvm.org/D21373 llvm-svn: 273401	2016-06-22 12:32:43 +00:00
Craig Topper	9ce3ddf2e6	[AVX512] Use a __v8hi vector inside of _mm_setzero_hi to match its name. Probably no real functional change. llvm-svn: 273389	2016-06-22 06:36:23 +00:00
Craig Topper	08181f795f	[AVX512] Fix _mm_setzero_di to not require avx512vl since its used by the avx512dqintrin.h. Also update the avx512dq test to not enable avx512vl feature so we can ensure correct dependencies. llvm-svn: 273388	2016-06-22 06:36:21 +00:00
Craig Topper	c89dda5938	[AVX512] Add missing typecasts to intrinsics. llvm-svn: 273386	2016-06-22 06:36:16 +00:00
Craig Topper	879b0978f4	[AVX512] Move the 128-bit and 256-bit lzcnt intrinsics to avx512vlcdintrin.h where they belong. llvm-svn: 273249	2016-06-21 06:53:58 +00:00
Yaxun Liu	143f083e4b	[OpenCL] Include opencl-c.h by default as a clang module Include opencl-c.h by default as a module to utilize the automatic AST caching mechanism of clang modules. Add an option -finclude-default-header to enable default header for OpenCL, which is off by default. Differential Revision: http://reviews.llvm.org/D20444 llvm-svn: 273191	2016-06-20 19:26:00 +00:00
Zvi Rackover	453d734201	[X86] _MM_ALIGN16 attribute support for non-windows targets Summary: This patch adds support for the _MM_ALIGN16 attribute on non-windows targets. This aligns Clang with ICC which supports the attribute on all targets. Fixes PR28056 Reviewers: aaboud, echristo, cfe-commits, mkuper Subscribers: zvi, mehdi_amini Projects: #clang-c Differential Revision: http://reviews.llvm.org/D21173 llvm-svn: 273095	2016-06-18 20:01:07 +00:00
Saleem Abdulrasool	5065d8cfc9	Headers: wordsmith error message Use the marketing name for the MSVC release as pointed out by Nico Weber! llvm-svn: 272979	2016-06-17 00:27:02 +00:00
Saleem Abdulrasool	13f3baf572	Headers: tweak for MSVC[<1800] Earlier versions of MSVC did not include inttypes.h. Ensure that we dont try to include_next on those releases. llvm-svn: 272741	2016-06-15 00:28:15 +00:00
Hans Wennborg	f8b91f8336	s/Intrin.h/intrin.h/, trying to fix the build after r272701 llvm-svn: 272702	2016-06-14 20:14:24 +00:00
Nico Weber	73384a8f76	Rename Intrin.h to intrin.h, that's how all the documentation calls it. llvm-svn: 272701	2016-06-14 19:54:40 +00:00
Michael Zuckerman	c49f6ce3e1	[Clang][avx512][Intrinsics] adding prefetch gather intrinsics Differential Revision: http://reviews.llvm.org/D21322 llvm-svn: 272667	2016-06-14 13:45:17 +00:00
Michael Zuckerman	223676d2cc	[Clang][AVX512][intrinsics] Adding missing intrinsics div_pd and div_ps Differential Revision: http://reviews.llvm.org/D20626 llvm-svn: 272658	2016-06-14 12:38:58 +00:00
David Majnemer	d423574fde	[immintrin] Reimplement _bit_scan_{forward,reverse} There is no need to use a target-specific intrinsic to implement _bit_scan_forward or _bit_scan_reverse, reimplementing them using generic intrinsics makes it more likely that the middle end will understand what's going on. llvm-svn: 272564	2016-06-13 17:26:16 +00:00
Asaf Badouh	880f0c252b	[X86][AVX512F] bugfix - sqrtps should get __mask16 as mask parameter CR: Michael Zuckerman llvm-svn: 272549	2016-06-13 15:15:57 +00:00
Simon Pilgrim	beca5f295c	[Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores. The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load Differential Revision: http://reviews.llvm.org/D21272 llvm-svn: 272540	2016-06-13 09:57:52 +00:00
Craig Topper	fc07498e4a	[AVX512] Masked pcmpeqd, pcmpeqq, pcmpgtd, and pcmpgtq don't require avx512bw, just avx512vl. llvm-svn: 272532	2016-06-13 04:15:11 +00:00
Craig Topper	7cc9263ec2	[AVX512] Implement masked and 512-bit pshufd intrinsics directly with __builtin_shufflevector and __builtin_ia32_select. llvm-svn: 272467	2016-06-11 12:50:19 +00:00
Craig Topper	26d5b87316	[X86] Add explicit typecasts to some intrinsics. llvm-svn: 272466	2016-06-11 12:50:12 +00:00
Craig Topper	68738332b8	[AVX512] Implement 512-bit and masked shufflelo and shufflehi intrinsics directly with __builtin_shufflevector and __builtin_ia32_select. Also improve the formatting of the AVX2 version. llvm-svn: 272452	2016-06-11 03:31:13 +00:00
Craig Topper	d4273a425e	[AVX512] Add _mm512_bsrli_epi128 and _mm512_bslli_epi128 intrinsics. llvm-svn: 272451	2016-06-11 03:31:07 +00:00
Ekaterina Romanova	71a68c928a	Add doxygen comments to mmintrin.h's intrinsics. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 272350	2016-06-10 00:10:40 +00:00
Justin Lebar	4fb5711751	[CUDA] Implement __shfl* intrinsics in clang headers. Summary: Clang changes to make use of the LLVM intrinsics added in D21160. Reviewers: tra Subscribers: jholewinski, cfe-commits Differential Revision: http://reviews.llvm.org/D21162 llvm-svn: 272299	2016-06-09 20:04:57 +00:00
Craig Topper	2769bb5753	[X86] Handle AVX2 pslldqi and psrldqi intrinsics shufflevector creation directly in the header file instead of in CGBuiltin.cpp. Simplify the sse2 equivalents as well. llvm-svn: 272246	2016-06-09 05:15:12 +00:00
Craig Topper	3a0c7260f4	[X86] Add void to the argument list of intrinsics that don't take arguments since empty argument list mean something else in C. llvm-svn: 272244	2016-06-09 05:14:28 +00:00
Igor Breger	aadb876200	[AVX512] Emit select instruction instead of using x86 specific instrinsics. This will allow us to remove the x86 instrinics from the backend. Differential Revision: http://reviews.llvm.org/D21060 llvm-svn: 272141	2016-06-08 13:59:20 +00:00
Michael Zuckerman	c4ae8537cf	[Clang][AVX512][BUILTIN]Adding intrinsics for range_round_{sd\|ss} Differential Revision: http://reviews.llvm.org/D21002 llvm-svn: 272123	2016-06-08 08:19:27 +00:00
Ekaterina Romanova	50e94a3b34	Add doxygen comments to xmmintrin.h's intrinsics. Only half of the intrinsics in this file is documented here. The patch for the o ther half will be sent out later. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 272121	2016-06-08 07:34:31 +00:00
Craig Topper	f3efec65bb	[AVX512] Reformat macro intrinsics, ensure arguments have proper typecasts, ensure result is typecasted back to the generic types. llvm-svn: 272119	2016-06-08 06:08:07 +00:00
Craig Topper	605894985f	[X86] Put parentheses around macro arguments in intrinsics. llvm-svn: 272118	2016-06-08 06:08:04 +00:00
Michael Zuckerman	96d0399658	[clang][AVX512][Intrinsics] Adding intrinsics reduce_[round]_{ss\|sd} to clang Differential Revision: http://reviews.llvm.org/D21014 llvm-svn: 272012	2016-06-07 14:00:20 +00:00
Michael Zuckerman	1a7889f203	Fixing problem with rsqrt28_sd maskz_rsqrt28_sd mapped to mask_rsqrt28_sd and not to the maskz. llvm-svn: 271836	2016-06-05 15:57:49 +00:00
Michael Zuckerman	95721ac863	[Clang][AVX512]Adding set4 intrinsics Differential Revision: http://reviews.llvm.org/D20866 llvm-svn: 271835	2016-06-05 15:43:30 +00:00
Michael Zuckerman	f36f6eb036	[Clang][AVX512][Intrinsics] Adding two definitions _mm512_setzero and _mm512_setzero_epi32 Differential Revision: http://reviews.llvm.org/D20871 llvm-svn: 271832	2016-06-05 15:12:52 +00:00
Craig Topper	6a77b62640	[X86] Use unsigned types for vector arithmetic in intrinsics to avoid undefined behavior for signed integer overflow. This is really only needed for addition, subtraction, and multiplication, but I did the bitwise ops too for overall consistency. Clang currently doesn't set NSW for signed vector operations so the undefined behavior shouldn't happen today. llvm-svn: 271778	2016-06-04 05:43:41 +00:00
Craig Topper	406d5cdf7c	[AVX512] Remove space in -1 constants. NFC llvm-svn: 271777	2016-06-04 05:43:37 +00:00
Asaf Badouh	89f657611c	[X86][AVX512] add intrinsics of Scalar FP to integer Differential Revision: http://reviews.llvm.org/D20861 llvm-svn: 271499	2016-06-02 08:11:35 +00:00

1 2 3 4 5 ...

1181 Commits