llvm-project

Commit Graph

Author	SHA1	Message	Date
Albert Gutowski	f3a0bce155	Separate builtins for x84-64 and i386; implement __mulh and __umulh Summary: We need x86-64-specific builtins if we want to implement some of the MS intrinsics - winnt.h contains definitions of some functions for i386, but not for x86-64 (for example _InterlockedOr64), which means that we cannot treat them as builtins for both i386 and x86-64, because then we have definitions of builtin functions in winnt.h on i386. Reviewers: thakis, majnemer, hans, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24598 llvm-svn: 283264	2016-10-04 22:29:49 +00:00
Craig Topper	c4a8228bcc	[AVX-512] Use native IR for masked 512-bit add/sub/mul/div ps/pd intrinsics when rounding mode isn't used. llvm-svn: 283073	2016-10-02 17:43:00 +00:00
Artem Belevich	d4d9dc8252	[CUDA] Added support for CUDA-8 Differential Revision: https://reviews.llvm.org/D24946 llvm-svn: 282610	2016-09-28 17:47:40 +00:00
Martin Storsjo	963f75efc2	[Headers] Replace stray indentation with tabs with spaces. NFC. This matches the rest of the surrounding file. llvm-svn: 282569	2016-09-28 09:34:51 +00:00
Ayman Musa	17a2819b05	Update to commit r282488, fix the buildboot failure. llvm-svn: 282492	2016-09-27 15:37:31 +00:00
Ayman Musa	2e250e8845	[avx512] Add aliases to some missing avx512 intrinsics. Differential Revision:https: //reviews.llvm.org/D24961 llvm-svn: 282488	2016-09-27 14:06:32 +00:00
Nemanja Ivanovic	10e2b5dcaa	[Power9] Builtins for ELF v.2 ABI conformance - front end portion This patch corresponds to review: https://reviews.llvm.org/D24397 It adds the __POWER9_VECTOR__ macro and the -mpower9-vector option along with a number of altivec.h functions (refer to the code review for a list). llvm-svn: 282481	2016-09-27 10:45:22 +00:00
Saleem Abdulrasool	eae64f8a62	headers: add missing Windows ARM Interlocked intrinsics On ARM, there are multiple versions of each of the intrinsics, with acquire/relaxed/release barrier semantics. The newly added ones are provided as inline functions here instead of builtins, since they should only be available on certain archs (arm/aarch64). This is necessary in order to compile C++ code for ARM in MSVC mode. Patch by Martin Storsjö! llvm-svn: 282447	2016-09-26 22:12:43 +00:00
Simon Dardis	3d9c763816	[mips] MSA intrinsics header file This patch adds the msa.h header file containing the shorter names for the MSA instrinsics, e.g. msa_sll_b for builtin_msa_sll_b. Reviewers: vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24674 llvm-svn: 281975	2016-09-20 15:07:36 +00:00
Justin Lebar	e3612a039f	[CUDA] Make __clang_cuda_cmath.h compatible with libc++. Summary: We need to add a bunch more "using"s, which weren't necessary with libstdc++. Once this is in I can check in a test to the test-suite. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24588 llvm-svn: 281544	2016-09-14 21:50:14 +00:00
Albert Gutowski	727ab8a803	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: alexshap, cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281540	2016-09-14 21:19:43 +00:00
Albert Gutowski	fc19fa3721	Temporary fix for MS _Interlocked intrinsics llvm-svn: 281401	2016-09-13 21:51:37 +00:00
Albert Gutowski	9918cb6573	Reverse commit 281375 (breaks building Chromium) llvm-svn: 281399	2016-09-13 21:24:51 +00:00
Albert Gutowski	ce7a9a47b2	Add bunch of _Interlocked builtins Reviewers: compnerd, thakis, Prazek, majnemer, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24153 llvm-svn: 281378	2016-09-13 19:43:33 +00:00
Albert Gutowski	ae3fb3113f	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281375	2016-09-13 19:26:42 +00:00
Albert Gutowski	b6a11acb53	Implement MS _rot intrinsics Reviewers: thakis, Prazek, compnerd, rnk Subscribers: majnemer, cfe-commits Differential Revision: https://reviews.llvm.org/D24311 llvm-svn: 280997	2016-09-08 22:32:19 +00:00
Reid Kleckner	5de2bcdcf6	Add MS __nop intrinsic to intrin.h Summary: There was no definition for __nop function - added inline assembly. Patch by Albert Gutowski! Reviewers: rnk, thakis Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24286 llvm-svn: 280826	2016-09-07 16:55:12 +00:00
Craig Topper	2dfab63bb3	[AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div builtins and replace with native operations. We can't do the 512-bit ones because they take a rounding mode argument that we can't represent. llvm-svn: 280635	2016-09-04 18:30:17 +00:00
Elad Cohen	fb6358d2b5	[Modules] Add 'freestanding' to the 'requires-declaration' feature-list. This adds support for modules that require (non-)freestanding environment, such as the compiler builtin mm_malloc submodule. Differential Revision: https://reviews.llvm.org/D23871 llvm-svn: 280613	2016-09-04 06:00:42 +00:00
Joerg Sonnenberger	b50b2fac9f	Trailing dot that shouldn't have been committed. llvm-svn: 280609	2016-09-04 00:51:02 +00:00
Joerg Sonnenberger	82216f0faa	PR 27200: Fix names of the atomic lock-free macros. llvm-svn: 280607	2016-09-04 00:44:10 +00:00
Craig Topper	f43e4a1728	[AVX-512] Remove masked integer mullo builtins and replace with native IR. llvm-svn: 280597	2016-09-03 19:19:49 +00:00
Craig Topper	0e18976b8d	[AVX-512] Remove masked integer add/sub builtins and replace with native IR. llvm-svn: 280596	2016-09-03 18:29:35 +00:00
Craig Topper	a815f488d5	[AVX-512] Implement masked floating point logical operations with native IR and remove the builtins. llvm-svn: 280197	2016-08-31 05:38:58 +00:00
Craig Topper	d0681d528d	[X86] Use v2i64 vectors to implement _mm_and/andn/or/xor_pd. These will be reused when removing some builtins from avx512vldqintrin.h and this will make the tests for that change show a better number of vector elements. llvm-svn: 280196	2016-08-31 05:38:55 +00:00
Bruno Cardoso Lopes	6736e199c7	[Modules] Add 'gnuinlineasm' to the 'requires-declaration' feature-list. This adds support for modules that require (no-)gnu-inline-asm environment, such as the compiler builtin cpuid submodule. This is the gnu-inline-asm variant of https://reviews.llvm.org/D23871 Differential Revision: https://reviews.llvm.org/D23905 rdar://problem/26931199 llvm-svn: 280159	2016-08-30 21:25:42 +00:00
Alexey Bader	b5d90e57dc	[OpenCL] Make is_valid_event, create_user_event overloadable. Summary: Make is_valid_event and create_user_event overloadable like other built-ins. Patch by Evgeniy Tyurin. Reviewers: bader, yaxunl Subscribers: Anastasia, cfe-commits Differential Revision: https://reviews.llvm.org/D23914 llvm-svn: 280097	2016-08-30 14:42:54 +00:00
Asaf Badouh	356bb76809	[X86][AVX512F] minor fix of the parameter names add "__" prefix Bug 28842 https://llvm.org/bugs/show_bug.cgi?id=29040 Differential Revision: https://reviews.llvm.org/D23753 llvm-svn: 279392	2016-08-21 07:56:47 +00:00
Justin Lebar	cb20a09f54	[CUDA] Improve handling of math functions. Summary: A bunch of related changes here to our CUDA math headers. - The second arg to nexttoward is a double (well, technically, long double, but we don't have that), not a float. - Add a forward-declare of llround(float), which is defined in the CUDA headers. We need this for the same reason we need most of the other forward-declares: To prevent a constexpr function in our standard library from becoming host+device. - Add nexttowardf implementation. - Pull "foobarf" functions defined by the CUDA headers in the global namespace into namespace std. This lets you do e.g. std::sinf. - Add overloads for math functions accepting integer types. This lets you do e.g. std::sin(0) without having an ambiguity between the overload that takes a float and the one that takes a double. With these changes, we pass testcases derived from libc++ for cmath and math.h. We can check these testcases in to the test-suite once support for CUDA lands there. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23627 llvm-svn: 279140	2016-08-18 20:43:13 +00:00
Yaxun Liu	3317446301	[OpenCL] AMDGPU: Add extensions cl_amd_media_ops and cl_amd_media_ops2 Differential Revision: https://reviews.llvm.org/D23322 llvm-svn: 278851	2016-08-16 20:49:49 +00:00
Reid Kleckner	66e7717b46	Revert "[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms" This reverts commit r278783. It breaks usage of _xgetbv on Windows. llvm-svn: 278814	2016-08-16 16:04:14 +00:00
Marina Yatsina	197b65f833	[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms commit on behalf of guyblank Differential Revision: https://reviews.llvm.org/D21959 llvm-svn: 278783	2016-08-16 08:13:36 +00:00
Lama Saba	5d01f224cf	[X86][AVX512] lower __mm512_andnot_ps/__mm512_andnot_pd to IR Differential revision: https://reviews.llvm.org/D23262 llvm-svn: 278209	2016-08-10 10:34:45 +00:00
Justin Lebar	2ef3dabd45	[CUDA] Add __device__ overloads for placement new and delete. Summary: Previously these sort of worked because they didn't end up resulting in calls at the ptx layer. But I'm adding stricter checks that break placement new without these changes. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23239 llvm-svn: 278194	2016-08-10 01:09:14 +00:00
Asaf Badouh	2f344b788c	[AVX512] integer comparisions enumeration. fix Bug 28842 https://llvm.org/bugs/show_bug.cgi?id=28842 Differential Revision: https://reviews.llvm.org/D22212 llvm-svn: 277955	2016-08-07 10:43:04 +00:00
Saleem Abdulrasool	afdef205d8	Headers: Add ARM support to intrin.h for MSVC compatibility This fixes compiling with headers from the Windows SDK for ARM, where the YieldProcessor function (in winnt.h) refers to _ARM_BARRIER_ISHST. The actual MSVC armintr.h contains a lot more definitions, but this is enough to build code that uses the Windows SDK but doesn't use ARM intrinsics directly. An alternative would to just keep the addition to intrin.h (to include armintr.h), but not actually ship armintr.h, instead having clang's intrin.h include armintr.h from MSVC's include directory. (That one works fine with clang, at least for building code that uses the Windows SDK.) Patch by Martin Storsjö! llvm-svn: 277928	2016-08-06 17:58:24 +00:00
Yaxun Liu	c489e39eca	[OpenCL] Remove extra native_ functions from opencl-c.h There should be no native_ builtin functions with double type arguments. Patch by Aaron En Ye Shi. Differential Revision : https://reviews.llvm.org/D23071 llvm-svn: 277754	2016-08-04 19:30:54 +00:00
Dimitry Andric	f8099f256d	Add more gcc compatibility names to clang's cpuid.h Summary: Some cpuid bit defines are named slightly different from how gcc's cpuid.h calls them. Define a few more compatibility names to appease software built for gcc: * `bit_PCLMUL` alias of `bit_PCLMULQDQ` * `bit_SSE4_1` alias of `bit_SSE41` * `bit_SSE4_2` alias of `bit_SSE42` * `bit_AES` alias of `bit_AESNI` * `bit_CMPXCHG8B` alias of `bit_CX8` While here, add the misssing 29th bit, `bit_F16C` (which is how gcc calls this bit). Reviewers: joerg, rsmith Subscribers: bruno, cfe-commits Differential Revision: https://reviews.llvm.org/D22010 llvm-svn: 277307	2016-07-31 20:23:23 +00:00
Eric Christopher	b638558e12	Remove unused variable. Fixes PR28761. llvm-svn: 277221	2016-07-29 22:11:11 +00:00
Yaxun Liu	c944e65a24	[OpenCL] Added CLK_ABGR definition for get_image_channel_order return value Added CLK_ABGR definition for get_image_channel_order return value inside opencl-c.h file. Patch by Aaron En Ye Shi. Differential Revision: https://reviews.llvm.org/D22767 llvm-svn: 277179	2016-07-29 17:50:10 +00:00
Craig Topper	351ed42795	[X86] Block pbroadcastq instructions on 32-bit targets instead of pbroadcastb. Thanks to Simon Pilgrim for catching the mistake. llvm-svn: 276564	2016-07-24 14:58:06 +00:00
Ekaterina Romanova	a84c24f39c	Add doxygen comments to emmintrin.h's intrinsics. Only around 50% of the intrinsics in this file are documented now. The patches for the rest of the intrisics in this file will be send out later. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Paul Robinson. llvm-svn: 276499	2016-07-22 23:49:37 +00:00
Craig Topper	45db56c375	[X86] Add missing __x86_64__ qualifiers on a bunch of intrinsics that assume 64-bit GPRs are available. Usages of these intrinsics in a 32-bit build results in assertions in the backend. llvm-svn: 276249	2016-07-21 07:38:39 +00:00
Simon Pilgrim	e3b9ee0645	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102	2016-07-20 10:18:01 +00:00
Asaf Badouh	a0b6f8fb56	[X86][AVX512F] minor fix of the parameter names add "__" prefix llvm-svn: 275384	2016-07-14 08:40:30 +00:00
Michael Zuckerman	3378653f8d	[Clang][AVX512] Making cosmetic changes llvm-svn: 275169	2016-07-12 12:42:27 +00:00
Craig Topper	4d61a3c2d8	[AVX512] Replace masked AND/OR/XOR intrinsics with native code and remove the builtins. llvm-svn: 275049	2016-07-11 06:14:18 +00:00
Craig Topper	6e76fb61a7	[X86] Use __butilin_shufflevector for 512-bit shufps intrinsics. llvm-svn: 275012	2016-07-10 05:57:21 +00:00
Craig Topper	95b61b0544	[X86] Use __builtin_ia32_vec_ext_v4hi and __builtin_ia32_vec_set_v4hi to implement pextrw/pinsertw MMX intrinsics instead of trying to use native IR. Without this we end up generating code that doesn't use mmx registers and probably doesn't work well with other mmx intrinsics. llvm-svn: 274968	2016-07-09 05:30:41 +00:00
Justin Bogner	2d5de7e568	NVPTX: Use the nvvm builtins to read SRegs rather than the legacy ptx ones The ptx spellings were removed from LLVM in r274769. llvm-svn: 274770	2016-07-07 16:41:08 +00:00
Justin Bogner	2f8de9fb4f	NVPTX: Rename __builtin_ptx_shfl -> __nvvm_shfl To match "NVPTX: Make the llvm.nvvm.shfl intrinsics and builtin names consistent" in LLVM. llvm-svn: 274663	2016-07-06 19:52:32 +00:00
Michael Zuckerman	b920665493	[Clang][Feature] Adding CLFLUSHOPT feature and intrinsic to clang Differential Revision: http://reviews.llvm.org/D21792 llvm-svn: 274559	2016-07-05 15:56:03 +00:00
Simon Pilgrim	f5a8837e1b	[X86][AVX512] Converted the VBROADCAST intrinsics to generic IR llvm-svn: 274544	2016-07-05 12:59:33 +00:00
Asaf Badouh	136332888a	[X86][AVX512F] add float/double abs intrinsics add abs intrinsics that use native LLVM-IR. change _mm512_mask[z]_and_epi{32\|64} to use select intrinsic Differential Revision: http://reviews.llvm.org/D21973 llvm-svn: 274542	2016-07-05 12:24:14 +00:00
Asaf Badouh	f9cdb8de7a	[AVX512] minor fix in sqrt{ss\|sd} intrinsics arguments Differential Revision: http://reviews.llvm.org/D21988 llvm-svn: 274541	2016-07-05 11:36:21 +00:00
Anastasia Stulova	db7a31cce7	[OpenCL] An implementation of device side enqueue (DSE) from OpenCL v2.0 s6.13.17. - Added new Builtins: enqueue_kernel, get_kernel_work_group_size and get_kernel_preferred_work_group_size_multiple. These Builtins use custom check to diagnose parameters of the passed Blocks i. e. variable number of 'local void*' type params, and check different overloads specified in Table 6.31 of OpenCL v2.0. - IR is generated as an internal library call for each OpenCL Builtin, reusing ObjC Block implementation. Review: http://reviews.llvm.org/D20249 llvm-svn: 274540	2016-07-05 11:31:24 +00:00
Michael Zuckerman	a72b49efe4	ntrinsics _mm256_permutexvar_epi64 doesn't accept three parameters as specify bellow. I deleted the extra mask parameter. __m256i _mm256_permutexvar_epi64 (__m256i idx, __m256i a) #include "immintrin.h" Instruction: vpermq CPUID Flags: AVX512VL + AVX512F Description Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst. Operation FOR j := 0 to 3 i := j64 id := idx[i+1:i]64 dst[i+63:i] := a[id+63:id] ENDFOR dst[MAX:256] := 0 dst[MAX:256] := 0 (From: Intel intrinsics guide) llvm-svn: 274539	2016-07-05 11:30:31 +00:00
Michael Zuckerman	7dac6fbdf8	[Clang][BuiltIn][AVX512] adding _mm{\|256\|512}_mask_cvt{s\|us\|}epi16_storeu_epi8 intrinsics Differential Revision: http://reviews.llvm.org/D21729 llvm-svn: 274532	2016-07-05 08:08:01 +00:00
Craig Topper	2a383c9273	[X86] Use undefined instead of setzero in shufflevector based intrinsics when the second source is unused. Rewrite immediate extractions in shuffle intrinsics to be in ((c >> x) & y) form instead of ((c & z) >> x). This way only x varies between each use instead of having to vary x and z. llvm-svn: 274525	2016-07-04 22:18:01 +00:00
Simon Pilgrim	427154db2a	[X86][AVX512] Converted the VSHUFPD intrinsics to generic IR llvm-svn: 274523	2016-07-04 21:30:47 +00:00
Simon Pilgrim	30db811526	[X86][AVX512] Converted the VPERMPD/VPERMQ intrinsics to generic IR llvm-svn: 274502	2016-07-04 13:34:44 +00:00
Simon Pilgrim	17388f2569	[X86][AVX512] Converted the VPERMILPD/VPERMILPS intrinsics to generic IR llvm-svn: 274492	2016-07-04 11:06:15 +00:00
Simon Pilgrim	275d721485	[X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR llvm companion patch imminent llvm-svn: 274442	2016-07-02 17:16:25 +00:00
Craig Topper	b3a4477b13	[X86] Replace 128-bit and 256 masked vpermilps/vpermilpd builtins with native IR. llvm-svn: 274425	2016-07-02 05:36:43 +00:00
Michael Zuckerman	3f316abdce	[Clang][Intrinsics][AVX512][BuiltIn] adding intrinsics for vrangesd instruction set Differential Revision: http://reviews.llvm.org/D21734 llvm-svn: 274218	2016-06-30 08:05:46 +00:00
Alexey Bader	e5b3aebfb5	[OpenCL] Add attribute 'pure' to read_image built-in functions to enable optimizations. Reviewers: Anastasia, yaxunl Subscribers: pekka.jaaskelainen, pxli168, cfe-commits Differential Revision: http://reviews.llvm.org/D21795 llvm-svn: 274122	2016-06-29 12:30:26 +00:00
David Majnemer	2916a612cd	[intrin.h] Certain _Interlocked intrinsics return the old value This fixes PR28326. llvm-svn: 273986	2016-06-28 02:54:43 +00:00
Asaf Badouh	57819aa185	[X86] add _mm_loadu_si64 Differential Revision: http://reviews.llvm.org/D21504 llvm-svn: 273812	2016-06-26 13:51:54 +00:00
Craig Topper	50e3dfe9d0	[X86] Fix pslldq/psrldq intrinsics to not fail compilation with immediates larger than 16. This was accidentally broken in r272246. llvm-svn: 273775	2016-06-25 07:31:14 +00:00
Craig Topper	79f53ca0b5	[AVX512] Replace masked unpack builtins with shufflevector and selects. llvm-svn: 273533	2016-06-23 06:36:42 +00:00
Michael Zuckerman	716859aa64	[Clang][bmi][intrinsics] Adding _mm_tzcnt_64 _mm_tzcnt_32 intrinsics to clang. Differential Revision: http://reviews.llvm.org/D21373 llvm-svn: 273401	2016-06-22 12:32:43 +00:00
Craig Topper	9ce3ddf2e6	[AVX512] Use a __v8hi vector inside of _mm_setzero_hi to match its name. Probably no real functional change. llvm-svn: 273389	2016-06-22 06:36:23 +00:00
Craig Topper	08181f795f	[AVX512] Fix _mm_setzero_di to not require avx512vl since its used by the avx512dqintrin.h. Also update the avx512dq test to not enable avx512vl feature so we can ensure correct dependencies. llvm-svn: 273388	2016-06-22 06:36:21 +00:00
Craig Topper	c89dda5938	[AVX512] Add missing typecasts to intrinsics. llvm-svn: 273386	2016-06-22 06:36:16 +00:00
Craig Topper	879b0978f4	[AVX512] Move the 128-bit and 256-bit lzcnt intrinsics to avx512vlcdintrin.h where they belong. llvm-svn: 273249	2016-06-21 06:53:58 +00:00
Yaxun Liu	143f083e4b	[OpenCL] Include opencl-c.h by default as a clang module Include opencl-c.h by default as a module to utilize the automatic AST caching mechanism of clang modules. Add an option -finclude-default-header to enable default header for OpenCL, which is off by default. Differential Revision: http://reviews.llvm.org/D20444 llvm-svn: 273191	2016-06-20 19:26:00 +00:00
Zvi Rackover	453d734201	[X86] _MM_ALIGN16 attribute support for non-windows targets Summary: This patch adds support for the _MM_ALIGN16 attribute on non-windows targets. This aligns Clang with ICC which supports the attribute on all targets. Fixes PR28056 Reviewers: aaboud, echristo, cfe-commits, mkuper Subscribers: zvi, mehdi_amini Projects: #clang-c Differential Revision: http://reviews.llvm.org/D21173 llvm-svn: 273095	2016-06-18 20:01:07 +00:00
Saleem Abdulrasool	5065d8cfc9	Headers: wordsmith error message Use the marketing name for the MSVC release as pointed out by Nico Weber! llvm-svn: 272979	2016-06-17 00:27:02 +00:00
Saleem Abdulrasool	13f3baf572	Headers: tweak for MSVC[<1800] Earlier versions of MSVC did not include inttypes.h. Ensure that we dont try to include_next on those releases. llvm-svn: 272741	2016-06-15 00:28:15 +00:00
Hans Wennborg	f8b91f8336	s/Intrin.h/intrin.h/, trying to fix the build after r272701 llvm-svn: 272702	2016-06-14 20:14:24 +00:00
Nico Weber	73384a8f76	Rename Intrin.h to intrin.h, that's how all the documentation calls it. llvm-svn: 272701	2016-06-14 19:54:40 +00:00
Michael Zuckerman	c49f6ce3e1	[Clang][avx512][Intrinsics] adding prefetch gather intrinsics Differential Revision: http://reviews.llvm.org/D21322 llvm-svn: 272667	2016-06-14 13:45:17 +00:00
Michael Zuckerman	223676d2cc	[Clang][AVX512][intrinsics] Adding missing intrinsics div_pd and div_ps Differential Revision: http://reviews.llvm.org/D20626 llvm-svn: 272658	2016-06-14 12:38:58 +00:00
David Majnemer	d423574fde	[immintrin] Reimplement _bit_scan_{forward,reverse} There is no need to use a target-specific intrinsic to implement _bit_scan_forward or _bit_scan_reverse, reimplementing them using generic intrinsics makes it more likely that the middle end will understand what's going on. llvm-svn: 272564	2016-06-13 17:26:16 +00:00
Asaf Badouh	880f0c252b	[X86][AVX512F] bugfix - sqrtps should get __mask16 as mask parameter CR: Michael Zuckerman llvm-svn: 272549	2016-06-13 15:15:57 +00:00
Simon Pilgrim	beca5f295c	[Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores. The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load Differential Revision: http://reviews.llvm.org/D21272 llvm-svn: 272540	2016-06-13 09:57:52 +00:00
Craig Topper	fc07498e4a	[AVX512] Masked pcmpeqd, pcmpeqq, pcmpgtd, and pcmpgtq don't require avx512bw, just avx512vl. llvm-svn: 272532	2016-06-13 04:15:11 +00:00
Craig Topper	7cc9263ec2	[AVX512] Implement masked and 512-bit pshufd intrinsics directly with __builtin_shufflevector and __builtin_ia32_select. llvm-svn: 272467	2016-06-11 12:50:19 +00:00
Craig Topper	26d5b87316	[X86] Add explicit typecasts to some intrinsics. llvm-svn: 272466	2016-06-11 12:50:12 +00:00
Craig Topper	68738332b8	[AVX512] Implement 512-bit and masked shufflelo and shufflehi intrinsics directly with __builtin_shufflevector and __builtin_ia32_select. Also improve the formatting of the AVX2 version. llvm-svn: 272452	2016-06-11 03:31:13 +00:00
Craig Topper	d4273a425e	[AVX512] Add _mm512_bsrli_epi128 and _mm512_bslli_epi128 intrinsics. llvm-svn: 272451	2016-06-11 03:31:07 +00:00
Ekaterina Romanova	71a68c928a	Add doxygen comments to mmintrin.h's intrinsics. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 272350	2016-06-10 00:10:40 +00:00
Justin Lebar	4fb5711751	[CUDA] Implement __shfl* intrinsics in clang headers. Summary: Clang changes to make use of the LLVM intrinsics added in D21160. Reviewers: tra Subscribers: jholewinski, cfe-commits Differential Revision: http://reviews.llvm.org/D21162 llvm-svn: 272299	2016-06-09 20:04:57 +00:00
Craig Topper	2769bb5753	[X86] Handle AVX2 pslldqi and psrldqi intrinsics shufflevector creation directly in the header file instead of in CGBuiltin.cpp. Simplify the sse2 equivalents as well. llvm-svn: 272246	2016-06-09 05:15:12 +00:00
Craig Topper	3a0c7260f4	[X86] Add void to the argument list of intrinsics that don't take arguments since empty argument list mean something else in C. llvm-svn: 272244	2016-06-09 05:14:28 +00:00
Igor Breger	aadb876200	[AVX512] Emit select instruction instead of using x86 specific instrinsics. This will allow us to remove the x86 instrinics from the backend. Differential Revision: http://reviews.llvm.org/D21060 llvm-svn: 272141	2016-06-08 13:59:20 +00:00
Michael Zuckerman	c4ae8537cf	[Clang][AVX512][BUILTIN]Adding intrinsics for range_round_{sd\|ss} Differential Revision: http://reviews.llvm.org/D21002 llvm-svn: 272123	2016-06-08 08:19:27 +00:00
Ekaterina Romanova	50e94a3b34	Add doxygen comments to xmmintrin.h's intrinsics. Only half of the intrinsics in this file is documented here. The patch for the o ther half will be sent out later. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 272121	2016-06-08 07:34:31 +00:00
Craig Topper	f3efec65bb	[AVX512] Reformat macro intrinsics, ensure arguments have proper typecasts, ensure result is typecasted back to the generic types. llvm-svn: 272119	2016-06-08 06:08:07 +00:00
Craig Topper	605894985f	[X86] Put parentheses around macro arguments in intrinsics. llvm-svn: 272118	2016-06-08 06:08:04 +00:00
Michael Zuckerman	96d0399658	[clang][AVX512][Intrinsics] Adding intrinsics reduce_[round]_{ss\|sd} to clang Differential Revision: http://reviews.llvm.org/D21014 llvm-svn: 272012	2016-06-07 14:00:20 +00:00
Michael Zuckerman	1a7889f203	Fixing problem with rsqrt28_sd maskz_rsqrt28_sd mapped to mask_rsqrt28_sd and not to the maskz. llvm-svn: 271836	2016-06-05 15:57:49 +00:00
Michael Zuckerman	95721ac863	[Clang][AVX512]Adding set4 intrinsics Differential Revision: http://reviews.llvm.org/D20866 llvm-svn: 271835	2016-06-05 15:43:30 +00:00
Michael Zuckerman	f36f6eb036	[Clang][AVX512][Intrinsics] Adding two definitions _mm512_setzero and _mm512_setzero_epi32 Differential Revision: http://reviews.llvm.org/D20871 llvm-svn: 271832	2016-06-05 15:12:52 +00:00
Craig Topper	6a77b62640	[X86] Use unsigned types for vector arithmetic in intrinsics to avoid undefined behavior for signed integer overflow. This is really only needed for addition, subtraction, and multiplication, but I did the bitwise ops too for overall consistency. Clang currently doesn't set NSW for signed vector operations so the undefined behavior shouldn't happen today. llvm-svn: 271778	2016-06-04 05:43:41 +00:00
Craig Topper	406d5cdf7c	[AVX512] Remove space in -1 constants. NFC llvm-svn: 271777	2016-06-04 05:43:37 +00:00
Asaf Badouh	89f657611c	[X86][AVX512] add intrinsics of Scalar FP to integer Differential Revision: http://reviews.llvm.org/D20861 llvm-svn: 271499	2016-06-02 08:11:35 +00:00
Michael Zuckerman	9e7d0a98fa	[Clang][AVX512][INTRINSICS] adding round cvt and fix regular cvtps_ph Differential Revision: http://reviews.llvm.org/D20870 llvm-svn: 271498	2016-06-02 07:44:08 +00:00
Simon Pilgrim	00880511b1	[X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (clang) The 'cvtt' truncation (round to zero) conversions can be safely represented as generic __builtin_convertvector (fptosi) calls instead of x86 intrinsics. We already do this (implicitly) for the scalar equivalents. Note: I looked at updating _mm_cvttpd_epi32 as well but this still requires a lot more backend work to correctly lower (both for debug and optimized builds). Differential Revision: http://reviews.llvm.org/D20859 llvm-svn: 271436	2016-06-01 21:46:51 +00:00
Michael Zuckerman	6170c15fc6	[Clang][Intrinsics][avx512] Continue Adding round cvt to clang And remove trailing spaces in intrinsic f test Differential Revision: http://reviews.llvm.org/D20810 llvm-svn: 271398	2016-06-01 14:41:41 +00:00
Michael Zuckerman	e54093fcc0	Adding front-end support to several intrinsics (bit scanning, conversion and state reading intrinsics) Adding LLVM front-end support to two intrinsics dealing with bit scan: _bit_scan_forward and _bit_scan_reverse. Their functionality is as described in Intel intrinsics guide: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_forward&expand=371,370 https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_reverse&expand=371,370 Furthermore, adding clang front-end support to these conversion intrinsics: _mm256_cvtsd_f64, _mm256_cvtsi256_si32 and _mm256_cvtss_f32. Finally, adding tests to all of the above, as well as to the state reading intrinsics _rdpmc and _rdtsc. Their functionality is also specified in the Intel intrinsics guide. Commit on behalf of Omer Paparo Bivas llvm-svn: 271387	2016-06-01 12:21:00 +00:00
Michael Zuckerman	e6aa66a53d	[Clang][Intrinsics][avx512] Adding round intrinsics fot max/min/sqrt instruction set to clang Differential Revision: http://reviews.llvm.org/D20812 llvm-svn: 271373	2016-06-01 08:34:03 +00:00
Michael Zuckerman	c301c194ec	[Clang][Intrinsics][avx512] Adding round roundscale to clang Differential Revision: http://reviews.llvm.org/D20815 llvm-svn: 271368	2016-06-01 07:35:44 +00:00
Michael Zuckerman	186d86738d	[Clang][Intrinsics][avx512] Adding round cvt to clang Differential Revision: http://reviews.llvm.org/D20790 llvm-svn: 271265	2016-05-31 11:27:34 +00:00
Craig Topper	74b5948f39	[X86] Use unaligned load intrinsics to implement other intrinsics instead of manually creating the unaligned load. llvm-svn: 271250	2016-05-31 05:49:13 +00:00
Simon Pilgrim	645e1ad33a	[X86][SSE] _mm_store1_ps/_mm_store1_pd should require an aligned pointer According to the gcc headers, intel intrinsics docs and msdn codegen the _mm_store1_pd (and its _mm_store_pd1 equivalent) should use an aligned pointer - the clang headers are the only implementation I can find that assume non-aligned stores (by storing with _mm_storeu_pd). Additionally, according to the intel intrinsics docs and msdn codegen the _mm_store1_ps (_mm_store_ps1) requires a similarly aligned pointer. This patch raises the alignment requirements to match the other implementations by calling _mm_store_ps/_mm_store_pd instead. I've also added the missing _mm_store_pd1 intrinsic (which maps to _mm_store1_pd like _mm_store_ps1 does to _mm_store1_ps). As a followup I'll update the llvm fast-isel tests to match this codegen. Differential Revision: http://reviews.llvm.org/D20617 llvm-svn: 271218	2016-05-30 17:55:25 +00:00
Justin Lebar	720f8da33a	[CUDA] Fix order of vectorized ldg intrinsics' elements. Summary: The order is [x, y, z, w], not [w, x, y, z]. Subscribers: cfe-commits, tra Differential Revision: http://reviews.llvm.org/D20794 llvm-svn: 271215	2016-05-30 17:12:55 +00:00
Craig Topper	09175dab31	[X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code that will compile to a native unaligned store. Remove the builtins since they are no longer used. Intrinsics will be removed from llvm in a future commit. llvm-svn: 271214	2016-05-30 17:10:30 +00:00
Michael Zuckerman	9fcf3552ad	[Clang][avx512][builtin] Adding missing intrinsics for cvt Differential Revision: http://reviews.llvm.org/D20618 llvm-svn: 271205	2016-05-30 13:22:12 +00:00
Yaxun Liu	e8f49b9db7	[OpenCL] Add the default header file opencl-c.h for OpenCL C language OpenCL has large number of "builtin" functions ("builtin" in the sense of OpenCL spec) which are defined in header files. To compile OpenCL kernels using these builtin functions, a header file is needed. This header file is based on the Khronos implementation (https://github.com/KhronosGroup/SPIR/blob/spirv-1.0/lib/Headers/opencl.h) with heavy refactoring. Re-commit after fixing failures on ppc64/systemz etc. Differential Revision: http://reviews.llvm.org/D18369 llvm-svn: 271197	2016-05-30 02:22:28 +00:00
Simon Pilgrim	6d1a0c4c75	[X86][SSE] Make unsigned integer vector types generally available As discussed on http://reviews.llvm.org/D20684, move the unsigned integer vector types used for zero extension to make them available for general use. llvm-svn: 271187	2016-05-29 18:49:08 +00:00
Yaxun Liu	898eb39bfc	Revert r271136 [OpenCL] Add the default header file opencl-c.h for OpenCL C language due to build failure on ppc64/hexagon/systemz. llvm-svn: 271144	2016-05-28 19:50:40 +00:00
Yaxun Liu	e54d7c44d0	[OpenCL] Add the default header file opencl-c.h for OpenCL C language OpenCL has large number of "builtin" functions ("builtin" in the sense of OpenCL spec) which are defined in header files. To compile OpenCL kernels using these builtin functions, a header file is needed. This header file is based on the Khronos implementation (https://github.com/KhronosGroup/SPIR/blob/spirv-1.0/lib/Headers/opencl.h) with heavy refactoring. Differential Revision: http://reviews.llvm.org/D18369 llvm-svn: 271136	2016-05-28 19:09:01 +00:00
Simon Pilgrim	91b77ceaed	[X86][SSE] Replace VPMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (clang) The VPMOVSX and (V)PMOVZX sign/zero extension intrinsics can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics. This patch removes the clang builtins and their use in the sse2/avx headers - a companion patch will remove/auto-upgrade the llvm intrinsics. Note: We already did this for SSE41 PMOVSX sometime ago. Differential Revision: http://reviews.llvm.org/D20684 llvm-svn: 271106	2016-05-28 08:12:45 +00:00
Ekaterina Romanova	5a7f09c5af	Clean up: remove trailing spaces in x86 intrinsic headers. Differential Revision: http://reviews.llvm.org/D20614 llvm-svn: 271077	2016-05-28 00:18:59 +00:00
Ahmed Bougacha	5aa0ab3869	[Headers] Remove redundant typedef. NFC. llvm-svn: 271022	2016-05-27 17:57:23 +00:00
Craig Topper	32578b7dcf	[AVX512][Builtin] Fix palignr intrinsic for avx512vlbw. The immediate should not be multiplied by 8. The 512-bit version was fixed recently but this was missed. llvm-svn: 270970	2016-05-27 06:59:39 +00:00
David Majnemer	b2c5720bfd	[Intrin.h] Sort the __read[fg]s intrinsics No functional change is intended. llvm-svn: 270952	2016-05-27 02:06:14 +00:00
Michael Zuckerman	22c47e606a	Adding missing _mm512_castsi512_si256 intrinsic. llvm-svn: 270851	2016-05-26 14:32:11 +00:00
Michael Zuckerman	eb5f178c4b	Fix instrinsics names: _mm128_cmp_ps_mask-->_mm_cmp_ps_mask _mm128_mask_cmp_ps_mask-->_mm_mask_cmp_ps_mask _mm128_cmp_pd_mask-->_mm_cmp_pd_mask _mm128_mask_cmp_pd_mask-->_mm_mask_cmp_pd_mask llvm-svn: 270830	2016-05-26 08:10:12 +00:00
Michael Zuckerman	6f08cebf36	[Clang][AVX512][BUILTIN] Adding intrinsics for set1 Differential Revision: http://reviews.llvm.org/D20562 llvm-svn: 270825	2016-05-26 06:54:52 +00:00
Michael Zuckerman	efbf3f108e	[Clang][AVX512][Builtin] Fix palignr intrinsics header Differential Revision: http://reviews.llvm.org/D20620 llvm-svn: 270707	2016-05-25 15:05:03 +00:00
Michael Zuckerman	d5cc6cd262	[Clang][AVX512][BUILTIN] Add missing intrinsics for cast Differential Revision: http://reviews.llvm.org/D20523 llvm-svn: 270699	2016-05-25 14:04:21 +00:00
Eric Christopher	d83af71b3a	Make the altivec intrinsics that require immediate constant propagation macros rather than functions. Unfortunately couldn't come up with a simple testcase that didn't need code generation to verify what was going on. llvm-svn: 270625	2016-05-24 22:25:06 +00:00
Simon Pilgrim	90770c7c76	[X86][SSE] Replace lossless i32/f32 to f64 conversion intrinsics with generic IR Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen. This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work. Differential Revision: http://reviews.llvm.org/D20528 llvm-svn: 270499	2016-05-23 22:13:02 +00:00
Justin Lebar	91f6f07bb8	[CUDA] Add -fcuda-approx-transcendentals flag. Summary: This lets us emit e.g. sin.approx.f32. See http://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-sin Reviewers: rnk Subscribers: tra, cfe-commits Differential Revision: http://reviews.llvm.org/D20493 llvm-svn: 270484	2016-05-23 20:19:56 +00:00
Michael Zuckerman	f86eb71616	[clang][AVX512][Builtin] adding missing intrinsics for vpmultishiftqb{128\|256\|512} instruction set . Differential Revision: http://reviews.llvm.org/D20521 llvm-svn: 270441	2016-05-23 15:04:39 +00:00
Michael Zuckerman	e6542002fc	[Clang][AVX512][BUILTIN]adding missing intrinsics for movdaq instruction set Differential Revision: http://reviews.llvm.org/D20514 llvm-svn: 270401	2016-05-23 08:01:48 +00:00
Simon Pilgrim	28666ce778	[X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16 Ensure _mm256_extract_epi8 and _mm256_extract_epi16 zero extend their i8/i16 result to i32. This matches _mm_extract_epi8 and _mm_extract_epi16. Fix for PR27594 Differential Revision: http://reviews.llvm.org/D20468 llvm-svn: 270330	2016-05-21 21:14:35 +00:00
Richard Smith	b391930bbf	Re-alphabetize this file list. llvm-svn: 270170	2016-05-20 01:07:10 +00:00
Richard Smith	f5c3a63c28	Revert incorrect module map changes in r269907 and replace them with the appropriate changes. llvm-svn: 270169	2016-05-20 01:06:47 +00:00
Justin Lebar	2e4ecfdebe	[CUDA] Implement __ldg using intrinsics. Summary: Previously it was implemented as inline asm in the CUDA headers. This change allows us to use the [addr+imm] addressing mode when executing ld.global.nc instructions. This translates into a 1.3x speedup on some benchmarks that call this instruction from within an unrolled loop. Reviewers: tra, rsmith Subscribers: jhen, cfe-commits, jholewinski Differential Revision: http://reviews.llvm.org/D19990 llvm-svn: 270150	2016-05-19 22:49:13 +00:00
Michael Zuckerman	178113e8cc	[Clang][AVX512][intrinsics] continue completing missing set intrinsics Differential Revision: http://reviews.llvm.org/D20160 llvm-svn: 270047	2016-05-19 12:07:49 +00:00
Michael Zuckerman	2cacc35343	[Clang][AVX512] completing missing intrinsics [pandnd]. Differential Revision: http://reviews.llvm.org/D20101 llvm-svn: 269939	2016-05-18 15:25:53 +00:00
Ashutosh Nema	51c9dd0081	Add new intrinsic support for MONITORX and MWAITX instructions Summary: MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT pair while adding a timer function, such that another termination of the MWAITX instruction occurs when the timer expires. The presence of the MONITORX and MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29. The MONITORX and MWAITX instructions are intercepted by the same bits that intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be monitored. MWAITX instruction causes the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is "0F 01 FB". These opcode information is used in adding tests for the disassembler. These instructions are enabled for AMD's bdver4 architecture. Patch by Ganesh Gopalasubramanian! Reviewers: echristo, craig.topper Subscribers: RKSimon, joker.eph, llvm-commits, cfe-commits Differential Revision: http://reviews.llvm.org/D19796 llvm-svn: 269907	2016-05-18 11:56:23 +00:00
Craig Topper	8c18e1120d	[AVX512] Add parentheses around macro arguments in AVX512F intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269746	2016-05-17 04:41:50 +00:00
Craig Topper	d266188540	[AVX512] Add parentheses around macro arguments in AVX512VL intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269745	2016-05-17 04:41:48 +00:00
Craig Topper	f2e67a03fe	[AVX512] Add parentheses around macro arguments in AVX512VLDQ intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269744	2016-05-17 04:41:46 +00:00
Craig Topper	1a15b6aff2	[AVX512] Add parentheses around macro arguments in AVX512VLBW intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269743	2016-05-17 04:41:42 +00:00
Craig Topper	8e95bb99fe	[AVX512] Add parentheses around macro arguments in AVX512PF intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269742	2016-05-17 04:41:40 +00:00
Craig Topper	0bb4664a88	[AVX512] Add parentheses around macro arguments in AVX512ER intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269741	2016-05-17 04:41:38 +00:00
Craig Topper	41ad25a0f9	[AVX512] Add parentheses around macro arguments in AVX512DQ intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269740	2016-05-17 04:41:36 +00:00
Craig Topper	709235674b	[AVX512] Add parentheses around macro arguments in AVX512BW intrinsics. Remove leading underscores from macro argument names. Add explicit typecasts to all macro arguments and return values. And finally reformat after all the adjustments. This is a mostly mechanical change accomplished with a script. I tried to split out any changes to the typecasts that already existed into separate commits. llvm-svn: 269739	2016-05-17 04:41:33 +00:00
Craig Topper	58187d33b7	[AVX512] Correct types for scalar double precision FMA intrinsics and single precision getexp intrinsics. llvm-svn: 269737	2016-05-17 04:41:29 +00:00
Craig Topper	cd45b1a7c7	[X86] Add a few missing typecasts to intrinsics. Found by playing with -fno-lax-vector-conversions on the builtin tests. llvm-svn: 269734	2016-05-17 03:42:31 +00:00
Craig Topper	3007cde8c5	[AVX512] _m512_setzero_qi/hi should return __m512i. llvm-svn: 269733	2016-05-17 03:42:25 +00:00
Craig Topper	f6d024edff	[AVX512] Fix odd formatting in intrinsic header. llvm-svn: 269732	2016-05-17 03:42:15 +00:00
Ekaterina Romanova	1168fdc9df	Doxygen comments for avxintrin.h. Added doxygen comments to avxintrin.h's intrinsics. As of now, only around 50% of the intrinsics in this file are documented here. The patches for the other half will be sent out later. Updated bmiintrin.h to fix an incorrect section name. Updated f16cintrin.h to fix incorect parameter names. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 269718	2016-05-16 22:54:45 +00:00
Michael Zuckerman	bf05a4589e	[Clang][AVX512] completing missing intrinsics for [vpabs] instruction set Differential Revision: http://reviews.llvm.org/D20069 llvm-svn: 269680	2016-05-16 18:57:24 +00:00
Nico Weber	379a1952b3	[ms] Reintroduce feature guards in intrinsic headers in Microsoft mode Visual Studio's C++ standard library headers include intrin.h, so the intrinsic headers get included a lot more often in Microsoft mode than elsewhere. The AVX512 intrinsics are a lot of code (0.7 MB, causing 30% compile time overhead for small programs including e.g. <string> and 6% compile time overhead for larger projects like e.g. v8). Since multiversioning can't be relied on in Microsoft mode (cl.exe doesn't support it), having faster compiles seems like the much better tradeoff until we have a better intrinsic story going forward (which we'll need for e.g. PR19898). Actually using intrinsics on Windows already requires the right /arch: settings, so this patch should have no big behavior change. See also thread "The intrinsics headers (especially avx512) are too big. What to do about it?" on cfe-dev. http://reviews.llvm.org/D20291 llvm-svn: 269675	2016-05-16 18:14:07 +00:00
Michael Zuckerman	cb85677471	[Clang][AVX512] completing missing intrinsics [vsqrt\|vrsqrt\|vrcp14 ]. Differential Revision: http://reviews.llvm.org/D20068 llvm-svn: 269649	2016-05-16 11:42:01 +00:00
Craig Topper	1aa231e3aa	[X86] Add typecasts to remove most assumptions about what __m128i/__m256i is defined as. Add similar typecasts for the fp types as well. llvm-svn: 269632	2016-05-16 06:38:42 +00:00
Craig Topper	9c6c85f1ad	[AVX512] Add typecasts to some intrinsics to avoid doing operations on the __m512/__m512i/__m512d types. llvm-svn: 269631	2016-05-16 06:38:36 +00:00
Craig Topper	91f23d900f	[X86] Remove bad cast from the 'int' return type of __builtin_ia32_kortestchi to '__mask16' before return in an 'int' intrinsic. llvm-svn: 269621	2016-05-16 01:09:16 +00:00
Craig Topper	7d00d2031d	[AVX512] Fix bad typecasts on return value for 512-bit integer byte/word compare builtins. llvm-svn: 269620	2016-05-16 00:51:06 +00:00
Craig Topper	dca1f230ae	[AVX512] Add intrinsics for 512-bit insertf32x8/insertf32x4/inserti32x4. llvm-svn: 269617	2016-05-15 21:26:20 +00:00
Craig Topper	79d05c9b3d	[AVX512] Mark some integer builtin arguments that go to immediates in final instructions as an ICE. llvm-svn: 269613	2016-05-15 20:10:06 +00:00
Craig Topper	9864c59c89	[AVX512] Move unary negations to the left side of typecasts to specific vector type. The __m128/__m256/__m512 types should be treated more opaquely and not have any operations performed on them. llvm-svn: 269612	2016-05-15 20:10:03 +00:00
Craig Topper	f32e2fbe0e	[AVX512] Use the correct mask type in an intrinsic. llvm-svn: 269611	2016-05-15 20:10:00 +00:00
Craig Topper	b81d430d3a	[AVX512] Fix an intrinsic that was passing -2 as a mask instead of -1. llvm-svn: 269610	2016-05-15 20:09:58 +00:00
Craig Topper	4537ea74eb	[X86] Change most 'void' pointers in builtin type lists to more correct types. Fix some unaligned load/store intrinsics to use a less aligned type in their pointer casts. llvm-svn: 269552	2016-05-14 06:03:13 +00:00
Michael Zuckerman	13d3c002df	[clang][AVX512] completing missing set intrinsics Differential Revision: http://reviews.llvm.org/D20099 llvm-svn: 269172	2016-05-11 11:41:29 +00:00
Michael Zuckerman	5e2c6b6200	[clang][AVX512] completing missing intrinsics for [vpermt2d\|vptestm] instruction set. Differential Revision: http://reviews.llvm.org/D20096 llvm-svn: 269170	2016-05-11 11:21:18 +00:00
Michael Zuckerman	e9e8e573e3	[Clang][AVX512] completing missing intrinsics [load/store] Differential Revision: http://reviews.llvm.org/D20063 llvm-svn: 269056	2016-05-10 13:13:54 +00:00
Michael Zuckerman	de860e5585	[Clang][AVX512] completing missing intrinsics [vmin/vmax]{sd\|sq\|uq\|ud}. Differential Revision: http://reviews.llvm.org/D20064 llvm-svn: 269042	2016-05-10 11:34:19 +00:00
Michael Zuckerman	2564d2f5fe	[Clang][AVX512] completing missing intrinsics [vextractf]. Differential Revision: http://reviews.llvm.org/D20061 llvm-svn: 269037	2016-05-10 10:14:50 +00:00
Michael Zuckerman	7360d8a9cc	[Clang][AVX512] completing missing intrinsics [roundscale, ceil, floor] Differential Revision: http://reviews.llvm.org/D20070 llvm-svn: 269022	2016-05-10 07:30:58 +00:00
Michael Zuckerman	f9be3bb1d5	[clang][AVX512] completing missing intrinsics [vmin/vmax]. Differential Revision: http://reviews.llvm.org/D20062 llvm-svn: 268910	2016-05-09 12:38:49 +00:00
Michael Zuckerman	f15447537f	[Clang][AVX512] completing missing intrinsics [CVT] Differential Revision: http://reviews.llvm.org/D20056 llvm-svn: 268903	2016-05-09 10:32:51 +00:00
Michael Zuckerman	e6f7389b5a	[Clang][Builtin][AVX512] Adding intrinsics fot cvt{u}si2s{d\|s} cvt{sd\|ss}2{ss\|sd} instruction set Differential Revision: http://reviews.llvm.org/D19765 llvm-svn: 268481	2016-05-04 08:55:11 +00:00
Michael Zuckerman	c66770313a	[clang][AVX512][BuiltIn] Adding intrinsics for cast{pd\|ps\|si}128_{pd\|ps\|si}512 and castsi256_si512 instruction set Differential Revision: http://reviews.llvm.org/D19858 llvm-svn: 268387	2016-05-03 14:26:52 +00:00
Michael Zuckerman	e871785eb6	[Clang][avx512][Builtin] Adding intrinsics for cvtw2mask{128\|256\|512} instruction set Differential Revision: http://reviews.llvm.org/D19766 llvm-svn: 268385	2016-05-03 14:12:23 +00:00
Michael Zuckerman	8bfb7776e4	[Clang][AVX512][Builtin] Adding intrinsics for vcvt{ph\|ps}2{ps\|ph} instruction set Differential Revision: http://reviews.llvm.org/D19767 llvm-svn: 268376	2016-05-03 12:45:04 +00:00
Michael Zuckerman	138fc5b5a8	[Clang][AVX512][Builtin] Adding intrinsics for vcvttpd2udq instruction set Differential Revision: http://reviews.llvm.org/D19768 llvm-svn: 268373	2016-05-03 11:05:24 +00:00
Michael Zuckerman	708e759b86	[Clang][AVX512][BUILTIN] Adding intrinsics for compressstore{df\|di\|sf\|si} instruction set. Differential Revision: http://reviews.llvm.org/D19808 llvm-svn: 268372	2016-05-03 10:42:46 +00:00
Michael Zuckerman	5f0e96e56a	[CLANG][AVX512][BUILTIN]movap{d\|s}{128\|256\|512} Differential Revision: http://reviews.llvm.org/D17818 llvm-svn: 268230	2016-05-02 14:02:01 +00:00
Michael Zuckerman	d6e68ce75f	[Clang][AVX512][BuiltIn] Adding intrinsics for cvtps2pd instruction set Differential Revision: http://reviews.llvm.org/D19774 llvm-svn: 268217	2016-05-02 09:42:31 +00:00
Michael Zuckerman	6a0e0871db	[Clang][avx512][builtin] Adding intrinsics for vexpand{d\|q\|ps\|pd} instrctuon set Differential Revision: http://reviews.llvm.org/D19467 llvm-svn: 268214	2016-05-02 08:36:41 +00:00
Michael Zuckerman	c62f27e3f4	[Clang][BuiltIn][avx512] Adding intrinsics for vpshufd instruction set Differential Revision: http://reviews.llvm.org/D19580 llvm-svn: 268213	2016-05-02 07:35:27 +00:00
Michael Zuckerman	ac1e519944	[clang][Builtin][AVX512] Adding intrinsics for vmovshdup and vmovsldup instruction set Differential Revision: http://reviews.llvm.org/D19595 llvm-svn: 268196	2016-05-01 14:43:43 +00:00
Michael Zuckerman	0b9d105a16	[clang][BuiltIn][AVX512]Adding intrinsics for cmp{ss\|sd} instruction set. Differential Revision: http://reviews.llvm.org/D19601 llvm-svn: 268028	2016-04-29 11:01:16 +00:00
Michael Zuckerman	41f5a37707	[Clang][AVX512][Builtin] Adding intrinsics for compress instruction set Differential Revision: http://reviews.llvm.org/D19599 llvm-svn: 268013	2016-04-29 08:52:02 +00:00
Michael Zuckerman	de8d3753d3	[clang][AVX512][Builtin] Adding intrinsics for the SAD instruction set. Differential Revision: http://reviews.llvm.org/D19591 llvm-svn: 267942	2016-04-28 21:21:08 +00:00
Michael Zuckerman	533e065bdc	[Clang][BuiltIn][AVX512] Adding intrinsics fot align{d\|q} and palignr instruction set Differential Revision: http://reviews.llvm.org/D19588 llvm-svn: 267876	2016-04-28 12:47:30 +00:00
Michael Zuckerman	514f05543f	[Clang][Builtin][AVX512] Adding intrisnics for the vpconflict{q\|d} instruction set Differential Revision: http://reviews.llvm.org/D19525 llvm-svn: 267728	2016-04-27 15:35:13 +00:00
Michael Zuckerman	8c2900f44d	[Clang][BuiltIn][AVX512] Adding intrinsics without mask for VBROADCAST and VPBROADCAST instruction set . Differential Revision: http://reviews.llvm.org/D19196 llvm-svn: 267696	2016-04-27 11:43:14 +00:00
Michael Zuckerman	7c85a8cb46	[Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set Differential Revision: http://reviews.llvm.org/D19529 llvm-svn: 267690	2016-04-27 10:44:15 +00:00
Ekaterina Romanova	a2d72377a1	Updated doxygen comments for intrinsics. (1) Removed \code.. \endcode tags around the instruction name. This matches the doxygen format for all other intrinsics. (2) Did a better formatting for the comments (to fit into 80 columns more compactly). llvm-svn: 267676	2016-04-27 07:14:02 +00:00
Michael Zuckerman	fa508e8b6d	[Clang][Builtin][AVX512]Adding k-register logic intrinsics KAND, KANDN, KOR, KORTEST, KXNOR, KXOR, KUNPACK instruction set. Differential Revision: http://reviews.llvm.org/D19466 llvm-svn: 267425	2016-04-25 16:42:29 +00:00
Michael Zuckerman	edc82fe3ef	[Clang][Builtin][AVX512]Adding intrinsics for vfpclass{sd\|ss} vfpclass{pd\|ps} instruction set Differential Revision: http://reviews.llvm.org/D19476 llvm-svn: 267414	2016-04-25 14:48:23 +00:00
Michael Zuckerman	fcf32c2f00	[Clang][AVX512][BUILTIN] Adding intrinsics for VSCATTERPF{1\|0}{DPS\|QPS\|DPD\|QPD} instruction set Differential Revision: http://reviews.llvm.org/D19313 llvm-svn: 267398	2016-04-25 13:01:40 +00:00
Michael Zuckerman	8938e836c4	[Clang][AVX512][BuiltIn] Adding support to intrinsics of VPERMD and VPERMW instruction set Differential Revision: http://reviews.llvm.org/D19195 llvm-svn: 267380	2016-04-25 05:32:35 +00:00
Michael Zuckerman	743d68c3cb	[clang][AVX512][Builtin] adding intrinsics for vf{n}madd{ss\|sd} and vf{n}sub{ss\|sd} instruction set Differential Revision: http://reviews.llvm.org/D19320 llvm-svn: 267135	2016-04-22 10:56:24 +00:00
Michael Zuckerman	a1ceca20b6	[Clang][AVX512][BUILTIN] Adding scalar intrinsics for rsqrt14 ,rcp14, getexp and getmant instruction set Differential Revision: http://reviews.llvm.org/D19326 llvm-svn: 267129	2016-04-22 10:06:10 +00:00
Artem Belevich	c34a519407	[CUDA] removed unneeded __nvvm_reflect_anchor() Since r265060 LLVM infers correct __nvvm_reflect attributes, so explicit declaration of __nvvm_reflect() is no longer needed. Differential Revision: http://reviews.llvm.org/D19074 llvm-svn: 267062	2016-04-21 21:40:27 +00:00
Michael Zuckerman	4fa96af4db	[Clang][AVX512][BuiltIn] Adding intrinsics of VGATHER{DPS\|DPD} , VPGATHER{QD\|QQ\|DD\|DQ} and VGATHERPF{0\|1}{DPS\|QPS\|DPD\|QPD} instruction set . Differential Revision: http://reviews.llvm.org/D19224 llvm-svn: 266983	2016-04-21 12:47:27 +00:00
Richard Smith	e0fa4c83b2	[modules] Make the tweak to avoid circular inclusion of emmintrin.h and xmmintrin.h a bit more directed. If for whatever reason modules are enabled but we textually include one of these headers, don't deploy the special case for modules. To make this work cleanly, extend __building_module to be defined even when modules is disabled. llvm-svn: 266945	2016-04-21 01:46:37 +00:00
Michael Zuckerman	6fa512cecf	[Clang][Builtin][AVX512] Adding intrinsics for VGETMANT{PD\|PS} and VGETEXP{PD\|PS} instruction set Differential Revision: http://reviews.llvm.org/D19197 llvm-svn: 266763	2016-04-19 17:10:29 +00:00
Michael Zuckerman	ef2979af50	[Clang][AVX512][BUILTIN] Adding intrinsics support to VEXTRACT{I\|F} and VINSERT{I\|F} instruction set Differential Revision: http://reviews.llvm.org/D19097 llvm-svn: 266745	2016-04-19 15:18:23 +00:00
Richard Smith	20d4701b3d	[modules] Don't expose *intrin.h headers that cannot be included standalone as separate modules. These cause build breakage with -fmodules-local-submodule-visibility. llvm-svn: 266501	2016-04-16 00:46:26 +00:00
Michael Zuckerman	0a3508a8d3	[Clang][AVX512][BUILTIN] Adding support for intrinsics of vpmov{d\|q}{b\|w\|d}{128\|256\|512} instruction set Differential Revision: http://reviews.llvm.org/D19055 llvm-svn: 266280	2016-04-14 07:56:51 +00:00
Michael Zuckerman	d871531687	[Clang][AVX512][Builtin] Adding intrinsics of vpmovus{d\|q}{b\|w\|d}{128\|256\|512} instruction set Differential Revision: http://reviews.llvm.org/D19050 llvm-svn: 266278	2016-04-14 06:48:09 +00:00
Michael Zuckerman	e1680617b0	[Clang][AVX512][Builtin] Adding support to intrinsics of pmovs{d\|q}{b\|w\|d}{128\|256\|512} instruction set Differential Revision: http://reviews.llvm.org/D19023 llvm-svn: 266202	2016-04-13 15:02:04 +00:00
Michael Zuckerman	c2b6128a8f	[Clang][AVX512][Builtin] Adding support for VBROADCAST and VPBROADCASTB/W/D/Q instruction set Differential Revision: http://reviews.llvm.org/D19012 llvm-svn: 266195	2016-04-13 12:58:01 +00:00
Michael Zuckerman	074edd7c1e	[Clang][AVX512][Builtin] Adding supporting to intrinsics of cvt{b\|d\|q}2mask{128\|256\|512} and cvtmask2{b\|d\|q}{128\|256\|512} instruction set. Differential Revision: http://reviews.llvm.org/D19009 llvm-svn: 266188	2016-04-13 10:49:37 +00:00
Chuang-Yu Cheng	8eac7ae9ad	[PPC64][VSX] Add a couple of new data types for vec_vsx_ld and vec_vsx_st intrinsics and fix incorrect testcases with minor refactoring New added data types: vector double vec_vsx_ld (int, const double ); vector float vec_vsx_ld (int, const float ); vector bool short vec_vsx_ld (int, const vector bool short ); vector bool int vec_vsx_ld (int, const vector bool int ); vector signed int vec_vsx_ld (int, const signed int ); vector unsigned int vec_vsx_ld (int, const unsigned int ); void vec_vsx_st (vector double, int, double ); void vec_vsx_st (vector float, int, float ); void vec_vsx_st (vector bool short, int, vector bool short ); void vec_vsx_st (vector bool short, int, signed short ); void vec_vsx_st (vector bool short, int, unsigned short ); void vec_vsx_st (vector bool int, int, vector bool int ); void vec_vsx_st (vector bool int, int, signed int ); void vec_vsx_st (vector bool int, int, unsigned int ); Also fix testcases which use non-vector argument version of vec_vsx_ld or vec_vsx_st, but pass incorrect parameter. llvm-svn: 266166	2016-04-13 05:16:31 +00:00
Eric Christopher	d5c75eed44	Add a couple of missing vsx load and store intrinsics. Patch by Jing Yu! llvm-svn: 266122	2016-04-12 21:08:54 +00:00
Michael Zuckerman	04fb3bc682	[Clang][BuiltIn][avx512] Adding avx512 (shuf,sqrt{ss\|sd},rsqrt ) builtin to clang llvm-svn: 266048	2016-04-12 07:59:39 +00:00
Michael Zuckerman	81f468c859	[Clang][AVX512][BuiltIn] Adding avx512 ( psll{d\|q}512,psllv{16si\|8di},psra{d\|q}512,psrav{16si\|8di},pternlog{d\|q}{128\|256\|512} ) builtin to clang Differential Revision: http://reviews.llvm.org/D18926 llvm-svn: 265964	2016-04-11 17:04:21 +00:00
Michael Zuckerman	6b5f4d8ad1	[CLANG] [AVX512] [BUILTIN] Adding PSRA{Q\|D\|QI\|DI}{128\|256\|512} builtin Differential Revision: http://reviews.llvm.org/D17693 llvm-svn: 265952	2016-04-11 15:46:39 +00:00
Michael Zuckerman	1af947a7b3	[Clang][AVX512][BuiltIn] Adding avx512 ( punpck{h\|l}{dq\|qdq}{128\|256\|512},rndscale{ss\|sd}, {scalef{ss\|sd\|pd512\|ps512} ) builtin to clang Differential Revision: http://reviews.llvm.org/D18929 llvm-svn: 265935	2016-04-11 12:32:31 +00:00
Michael Zuckerman	07525091e6	[Clang][AVX512][BuiltIn] Adding avx512 ( ptest{n}m{b\|w}{128\|256\|512} ) builtin to clang Differential Revision: http://reviews.llvm.org/D18924 llvm-svn: 265928	2016-04-11 10:22:07 +00:00
Michael Zuckerman	d8d2f62107	[Clang][AVX512][BuiltIn] Adding avx512 ( vperm{i\|t}2var, vpermil{var}{ps\|pd}{256\|512} ) builtin to clang. Differential Revision: http://reviews.llvm.org/D18933 llvm-svn: 265915	2016-04-11 07:15:34 +00:00
Michael Zuckerman	8d16199b7b	[Clang][AVX512][BuiltIn] Adding avx512 ( vcvt ) builtin to clang Differential Revision: http://reviews.llvm.org/D18932 llvm-svn: 265904	2016-04-10 17:24:03 +00:00
Michael Zuckerman	cdd54c83d8	Adding avx512 (unpck{h\|l}{pd\|ps}, rcp14{pd\|ps}{128\|256},vplzcnt{d\|q} ) builtin to clang Differential Revision: http://reviews.llvm.org/D18931 llvm-svn: 265896	2016-04-10 12:54:23 +00:00
Michael Zuckerman	fa7ccc5bcf	[Clang][AVX512][BuiltIn] Adding avx512 ( store ) builtin to clang Differential Revision: http://reviews.llvm.org/D18925 llvm-svn: 265895	2016-04-10 10:51:04 +00:00
Ekaterina Romanova	f2ed62027d	Add doxygen comments to emmintrin.h's intrinsics. Only around 25% of the intrinsics in this file are documented now. The patches for the rest of the intrisics in this file will be send out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Paul Robinson. llvm-svn: 265844	2016-04-08 20:45:48 +00:00
Justin Lebar	25c36fd61b	[CUDA] Tweak math forward declares so we're compatible with libstdc++4.9. Summary: See comments in patch; we were assuming that some stdlib math functions would be defined in namespace std, when in fact the spec says they should be defined in the global namespace. libstdc++4.9 became more conforming and broke us. This new implementation seems to cover the known knowns. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: http://reviews.llvm.org/D18882 llvm-svn: 265751	2016-04-07 23:55:53 +00:00
Michael Zuckerman	5ae71243c2	Fixing duplicate declaration "_mm256 _mm_set_epi32" in revision 262177 Differential Revision: http://reviews.llvm.org/D17685 llvm-svn: 265677	2016-04-07 14:44:08 +00:00
Yunzhong Gao	c293a2688d	Add copyright notice to the modulemap file. The module.modulemap file in the lib/Headers directory was missing the LLVM copyright notice. This patch adds the copyright notice just like the rest of the files in this directory. Differential Revision: http://reviews.llvm.org/D18709 llvm-svn: 265325	2016-04-04 18:46:09 +00:00
Justin Lebar	cb28f15fbc	[CUDA] Fix typo in __clang_cuda_runtime_wrapper.h. We're #including the wrong file! llvm-svn: 265083	2016-04-01 00:25:42 +00:00
Justin Lebar	0cda764430	[CUDA] Add math forward declares to CUDA header wrapper. Summary: This is necessary for a future patch which will make all constexpr functions implicitly host+device. cmath may declare constexpr functions, but these we do not want to be host+device. The forward declares added in this patch prevent this (because the rule will be, constexpr functions become implicitly host+device unless they're preceeded by a decl with __device__). Reviewers: tra Subscribers: cfe-commits, rnk, rsmith Differential Revision: http://reviews.llvm.org/D18539 llvm-svn: 264963	2016-03-30 23:30:14 +00:00
Justin Lebar	50e5f184d8	[CUDA] Add missing #undef __DEVICE__ to CUDA shim header. llvm-svn: 264742	2016-03-29 16:24:23 +00:00
Michael Zuckerman	def78750b7	[CLANG][avx512][BUILTIN] Adding fixupimm{pd\|ps\|sd\|ss} getexp{sd\|ss} getmant{sd\|ss} kunpck{di\|si} loada{pd\|ps} loaddqu{di\|hi\|qi\|si} max{sd\|ss} min{sd\|ss} kmov16 builtins to clang Differential Revision: http://reviews.llvm.org/D18215 llvm-svn: 264574	2016-03-28 12:23:09 +00:00
Justin Lebar	334535132f	[CUDA] Don't define __NVCC__. Summary: We decided this makes life too difficult for code authors. For example, people may want to detect NVCC and disable variadic templates, which NVCC does not support, but which we do. Since people are going to have to change compiler flags anyway in order to compile with clang, if they really want the old behavior, they can pass -D__NVCC__. Tested with tensorflow and thrust, no apparent problems. Reviewers: tra Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18417 llvm-svn: 264205	2016-03-23 22:42:27 +00:00
John Thompson	debce24c90	D18325: Added mm_malloc module export. llvm-svn: 264092	2016-03-22 20:57:51 +00:00
Daniel Jasper	be50836514	Make functions in altivec.h be __inline__. As they are all also marked __always_inline__, this has likely been meant from the start. Review: http://reviews.llvm.org/D18015 llvm-svn: 263302	2016-03-11 22:13:28 +00:00
Ekaterina Romanova	13f189da86	Add doxygen comments to avxintrin.h's intrinsics. Only around 25% of the intrinsics in this file are documented here. The patches for the other half will be sent out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 263175	2016-03-11 00:05:54 +00:00
Ekaterina Romanova	e2961f71d2	Add doxygen comments to xmmintrin.h's intrinsics. Only half of the intrinsics in this file is documented here. The patch for the other half will be sent out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 263098	2016-03-10 09:37:04 +00:00
Kit Barton	fbab158767	[PPC] FE support for generating VSX [negated] absolute value instructions Includes new built-in, conversion of built-in to target-independent intrinsic and update in the header file. Tests are also updated. There is a second part in the backend for which I will post a separate code-review. BACKEND PART SHOULD BE COMMITTED FIRST. Phabricator: http://reviews.llvm.org/D17816 llvm-svn: 263051	2016-03-09 19:28:31 +00:00
Michael Zuckerman	10d6f9ac04	Fixing wrong header title name. Differential Revision: http://reviews.llvm.org/D17917 llvm-svn: 263007	2016-03-09 11:26:45 +00:00
Ekaterina Romanova	c8976d58fe	Add doxygen comments to bmiintrin.h's intrinsics. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 262895	2016-03-08 01:36:59 +00:00
Michael Zuckerman	e71d59fc4f	[CLANG][AVX512][BUILTIN] Add builtin vcomi{ss\|sd} Differential Revision: http://reviews.llvm.org/D17919 llvm-svn: 262847	2016-03-07 19:15:00 +00:00
Michael Zuckerman	9f33848f04	[CLANG][AVX512][BUILTIN] Adding new feature flag headed files and new BUILTIN vpermi2varq{i\|t}{128\|256\|512}{mask\|maskz} Differential Revision: http://reviews.llvm.org/D17917 llvm-svn: 262834	2016-03-07 17:04:11 +00:00
Michael Zuckerman	0190c65571	[CLANG][AVX512][BUILTIN] Adding new feature flag header file and new builtin vpmadd52{h\|l}uq{128\|256\|512}{mask\|maskz} Differential Revision: http://reviews.llvm.org/D17915 llvm-svn: 262820	2016-03-07 09:55:55 +00:00
Michael Zuckerman	912be16a0e	[CLANG][AVX512][BUILTIN] Adding vpmultishiftqb{128\|256\|512} Differential Revision: http://reviews.llvm.org/D17914 llvm-svn: 262817	2016-03-07 08:29:10 +00:00
Michael Zuckerman	0d67e4b5d6	[CLANG][AVX512][BUILTIN] movddup{128\|256\|512} Differential Revision: http://reviews.llvm.org/D17826 llvm-svn: 262617	2016-03-03 13:43:05 +00:00
Michael Zuckerman	1ad03e7f01	[CLANG][AVX512][BUILTIN] movdqu{qi\|hi} {128\|256\|512} Differential Revision: http://reviews.llvm.org/D17814 llvm-svn: 262609	2016-03-03 11:34:52 +00:00
Michael Zuckerman	ffbb67a8e2	[CLANG][AVX512][BUILTIN] movdqa{32\|64}{load\|store\|}{128\|256\|512} Differential Revision: http://reviews.llvm.org/D17812 llvm-svn: 262598	2016-03-03 09:26:01 +00:00
Michael Zuckerman	abbe34bce6	[Clang][AVX512][BUILTIN] Adding PSRL{W\|WI}{128\|256\|512} Differential Revision: http://reviews.llvm.org/D17754 llvm-svn: 262593	2016-03-03 08:55:20 +00:00
Ekaterina Romanova	4711441e52	This patch adds doxygen comments for all the intrinsincs in the header file tmmintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 262565	2016-03-03 00:20:11 +00:00
Michael Zuckerman	3df95e711f	[CLANG] [AVX512] [BUILTIN] Adding PSRA{W\|WI}{128\|256\|512}. Differential Revision: http://reviews.llvm.org/D17706 llvm-svn: 262481	2016-03-02 12:06:06 +00:00
Michael Zuckerman	d15c95a793	[CLANG] [AVX512] [BUILTIN] Adding PSRAV Differential Revision: http://reviews.llvm.org/D17699 llvm-svn: 262471	2016-03-02 09:05:46 +00:00
Ekaterina Romanova	c207006bbb	This patch adds doxygen comments for the intrinsincs in the header file popcntintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics documentation. Differential Revision: http://reviews.llvm.org/D17550 llvm-svn: 262385	2016-03-01 20:04:57 +00:00
Kit Barton	2b36b15834	[PPC64][VSX] Add short, char, and bool data type for vec_vsx_ld and vec_vsx_st intrinsics Issue: https://llvm.org/bugs/show_bug.cgi?id=26720 Fix compile error when building ffmpeg for PowerPC64LE because of some vec_vsx_ld/vec_vsx_st intrinsics are not supported by current clang. New added intrinsics: (vector) {signed\|unsigned} {short\|char} vec_vsx_ld: (total: 8) bool vec_vsx_ld: (total: 1) (vector) {signed\|unsigned} {short\|char} vec_vsx_st: (total: 8) bool vec_vsx_st: (total: 1) Total: 18 intrinsics Phabricator: http://reviews.llvm.org/D17637 llvm-svn: 262359	2016-03-01 18:11:28 +00:00
Michael Zuckerman	d176d744af	[CLANG][AVX512][BUILTIN] Adding PSRL{DI\|QI}{128\|256\|512} builtin Differential Revision: http://reviews.llvm.org/D17714 llvm-svn: 262355	2016-03-01 17:49:03 +00:00
Michael Zuckerman	0165e7669c	[CLANG][AVX512][BUILTIN] Adding PSRLV builtin Differential Revision: http://reviews.llvm.org/D17718 llvm-svn: 262326	2016-03-01 13:03:45 +00:00
Michael Zuckerman	1ac360cca4	[CLANG] [AVX512] [BUILTIN] Adding PSRA{Q\|D\|QI\|DI}{128\|256\|512} builtin Differential Revision: http://reviews.llvm.org/D17693 llvm-svn: 262321	2016-03-01 11:38:16 +00:00
Logan Chien	3267ca225d	Add ARM EHABI-related constants to unwind.h. Adds a number of constants, defined in the ARM EHABI spec, to the Clang lib/Headers/unwind.h header. This is prerequisite for landing http://reviews.llvm.org/D15781, as previously discussed there. Patch by Timon Van Overveldt. llvm-svn: 262178	2016-02-28 15:01:42 +00:00
Michael Zuckerman	431b0e18b4	[CLANG] [AVX512] [BUILTIN] Adding PSLL{V\|W\|Wi}{128\|256\|512} builtin Differential Revision: http://reviews.llvm.org/D17685 llvm-svn: 262177	2016-02-28 07:39:34 +00:00
Chris Bieneman	2c6c01a4fc	[CMake] Fixing install-clang-headers dependencies to depend on generating the headers. llvm-svn: 261911	2016-02-25 18:39:19 +00:00
Justin Lebar	d7a35492ad	[CUDA] Add conversion operators for threadIdx, blockIdx, gridDim, and blockDim to uint3 and dim3. Summary: This lets you write, e.g. uint3 a = threadIdx; uint3 b = blockIdx; dim3 c = gridDim; dim3 d = blockDim; which is legal in nvcc, but was not legal in clang. The fact that e.g. the type of threadIdx is not actually uint3 is still observable, but now you have to try to observe it. Reviewers: tra Subscribers: echristo, cfe-commits Differential Revision: http://reviews.llvm.org/D17561 llvm-svn: 261777	2016-02-24 21:49:33 +00:00
Justin Lebar	c8dae5378b	[CUDA] Add hack so code which includes "curand.h" doesn't break. Summary: curand.h includes curand_mtgp32_kernel.h. In host mode, this header redefines threadIdx and blockDim, giving them their "proper" types of uint3 and dim3, respectively. clang has its own plan for these variables -- their types are magic builtin classes. So these redefinitions are incompatible. As a hack, we force-include the offending CUDA header and use #defines to get the right types for threadIdx and blockDim. Reviewers: tra Subscribers: echristo, cfe-commits Differential Revision: http://reviews.llvm.org/D17562 llvm-svn: 261776	2016-02-24 21:49:31 +00:00
Michael Zuckerman	6c317515e4	[CLANG] [AVX512] [BUILTIN] Adding PSHUF{L\|H}W{128\|256\|512} builtin to clang . Differential Revision: http://reviews.llvm.org/D17539 llvm-svn: 261755	2016-02-24 17:39:35 +00:00
Michael Zuckerman	e98cc7477f	[CLANG] [AVX512] [BUILTIN] Adding prorv{d\|q}{128\|256\|512} builtin to clang Differential Revision: http://reviews.llvm.org/D17512 llvm-svn: 261641	2016-02-23 15:59:47 +00:00
Michael Zuckerman	4924c7a2b5	[CLANG] [AVX512] [BUILTIN] Adding pro{lv\|r}{d\|q}{128\|256\|512} builtin to clang Adding closer to the end of macro }->}) Differential Revision: http://reviews.llvm.org/D17506 llvm-svn: 261638	2016-02-23 14:23:53 +00:00
Michael Zuckerman	0231f1649b	[CLANG] [AVX512] [BUILTIN] Adding pro{lv\|r}{d\|q}{128\|256\|512} builtin to clang Differential Revision: http://reviews.llvm.org/D17506 llvm-svn: 261635	2016-02-23 13:41:13 +00:00
Michael Zuckerman	477e0a326b	[CLANG] [AVX512] [BUILTIN] Adding prol{d\|q\|w}{128\|256\|512} builtin to clang . Fixing problem with the lib/include/avx512vlintrin.h file. Adding one more _ to the prefix of _extension__ -> __extension__. Differential Revision: http://reviews.llvm.org/D16985 llvm-svn: 261518	2016-02-22 09:42:57 +00:00
Michael Zuckerman	38a2727764	[CLANG] [AVX512] [BUILTIN] Adding prol{d\|q\|w}{128\|256\|512} builtin to clang . Differential Revision: http://reviews.llvm.org/D16985 llvm-svn: 261516	2016-02-22 09:05:41 +00:00
Michael Zuckerman	7a33dce4ef	[CLANG] [AVX512] [BUILTIN] Adding pmovzx{b\|d\|w}{w\|d\|q}{128\|256\|512} builtin to clang Differential Revision: http://reviews.llvm.org/D16961 llvm-svn: 261471	2016-02-21 14:00:11 +00:00
David Majnemer	7a0d7d6be9	Remove a duplicate declaration specifier from _ReadBarrier This fixes PR26675. llvm-svn: 261388	2016-02-20 00:57:00 +00:00
Michael Zuckerman	7cdb72f7ea	[CLANG] [AVX512] [BUILTIN] Adding pmovsx{b\|d\|w}{w\|d\|q}{128\|256\|512} builtin to clang Differential Revision: http://reviews.llvm.org/D16955 llvm-svn: 261196	2016-02-18 09:09:34 +00:00
Artem Belevich	7f522b7876	Added missing '__'. llvm-svn: 260719	2016-02-12 20:26:43 +00:00
Eric Christopher	39a84d0b9b	Update functions in clang supplied headers to use the compiler reserved namespace for arguments. llvm-svn: 260647	2016-02-12 02:22:53 +00:00
Richard Smith	66a7385e27	<float.h>: do not define DECIMAL_DIG in -std=c89 mode; this macro was added in C99. Patch by Jorge Teixeira! llvm-svn: 260639	2016-02-12 01:15:33 +00:00
Eric Christopher	0466c7ce23	Use __ before argument names in provided headers. llvm-svn: 260631	2016-02-12 00:32:23 +00:00
Richard Smith	b473e1e473	In C11, provide macros FLT_DECIMAL_DIG, DBL_DECIMAL_DIG, and LDBL_DECIMAL_DIG in <float.h>. Patch by Jorge Teixeira! llvm-svn: 260577	2016-02-11 19:57:37 +00:00
Ekaterina Romanova	a61946d551	This patch adds doxygen comments for all the intrinsincs in the header file f16cintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics document. Differential Revision: http://reviews.llvm.org/D17021 llvm-svn: 260333	2016-02-10 00:12:24 +00:00
Ekaterina Romanova	d416747803	This patch adds doxygen comments for all the intrinsincs in the header file pmmintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics document. Differential Revision: http://reviews.llvm.org/D16913 llvm-svn: 260160	2016-02-08 22:35:09 +00:00
Igor Breger	9c2a0bfa13	AVX512: Change builtin function name for scalar intrinsics. Add "mask" to function name to reflect the function behavior. Differential Revision: http://reviews.llvm.org/D16957 llvm-svn: 260088	2016-02-08 12:36:48 +00:00
Artem Belevich	2aad2b3500	[CUDA] Bug 26497 : Remove wrappers for variants provided by CUDA headers. ... and pull global-scope ones into std namespace with using-declaration. Differential Revision: http://reviews.llvm.org/D16932 llvm-svn: 259944	2016-02-05 22:54:05 +00:00
Artem Belevich	7b660e2604	[CUDA] added declarations for device-side system calls ...and std:: wrappers for free/malloc. llvm-svn: 259690	2016-02-03 20:53:58 +00:00
Ekaterina Romanova	0e19cf2dd8	This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_aes.h. The doxygen comments are automatically generated based on Sony's intrinsics document. Differential Revision: http://reviews.llvm.org/D16562 llvm-svn: 259275	2016-01-29 23:59:00 +00:00
Ekaterina Romanova	deec50a3d2	This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_pclmul.h. The doxygen comments are automatically generated based on Sony's intrinsics document. Differential Revision: http://reviews.llvm.org/D15999 llvm-svn: 259239	2016-01-29 20:37:14 +00:00
Artem Belevich	c5f41a34e5	[CUDA] Implemented device-side support functions in <cmath>. CUDA expects math functions in std:: namespace to work on device side. In order to make it work with clang without allowing device-side code generation for functions w/o appropriate target attributes, this patch provides device-side implementations for <cmath> functions. Most of them call global-scope math functions provided by CUDA headers. In few cases we use clang builtins. Tested out-of tree by compiling and running thrust's unit_tests. https://github.com/thrust/thrust/tree/master/testing Differential Revision: http://reviews.llvm.org/D16593 llvm-svn: 258880	2016-01-26 23:37:29 +00:00
Chris Bieneman	2bf68c6c1c	Remove autoconf support Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "This is the way [autoconf] ends Not with a bang but a whimper." -T.S. Eliot Reviewers: chandlerc, grosbach, bob.wilson, echristo Subscribers: klimek, cfe-commits Differential Revision: http://reviews.llvm.org/D16472 llvm-svn: 258862	2016-01-26 21:30:40 +00:00
Justin Lebar	3039a593db	[CUDA] Make printf work. Summary: The code in CGCUDACall is largely based on a patch written by Eli Bendersky: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html That patch implemented an LLVM pass lowering printf to vprintf; this one does something similar, but in Clang codegen. Reviewers: echristo Subscribers: cfe-commits, jhen, tra, majnemer Differential Revision: http://reviews.llvm.org/D16372 llvm-svn: 258642	2016-01-23 21:28:14 +00:00
Ekaterina Romanova	08d1f2431d	2 missing intrinsics _cvtss_sh and _mm_cvtps_ph were added to the intrinsics header f16intrin.h Differential Revision: http://reviews.llvm.org/D16177 llvm-svn: 258492	2016-01-22 06:50:50 +00:00
Adam Nemet	e708747129	[AVX512] Fix typo in r226298 Hal noticed that the double/float got mixed up on the parameters for these. llvm-svn: 258108	2016-01-19 02:02:25 +00:00
Kyle Butt	436ff85b63	[PPC] Add long long/double support for vec_cts, vec_ctu and vec_ctf Add long long/double support for vec_cts, vec_ctu and vec_ctf. Similar to this change in GCC: https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02653.html Patch by Tim Shen. llvm-svn: 257135	2016-01-08 02:00:48 +00:00
David Majnemer	30f9bfd574	Reimplement __readeflags and __writeeflags on top of intrinsics Lean on LLVM to provide this functionality now that it provides the necessary intrinsics. llvm-svn: 256686	2016-01-01 06:50:08 +00:00
Asaf Badouh	a9d1e18f48	[X86][PKU] add clang intrinsic for {RD\|WR}PKRU Differential Revision: http://reviews.llvm.org/D15837 llvm-svn: 256672	2015-12-31 14:14:07 +00:00
Eric Christopher	7f7d9bea6f	Fix up comment in header. llvm-svn: 256508	2015-12-28 19:07:46 +00:00
Michael Kuperstein	591278c08d	[X86] Add missing m64/int64 conversions Define the 64-bit equivalents of _m_to_int and _m_from_int. Differential Revision: http://reviews.llvm.org/D15572 llvm-svn: 256122	2015-12-20 12:37:18 +00:00
Michael Kuperstein	beae026738	[X86] Add signed aliases for popcnt intrinsics The Intel manual documents both an unsigned form (_mm_popcnt_u32) and a signed form (_popcnt32) of the intrinsic. Add the missing signed form. Differential Revision: http://reviews.llvm.org/D15568 llvm-svn: 256121	2015-12-20 12:35:35 +00:00
Artem Belevich	8e9ba042a6	[CUDA] runtime wrapper header tweaks * Pull in host-only implementations of few CUDA-specific math functions. * #nclude <cmath> early to prevent its inclusion from CUDA headers after they've messed with __THROW macro. llvm-svn: 255933	2015-12-17 22:25:22 +00:00
Artem Belevich	7fda3c9ff3	[CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h Currently it's easy to break CUDA compilation by passing "-isystem /path/to/cuda/include" to compiler which leads to compiler including real cuda_runtime.h from there instead of the wrapper we need. Renaming the wrapper ensures that we can include the wrapper regardless of user-specified include paths and files. Differential Revision: http://reviews.llvm.org/D15534 llvm-svn: 255802	2015-12-16 18:51:59 +00:00
Asaf Badouh	5e4248b4e0	[x86][avx512] more changes in intrinsics to be align with gcc format Differential Revision: http://reviews.llvm.org/D15328 llvm-svn: 255012	2015-12-08 12:34:38 +00:00
Asaf Badouh	3e5111e313	[avx512] rename gcc intrinsics to be align with gcc format rename the gcc intrinsics suffix : _mask ->_round Differential Revision: http://reviews.llvm.org/D15284 llvm-svn: 254906	2015-12-07 13:14:22 +00:00
Paul Robinson	941bc91518	Move _mm256_cvtps_ph and _mm256_cvtph_ps to immintrin.h. This more closely matches their locations as described by Intel documentation, and lets us remove a pair of redundant typedefs. Differential Revision: http://reviews.llvm.org/D15127 llvm-svn: 254528	2015-12-02 18:41:52 +00:00
Craig Topper	5ec97a7b9b	[X86] Improve codegen for AVX2 gather with an all 1s mask. Use undefined instead of setzero as the pass through input since its going to be fully overwritten. Use cmpeq of two zero vectors to produce the all 1s vector. Casting -1 to a double and vectorizing causes a constant load of a -1.0 floating point value. llvm-svn: 254389	2015-12-01 07:12:59 +00:00
Craig Topper	e20b8c68ed	[X86] _mm256_permutevar8x32_ps should take an integer vector for its shuffle index input. llvm-svn: 254270	2015-11-29 22:53:32 +00:00
Craig Topper	3a71f35a67	[X86] Remove temporary variables from intrinsic macros. NFC llvm-svn: 254247	2015-11-29 06:50:33 +00:00
Argyrios Kyrtzidis	dcb5653516	[CMake] Add a specific 'install-clang-headers' target. llvm-svn: 253636	2015-11-20 02:24:03 +00:00
Artem Belevich	c29db84419	[CUDA] Added a wrapper header for inclusion of stock CUDA headers. Header files that come with CUDA are assuming split host/device compilation and are not usable by clang out of the box. With a bit of preprocessor magic it's possible to twist them into something clang can use. This wrapper always includes CUDA headers exactly the same way during host and device compilation passes and produces identical preprocessed content during host and device side compilation for sm_35 GPUs. Device compilation passes for older GPUs will see a smaller subset of device functions supported by particular GPU. The wrapper assumes specific contents of CUDA header files and works only with CUDA 7.0 and 7.5. Differential Revision: http://reviews.llvm.org/D13171 llvm-svn: 253388	2015-11-17 22:28:52 +00:00
Hans Wennborg	1acf955a6a	bmiintrin.h: Allow using the tzcnt intrinsics for non-BMI targets The tzcnt intrinsics are used non non-BMI targets by code (e.g. ffmpeg) that uses it as a potentially faster BSF. The TZCNT instruction is special in that it's encoded in a backward-compatible way and behaves as BSF on non-BMI targets. Differential Revision: http://reviews.llvm.org/D14748 llvm-svn: 253358	2015-11-17 18:46:48 +00:00
Oliver Stannard	7aa90f5735	[ARM,AArch64] Fix __rev16l and __rev16ll intrinsics These two intrinsics are defined in arm_acle.h. __rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits. For AArch64, where long is 64 bits, this would still be wrong. __rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than each 16-bit halfword. The correct implementation is to apply __rev16 to the top and bottom words of the 64-bit value. For AArch32 targets, these get compiled down to the hardware rev16 instruction at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two 32-bit rev16 instructions, because there is not currently a pattern for the 64-bit rev16 instruction. Differential Revision: http://reviews.llvm.org/D14609 llvm-svn: 253211	2015-11-16 14:58:50 +00:00
Craig Topper	fb79b5f273	[X86] Add 'pause' builtin that's already in llvm and use it instead of inline assembly to implement _mm_pause. llvm-svn: 252712	2015-11-11 08:13:33 +00:00
Craig Topper	a5455524c2	[X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there. llvm-svn: 252711	2015-11-11 08:00:41 +00:00
Craig Topper	880f60b7b3	[X86] Header formatting fixes. NFC llvm-svn: 252710	2015-11-11 08:00:39 +00:00
Craig Topper	d619eaaae4	[X86] Add missing typecasts in intrinsic macros. This should make them more robust against inputs that aren't already the right type. llvm-svn: 252700	2015-11-11 03:47:10 +00:00
Craig Topper	19744ee6ad	[X86] Change pointer type in AVX2 gather builtins to be the scalar type instead of the vector type. This matches gcc and removes extras casts. llvm-svn: 252697	2015-11-11 02:51:18 +00:00
Craig Topper	fd778eebac	[X86] Use setzero instead of set1(0) in a few places in intrinsic headers. llvm-svn: 252587	2015-11-10 05:08:08 +00:00
Craig Topper	7148166785	[X86] Remove temporary variables from macros in x86 intrinsic headers. Prevents duplicate names appearing from multiple macro expansions. NFC llvm-svn: 252586	2015-11-10 05:08:05 +00:00
Craig Topper	166f8b20a3	[X86] Fix bad intrinsic header comment. NFC. llvm-svn: 252585	2015-11-10 05:08:00 +00:00
Craig Topper	991d499457	Fix a couple intrinsic header comments. NFC llvm-svn: 251900	2015-11-03 06:16:31 +00:00
Eric Christopher	99af5b2ea7	Handle target builtin options that are all required rather than only one of a group of possibilities. This changes the syntax in the builtin files to represent: , as the and operator \| as the or operator The former syntax matches how the backend tablegen files represent multiple subtarget features being required. Updated the builtin and intrinsic headers accordingly for the new syntax. llvm-svn: 251388	2015-10-27 06:11:03 +00:00
Andrea Di Biagio	8bb12d0a77	[x86] Fix maskload/store intrinsic definitions in avxintrin.h According to the Intel documentation, the mask operand of a maskload and maskstore intrinsics is always a vector of packed integer/long integer values. This patch introduces the following two changes: 1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h. 2. It changes BuiltinsX86.def to match the correct gcc definitions for avx maskload/store (see D13861 for more details). Differential Revision: http://reviews.llvm.org/D13861 llvm-svn: 250816	2015-10-20 11:19:54 +00:00
Craig Topper	e33f51fa91	[X86] Add fxsr feature name for fxsave/fxrestore builtins. llvm-svn: 250498	2015-10-16 06:22:36 +00:00
Peter Collingbourne	e919b0f9ad	Headers: Switch some headers to LF line endings for consistency. llvm-svn: 250388	2015-10-15 10:33:27 +00:00
Hans Wennborg	4ca00afd7c	Intrin.h: implement __emul and __emulu llvm-svn: 250301	2015-10-14 16:24:28 +00:00
Eric Christopher	525334cf6c	Add subtarget feature support for 3dnowa to the 3dnowa intrinsics. llvm-svn: 250202	2015-10-13 18:40:17 +00:00
Amjad Aboud	2b9b8a5921	[X86] Add XSAVE intrinsic family Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13014 llvm-svn: 250158	2015-10-13 12:29:35 +00:00
Ahmed Bougacha	7dfaaf3891	[Headers][X86] Fix stream_load (movntdqa) to accept const. Per Intel intrinsics guide: - _mm256_stream_load_si256 takes `__m256i const ' - _mm_stream_load_si128 takes `__m128i ', for no good reason. Let's accept const for both. llvm-svn: 249213	2015-10-02 23:29:26 +00:00
Chandler Carruth	cbe6411401	Fix the SSE4 byte sign extension in a cleaner way, and more thoroughly test that our intrinsics behave the same under -fsigned-char and -funsigned-char. This further testing uncovered that AVX-2 has a broken cmpgt for 8-bit elements, and has for a long time. This is fixed in the same way as SSE4 handles the case. The other ISA extensions currently work correctly because they use specific instruction intrinsics. As soon as they are rewritten in terms of generic IR, they will need to add these special casts. I've added the necessary testing to catch this however, so we shouldn't have to chase it down again. I considered changing the core typedef to be signed, but that seems like a bad idea. Notably, it would be an ABI break if anyone is reaching into the innards of the intrinsic headers and passing __v16qi on an API boundary. I can't be completely confident that this wouldn't happen due to a macro expanding in a lambda, etc., so it seems much better to leave it alone. It also matches GCC's behavior exactly. A fun side note is that for both GCC and Clang, -funsigned-char really does change the semantics of __v16qi. To observe this, consider: % cat x.cc #include <smmintrin.h> #include <iostream> int main() { __v16qi a = { 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; __v16qi b = _mm_set1_epi8(-1); std::cout << (int)(a / b)[0] << ", " << (int)(a / b)[1] << '\n'; } % clang++ -o x x.cc && ./x -1, 1 % clang++ -funsigned-char -o x x.cc && ./x 0, 1 However, while this may be surprising, both Clang and GCC agree. Differential Revision: http://reviews.llvm.org/D13324 llvm-svn: 249097	2015-10-01 23:40:12 +00:00
Chandler Carruth	9143378db0	Patch over a really horrible bug in our vector builtins that showed up recently when we started using direct conversion to model sign extension. The __v16qi type we use for SSE v16i8 vectors is defined in terms of 'char' which may or may not be signed! This causes us to generate pmovsx and pmovzx depending on the setting of -funsigned-char. This patch just forms an explicitly signed type and uses that to formulate the sign extension. While this gets the correct behavior (which we now verify with the enhanced test) this is just the tip of the ice berg. Now that I know what to look for, I have found errors of this sort throughout our vector code. Fortunately, this is the only specific place where I know of users actively having their code miscompiled by Clang due to this, so I'm keeping the fix for those users minimal and targeted. I'll be sending a proper email for discussion of how to fix these systematically, what the implications are, and just how widely broken this is... From what I can tell, we have never shipped a correct set of builtin headers for x86 when users rely on -funsigned-char. Oops. llvm-svn: 248980	2015-10-01 02:21:34 +00:00
Nemanja Ivanovic	a0deee530b	Forgot to remove a FIXME that has been fixed. NFC. llvm-svn: 248815	2015-09-29 18:20:59 +00:00
Nemanja Ivanovic	236904ea9e	Addition of interfaces the FE to conform to Table A-2 of ELF V2 ABI V1.1 This patch corresponds to review: http://reviews.llvm.org/D13190 Implemented the following interfaces to conform to ELF V2 ABI version 1.1. vector signed __int128 vec_adde (vector signed __int128, vector signed __int128, vector signed __int128); vector unsigned __int128 vec_adde (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128); vector signed __int128 vec_addec (vector signed __int128, vector signed __int128, vector signed __int128); vector unsigned __int128 vec_addec (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128); vector signed int vec_addc(vector signed int __a, vector signed int __b); vector bool char vec_cmpge (vector signed char __a, vector signed char __b); vector bool char vec_cmpge (vector unsigned char __a, vector unsigned char __b); vector bool short vec_cmpge (vector signed short __a, vector signed short __b); vector bool short vec_cmpge (vector unsigned short __a, vector unsigned short __b); vector bool int vec_cmpge (vector signed int __a, vector signed int __b); vector bool int vec_cmpge (vector unsigned int __a, vector unsigned int __b); vector bool char vec_cmple (vector signed char __a, vector signed char __b); vector bool char vec_cmple (vector unsigned char __a, vector unsigned char __b); vector bool short vec_cmple (vector signed short __a, vector signed short __b); vector bool short vec_cmple (vector unsigned short __a, vector unsigned short __b); vector bool int vec_cmple (vector signed int __a, vector signed int __b); vector bool int vec_cmple (vector unsigned int __a, vector unsigned int __b); vector double vec_double (vector signed long long __a); vector double vec_double (vector unsigned long long __a); vector bool char vec_eqv(vector bool char __a, vector bool char __b); vector bool short vec_eqv(vector bool short __a, vector bool short __b); vector bool int vec_eqv(vector bool int __a, vector bool int __b); vector bool long long vec_eqv(vector bool long long __a, vector bool long long __b); vector signed short vec_madd(vector signed short __a, vector signed short __b, vector signed short __c); vector signed short vec_madd(vector signed short __a, vector unsigned short __b, vector unsigned short __c); vector signed short vec_madd(vector unsigned short __a, vector signed short __b, vector signed short __c); vector unsigned short vec_madd(vector unsigned short __a, vector unsigned short __b, vector unsigned short __c); vector bool long long vec_mergeh(vector bool long long __a, vector bool long long __b); vector bool long long vec_mergel(vector bool long long __a, vector bool long long __b); vector bool char vec_nand(vector bool char __a, vector bool char __b); vector bool short vec_nand(vector bool short __a, vector bool short __b); vector bool int vec_nand(vector bool int __a, vector bool int __b); vector bool long long vec_nand(vector bool long long __a, vector bool long long __b); vector bool char vec_orc(vector bool char __a, vector bool char __b); vector bool short vec_orc(vector bool short __a, vector bool short __b); vector bool int vec_orc(vector bool int __a, vector bool int __b); vector bool long long vec_orc(vector bool long long __a, vector bool long long __b); vector signed long long vec_sub(vector signed long long __a, vector signed long long __b); vector signed long long vec_sub(vector bool long long __a, vector signed long long __b); vector signed long long vec_sub(vector signed long long __a, vector bool long long __b); vector unsigned long long vec_sub(vector unsigned long long __a, vector unsigned long long __b); vector unsigned long long vec_sub(vector bool long long __a, vector unsigned long long __b); vector unsigned long long vec_sub(vector unsigned long long __V2 ABI V1.1 http://ror float vec_sub(vector float __a, vector float __b); unsigned char vec_extract(vector bool char __a, int __b); signed short vec_extract(vector signed short __a, int __b); unsigned short vec_extract(vector bool short __a, int __b); signed int vec_extract(vector signed int __a, int __b); unsigned int vec_extract(vector bool int __a, int __b); signed long long vec_extract(vector signed long long __a, int __b); unsigned long long vec_extract(vector unsigned long long __a, int __b); unsigned long long vec_extract(vector bool long long __a, int __b); double vec_extract(vector double __a, int __b); vector bool char vec_insert(unsigned char __a, vector bool char __b, int __c); vector signed short vec_insert(signed short __a, vector signed short __b, int __c); vector bool short vec_insert(unsigned short __a, vector bool short __b, int __c); vector signed int vec_insert(signed int __a, vector signed int __b, int __c); vector bool int vec_insert(unsigned int __a, vector bool int __b, int __c); vector signed long long vec_insert(signed long long __a, vector signed long long __b, int __c); vector unsigned long long vec_insert(unsigned long long __a, vector unsigned long long __b, int __c); vector bool long long vec_insert(unsigned long long __a, vector bool long long __b, int __c); vector double vec_insert(double __a, vector double __b, int __c); vector signed long long vec_splats(signed long long __a); vector unsigned long long vec_splats(unsigned long long __a); vector signed __int128 vec_splats(signed __int128 __a); vector unsigned __int128 vec_splats(unsigned __int128 __a); vector double vec_splats(double __a); int vec_all_eq(vector double __a, vector double __b); int vec_all_ge(vector double __a, vector double __b); int vec_all_gt(vector double __a, vector double __b); int vec_all_le(vector double __a, vector double __b); int vec_all_lt(vector double __a, vector double __b); int vec_all_nan(vector double __a); int vec_all_ne(vector double __a, vector double __b); int vec_all_nge(vector double __a, vector double __b); int vec_all_ngt(vector double __a, vector double __b); int vec_any_eq(vector double __a, vector double __b); int vec_any_ge(vector double __a, vector double __b); int vec_any_gt(vector double __a, vector double __b); int vec_any_le(vector double __a, vector double __b); int vec_any_lt(vector double __a, vector double __b); int vec_any_ne(vector double __a, vector double __b); vector unsigned char vec_sbox_be (vector unsigned char); vector unsigned char vec_cipher_be (vector unsigned char, vector unsigned char); vector unsigned char vec_cipherlast_be (vector unsigned char, vector unsigned char); vector unsigned char vec_ncipher_be (vector unsigned char, vector unsigned char); vector unsigned char vec_ncipherlast_be (vector unsigned char, vector unsigned char); vector unsigned int vec_shasigma_be (vector unsigned int, const int, const int); vector unsigned long long vec_shasigma_be (vector unsigned long long, const int, const int); vector unsigned short vec_pmsum_be (vector unsigned char, vector unsigned char); vector unsigned int vec_pmsum_be (vector unsigned short, vector unsigned short); vector unsigned long long vec_pmsum_be (vector unsigned int, vector unsigned int); vector unsigned __int128 vec_pmsum_be (vector unsigned long long, vector unsigned long long); vector unsigned char vec_gb (vector unsigned char); vector unsigned long long vec_bperm (vector unsigned __int128 __a, vector unsigned char __b); Removed the folowing interfaces either because their signatures have changed in version 1.1 of the ABI or because they were implemented for ELF V2 ABI but have actually been deprecated in version 1.1. vector signed char vec_eqv(vector bool char __a, vector signed char __b); vector signed char vec_eqv(vector signed char __a, vector bool char __b); vector unsigned char vec_eqv(vector bool char __a, vector unsigned char __b); vector unsigned char vec_eqv(vector unsigned char __a, vector bool char __b); vector signed short vec_eqv(vector bool short __a, vector signed short __b); vector signed short vec_eqv(vector signed short __a, vector bool short __b); vector unsigned short vec_eqv(vector bool short __a, vector unsigned short __b); vector unsigned short vec_eqv(vector unsigned short __a, vector bool short __b); vector signed int vec_eqv(vector bool int __a, vector signed int __b); vector signed int vec_eqv(vector signed int __a, vector bool int __b); vector unsigned int vec_eqv(vector bool int __a, vector unsigned int __b); vector unsigned int vec_eqv(vector unsigned int __a, vector bool int __b); vector signed long long vec_eqv(vector bool long long __a, vector signed long long __b); vector signed long long vec_eqv(vector signed long long __a, vector bool long long __b); vector unsigned long long vec_eqv(vector bool long long __a, vector unsigned long long __b); vector unsigned long long vec_eqv(vector unsigned long long __a, vector bool long long __b); vector float vec_eqv(vector bool int __a, vector float __b); vector float vec_eqv(vector float __a, vector bool int __b); vector double vec_eqv(vector bool long long __a, vector double __b); vector double vec_eqv(vector double __a, vector bool long long __b); vector unsigned short vec_nand(vector bool short __a, vector unsigned short __b); llvm-svn: 248813	2015-09-29 18:13:34 +00:00
Nico Weber	1f22a34409	ms Intrin.h: Fix __movsw's and __stosw's inline asm. Before, clang's internal assembler would reject the inline asm in clang's Intrin.h. To make sure this doesn't happen for other Intrin.h functions using __asm__ blocks, add 32-bit and 64-bit codegen tests for Intrin.h. Sadly, these tests discovered that __readcr3 and __writecr3 have bad implementations in 64-bit builds. This will have to be fixed in a follow-up. llvm-svn: 248234	2015-09-22 00:46:21 +00:00
Michael Kuperstein	a10dff946e	[X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs Differential Revision: http://reviews.llvm.org/D13015 llvm-svn: 248156	2015-09-21 13:34:47 +00:00
Michael Kuperstein	5c2cb0eee2	[X86] Fix some non-reserved parameter names in intrinsic headers Differential Revision: http://reviews.llvm.org/D13009 llvm-svn: 248150	2015-09-21 11:45:27 +00:00
Simon Pilgrim	12919f7e49	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR 128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds. This patch removes the builtins and reimplements the _mm_cvtepi_epi intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension). Differential Revision: http://reviews.llvm.org/D12835 llvm-svn: 248092	2015-09-19 15:12:38 +00:00
Asaf Badouh	2718051dd7	re-apply r.247881 fixed the tests. llvm-svn: 247892	2015-09-17 14:53:37 +00:00
Asaf Badouh	8a61250709	revert r.247881 due to tests failures llvm-svn: 247883	2015-09-17 13:09:33 +00:00
Asaf Badouh	a0e5e71ef1	[X86][AVX512DQ] add new intrinsics convert i64 to FP and vice versa reduceps & reducepd rangeps & rangepd all in their 512bit versions Differential Revision: http://reviews.llvm.org/D11716 llvm-svn: 247881	2015-09-17 11:56:04 +00:00
Sean Silva	e4c3760a9f	Clean up trailing whitespace in the builtin headers llvm-svn: 247498	2015-09-12 02:55:19 +00:00
Simon Pilgrim	5aba9925c0	[X86][SSE] Add _mm_undefined_* intrinsics Added missing SSE/AVX 'undefined' intrinsics (PR24040): _mm_undefined_pd, _mm_undefined_ps + _mm_undefined_si128 _mm256_undefined_pd, _mm256_undefined_ps + _mm256_undefined_si256 _mm512_undefined, _mm512_undefined_ps, _mm512_undefined_pd + _mm512_undefined_epi32 Added builtin intrinsicss: __builtin_ia32_undef128, __builtin_ia32_undef256 + __builtin_ia32_undef512 Differential Revision: http://reviews.llvm.org/D12052 llvm-svn: 246083	2015-08-26 21:17:12 +00:00
Simon Pilgrim	fbb8904411	[X86] Remove unnecessary MMX declarations from Intrin.h As discussed in PR23648 - the intrinsics _m_from_int, _m_to_int and _m_prefetch are defined in mmintrin.h and prfchwintrin.h so we don't need to in Intrin.h Added tests for _m_from_int and _m_to_int D11338 already added a test for _m_prefetch Differential Revision: http://reviews.llvm.org/D12272 llvm-svn: 245975	2015-08-25 21:27:46 +00:00
Michael Kuperstein	b62c5bc64d	Revert r245923 since it breaks mingw. llvm-svn: 245929	2015-08-25 11:42:31 +00:00
Michael Kuperstein	2c8f9c2c23	[X86] Expose the various _rot intrinsics on non-MS platforms _rotl, _rotwl and _lrotl (and their right-shift counterparts) are official x86 intrinsics, and should be supported regardless of environment. This is in contrast to _rotl8, _rotl16, and _rotl64 which are MS-specific. Note that the MS documentation for _lrotl is different from the Intel documentation. Intel explicitly documents it as a 64-bit rotate, while for MS, since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is implied. Differential Revision: http://reviews.llvm.org/D12271 llvm-svn: 245923	2015-08-25 07:21:33 +00:00
Ahmed Bougacha	5e354cb547	[Headers][X86] Use __builtin_shufflevector in AVX2 broadcasts. This lets us optimize them better. We agreed to remove the intrinsics, instead of combining them later, as, at -O0, we generate the expected instructions. Plus, it's a nice cleanup. Differential Revision: http://reviews.llvm.org/D10556 llvm-svn: 245605	2015-08-20 20:27:21 +00:00
Michael Kuperstein	d7b9392f59	[X86] Add support for _MM_ALIGN16 Differential Revision: http://reviews.llvm.org/D11753 llvm-svn: 244201	2015-08-06 08:24:38 +00:00
Asaf Badouh	c68e347c25	[X86][AVX512VLBW] add pack, cvt, mulhi and madd intrinsics Differential Revision: http://reviews.llvm.org/D11642 llvm-svn: 243867	2015-08-03 07:51:00 +00:00
Asaf Badouh	73b639f650	[X86][AVX512VLDQ] add reduce/range/cvt intrinsics add 128 & 256 width intrinsic versions of reduce/range and cvt i64 to FP and vice versa Differential Revision: http://reviews.llvm.org/D11598 llvm-svn: 243848	2015-08-02 12:43:08 +00:00
Ulrich Weigand	ca25643a05	[SystemZ] Add support for vecintrin.h vector built-in functions This patch adds support for the System Z vector built-in functions. The API-defined header file has the name vecintrin.h. The user-level functions are defined in the same style as the clang version of altivec.h, making heavy use of the __overloadable__ and __always_inline__ attributes. Where possible the functions expand to generic operations rather than specific built-in functions, in the hope that that form can be optimised better. Where a built-in routine is specified to require an immediate integer argument, the __enable_if__ attribute is used to verify the argument is in fact constant and in the appropriate range. Based on a patch by Richard Sandiford. llvm-svn: 243643	2015-07-30 14:10:43 +00:00
Asaf Badouh	d6cb100bc2	[X86][AVX512BW] Remove whitespaces llvm-svn: 243623	2015-07-30 06:52:26 +00:00
Asaf Badouh	1998eb2077	[X86][AVX512BW] add convert i16 to i8 and unpack intrinsics Differential Revision: http://reviews.llvm.org/D11564 llvm-svn: 243514	2015-07-29 12:34:20 +00:00
Asaf Badouh	a6c31703ac	[X86][AVX512BW] Replace attributes with __DEFAULT_FN_ATTRS llvm-svn: 243512	2015-07-29 12:22:19 +00:00
Asaf Badouh	93aa4c808a	[X86][AVX512VL] add AVX512VL intrinsics 4 out of 4 Differential Revision: http://reviews.llvm.org/D11526 llvm-svn: 243409	2015-07-28 12:04:40 +00:00
Asaf Badouh	b7cf71b63d	[X86][AVX512VL] add AVX512VL intrinsics 3 out of 4 http://reviews.llvm.org/D11526 llvm-svn: 243406	2015-07-28 11:14:09 +00:00
Asaf Badouh	78ee5cc8e1	[X86][AVX512VL] add AVX512VL intrinsics 2 out of 4 http://reviews.llvm.org/D11526 llvm-svn: 243402	2015-07-28 10:30:56 +00:00
Asaf Badouh	74da38706e	[X86][AVX512VL] add AVX512VL intrinsics 1 out of 4 http://reviews.llvm.org/D11526 llvm-svn: 243394	2015-07-28 08:26:14 +00:00
Simon Pilgrim	f81966d04b	[X86] Add missing _m_prefetch intrinsic The 3DNOW/PRFCHW cpu targets define both the PREFETCHW (set cache line modified) and PREFETCH (set cache line exclusive) instructions but only the _m_prefetchw (PREFETCHW) intrinsic is included in the header. This patch adds the missing _m_prefetch intrinsic. I'm basing this off AMD documentation - the intel docs on the support for PREFETCHW isn't clear whether Silvermont/Broadwell properly support PREFETCH but given that the intrinsic implementation is a default __builtin_prefetch call, it is safe whatever. Fix for PR23648 Differential Revision: http://reviews.llvm.org/D11338 llvm-svn: 243305	2015-07-27 19:01:52 +00:00
Asaf Badouh	f6a58b6dff	[X86][AVX512F] Add FP scalar intrinsics intrinsics for: add/sub/mul/div/min/max in their FP scalar versions Differential Revision: http://reviews.llvm.org/D11418 llvm-svn: 243009	2015-07-23 12:13:32 +00:00
Asaf Badouh	7d99966e91	[X86][AVX512BW] add madd and maddubs intrinsics Differential Revision: http://reviews.llvm.org/D11420 llvm-svn: 242986	2015-07-23 07:07:25 +00:00
Asaf Badouh	ffeb624483	[X86][AVX512F] add FP arithmetic intrinsics add/div/mul/sub include rounding versions Differential Revision: http://reviews.llvm.org/D11354 llvm-svn: 242790	2015-07-21 15:27:28 +00:00
Asaf Badouh	d4419ca657	[X86][AVX512BW] add clang intrinsics for pmulhrsw / pmulhuw / pmulhw also made minor fix in "test_mm512_maskz_permutex2var_epi16" Differential Revision: http://reviews.llvm.org/D11336 llvm-svn: 242635	2015-07-19 08:47:31 +00:00
David Majnemer	6b8e297089	[Intrin.h] Use compiler builtins to model memory barriers _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier are essentially memory barriers of one form or another. Model these as atomic_signal_fence(ATOMIC_SEQ_CST). __faststorefence is a curious intrinsic. It's single purpose seems to an alternative to mfence when that instruction is slow. However, mfence is not always slow and is, in general, preferable to a 'lock or' sequence on certain CPUs. Give the compiler freedom to select the best sequence to get a fence. llvm-svn: 242378	2015-07-16 03:13:02 +00:00
Bill Schmidt	8da737a18a	[PPC64LE] Fix vec_sld semantics for little endian The vec_sld interface provides access to the vsldoi instruction. Unlike most of the vec_* interfaces, we do not attempt to change the generated code for vec_sld based on the endian mode. It is too difficult to correctly infer the desired semantics because of different element types, and the corrected instruction sequence is expensive, involving loading a permute control vector and performing a generalized permute. For GCC, this was implemented as "Don't touch the vec_sld" implementation. When it came time for the LLVM implementation, I did the same thing. However, this was hasty and incorrect. In LLVM's version of altivec.h, vec_sld was previously defined in terms of the vec_perm interface. Because vec_perm semantics are adjusted for little endian, this means that leaving vec_sld untouched causes it to generate something different for LE than for BE. Not good. This patch adjusts the form of vec_perm that is used for vec_sld and vec_vsldoi, effectively undoing the modifications so that the same vsldoi instruction will be generated for both BE and LE. There is an accompanying back-end patch to take care of some small ripple effects caused by these changes. llvm-svn: 242297	2015-07-15 15:45:53 +00:00
Nemanja Ivanovic	6c363ed67a	Add missing builtins to altivec.h for ABI compliance (vol. 4) This patch corresponds to review: http://reviews.llvm.org/D11184 A number of new interfaces for altivec.h (as mandated by the ABI): vector float vec_cpsgn(vector float, vector float) vector double vec_cpsgn(vector double, vector double) vector double vec_or(vector bool long long, vector double) vector double vec_or(vector double, vector bool long long) vector double vec_re(vector double) vector signed char vec_cntlz(vector signed char) vector unsigned char vec_cntlz(vector unsigned char) vector short vec_cntlz(vector short) vector unsigned short vec_cntlz(vector unsigned short) vector int vec_cntlz(vector int) vector unsigned int vec_cntlz(vector unsigned int) vector signed long long vec_cntlz(vector signed long long) vector unsigned long long vec_cntlz(vector unsigned long long) vector signed char vec_nand(vector bool signed char, vector signed char) vector signed char vec_nand(vector signed char, vector bool signed char) vector signed char vec_nand(vector signed char, vector signed char) vector unsigned char vec_nand(vector bool unsigned char, vector unsigned char) vector unsigned char vec_nand(vector unsigned char, vector bool unsigned char) vector unsigned char vec_nand(vector unsigned char, vector unsigned char) vector short vec_nand(vector bool short, vector short) vector short vec_nand(vector short, vector bool short) vector short vec_nand(vector short, vector short) vector unsigned short vec_nand(vector bool unsigned short, vector unsigned short) vector unsigned short vec_nand(vector unsigned short, vector bool unsigned short) vector unsigned short vec_nand(vector unsigned short, vector unsigned short) vector int vec_nand(vector bool int, vector int) vector int vec_nand(vector int, vector bool int) vector int vec_nand(vector int, vector int) vector unsigned int vec_nand(vector bool unsigned int, vector unsigned int) vector unsigned int vec_nand(vector unsigned int, vector bool unsigned int) vector unsigned int vec_nand(vector unsigned int, vector unsigned int) vector signed long long vec_nand(vector bool long long, vector signed long long) vector signed long long vec_nand(vector signed long long, vector bool long long) vector signed long long vec_nand(vector signed long long, vector signed long long) vector unsigned long long vec_nand(vector bool long long, vector unsigned long long) vector unsigned long long vec_nand(vector unsigned long long, vector bool long long) vector unsigned long long vec_nand(vector unsigned long long, vector unsigned long long) vector signed char vec_orc(vector bool signed char, vector signed char) vector signed char vec_orc(vector signed char, vector bool signed char) vector signed char vec_orc(vector signed char, vector signed char) vector unsigned char vec_orc(vector bool unsigned char, vector unsigned char) vector unsigned char vec_orc(vector unsigned char, vector bool unsigned char) vector unsigned char vec_orc(vector unsigned char, vector unsigned char) vector short vec_orc(vector bool short, vector short) vector short vec_orc(vector short, vector bool short) vector short vec_orc(vector short, vector short) vector unsigned short vec_orc(vector bool unsigned short, vector unsigned short) vector unsigned short vec_orc(vector unsigned short, vector bool unsigned short) vector unsigned short vec_orc(vector unsigned short, vector unsigned short) vector int vec_orc(vector bool int, vector int) vector int vec_orc(vector int, vector bool int) vector int vec_orc(vector int, vector int) vector unsigned int vec_orc(vector bool unsigned int, vector unsigned int) vector unsigned int vec_orc(vector unsigned int, vector bool unsigned int) vector unsigned int vec_orc(vector unsigned int, vector unsigned int) vector signed long long vec_orc(vector bool long long, vector signed long long) vector signed long long vec_orc(vector signed long long, vector bool long long) vector signed long long vec_orc(vector signed long long, vector signed long long) vector unsigned long long vec_orc(vector bool long long, vector unsigned long long) vector unsigned long long vec_orc(vector unsigned long long, vector bool long long) vector unsigned long long vec_orc(vector unsigned long long, vector unsigned long long) vector signed char vec_div(vector signed char, vector signed char) vector unsigned char vec_div(vector unsigned char, vector unsigned char) vector signed short vec_div(vector signed short, vector signed short) vector unsigned short vec_div(vector unsigned short, vector unsigned short) vector signed int vec_div(vector signed int, vector signed int) vector unsigned int vec_div(vector unsigned int, vector unsigned int) vector signed long long vec_div(vector signed long long, vector signed long long) vector unsigned long long vec_div(vector unsigned long long, vector unsigned long long) vector unsigned char vec_mul(vector unsigned char, vector unsigned char) vector unsigned int vec_mul(vector unsigned int, vector unsigned int) vector unsigned long long vec_mul(vector unsigned long long, vector unsigned long long) vector unsigned short vec_mul(vector unsigned short, vector unsigned short) vector signed char vec_mul(vector signed char, vector signed char) vector signed int vec_mul(vector signed int, vector signed int) vector signed long long vec_mul(vector signed long long, vector signed long long) vector signed short vec_mul(vector signed short, vector signed short) vector signed long long vec_mergeh(vector signed long long, vector signed long long) vector signed long long vec_mergeh(vector signed long long, vector bool long long) vector signed long long vec_mergeh(vector bool long long, vector signed long long) vector unsigned long long vec_mergeh(vector unsigned long long, vector unsigned long long) vector unsigned long long vec_mergeh(vector unsigned long long, vector bool long long) vector unsigned long long vec_mergeh(vector bool long long, vector unsigned long long) vector double vec_mergeh(vector double, vector double) vector double vec_mergeh(vector double, vector bool long long) vector double vec_mergeh(vector bool long long, vector double) vector signed long long vec_mergel(vector signed long long, vector signed long long) vector signed long long vec_mergel(vector signed long long, vector bool long long) vector signed long long vec_mergel(vector bool long long, vector signed long long) vector unsigned long long vec_mergel(vector unsigned long long, vector unsigned long long) vector unsigned long long vec_mergel(vector unsigned long long, vector bool long long) vector unsigned long long vec_mergel(vector bool long long, vector unsigned long long) vector double vec_mergel(vector double, vector double) vector double vec_mergel(vector double, vector bool long long) vector double vec_mergel(vector bool long long, vector double) vector signed int vec_pack(vector signed long long, vector signed long long) vector unsigned int vec_pack(vector unsigned long long, vector unsigned long long) vector bool int vec_pack(vector bool long long, vector bool long long) llvm-svn: 242171	2015-07-14 17:50:27 +00:00
Asaf Badouh	1626545667	[x86] add 2 bit to ObjCOrBuiltinID and new intrinsics add 2 bit to ObjCOrBuiltinID (changed from 11bits to 13bits), see discussion in Add new intrinsics support that already covered by the BE. All the intrinsics are covered by tests Differential Revision: http://reviews.llvm.org/D10893 llvm-svn: 242144	2015-07-14 14:02:45 +00:00
David Majnemer	e0b863f4c7	[Intrin.h] Use __ATOMIC_SEQ_CST instead of '5' No functionality change is intended. llvm-svn: 242087	2015-07-13 23:39:37 +00:00
David Majnemer	56e466745d	[Intrin.h] Make the variable names more consistent No functionality change intended. llvm-svn: 242086	2015-07-13 23:38:56 +00:00
David Majnemer	8dadce78ed	Intrin.h: Don't invade the program's namespace The program is permitted to have stuff like '#define x' in it so avoid using identifiers not reserved for the implementation. llvm-svn: 242010	2015-07-13 02:53:23 +00:00
David Majnemer	3c8ea5f3f8	Intrin.h: Clean up our atomic intrinsics Three things: - The atomic intrinsics mandate memory barriers, let's start emitting some. - We don't need to manually create RMW operations, we can just do __atomic_fetch_foo instead of performing __atomic_foo_fetch and undoing foo. - Don't use inline assembly, we don't need it for these intrinsics. This fixes PR24101. llvm-svn: 242009	2015-07-13 02:53:19 +00:00
Nemanja Ivanovic	26c3534b84	Add missing builtins to altivec.h for ABI compliance (vol. 3) This patch corresponds to review: http://reviews.llvm.org/D10972 Fix for the handling of dependent features that are enabled by default on some CPU's (such as -mvsx, -mpower8-vector). Also provides a number of new interfaces or fixes existing ones in altivec.h. Changed signatures to conform to ABI: vector short vec_perm(vector signed short, vector signed short, vector unsigned char) vector int vec_perm(vector signed int, vector signed int, vector unsigned char) vector long long vec_perm(vector signed long long, vector signed long long, vector unsigned char) vector signed char vec_sld(vector signed char, vector signed char, const int) vector unsigned char vec_sld(vector unsigned char, vector unsigned char, const int) vector bool char vec_sld(vector bool char, vector bool char, const int) vector unsigned short vec_sld(vector unsigned short, vector unsigned short, const int) vector signed short vec_sld(vector signed short, vector signed short, const int) vector signed int vec_sld(vector signed int, vector signed int, const int) vector unsigned int vec_sld(vector unsigned int, vector unsigned int, const int) vector float vec_sld(vector float, vector float, const int) vector signed char vec_splat(vector signed char, const int) vector unsigned char vec_splat(vector unsigned char, const int) vector bool char vec_splat(vector bool char, const int) vector signed short vec_splat(vector signed short, const int) vector unsigned short vec_splat(vector unsigned short, const int) vector bool short vec_splat(vector bool short, const int) vector pixel vec_splat(vector pixel, const int) vector signed int vec_splat(vector signed int, const int) vector unsigned int vec_splat(vector unsigned int, const int) vector bool int vec_splat(vector bool int, const int) vector float vec_splat(vector float, const int) Added a VSX path to: vector float vec_round(vector float) Added interfaces: vector signed char vec_eqv(vector signed char, vector signed char) vector signed char vec_eqv(vector bool char, vector signed char) vector signed char vec_eqv(vector signed char, vector bool char) vector unsigned char vec_eqv(vector unsigned char, vector unsigned char) vector unsigned char vec_eqv(vector bool char, vector unsigned char) vector unsigned char vec_eqv(vector unsigned char, vector bool char) vector signed short vec_eqv(vector signed short, vector signed short) vector signed short vec_eqv(vector bool short, vector signed short) vector signed short vec_eqv(vector signed short, vector bool short) vector unsigned short vec_eqv(vector unsigned short, vector unsigned short) vector unsigned short vec_eqv(vector bool short, vector unsigned short) vector unsigned short vec_eqv(vector unsigned short, vector bool short) vector signed int vec_eqv(vector signed int, vector signed int) vector signed int vec_eqv(vector bool int, vector signed int) vector signed int vec_eqv(vector signed int, vector bool int) vector unsigned int vec_eqv(vector unsigned int, vector unsigned int) vector unsigned int vec_eqv(vector bool int, vector unsigned int) vector unsigned int vec_eqv(vector unsigned int, vector bool int) vector signed long long vec_eqv(vector signed long long, vector signed long long) vector signed long long vec_eqv(vector bool long long, vector signed long long) vector signed long long vec_eqv(vector signed long long, vector bool long long) vector unsigned long long vec_eqv(vector unsigned long long, vector unsigned long long) vector unsigned long long vec_eqv(vector bool long long, vector unsigned long long) vector unsigned long long vec_eqv(vector unsigned long long, vector bool long long) vector float vec_eqv(vector float, vector float) vector float vec_eqv(vector bool int, vector float) vector float vec_eqv(vector float, vector bool int) vector double vec_eqv(vector double, vector double) vector double vec_eqv(vector bool long long, vector double) vector double vec_eqv(vector double, vector bool long long) vector bool long long vec_perm(vector bool long long, vector bool long long, vector unsigned char) vector double vec_round(vector double) vector double vec_splat(vector double, const int) vector bool long long vec_splat(vector bool long long, const int) vector signed long long vec_splat(vector signed long long, const int) vector unsigned long long vec_splat(vector unsigned long long, vector bool int vec_sld(vector bool int, vector bool int, const int) vector bool short vec_sld(vector bool short, vector bool short, const int) llvm-svn: 241904	2015-07-10 13:11:34 +00:00
Nemanja Ivanovic	e00fa61412	Add the missing return statements from revision 241399. llvm-svn: 241405	2015-07-05 10:54:10 +00:00
Nemanja Ivanovic	1c7ad715ec	Add missing builtins to altivec.h for ABI compliance (vol. 2) This patch corresponds to review: http://reviews.llvm.org/D10875 The bulk of the second round of additions to altivec.h. The following interfaces were added: vector double vec_floor(vector double) vector double vec_madd(vector double, vector double, vector double) vector float vec_msub(vector float, vector float, vector float) vector double vec_msub(vector double, vector double, vector double) vector float vec_mul(vector float, vector float) vector double vec_mul(vector double, vector double) vector float vec_nmadd(vector float, vector float, vector float) vector double vec_nmadd(vector double, vector double, vector double) vector double vec_nmsub(vector double, vector double, vector double) vector double vec_nor(vector double, vector double) vector double vec_or(vector double, vector double) vector float vec_rint(vector float) vector double vec_rint(vector double) vector float vec_nearbyint(vector float) vector double vec_nearbyint(vector double) vector float vec_sqrt(vector float) vector double vec_sqrt(vector double) vector double vec_rsqrte(vector double) vector double vec_sel(vector double, vector double, vector unsigned long long) vector double vec_sel(vector double, vector double, vector unsigned long long) vector double vec_sub(vector double, vector double) vector double vec_trunc(vector double) vector double vec_xor(vector double, vector double) vector double vec_xor(vector double, vector bool long long) vector double vec_xor(vector bool long long, vector double) New VSX paths for the following interfaces: vector float vec_madd(vector float, vector float, vector float) vector float vec_nmsub(vector float, vector float, vector float) vector float vec_rsqrte(vector float) vector float vec_trunc(vector float) vector float vec_floor(vector float) llvm-svn: 241399	2015-07-05 06:40:52 +00:00
Kit Barton	b61173e791	This patch adds support for the vector merge even word and vector merge odd word instructions introduced in POWER8. These are the Clang-related changes for http://reviews.llvm.org/D10704 All builtins are added in altivec.h and guarded with the POWER8_VECTOR macro. Phabricator review: http://reviews.llvm.org/D10736 llvm-svn: 241293	2015-07-02 19:29:05 +00:00
Michael Kuperstein	e45af54cdb	[X86] Rename DEFAULT_FN_ATTR macro to __DEFAULT_FN_ATTR llvm-svn: 241065	2015-06-30 13:36:19 +00:00
Michael Kuperstein	9101a98bd0	[X86] Add missing undef of DEFAULT_FN_ATTRS in FXSR intrinsics llvm-svn: 241055	2015-06-30 10:18:54 +00:00
Michael Kuperstein	a3c7b74208	[X86] Add FXSR intrinsics Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64) These were previously declared in Intrin.h for MSVC compatibility, but now that we have them implemented, these declarations can be removed. llvm-svn: 241053	2015-06-30 09:45:38 +00:00
Asaf Badouh	a45b7cab7b	[x86][AVX512CD] Add conflict and lzcnt intrinsics in their 512bit versions include tests review http://reviews.llvm.org/D10795 llvm-svn: 240941	2015-06-29 12:51:53 +00:00
Asaf Badouh	4002ce4834	[X86][AVX512BW] Add more intrinsics support: Blend, abs, packs, adds, subs, avg, max, min, permute. all the intrinsics are covered by tests review: http://reviews.llvm.org/D10799 llvm-svn: 240937	2015-06-29 12:16:40 +00:00
Elena Demikhovsky	c563c2c61a	AVX-512: Implemented AVX-512 FMA intrinsics and tests. by Igor Breger http://reviews.llvm.org/D10797 llvm-svn: 240928	2015-06-29 09:20:57 +00:00
Nemanja Ivanovic	2f1f926e34	Add missing builtins to altivec.h for ABI compliance (vol. 1) This patch corresponds to review: http://reviews.llvm.org/D10637 This is the first round of additions of missing builtins listed in the ABI document. More to come (this builds onto what seurer already addes). This patch adds: vector signed long long vec_abs(vector signed long long) vector double vec_abs(vector double) vector signed long long vec_add(vector signed long long, vector signed long long) vector unsigned long long vec_add(vector unsigned long long, vector unsigned long long) vector double vec_add(vector double, vector double) vector double vec_and(vector bool long long, vector double) vector double vec_and(vector double, vector bool long long) vector double vec_and(vector double, vector double) vector signed long long vec_and(vector signed long long, vector signed long long) vector double vec_andc(vector bool long long, vector double) vector double vec_andc(vector double, vector bool long long) vector double vec_andc(vector double, vector double) vector signed long long vec_andc(vector signed long long, vector signed long long) vector double vec_ceil(vector double) vector bool long long vec_cmpeq(vector double, vector double) vector bool long long vec_cmpge(vector double, vector double) vector bool long long vec_cmpge(vector signed long long, vector signed long long) vector bool long long vec_cmpge(vector unsigned long long, vector unsigned long long) vector bool long long vec_cmpgt(vector double, vector double) vector bool long long vec_cmple(vector double, vector double) vector bool long long vec_cmple(vector signed long long, vector signed long long) vector bool long long vec_cmple(vector unsigned long long, vector unsigned long long) vector bool long long vec_cmplt(vector double, vector double) vector bool long long vec_cmplt(vector signed long long, vector signed long long) vector bool long long vec_cmplt(vector unsigned long long, vector unsigned long long) llvm-svn: 240821	2015-06-26 19:27:20 +00:00
Nico Weber	ac64b97771	Add new file from r240741 to CMakeLists.txt. llvm-svn: 240743	2015-06-26 00:19:32 +00:00
Nico Weber	2ca46867e1	Add an inttypes.h wrapper that fixes up some macros in Microsoft mode. Before MSVS2015, MSVS's headers disagree about int32_t and PRIx32 and so on. Provide a wrapper header to fix this, so that -Wformat can still be used. Fixes PR23412. llvm-svn: 240741	2015-06-26 00:13:18 +00:00
Sean Silva	d0de76a3da	Remove `requires` for x86 CPU features. Ever since the target attributes change, we don't need to guard these headers with `requires`. Actually it's a bit worse, because if we do then they are included textually under the covers, causing declarations to appear in submodules they aren't supposed to be in. llvm-svn: 240720	2015-06-25 23:22:11 +00:00
Eric Christopher	3d920eed5d	Move xtest to its own file to match the gcc header organization. llvm-svn: 239926	2015-06-17 18:42:07 +00:00
Eric Christopher	29b78091e7	Update comments on HLE, RTM, and ADX support for intrinsics. llvm-svn: 239925	2015-06-17 18:42:03 +00:00
Eric Christopher	9fc7fb274e	Update the intel intrinsic headers to use the target attribute support. This involved removing the conditional inclusion and replacing them with target attributes matching the original conditional inclusion and checks. The testcase update removes the macro checks for each file and replaces them with usage of the __target__ attribute, e.g.: int __attribute__((__target__(("sse3")))) foo(int a) { _mm_mwait(0, 0); return 4; } This usage does require the enclosing function have the requisite __target__ attribute for inlining and code generation - also for any macro intrinsic uses in the enclosing function. There's no change for existing uses of the intrinsic headers. llvm-svn: 239883	2015-06-17 07:09:32 +00:00
Eric Christopher	4d185168e9	Use a define for per-file function attributes for the Intel intrinsic headers. This is a precursor to changing them to use the new target attribute code. llvm-svn: 239882	2015-06-17 07:09:20 +00:00
Eric Christopher	5a9bec104b	Use a macro for the omnipresent attributes on header functions in Intrin.h. Saves some typing and if someone wants to change them it makes it much easier. llvm-svn: 239782	2015-06-15 23:20:35 +00:00
Luke Cheeseman	59b2d83909	This patch implements clang support for the ACLE special register intrinsics in section 10.1, __arm_{w,r}sr{,p,64}. This includes arm_acle.h definitions with builtins and codegen to support these, the intrinsics are implemented by generating read/write_register calls which get appropriately lowered in the backend based on the register string provided. SemaChecking is also implemented to fault invalid parameters. Differential Revision: http://reviews.llvm.org/D9697 llvm-svn: 239737	2015-06-15 17:51:01 +00:00
Nemanja Ivanovic	b17f1129fa	Clang support for vector quad bit permute and gather instructions through builtins This patch corresponds to review: http://reviews.llvm.org/D10095 This is for just two instructions and related builtins: vbpermq vgbbd llvm-svn: 239506	2015-06-11 06:25:36 +00:00
Bill Seurer	703e8486ec	[PowerPC] Reformat altivec.h with clang-format This revision just fixes the formatting of altivec.h. llvm-svn: 239408	2015-06-09 14:39:47 +00:00
David Majnemer	81ecbf45d4	Revert accidental commit This change was unrelated to r239170. llvm-svn: 239176	2015-06-05 18:24:55 +00:00
David Majnemer	cdffc36c11	[AST] There is no message for C++1z-style static_assert We would crash in the DeclPrinter trying to pretty-print the static_assert message. C++1z-style assertions don't have a message so we would crash. This fixes PR23756. llvm-svn: 239170	2015-06-05 18:03:58 +00:00
Bill Seurer	8be14f11ce	[PowerPC] This revision adds 68 of the missing "Predefined Functions for Vector Programming" from appendix A of the OpenPOWER ABI for Linux Supplement document. I also added tests for the new functions and updated another test that was looking for specific line numbers in error messages from altivec.h. https://llvm.org/bugs/show_bug.cgi?id=23679 http://reviews.llvm.org/D10131 llvm-svn: 239066	2015-06-04 18:45:44 +00:00
Ekaterina Romanova	2e81434552	Added doxygen comments for the intrinsics. llvm-svn: 238386	2015-05-28 01:25:25 +00:00
John Thompson	b7892ffc69	It appears these exports are needed, as wmmintrin.h includes them. llvm-svn: 238345	2015-05-27 18:26:41 +00:00
Kit Barton	5944ee2179	This patch adds support for the vector quadword add/sub instructions introduced in POWER8. These are the Clang-related changes for http://reviews.llvm.org/D9081 vadduqm vaddeuqm vaddcuq vaddecuq vsubuqm vsubeuqm vsubcuq vsubecuq All builtins are added in altivec.h, and guarded with the POWER8_VECTOR and powerpc64 macros. http://reviews.llvm.org/D9903 llvm-svn: 238145	2015-05-25 15:52:45 +00:00
Michael Kuperstein	7619004211	[X86] Add _mm256_set_m128 and its 5 variants. Differential Revision: http://reviews.llvm.org/D9855 llvm-svn: 237778	2015-05-20 07:46:52 +00:00
Michael Kuperstein	877f3cbe84	[X86] Add _mm_broadcastsd_pd intrinsic _mm_broadcastsd_pd is basically an alias for _mm_movedup_pd, however the alias is only available from AVX2 forward. llvm-svn: 237698	2015-05-19 14:49:14 +00:00
Michael Kuperstein	6168183e04	[X86] Added _mm256_bslli_epi128 and _mm256_bsrli_epi128. These two intrinsics are alternative names for _mm256_slli_si256 and _mm256_srli_si256, respectively. llvm-svn: 237693	2015-05-19 13:05:46 +00:00
Bill Schmidt	41e14c4dfa	[PPC64] Add vector pack/unpack support from ISA 2.07 This patch adds support for the following new instructions in the Power ISA 2.07: vpksdss vpksdus vpkudus vpkudum vupkhsw vupklsw These instructions are available through the vec_packs, vec_packsu, vec_unpackh, and vec_unpackl built-in interfaces. These are lane-sensitive instructions, so the built-ins have different implementations for big- and little-endian, and the instructions must be marked as killing the vector swap optimization for now. The first three instructions perform saturating pack operations. The fourth performs a modulo pack operation, which means it can be represented with a vector shuffle, and conversely the appropriate vector shuffles may cause this instruction to be generated. The other instructions are only generated via built-in support for now. I noticed during patch preparation that the macro __VSX__ was not previously predefined when the power8-vector or direct-move features are requested. This is an error, and I've corrected that here as well. Appropriate tests have been added. There is a companion patch to llvm for the rest of this support. llvm-svn: 237500	2015-05-16 01:02:25 +00:00
Richard Smith	23d8d0338e	[modules] Fix a #include cycle when building a module for our builtin headers. xmmintrin.h includes emmintrin.h and vice versa if SSE2 is enabled. We break this cycle for a modules build, and instead make the xmmintrin.h module re-export the immintrin.h module. Also included is a fix for an assert in the serialization code if a module exports another module that was declared later in the same module map. llvm-svn: 237321	2015-05-14 00:45:20 +00:00
Elena Demikhovsky	bd5c8b9be9	AVX-512: FP compare intrinsics - changed type of CC parameter from i8 to i32 according to the spec. Added FP compare intrinsics for SKX. llvm-svn: 236715	2015-05-07 11:26:36 +00:00
Elena Demikhovsky	e7d4c2e229	AVX-512: Added AVX-512 intrinsics and tests by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236218	2015-04-30 09:24:29 +00:00

... 6 7 8 9 10 ...

1388 Commits