llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	a2137d6e09	[X86] Add -flax-vector-conversions=none to all of the x86 vector intrinsic header tests.	2020-01-23 20:43:50 -08:00
Richard Smith	9864269a0d	Fix reliance on lax vector conversions in tests for x86 intrinsics. llvm-svn: 372062	2019-09-17 03:56:28 +00:00
JF Bastien	dbc0a5df8d	Allow prefetching from non-zero address spaces Summary: This is useful for targets which have prefetch instructions for non-default address spaces. <rdar://problem/42662136> Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65254 llvm-svn: 367032	2019-07-25 16:11:57 +00:00
Craig Topper	caf6b71ab2	[X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669	2019-07-10 17:11:29 +00:00
Craig Topper	f9cb127ca9	[X86] Add guards to some of the x86 intrinsic tests to skip 64-bit mode only intrinsics when compiled for 32-bit mode. All the command lines are for 64-bit mode, but sometimes I compile the tests in 32-bit mode to see what assembly we get and we need to skip these to do that. llvm-svn: 365668	2019-07-10 17:11:23 +00:00
Reid Kleckner	79d7f4114d	[X86] Use __m128_u for _mm_loadu_ps after r353555 Add secondary triple to existing SSE test for it. I audited other uses of __attribute__((__packed__)) in the intrinsic headers, and this seemed to be the only missing one. llvm-svn: 353878	2019-02-12 21:04:21 +00:00
Craig Topper	638426fc36	[X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that is suitable for use in scalar mask intrinsics. This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics. This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling. I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle. llvm-svn: 336622	2018-07-10 00:37:25 +00:00
Tomasz Krupa	f1792bb3d6	[X86] Lowering sqrt intrinsics to native IR Reviewers: craig.topper, spatel, RKSimon, igorb, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, cfe-commits Differential Revision: https://reviews.llvm.org/D41168 llvm-svn: 334850	2018-06-15 18:05:59 +00:00
Craig Topper	c5ec55e921	[X86] Simplify the implementation of _mm_sqrt_ss, _mm_rcp_ss, and _mm_rsqrt_ss. We don't need the insertion back into the original vector at the end. The builtin already understands that. This is different than _mm_sqrt_sd which takes two arguments and we do need to insert. llvm-svn: 333572	2018-05-30 18:27:07 +00:00
Gabor Buella	5219ed89be	[X86] NFC Include immintrin.h in CodeGen tests Following r333110: "Move all Intel defined intrinsic includes into immintrin.h" llvm-svn: 333160	2018-05-24 07:09:08 +00:00
Sanjay Patel	e795daa55e	[x86] these aren't the undefs you're looking for (PR32176) x86 has undef SSE/AVX intrinsics that should represent a bogus register operand. This is not the same as LLVM's undef value which can take on multiple bit patterns. There are better solutions / follow-ups to this discussed here: https://bugs.llvm.org/show_bug.cgi?id=32176 ...but this should prevent miscompiles with a one-line code change. Differential Revision: https://reviews.llvm.org/D30834 llvm-svn: 297588	2017-03-12 19:15:10 +00:00
Elad Cohen	b107a22afb	[X86] Remove the mm_malloc.h include guard hack from the X86 builtins tests The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted environment since it expects stdlib.h to be available - which is not the case in these internal clang codegen tests). This patch removes this hack and instead passes -ffreestanding to clang cc1. Differential Revision: https://reviews.llvm.org/D24825 llvm-svn: 282581	2016-09-28 11:59:09 +00:00
Eric Christopher	abb2b54ad3	After PR28761 use -Wall with -Werror in builtins tests to identify possible problems in headers. llvm-svn: 277696	2016-08-04 06:02:50 +00:00
Simon Pilgrim	e3b9ee0645	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102	2016-07-20 10:18:01 +00:00
Sanjay Patel	280cfd1a69	[x86] translate SSE packed FP comparison builtins to IR As noted in the code comment, a potential follow-on would be to remove the builtins themselves. Other than ord/unord, this already works as expected. Eg: typedef float v4sf __attribute__((__vector_size__(16))); v4sf fcmpgt(v4sf a, v4sf b) { return a > b; } Differential Revision: http://reviews.llvm.org/D21268 llvm-svn: 272840	2016-06-15 21:20:04 +00:00
Simon Pilgrim	0e90936fea	[X86] Ensure load/store tests unaligned pointers really are align 1 llvm-svn: 271227	2016-05-30 19:20:55 +00:00
Simon Pilgrim	43439bd33d	[X86][SSE] Added missing tests (merge failure) Differential Revision: http://reviews.llvm.org/D20617 llvm-svn: 271219	2016-05-30 17:58:38 +00:00
Craig Topper	09175dab31	[X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code that will compile to a native unaligned store. Remove the builtins since they are no longer used. Intrinsics will be removed from llvm in a future commit. llvm-svn: 271214	2016-05-30 17:10:30 +00:00
Simon Pilgrim	7b365bce6f	[X86][SSE] Updated _mm_store_ps1 test to match _mm_store1_ps llvm-svn: 270679	2016-05-25 09:20:08 +00:00
Craig Topper	f70a61ff3f	[X86] Update test cases to make sure storeu builtins use the storeu instrinsics. We were previously matching on other stores in the IR from this being an -O0 test. We should probably look into making the storeu builtins just emit a normal store with an alignment of 1. llvm-svn: 270664	2016-05-25 05:26:23 +00:00
Simon Pilgrim	9b3729b043	[X86][SSE] Sync with llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll sse-builtins.c now just covers SSE1 intrinsics llvm-svn: 270083	2016-05-19 17:11:31 +00:00
Simon Pilgrim	2d1decf7cb	[X86][SSE] Tidied up MMX/SSE/SSE2 builtin tests to the correct test file llvm-svn: 269852	2016-05-17 22:03:31 +00:00
Craig Topper	8ca5373c72	[X86] Fix a few intrinsic tests to use the return type that matches the intrinsic they're testing. llvm-svn: 269735	2016-05-17 03:42:37 +00:00
Craig Topper	fb79b5f273	[X86] Add 'pause' builtin that's already in llvm and use it instead of inline assembly to implement _mm_pause. llvm-svn: 252712	2015-11-11 08:13:33 +00:00
Craig Topper	a5455524c2	[X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there. llvm-svn: 252711	2015-11-11 08:00:41 +00:00
Simon Pilgrim	0d9d748bf1	[X86][SSSE3] Added SSSE3 IR + assembly codegen builtin tests Transferred SSSE3 instructions from sse-builtins.c llvm-svn: 246948	2015-09-06 17:06:22 +00:00
Simon Pilgrim	ff88a0da31	[X86]][SSE3] Added SSE41 IR + assembly codegen builtin tests Transferred SSE41 instructions from sse-builtins.c llvm-svn: 246947	2015-09-06 16:38:17 +00:00
Simon Pilgrim	5aba9925c0	[X86][SSE] Add _mm_undefined_* intrinsics Added missing SSE/AVX 'undefined' intrinsics (PR24040): _mm_undefined_pd, _mm_undefined_ps + _mm_undefined_si128 _mm256_undefined_pd, _mm256_undefined_ps + _mm256_undefined_si256 _mm512_undefined, _mm512_undefined_ps, _mm512_undefined_pd + _mm512_undefined_epi32 Added builtin intrinsicss: __builtin_ia32_undef128, __builtin_ia32_undef256 + __builtin_ia32_undef512 Differential Revision: http://reviews.llvm.org/D12052 llvm-svn: 246083	2015-08-26 21:17:12 +00:00
Simon Pilgrim	503976ad9a	Added missing tests for SSE41 pmovsx/pmovzx extension intrinsics llvm-svn: 245815	2015-08-23 16:19:38 +00:00
David Blaikie	a953f2825b	Update Clang tests to handle explicitly typed load changes in LLVM. llvm-svn: 230795	2015-02-27 21:19:58 +00:00
Manuel Klimek	fa27b8861b	Make tests independent of llvm variable naming. llvm-svn: 229484	2015-02-17 09:49:31 +00:00
Craig Topper	96f9a573b5	[X86] Convert palignr builtin handling to use shuffle form of right shift instead of intrinsics. This should allow the instrinsics to removed from the backend. llvm-svn: 229474	2015-02-17 07:18:01 +00:00
Craig Topper	480e2b6e43	[X86] Merge the 2 separate builtin handlers for PALIGNR into a single one that handles both. llvm-svn: 229469	2015-02-17 06:37:58 +00:00
Craig Topper	4fb4581716	[X86] Fix test cases that I foolishly copied and modified from another file that had optimizations on. This caused the check patterns to not quite match. llvm-svn: 229073	2015-02-13 06:27:39 +00:00
Craig Topper	a462482d98	[X86] Add _mm_bslli_si128 and _mm_bsrli_si128 as aliases of _mm_slli_si128 and _mm_srli_si128. This matches Intel documentation and gcc. llvm-svn: 229066	2015-02-13 06:04:45 +00:00
Craig Topper	2094d8fe88	[x86] Add the (v)cmpps/pd/ss/sd builtins to match gcc. Use them in the sse intrinsic files. This still lower to the same intrinsics as before. This is preparation for bounds checking the immediate on the avx version of the builtin so we don't pass illegal immediates into the backend. Since SSE uses a smaller size immediate its not possible to bounds check when using a shared builtin. Rather than creating a clang specific builtin for the different immediate, I decided (after consulting with Chandler) that it was better to match gcc. llvm-svn: 224879	2014-12-27 06:59:57 +00:00
Evgeniy Stepanov	2be29929be	Fix line numbers for code inlined from __nodebug__ functions. Instructions from __nodebug__ functions don't have file:line information even when inlined into no-nodebug functions. As a result, intrinsics (SSE and other) from <*intrin.h> clang headers _never_ have file:line information. With this change, an instruction without !dbg metadata gets one from the call instruction when inlined. Fixes PR19001. llvm-svn: 210459	2014-06-09 09:09:19 +00:00
Filipe Cabecinhas	5d289b48b1	Patched clang to emit x86 blends as shufflevectors. Summary: Most of the clang header patch by Simon Pilgrim @ SCEE. Also fixed (or added) clang tests for these intrinsics. LLVM tests to make sure we get the blend instruction out of these shufflevectors are at http://reviews.llvm.org/D3600 Reviewers: eli.friedman, craig.topper, rafael Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D3601 llvm-svn: 208664	2014-05-13 02:37:02 +00:00
Manman Ren	c94122e05b	Intrinsics: fix extract & insert when index is out of bound. Now, all extract & insert intrinsics should have the correct and operation to ignore higher bits. rdar://15250497 llvm-svn: 193267	2013-10-23 20:33:14 +00:00
Manman Ren	be38b9e15f	_mm_extract_epi16: use "& 7" when index is out of bound. This is in line with implementation of _mm_extract_pi16. rdar://15250497 llvm-svn: 193187	2013-10-22 19:24:42 +00:00
Eli Friedman	f9d8c6cebb	Add _mm_stream_si64 intrinsic. While I'm here, also fix the alignment computation for the whole family of intrinsics. PR17298. llvm-svn: 191243	2013-09-23 23:38:39 +00:00
Stephen Lin	4362261b00	CHECK-LABEL-ify some code gen tests to improve diagnostic experience when tests fail. llvm-svn: 188447	2013-08-15 06:47:53 +00:00
Manman Ren	5750c1c07e	X86 SSE Intrinsics: update header for sqrt_ss, rsqrt_ss and rcp_ss. There intrinsics pass through the upper FP values from the input. rdar://12558838 llvm-svn: 166743	2012-10-26 00:25:10 +00:00
Chad Rosier	87622b8b84	Get rid of storelv4si builtin as it can be expressed directly. This is general goodness because it provides opportunites to cleanup things. For example, uint64_t t1(__m128i vA) { uint64_t Alo; _mm_storel_epi64((__m128i*)&Alo, vA); return Alo; } was generating movq %xmm0, -8(%rbp) movq -8(%rbp), %rax and now generates movd %xmm0, %rax rdar://11282581 llvm-svn: 155924	2012-05-01 18:11:51 +00:00
Craig Topper	74c17c65e4	Correctly check argument types for some vector macros in smmintrin.h. Put parentheses around uses of vector macro arguments. llvm-svn: 153732	2012-03-30 07:01:17 +00:00
Craig Topper	97f042f2d6	Add _mm_minpos_epu16 to smmintrin.h. Fixes PR12399. llvm-svn: 153726	2012-03-30 05:41:28 +00:00
NAKAMURA Takumi	3f62af76b9	test/CodeGen/sse-builtins.c: Make this host-independent to unbreak posix-unlike hosts. Without -ffreestanding, clang tries to seek /usr/include/stdlib.h in host filesystem, even on Windows hosts. llvm-svn: 139899	2011-09-16 03:55:36 +00:00
Eli Friedman	9bb51adcce	Tweak *mmintrin.h so that they don't make any bad assumptions about alignment (which probably has little effect in practice, but better to get it right). Make the load in _mm_loadh_pi and _mm_loadl_pi a single LLVM IR instruction to make optimizing easier for CodeGen. rdar://10054986 llvm-svn: 139874	2011-09-15 23:15:27 +00:00

48 Commits