llvm-project

Commit Graph

Author	SHA1	Message	Date
Ekaterina Romanova	1d4a0f270c	[DOXYGEN] Minor improvements in doxygen comments. Separated very long brief sections into two sections. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 303031	2017-05-15 03:25:04 +00:00
Ekaterina Romanova	0a40d67b20	[DOXYGEN] Minor improvements in doxygen comments. - To be consistent with the rest of the intrinsics headers, I removed the tags <i> .. </i> for marking instruction names in italics in in smmintrin.h. - Formatting changes to fit into 80 characters. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 300578	2017-04-18 19:44:07 +00:00
Simon Pilgrim	9f6e79c5e4	[X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (clang) MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics. LLVM companion patch: D31767. Differential Revision: https://reviews.llvm.org/D31766 llvm-svn: 300326	2017-04-14 15:05:57 +00:00
Ekaterina Romanova	6a5702a093	[DOXYGEN] Improvements to smmintrin.h and emmintrin.h intrinsics. I made some small changes in smmintrin.h and emmintrin.h intrinsics. - changed some regular comments '//' into doxygen-style comments '///' where necessary - removed some trailing spaces in doxygen comments. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 298371	2017-03-21 13:34:06 +00:00
Ekaterina Romanova	ff266f5236	Added doxygen comments to smmintrin.h's intrinsics. Note: The doxygen comments are automatically generated based on Sony's intrinsic s document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 295404	2017-02-17 02:49:50 +00:00
Craig Topper	6a77b62640	[X86] Use unsigned types for vector arithmetic in intrinsics to avoid undefined behavior for signed integer overflow. This is really only needed for addition, subtraction, and multiplication, but I did the bitwise ops too for overall consistency. Clang currently doesn't set NSW for signed vector operations so the undefined behavior shouldn't happen today. llvm-svn: 271778	2016-06-04 05:43:41 +00:00
Simon Pilgrim	6d1a0c4c75	[X86][SSE] Make unsigned integer vector types generally available As discussed on http://reviews.llvm.org/D20684, move the unsigned integer vector types used for zero extension to make them available for general use. llvm-svn: 271187	2016-05-29 18:49:08 +00:00
Simon Pilgrim	91b77ceaed	[X86][SSE] Replace VPMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (clang) The VPMOVSX and (V)PMOVZX sign/zero extension intrinsics can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics. This patch removes the clang builtins and their use in the sse2/avx headers - a companion patch will remove/auto-upgrade the llvm intrinsics. Note: We already did this for SSE41 PMOVSX sometime ago. Differential Revision: http://reviews.llvm.org/D20684 llvm-svn: 271106	2016-05-28 08:12:45 +00:00
Ahmed Bougacha	5aa0ab3869	[Headers] Remove redundant typedef. NFC. llvm-svn: 271022	2016-05-27 17:57:23 +00:00
Craig Topper	cd45b1a7c7	[X86] Add a few missing typecasts to intrinsics. Found by playing with -fno-lax-vector-conversions on the builtin tests. llvm-svn: 269734	2016-05-17 03:42:31 +00:00
Craig Topper	d619eaaae4	[X86] Add missing typecasts in intrinsic macros. This should make them more robust against inputs that aren't already the right type. llvm-svn: 252700	2015-11-11 03:47:10 +00:00
Craig Topper	7148166785	[X86] Remove temporary variables from macros in x86 intrinsic headers. Prevents duplicate names appearing from multiple macro expansions. NFC llvm-svn: 252586	2015-11-10 05:08:05 +00:00
Ahmed Bougacha	7dfaaf3891	[Headers][X86] Fix stream_load (movntdqa) to accept const. Per Intel intrinsics guide: - _mm256_stream_load_si256 takes `__m256i const ' - _mm_stream_load_si128 takes `__m128i ', for no good reason. Let's accept const for both. llvm-svn: 249213	2015-10-02 23:29:26 +00:00
Chandler Carruth	cbe6411401	Fix the SSE4 byte sign extension in a cleaner way, and more thoroughly test that our intrinsics behave the same under -fsigned-char and -funsigned-char. This further testing uncovered that AVX-2 has a broken cmpgt for 8-bit elements, and has for a long time. This is fixed in the same way as SSE4 handles the case. The other ISA extensions currently work correctly because they use specific instruction intrinsics. As soon as they are rewritten in terms of generic IR, they will need to add these special casts. I've added the necessary testing to catch this however, so we shouldn't have to chase it down again. I considered changing the core typedef to be signed, but that seems like a bad idea. Notably, it would be an ABI break if anyone is reaching into the innards of the intrinsic headers and passing __v16qi on an API boundary. I can't be completely confident that this wouldn't happen due to a macro expanding in a lambda, etc., so it seems much better to leave it alone. It also matches GCC's behavior exactly. A fun side note is that for both GCC and Clang, -funsigned-char really does change the semantics of __v16qi. To observe this, consider: % cat x.cc #include <smmintrin.h> #include <iostream> int main() { __v16qi a = { 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; __v16qi b = _mm_set1_epi8(-1); std::cout << (int)(a / b)[0] << ", " << (int)(a / b)[1] << '\n'; } % clang++ -o x x.cc && ./x -1, 1 % clang++ -funsigned-char -o x x.cc && ./x 0, 1 However, while this may be surprising, both Clang and GCC agree. Differential Revision: http://reviews.llvm.org/D13324 llvm-svn: 249097	2015-10-01 23:40:12 +00:00
Chandler Carruth	9143378db0	Patch over a really horrible bug in our vector builtins that showed up recently when we started using direct conversion to model sign extension. The __v16qi type we use for SSE v16i8 vectors is defined in terms of 'char' which may or may not be signed! This causes us to generate pmovsx and pmovzx depending on the setting of -funsigned-char. This patch just forms an explicitly signed type and uses that to formulate the sign extension. While this gets the correct behavior (which we now verify with the enhanced test) this is just the tip of the ice berg. Now that I know what to look for, I have found errors of this sort throughout our vector code. Fortunately, this is the only specific place where I know of users actively having their code miscompiled by Clang due to this, so I'm keeping the fix for those users minimal and targeted. I'll be sending a proper email for discussion of how to fix these systematically, what the implications are, and just how widely broken this is... From what I can tell, we have never shipped a correct set of builtin headers for x86 when users rely on -funsigned-char. Oops. llvm-svn: 248980	2015-10-01 02:21:34 +00:00
Simon Pilgrim	12919f7e49	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR 128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds. This patch removes the builtins and reimplements the _mm_cvtepi_epi intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension). Differential Revision: http://reviews.llvm.org/D12835 llvm-svn: 248092	2015-09-19 15:12:38 +00:00
Sean Silva	e4c3760a9f	Clean up trailing whitespace in the builtin headers llvm-svn: 247498	2015-09-12 02:55:19 +00:00
Michael Kuperstein	e45af54cdb	[X86] Rename DEFAULT_FN_ATTR macro to __DEFAULT_FN_ATTR llvm-svn: 241065	2015-06-30 13:36:19 +00:00
Eric Christopher	9fc7fb274e	Update the intel intrinsic headers to use the target attribute support. This involved removing the conditional inclusion and replacing them with target attributes matching the original conditional inclusion and checks. The testcase update removes the macro checks for each file and replaces them with usage of the __target__ attribute, e.g.: int __attribute__((__target__(("sse3")))) foo(int a) { _mm_mwait(0, 0); return 4; } This usage does require the enclosing function have the requisite __target__ attribute for inlining and code generation - also for any macro intrinsic uses in the enclosing function. There's no change for existing uses of the intrinsic headers. llvm-svn: 239883	2015-06-17 07:09:32 +00:00
Eric Christopher	4d185168e9	Use a define for per-file function attributes for the Intel intrinsic headers. This is a precursor to changing them to use the new target attribute code. llvm-svn: 239882	2015-06-17 07:09:20 +00:00
Filipe Cabecinhas	5d289b48b1	Patched clang to emit x86 blends as shufflevectors. Summary: Most of the clang header patch by Simon Pilgrim @ SCEE. Also fixed (or added) clang tests for these intrinsics. LLVM tests to make sure we get the blend instruction out of these shufflevectors are at http://reviews.llvm.org/D3600 Reviewers: eli.friedman, craig.topper, rafael Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D3601 llvm-svn: 208664	2014-05-13 02:37:02 +00:00
Manman Ren	c94122e05b	Intrinsics: fix extract & insert when index is out of bound. Now, all extract & insert intrinsics should have the correct and operation to ignore higher bits. rdar://15250497 llvm-svn: 193267	2013-10-23 20:33:14 +00:00
Eli Friedman	9b04f41899	Fix return type of _mm_extract_epi8 etc. PR17300. llvm-svn: 191120	2013-09-21 00:05:25 +00:00
David Blaikie	3302f2bd46	PR14964: intrinsic headers using non-reserved identifiers Several of the intrinsic headers were using plain non-reserved identifiers. C++11 17.6.4.3.2 [global.names] p1 reservers names containing a double begining with an underscore followed by an uppercase letter for any use. I think I got them all, but open to being corrected. For the most part I didn't bother updating function-like macro parameter names because I don't believe they're subject to any such collission - though some function-like macros already follow this convention (I didn't update them in part because the churn was more significant as several function-like macros use the double underscore prefixed version of the same name as a parameter in their implementation) llvm-svn: 172666	2013-01-16 23:08:36 +00:00
Craig Topper	74c17c65e4	Correctly check argument types for some vector macros in smmintrin.h. Put parentheses around uses of vector macro arguments. llvm-svn: 153732	2012-03-30 07:01:17 +00:00
Craig Topper	97f042f2d6	Add _mm_minpos_epu16 to smmintrin.h. Fixes PR12399. llvm-svn: 153726	2012-03-30 05:41:28 +00:00
Craig Topper	1de8348db7	Add popcnt feature flag to match gcc. This flag is implied when sse42 is enabled, but can be disabled separately. Move popcnt intrinsics to popcntintrin.h to match gcc. llvm-svn: 147340	2011-12-29 16:10:46 +00:00
Craig Topper	a89747dd1e	Add AVX2 intrinsics for pavg, pblend, and pcmp instructions. Also remove unneeded builtins for SSE pcmp. Change SSE pcmpeqq and pcmpgtq to not use builtins and just use vector == and >. llvm-svn: 146969	2011-12-20 09:55:26 +00:00
Bob Wilson	16c4195548	Fix obvious error in _mm_test_all_zeros. PR11565. Patch by Mathias Gaunard! llvm-svn: 146565	2011-12-14 17:17:16 +00:00
Chandler Carruth	222c66db38	Fix a blatant typo or cut/paste-o reported by users of this header. llvm-svn: 146251	2011-12-09 09:23:55 +00:00
Eli Friedman	f16beb3942	Fix some additional x86 intrinsics to use "I" (ICE) markings. Fix *mmintrin.h to take them into account. <rdar://problem/10341145> llvm-svn: 144246	2011-11-10 00:11:13 +00:00
Eli Friedman	9586cdb01e	Misc fixes to pcmp*stri. llvm-svn: 144073	2011-11-08 04:13:51 +00:00
Eric Christopher	2a9898f0a2	Move some type defines from smmintrin.h to emmintrin.h to match where gcc defines them. llvm-svn: 112146	2010-08-26 02:09:25 +00:00
Chris Lattner	9052c35479	fix some vector extractions to return properly zero extended values (instead of sign extending) to match ICC. GCC is changing this in a series of their own PRs (e.g. 41323). llvm-svn: 111637	2010-08-20 16:08:33 +00:00
Daniel Dunbar	f5e075d392	Headers: Fix quoting of macro arguments in a couple more places. llvm-svn: 105331	2010-06-02 16:35:01 +00:00
Eric Christopher	bd9a3aecd6	This is just a simple v4si * v4si, make it so. llvm-svn: 99587	2010-03-26 00:51:28 +00:00
Anders Carlsson	91e18c93c4	Make the license header in smmintrin.h match the other SSE headers. llvm-svn: 99384	2010-03-24 05:31:31 +00:00
Chris Lattner	7eac805bb0	fix PR6658: inline isn't a keyword in C89 mode, use __inline__ instead. llvm-svn: 99190	2010-03-22 18:14:12 +00:00
Eric Christopher	08f135274d	Add sse4.2 header and builtin support. llvm-svn: 99051	2010-03-20 07:43:28 +00:00
Eric Christopher	8c6f61394f	Add remaining sse4.1 intrinsics and builtins. llvm-svn: 98587	2010-03-15 23:22:58 +00:00
Eric Christopher	6932b2e8b7	Add SSE4 packed integer comparisons and corresponding intrinsics. llvm-svn: 98323	2010-03-12 01:22:33 +00:00
Eric Christopher	e486f68b59	Integer array extraction for sse4.1. llvm-svn: 98305	2010-03-11 23:50:18 +00:00
Eric Christopher	e7594305bc	Add packed integer array insertion. llvm-svn: 98299	2010-03-11 23:36:29 +00:00
Eric Christopher	1dca62055a	Add insert/extract_ps and related random macros. llvm-svn: 98114	2010-03-10 00:50:58 +00:00
Eric Christopher	4c70358296	Add sse4.1 packed min and max intrinsics. llvm-svn: 97907	2010-03-07 07:00:42 +00:00
Eric Christopher	7288890b51	Add load hint instruction intrinsic. llvm-svn: 97904	2010-03-07 06:29:09 +00:00
Eric Christopher	87990fe5df	Add in support for dword multiply and fp dot product intrinsics. llvm-svn: 97902	2010-03-07 06:17:19 +00:00
Eric Christopher	b0759be4d0	Fix _MM_FROUND_NEARBYINT and move rounding intrinsics to macros. llvm-svn: 97874	2010-03-06 10:31:44 +00:00
Eric Christopher	94567c04bb	First start on smmintrin.h, rounding and blending. llvm-svn: 97717	2010-03-04 02:56:19 +00:00

49 Commits