llvm-project

Commit Graph

Author	SHA1	Message	Date
serge-sans-paille	9aeecdfa8e	Check supported architectures in sseXYZ/avxXYZ headers It doesn't make sense to include those headers on the wrong architecture, provide an explicit error message in that case. Fix https://bugs.llvm.org/show_bug.cgi?id=48915 Differential Revision: https://reviews.llvm.org/D109686	2021-09-14 09:57:54 +02:00
Craig Topper	4190d99dfc	[X86] Add parentheses around casts in some of the X86 intrinsic headers. This covers the SSE and AVX/AVX2 headers. AVX512 has a lot more macros due to rounding mode. Fixes part of PR51324. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D107843	2021-08-13 09:36:16 -07:00
Pierre Gousseau	08fab9ebec	[X86] Fix implicit sign conversion warnings in X86 headers. Warnings in emmintrin.h and xmmintrin.h are reported by -fsanitize=implicit-integer-sign-change. Reviewed By: RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D77393	2020-04-07 11:25:08 +01:00
Craig Topper	16b9410caa	[X86] Cast to __v4hi instead of __m64 in the implementation of _mm_extract_pi16 and _mm_insert_pi16. __m64 is a vector of 1 long long. But the builtins these intrinsics are calling expect a vector of 4 shorts. Fixes PR44589	2020-01-22 16:00:23 -06:00
Warren Ristow	7fcd9e3f70	[X86] Mark various pointer arguments in builtins as const Enabling `-Wcast-qual` identified many casts in various system headers that were dropping the `const` qualifier. Fixing those missing qualifiers pointed out that a few of the definitions of the builtins did not properly identify their arguments as `const` pointers. This commit fixes those builtin definitions, and the system header files so that they no longer drop the qualifier. Differential Revision: https://reviews.llvm.org/D71718	2019-12-19 11:42:11 -08:00
Craig Topper	caf6b71ab2	[X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669	2019-07-10 17:11:29 +00:00
Chandler Carruth	4cf5743b77	Move the builtin headers to use the new license file header. Summary: These all had somewhat custom file headers with different text from the ones I searched for previously, and so I missed them. Thanks to Hal and Kristina and others who prompted me to fix this, and sorry it took so long. Reviewers: hfinkel Subscribers: mcrosier, javed.absar, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60406 llvm-svn: 357941	2019-04-08 20:51:30 +00:00
Reid Kleckner	79d7f4114d	[X86] Use __m128_u for _mm_loadu_ps after r353555 Add secondary triple to existing SSE test for it. I audited other uses of __attribute__((__packed__)) in the intrinsic headers, and this seemed to be the only missing one. llvm-svn: 353878	2019-02-12 21:04:21 +00:00
Craig Topper	be4cbe8726	[X86] Add explicit alignment to __m128/__m128i/__m128d/etc. to allow matching of MSVC behavior with #pragma pack. Summary: With MSVC, #pragma pack is ignored when there is explicit alignment. This differs from gcc. Clang emulates this difference when compiling for Windows. It appears that MSVC and its headers consider the __m128/__m128i/__m128d/etc. types to be explicitly aligned and ignores #pragma pack for them. Since we don't have explicit alignment on them in our headers, we don't match the MSVC behavior here. This patch adds explicit alignment to match this behavior. I'm hoping this won't cause any problems when we're not emulating MSVC. But if someone knows of something that would be different we can swith to conditionally adding the alignment based on _MSC_VER. I had to add explicitly unaligned types as well so we could use them in the loadu/storeu intrinsics which use __attribute__(__packed__). Using the now explicitly aligned types wouldn't produce align 1 accesses when targeting Windows. Reviewers: rnk, erichkeane, spatel, RKSimon Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D57961 llvm-svn: 353555	2019-02-08 19:45:08 +00:00
Craig Topper	638426fc36	[X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that is suitable for use in scalar mask intrinsics. This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics. This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling. I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle. llvm-svn: 336622	2018-07-10 00:37:25 +00:00
Craig Topper	74c10e3236	[Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter. Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible. To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future. To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin. To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute. To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins. There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch. Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue. Differential Revision: https://reviews.llvm.org/D48617 llvm-svn: 336583	2018-07-09 19:00:16 +00:00
Craig Topper	422a1bbb84	[X86] Add builtins for shufps and shufpd to enable target feature and immediate range checking. llvm-svn: 334266	2018-06-08 07:18:33 +00:00
Craig Topper	9b0c61e9de	[X86] Mark all the builtins and intrinsics that require MMX and an SSE feature as requiring both mmx and the sse feature. Previously we only checked the sse feature, but this means that if you passed -mno-mmx, the builtins/intrinsics wouldn't be disabled in the frontend and would instead fail backend isel. llvm-svn: 333980	2018-06-05 03:12:14 +00:00
Craig Topper	c633867944	[X86] Remove __extension__ from macro intrinsics when its not needed. I think this is a holdover from when we used to declare variables inside the macros. And then its been copy and pasted forward for years every time a new macro intrinsic gets added. Interestingly this caused some tests for IRGen to be slightly more optimized. We now return a zeroinitializer directly instead of going through a store+load. It also removed a bogus error message on another test. llvm-svn: 333613	2018-05-31 00:51:20 +00:00
Craig Topper	63ec0ea7bc	[X86] Add __extension__ to a bunch of places in our intrinsic headers that fail if you run it through -pedantic -ansi. All of these are lines that create a 'compound literal' to concatenate elements together. llvm-svn: 333593	2018-05-30 21:08:27 +00:00
Craig Topper	c5ec55e921	[X86] Simplify the implementation of _mm_sqrt_ss, _mm_rcp_ss, and _mm_rsqrt_ss. We don't need the insertion back into the original vector at the end. The builtin already understands that. This is different than _mm_sqrt_sd which takes two arguments and we do need to insert. llvm-svn: 333572	2018-05-30 18:27:07 +00:00
Craig Topper	819f2a20c3	[X86] Remove 'return' from a bunch of intrinsics that return void and use a builtin that returns void. Found by running the intrinsic headers through -pedantic -ansi. llvm-svn: 333563	2018-05-30 17:23:45 +00:00
Ekaterina Romanova	9b412153bf	[DOXYGEN] Formatting changes for better intrinsics documentation rendering (1) I added some \see cross-references to a few select intrinsics that are related (and have the same or similar semantics). (2) pmmintrin.h, smmintrin.h, xmmintrin.h have very few minor formatting changes. They make rendering of our intrinsics documentation better. llvm-svn: 333065	2018-05-23 06:33:22 +00:00
Adrian Prantl	9fc8faf9e6	Remove \brief commands from doxygen comments. This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834	2018-05-09 01:00:01 +00:00
Douglas Yung	46474dae4d	[DOXYGEN] Fix doxygen and content issues in xmmintrin.h - Fix inaccurate instruction listings. - Fix small issues in _mm_getcsr and _mm_setcsr. - Fix description of NaN handling in comparison intrinsics. - Fix inaccurate description of _mm_movemask_pi8. - Fix inaccurate instruction mappings. - Fix typos. - Clarify wording on some descriptions. - Fix bit ranges in return value. - Fix typo in _mm_move_ms intrinsic instruction since it operates on singe-precision values, not double. - This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41523 llvm-svn: 322778	2018-01-17 22:53:15 +00:00
Craig Topper	170de4b4ba	[X86] Allow _mm_prefetch (both the header implementation and the builtin) to accept bit 2 which is supposed to indicate the prefetched addresses will be written to Add the appropriate _MM_HINT_ET0/ET1 defines to match gcc. llvm-svn: 321325	2017-12-21 23:50:22 +00:00
Ekaterina Romanova	cb3603a4eb	[DOXYGEN] Corrected several typos and incorrect parameters description that Sony's techinical writer found during review. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 304840	2017-06-06 22:58:01 +00:00
Ekaterina Romanova	bfc1e3a84e	(1) Fixed mismatch in intrinsics names in declarations and in doxygen comments. (2) Removed uncessary anymore \c commands, since the same effect will be achived by <c> ... </c> sequence. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 303228	2017-05-17 01:46:11 +00:00
Ekaterina Romanova	1d4a0f270c	[DOXYGEN] Minor improvements in doxygen comments. Separated very long brief sections into two sections. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 303031	2017-05-15 03:25:04 +00:00
Ekaterina Romanova	0a40d67b20	[DOXYGEN] Minor improvements in doxygen comments. - To be consistent with the rest of the intrinsics headers, I removed the tags <i> .. </i> for marking instruction names in italics in in smmintrin.h. - Formatting changes to fit into 80 characters. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 300578	2017-04-18 19:44:07 +00:00
Ekaterina Romanova	2e041c9c20	[DOXYGEN] Documentation for the newly added x86 intrinsics. Added doxygen comments for the newly added intrinsics in avxintrin.h, namely _mm256_cvtsd_f64, _mm256_cvtsi256_si32 and _mm256_cvtss_f32 Added doxygen comments for the new intrinsics in emmintrin.h, namely _mm_loadu_si64 and _mm_load_sd. Explicit parameter names were added for _mm_clflush and _mm_setcsr The rest of the changes are editorial, removing trailing spaces at the end of the lines. Differential Revision: https://reviews.llvm.org/D28503 llvm-svn: 291876	2017-01-13 01:14:08 +00:00
Ekaterina Romanova	c9ed514632	[DOXYGEN] Improved doxygen comments for xmmintrin.h intrinsics. Added \n commands to insert a line breaks where necessary, since one long line of documentation is nearly unreadable. Formatted comments to fit into 80 chars. In some cases added \a command in front of the parameter names to display them in italics. llvm-svn: 290619	2016-12-27 18:53:29 +00:00
Ekaterina Romanova	0c1c3bbc78	[DOXYGEN] Improved doxygen comments for x86 intrinsics headers. Tagged instruction names with <c> INSTR_NAME </c> to display them in typewriter font. In the past, \c command was used, unfortunately it applied to only one word. <c> .. </c> has the same meaning, but applies to all words in between the tags. llvm-svn: 289249	2016-12-09 18:35:50 +00:00
Ekaterina Romanova	08da283295	[DOXYGEN] Improved doxygen comments for xmmintrin.h intrinsics. Tagged parameter names with \a doxygen command to display parameters in italics. Formatted comments to fit into 80 chars. llvm-svn: 289159	2016-12-08 23:58:39 +00:00
Ekaterina Romanova	2174b6fe72	Minor changes in x86 intrinsics headers; NFC I made several changes for consistency with the rest of x86 instrinsics header files. Some of these changes help to render doxygen comments better. 1. avxintrin.h – Moved the opening bracket on a separate line for several intrinsics (for consistency with the rest of the intrinsics). 2. emmintrin.h - Moved the doxygen comment next to the body of the function; - Added braces after extern "C" even though there is only one declaration each time 3. xmmintrin.h - Moved the doxygen comment next to the body of the function; - Added intrinsic prototypes for a couple of macro definitions into the doxygen comment; - Added braces after extern "C" even though there is only one declaration each time 4. ammintrin.h – Removed extra line between the doxygen comment and the body of the functions (for consistency with the rest of the files). Desk reviewed by Paul Robinson. llvm-svn: 287278	2016-11-17 23:02:00 +00:00
Yunzhong Gao	d9fa56a4fb	[NFC] Fixing the description for _mm_store_ps and _mm_store_ps1. It seems that the doxygen description of these two intrinsics were swapped by mistake. llvm-svn: 284080	2016-10-12 23:27:27 +00:00
Yunzhong Gao	c37e2231ad	[NFC] Trial change to remove a redundant blank line. llvm-svn: 284033	2016-10-12 19:33:33 +00:00
Albert Gutowski	727ab8a803	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: alexshap, cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281540	2016-09-14 21:19:43 +00:00
Albert Gutowski	9918cb6573	Reverse commit 281375 (breaks building Chromium) llvm-svn: 281399	2016-09-13 21:24:51 +00:00
Albert Gutowski	ae3fb3113f	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281375	2016-09-13 19:26:42 +00:00
Craig Topper	45db56c375	[X86] Add missing __x86_64__ qualifiers on a bunch of intrinsics that assume 64-bit GPRs are available. Usages of these intrinsics in a 32-bit build results in assertions in the backend. llvm-svn: 276249	2016-07-21 07:38:39 +00:00
Simon Pilgrim	e3b9ee0645	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102	2016-07-20 10:18:01 +00:00
Craig Topper	95b61b0544	[X86] Use __builtin_ia32_vec_ext_v4hi and __builtin_ia32_vec_set_v4hi to implement pextrw/pinsertw MMX intrinsics instead of trying to use native IR. Without this we end up generating code that doesn't use mmx registers and probably doesn't work well with other mmx intrinsics. llvm-svn: 274968	2016-07-09 05:30:41 +00:00
Craig Topper	2a383c9273	[X86] Use undefined instead of setzero in shufflevector based intrinsics when the second source is unused. Rewrite immediate extractions in shuffle intrinsics to be in ((c >> x) & y) form instead of ((c & z) >> x). This way only x varies between each use instead of having to vary x and z. llvm-svn: 274525	2016-07-04 22:18:01 +00:00
Zvi Rackover	453d734201	[X86] _MM_ALIGN16 attribute support for non-windows targets Summary: This patch adds support for the _MM_ALIGN16 attribute on non-windows targets. This aligns Clang with ICC which supports the attribute on all targets. Fixes PR28056 Reviewers: aaboud, echristo, cfe-commits, mkuper Subscribers: zvi, mehdi_amini Projects: #clang-c Differential Revision: http://reviews.llvm.org/D21173 llvm-svn: 273095	2016-06-18 20:01:07 +00:00
Simon Pilgrim	beca5f295c	[Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores. The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load Differential Revision: http://reviews.llvm.org/D21272 llvm-svn: 272540	2016-06-13 09:57:52 +00:00
Craig Topper	3a0c7260f4	[X86] Add void to the argument list of intrinsics that don't take arguments since empty argument list mean something else in C. llvm-svn: 272244	2016-06-09 05:14:28 +00:00
Ekaterina Romanova	50e94a3b34	Add doxygen comments to xmmintrin.h's intrinsics. Only half of the intrinsics in this file is documented here. The patch for the o ther half will be sent out later. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 272121	2016-06-08 07:34:31 +00:00
Craig Topper	6a77b62640	[X86] Use unsigned types for vector arithmetic in intrinsics to avoid undefined behavior for signed integer overflow. This is really only needed for addition, subtraction, and multiplication, but I did the bitwise ops too for overall consistency. Clang currently doesn't set NSW for signed vector operations so the undefined behavior shouldn't happen today. llvm-svn: 271778	2016-06-04 05:43:41 +00:00
Simon Pilgrim	645e1ad33a	[X86][SSE] _mm_store1_ps/_mm_store1_pd should require an aligned pointer According to the gcc headers, intel intrinsics docs and msdn codegen the _mm_store1_pd (and its _mm_store_pd1 equivalent) should use an aligned pointer - the clang headers are the only implementation I can find that assume non-aligned stores (by storing with _mm_storeu_pd). Additionally, according to the intel intrinsics docs and msdn codegen the _mm_store1_ps (_mm_store_ps1) requires a similarly aligned pointer. This patch raises the alignment requirements to match the other implementations by calling _mm_store_ps/_mm_store_pd instead. I've also added the missing _mm_store_pd1 intrinsic (which maps to _mm_store1_pd like _mm_store_ps1 does to _mm_store1_ps). As a followup I'll update the llvm fast-isel tests to match this codegen. Differential Revision: http://reviews.llvm.org/D20617 llvm-svn: 271218	2016-05-30 17:55:25 +00:00
Craig Topper	09175dab31	[X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code that will compile to a native unaligned store. Remove the builtins since they are no longer used. Intrinsics will be removed from llvm in a future commit. llvm-svn: 271214	2016-05-30 17:10:30 +00:00
Craig Topper	1aa231e3aa	[X86] Add typecasts to remove most assumptions about what __m128i/__m256i is defined as. Add similar typecasts for the fp types as well. llvm-svn: 269632	2016-05-16 06:38:42 +00:00
Richard Smith	e0fa4c83b2	[modules] Make the tweak to avoid circular inclusion of emmintrin.h and xmmintrin.h a bit more directed. If for whatever reason modules are enabled but we textually include one of these headers, don't deploy the special case for modules. To make this work cleanly, extend __building_module to be defined even when modules is disabled. llvm-svn: 266945	2016-04-21 01:46:37 +00:00
Ekaterina Romanova	e2961f71d2	Add doxygen comments to xmmintrin.h's intrinsics. Only half of the intrinsics in this file is documented here. The patch for the other half will be sent out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 263098	2016-03-10 09:37:04 +00:00
Craig Topper	7148166785	[X86] Remove temporary variables from macros in x86 intrinsic headers. Prevents duplicate names appearing from multiple macro expansions. NFC llvm-svn: 252586	2015-11-10 05:08:05 +00:00

1 2 3

112 Commits