llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	31730ae761	[X86] Rename __builtin_ia32_pslldqi128 to __builtin_ia32_pslldqi128_byteshift and similar for other sizes. Remove the multiply by 8 from the header files. The previous names took the shift amount in bits to match gcc and required a multiply by 8 in the header. This creates a misleading error message when we check the range of the immediate to the builtin since the allowed range also got multiplied by 8. This commit changes the builtins to use a byte shift amount to match the underlying instruction and the Intel intrinsic. Fixes the remaining issue from PR37795. llvm-svn: 334773	2018-06-14 22:02:35 +00:00
Craig Topper	422a1bbb84	[X86] Add builtins for shufps and shufpd to enable target feature and immediate range checking. llvm-svn: 334266	2018-06-08 07:18:33 +00:00
Craig Topper	03de166ccd	[X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature and immediate range checking. llvm-svn: 334265	2018-06-08 06:13:16 +00:00
Craig Topper	d3623155a2	[X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar intrinsics. We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits. This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously. llvm-svn: 334208	2018-06-07 17:28:03 +00:00
Craig Topper	f3914b74c1	[X86] Add builtins for vector element insert and extract for different 128 and 256 bit vector types. Use them to implement the extract and insert intrinsics. Previously we were just using extended vector operations in the header file. This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load. By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count. User code still has the option to use extended vector operations themselves if they need non-constant indexing. llvm-svn: 334057	2018-06-06 00:24:55 +00:00
Craig Topper	9b0c61e9de	[X86] Mark all the builtins and intrinsics that require MMX and an SSE feature as requiring both mmx and the sse feature. Previously we only checked the sse feature, but this means that if you passed -mno-mmx, the builtins/intrinsics wouldn't be disabled in the frontend and would instead fail backend isel. llvm-svn: 333980	2018-06-05 03:12:14 +00:00
Tim Shen	f811de484c	[X86] Fix wrong intrinsic semantic. llvm-svn: 333617	2018-05-31 01:51:07 +00:00
Craig Topper	c633867944	[X86] Remove __extension__ from macro intrinsics when its not needed. I think this is a holdover from when we used to declare variables inside the macros. And then its been copy and pasted forward for years every time a new macro intrinsic gets added. Interestingly this caused some tests for IRGen to be slightly more optimized. We now return a zeroinitializer directly instead of going through a store+load. It also removed a bogus error message on another test. llvm-svn: 333613	2018-05-31 00:51:20 +00:00
Craig Topper	63ec0ea7bc	[X86] Add __extension__ to a bunch of places in our intrinsic headers that fail if you run it through -pedantic -ansi. All of these are lines that create a 'compound literal' to concatenate elements together. llvm-svn: 333593	2018-05-30 21:08:27 +00:00
Craig Topper	819f2a20c3	[X86] Remove 'return' from a bunch of intrinsics that return void and use a builtin that returns void. Found by running the intrinsic headers through -pedantic -ansi. llvm-svn: 333563	2018-05-30 17:23:45 +00:00
Craig Topper	25caca72f4	[X86] As mentioned in post-commit feedback in D47174, move the 128 bit f16c intrinsics into f16cintrin.h and remove __emmintrin_f16c.h These were included in emmintrin.h to match Intel Intrinsics Guide documentation. But this is because icc is capable of emulating them on targets that don't support F16C using library calls. Clang/LLVM doesn't have this emulation support. So it makes more sense to include them in immintrin.h instead. I've left a comment behind to hopefully deter someone from trying to move them again in the future. llvm-svn: 333033	2018-05-22 22:19:19 +00:00
Craig Topper	34c8c0d858	[X86] Move 128-bit f16c intrinsics to __emmintrin_f16c.h include from emmintrin.h. Move 256-bit f16c intrinsics back to f16cintrin.h Intel documents the 128-bit versions as being in emmintrin.h and the 256-bit version as being in immintrin.h. This patch makes a new __emmtrin_f16c.h to hold the 128-bit versions to be included from emmintrin.h. And makes the existing f16cintrin.h contain the 256-bit versions and include it from immintrin.h with an error if its included directly. Differential Revision: https://reviews.llvm.org/D47174 llvm-svn: 333014	2018-05-22 18:54:19 +00:00
Craig Topper	842171de36	[X86] Use __builtin_convertvector to implement some of the packed integer to packed float conversion intrinsics. I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions. We already do something similar for scalar conversions. Differential Revision: https://reviews.llvm.org/D46863 llvm-svn: 332882	2018-05-21 20:19:17 +00:00
Adrian Prantl	9fc8faf9e6	Remove \brief commands from doxygen comments. This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834	2018-05-09 01:00:01 +00:00
Ekaterina Romanova	f28751849e	[DOXYGEN] There was a request in the review D41507 to change the notation for hex numbers in doxygen documentation from <...>h to 0x<...>. Both of these notations were used in x86 intrinsics documentation. I promised to change them to 0x<...> for consistency. Differential Revision: https://reviews.llvm.org/D41888 llvm-svn: 325312	2018-02-16 03:11:35 +00:00
Douglas Yung	0686df106c	[DOXYGEN] Fix doxygen and content issues in emmintrin.h - Fixed innaccurate instruction mappings for various intrinsics. - Fixed description of NaN handling in comparison intrinsics. - Unify description of _mm_store_pd1 to match _mm_store1_pd. - Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed". - Fix typos. - Add missing italics command (\a) for params and fixed some parameter spellings. This patch was made by Craig Flores Differential Revision: https://reviews.llvm.org/D41516 llvm-svn: 321669	2018-01-02 20:39:29 +00:00
Yael Tsafrir	23e7733230	[X86] Lower _mm[256\|512]_[mask[z]]_avg_epu[8\|16] intrinsics to native llvm IR Differential Revision: https://reviews.llvm.org/D37562 llvm-svn: 313011	2017-09-12 07:46:32 +00:00
Ekaterina Romanova	cb3603a4eb	[DOXYGEN] Corrected several typos and incorrect parameters description that Sony's techinical writer found during review. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 304840	2017-06-06 22:58:01 +00:00
Ekaterina Romanova	1d4a0f270c	[DOXYGEN] Minor improvements in doxygen comments. Separated very long brief sections into two sections. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 303031	2017-05-15 03:25:04 +00:00
Simon Pilgrim	99ed27053d	[X86][SSE] Add _mm_set_pd1 (PR32827) Matches _mm_set_ps1 implementation llvm-svn: 301637	2017-04-28 10:28:32 +00:00
Ekaterina Romanova	6a5702a093	[DOXYGEN] Improvements to smmintrin.h and emmintrin.h intrinsics. I made some small changes in smmintrin.h and emmintrin.h intrinsics. - changed some regular comments '//' into doxygen-style comments '///' where necessary - removed some trailing spaces in doxygen comments. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 298371	2017-03-21 13:34:06 +00:00
Oren Ben Simhon	259b091669	[X86] DAZ Macros Relocation The DAZ feature introduces the denormal zero support for x86. Currently the definitions are located under SSE3 header, however there are some SSE2 targets that support the feature as well. Differential Revision: https://reviews.llvm.org/D30194 llvm-svn: 296296	2017-02-26 11:58:15 +00:00
Ekaterina Romanova	ff266f5236	Added doxygen comments to smmintrin.h's intrinsics. Note: The doxygen comments are automatically generated based on Sony's intrinsic s document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 295404	2017-02-17 02:49:50 +00:00
Ekaterina Romanova	2e041c9c20	[DOXYGEN] Documentation for the newly added x86 intrinsics. Added doxygen comments for the newly added intrinsics in avxintrin.h, namely _mm256_cvtsd_f64, _mm256_cvtsi256_si32 and _mm256_cvtss_f32 Added doxygen comments for the new intrinsics in emmintrin.h, namely _mm_loadu_si64 and _mm_load_sd. Explicit parameter names were added for _mm_clflush and _mm_setcsr The rest of the changes are editorial, removing trailing spaces at the end of the lines. Differential Revision: https://reviews.llvm.org/D28503 llvm-svn: 291876	2017-01-13 01:14:08 +00:00
Ekaterina Romanova	dffe45b3e6	[DOXYGEN] Improved doxygen comments for x86 intrinsics. Improved doxygen comments for the following intrinsics headers: __wmmintrin_pclmul.h, bmiintrin.h, emmintrin.h, f16cintrin.h, immintrin.h, mmintrin.h, pmmintrin.h, tmmintrin.h Added \n commands to insert a line breaks where necessary, since one long line of documentation is nearly unreadable. Formatted comments to fit into 80 chars. In some cases added \a command in front of the parameter names to display them in italics. llvm-svn: 290561	2016-12-27 00:49:38 +00:00
Ekaterina Romanova	0c1c3bbc78	[DOXYGEN] Improved doxygen comments for x86 intrinsics headers. Tagged instruction names with <c> INSTR_NAME </c> to display them in typewriter font. In the past, \c command was used, unfortunately it applied to only one word. <c> .. </c> has the same meaning, but applies to all words in between the tags. llvm-svn: 289249	2016-12-09 18:35:50 +00:00
Ekaterina Romanova	797b0ebf2d	[DOXYGEN] Improved doxygen comments for emmintrin.h intrinsics. Tagged parameter names with \a doxygen command to display parameters in italics. Formatted comments to fit into 80 chars. llvm-svn: 289116	2016-12-08 22:10:51 +00:00
Ekaterina Romanova	2174b6fe72	Minor changes in x86 intrinsics headers; NFC I made several changes for consistency with the rest of x86 instrinsics header files. Some of these changes help to render doxygen comments better. 1. avxintrin.h – Moved the opening bracket on a separate line for several intrinsics (for consistency with the rest of the intrinsics). 2. emmintrin.h - Moved the doxygen comment next to the body of the function; - Added braces after extern "C" even though there is only one declaration each time 3. xmmintrin.h - Moved the doxygen comment next to the body of the function; - Added intrinsic prototypes for a couple of macro definitions into the doxygen comment; - Added braces after extern "C" even though there is only one declaration each time 4. ammintrin.h – Removed extra line between the doxygen comment and the body of the functions (for consistency with the rest of the files). Desk reviewed by Paul Robinson. llvm-svn: 287278	2016-11-17 23:02:00 +00:00
Ekaterina Romanova	64adc38e51	Doxygen comments for avxintrin.h. Added doxygen comments to avxintrin.h's intrinsics. As of now, around 75% of the intrinsics in this file are documented here. The patches for the other 25% will be se nt out later. Removed extra spaces in emmitrin.h. Note: The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 286336	2016-11-09 03:58:30 +00:00
Ekaterina Romanova	06477bf035	Add more doxygen comments to emmintrin.h's intrinsics. With this patch, all intrinsics in this file (with an exception of a handful of a recently added ones) will be documented. I will send out a patch for 4 missining intrisics later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao. llvm-svn: 284934	2016-10-23 07:30:50 +00:00
Ekaterina Romanova	493091fdef	Add more doxygen comments to emmintrin.h's intrinsics. With this patch, 75% of the intrinsics in this file will be documented now. The patches for the rest of the intrisics in this file will be send out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Yunzhong Gao. llvm-svn: 284754	2016-10-20 17:59:15 +00:00
Albert Gutowski	727ab8a803	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: alexshap, cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281540	2016-09-14 21:19:43 +00:00
Albert Gutowski	9918cb6573	Reverse commit 281375 (breaks building Chromium) llvm-svn: 281399	2016-09-13 21:24:51 +00:00
Albert Gutowski	ae3fb3113f	Add some MS aliases for existing intrinsics Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281375	2016-09-13 19:26:42 +00:00
Craig Topper	d0681d528d	[X86] Use v2i64 vectors to implement _mm_and/andn/or/xor_pd. These will be reused when removing some builtins from avx512vldqintrin.h and this will make the tests for that change show a better number of vector elements. llvm-svn: 280196	2016-08-31 05:38:55 +00:00
Ekaterina Romanova	a84c24f39c	Add doxygen comments to emmintrin.h's intrinsics. Only around 50% of the intrinsics in this file are documented now. The patches for the rest of the intrisics in this file will be send out later. The doxygen comments are automatically generated based on Sony's intrinsics docu ment. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. This patch was internally reviewed by Paul Robinson. llvm-svn: 276499	2016-07-22 23:49:37 +00:00
Simon Pilgrim	e3b9ee0645	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102	2016-07-20 10:18:01 +00:00
Craig Topper	2a383c9273	[X86] Use undefined instead of setzero in shufflevector based intrinsics when the second source is unused. Rewrite immediate extractions in shuffle intrinsics to be in ((c >> x) & y) form instead of ((c & z) >> x). This way only x varies between each use instead of having to vary x and z. llvm-svn: 274525	2016-07-04 22:18:01 +00:00
Asaf Badouh	57819aa185	[X86] add _mm_loadu_si64 Differential Revision: http://reviews.llvm.org/D21504 llvm-svn: 273812	2016-06-26 13:51:54 +00:00
Craig Topper	50e3dfe9d0	[X86] Fix pslldq/psrldq intrinsics to not fail compilation with immediates larger than 16. This was accidentally broken in r272246. llvm-svn: 273775	2016-06-25 07:31:14 +00:00
Simon Pilgrim	beca5f295c	[Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores. The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load Differential Revision: http://reviews.llvm.org/D21272 llvm-svn: 272540	2016-06-13 09:57:52 +00:00
Craig Topper	2769bb5753	[X86] Handle AVX2 pslldqi and psrldqi intrinsics shufflevector creation directly in the header file instead of in CGBuiltin.cpp. Simplify the sse2 equivalents as well. llvm-svn: 272246	2016-06-09 05:15:12 +00:00
Craig Topper	3a0c7260f4	[X86] Add void to the argument list of intrinsics that don't take arguments since empty argument list mean something else in C. llvm-svn: 272244	2016-06-09 05:14:28 +00:00
Craig Topper	6a77b62640	[X86] Use unsigned types for vector arithmetic in intrinsics to avoid undefined behavior for signed integer overflow. This is really only needed for addition, subtraction, and multiplication, but I did the bitwise ops too for overall consistency. Clang currently doesn't set NSW for signed vector operations so the undefined behavior shouldn't happen today. llvm-svn: 271778	2016-06-04 05:43:41 +00:00
Simon Pilgrim	00880511b1	[X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (clang) The 'cvtt' truncation (round to zero) conversions can be safely represented as generic __builtin_convertvector (fptosi) calls instead of x86 intrinsics. We already do this (implicitly) for the scalar equivalents. Note: I looked at updating _mm_cvttpd_epi32 as well but this still requires a lot more backend work to correctly lower (both for debug and optimized builds). Differential Revision: http://reviews.llvm.org/D20859 llvm-svn: 271436	2016-06-01 21:46:51 +00:00
Simon Pilgrim	645e1ad33a	[X86][SSE] _mm_store1_ps/_mm_store1_pd should require an aligned pointer According to the gcc headers, intel intrinsics docs and msdn codegen the _mm_store1_pd (and its _mm_store_pd1 equivalent) should use an aligned pointer - the clang headers are the only implementation I can find that assume non-aligned stores (by storing with _mm_storeu_pd). Additionally, according to the intel intrinsics docs and msdn codegen the _mm_store1_ps (_mm_store_ps1) requires a similarly aligned pointer. This patch raises the alignment requirements to match the other implementations by calling _mm_store_ps/_mm_store_pd instead. I've also added the missing _mm_store_pd1 intrinsic (which maps to _mm_store1_pd like _mm_store_ps1 does to _mm_store1_ps). As a followup I'll update the llvm fast-isel tests to match this codegen. Differential Revision: http://reviews.llvm.org/D20617 llvm-svn: 271218	2016-05-30 17:55:25 +00:00
Craig Topper	09175dab31	[X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code that will compile to a native unaligned store. Remove the builtins since they are no longer used. Intrinsics will be removed from llvm in a future commit. llvm-svn: 271214	2016-05-30 17:10:30 +00:00
Simon Pilgrim	6d1a0c4c75	[X86][SSE] Make unsigned integer vector types generally available As discussed on http://reviews.llvm.org/D20684, move the unsigned integer vector types used for zero extension to make them available for general use. llvm-svn: 271187	2016-05-29 18:49:08 +00:00
Simon Pilgrim	90770c7c76	[X86][SSE] Replace lossless i32/f32 to f64 conversion intrinsics with generic IR Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen. This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work. Differential Revision: http://reviews.llvm.org/D20528 llvm-svn: 270499	2016-05-23 22:13:02 +00:00
Craig Topper	1aa231e3aa	[X86] Add typecasts to remove most assumptions about what __m128i/__m256i is defined as. Add similar typecasts for the fp types as well. llvm-svn: 269632	2016-05-16 06:38:42 +00:00

1 2 3

111 Commits