llvm-project

Commit Graph

Author	SHA1	Message	Date
Gadi Haber	19c4fc5e62	This is a large patch for X86 AVX-512 of an optimization for reducing code size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible. There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers. The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled. Reviewers: Craig Topper, Zvi Rackoover, Elena Demikhovsky Differential Revision: https://reviews.llvm.org/D27901 llvm-svn: 290663	2016-12-28 10:12:48 +00:00
Craig Topper	8e7498976a	[X86] Fix VEX encoded VPMADDUBSW to not be marked commutable. This was accidentallly broken in r285515 when we started lowering the intrinsic to an ISD node. Should fix PR31241. llvm-svn: 288578	2016-12-03 05:35:44 +00:00
Craig Topper	da73a09fcd	[X86] Add test cases demonstrating where we incorrectly commute VEX VPMADDUSBW due to a bug introduced in r285515. I believe this is the cause of PR31241. llvm-svn: 288577	2016-12-03 05:35:38 +00:00
Craig Topper	a4a51f1afe	[AVX-512] Add -show-mc-encoding to legacy vector intrinsic tests so we can see when VEX or EVEX encoded instructions are being emitted. Make sure the tests all have an avx2 command line and an skx command line. llvm-svn: 286055	2016-11-06 02:03:58 +00:00
Craig Topper	5c913e84df	[AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported. Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step. llvm-svn: 275763	2016-07-18 06:14:34 +00:00
Igor Breger	e59165ca63	[AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering. Differential Revision: http://reviews.llvm.org/D20897 llvm-svn: 273138	2016-06-20 07:05:43 +00:00
Craig Topper	8287fd8abd	[X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions. llvm-svn: 271236	2016-05-30 23:15:56 +00:00
Simon Pilgrim	9602d678cb	[X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 271131	2016-05-28 18:03:41 +00:00
Simon Pilgrim	4642a57fbf	Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) llvm-svn: 270976	2016-05-27 09:02:25 +00:00
Simon Pilgrim	c013e5737b	[X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. A companion patch (D20684) removes/auto-upgrade the clang intrinsics. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 270973	2016-05-27 08:49:15 +00:00
Craig Topper	25363178bb	[X86] Run the AVX/AVX2 intrinsic tests in AVX512VL mode too just to make sure we don't break any older intrinsics. llvm-svn: 270183	2016-05-20 05:10:32 +00:00
Simon Pilgrim	46e57fd073	[X86][AVX2] Regenerated intrinsics tests llvm-svn: 262401	2016-03-01 21:38:41 +00:00
Craig Topper	ecae476e4c	[X86] int_x86_avx2_permps and X86ISD::VPERMV should take an integer vector for its shuffle indices. llvm-svn: 254269	2015-11-29 22:53:22 +00:00
Ahmed Bougacha	1a498705e4	[X86] Replace avx2 broadcast intrinsics with native IR. Since r245605, the clang headers don't use these anymore. r245165 updated some of the tests already; update the others, add an autoupgrade, remove the intrinsics, and cleanup the definitions. Differential Revision: http://reviews.llvm.org/D10555 llvm-svn: 245606	2015-08-20 20:36:19 +00:00
Sanjay Patel	4339abe66f	[X86, AVX2] Replace inserti128 and extracti128 intrinsics with generic shuffles This should complete the job started in r231794 and continued in r232045: We want to replace as much custom x86 shuffling via intrinsics as possible because pushing the code down the generic shuffle optimization path allows for better codegen and less complexity in LLVM. AVX2 introduced proper integer variants of the hacked integer insert/extract C intrinsics that were created for this same functionality with AVX1. This should complete the removal of insert/extract128 intrinsics. The Clang precursor patch for this change was checked in at r232109. llvm-svn: 232120	2015-03-12 23:16:18 +00:00
Quentin Colombet	f59b2d034c	[X86] Fix a regression introduced by r223641. The permps and permd instructions have their operands swapped compared to the intrinsic definition. Therefore, they do not fall into the INTR_TYPE_2OP category. I did not create a new category for those two, as they are the only one AFAICT in that case. <rdar://problem/20108262> llvm-svn: 232085	2015-03-12 19:34:12 +00:00
Juergen Ributzka	1f7a17661c	Remove 'llvm.x86.avx2.vbroadcasti128' intrinsic. The intrinsic is no longer generated by the front-end. Remove the intrinsic and auto-upgrade it to a vector shuffle. Reviewed by Nadav This is related to rdar://problem/18742778. llvm-svn: 231182	2015-03-04 00:13:25 +00:00
Craig Topper	b324e43aed	[X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles. llvm-svn: 229640	2015-02-18 06:24:44 +00:00
Craig Topper	49df44e2e2	[X86] Remove the multiply by 8 that goes into the shift constant for X86ISD::VSHLDQ and X86ISD::VSRLDQ. This simplifies the pattern matching in isel and allows these nodes to become the patterns embedded in the instruction. llvm-svn: 229431	2015-02-16 20:52:07 +00:00
Chandler Carruth	c491f72e7a	[x86] Remove some windows line endings that snuck into the tests here. Folks on Windows, remember to set up your subversion to strip these when submitting... llvm-svn: 225593	2015-01-11 01:36:20 +00:00
Simon Pilgrim	a798e9ffdf	[X86][SSE] pslldq/psrldq shuffle mask decodes Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions. Differential Revision: http://reviews.llvm.org/D5598 llvm-svn: 219738	2014-10-14 22:31:34 +00:00
Chandler Carruth	373b2b1728	[x86] Fix a pretty horrible bug and inconsistency in the x86 asm parsing (and latent bug in the instruction definitions). This is effectively a revert of r136287 which tried to address a specific and narrow case of immediate operands failing to be accepted by x86 instructions with a pretty heavy hammer: it introduced a new kind of operand that behaved differently. All of that is removed with this commit, but the test cases are both preserved and enhanced. The core problem that r136287 and this commit are trying to handle is that gas accepts both of the following instructions: insertps $192, %xmm0, %xmm1 insertps $-64, %xmm0, %xmm1 These will encode to the same byte sequence, with the immediate occupying an 8-bit entry. The first form was fixed by r136287 but that broke the prior handling of the second form! =[ Ironically, we would still emit the second form in some cases and then be unable to re-assemble the output. The reason why the first instruction failed to be handled is because prior to r136287 the operands ere marked 'i32i8imm' which forces them to be sign-extenable. Clearly, that won't work for 192 in a single byte. However, making thim zero-extended or "unsigned" doesn't really address the core issue either because it breaks negative immediates. The correct fix is to make these operands 'i8imm' reflecting that they can be either signed or unsigned but must be 8-bit immediates. This patch backs out r136287 and then changes those places as well as some others to use 'i8imm' rather than one of the extended variants. Naturally, this broke something else. The custom DAG nodes had to be updated to have a much more accurate type constraint of an i8 node, and a bunch of Pat immediates needed to be specified as i8 values. The fallout didn't end there though. We also then ceased to be able to match the instruction-specific intrinsics to the instructions so modified. Digging, this is because they too used i32 rather than i8 in their signature. So I've also switched those intrinsics to i8 arguments in line with the instructions. In order to make the intrinsic adjustments of course, I also had to add auto upgrading for the intrinsics. I suspect that the intrinsic argument types may have led everything down this rabbit hole. Pretty happy with the result. llvm-svn: 217310	2014-09-06 10:00:01 +00:00
Quentin Colombet	6f12ae0d5c	[X86] Add broadcast instructions to the table used by ExeDepsFix pass. Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table. That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float). In particular, prior to this patch we were generating: vpbroadcastd LCPI1_0(%rip), %ymm2 vpand %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty Now, we generate the following nice sequence where everything is in the float domain: vbroadcastss LCPI1_0(%rip), %ymm2 vandps %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 <rdar://problem/16354675> llvm-svn: 204770	2014-03-26 00:10:22 +00:00
Cameron McInally	45dc489403	Fix AVX2 Gather execution domains. llvm-svn: 204713	2014-03-25 12:36:38 +00:00
Craig Topper	f7755df776	Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren. llvm-svn: 160110	2012-07-12 06:52:41 +00:00
Manman Ren	98a5bf24a9	X86: add more GATHER intrinsics in LLVM Corrected type for index of llvm.x86.avx2.gather.d.pd.256 from 256-bit to 128-bit. Corrected types for src\|dst\|mask of llvm.x86.avx2.gather.q.ps.256 from 256-bit to 128-bit. Support the following intrinsics: llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256 llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256 llvm-svn: 159402	2012-06-29 00:54:20 +00:00
Manman Ren	a09820414a	X86: add GATHER intrinsics (AVX2) in LLVM Support the following intrinsics: llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256 llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256 Modified Disassembler to handle VSIB addressing mode. llvm-svn: 159221	2012-06-26 19:47:59 +00:00
Craig Topper	bfc9a5f7d3	Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors. llvm-svn: 154778	2012-04-15 22:43:31 +00:00
Craig Topper	b85e40f738	Remove pcmpgt/pcmpeq intrinsics as clang is not using them. llvm-svn: 149367	2012-01-31 06:52:44 +00:00
Craig Topper	12b72def4e	Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled. llvm-svn: 145376	2011-11-29 05:37:58 +00:00
Craig Topper	a6d409d543	Add AVX2 variable shift instructions and intrinsics. llvm-svn: 143915	2011-11-07 08:26:24 +00:00
Craig Topper	ff39be0afc	Add AVX2 VPMOVMASK instructions and intrinsics. llvm-svn: 143904	2011-11-07 03:20:35 +00:00
Craig Topper	e122dcbf4a	Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects. llvm-svn: 143902	2011-11-07 02:00:04 +00:00
Craig Topper	f01f1b5cb9	More AVX2 instructions and their intrinsics. llvm-svn: 143895	2011-11-06 23:04:08 +00:00
Craig Topper	05d1cb98e7	Add more AVX2 instructions and intrinsics. llvm-svn: 143861	2011-11-06 06:12:20 +00:00
Craig Topper	0e7cbbabea	Add new X86 AVX2 VBROADCAST instructions. llvm-svn: 143612	2011-11-03 07:35:53 +00:00
Craig Topper	a47b05c7f3	More AVX2 instructions and intrinsics. llvm-svn: 143536	2011-11-02 06:54:17 +00:00
Craig Topper	682b850602	Add a bunch more X86 AVX2 instructions and their corresponding intrinsics. llvm-svn: 143529	2011-11-02 04:42:13 +00:00
Craig Topper	cfcfdf2aab	Begin adding AVX2 instructions. No selection support yet other than intrinsics. llvm-svn: 143331	2011-10-31 02:15:10 +00:00

39 Commits