Craig Topper
b6767f3acd
Move more SSE/AVX convert instruction patterns into their definitions.
...
llvm-svn: 160937
2012-07-29 22:30:06 +00:00
Craig Topper
fc93281c07
Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions.
...
llvm-svn: 160922
2012-07-28 18:59:19 +00:00
Craig Topper
024797b9a2
Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects.
...
llvm-svn: 160921
2012-07-28 18:36:39 +00:00
Craig Topper
44f9b5343d
Make CVTSS2SI instruction definition consistent with CVTSD2SI.
...
llvm-svn: 160914
2012-07-28 08:28:23 +00:00
Craig Topper
1c1aef07b8
Fix up memory load types for SSE scalar convert intrinsic patterns.
...
llvm-svn: 160913
2012-07-28 07:59:59 +00:00
Jakob Stoklund Olesen
77cd55b4ee
Remove the last mentions of sub_ss and sub_sd from patterns.
...
I'll remove these two sub-register indexes shortly.
llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen
b96d0b4e08
Eliminate sub_ss, sub_sd from broadcast patterns.
...
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).
llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen
206b825f5c
Eliminate more sub_ss / sub_sd patterns.
...
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.
llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen
75d17b0577
Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
...
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.
llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen
ceee4a9d0c
Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
...
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves. They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.
This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.
The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.
llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Craig Topper
c7690ac7ac
Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
...
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Nadav Rotem
4c12245b3a
The vbroadcast family of instructions has 'fallback patterns' in case where the
...
load source operand is used by multiple nodes. The v2i64 broadcast was emulated
by shuffling the two lower i32 elements to the upper two.
We had a bug in the immediate used for the broadcast.
Replacing 0 to 0x44.
0x44 means [01|00|01|00] which corresponds to the correct lane.
Patch by Michael Kuperstein.
llvm-svn: 160430
2012-07-18 08:14:48 +00:00
Craig Topper
01deb5f2df
Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
...
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Nadav Rotem
ee3552f88d
Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
...
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.
PR12782.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160230
2012-07-15 12:26:30 +00:00
Craig Topper
b3bac4908e
Mark VINSERTI128rm as MayLoad=1. Fixes PR13348.
...
llvm-svn: 160162
2012-07-13 05:46:28 +00:00
Craig Topper
f7755df776
Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
...
llvm-svn: 160110
2012-07-12 06:52:41 +00:00
Craig Topper
be41e2daa6
Reverse assembler/disassembler operand order for gather instructions.
...
llvm-svn: 159983
2012-07-10 06:38:33 +00:00
Craig Topper
85c938f44f
Remove extra space.
...
llvm-svn: 159647
2012-07-03 06:48:58 +00:00
Craig Topper
f067f9aa51
Change i128mem/i256mem to f128mem/f256mem on some floating point vector instructions.
...
llvm-svn: 159646
2012-07-03 06:11:06 +00:00
Craig Topper
676dcd8c39
Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252.
...
llvm-svn: 159644
2012-07-03 05:49:45 +00:00
Elena Demikhovsky
9af899fa88
Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
...
llvm-svn: 159504
2012-07-01 06:12:26 +00:00
Manman Ren
98a5bf24a9
X86: add more GATHER intrinsics in LLVM
...
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
from 256-bit to 128-bit.
Support the following intrinsics:
llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256
llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Manman Ren
a09820414a
X86: add GATHER intrinsics (AVX2) in LLVM
...
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256
Modified Disassembler to handle VSIB addressing mode.
llvm-svn: 159221
2012-06-26 19:47:59 +00:00
Craig Topper
94bf0f3855
Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.
...
llvm-svn: 159184
2012-06-26 04:12:49 +00:00
Craig Topper
357de815b4
Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there are no patterns in the instruction.
...
llvm-svn: 159127
2012-06-25 06:51:42 +00:00
Craig Topper
b6eb513c68
Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types.
...
llvm-svn: 159126
2012-06-25 06:16:00 +00:00
Craig Topper
fd5e6e7db1
Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
...
llvm-svn: 159109
2012-06-24 07:07:16 +00:00
Craig Topper
b925230fb1
Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
...
llvm-svn: 159108
2012-06-24 06:55:37 +00:00
Craig Topper
f48ec7a708
Fix build failures from r159106.
...
llvm-svn: 159107
2012-06-24 06:08:31 +00:00
Craig Topper
bab2b89944
Remove intrinsic specific instructions for CVTPD2PS and replace with just patterns.
...
llvm-svn: 159106
2012-06-24 05:44:31 +00:00
Craig Topper
3cee08ce7d
Remove intrinsic specific instructions for CVTPD2DQ. Replace with patterns.
...
llvm-svn: 159105
2012-06-24 05:33:24 +00:00
Craig Topper
a899cc15f1
Remove intrinsic specific instructions for (V)CVTDQ2PS. Use a Pat instead instead.
...
llvm-svn: 159090
2012-06-23 22:33:14 +00:00
Craig Topper
7e9415220a
Make CVTDQ2PS instruction use SSE2 predicate instead of SSE1. No functional change because there are no patterns in the instructions. Also fix a typo in a comment.
...
llvm-svn: 159087
2012-06-23 20:52:45 +00:00
Craig Topper
24e3418215
Move CVTPD2DQ to use SSE2 predicate instead of SSE3. Move DQ2PD and PD2DQ to the SSE2 section of the file.
...
llvm-svn: 159086
2012-06-23 20:15:42 +00:00
Craig Topper
8c03ea79c4
Use correct memory types for (V)CVTDQ2PD instructions.
...
llvm-svn: 159075
2012-06-23 08:30:27 +00:00
Craig Topper
431f1e7192
Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with intrinsic patterns. Mem forms omitted because the load size is only 64-bits.
...
llvm-svn: 159070
2012-06-23 04:23:36 +00:00
Craig Topper
21d04fc118
Add predicate check around some patterns.
...
llvm-svn: 158797
2012-06-20 07:30:23 +00:00
Craig Topper
3b662a6279
Add predicate check around some patterns.
...
llvm-svn: 158795
2012-06-20 07:01:11 +00:00
Kay Tiong Khoo
390edb0d91
*no need to pollute Intel syntax with bonus mnemonics; operand size is explicitly specified
...
llvm-svn: 158603
2012-06-16 17:19:49 +00:00
Craig Topper
bf2409e8aa
Mark several instructions SSE2 instead of SSE3 as they should be.
...
llvm-svn: 158049
2012-06-06 06:45:27 +00:00
Benjamin Kramer
a0396e4583
X86: Rename the CLMUL target feature to PCLMUL.
...
It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.
llvm-svn: 157745
2012-05-31 14:34:17 +00:00
Craig Topper
c1ac05dad5
Add intrinsic for pclmulqdq instruction.
...
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Benjamin Kramer
ef479ea854
Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
...
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.
llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Craig Topper
7daf897678
Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
...
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Craig Topper
dbb98b4917
Fix some issues in the f16c instructions.
...
llvm-svn: 156287
2012-05-07 06:00:15 +00:00
Craig Topper
d4e1894ec1
Add SSE4A MOVNTSS/MOVNTSD instructions.
...
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Nadav Rotem
810734b7f4
AVX: Add additional vbroadcast replacement sequences for integers.
...
Remove the v2f64 patterns because it does not match any vbroadcast
instruction.
llvm-svn: 155461
2012-04-24 18:09:59 +00:00
Nadav Rotem
aa3ff8da00
AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions
...
using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).
llvm-svn: 155437
2012-04-24 11:07:03 +00:00
Elena Demikhovsky
8d7e56c409
ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
...
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Craig Topper
4badeb3f0d
Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes.
...
llvm-svn: 154801
2012-04-16 07:13:00 +00:00
Craig Topper
c0075aa7ff
Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second.
...
llvm-svn: 154797
2012-04-16 06:26:15 +00:00
Craig Topper
b86fa404d3
Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
...
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper
bfc9a5f7d3
Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors.
...
llvm-svn: 154778
2012-04-15 22:43:31 +00:00
Nadav Rotem
42bcd04ee3
Fix PR12529. The Vxx family of instructions are only supported by AVX.
...
Use non-vex instructions for SSE4.
llvm-svn: 154770
2012-04-15 19:36:44 +00:00
Elena Demikhovsky
779a72b49e
Added VPERM optimization for AVX2 shuffles
...
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
Craig Topper
d0271b27cb
Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions.
...
llvm-svn: 154580
2012-04-12 07:23:00 +00:00
Nadav Rotem
9bc178ac5c
Reapply 154396 after fixing a test.
...
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154483
2012-04-11 06:40:27 +00:00
Eric Christopher
65ada95b84
Temporarily revert this patch to see if it brings the buildbots back.
...
llvm-svn: 154425
2012-04-10 19:33:16 +00:00
Nadav Rotem
f934f91709
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
...
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154396
2012-04-10 14:33:13 +00:00
Craig Topper
d024cef233
Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
...
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper
aa9aab5ad2
Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
...
llvm-svn: 154268
2012-04-07 21:57:43 +00:00
Craig Topper
7629d63bc4
Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
...
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Chad Rosier
4106917355
[avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to
...
vextractf128 with 128-bit mem dest.
Combines
vextractf128 $0, %ymm0, %xmm0
vmovaps %xmm0, (%rdi)
to
vextractf128 $0, %ymm0, (%rdi)
rdar://11082570
llvm-svn: 153139
2012-03-20 21:43:40 +00:00
Chad Rosier
0158ae2e5b
[avx] Add the AddedComplexity to the VINSERTI128 avx2 patterns to give
...
precedence over the VINSERTF128 avx1 patterns.
llvm-svn: 153114
2012-03-20 19:45:07 +00:00
Chad Rosier
93d5427c69
Whitespace.
...
llvm-svn: 153105
2012-03-20 18:38:33 +00:00
Chad Rosier
5a6011267a
[avx] Move the vextractf128 patterns closer to the vextractf128 def. Remove
...
whitespace from test case. No functional change intended.
llvm-svn: 153103
2012-03-20 18:24:55 +00:00
Chad Rosier
07a4cb9382
[avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads.
...
This results in things such as
vmovups 16(%rdi), %xmm0
vinsertf128 $1, %xmm0, %ymm0, %ymm0
to be combined to
vinsertf128 $1, 16(%rdi), %ymm0, %ymm0
rdar://11076953
llvm-svn: 153092
2012-03-20 17:08:51 +00:00
Chad Rosier
b9b73170e3
[avx] Add patterns for VINSERTF128rm.
...
This results in things such as
vmovaps -96(%rbx), %xmm1
vinsertf128 $1, %xmm1, %ymm0, %ymm0
to be combined to
vinsertf128 $1, -96(%rbx), %ymm0, %ymm0
rdar://10643481
llvm-svn: 152762
2012-03-15 00:45:30 +00:00
Kay Tiong Khoo
57c8e7f364
*fix typo in comment; test of commit access
...
llvm-svn: 152507
2012-03-10 21:29:49 +00:00
Chad Rosier
a281afc676
Fix a regression from r147481.
...
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078
llvm-svn: 152366
2012-03-09 02:00:48 +00:00
Preston Gurd
a49ef92a76
This patch adds instruction latencies for the SSE instructions
...
to the instruction scheduler for the Intel Atom.
llvm-svn: 151590
2012-02-27 23:35:03 +00:00
Pete Cooper
682c76b7d4
Turn avx insert intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove duplicate patterns for selecting the intrinsics
...
llvm-svn: 151342
2012-02-24 03:51:49 +00:00
Jia Liu
e1d619691b
some comment fix for X86 and ARM
...
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Jia Liu
b22310fda6
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
...
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Craig Topper
ba172d2d59
Remove the last of the old vector_shuffle patterns from X86 isel.
...
llvm-svn: 150795
2012-02-17 07:02:34 +00:00
Craig Topper
cfad98f745
Move old movl vector_shuffle patterns. Not needed anymore since vector_shuffles shouldn't reach isel.
...
llvm-svn: 150462
2012-02-14 08:14:53 +00:00
Craig Topper
8b19d78808
Still more vector_shuffle pattern removal.
...
llvm-svn: 150365
2012-02-13 07:23:41 +00:00
Craig Topper
74650add0e
Remove more vector_shuffle patterns for unpack. These should be target specific nodes when they get to isel.
...
llvm-svn: 150363
2012-02-13 05:48:49 +00:00
Craig Topper
6d471c9e49
Recommit r150328. Previous test failures should be fixed by r150360.
...
llvm-svn: 150362
2012-02-13 05:10:10 +00:00
NAKAMURA Takumi
0826c17d00
Revert r150328, "Remove more vector_shuffle patterns."
...
It caused 3 failures on pre-penryn and non-x86(generic) hosts.
llvm-svn: 150357
2012-02-13 00:10:15 +00:00
Craig Topper
e24c94af81
Remove more vector_shuffle patterns.
...
llvm-svn: 150328
2012-02-12 08:14:35 +00:00
Craig Topper
d40d9eb2b3
Remove more vector_shuffle patterns.
...
llvm-svn: 150321
2012-02-12 01:07:34 +00:00
Craig Topper
330ca97700
Remove more vector_shuffle patterns.
...
llvm-svn: 150314
2012-02-11 23:31:01 +00:00
Craig Topper
981c6cf7b3
Remove some patterns for matching vector_shuffle instructions since vector_shuffles should be custom lowered before isel.
...
llvm-svn: 150299
2012-02-11 07:43:35 +00:00
Craig Topper
172b9243cd
Remove a couple unneeded intrinsic patterns
...
llvm-svn: 150067
2012-02-08 08:29:30 +00:00
Craig Topper
5405571fe0
Remove GCC builtins for vpermilp* intrinsics as clang no longer needs them. Custom lower the intrinsics to the vpermilp target specific node and remove intrinsic patterns.
...
llvm-svn: 150060
2012-02-08 06:36:57 +00:00
Craig Topper
b27fd77c3f
Add instruction selection for 256-bit VPSHUFD and 128-bit VPERMILPS/VPERMILPD.
...
llvm-svn: 149968
2012-02-07 06:28:42 +00:00
Craig Topper
1d471e31ba
Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies.
...
llvm-svn: 149807
2012-02-05 03:14:49 +00:00
Elena Demikhovsky
fb44980b41
Optimization for SIGN_EXTEND operation on AVX.
...
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.
llvm-svn: 149600
2012-02-02 09:10:43 +00:00
Andrew Trick
8523b16ff5
Instruction scheduling itinerary for Intel Atom.
...
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
Adds a test to verify that the scheduler is working.
Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
Patch by Preston Gurd!
llvm-svn: 149558
2012-02-01 23:20:51 +00:00
Craig Topper
516cba3380
Fix pattern for memory form of PSHUFD for use with FP vectors to remove bitcast to an integer vector that normal code wouldn't have. Also remove bitcasts from code that turns splat vector loads into a shuffle as it was making the broken pattern necessary.
...
llvm-svn: 149232
2012-01-30 07:50:31 +00:00
Craig Topper
5639e9e8fb
Move some patterns back near their instructions and use AddedComplexity to fix priority. Merge some patterns into their instruction definition.
...
llvm-svn: 149122
2012-01-27 07:09:40 +00:00
Victor Umansky
5f29b0e57b
Fix for the following bug in AVX codegen for double-to-int conversions:
...
. "fptosi" and "fptoui" IR instructions are defined with round-to-zero rounding mode.
. Currently for AVX mode for <4xdouble> and <8xdouble> the "VCVTPD2DQ.128" and "VCVTPD2DQ.256" instructions are selected (for .fp_to_sint. DAG node operation ) by AVX codegen. However they use round-to-nearest-even rounding mode.
. Consequently, the conversion produces incorrect numbers.
The fix is to replace selection of VCVTPD2DQ instructions with VCVTTPD2DQ instructions. The latter use truncate (i.e. round-to-zero) rounding mode.
As .fp_to_sint. DAG node operation is used only for lowering of "fptosi" and "fptoui" IR instructions, the fix in X86InstrSSE.td definition file doesn.t have an impact on other LLVM flows.
The patch includes changes in the .td file, LIT test for the changes and a fix in a legacy LIT test (which produced asm code conflicting with LLVN IR spec).
llvm-svn: 149056
2012-01-26 08:51:39 +00:00
Craig Topper
1c0e22f57a
Fix AVX vs SSE patterns ordering issue for VPCMPESTRM and VPCMPISTRM.
...
llvm-svn: 149053
2012-01-26 07:31:30 +00:00
Craig Topper
b91760eff8
Remove some more patterns by custom lowering intrinsics to target specific nodes.
...
llvm-svn: 149052
2012-01-26 07:18:03 +00:00
Craig Topper
7834900950
Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns.
...
llvm-svn: 148933
2012-01-25 06:43:11 +00:00
Craig Topper
ce4f9c5668
Custom lower phadd and phsub intrinsics to target specific nodes. Remove the patterns that are no longer necessary.
...
llvm-svn: 148927
2012-01-25 05:37:32 +00:00
Craig Topper
5bcf070e68
Remove AVX 256-bit unaligned load intrinsics. 128-bit versions had been removed a while ago.
...
llvm-svn: 148922
2012-01-25 04:42:03 +00:00
Craig Topper
3ad5bc019a
Merge intrinsic pattern and no pattern versions of VCVTSD2SI intruction definitions. Matches non-AVX version of same instructions.
...
llvm-svn: 148914
2012-01-25 03:52:09 +00:00
Craig Topper
edd1d0acfc
Custom lower PCMPEQ/PCMPGT intrinsics to target specific nodes and remove the intrinsic patterns.
...
llvm-svn: 148687
2012-01-23 08:18:28 +00:00
Craig Topper
5e80db4e4f
Custom lower vector shift intrinsics to target specific nodes and remove the patterns that are no longer needed.
...
llvm-svn: 148684
2012-01-23 06:16:53 +00:00
Craig Topper
20c98df340
Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32 loads. All integer vector loads are promoted to v2i64 or v4i64 so these pattern fragments can never match. Fix or remove patterns that used these fragments.
...
llvm-svn: 148672
2012-01-23 00:06:44 +00:00
Craig Topper
0b7ad76bd0
Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching.
...
llvm-svn: 148670
2012-01-22 23:36:02 +00:00
Craig Topper
bd4884371b
Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.
...
llvm-svn: 148667
2012-01-22 22:42:16 +00:00
Craig Topper
094626414d
Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.
...
llvm-svn: 148664
2012-01-22 19:15:14 +00:00
Craig Topper
123adfa0f3
Move some vector shift patterns into their instruction definitions.
...
llvm-svn: 148643
2012-01-22 00:41:20 +00:00
Craig Topper
dcaa5fbd08
Add memory patterns for some of the fp<->integer conversion instructions. Fold some patterns into instruction definitions.
...
llvm-svn: 148641
2012-01-21 18:37:15 +00:00
Craig Topper
3469212c82
Add support for selecting 256-bit PALIGNR.
...
llvm-svn: 148532
2012-01-20 05:53:00 +00:00
Craig Topper
db8890aedd
Give priority to AVX over SSE for 128-bit floating point unpck instructions.
...
llvm-svn: 148233
2012-01-16 09:56:42 +00:00
Craig Topper
c10e1abaf3
Fix the memop type on a couple 256-bit AVX instructions that were using f128mem instead of f256mem.
...
llvm-svn: 148196
2012-01-14 18:29:57 +00:00
Chad Rosier
71a185c5c6
Fix pasto from r146196.
...
llvm-svn: 148167
2012-01-14 01:50:21 +00:00
Craig Topper
e52d86a740
Convert SHUFPD with the same register for both sources to PSHUFD if it would prevent a register copy. Similar to SHUFPS, but requires the mask to be converted.
...
llvm-svn: 148112
2012-01-13 09:21:41 +00:00
Craig Topper
cb7e13d7c0
Make X86 instruction selection use 256-bit VPXOR for build_vector of all ones if AVX2 is enabled. This gives the ExeDepsFix pass a chance to choose FP vs int as appropriate. Also use v8i32 as the type for getZeroVector if AVX2 is enabled. This is consistent with SSE2 using prefering v4i32.
...
llvm-svn: 148108
2012-01-13 08:12:35 +00:00
Craig Topper
9f14d9f939
Add patterns for v16i16 and v32i8 immAllZerosV to select VPXOR to match v4i64 and v8i32.
...
llvm-svn: 148106
2012-01-13 06:59:47 +00:00
Chad Rosier
1a8f0ccd8c
Add missing VEX predicates to VMOVSDto64rr/VMOVSDto64mr. This fixes a few
...
failing test cases on our internal AVX nightly tester.
rdar://10663637
llvm-svn: 147881
2012-01-10 22:14:06 +00:00
Craig Topper
eb8f9e9e5b
Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.
...
llvm-svn: 147841
2012-01-10 06:30:56 +00:00
Craig Topper
b89805c77d
Add HasAVX predicate to some of the AVX patterns.
...
llvm-svn: 147769
2012-01-09 08:34:00 +00:00
Craig Topper
a51f7f75c2
Reorder a bunch of patterns to put the AVX version first thus giving it priority over the SSE version. Another step towards trying to remove the AVX hack that disables SSE from X86Subtarget.
...
llvm-svn: 147768
2012-01-09 08:10:38 +00:00
Craig Topper
ef7f5bf8c9
Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.
...
llvm-svn: 147767
2012-01-09 06:52:46 +00:00
Craig Topper
c1f5622ad3
Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no AVX equivalent so we should use the SSE version.
...
llvm-svn: 147766
2012-01-09 06:38:55 +00:00
Craig Topper
a081644f8a
Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical operations ANDPS/ORPS/XORPS/ANDNPS. This fixes a pattern ordering issue that meant that the SSE2 instructions could never be directly selected since the SSE1 patterns would always match first. This is largely moot with the ExeDepsFix pass, but I'm trying to audit for all such ordering issues.
...
llvm-svn: 147765
2012-01-09 05:07:01 +00:00
Chad Rosier
493c1b3152
Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather
...
then a vxorps + vinsertf128 pair if the original vector came from a load.
rdar://10594409
llvm-svn: 147481
2012-01-03 21:05:52 +00:00
Craig Topper
53d559641f
Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also make it return false if there's not even a load at all. This makes the code better match the code in DAGCombiner that it tries to match. These two changes prevent some cases where vector_shuffles were making it to instruction selection and causing the older shuffle selection code to be triggered. Also needed to fix a bad pattern that this change exposed. This is the first step towards getting rid of the old shuffle selection support. No test cases yet because there's no way to tell whether a shuffle was handled in the legalize stage or at instruction selection.
...
llvm-svn: 147428
2012-01-02 08:46:48 +00:00
Craig Topper
1c064e0a89
Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers.
...
llvm-svn: 147409
2012-01-01 19:40:22 +00:00
Craig Topper
6e54ba7eee
Merge X86 SHUFPS and SHUFPD node types.
...
llvm-svn: 147394
2011-12-31 23:50:21 +00:00
Craig Topper
d51092d93a
Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load.
...
llvm-svn: 147393
2011-12-31 23:24:49 +00:00
Craig Topper
0e796fee11
Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected.
...
llvm-svn: 147392
2011-12-31 23:15:11 +00:00
Craig Topper
9e61291bf5
Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.
...
llvm-svn: 147342
2011-12-29 17:41:56 +00:00
Chad Rosier
3172488cc0
Fix 80-column violations.
...
llvm-svn: 147095
2011-12-21 20:59:09 +00:00
Elena Demikhovsky
ec7e6e0946
This is the second fix related to VZEXT_MOVL node.
...
The failure that I see in the current version is:
LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14]
0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13]
0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12]
0x18b9870: v4i64 = undef [ID=4]
0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10]
0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9970: i32 = Constant<0> [ID=3]
0x18b9170: v2i64 = undef [ORD=1] [ID=1]
0x18b9570: i32 = Constant<2> [ID=5]
llvm-svn: 146975
2011-12-20 13:34:28 +00:00
Eli Friedman
64944090ff
Make sure we correctly note the existence of an i8 immediate for vblendvps and friends, so we compute fixups correctly. PR11586.
...
llvm-svn: 146709
2011-12-15 23:46:18 +00:00
Chad Rosier
41dbf59e12
Add missing zmovl AVX patterns which were causing crashes.
...
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!
llvm-svn: 146689
2011-12-15 22:11:31 +00:00
Benjamin Kramer
16bbfbec66
X86: Add patterns for the various rounding ops for SSE4.1 and AVX.
...
llvm-svn: 146257
2011-12-09 15:44:03 +00:00
Benjamin Kramer
2dc5dec41d
X86: Split (v)rounds[sd] into a normal and an intrinsic version.
...
llvm-svn: 146256
2011-12-09 15:43:55 +00:00
Evan Cheng
b96bca81e7
Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417
...
llvm-svn: 146196
2011-12-08 22:30:45 +00:00
Evan Cheng
2a217be25f
Add various missing AVX patterns which was causing crashes. Sadly, the generated
...
code looks pretty bad compared to SSE.
rdar://10538793
llvm-svn: 146191
2011-12-08 22:05:28 +00:00
Evan Cheng
4d1a2d449f
Many of the SSE patterns should not be selected when AVX is available. This led to the following code in X86Subtarget.cpp
...
if (HasAVX)
X86SSELevel = NoMMXSSE;
This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected.
However, this breaks instructions which do not have AVX variants.
The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX().
Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change.
However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case,
the prefetch instructions. rdar://10538297
llvm-svn: 146163
2011-12-08 19:00:42 +00:00
Craig Topper
1d578e8835
Fix a bunch of SSE/AVX patterns to use proper memop types. In particular, not using integer loads other than v2i64/v4i64 since the others are all promoted.
...
llvm-svn: 146031
2011-12-07 08:30:53 +00:00
Craig Topper
6572e0f203
Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those.
...
llvm-svn: 145927
2011-12-06 09:04:59 +00:00
Craig Topper
8d4ba198d6
Merge floating point and integer UNPCK X86ISD node types.
...
llvm-svn: 145926
2011-12-06 08:21:25 +00:00
Craig Topper
0a672eaf9e
Merge VPERM2F128/VPERM2I128 ISD node types.
...
llvm-svn: 145485
2011-11-30 07:47:51 +00:00
Craig Topper
bafd224c8b
Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.
...
llvm-svn: 145483
2011-11-30 06:25:25 +00:00
Evan Cheng
648e48d02e
Add another missing pattern. llvm-gcc likes f64 but clang likes i64 so it was generating poor code for some SSE builtins.
...
llvm-svn: 145448
2011-11-29 22:48:34 +00:00
Jakob Stoklund Olesen
bde32d36bb
Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions.
...
Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.
This also makes the AVX variants redundant.
llvm-svn: 145440
2011-11-29 22:27:25 +00:00
Elena Demikhovsky
7a81dea516
Fixed vsqrt.ss intrinsic usage - order of input operands was wrong.
...
Added a test.
Thanks Bruno for reviewing the patch.
llvm-svn: 145403
2011-11-29 15:00:45 +00:00
Craig Topper
c16db840be
Fix issues in shuffle decoding around VPERM* instructions. Fix shuffle decoding for VSHUFPS/D for 256-bit types. Add pattern matching for memory forms of VPERMILPS/VPERMILPD.
...
llvm-svn: 145390
2011-11-29 07:49:05 +00:00
Craig Topper
12b72def4e
Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled.
...
llvm-svn: 145376
2011-11-29 05:37:58 +00:00
Craig Topper
897a7d4b9c
Correctly mark VPERM2F128 as being an FP instruction and add execution domain fixing support to convert it to VPERM2I128 for AVX2.
...
llvm-svn: 145370
2011-11-29 03:57:34 +00:00
Evan Cheng
aa93ceb164
Add missing avx pattern.
...
llvm-svn: 145272
2011-11-28 20:27:23 +00:00
Craig Topper
818a983e93
Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
...
llvm-svn: 145238
2011-11-28 10:14:51 +00:00
Craig Topper
51280d565b
Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
...
llvm-svn: 145153
2011-11-26 22:55:48 +00:00
Craig Topper
7704bd7ac3
Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
...
llvm-svn: 145148
2011-11-26 20:47:44 +00:00
Craig Topper
d65a444478
Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
...
llvm-svn: 145126
2011-11-24 22:57:10 +00:00
Craig Topper
d26466748b
Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.
...
llvm-svn: 145125
2011-11-24 22:20:08 +00:00
Craig Topper
6270d072c5
Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
...
llvm-svn: 145028
2011-11-21 08:26:50 +00:00
Craig Topper
669199ca94
Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled.
...
llvm-svn: 145026
2011-11-21 06:57:39 +00:00
Craig Topper
e79761df73
Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
...
llvm-svn: 145005
2011-11-20 00:12:05 +00:00
Craig Topper
a3a6583694
Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
...
llvm-svn: 145004
2011-11-19 22:34:59 +00:00
Craig Topper
bac86038ac
Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns.
...
llvm-svn: 145003
2011-11-19 21:01:54 +00:00
Craig Topper
3af6ae089f
Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns.
...
llvm-svn: 144999
2011-11-19 17:46:46 +00:00
Craig Topper
f984efbfce
Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.
...
llvm-svn: 144989
2011-11-19 09:02:40 +00:00
Craig Topper
81390be00f
Collapse X86 PSIGNB/PSIGNW/PSIGND node types.
...
llvm-svn: 144988
2011-11-19 07:33:10 +00:00
Craig Topper
de6b73bb4d
Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
...
llvm-svn: 144987
2011-11-19 07:07:26 +00:00
Craig Topper
66e2b5a61e
Remove unused parameters from the AVX maskmov classes.
...
llvm-svn: 144985
2011-11-19 04:49:22 +00:00
Nadav Rotem
1ec141d0f9
Add AVX2 vpbroadcast support
...
llvm-svn: 144967
2011-11-18 02:49:55 +00:00
Craig Topper
f41e1d0246
Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments.
...
llvm-svn: 144896
2011-11-17 07:49:38 +00:00
Craig Topper
f17b600577
Remove seemingly unnecessary duplicate VROUND definitions.
...
llvm-svn: 144885
2011-11-17 07:04:00 +00:00
Evan Cheng
011538dc79
Another missing X86ISD::MOVLPD pattern. rdar://10450317
...
llvm-svn: 144839
2011-11-16 22:24:44 +00:00
Craig Topper
3ed7d9ee5a
Fix the execution domain on a bunch of SSE/AVX instructions.
...
llvm-svn: 144784
2011-11-16 07:30:46 +00:00
Evan Cheng
fb13d32b3f
Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
...
llvm-svn: 144566
2011-11-14 20:35:52 +00:00
Craig Topper
a331515c82
Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway.
...
llvm-svn: 144522
2011-11-14 06:46:21 +00:00
Craig Topper
3dc75f9e3b
Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code.
...
llvm-svn: 144457
2011-11-12 09:58:49 +00:00
Craig Topper
ea28a34c43
Add lowering for AVX2 shift instructions.
...
llvm-svn: 144380
2011-11-11 07:39:23 +00:00
Nadav Rotem
0a2f797dec
AVX2: Add variable shift from memory.
...
Note: These patterns only works in some cases because
many times the load sd node is bitcasted from a load
node of a different type.
llvm-svn: 144266
2011-11-10 06:54:20 +00:00
Nadav Rotem
1938482bfa
AVX2: Add patterns for variable shift operations
...
llvm-svn: 144212
2011-11-09 21:22:13 +00:00
Nadav Rotem
79135d844d
Add AVX2 support for vselect of v32i8
...
llvm-svn: 144187
2011-11-09 13:21:28 +00:00
Craig Topper
c9eb09d3b8
Add instruction selection for AVX2 integer comparisons.
...
llvm-svn: 144176
2011-11-09 08:06:13 +00:00
Evan Cheng
91b56e0390
Add x86 isel logic and patterns to match movlps from clang generated IR for _mm_loadl_pi(). rdar://10134392, rdar://10050222
...
llvm-svn: 144052
2011-11-08 00:31:58 +00:00
Craig Topper
a6d409d543
Add AVX2 variable shift instructions and intrinsics.
...
llvm-svn: 143915
2011-11-07 08:26:24 +00:00
Craig Topper
ff39be0afc
Add AVX2 VPMOVMASK instructions and intrinsics.
...
llvm-svn: 143904
2011-11-07 03:20:35 +00:00
Craig Topper
e122dcbf4a
Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects.
...
llvm-svn: 143902
2011-11-07 02:00:04 +00:00
Craig Topper
f01f1b5cb9
More AVX2 instructions and their intrinsics.
...
llvm-svn: 143895
2011-11-06 23:04:08 +00:00
Craig Topper
05d1cb98e7
Add more AVX2 instructions and intrinsics.
...
llvm-svn: 143861
2011-11-06 06:12:20 +00:00
Craig Topper
caba032f48
Add intrinsics for X86 vcvtps2ph and vcvtph2ps instructions
...
llvm-svn: 143683
2011-11-04 06:59:49 +00:00
Craig Topper
0e7cbbabea
Add new X86 AVX2 VBROADCAST instructions.
...
llvm-svn: 143612
2011-11-03 07:35:53 +00:00
Craig Topper
a47b05c7f3
More AVX2 instructions and intrinsics.
...
llvm-svn: 143536
2011-11-02 06:54:17 +00:00
Craig Topper
682b850602
Add a bunch more X86 AVX2 instructions and their corresponding intrinsics.
...
llvm-svn: 143529
2011-11-02 04:42:13 +00:00
Craig Topper
cfcfdf2aab
Begin adding AVX2 instructions. No selection support yet other than intrinsics.
...
llvm-svn: 143331
2011-10-31 02:15:10 +00:00
Jakob Stoklund Olesen
eafa9d50c2
V_SET0 has no side effects.
...
TableGen will mark any pattern-less instruction as having unmodeled side
effects. This is extra bad for V_SET0 which gets rematerialized a lot.
This was part of the cause for PR11125, but the real bug was fixed
in r141923.
llvm-svn: 141924
2011-10-14 00:39:50 +00:00
Craig Topper
2fdcb1f045
Add 'implicit EFLAGS' to patterns for popcnt and lzcnt
...
llvm-svn: 141853
2011-10-13 06:18:52 +00:00
Craig Topper
63bc541196
Add HasPOPCNT predicate to the POPCNT instructions. Also mark POPCNT as modifying EFLAGS.
...
llvm-svn: 141656
2011-10-11 07:13:09 +00:00
Craig Topper
0fbca75c17
Make Ivy Bridge 16-bit floating point conversion instructions require AVX.
...
llvm-svn: 141654
2011-10-11 07:01:37 +00:00
Craig Topper
fe9179fa4f
Add Ivy Bridge 16-bit floating point conversion instructions for the X86 disassembler.
...
llvm-svn: 141505
2011-10-09 07:31:39 +00:00
Craig Topper
f18c896337
Add support in the disassembler for ignoring the L-bit on certain VEX instructions. Mark instructions that have this behavior. Fixes PR10676.
...
llvm-svn: 141065
2011-10-04 06:30:42 +00:00
Craig Topper
786bdb9e14
Add support for MOVBE and RDRAND instructions for the assembler and disassembler. Includes feature flag checking, but no instrinsic support. Fixes PR10832, PR11026 and PR11027.
...
llvm-svn: 141007
2011-10-03 17:28:23 +00:00
Jakob Stoklund Olesen
dd1904e7a6
Expand the x86 V_SET0* pseudos right after register allocation.
...
This also makes it possible to reduce the number of pseudo instructions
and get rid of the encoding information.
llvm-svn: 140776
2011-09-29 05:10:54 +00:00
Duncan Sands
a54fd541c2
Implement Chris's suggestion of legalizing the various SSE and AVX
...
hadd/hsub intrinsics into the new fhadd/fhsub X86 node.
llvm-svn: 140383
2011-09-23 16:10:22 +00:00
Duncan Sands
0e4fcb8e3b
Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from
...
floating point add/sub of appropriate shuffle vectors. Does not
synthesize the 256 bit AVX versions because they work differently.
llvm-svn: 140332
2011-09-22 20:15:48 +00:00
Bruno Cardoso Lopes
8058234b32
Revert r140097, working on a better approach
...
llvm-svn: 140203
2011-09-20 23:19:29 +00:00
Bruno Cardoso Lopes
33e91a6cf7
The wrong relocation was being emitted for several SSSE3 instructions.
...
This fixes PR10963. Thanks to Benjamin for finding the wrong tablegen
declaration.
llvm-svn: 140184
2011-09-20 21:39:21 +00:00
Bruno Cardoso Lopes
c4398d2c7b
Fix PR10949. Fix the encoding of VMOVPQIto64rr.
...
llvm-svn: 140098
2011-09-19 23:36:59 +00:00
Bruno Cardoso Lopes
51792dcc4d
Based on the small opt Zvi's patch was trying to achieve, eliminate
...
128-bit undef subvector insertion into a 256-bit vector
llvm-svn: 140097
2011-09-19 23:36:50 +00:00
Bruno Cardoso Lopes
d4a3d452d4
Match X86ISD::FSETCCsd and X86ISD::FSETCCss while in AVX mode. This fix
...
PR10955 and PR10948.
llvm-svn: 140069
2011-09-19 21:29:24 +00:00
Bruno Cardoso Lopes
4641efe304
Describe more AVX 128-bit convert instructions without patterns to have
...
mayLoad = 1
llvm-svn: 139973
2011-09-16 23:41:29 +00:00
Bruno Cardoso Lopes
5389ed5dfb
Add mayLoad attribute to AVX convert instructions, since non of them
...
are declared with load patterns. This fix the crash in PR10941. No testcases,
since a fold is triggered and then converted back to the register form
afterwards.
llvm-svn: 139953
2011-09-16 22:02:14 +00:00
Craig Topper
ee8157cb41
Fix mem type for VEX.128 form of VROUNDP*. Remove filter preventing VROUND from being recognized by disassembler.
...
llvm-svn: 139691
2011-09-14 06:41:26 +00:00
Bruno Cardoso Lopes
d560b8c8e9
Teach the foldable tables about 128-bit AVX instructions and make the
...
alignment check for 256-bit classes more strict. There're no testcases
but we catch more folding cases for AVX while running single and multi
sources in the llvm testsuite.
Since some 128-bit AVX instructions have different number of operands
than their SSE counterparts, they are placed in different tables.
256-bit AVX instructions should also be added in the table soon. And
there a few more 128-bit versions to handled, which should come in
the following commits.
llvm-svn: 139687
2011-09-14 02:36:58 +00:00
Nadav Rotem
9cfbeaff15
swap vselect operand order - pr10907
...
llvm-svn: 139630
2011-09-13 19:56:38 +00:00
Bruno Cardoso Lopes
03d6002d68
Add versions 256-bit versions of alignedstore and alignedload, to be
...
more strict about the alignment checking. This was found by inspection
and I don't have any testcases so far, although the llvm testsuite runs
without any problem.
llvm-svn: 139625
2011-09-13 19:33:03 +00:00
Craig Topper
e98d8a5c84
Remove filter that was preventing MOVDQU/MOVDQA and their VEX forms from being disassembled. Also added encodings for the other register/register form of these instructions. Fixes PR10848.
...
llvm-svn: 139588
2011-09-13 06:54:58 +00:00
Craig Topper
b7ae29e404
Fix encoding of VMOVDQU to not simultaneously be 'TB OpSize' and 'XS'. 'XS' is correct and seems to have been taking priority.
...
llvm-svn: 139587
2011-09-13 06:39:34 +00:00
Bruno Cardoso Lopes
ff8d8a830e
Fix PR10845. SUBREG_TO_REG shouldn't be used when the input and
...
destination types are equal!
llvm-svn: 139553
2011-09-12 22:59:23 +00:00
Bruno Cardoso Lopes
f6382979f2
Organize a bit the operand names for CMPPS and CMPPD
...
llvm-svn: 139527
2011-09-12 19:30:36 +00:00
Bruno Cardoso Lopes
2e4bee16bb
Realign BLEND patterns to match the general style for patterns in .td file.
...
llvm-svn: 139526
2011-09-12 19:30:33 +00:00
Bruno Cardoso Lopes
9c9f64918c
Fix 80-columns
...
llvm-svn: 139525
2011-09-12 19:30:29 +00:00
Nadav Rotem
c0c71e162a
Format patterns, remove unused X86blend patterns
...
llvm-svn: 139491
2011-09-12 08:41:50 +00:00
Craig Topper
48f2b36911
Fix disassembling of one of the register/register forms of MOVUPS/MOVUPD/MOVAPS/MOVAPD/MOVSS/MOVSD and their VEX equivalents. Fixes PR10877.
...
llvm-svn: 139486
2011-09-11 23:19:54 +00:00
Nadav Rotem
b873b18721
CR fixes per Bruno's request.
...
Undo the changes from r139285 which added custom lowering to vselect.
Add tablegen lowering for vselect.
llvm-svn: 139479
2011-09-11 15:02:23 +00:00
Nadav Rotem
de838daefd
Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type
...
llvm-svn: 139400
2011-09-09 20:29:17 +00:00
Bruno Cardoso Lopes
46b9cde019
Add a AVX version of a simple i64 -> f64 bitcast. This could be
...
triggered using llc with -O0, which wouldn't let it be folded and
expose the lack of this pattern.
llvm-svn: 139320
2011-09-08 21:52:33 +00:00
Bruno Cardoso Lopes
fb113a0051
Add AVX versions of blend vector operations and fix some issues noticed
...
in Nadav's r139285 and r139287 commits.
1) Rename vsel.ll to a more descriptive name
2) Change the order of BLEND operands to "Op1, Op2, Cond", this is
necessary because PBLENDVB is already used in different places with
this order, and it was being emitted in the wrong way for vselect
3) Add AVX patterns and tests for the same SSE41 instructions
llvm-svn: 139305
2011-09-08 18:05:08 +00:00
Bruno Cardoso Lopes
ea8d803bb0
Fix PR10844: Add patterns to cover non foldable versions of X86vzmovl.
...
Triggered using llc -O0. Also fix some SET0PS patterns to their AVX
forms and test it on the testcase.
llvm-svn: 139304
2011-09-08 18:05:02 +00:00
Nadav Rotem
2550ba2a27
Add X86-SSE4 codegen support for vector-select.
...
llvm-svn: 139285
2011-09-08 08:11:19 +00:00
Bruno Cardoso Lopes
07d9914620
Add AVX versions to match AESENC/AESDEC intrinsics. This hopefully ends
...
the cycle of missing AVX counterparts of already present SSE* patterns
llvm-svn: 139073
2011-09-03 00:47:08 +00:00
Bruno Cardoso Lopes
1d5c2d9227
Add AVX version of a SSE4.1 VPBLENDVB pattern
...
llvm-svn: 139072
2011-09-03 00:47:05 +00:00
Bruno Cardoso Lopes
212a8c4357
Add AVX versions of SSE4.1 EXTRACTPS patterns
...
llvm-svn: 139071
2011-09-03 00:47:03 +00:00
Bruno Cardoso Lopes
3d581a36b6
Add AVX versions for SSE4.1 MOVZX* patterns
...
llvm-svn: 139070
2011-09-03 00:47:01 +00:00
Bruno Cardoso Lopes
6d701fcef0
Add one more AVX pattern for MOVZPQILo2PQI
...
llvm-svn: 139069
2011-09-03 00:46:58 +00:00
Bruno Cardoso Lopes
9923c51564
Move PUNPCKLQDQ splat pattern close to the instruction definition and
...
duplicate it for AVX mode.
llvm-svn: 139068
2011-09-03 00:46:56 +00:00
Bruno Cardoso Lopes
96b11f39e2
Add AVX pattern versions for PSHUFB,PSIGN{B,W,D}
...
llvm-svn: 139067
2011-09-03 00:46:54 +00:00
Bruno Cardoso Lopes
9a0da1e57a
Add AVX versions of MOVZDI2PDI patterns. Use SUBREG_TO_REG to indicate
...
that the AVX versions (even the 128-bit ones) all clear the upper part
of the destination register.
llvm-svn: 139066
2011-09-03 00:46:51 +00:00
Bruno Cardoso Lopes
903952223a
Enforce subtarget checks in a few places to be explicit when the
...
pattern should be matched
llvm-svn: 139065
2011-09-03 00:46:49 +00:00
Bruno Cardoso Lopes
521b0cfdc6
Tidy up code moving patterns to their appropriate place!
...
llvm-svn: 139064
2011-09-03 00:46:47 +00:00
Bruno Cardoso Lopes
aad5e50ded
Add AVX versions of FsMOVAPS and FsMOVAPS. Teach X86InstrInfo how to use
...
it!
llvm-svn: 139063
2011-09-03 00:46:45 +00:00
Bruno Cardoso Lopes
006c9371a1
Fix 80-column and style
...
llvm-svn: 139061
2011-09-03 00:46:40 +00:00
Bruno Cardoso Lopes
dbb40015ff
Tidy up some SSE/AVX convert intrinsics. Also add an AVX version of
...
OptForSize pattern
llvm-svn: 139060
2011-09-03 00:46:38 +00:00
Bruno Cardoso Lopes
a0d85139e5
Move more code around and duplicate AVX patterns: MOVHPS and MOVLPS
...
llvm-svn: 138897
2011-08-31 21:15:32 +00:00
Bruno Cardoso Lopes
21a180367b
Move MOVAPS,MOVUPS patterns close to the instructions definition
...
llvm-svn: 138896
2011-08-31 21:15:29 +00:00
Bruno Cardoso Lopes
941001312a
Remove "_Int" forms of MOVUPSmr and MOVAPSmr
...
llvm-svn: 138895
2011-08-31 21:15:22 +00:00
Bruno Cardoso Lopes
9fc6b8be03
- Move all MOVSS and MOVSD patterns close to their definitions
...
- Duplicate some store patterns to their AVX forms!
- Catched a bug while restricting the patterns subtarget, fix it
and update a testcase to check it properly
llvm-svn: 138851
2011-08-31 03:04:20 +00:00
Bruno Cardoso Lopes
aa1daa63da
Remove unnecessary AVX checks
...
llvm-svn: 138850
2011-08-31 03:04:14 +00:00
Evan Cheng
cb1e5bae4c
Fix (movhps load) lowering / pattern to match more cases. rdar://10050549
...
llvm-svn: 138848
2011-08-31 02:05:24 +00:00
Bruno Cardoso Lopes
50e0170fa5
Move non-intruction patterns to a more appropriate place!
...
llvm-svn: 138744
2011-08-29 17:51:24 +00:00
Craig Topper
c66d50d1a2
Fix disassembling of VCVTSD2SI
...
llvm-svn: 138623
2011-08-26 04:49:29 +00:00
Bruno Cardoso Lopes
ed834810be
Do the same as r138461. Mark VZEROALL as clobbering all YMM registers
...
llvm-svn: 138592
2011-08-25 22:23:58 +00:00
Bruno Cardoso Lopes
8347b86293
Add support for AVX 256-bit version of MOVDDUP!
...
llvm-svn: 138588
2011-08-25 21:40:37 +00:00
Craig Topper
14380ff9a0
Add more missing TB encodings to VEX instructions to allow them to be disassembled. Fixes remainder of PR10678.
...
llvm-svn: 138553
2011-08-25 08:11:01 +00:00
Craig Topper
e1541838f9
Add TB encoding to VEROALL, VZEROUPPER, and VCVTPS2PD to allow them to be disassembled. Fixes PR10723.
...
llvm-svn: 138551
2011-08-25 06:57:46 +00:00
Bruno Cardoso Lopes
296256fb32
Add support for 256-bit versions of VSHUFPD and VSHUFPS.
...
llvm-svn: 138546
2011-08-25 02:58:26 +00:00
Bruno Cardoso Lopes
50d74211df
Create a section for non-instructions patterns in the beginning of the
...
file, and move more code around!
llvm-svn: 138521
2011-08-24 23:18:11 +00:00