Michael Liao
00b20cc924
[PATCH] Fix VGATHER* operand constraints
...
Add earlyclobber constaints to prevent input register being allocated as
the output register because, according to Intel spec [1], "If any pair
of the index, mask, or destination registers are the same, this
instruction results a UD fault."
---
[1] http://software.intel.com/sites/default/files/319433-014.pdf
llvm-svn: 183327
2013-06-05 18:12:26 +00:00
Elena Demikhovsky
fad029202f
Removed SSEPacked domain from all forms (AVX, SSE, signed, unsigned) scalar compare instructions, like COMISS, COMISD.
...
No functional changes.
llvm-svn: 182371
2013-05-21 12:04:22 +00:00
Preston Gurd
9264c95400
Corrected Atom latencies for SSE SQRT instructions.
...
llvm-svn: 181346
2013-05-07 19:57:34 +00:00
Rafael Espindola
817c1d92b4
Put VMOVPQIto64rr in the VRPDI class.
...
Patch by Joshua Magee.
llvm-svn: 180842
2013-05-01 13:00:16 +00:00
Benjamin Kramer
aec90531f9
X86: Now that we have a canonical form for vector integer abs, match it into pabs.
...
llvm-svn: 180600
2013-04-26 12:05:21 +00:00
Jakob Stoklund Olesen
e440d476ee
Annotate the remaining x86 instructions with SchedRW lists.
...
Now all x86 instructions that have itinerary classes also have SchedRW
lists. This is required before the new scheduling models can be used.
There are still unannotated instructions remaining, but they don't have
itinerary classes either.
llvm-svn: 178051
2013-03-26 18:24:22 +00:00
Jakob Stoklund Olesen
4d39e81fb8
Remove IIC_DEFAULT from X86Schedule.td
...
All the instructions tagged with IIC_DEFAULT had nothing in common, and
we already have a NoItineraries class to represent untagged
instructions.
llvm-svn: 177937
2013-03-25 23:12:41 +00:00
Jakob Stoklund Olesen
712f674880
Model prefetches and barriers as loads.
...
It's not yet clear if these instructions need a more careful model.
llvm-svn: 177599
2013-03-20 23:09:53 +00:00
Jakob Stoklund Olesen
5b535c965e
Add a catch-all WriteSystem SchedWrite type.
...
This is used for all the expensive system instructions.
llvm-svn: 177598
2013-03-20 23:09:50 +00:00
Jakob Stoklund Olesen
cd4ebb7639
Annotate the remaining SSE MOV instructions.
...
llvm-svn: 177592
2013-03-20 22:37:16 +00:00
Jakob Stoklund Olesen
c6dc70d865
Annotate SSE horizontal and integer instructions.
...
llvm-svn: 177591
2013-03-20 22:37:13 +00:00
Jakob Stoklund Olesen
7a8bb72a3a
Add some missing SSE annotations.
...
llvm-svn: 177540
2013-03-20 16:56:39 +00:00
Jakob Stoklund Olesen
3a546156c7
Annotate various null idioms with SchedRW lists.
...
llvm-svn: 177461
2013-03-19 23:23:31 +00:00
Jakob Stoklund Olesen
24aac1dc92
Annotate SSE float conversions with SchedRW lists.
...
llvm-svn: 177460
2013-03-19 23:23:29 +00:00
Jakob Stoklund Olesen
a5158c8f0a
Add SchedRW annotations to most of X86InstrSSE.td.
...
We hitch a ride with the existing OpndItins class that was used to add
instruction itinerary classes in the many multiclasses in this file.
Use the link provided by the X86FoldableSchedWrite.Folded to find the
right SchedWrite for folded loads.
llvm-svn: 177326
2013-03-18 22:01:35 +00:00
Nadav Rotem
adfa5eaf8c
Unaligned loads should use the VMOVUPS opcode.
...
llvm-svn: 177130
2013-03-14 23:49:44 +00:00
Craig Topper
8fb09f0abb
Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction.
...
llvm-svn: 173667
2013-01-28 06:48:25 +00:00
Craig Topper
c7e6feee42
Combine AVX and SSE forms of MOVSS and MOVSD into the same multiclasses so they get instantiated together.
...
llvm-svn: 172704
2013-01-17 06:59:42 +00:00
Craig Topper
0d2c29e807
Simplify nested strconcats in X86 td files since strconcat can take more than 2 arguments.
...
llvm-svn: 172379
2013-01-14 07:46:34 +00:00
Craig Topper
4c69a05d2d
Create a single multiclass for SSE and AVX version of MOVL/MOVH. Prevents needing to specify everything twice. No functional change intended
...
llvm-svn: 172378
2013-01-14 07:26:58 +00:00
Benjamin Kramer
bcd14a0f26
X86: Add patterns for X86ISD::VSEXT in registers.
...
Those can occur when something between the sextload and the store is on the same
chain and blocks isel. Fixes PR14887.
llvm-svn: 172353
2013-01-13 11:37:04 +00:00
Craig Topper
bd62d64cbf
Remove unnecessary # tokens at the beginning and end of defm names.
...
llvm-svn: 171694
2013-01-07 05:04:39 +00:00
Craig Topper
4f1c7256f9
Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior.
...
cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix.
cvtt*2si/cvt*2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix.
Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein.
llvm-svn: 171668
2013-01-06 20:39:29 +00:00
Craig Topper
9791afb182
Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP.
...
llvm-svn: 171356
2013-01-02 08:00:39 +00:00
Craig Topper
4bc5c4e152
Merge SSE and AVX instruction definitions for PSHUFD/PSHUFHW/PSHUFLW.
...
llvm-svn: 171355
2013-01-02 07:27:49 +00:00
Rafael Espindola
db1a84c84a
Revert 171351. It broke MC/X86/x86-32-avx.s.
...
llvm-svn: 171352
2013-01-02 01:35:11 +00:00
Craig Topper
86d0cdb82f
Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP.
...
llvm-svn: 171351
2013-01-01 20:53:20 +00:00
Craig Topper
12ed9cd6ae
Remove unused argument from a multiclass.
...
llvm-svn: 171340
2013-01-01 03:42:44 +00:00
Craig Topper
2edafc059d
Merge intrinsic instruction definitions for SSE and AVX versions of RCPPS and RSQRTPS.
...
llvm-svn: 171339
2013-01-01 03:30:21 +00:00
Craig Topper
d04dbec6c9
Remove 2 unused multiclasses.
...
llvm-svn: 171338
2013-01-01 02:02:45 +00:00
Craig Topper
7cc4f322cf
Merge AVX/SSE instruction definitions for SQRTPS/PD, RSQRTPS, RCPPS. No funcitonal change intended.
...
llvm-svn: 171337
2013-01-01 00:11:07 +00:00
Craig Topper
c2521cd309
Use packed instead of scalar itineraries for SSE1/2 SQRTPS/PD, RCPPS, and RSQRTPS. VEX-encoded forms already use packed.
...
llvm-svn: 171336
2012-12-31 23:49:05 +00:00
Craig Topper
fe82eb6bcd
Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to target-independent ISD nodes and use the existing patterns for those.
...
llvm-svn: 171237
2012-12-29 18:18:20 +00:00
Craig Topper
6b27251a76
Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min instructions. Lower them to target specific nodes and use those patterns instead. This also allows them to be commuted if UnsafeFPMath is enabled.
...
llvm-svn: 171227
2012-12-29 16:44:25 +00:00
Craig Topper
ab2e6842cc
Merge basic_sse12_fp_binop_p_int and basic_sse12_fp_binop_p_y_int multiclasses.
...
llvm-svn: 171171
2012-12-27 22:53:47 +00:00
Craig Topper
e2eec3c52b
Merge basic_sse12_fp_binop_p and basic_sse12_fp_binop_p_y multiclasses.
...
llvm-svn: 171166
2012-12-27 18:51:50 +00:00
Craig Topper
757f3fc394
Add hasSideEffects=0 to some forms of ROUND, RCP, and RSQRT.
...
llvm-svn: 171143
2012-12-27 07:16:08 +00:00
Craig Topper
09ce4b9efe
Move single letter 'P' prefix out of multiclass now that tablegen allows defm to start with #NAME. This makes instruction names more searchable again.
...
llvm-svn: 171141
2012-12-27 06:34:54 +00:00
Craig Topper
1b8c0750ee
Mark all the _REV instructions as not having side effects. They aren't really emitted by the backend, but it reduces the number of instructions in the output files with unmodelled side effects to make auditing easier.
...
llvm-svn: 171118
2012-12-26 21:30:22 +00:00
Craig Topper
18f2675e9b
Remove a special conditional setting of neverHasSideEffects if the instruction didn't have a pattern. This was leftover from when tablegen used to complain if things were already inferred from patterns.
...
llvm-svn: 171117
2012-12-26 21:04:30 +00:00
Craig Topper
24f316e4db
Merge still more SSE/AVX instruction definitions.
...
llvm-svn: 171103
2012-12-26 07:54:43 +00:00
Craig Topper
af629e2700
Merge more SSE/AVX instruction definitions.
...
llvm-svn: 171102
2012-12-26 07:20:35 +00:00
Craig Topper
65fe30450d
Fix 80 column violation.
...
llvm-svn: 171097
2012-12-26 06:15:53 +00:00
Craig Topper
f4d0fe8fcd
Fix class name in comment.
...
llvm-svn: 171096
2012-12-26 06:15:09 +00:00
Craig Topper
59747c4dbd
Merge SSE/AVX PCMPEQ/PCMPGT instruction definitions.
...
llvm-svn: 171095
2012-12-26 06:14:15 +00:00
Craig Topper
8a48677586
Remove 'v' from mnemonic to fix asm matching failures.
...
llvm-svn: 171093
2012-12-26 06:02:15 +00:00
Craig Topper
b4ef0fa3a1
Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for a bunch of SSE2 integer arithmetic instructions.
...
llvm-svn: 171092
2012-12-26 05:49:15 +00:00
Craig Topper
a2594dd5f0
Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for PAND/POR/PXOR/PANDN
...
llvm-svn: 171087
2012-12-26 04:36:03 +00:00
Craig Topper
97730a0d6a
Merge an AVX/SSE 256-bit and 128-bit multiclass.
...
llvm-svn: 171086
2012-12-26 03:56:47 +00:00
Craig Topper
8b59746390
Mark VANDNPD/VANDNPDS as not commutable.
...
llvm-svn: 171085
2012-12-26 03:48:10 +00:00
Benjamin Kramer
4669d18893
X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics
...
This is very mechanical, no functionality change. Preparation for PR14667.
llvm-svn: 170898
2012-12-21 14:04:55 +00:00
Elena Demikhovsky
14a4af0e66
Optimized load + SIGN_EXTEND patterns in the X86 backend.
...
llvm-svn: 170506
2012-12-19 07:50:20 +00:00
Benjamin Kramer
b16ccde7a4
X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
...
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.
llvm-svn: 170273
2012-12-15 16:47:44 +00:00
Craig Topper
216bcd522b
Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions.
...
llvm-svn: 169482
2012-12-06 07:31:16 +00:00
Craig Topper
922f10aec4
Mark MOVDQ(A/U)rm as ReMaterializable. Mark all MOVDQ(A/U) instructions as neverHasSideEffects.
...
llvm-svn: 169477
2012-12-06 06:49:16 +00:00
Elena Demikhovsky
cd3c1c4a16
Simplified BLEND pattern matching for shuffles.
...
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.
llvm-svn: 169366
2012-12-05 09:24:57 +00:00
Craig Topper
70601ba6f9
Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of vector types.
...
llvm-svn: 168141
2012-11-16 06:37:56 +00:00
Craig Topper
9268c94b15
Cleanup pcmp(e/i)str(m/i) instruction definitions and load folding support.
...
llvm-svn: 167652
2012-11-10 01:23:36 +00:00
Michael Liao
ec47090b1e
Remove tailing whitespaces
...
llvm-svn: 167445
2012-11-06 08:06:35 +00:00
Manman Ren
6b223a4f06
X86 SSE: update rsqrtss and rcpss to use two source operands and
...
the first source operand is tied to the destination operand.
This is to accurately model the corresponding instructions where the upper
bits are unmodified.
rdar://12558838
PR14221
llvm-svn: 167064
2012-10-30 23:53:59 +00:00
Michael Liao
ad0b69fe3e
Fix PR14204
...
- Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled.
llvm-svn: 166947
2012-10-29 17:57:12 +00:00
Michael Liao
c5af149e70
Add custom conversion from v2u32 to v2f32 in 32-bit mode
...
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
v2f32 is added to improve the efficiency of the code generated.
llvm-svn: 166545
2012-10-24 04:09:32 +00:00
Michael Liao
1be96bb5ce
Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
...
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Michael Liao
e999b865dd
Add support for FP_ROUND from v2f64 to v2f32
...
- Due to the current matching vector elements constraints in
ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
v2f32) is scalarized. Add a customized v2f32 widening to convert it
into a target-specific X86ISD::VFPROUND to work around this
constraints.
llvm-svn: 165631
2012-10-10 16:53:28 +00:00
Craig Topper
3f23c1a8b9
Remove code for setting the VEX L-bit as a function of operand size from the code emitters and the disassembler table builder. Fix a couple instructions that were still missing VEX_L.
...
llvm-svn: 164204
2012-09-19 06:37:45 +00:00
Craig Topper
a73be890a1
Add explicit VEX_L tags to all 256-bit instructions. This will allow us to remove code from the code emitters that examined operands to set the L-bit.
...
llvm-svn: 164202
2012-09-19 06:06:34 +00:00
Nadav Rotem
37521aa89c
The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types.
...
It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast,
and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics.
rdar://11897677
llvm-svn: 163995
2012-09-16 07:39:07 +00:00
Michael Liao
400f7ef871
Enhance PR11334 fix to support extload from v2f32/v4f32
...
- Fix an remaining issue of PR11674 as well
llvm-svn: 163528
2012-09-10 18:33:51 +00:00
Craig Topper
4ed79bd7d7
Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.
...
llvm-svn: 163473
2012-09-08 17:42:27 +00:00
Craig Topper
f3e4aa8cdd
Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder.
...
llvm-svn: 163293
2012-09-06 06:09:01 +00:00
Craig Topper
daa5ed1e0a
Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.
...
llvm-svn: 163292
2012-09-06 05:15:01 +00:00
Craig Topper
81f06df699
Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing.
...
llvm-svn: 163198
2012-09-05 07:26:35 +00:00
Craig Topper
f7c87d6eea
Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS.
...
llvm-svn: 163196
2012-09-05 06:58:39 +00:00
Craig Topper
2db2353b21
Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.
...
llvm-svn: 163192
2012-09-05 05:48:09 +00:00
Craig Topper
d6cc4062be
Typos
...
llvm-svn: 163053
2012-09-01 06:33:50 +00:00
Michael Liao
969f3913dd
Clean up AddedComplexity further after adding UseSSEx
...
llvm-svn: 162973
2012-08-31 03:01:35 +00:00
Jim Grosbach
e423e865fe
X86: Fix encoding of 'movd %xmm0, %rax'
...
The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v'
prefix, resulting in mis-assembly of the vanilla movd instruction.
llvm-svn: 162963
2012-08-31 00:30:30 +00:00
Michael Liao
bbd10792c2
Introduce 'UseSSEx' to force SSE legacy encoding
...
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
enabled.
As the penalty of inter-mixing SSE and AVX instructions, we need
prevent SSE legacy insn from being generated except explicitly
specified through some intrinsics. For patterns supported by both
SSE and AVX, so far, we force AVX insn will be tried first relying on
AddedComplexity or position in td file. It's error-prone and
introduces bugs accidentally.
'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
by AVX, we need this predicate to force VEX encoding or SSE legacy
encoding only.
For insns not inherited by AVX, we still use the previous predicates,
i.e. 'HasSSEx'. So far, these insns fall into the following
categories:
* SSE insns with MMX operands
* SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
CRC, and etc.)
* SSE4A insns.
* MMX insns.
* x87 insns added by SSE.
2 test cases are modified:
- test/CodeGen/X86/fast-isel-x86-64.ll
AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
selected by fast-isel due to complicated pattern and fast-isel
fallback to materialize it from constant pool.
- test/CodeGen/X86/widen_load-1.ll
AVX code generation is different from SSE one after fixing SSE/AVX
inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
'vmovaps'.
llvm-svn: 162919
2012-08-30 16:54:46 +00:00
Bill Wendling
cc56718038
The commutative flag is already correctly set within the multiclass. If we set
...
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>
llvm-svn: 162741
2012-08-28 07:36:46 +00:00
Craig Topper
72f51c3986
Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos.
...
llvm-svn: 162740
2012-08-28 07:30:47 +00:00
Craig Topper
bd509eea4a
Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo.
...
llvm-svn: 162738
2012-08-28 07:05:28 +00:00
Jakob Stoklund Olesen
89d6b29d16
More missing mayLoad flags on AVX multiclasses.
...
llvm-svn: 162714
2012-08-28 00:02:01 +00:00
Craig Topper
5af2fed5f2
Don't allow vextractf128 to be folded with unaligned stores. We don't fold unaligned loads so shouldn't fold unaligned stores as it can cause an alignment fault to occur.
...
llvm-svn: 162658
2012-08-27 07:19:59 +00:00
Craig Topper
6d44554cd4
Fold some patterns into instruction definitons so tablegen can infer flags removing the need for an explicit 'neverHasSideEffects = 1'
...
llvm-svn: 162656
2012-08-27 07:04:50 +00:00
Craig Topper
f7828f91ee
Add HasAVX1Only predicate and use it for patterns that have an AVX1 instruction and an AVX2 instruction rather than relying on AddedComplexity.
...
llvm-svn: 162654
2012-08-27 06:08:57 +00:00
Jakob Stoklund Olesen
3d91b43ad2
Add missing mayLoad flags to a large class of AVX *_Int instructions.
...
llvm-svn: 162622
2012-08-24 23:29:07 +00:00
Jakob Stoklund Olesen
d3511235d1
Remove some spurious mayLoad = 0 flags.
...
They were inserted to silence TableGen's warning about
redundant properties. That warning is now gone.
llvm-svn: 162517
2012-08-24 00:31:20 +00:00
Nadav Rotem
178250ad87
When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
...
this allows for better code generation.
Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.
For example:
movaps %xmm0, %xmm1
movsd LC(%rip), %xmm0
minsd %xmm1, %xmm0
becomes:
minsd LC(%rip), %xmm0
llvm-svn: 162187
2012-08-19 13:06:16 +00:00
Michael Liao
34107b9177
fix PR11334
...
- FP_EXTEND only support extending from vectors with matching elements.
This results in the scalarization of extending to v2f64 from v2f32,
which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.
llvm-svn: 161894
2012-08-14 21:24:47 +00:00
Craig Topper
ab47fe4e16
Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
...
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Craig Topper
6d0408d3a5
Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern.
...
llvm-svn: 161306
2012-08-05 00:36:57 +00:00
Manman Ren
4059145396
X86: mark GATHER instructios as mayLoad
...
llvm-svn: 161143
2012-08-01 23:28:59 +00:00
Craig Topper
14eac5dda8
Give VCVTTPD2DQ priority over CVTTPD2DQ.
...
llvm-svn: 160942
2012-07-30 02:20:32 +00:00
Craig Topper
f881d385da
Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1.
...
llvm-svn: 160941
2012-07-30 02:14:02 +00:00
Craig Topper
415b3586d0
Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern.
...
llvm-svn: 160939
2012-07-30 01:38:57 +00:00
Craig Topper
28402efcb6
Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns.
...
llvm-svn: 160938
2012-07-29 23:26:34 +00:00
Craig Topper
b6767f3acd
Move more SSE/AVX convert instruction patterns into their definitions.
...
llvm-svn: 160937
2012-07-29 22:30:06 +00:00
Craig Topper
fc93281c07
Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions.
...
llvm-svn: 160922
2012-07-28 18:59:19 +00:00
Craig Topper
024797b9a2
Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects.
...
llvm-svn: 160921
2012-07-28 18:36:39 +00:00
Craig Topper
44f9b5343d
Make CVTSS2SI instruction definition consistent with CVTSD2SI.
...
llvm-svn: 160914
2012-07-28 08:28:23 +00:00
Craig Topper
1c1aef07b8
Fix up memory load types for SSE scalar convert intrinsic patterns.
...
llvm-svn: 160913
2012-07-28 07:59:59 +00:00
Jakob Stoklund Olesen
77cd55b4ee
Remove the last mentions of sub_ss and sub_sd from patterns.
...
I'll remove these two sub-register indexes shortly.
llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen
b96d0b4e08
Eliminate sub_ss, sub_sd from broadcast patterns.
...
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).
llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen
206b825f5c
Eliminate more sub_ss / sub_sd patterns.
...
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.
llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen
75d17b0577
Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
...
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.
llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen
ceee4a9d0c
Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
...
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves. They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.
This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.
The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.
llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Craig Topper
c7690ac7ac
Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
...
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Nadav Rotem
4c12245b3a
The vbroadcast family of instructions has 'fallback patterns' in case where the
...
load source operand is used by multiple nodes. The v2i64 broadcast was emulated
by shuffling the two lower i32 elements to the upper two.
We had a bug in the immediate used for the broadcast.
Replacing 0 to 0x44.
0x44 means [01|00|01|00] which corresponds to the correct lane.
Patch by Michael Kuperstein.
llvm-svn: 160430
2012-07-18 08:14:48 +00:00
Craig Topper
01deb5f2df
Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
...
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Nadav Rotem
ee3552f88d
Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
...
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.
PR12782.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160230
2012-07-15 12:26:30 +00:00
Craig Topper
b3bac4908e
Mark VINSERTI128rm as MayLoad=1. Fixes PR13348.
...
llvm-svn: 160162
2012-07-13 05:46:28 +00:00
Craig Topper
f7755df776
Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
...
llvm-svn: 160110
2012-07-12 06:52:41 +00:00
Craig Topper
be41e2daa6
Reverse assembler/disassembler operand order for gather instructions.
...
llvm-svn: 159983
2012-07-10 06:38:33 +00:00
Craig Topper
85c938f44f
Remove extra space.
...
llvm-svn: 159647
2012-07-03 06:48:58 +00:00
Craig Topper
f067f9aa51
Change i128mem/i256mem to f128mem/f256mem on some floating point vector instructions.
...
llvm-svn: 159646
2012-07-03 06:11:06 +00:00
Craig Topper
676dcd8c39
Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252.
...
llvm-svn: 159644
2012-07-03 05:49:45 +00:00
Elena Demikhovsky
9af899fa88
Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
...
llvm-svn: 159504
2012-07-01 06:12:26 +00:00
Manman Ren
98a5bf24a9
X86: add more GATHER intrinsics in LLVM
...
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
from 256-bit to 128-bit.
Support the following intrinsics:
llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256
llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Manman Ren
a09820414a
X86: add GATHER intrinsics (AVX2) in LLVM
...
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256
Modified Disassembler to handle VSIB addressing mode.
llvm-svn: 159221
2012-06-26 19:47:59 +00:00
Craig Topper
94bf0f3855
Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.
...
llvm-svn: 159184
2012-06-26 04:12:49 +00:00
Craig Topper
357de815b4
Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there are no patterns in the instruction.
...
llvm-svn: 159127
2012-06-25 06:51:42 +00:00
Craig Topper
b6eb513c68
Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types.
...
llvm-svn: 159126
2012-06-25 06:16:00 +00:00
Craig Topper
fd5e6e7db1
Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
...
llvm-svn: 159109
2012-06-24 07:07:16 +00:00
Craig Topper
b925230fb1
Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
...
llvm-svn: 159108
2012-06-24 06:55:37 +00:00
Craig Topper
f48ec7a708
Fix build failures from r159106.
...
llvm-svn: 159107
2012-06-24 06:08:31 +00:00
Craig Topper
bab2b89944
Remove intrinsic specific instructions for CVTPD2PS and replace with just patterns.
...
llvm-svn: 159106
2012-06-24 05:44:31 +00:00
Craig Topper
3cee08ce7d
Remove intrinsic specific instructions for CVTPD2DQ. Replace with patterns.
...
llvm-svn: 159105
2012-06-24 05:33:24 +00:00
Craig Topper
a899cc15f1
Remove intrinsic specific instructions for (V)CVTDQ2PS. Use a Pat instead instead.
...
llvm-svn: 159090
2012-06-23 22:33:14 +00:00
Craig Topper
7e9415220a
Make CVTDQ2PS instruction use SSE2 predicate instead of SSE1. No functional change because there are no patterns in the instructions. Also fix a typo in a comment.
...
llvm-svn: 159087
2012-06-23 20:52:45 +00:00
Craig Topper
24e3418215
Move CVTPD2DQ to use SSE2 predicate instead of SSE3. Move DQ2PD and PD2DQ to the SSE2 section of the file.
...
llvm-svn: 159086
2012-06-23 20:15:42 +00:00
Craig Topper
8c03ea79c4
Use correct memory types for (V)CVTDQ2PD instructions.
...
llvm-svn: 159075
2012-06-23 08:30:27 +00:00
Craig Topper
431f1e7192
Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with intrinsic patterns. Mem forms omitted because the load size is only 64-bits.
...
llvm-svn: 159070
2012-06-23 04:23:36 +00:00
Craig Topper
21d04fc118
Add predicate check around some patterns.
...
llvm-svn: 158797
2012-06-20 07:30:23 +00:00
Craig Topper
3b662a6279
Add predicate check around some patterns.
...
llvm-svn: 158795
2012-06-20 07:01:11 +00:00
Kay Tiong Khoo
390edb0d91
*no need to pollute Intel syntax with bonus mnemonics; operand size is explicitly specified
...
llvm-svn: 158603
2012-06-16 17:19:49 +00:00
Craig Topper
bf2409e8aa
Mark several instructions SSE2 instead of SSE3 as they should be.
...
llvm-svn: 158049
2012-06-06 06:45:27 +00:00
Benjamin Kramer
a0396e4583
X86: Rename the CLMUL target feature to PCLMUL.
...
It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.
llvm-svn: 157745
2012-05-31 14:34:17 +00:00
Craig Topper
c1ac05dad5
Add intrinsic for pclmulqdq instruction.
...
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Benjamin Kramer
ef479ea854
Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
...
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.
llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Craig Topper
7daf897678
Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
...
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Craig Topper
dbb98b4917
Fix some issues in the f16c instructions.
...
llvm-svn: 156287
2012-05-07 06:00:15 +00:00
Craig Topper
d4e1894ec1
Add SSE4A MOVNTSS/MOVNTSD instructions.
...
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Nadav Rotem
810734b7f4
AVX: Add additional vbroadcast replacement sequences for integers.
...
Remove the v2f64 patterns because it does not match any vbroadcast
instruction.
llvm-svn: 155461
2012-04-24 18:09:59 +00:00
Nadav Rotem
aa3ff8da00
AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions
...
using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).
llvm-svn: 155437
2012-04-24 11:07:03 +00:00
Elena Demikhovsky
8d7e56c409
ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
...
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Craig Topper
4badeb3f0d
Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes.
...
llvm-svn: 154801
2012-04-16 07:13:00 +00:00
Craig Topper
c0075aa7ff
Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second.
...
llvm-svn: 154797
2012-04-16 06:26:15 +00:00
Craig Topper
b86fa404d3
Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
...
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper
bfc9a5f7d3
Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors.
...
llvm-svn: 154778
2012-04-15 22:43:31 +00:00
Nadav Rotem
42bcd04ee3
Fix PR12529. The Vxx family of instructions are only supported by AVX.
...
Use non-vex instructions for SSE4.
llvm-svn: 154770
2012-04-15 19:36:44 +00:00
Elena Demikhovsky
779a72b49e
Added VPERM optimization for AVX2 shuffles
...
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
Craig Topper
d0271b27cb
Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions.
...
llvm-svn: 154580
2012-04-12 07:23:00 +00:00
Nadav Rotem
9bc178ac5c
Reapply 154396 after fixing a test.
...
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154483
2012-04-11 06:40:27 +00:00
Eric Christopher
65ada95b84
Temporarily revert this patch to see if it brings the buildbots back.
...
llvm-svn: 154425
2012-04-10 19:33:16 +00:00
Nadav Rotem
f934f91709
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
...
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154396
2012-04-10 14:33:13 +00:00
Craig Topper
d024cef233
Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
...
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper
aa9aab5ad2
Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
...
llvm-svn: 154268
2012-04-07 21:57:43 +00:00
Craig Topper
7629d63bc4
Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
...
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Chad Rosier
4106917355
[avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to
...
vextractf128 with 128-bit mem dest.
Combines
vextractf128 $0, %ymm0, %xmm0
vmovaps %xmm0, (%rdi)
to
vextractf128 $0, %ymm0, (%rdi)
rdar://11082570
llvm-svn: 153139
2012-03-20 21:43:40 +00:00
Chad Rosier
0158ae2e5b
[avx] Add the AddedComplexity to the VINSERTI128 avx2 patterns to give
...
precedence over the VINSERTF128 avx1 patterns.
llvm-svn: 153114
2012-03-20 19:45:07 +00:00
Chad Rosier
93d5427c69
Whitespace.
...
llvm-svn: 153105
2012-03-20 18:38:33 +00:00
Chad Rosier
5a6011267a
[avx] Move the vextractf128 patterns closer to the vextractf128 def. Remove
...
whitespace from test case. No functional change intended.
llvm-svn: 153103
2012-03-20 18:24:55 +00:00
Chad Rosier
07a4cb9382
[avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads.
...
This results in things such as
vmovups 16(%rdi), %xmm0
vinsertf128 $1, %xmm0, %ymm0, %ymm0
to be combined to
vinsertf128 $1, 16(%rdi), %ymm0, %ymm0
rdar://11076953
llvm-svn: 153092
2012-03-20 17:08:51 +00:00
Chad Rosier
b9b73170e3
[avx] Add patterns for VINSERTF128rm.
...
This results in things such as
vmovaps -96(%rbx), %xmm1
vinsertf128 $1, %xmm1, %ymm0, %ymm0
to be combined to
vinsertf128 $1, -96(%rbx), %ymm0, %ymm0
rdar://10643481
llvm-svn: 152762
2012-03-15 00:45:30 +00:00
Kay Tiong Khoo
57c8e7f364
*fix typo in comment; test of commit access
...
llvm-svn: 152507
2012-03-10 21:29:49 +00:00
Chad Rosier
a281afc676
Fix a regression from r147481.
...
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078
llvm-svn: 152366
2012-03-09 02:00:48 +00:00
Preston Gurd
a49ef92a76
This patch adds instruction latencies for the SSE instructions
...
to the instruction scheduler for the Intel Atom.
llvm-svn: 151590
2012-02-27 23:35:03 +00:00
Pete Cooper
682c76b7d4
Turn avx insert intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove duplicate patterns for selecting the intrinsics
...
llvm-svn: 151342
2012-02-24 03:51:49 +00:00
Jia Liu
e1d619691b
some comment fix for X86 and ARM
...
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Jia Liu
b22310fda6
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
...
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Craig Topper
ba172d2d59
Remove the last of the old vector_shuffle patterns from X86 isel.
...
llvm-svn: 150795
2012-02-17 07:02:34 +00:00
Craig Topper
cfad98f745
Move old movl vector_shuffle patterns. Not needed anymore since vector_shuffles shouldn't reach isel.
...
llvm-svn: 150462
2012-02-14 08:14:53 +00:00
Craig Topper
8b19d78808
Still more vector_shuffle pattern removal.
...
llvm-svn: 150365
2012-02-13 07:23:41 +00:00
Craig Topper
74650add0e
Remove more vector_shuffle patterns for unpack. These should be target specific nodes when they get to isel.
...
llvm-svn: 150363
2012-02-13 05:48:49 +00:00
Craig Topper
6d471c9e49
Recommit r150328. Previous test failures should be fixed by r150360.
...
llvm-svn: 150362
2012-02-13 05:10:10 +00:00
NAKAMURA Takumi
0826c17d00
Revert r150328, "Remove more vector_shuffle patterns."
...
It caused 3 failures on pre-penryn and non-x86(generic) hosts.
llvm-svn: 150357
2012-02-13 00:10:15 +00:00
Craig Topper
e24c94af81
Remove more vector_shuffle patterns.
...
llvm-svn: 150328
2012-02-12 08:14:35 +00:00
Craig Topper
d40d9eb2b3
Remove more vector_shuffle patterns.
...
llvm-svn: 150321
2012-02-12 01:07:34 +00:00
Craig Topper
330ca97700
Remove more vector_shuffle patterns.
...
llvm-svn: 150314
2012-02-11 23:31:01 +00:00
Craig Topper
981c6cf7b3
Remove some patterns for matching vector_shuffle instructions since vector_shuffles should be custom lowered before isel.
...
llvm-svn: 150299
2012-02-11 07:43:35 +00:00
Craig Topper
172b9243cd
Remove a couple unneeded intrinsic patterns
...
llvm-svn: 150067
2012-02-08 08:29:30 +00:00
Craig Topper
5405571fe0
Remove GCC builtins for vpermilp* intrinsics as clang no longer needs them. Custom lower the intrinsics to the vpermilp target specific node and remove intrinsic patterns.
...
llvm-svn: 150060
2012-02-08 06:36:57 +00:00
Craig Topper
b27fd77c3f
Add instruction selection for 256-bit VPSHUFD and 128-bit VPERMILPS/VPERMILPD.
...
llvm-svn: 149968
2012-02-07 06:28:42 +00:00
Craig Topper
1d471e31ba
Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies.
...
llvm-svn: 149807
2012-02-05 03:14:49 +00:00
Elena Demikhovsky
fb44980b41
Optimization for SIGN_EXTEND operation on AVX.
...
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.
llvm-svn: 149600
2012-02-02 09:10:43 +00:00
Andrew Trick
8523b16ff5
Instruction scheduling itinerary for Intel Atom.
...
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
Adds a test to verify that the scheduler is working.
Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
Patch by Preston Gurd!
llvm-svn: 149558
2012-02-01 23:20:51 +00:00
Craig Topper
516cba3380
Fix pattern for memory form of PSHUFD for use with FP vectors to remove bitcast to an integer vector that normal code wouldn't have. Also remove bitcasts from code that turns splat vector loads into a shuffle as it was making the broken pattern necessary.
...
llvm-svn: 149232
2012-01-30 07:50:31 +00:00
Craig Topper
5639e9e8fb
Move some patterns back near their instructions and use AddedComplexity to fix priority. Merge some patterns into their instruction definition.
...
llvm-svn: 149122
2012-01-27 07:09:40 +00:00
Victor Umansky
5f29b0e57b
Fix for the following bug in AVX codegen for double-to-int conversions:
...
. "fptosi" and "fptoui" IR instructions are defined with round-to-zero rounding mode.
. Currently for AVX mode for <4xdouble> and <8xdouble> the "VCVTPD2DQ.128" and "VCVTPD2DQ.256" instructions are selected (for .fp_to_sint. DAG node operation ) by AVX codegen. However they use round-to-nearest-even rounding mode.
. Consequently, the conversion produces incorrect numbers.
The fix is to replace selection of VCVTPD2DQ instructions with VCVTTPD2DQ instructions. The latter use truncate (i.e. round-to-zero) rounding mode.
As .fp_to_sint. DAG node operation is used only for lowering of "fptosi" and "fptoui" IR instructions, the fix in X86InstrSSE.td definition file doesn.t have an impact on other LLVM flows.
The patch includes changes in the .td file, LIT test for the changes and a fix in a legacy LIT test (which produced asm code conflicting with LLVN IR spec).
llvm-svn: 149056
2012-01-26 08:51:39 +00:00
Craig Topper
1c0e22f57a
Fix AVX vs SSE patterns ordering issue for VPCMPESTRM and VPCMPISTRM.
...
llvm-svn: 149053
2012-01-26 07:31:30 +00:00
Craig Topper
b91760eff8
Remove some more patterns by custom lowering intrinsics to target specific nodes.
...
llvm-svn: 149052
2012-01-26 07:18:03 +00:00
Craig Topper
7834900950
Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns.
...
llvm-svn: 148933
2012-01-25 06:43:11 +00:00
Craig Topper
ce4f9c5668
Custom lower phadd and phsub intrinsics to target specific nodes. Remove the patterns that are no longer necessary.
...
llvm-svn: 148927
2012-01-25 05:37:32 +00:00
Craig Topper
5bcf070e68
Remove AVX 256-bit unaligned load intrinsics. 128-bit versions had been removed a while ago.
...
llvm-svn: 148922
2012-01-25 04:42:03 +00:00
Craig Topper
3ad5bc019a
Merge intrinsic pattern and no pattern versions of VCVTSD2SI intruction definitions. Matches non-AVX version of same instructions.
...
llvm-svn: 148914
2012-01-25 03:52:09 +00:00
Craig Topper
edd1d0acfc
Custom lower PCMPEQ/PCMPGT intrinsics to target specific nodes and remove the intrinsic patterns.
...
llvm-svn: 148687
2012-01-23 08:18:28 +00:00
Craig Topper
5e80db4e4f
Custom lower vector shift intrinsics to target specific nodes and remove the patterns that are no longer needed.
...
llvm-svn: 148684
2012-01-23 06:16:53 +00:00
Craig Topper
20c98df340
Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32 loads. All integer vector loads are promoted to v2i64 or v4i64 so these pattern fragments can never match. Fix or remove patterns that used these fragments.
...
llvm-svn: 148672
2012-01-23 00:06:44 +00:00
Craig Topper
0b7ad76bd0
Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching.
...
llvm-svn: 148670
2012-01-22 23:36:02 +00:00
Craig Topper
bd4884371b
Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.
...
llvm-svn: 148667
2012-01-22 22:42:16 +00:00
Craig Topper
094626414d
Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.
...
llvm-svn: 148664
2012-01-22 19:15:14 +00:00
Craig Topper
123adfa0f3
Move some vector shift patterns into their instruction definitions.
...
llvm-svn: 148643
2012-01-22 00:41:20 +00:00
Craig Topper
dcaa5fbd08
Add memory patterns for some of the fp<->integer conversion instructions. Fold some patterns into instruction definitions.
...
llvm-svn: 148641
2012-01-21 18:37:15 +00:00
Craig Topper
3469212c82
Add support for selecting 256-bit PALIGNR.
...
llvm-svn: 148532
2012-01-20 05:53:00 +00:00
Craig Topper
db8890aedd
Give priority to AVX over SSE for 128-bit floating point unpck instructions.
...
llvm-svn: 148233
2012-01-16 09:56:42 +00:00
Craig Topper
c10e1abaf3
Fix the memop type on a couple 256-bit AVX instructions that were using f128mem instead of f256mem.
...
llvm-svn: 148196
2012-01-14 18:29:57 +00:00
Chad Rosier
71a185c5c6
Fix pasto from r146196.
...
llvm-svn: 148167
2012-01-14 01:50:21 +00:00
Craig Topper
e52d86a740
Convert SHUFPD with the same register for both sources to PSHUFD if it would prevent a register copy. Similar to SHUFPS, but requires the mask to be converted.
...
llvm-svn: 148112
2012-01-13 09:21:41 +00:00
Craig Topper
cb7e13d7c0
Make X86 instruction selection use 256-bit VPXOR for build_vector of all ones if AVX2 is enabled. This gives the ExeDepsFix pass a chance to choose FP vs int as appropriate. Also use v8i32 as the type for getZeroVector if AVX2 is enabled. This is consistent with SSE2 using prefering v4i32.
...
llvm-svn: 148108
2012-01-13 08:12:35 +00:00
Craig Topper
9f14d9f939
Add patterns for v16i16 and v32i8 immAllZerosV to select VPXOR to match v4i64 and v8i32.
...
llvm-svn: 148106
2012-01-13 06:59:47 +00:00
Chad Rosier
1a8f0ccd8c
Add missing VEX predicates to VMOVSDto64rr/VMOVSDto64mr. This fixes a few
...
failing test cases on our internal AVX nightly tester.
rdar://10663637
llvm-svn: 147881
2012-01-10 22:14:06 +00:00
Craig Topper
eb8f9e9e5b
Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.
...
llvm-svn: 147841
2012-01-10 06:30:56 +00:00
Craig Topper
b89805c77d
Add HasAVX predicate to some of the AVX patterns.
...
llvm-svn: 147769
2012-01-09 08:34:00 +00:00
Craig Topper
a51f7f75c2
Reorder a bunch of patterns to put the AVX version first thus giving it priority over the SSE version. Another step towards trying to remove the AVX hack that disables SSE from X86Subtarget.
...
llvm-svn: 147768
2012-01-09 08:10:38 +00:00
Craig Topper
ef7f5bf8c9
Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.
...
llvm-svn: 147767
2012-01-09 06:52:46 +00:00
Craig Topper
c1f5622ad3
Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no AVX equivalent so we should use the SSE version.
...
llvm-svn: 147766
2012-01-09 06:38:55 +00:00
Craig Topper
a081644f8a
Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical operations ANDPS/ORPS/XORPS/ANDNPS. This fixes a pattern ordering issue that meant that the SSE2 instructions could never be directly selected since the SSE1 patterns would always match first. This is largely moot with the ExeDepsFix pass, but I'm trying to audit for all such ordering issues.
...
llvm-svn: 147765
2012-01-09 05:07:01 +00:00
Chad Rosier
493c1b3152
Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather
...
then a vxorps + vinsertf128 pair if the original vector came from a load.
rdar://10594409
llvm-svn: 147481
2012-01-03 21:05:52 +00:00
Craig Topper
53d559641f
Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also make it return false if there's not even a load at all. This makes the code better match the code in DAGCombiner that it tries to match. These two changes prevent some cases where vector_shuffles were making it to instruction selection and causing the older shuffle selection code to be triggered. Also needed to fix a bad pattern that this change exposed. This is the first step towards getting rid of the old shuffle selection support. No test cases yet because there's no way to tell whether a shuffle was handled in the legalize stage or at instruction selection.
...
llvm-svn: 147428
2012-01-02 08:46:48 +00:00
Craig Topper
1c064e0a89
Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers.
...
llvm-svn: 147409
2012-01-01 19:40:22 +00:00
Craig Topper
6e54ba7eee
Merge X86 SHUFPS and SHUFPD node types.
...
llvm-svn: 147394
2011-12-31 23:50:21 +00:00
Craig Topper
d51092d93a
Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load.
...
llvm-svn: 147393
2011-12-31 23:24:49 +00:00
Craig Topper
0e796fee11
Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected.
...
llvm-svn: 147392
2011-12-31 23:15:11 +00:00
Craig Topper
9e61291bf5
Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.
...
llvm-svn: 147342
2011-12-29 17:41:56 +00:00
Chad Rosier
3172488cc0
Fix 80-column violations.
...
llvm-svn: 147095
2011-12-21 20:59:09 +00:00
Elena Demikhovsky
ec7e6e0946
This is the second fix related to VZEXT_MOVL node.
...
The failure that I see in the current version is:
LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14]
0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13]
0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12]
0x18b9870: v4i64 = undef [ID=4]
0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10]
0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9970: i32 = Constant<0> [ID=3]
0x18b9170: v2i64 = undef [ORD=1] [ID=1]
0x18b9570: i32 = Constant<2> [ID=5]
llvm-svn: 146975
2011-12-20 13:34:28 +00:00
Eli Friedman
64944090ff
Make sure we correctly note the existence of an i8 immediate for vblendvps and friends, so we compute fixups correctly. PR11586.
...
llvm-svn: 146709
2011-12-15 23:46:18 +00:00
Chad Rosier
41dbf59e12
Add missing zmovl AVX patterns which were causing crashes.
...
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!
llvm-svn: 146689
2011-12-15 22:11:31 +00:00
Benjamin Kramer
16bbfbec66
X86: Add patterns for the various rounding ops for SSE4.1 and AVX.
...
llvm-svn: 146257
2011-12-09 15:44:03 +00:00
Benjamin Kramer
2dc5dec41d
X86: Split (v)rounds[sd] into a normal and an intrinsic version.
...
llvm-svn: 146256
2011-12-09 15:43:55 +00:00
Evan Cheng
b96bca81e7
Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417
...
llvm-svn: 146196
2011-12-08 22:30:45 +00:00
Evan Cheng
2a217be25f
Add various missing AVX patterns which was causing crashes. Sadly, the generated
...
code looks pretty bad compared to SSE.
rdar://10538793
llvm-svn: 146191
2011-12-08 22:05:28 +00:00
Evan Cheng
4d1a2d449f
Many of the SSE patterns should not be selected when AVX is available. This led to the following code in X86Subtarget.cpp
...
if (HasAVX)
X86SSELevel = NoMMXSSE;
This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected.
However, this breaks instructions which do not have AVX variants.
The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX().
Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change.
However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case,
the prefetch instructions. rdar://10538297
llvm-svn: 146163
2011-12-08 19:00:42 +00:00
Craig Topper
1d578e8835
Fix a bunch of SSE/AVX patterns to use proper memop types. In particular, not using integer loads other than v2i64/v4i64 since the others are all promoted.
...
llvm-svn: 146031
2011-12-07 08:30:53 +00:00
Craig Topper
6572e0f203
Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those.
...
llvm-svn: 145927
2011-12-06 09:04:59 +00:00
Craig Topper
8d4ba198d6
Merge floating point and integer UNPCK X86ISD node types.
...
llvm-svn: 145926
2011-12-06 08:21:25 +00:00
Craig Topper
0a672eaf9e
Merge VPERM2F128/VPERM2I128 ISD node types.
...
llvm-svn: 145485
2011-11-30 07:47:51 +00:00
Craig Topper
bafd224c8b
Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.
...
llvm-svn: 145483
2011-11-30 06:25:25 +00:00
Evan Cheng
648e48d02e
Add another missing pattern. llvm-gcc likes f64 but clang likes i64 so it was generating poor code for some SSE builtins.
...
llvm-svn: 145448
2011-11-29 22:48:34 +00:00
Jakob Stoklund Olesen
bde32d36bb
Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions.
...
Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.
This also makes the AVX variants redundant.
llvm-svn: 145440
2011-11-29 22:27:25 +00:00
Elena Demikhovsky
7a81dea516
Fixed vsqrt.ss intrinsic usage - order of input operands was wrong.
...
Added a test.
Thanks Bruno for reviewing the patch.
llvm-svn: 145403
2011-11-29 15:00:45 +00:00
Craig Topper
c16db840be
Fix issues in shuffle decoding around VPERM* instructions. Fix shuffle decoding for VSHUFPS/D for 256-bit types. Add pattern matching for memory forms of VPERMILPS/VPERMILPD.
...
llvm-svn: 145390
2011-11-29 07:49:05 +00:00
Craig Topper
12b72def4e
Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled.
...
llvm-svn: 145376
2011-11-29 05:37:58 +00:00
Craig Topper
897a7d4b9c
Correctly mark VPERM2F128 as being an FP instruction and add execution domain fixing support to convert it to VPERM2I128 for AVX2.
...
llvm-svn: 145370
2011-11-29 03:57:34 +00:00
Evan Cheng
aa93ceb164
Add missing avx pattern.
...
llvm-svn: 145272
2011-11-28 20:27:23 +00:00
Craig Topper
818a983e93
Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
...
llvm-svn: 145238
2011-11-28 10:14:51 +00:00
Craig Topper
51280d565b
Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
...
llvm-svn: 145153
2011-11-26 22:55:48 +00:00
Craig Topper
7704bd7ac3
Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
...
llvm-svn: 145148
2011-11-26 20:47:44 +00:00
Craig Topper
d65a444478
Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
...
llvm-svn: 145126
2011-11-24 22:57:10 +00:00
Craig Topper
d26466748b
Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.
...
llvm-svn: 145125
2011-11-24 22:20:08 +00:00