Chris Lattner
f4b0c99d63
add some missing flags.
...
llvm-svn: 45859
2008-01-11 06:59:07 +00:00
Chris Lattner
317332fc2a
Start inferring side effect information more aggressively, and fix many bugs in the
...
x86 backend where instructions were not marked maystore/mayload, and perf issues where
instructions were not marked neverHasSideEffects. It would be really nice if we could
write patterns for copy instructions.
I have audited all the x86 instructions down to MOVDQAmr. The flags on others and on
other targets are probably not right in all cases, but no clients currently use this
info that are enabled by default.
llvm-svn: 45829
2008-01-10 07:59:24 +00:00
Chris Lattner
aca7ca3730
remove explicit sets of 'neverHasSideEffects' that can now be
...
inferred from the instr patterns.
llvm-svn: 45824
2008-01-10 05:45:39 +00:00
Chris Lattner
a4ce4f6987
rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate.
...
llvm-svn: 45667
2008-01-06 23:38:27 +00:00
Chris Lattner
f3ebc3f3d2
Remove attribution from file headers, per discussion on llvmdev.
...
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Evan Cheng
01c7c198ee
Fix JIT encoding for CMPSD as well.
...
llvm-svn: 45268
2007-12-20 19:57:09 +00:00
Bill Wendling
b3d85a5d4b
Add "mayHaveSideEffects" and "neverHasSideEffects" flags to some instructions. I
...
based what flag to set on whether it was already marked as
"isRematerializable". If there was a further check to determine if it's "really"
rematerializable, then I marked it as "mayHaveSideEffects" and created a check
in the X86 back-end similar to the remat one.
llvm-svn: 45132
2007-12-17 23:07:56 +00:00
Chris Lattner
dab6bd902e
Fix the JIT encoding of cmp*ss, which aborts with this assertion currently:
...
X86CodeEmitter.cpp:378: failed assertion `0 && "Immediate size not set!"'
I *think* this is right, but Evan, please verify. It also looks like
CMPSDrr and maybe others are missing this info. Evan, plz investigate.
llvm-svn: 45074
2007-12-16 20:12:41 +00:00
Evan Cheng
23d2d4dc6c
Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs.
...
llvm-svn: 45058
2007-12-15 03:00:47 +00:00
Evan Cheng
6e68381e02
Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled.
...
llvm-svn: 44960
2007-12-12 23:12:09 +00:00
Evan Cheng
c829e5cdf0
Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector.
...
llvm-svn: 44669
2007-12-06 22:14:22 +00:00
Chris Lattner
5728bdd4db
Fix a long standing deficiency in the X86 backend: we would
...
sometimes emit "zero" and "all one" vectors multiple times,
for example:
_test2:
pcmpeqd %mm0, %mm0
movq %mm0, _M1
pcmpeqd %mm0, %mm0
movq %mm0, _M2
ret
instead of:
_test2:
pcmpeqd %mm0, %mm0
movq %mm0, _M1
movq %mm0, _M2
ret
This patch fixes this by always arranging for zero/one vectors
to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be
any random type. This ensures they get trivially CSE'd on the dag.
This fix is also important for LegalizeDAGTypes, as it gets unhappy
when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when
'i64' isn't legal.
This patch makes the following changes:
1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into
their canonical types.
2) The now-dead patterns are removed from the SSE/MMX .td files.
3) All the patterns in the .td file that referred to immAllOnesV or
immAllZerosV in the wrong form now use *_bc to match them with a
bitcast wrapped around them.
4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle
bitcast'd zero vectors, which simplifies the code actually.
5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that
is legal, instead of generating one that is illegal and expecting
a later legalize pass to clean it up.
6) isZeroShuffle is generalized to handle bitcast of zeros.
7) several other minor tweaks.
This patch is definite goodness, but has the potential to cause random
code quality regressions. Please be on the lookout for these and let
me know if they happen.
llvm-svn: 44310
2007-11-25 00:24:49 +00:00
Nate Begeman
d4d45c268c
Add support for vectors to int <-> float casts.
...
llvm-svn: 44204
2007-11-17 03:58:34 +00:00
Dale Johannesen
d50c8bcef6
Add missing SSE builtins: CVTPD2PI, CVTPS2PI,
...
CVTTPD2PI, CVTTPS2PI, CVTPI2PD, CVTPI2PS.
llvm-svn: 43523
2007-10-30 22:15:38 +00:00
Arnold Schwaighofer
1f0da1fefb
Corrected many typing errors. And removed 'nest' parameter handling
...
for fastcc from X86CallingConv.td. This means that nested functions
are not supported for calling convention 'fastcc'.
llvm-svn: 42934
2007-10-12 21:30:57 +00:00
Dale Johannesen
62f65edc32
Add missing argument to PALIGNR
...
llvm-svn: 42874
2007-10-11 20:58:37 +00:00
Evan Cheng
f4b5d491df
Added DAG xforms. e.g.
...
(vextract (v4f32 s2v (f32 load $addr)), 0) -> (f32 load $addr)
(vextract (v4i32 bc (v4f32 s2v (f32 load $addr))), 0) -> (i32 load $addr)
Remove x86 specific patterns.
llvm-svn: 42677
2007-10-06 02:46:29 +00:00
Evan Cheng
a1b7e95039
Typo. X86comi doesn't read / write chain's.
...
llvm-svn: 42492
2007-10-01 18:12:48 +00:00
Evan Cheng
5fb5a1f389
Enabling new condition code modeling scheme.
...
llvm-svn: 42459
2007-09-29 00:00:36 +00:00
Evan Cheng
e95f391ef1
Added support for new condition code modeling scheme (i.e. physical register dependency). These are a bunch of instructions that are duplicated so the x86 backend can support both the old and new schemes at the same time. They will be deleted after
...
all the kinks are worked out.
llvm-svn: 42285
2007-09-25 01:57:46 +00:00
Dale Johannesen
e36c400255
Fix PR 1681. When X86 target uses +sse -sse2,
...
keep f32 in SSE registers and f64 in x87. This
is effectively a new codegen mode.
Change addLegalFPImmediate to permit float and
double variants to do different things.
Adjust callers.
llvm-svn: 42246
2007-09-23 14:52:20 +00:00
Evan Cheng
483e1ce16e
Add implicit def of EFLAGS on those instructions that may modify flags.
...
llvm-svn: 41962
2007-09-14 21:48:26 +00:00
Evan Cheng
3e18e504ae
Remove (somewhat confusing) Imp<> helper, use let Defs = [], Uses = [] instead.
...
llvm-svn: 41863
2007-09-11 19:55:27 +00:00
Dan Gohman
a95cbb0007
Avoid storing and reloading zeros and other constants from stack slots
...
by flagging the associated instructions as being trivially rematerializable.
llvm-svn: 41775
2007-09-07 21:32:51 +00:00
Evan Cheng
c2081fe573
Mark load instructions with isLoad = 1.
...
llvm-svn: 41595
2007-08-30 05:49:43 +00:00
Bill Wendling
cdbd82ee37
64-bit SSSE3 ops that use MMX registers don't require 16-byte alignment.
...
Make a 'memop' pattern just for them.
llvm-svn: 41017
2007-08-11 09:52:53 +00:00
Bill Wendling
7014615087
For kicks, I though it would be fun to use the correct opcode.
...
llvm-svn: 40985
2007-08-10 09:00:17 +00:00
Bill Wendling
2377206923
Adding SSSE3 intrinsics.
...
llvm-svn: 40982
2007-08-10 06:22:27 +00:00
Dan Gohman
8932bff7fe
Fix the alignment requirements of several unpck and shuf instructions.
...
Generalize isPSHUFDMask and add a unary SHUFPD pattern so that SHUFPD's
memory operand alignment can be tested as well, with a fix to avoid
breaking MMX's use of isPSHUFDMask.
llvm-svn: 40756
2007-08-02 21:17:01 +00:00
Dan Gohman
4d436e2b7d
Fix pastos in vector arithmetic intrinsics.
...
llvm-svn: 40754
2007-08-02 21:06:40 +00:00
Dan Gohman
fa3eeeedc0
Mark the SSE and MMX load instructions that
...
X86InstrInfo::isReallyTriviallyReMaterializable knows how to handle
with the isReMaterializable flag so that it is given a chance to handle
them. Without hoisting constant-pool loads from loops this isn't very
visible, though it does keep CodeGen/X86/constant-pool-remat-0.ll from
making a copy of the constant pool on the stack.
llvm-svn: 40736
2007-08-02 14:27:55 +00:00
Evan Cheng
da549ece5c
Missing Requires.
...
llvm-svn: 40691
2007-08-01 21:42:24 +00:00
Dan Gohman
54ec4bfa5f
Change the x86 assembly output to use tab characters to separate the
...
mnemonics from their operands instead of single spaces. This makes the
assembly output a little more consistent with various other compilers
(f.e. GCC), and slightly easier to read. Also, update the regression
tests accordingly.
llvm-svn: 40648
2007-07-31 20:11:57 +00:00
Evan Cheng
12c6be84ff
Redo and generalize previously removed opt for pinsrw: (vextract (v4i32 bc (v4f32 s2v (f32 load ))), 0) -> (i32 load )
...
llvm-svn: 40628
2007-07-31 08:04:03 +00:00
Dan Gohman
4788552deb
Re-apply 40504, but with a fix for the segfault it caused in oggenc:
...
Make the alignedload and alignedstore patterns always require 16-byte
alignment. This way when they are used in the "Fs" instructions, in which
a vector instruction is used for a scalar purpose, they can still require
the full vector alignment. And add a regression test for this.
llvm-svn: 40555
2007-07-27 17:16:43 +00:00
Evan Cheng
931de40afa
Reverting 40504 for now. It's breaking oggenc.
...
llvm-svn: 40547
2007-07-27 01:37:47 +00:00
Dan Gohman
cecd4b3793
Fix a whitespace difference between CMPSSrr and CMPSDrr.
...
llvm-svn: 40528
2007-07-26 15:11:50 +00:00
Dan Gohman
8455bd3fae
Remove X86ISD::LOAD_PACK and X86ISD::LOAD_UA and associated code from the
...
x86 target, replacing them with the new alignment attributes on memory
references.
llvm-svn: 40504
2007-07-26 00:31:09 +00:00
Evan Cheng
8fefeffb37
Because we promote SSE logical ops and loads to v2i64, we often end up generate
...
code that cross integer / floating point domains (e.g. generate pxor / pand for
logical ops on floating point value, movdqa to load / store floating point SSE
values). Given that, it's better to use movaps instead of movdqa and movups
instead of movdqu. They have the same latency but the "aps" variants are one
byte shorter.
If the domain crossing problem is a real performance issue, then we will have to
fix it with dynamic programming based isel.
llvm-svn: 40076
2007-07-20 00:27:43 +00:00
Evan Cheng
7ca3555bfa
Fix patterns so we isel the xorps, etc. for floating pt logical SSE ops. DAG combiner may fold away the (bit_convert (load)).
...
llvm-svn: 40070
2007-07-19 23:34:10 +00:00
Evan Cheng
94b5a80b93
Change instruction description to split OperandList into OutOperandList and
...
InOperandList. This gives one piece of important information: # of results
produced by an instruction.
An example of the change:
def ADD32rr : I<0x01, MRMDestReg, (ops GR32:$dst, GR32:$src1, GR32:$src2),
"add{l} {$src2, $dst|$dst, $src2}",
[(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
=>
def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
"add{l} {$src2, $dst|$dst, $src2}",
[(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
llvm-svn: 40033
2007-07-19 01:14:50 +00:00
Dan Gohman
776962a97a
Implement initial memory alignment awareness for SSE instructions. Vector loads
...
and stores that have a specified alignment of less than 16 bytes now use
instructions that support misaligned memory references.
llvm-svn: 40015
2007-07-18 20:23:34 +00:00
Dan Gohman
57111e7a60
Define non-intrinsic instructions for vector min, max, sqrt, rsqrt, and rcp,
...
in addition to the intrinsic forms. Add spill-folding entries for these new
instructions, and for the scalar min and max instrinsic instructions which
were missing. And add some preliminary ISelLowering code for using the new
non-intrinsic vector sqrt instruction, and fneg and fabs.
llvm-svn: 38478
2007-07-10 00:05:58 +00:00
Dale Johannesen
a2b3c175db
Fix for PR 1505 (and 1489). Rewrite X87 register
...
model to include f32 variants. Some factoring
improvments forthcoming.
llvm-svn: 37847
2007-07-03 00:53:03 +00:00
Dan Gohman
e8c1e428f2
Revert the earlier change that removed the M_REMATERIALIZABLE machine
...
instruction flag, and use the flag along with a virtual member function
hook for targets to override if there are instructions that are only
trivially rematerializable with specific operands (i.e. constant pool
loads).
llvm-svn: 37728
2007-06-26 00:48:07 +00:00
Dan Gohman
2e84e3f7b7
Make minor adjustments to whitespace and comments to reduce differences
...
between SSE1 instructions and their respective SSE2 analogues.
llvm-svn: 37718
2007-06-25 15:44:19 +00:00
Dan Gohman
33209bd6b8
Fix loadv2i32 to be loadv4i32, though it isn't actually used anywhere yet.
...
llvm-svn: 37717
2007-06-25 15:19:03 +00:00
Dan Gohman
9e82064924
Replace M_REMATERIALIZIBLE and the newly-added isOtherReMaterializableLoad
...
with a general target hook to identify rematerializable instructions. Some
instructions are only rematerializable with specific operands, such as loads
from constant pools, while others are always rematerializable. This hook
allows both to be identified as being rematerializable with the same
mechanism.
llvm-svn: 37644
2007-06-19 01:48:05 +00:00
Evan Cheng
632c3f01ed
Added missing patterns for UNPCKH* and PUNPCKH*.
...
llvm-svn: 37172
2007-05-17 18:44:37 +00:00
Bill Wendling
b5ce7c5466
Non-algorithmic change. Moved definitions around into separate sections
...
for SSE1, SSE2, SSE3, and SSSE3.
llvm-svn: 36656
2007-05-02 23:11:52 +00:00
Dan Gohman
29845cd40d
Fix the spelling of the prefetchnta instruction.
...
llvm-svn: 36256
2007-04-18 14:09:14 +00:00
Bill Wendling
f099841573
Add support for our first SSSE3 instruction "pmulhrsw".
...
llvm-svn: 35869
2007-04-10 22:10:25 +00:00
Evan Cheng
61eee86487
Mark re-materializable instructions.
...
llvm-svn: 35230
2007-03-21 00:16:56 +00:00
Chris Lattner
d647f92a14
add missing braces
...
llvm-svn: 34905
2007-03-04 06:13:52 +00:00
Evan Cheng
b68a774fd9
How the heck did I forget patterns for llvm.x86.sse2.cmp.sd?
...
llvm-svn: 34434
2007-02-20 00:39:09 +00:00
Evan Cheng
82241c86e9
- FCOPYSIGN custom lowering bug. Clear the sign bit of operand 0 first before
...
or'ing in the sign bit of operand 1.
- Tweaking: rather than left shift the sign bit, fp_extend operand 1 first
before taking its sign bit if its type is smaller than that of operand 0.
llvm-svn: 32932
2007-01-05 21:37:56 +00:00
Evan Cheng
4363e884c0
With SSE2, expand FCOPYSIGN to a series of SSE bitwise operations.
...
llvm-svn: 32900
2007-01-05 07:55:56 +00:00
Evan Cheng
fccea9b2a9
- Rename MOVDSS2DIrr to MOVSS2DIrr for consistency sake.
...
- Add MOVDI2SSrm and MOVSS2DImr to fold load / store for i32 <-> f32 bit_convert
patterns.
llvm-svn: 32582
2006-12-14 19:43:11 +00:00
Chris Lattner
c20b7e878a
If we have ScalarSSE, we can select bitconvert into single instructions.
...
This compiles bitcast.ll:test3/test4 into:
_test3:
movd %xmm0, %eax
ret
_test4:
movd %edi, %xmm0
ret
llvm-svn: 32230
2006-12-05 18:45:06 +00:00
Evan Cheng
572dc9cb4e
Correct instructions for moving data between GR64 and SSE registers; also correct load i64 / store i64 from v2i64.
...
llvm-svn: 31795
2006-11-16 23:33:25 +00:00
Evan Cheng
49683ba236
Don't dag combine floating point select to max and min intrinsics. Those
...
take v4f32 / v2f64 operands and may end up causing larger spills / restores.
Added X86 specific nodes X86ISD::FMAX, X86ISD::FMIN instead.
This fixes PR996.
llvm-svn: 31645
2006-11-10 21:43:37 +00:00
Evan Cheng
922e191116
Fixed a bug which causes x86 be to incorrectly match
...
shuffle v, undef, <2, ?, 3, ?>
to movhlps
It should match to unpckhps instead.
Added proper matching code for
shuffle v, undef, <2, 3, 2, 3>
llvm-svn: 31519
2006-11-07 22:14:24 +00:00
Chris Lattner
9ac6442db6
remove dead/redundant vars
...
llvm-svn: 31435
2006-11-03 23:48:56 +00:00
Evan Cheng
94e5bc9e83
Fix ldmxcsr JIT encoding.
...
llvm-svn: 31343
2006-11-01 06:53:52 +00:00
Evan Cheng
e056dd5928
Fixed a significant bug where unpcklpd is incorrectly used to extract element 1 from a v2f64 value.
...
llvm-svn: 31228
2006-10-27 21:08:32 +00:00
Evan Cheng
7e065ff7e4
X86ISD::PEXTRW 3rd operand type is always target pointer type.
...
llvm-svn: 31185
2006-10-25 21:35:05 +00:00
Evan Cheng
4090dc4703
ComplexPatterns sse_load_f32 and sse_load_f64 returns in / out chain operands.
...
llvm-svn: 30892
2006-10-11 21:06:01 +00:00
Evan Cheng
57ccb6d372
Don't go too crazy with these AddComplexity. Try matching shufps with load
...
folding first.
llvm-svn: 30848
2006-10-09 21:42:15 +00:00
Evan Cheng
e71fe34d75
Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
...
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Chris Lattner
398195ebbe
completely disable folding of loads into scalar sse instructions and provide
...
a framework for doing it right. This fixes
CodeGen/X86/2006-10-07-ScalarSSEMiscompile.ll.
Once X86DAGToDAGISel::SelectScalarSSELoad is implemented right, this task
will be done.
llvm-svn: 30817
2006-10-07 21:55:32 +00:00
Chris Lattner
942009fee5
convert packed FP add/sub/mul/div to use a multiclass.
...
llvm-svn: 30815
2006-10-07 21:17:13 +00:00
Chris Lattner
4005f4e49c
one multiclass now defines all 8 variants of binary-scalar-sse-fp operations.
...
llvm-svn: 30814
2006-10-07 20:55:57 +00:00
Chris Lattner
6eaee2c8e3
Switch ADD/MUL/DIV/SUB scalarsse fp ops to a multiclass
...
llvm-svn: 30813
2006-10-07 20:35:44 +00:00
Chris Lattner
c8c6441821
Random acts of shrinkage
...
llvm-svn: 30812
2006-10-07 19:49:05 +00:00
Chris Lattner
b5df7e554d
Convert pand/por/pxor to use multiclass
...
llvm-svn: 30811
2006-10-07 19:37:30 +00:00
Chris Lattner
6138cba5f1
Convert some more instructions over to use a new multiclass.
...
Fix a bug where the asmstring for PSUBQrm was wrong.
llvm-svn: 30810
2006-10-07 19:34:33 +00:00
Chris Lattner
662ba43f08
Fix a bug where PADDQrm printed paddd instead of paddq.
...
llvm-svn: 30809
2006-10-07 19:15:46 +00:00
Chris Lattner
29c62a3c88
Add multiclass for SSE2 instructions that correspond to simple binops.
...
llvm-svn: 30808
2006-10-07 19:14:49 +00:00
Chris Lattner
e0928d9d7b
rename:
...
PDI_binop_rm -> PDI_binop_rm_int
PDI_binop_rmi -> PDI_binop_rmi_int
to make it clear that these are for use with intrinsics.
llvm-svn: 30807
2006-10-07 19:02:31 +00:00
Chris Lattner
489b63089d
Convert saturating PADD/PSUB's to use a multiclass
...
llvm-svn: 30806
2006-10-07 18:48:46 +00:00
Chris Lattner
fa2ce8824d
Convert PAVG*, PMADDWD, and PMUL* to use multiclasses.
...
llvm-svn: 30805
2006-10-07 18:39:00 +00:00
Chris Lattner
cab92e4c0c
Fix typo in packsswb instr definition, where the load had the wrong type.
...
This allows us to use the multiclass for other packs.
llvm-svn: 30804
2006-10-07 18:23:58 +00:00
Chris Lattner
e746a9cd6a
handle pmin/pmax with multiclasses
...
llvm-svn: 30800
2006-10-07 07:49:33 +00:00
Chris Lattner
b14e6a0f8c
simplify pack and shift intrinsics with multiclasses
...
llvm-svn: 30797
2006-10-07 07:06:17 +00:00
Chris Lattner
521fc4e33f
Use a multiclass to simplify 'SSE2 Integer comparison'
...
llvm-svn: 30796
2006-10-07 06:47:08 +00:00
Chris Lattner
c6138cec61
move class defns close to uses to make it easier to read
...
llvm-svn: 30795
2006-10-07 06:33:36 +00:00
Chris Lattner
87e692323c
simplify horizontal op definitions
...
llvm-svn: 30794
2006-10-07 06:31:41 +00:00
Chris Lattner
3e9fc37458
remove more unneeded type info
...
llvm-svn: 30793
2006-10-07 06:27:03 +00:00
Chris Lattner
807be0a715
remove unneeded definitions and type info
...
llvm-svn: 30792
2006-10-07 06:19:41 +00:00
Chris Lattner
5b1358a8eb
remove some unneeded type info
...
llvm-svn: 30791
2006-10-07 06:17:43 +00:00
Chris Lattner
3414c022af
simplify patterns by merging in operand info
...
llvm-svn: 30790
2006-10-07 05:50:25 +00:00
Chris Lattner
ca21ce5f08
Factor operands into packed unary classes
...
llvm-svn: 30789
2006-10-07 05:47:20 +00:00
Chris Lattner
0052c3ff5b
remove dead/duplicate instructions
...
llvm-svn: 30788
2006-10-07 05:41:52 +00:00
Chris Lattner
904c6e9c92
Pull operand info up into parent class for scalar sse intrinsics.
...
llvm-svn: 30787
2006-10-07 05:26:13 +00:00
Chris Lattner
e698c90ee9
convert the sole sd unary intrinsic to a multiclass for consistency
...
llvm-svn: 30786
2006-10-07 05:19:31 +00:00
Chris Lattner
2bb2f050f5
pull operand string into the multiclass
...
llvm-svn: 30785
2006-10-07 05:13:26 +00:00
Chris Lattner
069679c7b6
Remove RSQRTSS[rm] RCPSS[rm], which are dead.
...
Introduce SS_IntUnary, a multiclass to replace SS_Int[rm].
llvm-svn: 30784
2006-10-07 05:09:48 +00:00
Chris Lattner
f13a7b376c
eliminate redundancy
...
llvm-svn: 30783
2006-10-07 04:52:09 +00:00
Evan Cheng
a36e6cf44f
These don't have immediate operands.
...
llvm-svn: 30694
2006-10-03 06:55:11 +00:00
Evan Cheng
4259a0f654
X86ISD::CMP now produces a chain as well as a flag. Make that the chain
...
operand of a conditional branch to allow load folding into CMP / TEST
instructions.
llvm-svn: 30241
2006-09-11 02:19:56 +00:00
Evan Cheng
17c28b2e0e
JIT encoding bug.
...
llvm-svn: 30112
2006-09-05 05:59:25 +00:00
Evan Cheng
66ed41cac1
Can't commute shufps. The high / low parts elements come from different vectors.
...
llvm-svn: 29275
2006-07-25 20:25:40 +00:00
Evan Cheng
5987cfb7b1
X86 target specific DAG combine: turn build_vector (load x), (load x+4),
...
(load x+8), (load x+12), <0, 1, 2, 3> to a single 128-bit load (aligned and
unaligned).
e.g.
__m128 test(float a, float b, float c, float d) {
return _mm_set_ps(d, c, b, a);
}
_test:
movups 4(%esp), %xmm0
ret
llvm-svn: 29042
2006-07-07 08:33:52 +00:00
Evan Cheng
390922f979
Should just use xorps to clear XMM registers for all data types. pxor is also one byte longer.
...
llvm-svn: 28984
2006-06-29 18:04:54 +00:00
Evan Cheng
fc8cdda070
Always use xorps to clear XMM registers.
...
llvm-svn: 28979
2006-06-29 00:34:23 +00:00
Chris Lattner
dbec49d574
Remove some ugly now-redundant casts.
...
llvm-svn: 28864
2006-06-20 00:25:29 +00:00
Chris Lattner
55594634d7
Fix some mismatched type constraints
...
llvm-svn: 28862
2006-06-20 00:12:37 +00:00
Evan Cheng
cd58e9d8b9
Minor clean up.
...
llvm-svn: 28860
2006-06-19 19:25:30 +00:00
Evan Cheng
de7156f12c
Type of vector extract / insert index operand should be iPTR.
...
llvm-svn: 28796
2006-06-15 08:14:54 +00:00
Evan Cheng
25e44e008d
Rename instructions for consistency sake.
...
llvm-svn: 28594
2006-05-31 19:00:07 +00:00
Evan Cheng
8abf45e22d
Select vector_shuffle v1, undef <2, 3, ?, ?> to MOVHLPS.
...
llvm-svn: 28582
2006-05-31 00:51:37 +00:00
Evan Cheng
57399704b3
MAXP{D|S} and MINP{D|S} are commutable.
...
llvm-svn: 28578
2006-05-30 23:47:30 +00:00
Evan Cheng
c0f90bef47
Commute shufps / shufpd.
...
llvm-svn: 28577
2006-05-30 23:34:30 +00:00
Evan Cheng
66f849bd7b
Allow shufps x, x, mask to be converted to pshufd x, mask to save a move.
...
llvm-svn: 28565
2006-05-30 20:26:50 +00:00
Evan Cheng
9fee442e63
X86 integer register classes naming changes. Make them consistent with FP, vector classes.
...
llvm-svn: 28324
2006-05-16 07:21:53 +00:00
Chris Lattner
44a73e9fa5
Teach the code generator to use cvtss2sd as extload f32 -> f64
...
llvm-svn: 28131
2006-05-05 21:35:18 +00:00
Evan Cheng
8b1cde2bbe
Use movsd to shuffle in the lowest two elements of a v4f32 / v4i32 vector when
...
movlps cannot be used (e.g. when load from m64 has multiple uses).
llvm-svn: 28089
2006-05-03 20:32:03 +00:00
Evan Cheng
4cc3e0b05f
Fix a typo.
...
llvm-svn: 27968
2006-04-25 17:48:41 +00:00
Evan Cheng
fb46b2bf5d
Explicitly specify result type for def : Pat<> patterns (if it produces a vector
...
result). Otherwise tblgen will pick the default (v16i8 for 128-bit vector).
llvm-svn: 27965
2006-04-25 00:50:01 +00:00
Evan Cheng
25b09295f8
Added X86 SSE2 intrinsics which can be represented as vector_shuffles. This is
...
a temporary workaround for the 2-wide vector_shuffle problem (i.e. its mask
would have type v2i32 which is not legal).
llvm-svn: 27964
2006-04-24 23:34:56 +00:00
Evan Cheng
63bd4d3730
Some missing movlps, movhps, movlpd, and movhpd patterns.
...
llvm-svn: 27960
2006-04-24 21:58:20 +00:00
Evan Cheng
e8b5180044
Now generating perfect (I think) code for "vector set" with a single non-zero
...
scalar value.
e.g.
_mm_set_epi32(0, a, 0, 0);
==>
movd 4(%esp), %xmm0
pshufd $69, %xmm0, %xmm0
_mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
movzbw 4(%esp), %ax
movzwl %ax, %eax
pxor %xmm0, %xmm0
pinsrw $5, %eax, %xmm0
llvm-svn: 27923
2006-04-21 01:05:10 +00:00
Evan Cheng
6d5297dac3
Prefer {p}unpack* and mov*dup over {p}shuf* as well.
...
llvm-svn: 27844
2006-04-19 21:15:24 +00:00
Evan Cheng
b416a25174
- Renamed AddedCost to AddedComplexity.
...
- Added more movhlps and movlhps patterns.
llvm-svn: 27842
2006-04-19 20:37:34 +00:00
Evan Cheng
cc7abc6c38
More mov{h|l}p{d|s} patterns.
...
llvm-svn: 27836
2006-04-19 18:20:17 +00:00
Evan Cheng
aeb09ccdd3
- More mov{h|l}ps patterns.
...
- Increase cost (complexity) of patterns which match mov{h|l}ps ops. These
are preferred over shufps in most cases.
llvm-svn: 27835
2006-04-19 18:11:52 +00:00
Evan Cheng
3823aa1d0f
- PEXTRW cannot take a memory location as its first source operand.
...
- PINSRWrmi encoding bug.
llvm-svn: 27818
2006-04-18 21:59:43 +00:00
Evan Cheng
a179ea631d
Name change for clarity sake
...
llvm-svn: 27816
2006-04-18 21:55:35 +00:00
Evan Cheng
d799d680f4
Name change for clarity sake
...
llvm-svn: 27814
2006-04-18 21:29:50 +00:00
Evan Cheng
0ee281f37c
Left a pattern out
...
llvm-svn: 27813
2006-04-18 21:29:08 +00:00
Evan Cheng
e2d25a1a50
Fixed an encoding bug: movd from XMM to R32.
...
llvm-svn: 27807
2006-04-18 18:19:00 +00:00
Evan Cheng
5421206c4b
Use movss to insert_vector_elt(v, s, 0).
...
llvm-svn: 27782
2006-04-17 22:45:49 +00:00
Evan Cheng
22c06f054b
Encoding bug
...
llvm-svn: 27773
2006-04-17 21:33:57 +00:00
Evan Cheng
5112b5c544
Errors in patterns preventing load folding
...
llvm-svn: 27762
2006-04-17 18:05:01 +00:00
Evan Cheng
20712deecb
movduprm, movshduprm bugs
...
llvm-svn: 27734
2006-04-16 18:11:28 +00:00
Evan Cheng
3064f9aaa6
Encoding bugs
...
llvm-svn: 27733
2006-04-16 07:02:22 +00:00
Evan Cheng
8f1d801389
More encoding bugs
...
llvm-svn: 27722
2006-04-15 06:10:09 +00:00
Evan Cheng
91944e8699
pslldrm, psrawrm, etc. encoding bug
...
llvm-svn: 27721
2006-04-15 05:59:08 +00:00
Evan Cheng
1220b31a31
hsubp{s|d} encoding bug
...
llvm-svn: 27720
2006-04-15 05:52:42 +00:00
Evan Cheng
6222cf2a36
Silly bug
...
llvm-svn: 27719
2006-04-15 05:37:34 +00:00
Evan Cheng
00a5b3d9d3
Some clean up
...
llvm-svn: 27715
2006-04-14 23:32:40 +00:00
Evan Cheng
5d247f81c1
Last few SSE3 intrinsics.
...
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Evan Cheng
3bd605397b
Misc. SSE2 intrinsics: clflush, lfench, mfence
...
llvm-svn: 27699
2006-04-14 07:43:12 +00:00
Evan Cheng
eb0063a34f
pcmpeq* and pcmpgt* intrinsics.
...
llvm-svn: 27685
2006-04-14 01:39:53 +00:00
Evan Cheng
16287444ff
psll*, psrl*, and psra* intrinsics.
...
llvm-svn: 27684
2006-04-14 00:14:05 +00:00
Evan Cheng
a84319719c
Doh. PANDrm, etc. are not commutable.
...
llvm-svn: 27668
2006-04-13 18:11:28 +00:00
Evan Cheng
ed3996743f
psad, pmax, pmin intrinsics.
...
llvm-svn: 27647
2006-04-13 06:11:45 +00:00
Evan Cheng
58dad55959
Various SSE2 packed integer intrinsics: pmulhuw, pavgw, etc.
...
llvm-svn: 27645
2006-04-13 05:24:54 +00:00
Evan Cheng
b3fe00bdc6
padds{b|w}, paddus{b|w}, psubs{b|w}, psubus{b|w} intrinsics.
...
llvm-svn: 27639
2006-04-13 00:43:35 +00:00
Evan Cheng
0aab735a1a
Naming inconsistency.
...
llvm-svn: 27638
2006-04-13 00:00:23 +00:00