Commit Graph

2364 Commits

Author SHA1 Message Date
Anton Korobeynikov 2fa75184f3 Another comments fixing
llvm-svn: 48683
2008-03-22 07:53:40 +00:00
Evan Cheng 31604a62f6 Teach DAG combiner to commute commutable binary nodes in order to achieve sdisel CSE.
llvm-svn: 48673
2008-03-22 01:55:50 +00:00
Dan Gohman 30e44a4b40 Fix -view-sunit-dags to support cross-rc-copy nodes.
llvm-svn: 48664
2008-03-21 22:51:06 +00:00
Duncan Sands d97eea372a Introduce a new node for holding call argument
flags.  This is needed by the new legalize types
infrastructure which wants to expand the 64 bit
constants previously used to hold the flags on
32 bit machines.  There are two functional changes:
(1) in LowerArguments, if a parameter has the zext
attribute set then that is marked in the flags;
before it was being ignored; (2) PPC had some bogus
code for handling two word arguments when using the
ELF 32 ABI, which was hard to convert because of
the bogusness.  As suggested by the original author
(Nicolas Geoffray), I've disabled it for the moment.
Tested with "make check" and the Ada ACATS testsuite.

llvm-svn: 48640
2008-03-21 09:14:45 +00:00
Christopher Lamb 3e9f49716e Check even more carefully before applying this DAGCombine transform.
llvm-svn: 48580
2008-03-20 04:31:39 +00:00
Evan Cheng 7a3e750fd2 Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m)))
llvm-svn: 48578
2008-03-20 02:18:41 +00:00
Chris Lattner a7cca362af detabify llvm, patch by Mike Stump!
llvm-svn: 48577
2008-03-20 01:22:40 +00:00
Christopher Lamb 8fe9109469 Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491.
llvm-svn: 48542
2008-03-19 08:30:06 +00:00
Bill Wendling efb4d9ef80 Temporarily revert r48491. It's breaking test/CodeGen/X86/xorl.ll.
llvm-svn: 48510
2008-03-18 22:29:51 +00:00
Dale Johannesen 12c76db312 Make conversions of i8/i16 to ppcf128 work.
llvm-svn: 48493
2008-03-18 17:28:38 +00:00
Christopher Lamb 3e408d4d82 Target independent DAG transform to use truncate for field extraction + sign extend on targets where this is profitable. Passes nightly on x86-64.
llvm-svn: 48491
2008-03-18 16:46:39 +00:00
Christopher Lamb d3d0ad3f58 Make insert_subreg a two-address instruction, vastly simplifying LowerSubregs pass. Add a new TII, subreg_to_reg, which is like insert_subreg except that it takes an immediate implicit value to insert into rather than a register.
llvm-svn: 48412
2008-03-16 03:12:01 +00:00
Evan Cheng 0e7b00d79f Replace all target specific implicit def instructions with a target independent one: TargetInstrInfo::IMPLICIT_DEF.
llvm-svn: 48380
2008-03-15 00:03:38 +00:00
Duncan Sands 858e6385f7 Do not generate special entries in the dwarf eh
table for nounwind calls.

llvm-svn: 48373
2008-03-14 21:36:24 +00:00
Duncan Sands a06e4f3050 Simplify using getIntPtrConstant.
llvm-svn: 48355
2008-03-14 05:23:57 +00:00
Nate Begeman 63eb03f800 Tabs -> spaces
Use getIntPtrConstant in a couple places to shorten stuff up
Handle splitting vector shuffles with undefs in the mask

llvm-svn: 48351
2008-03-14 00:53:31 +00:00
Evan Cheng db443ca377 Livein copy scheduling fixes: do not coalesce physical register copies, correctly determine the safe location to insert the copies.
llvm-svn: 48348
2008-03-14 00:14:55 +00:00
Dan Gohman b72127ac4c More APInt-ification.
llvm-svn: 48344
2008-03-13 22:13:53 +00:00
Evan Cheng 65e9d5f1a8 Experimental scheduler change to schedule / coalesce the copies added for function livein's. Take 2008-03-10-RegAllocInfLoop.ll, the schedule looks like this after these copies are inserted:
entry: 0x12049d0, LLVM BB @0x1201fd0, ID#0:
Live Ins: %EAX %EDX %ECX
        %reg1031<def> = MOVPC32r 0
        %reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def>
        %reg1028<def> = MOV32rr %EAX
        %reg1029<def> = MOV32rr %EDX
        %reg1030<def> = MOV32rr %ECX
        %reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x1201910 + 0]
        %reg1025<def> = MOV32rr %reg1029
        %reg1026<def> = MOV32rr %reg1030
        %reg1024<def> = MOV32rr %reg1028

The copies unnecessarily increase register pressure and it will end up requiring a physical register to be spilled.

With -schedule-livein-copies:
entry: 0x12049d0, LLVM BB @0x1201fa0, ID#0:
Live Ins: %EAX %EDX %ECX
        %reg1031<def> = MOVPC32r 0
        %reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def>
        %reg1024<def> = MOV32rr %EAX
        %reg1025<def> = MOV32rr %EDX
        %reg1026<def> = MOV32rr %ECX
        %reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x12018e0 + 0]

Much better!

llvm-svn: 48307
2008-03-12 22:19:41 +00:00
Duncan Sands 723849a17f Initial soft-float support for LegalizeTypes. I rewrote
the fcopysign expansion from LegalizeDAG to get rid of
what seems to be a bug: the use of sign extension means
that when copying the sign bit from an f32 to an f64,
the upper 32 bits of the f64 (now an i64) are set, not
just the top bit...  I also generalized it to work for
any sized floating point types, and removed the bogosity:
  SDOperand Mask1 = (SrcVT == MVT::f64)
    ? DAG.getConstantFP(BitsToDouble(1ULL << 63), SrcVT)
    : DAG.getConstantFP(BitsToFloat(1U << 31), SrcVT);
  Mask1 = DAG.getNode(ISD::BIT_CONVERT, SrcNVT, Mask1);
(here SrcNVT is an integer with the same size as SrcVT).
As far as I can see this takes a 1 << 63, converts to
a double, converts that to a floating point constant
then converts that to an integer constant, ending up
with... 1 << 63 as an integer constant!  So I just
generate this integer constant directly.

llvm-svn: 48305
2008-03-12 21:27:04 +00:00
Duncan Sands c54fe97f08 Fix typo.
llvm-svn: 48295
2008-03-12 20:35:19 +00:00
Duncan Sands 87de65fc29 Don't try to extract an i32 from an f64. This
getCopyToParts problem was noticed by the new
LegalizeTypes infrastructure.  In order to avoid
this kind of thing in the future I've added a
check that EXTRACT_ELEMENT is only used with
integers.  Once LegalizeTypes is up and running
most likely BUILD_PAIR and EXTRACT_ELEMENT can
be removed, in favour of using apints instead.

llvm-svn: 48294
2008-03-12 20:30:08 +00:00
Evan Cheng 99ee78ef63 Clean up my own mess.
X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases.

llvm-svn: 48279
2008-03-12 07:02:50 +00:00
Evan Cheng 0903aef2ff Total brain cramp.
llvm-svn: 48274
2008-03-12 02:05:05 +00:00
Anton Korobeynikov e8fa50f63a Correctly propagate thread-local flag from aliasee to alias. This fixes PR2137
llvm-svn: 48257
2008-03-11 22:38:53 +00:00
Dan Gohman 44b4c07cd1 Use the correct value for InSignBit.
llvm-svn: 48245
2008-03-11 21:29:43 +00:00
Dan Gohman 1351025a91 Initial codegen support for functions and calls with multiple return values.
llvm-svn: 48244
2008-03-11 21:11:25 +00:00
Christopher Lamb aa7c2105de Recommitting parts of r48130. These do not appear to cause the observed failures.
llvm-svn: 48223
2008-03-11 10:09:17 +00:00
Evan Cheng e88a625ecd When the register allocator runs out of registers, spill a physical register around the def's and use's of the interval being allocated to make it possible for the interval to target a register and spill it right away and restore a register for uses. This likely generates terrible code but is before than aborting.
llvm-svn: 48218
2008-03-11 07:19:34 +00:00
Duncan Sands b29f93613d Some LegalizeTypes code factorization and minor
enhancements.

llvm-svn: 48215
2008-03-11 06:41:14 +00:00
Chris Lattner 5c7bda440f compile: double test() {}
into:

_test:
	fldz
	ret

instead of:

_test:
	subl	$12, %esp
	#IMPLICIT_DEF %xmm0
	movsd	%xmm0, (%esp)
	fldl	(%esp)
	addl	$12, %esp
	ret

llvm-svn: 48213
2008-03-11 06:21:08 +00:00
Chris Lattner 3e0ec65678 variadic instructions don't have operand info for variadic arguments.
llvm-svn: 48208
2008-03-11 03:14:42 +00:00
Dan Gohman d6819da453 Generalize ExpandIntToFP to handle the case where the operand is legal
and it's the result that requires expansion. This code is a little confusing
because the TargetLoweringInfo tables for [US]INT_TO_FP use the operand type
(the integer type) rather than the result type. 

llvm-svn: 48206
2008-03-11 01:59:03 +00:00
Chris Lattner d3090bcfc8 If a register operand comes from the variadic part of a node, don't
verify the register constraint matches what the instruction expects.

llvm-svn: 48205
2008-03-11 00:59:28 +00:00
Dan Gohman 10f7d850cf More APInt-ification.
llvm-svn: 48201
2008-03-11 00:11:06 +00:00
Dan Gohman 2a3aeb1f72 Correctly clone FlaggedNodes.
llvm-svn: 48196
2008-03-10 23:48:14 +00:00
Dan Gohman 830d86cab8 APInt-ify this.
llvm-svn: 48194
2008-03-10 23:38:17 +00:00
Dan Gohman f4300950f1 Implement more support for fp-to-i128 and i128-to-fp conversions.
llvm-svn: 48189
2008-03-10 23:03:31 +00:00
Dan Gohman 272e234477 Fix mul expansion to check the correct number of bits for
zero extension when checking if an unsigned multiply is
safe.

llvm-svn: 48171
2008-03-10 20:42:19 +00:00
Evan Cheng b9e4280e94 Somewhat better solution.
llvm-svn: 48170
2008-03-10 19:58:22 +00:00
Evan Cheng ae2c56d93e Default ISD::PREFETCH to expand.
llvm-svn: 48169
2008-03-10 19:38:10 +00:00
Evan Cheng d4e1d9eeb2 Revert 48125, 48126, and 48130 for now to unbreak some x86-64 tests.
llvm-svn: 48167
2008-03-10 19:31:26 +00:00
Scott Michel a6729e8666 Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's
return ValueType can depend its operands' ValueType.

This is a cosmetic change, no functionality impacted.

llvm-svn: 48145
2008-03-10 15:42:14 +00:00
Evan Cheng 831ae49599 Doh
llvm-svn: 48140
2008-03-10 07:59:01 +00:00
Evan Cheng b5d11980d9 Avoid creating BUILD_VECTOR of all zero elements of "non-normalized" type (e.g. v8i16 on x86) after legalizer. Instruction selection does not expect to see them. In all likelihood this can only be an issue in a bugpoint reduced test case.
llvm-svn: 48136
2008-03-10 07:19:13 +00:00
Christopher Lamb 4ba3f0430b Allow insert_subreg into implicit, target-specific values.
Change insert/extract subreg instructions to be able to be used in TableGen patterns.
Use the above features to reimplement an x86-64 pseudo instruction as a pattern.

llvm-svn: 48130
2008-03-10 06:12:08 +00:00
Dale Johannesen 4e622ec86d Increase ISD::ParamFlags to 64 bits. Increase the ByValSize
field to 32 bits, thus enabling correct handling of ByVal
structs bigger than 0x1ffff.  Abstract interface a bit.
Fixes gcc.c-torture/execute/pr23135.c and 
gcc.c-torture/execute/pr28982b.c in gcc testsuite (were ICE'ing
on ppc32, quietly producing wrong code on x86-32.)

llvm-svn: 48122
2008-03-10 02:17:22 +00:00
Chris Lattner 4c4234b59c remove an extraneous (and ugly) default argument, thanks Duncan.
llvm-svn: 48117
2008-03-09 20:04:36 +00:00
Chris Lattner ce5f841bb5 fp_round's produced by getCopyFromParts should always be exact, because
they are produced by calls (which are known exact) and by cross block copies
which are known to be produced by extends.

This improves:

define double @test2() {
	%tmp85 = call double asm sideeffect "fld0", "={st(0)}"()
	ret double %tmp85
}

from:

_test2:
	subl	$20, %esp
	# InlineAsm Start
	fld0
	# InlineAsm End
	fstpl	8(%esp)
	movsd	8(%esp), %xmm0
	movsd	%xmm0, (%esp)
	fldl	(%esp)
	addl	$20, %esp
	#FP_REG_KILL
	ret

to:

_test2:
	# InlineAsm Start
	fld0
	# InlineAsm End
	#FP_REG_KILL
	ret

by avoiding a f64 <-> f80 trip

llvm-svn: 48108
2008-03-09 09:38:46 +00:00
Chris Lattner 86829f0ff7 teach X86InstrInfo::copyRegToReg how to copy into ST(0) from
an RFP register class.

Teach ScheduleDAG how to handle CopyToReg with different src/dst 
reg classes.

This allows us to compile trivial inline asms that expect stuff
on the top of x87-fp stack.

llvm-svn: 48107
2008-03-09 09:15:31 +00:00
Chris Lattner 9e07537e8c Add ScheduleDAG support for copytoreg where the src/dst register are
in different register classes, e.g. copy of ST(0) to RFP*.  This gets
some really trivial inline asm working that plops things on the top of
stack (PR879)

llvm-svn: 48105
2008-03-09 08:49:15 +00:00
Chris Lattner 381bbdb924 fix 80 col violation
llvm-svn: 48100
2008-03-09 07:51:01 +00:00
Chris Lattner 83b3473dd8 extend fp values with FP_EXTEND not FP_ROUND.
llvm-svn: 48097
2008-03-09 07:47:22 +00:00
Chris Lattner 322c826c9d Fix two problems in SelectionDAGLegalize::ExpandBUILD_VECTOR's handling
of BUILD_VECTORS that only have two unique elements:

1. The previous code was nondeterminstic, because it walked a map in
   SDOperand order, which isn't determinstic.
2. The previous code didn't handle the case when one element was undef
   very well.  Now we ensure that the generated shuffle mask has the
   undef vector on the RHS (instead of potentially being on the LHS)
   and that any elements that refer to it are themselves undef.  This
   allows us to compile CodeGen/X86/vec_set-9.ll into:

_test3:
	movd	%rdi, %xmm0
	punpcklqdq	%xmm0, %xmm0
	ret

instead of:

_test3:
	movd	%rdi, %xmm1
	#IMPLICIT_DEF %xmm0
	punpcklqdq	%xmm1, %xmm0
	ret

... saving a register.

llvm-svn: 48060
2008-03-09 00:29:42 +00:00
Chris Lattner a1f25b0020 Teach SD some vector identities, allowing us to compile vec_set-9 into:
_test3:
	movd	%rdi, %xmm1
	#IMPLICIT_DEF %xmm0
	punpcklqdq	%xmm1, %xmm0
	ret

instead of:

_test3:
	#IMPLICIT_DEF %rax
	movd	%rax, %xmm0
	movd	%rdi, %xmm1
	punpcklqdq	%xmm1, %xmm0
	ret

This is still not ideal.  There is no reason to two xmm regs.

llvm-svn: 48058
2008-03-08 23:43:36 +00:00
Evan Cheng 95cf661534 Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0|1|2} and prefetchnta instructions.
llvm-svn: 48042
2008-03-08 00:58:38 +00:00
Evan Cheng 34173f0a43 80 col violation.
llvm-svn: 47998
2008-03-06 17:42:34 +00:00
Evan Cheng a3cb090446 Constant fold SIGN_EXTEND_INREG with ashr not lshr.
llvm-svn: 47992
2008-03-06 08:20:51 +00:00
Dale Johannesen 8ee39c61f2 Clarify that CALLSEQ_START..END may not be nested,
and add some protection against creating such.

llvm-svn: 47957
2008-03-05 19:14:03 +00:00
Chris Lattner 78e9cab229 Generalize FP constant shrinking optimization to apply to any vt
except ppc long double.  This allows us to shrink constant pool
entries for x86 long double constants, which in turn allows us to
use flds/fldl instead of fldt.

llvm-svn: 47938
2008-03-05 06:48:13 +00:00
Chris Lattner 3dc3899007 Improve comment, pass in the original VT so that we can shrink a long double constant
all the way to float, not stopping at double.

llvm-svn: 47937
2008-03-05 06:46:58 +00:00
Dan Gohman da7897c4e1 Codegen support for i128 UINT_TO_FP. This just fixes a
bug in r47928 (Int64Ty is the correct type for the constant
pool entry here) and removes the asserts, now that the code
is capable of handling i128.

llvm-svn: 47932
2008-03-05 02:07:31 +00:00
Evan Cheng 0a62cb44ce Add a target lowering hook to control whether it's worthwhile to compress fp constant.
For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive.

llvm-svn: 47931
2008-03-05 01:30:59 +00:00
Andrew Lenharth 357061a74d 64bit CAS on 32bit x86.
llvm-svn: 47929
2008-03-05 01:15:49 +00:00
Dan Gohman d9d874b0cd Codegen support for i128 SINT_TO_FP.
llvm-svn: 47928
2008-03-05 01:08:17 +00:00
Roman Levenstein c62c2bb4d0 Some improvements related to the computation of heights, depths of SUnits.
The basic idea is that all these algorithms are computing the longest paths from the root node or to the exit node. Therefore the existing implementation that uses and iterative and potentially
exponential algorithm was changed to a well-known graph algorithm based on dynamic programming. It has a linear run-time.

llvm-svn: 47884
2008-03-04 11:19:43 +00:00
Evan Cheng 38caf77419 Refactor ExpandConstantFP so it can optimize load from constpool of types larger than f64 into extload from smaller types.
llvm-svn: 47883
2008-03-04 08:05:30 +00:00
Evan Cheng 567d2e5b57 Rename isOperand() to isOperandOf() (and other similar methods). It always confuses me.
llvm-svn: 47872
2008-03-04 00:41:45 +00:00
Dan Gohman e1c4f99549 Misc. APInt-ification in the DAGCombiner.
llvm-svn: 47869
2008-03-03 23:51:38 +00:00
Dan Gohman 10f34077f1 More APInt-ification.
llvm-svn: 47868
2008-03-03 23:35:36 +00:00
Dan Gohman 0e238dc813 Yet more APInt-ification.
llvm-svn: 47867
2008-03-03 22:37:52 +00:00
Dan Gohman 2fa65b7997 More APInt-ification.
llvm-svn: 47866
2008-03-03 22:22:56 +00:00
Dan Gohman f2bbfa3ba0 More APInt-ification.
llvm-svn: 47864
2008-03-03 22:20:46 +00:00
Andrew Lenharth d032c33300 all but CAS working on x86
llvm-svn: 47798
2008-03-01 21:52:34 +00:00
Dale Johannesen 208cc8f1b9 Add MVT::is128BitVector and is64BitVector. Shrink
unaligned load/store code using them.  Per review
of unaligned load/store vector patch.

llvm-svn: 47782
2008-03-01 03:40:57 +00:00
Evan Cheng 73bdf043a1 Refactor / clean up code; remove td list scheduler special tie breaker (no real benefit).
llvm-svn: 47779
2008-03-01 00:39:47 +00:00
Dan Gohman bd2fa566e4 More APInt-ification.
llvm-svn: 47746
2008-02-29 01:47:35 +00:00
Dan Gohman 837a6dccd7 Use the new convertFromAPInt instead of convertFromZeroExtendedInteger,
which allows more of the surrounding arithmetic to be done with APInt
instead of uint64_t.

llvm-svn: 47745
2008-02-29 01:44:25 +00:00
Dan Gohman ec6be4a782 Use the new APInt-enabled form of getConstant instead of converting
an APInt into a uint64_t to call getConstant.

llvm-svn: 47742
2008-02-29 01:41:59 +00:00
Dale Johannesen cbde4c2206 Interface of getByValTypeAlignment differed between
generic & x86 versions; change generic to follow x86
and improve comments.  Add PPC version (not right
for non-Darwin.)

llvm-svn: 47734
2008-02-28 22:31:51 +00:00
Dale Johannesen c4c3de2b52 Fix an assertion message.
llvm-svn: 47722
2008-02-28 18:36:51 +00:00
Evan Cheng a465bfb87c Keep track how many commutes are performed by the scheduler.
llvm-svn: 47710
2008-02-28 07:40:24 +00:00
Chris Lattner 9824ffef0c implement expand for ISD::DECLARE by just deleting it.
llvm-svn: 47708
2008-02-28 05:53:40 +00:00
Evan Cheng c799065cc3 Add a quick and dirty "loop aligner pass". x86 uses it to align its loops to 16-byte boundaries.
llvm-svn: 47703
2008-02-28 00:43:03 +00:00
Dale Johannesen bf76a08e7c Handle load/store of misaligned vectors that are the
same size as an int type by doing a bitconvert of
load/store of the int type (same algorithm as floating point).
This makes them work for ppc Altivec.  There was some
code that purported to handle loads of (some) vectors
by splitting them into two smaller vectors, but getExtLoad
rejects subvector loads, so this could never have worked;
the patch removes it.

llvm-svn: 47696
2008-02-27 22:36:00 +00:00
Dan Gohman e5e32ec8f7 Remove the `else', at Evan's insistence.
llvm-svn: 47686
2008-02-27 19:44:57 +00:00
Duncan Sands ef40c5b204 Add a FIXME about the VECTOR_SHUFFLE evil hack.
llvm-svn: 47676
2008-02-27 17:39:13 +00:00
Duncan Sands e158a82f26 LegalizeTypes support for EXTRACT_VECTOR_ELT. The
approach taken is different to that in LegalizeDAG
when it is a question of expanding or promoting the
result type: for example, if extracting an i64 from
a <2 x i64>, when i64 needs expanding, it bitcasts
the vector to <4 x i32>, extracts the appropriate
two i32's, and uses those for the Lo and Hi parts.
Likewise, when extracting an i16 from a <4 x i16>,
and i16 needs promoting, it bitcasts the vector to
<2 x i32>, extracts the appropriate i32, twiddles
the bits if necessary, and uses that as the promoted
value.  This puts more pressure on bitcast legalization,
and I've added the appropriate cases.  They needed to
be added anyway since users can generate such bitcasts
too if they want to.  Also, when considering various
cases (Legal, Promote, Expand, Scalarize, Split) it is
a pain that expand can correspond to Expand, Scalarize
or Split, so I've changed the LegalizeTypes enum so it
lists those different cases - now Expand only means
splitting a scalar in two.
The code produced is the same as by LegalizeDAG for
all relevant testcases, except for
2007-10-31-extractelement-i64.ll, where the code seems
to have improved (see below; can an expert please tell
me if it is better or not).
Before < vs after >.

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
<       movaps  %xmm0, (%esp)
<       movl    4(%esp), %eax
<       movl    %eax, 28(%esp)
<       movl    (%esp), %eax
<       movl    %eax, 24(%esp)
<       movq    24(%esp), %mm0
<       movq    %mm0, 56(%esp)
---
>       subl    $44, %esp
>       movaps  %xmm0, 16(%esp)
>       pshufd  $1, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movd    %xmm0, (%esp)
>       movq    (%esp), %mm0
>       movq    %mm0, 8(%esp)

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
<       movaps  %xmm0, (%esp)
<       movl    12(%esp), %eax
<       movl    %eax, 28(%esp)
<       movl    8(%esp), %eax
<       movl    %eax, 24(%esp)
<       movq    24(%esp), %mm0
<       movq    %mm0, 56(%esp)
---
>       subl    $44, %esp
>       movaps  %xmm0, 16(%esp)
>       pshufd  $3, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movhlps %xmm0, %xmm0
>       movd    %xmm0, (%esp)
>       movq    (%esp), %mm0
>       movq    %mm0, 8(%esp)

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
---
>       subl    $44, %esp

<       movl    16(%esp), %eax
<       movl    %eax, 48(%esp)
<       movl    20(%esp), %eax
<       movl    %eax, 52(%esp)
<       movaps  %xmm0, (%esp)
<       movl    4(%esp), %eax
<       movl    %eax, 60(%esp)
<       movl    (%esp), %eax
<       movl    %eax, 56(%esp)
---
>       pshufd  $1, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movd    %xmm0, (%esp)
>       movd    %xmm1, 12(%esp)
>       movd    %xmm0, 8(%esp)

<       subl    $92, %esp
<       movaps  %xmm0, 64(%esp)
---
>       subl    $44, %esp

<       movl    24(%esp), %eax
<       movl    %eax, 48(%esp)
<       movl    28(%esp), %eax
<       movl    %eax, 52(%esp)
<       movaps  %xmm0, (%esp)
<       movl    12(%esp), %eax
<       movl    %eax, 60(%esp)
<       movl    8(%esp), %eax
<       movl    %eax, 56(%esp)
---
>       pshufd  $3, %xmm0, %xmm1
>       movd    %xmm1, 4(%esp)
>       movhlps %xmm0, %xmm0
>       movd    %xmm0, (%esp)
>       movd    %xmm1, 12(%esp)
>       movd    %xmm0, 8(%esp)

llvm-svn: 47672
2008-02-27 13:34:40 +00:00
Duncan Sands 2111bd2e37 LegalizeTypes support for legalizing the mask
operand of a VECTOR_SHUFFLE.  The mask is a
vector of constant integers.  The code in
LegalizeDAG doesn't bother to legalize the
mask, since it's basically just storage for
a bunch of constants, however LegalizeTypes
is more picky.  The problem is that there may
not exist any legal vector-of-integers type
with a legal element type, so it is impossible
to create a legal mask!  Unless of course you
cheat by creating a BUILD_VECTOR where the
operands have a different type to the element
type of the vector being built...  This is
pretty ugly but works - all relevant tests in
the testsuite pass, and produce the same
assembler with and without LegalizeTypes.

llvm-svn: 47670
2008-02-27 13:03:44 +00:00
Duncan Sands 5d5bc484d0 LegalizeTypes support for INSERT_VECTOR_ELT.
llvm-svn: 47669
2008-02-27 10:18:23 +00:00
Duncan Sands 96658d0189 Support for legalizing MEMBARRIER.
llvm-svn: 47667
2008-02-27 08:53:44 +00:00
Bill Wendling 97925ec704 Final de-tabification.
llvm-svn: 47663
2008-02-27 06:33:05 +00:00
Dan Gohman 66272a545b Teach Legalize how to expand an EXTRACT_ELEMENT.
llvm-svn: 47656
2008-02-27 01:52:30 +00:00
Dan Gohman f19609abe8 Convert the last remaining users of the non-APInt form of
ComputeMaskedBits to use the APInt form, and remove the
non-APInt form.

llvm-svn: 47654
2008-02-27 01:23:58 +00:00
Dan Gohman ae2b6fbb8e Convert SimplifyDemandedMask and ShrinkDemandedConstant to use APInt.
Change several cases in SimplifyDemandedMask that don't ever do any
simplifying to reuse the logic in ComputeMaskedBits instead of
duplicating it.

llvm-svn: 47648
2008-02-27 00:25:32 +00:00
Bill Wendling d7a258d325 Rename PrintableName to Name.
llvm-svn: 47629
2008-02-26 21:47:57 +00:00
Bill Wendling c24ea4fb41 Change "Name" to "AsmName" in the target register info. Gee, a refactoring tool
would have been a Godsend here!

llvm-svn: 47625
2008-02-26 21:11:01 +00:00
Dan Gohman 9db0aa86d9 Avoid aborting on invalid shift counts.
llvm-svn: 47612
2008-02-26 18:50:50 +00:00
Chris Lattner 07c83cc86e Fix PR2096, a regression introduced with my patch last night. This
also fixes cfrac, flops, and 175.vpr

llvm-svn: 47605
2008-02-26 17:09:59 +00:00
Duncan Sands 7cdbbfd067 Fix a nasty bug in LegalizeTypes (spotted in
CodeGen/PowerPC/illegal-element-type.ll): suppose
a node X is processed, and processing maps it to
a node Y.  Then X continues to exist in the DAG,
but with no users.  While processing some other
node, a new node may be created that happens to
be equal to X, and thus X will be reused rather
than a truly new node.  This can cause X to
"magically reappear", and since it is in the
Processed state in will not be reprocessed, so
at the end of type legalization the illegal node
X can still be present.  The solution is to replace
X with Y whenever X gets resurrected like this.

llvm-svn: 47601
2008-02-26 11:21:42 +00:00
Chris Lattner e7c14013f5 Fix isNegatibleForFree to not return true for ConstantFP nodes
after legalize.  Just because a constant is legal (e.g. 0.0 in SSE) 
doesn't mean that its negated value is legal (-0.0).  We could make
this stronger by checking to see if the negated constant is actually
legal post negation, but it doesn't seem like a big deal.

llvm-svn: 47591
2008-02-26 07:04:54 +00:00
Evan Cheng ccc0c996a4 Refactor inline asm constraint matching code out of SDIsel into TargetLowering.
llvm-svn: 47587
2008-02-26 02:33:44 +00:00
Dan Gohman 432e4a6742 Make some static variables const.
llvm-svn: 47566
2008-02-25 21:39:34 +00:00
Dan Gohman 1f372edd97 Convert MaskedValueIsZero and all its users to use APInt. Also add
a SignBitIsZero function to simplify a common use case.

llvm-svn: 47561
2008-02-25 21:11:39 +00:00
Duncan Sands 896c519d19 In debug builds check that the key property holds: all
result and operand types are legal.

llvm-svn: 47546
2008-02-25 16:21:21 +00:00
Duncan Sands ba3d7e8e7d Add support to LegalizeTypes for building legal vectors
out of illegal elements (BUILD_VECTOR).  Uses and beefs
up BUILD_PAIR, though it didn't really have to.  Like
most of LegalizeTypes, does not support soft-float.
This cures all "make check" vector building failures.

llvm-svn: 47537
2008-02-24 07:36:03 +00:00
Dale Johannesen eabc5f39af Pass alignment on ByVal parameters, from FE, all
the way through.  It is now used for codegen.

llvm-svn: 47484
2008-02-22 17:49:45 +00:00
Dan Gohman f3057a939d Fix a regression in 403.gcc and 186.crafty introduced in 47383. To test
that a value is >= 32, check that all of the high bits are zero, not
just one or more.

llvm-svn: 47467
2008-02-22 01:12:31 +00:00
Chris Lattner 3422b673d1 Make the clobber analysis a bit more smart: we only are careful about
early clobbers if the clobber list contains a *register* not some thing
like {memory}, {dirflag} etc.

llvm-svn: 47457
2008-02-21 20:54:31 +00:00
Chris Lattner bdd4c8b04d Treat clobber operands like early clobbers: if we have
any, we force sdisel to do all regalloc for an asm.  This
leads to gross but correct codegen.

This fixes the rest of PR2078.

llvm-svn: 47454
2008-02-21 19:43:13 +00:00
Andrew Lenharth 7254826c40 Better names as per Evan's request
llvm-svn: 47435
2008-02-21 16:11:38 +00:00
Andrew Lenharth 95528943e9 Atomic op support. If any gcc test uses __sync builtins, it might start failing on archs that haven't implemented them yet
llvm-svn: 47430
2008-02-21 06:45:13 +00:00
Chris Lattner 4da4f85090 Add support for matching mem operands. This fixes PR1133, patch by
Eli Friedman.  This implements CodeGen/Generic/2008-02-20-MatchingMem.ll.

llvm-svn: 47428
2008-02-21 05:27:19 +00:00
Chris Lattner 83c93d5afd Fix a (harmless) but where vregs were added to the used reg lists for
inline asms.

Fix PR2078 by marking aliases of registers used when a register is 
marked used.  This prevents EAX from being allocated when AX is listed
in the clobber set for the asm.

llvm-svn: 47426
2008-02-21 04:55:52 +00:00
Devang Patel 57b4eedad9 assert is more effective reminder then FIXME tag for unimplemented features.
llvm-svn: 47388
2008-02-20 18:37:40 +00:00
Duncan Sands e7b462b329 LegalizeTypes support for scalarizing a vector store
and splitting extract_subvector.  This fixes nine
"make check" testcases, for example
2008-02-04-ExtractSubvector.ll and (partially)
CodeGen/Generic/vector.ll.

llvm-svn: 47384
2008-02-20 17:38:09 +00:00
Dan Gohman 34fc7dbf5b Convert Legalize to use the APInt form of ComputeMaskedBits.
llvm-svn: 47383
2008-02-20 16:57:27 +00:00
Dan Gohman 360c86aed5 Add explicit keywords.
llvm-svn: 47382
2008-02-20 16:44:09 +00:00
Dan Gohman d0ff91dac5 Convert DAGCombiner to use the APInt form of ComputeMaskedBits.
llvm-svn: 47381
2008-02-20 16:33:30 +00:00
Dan Gohman b717fdaa7b Use APInt::intersects.
llvm-svn: 47380
2008-02-20 16:30:17 +00:00
Anton Korobeynikov 035eaacd1f Update gcc 4.3 warnings fix patch with recent head changes
llvm-svn: 47368
2008-02-20 11:10:28 +00:00
Chris Lattner 2a8037b5f5 Fix an incredibly subtle bug exposed by Ted's change to APInt profiling.
AddNodeIDNode does profiling for a ConstantSDNode, but so does 
SelectionDAG::getConstant.  This profiling should be moved to a common
static function in ConstantSDNode.

llvm-svn: 47359
2008-02-20 06:28:01 +00:00
Devang Patel 295711f583 Add GetResultInst. First step for multiple return value support.
llvm-svn: 47348
2008-02-19 22:15:16 +00:00
Evan Cheng 6200c225e0 - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type.
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.

llvm-svn: 47290
2008-02-18 23:04:32 +00:00
Andrew Lenharth fedcf477b5 I cannot find a libgcc function for this builtin. Therefor expanding it to a noop (which is how it use to be treated). If someone who knows the x86 backend better than me could tell me how to get a lock prefix on an instruction, that would be nice to complete x86 support.
llvm-svn: 47213
2008-02-16 14:46:26 +00:00
Duncan Sands b289516a71 Teach LegalizeTypes how to expand the operands of
br_cc.  This fixes 5 "make check" failures.

llvm-svn: 47212
2008-02-16 10:29:26 +00:00
Andrew Lenharth 9b254eed32 llvm.memory.barrier, and impl for x86 and alpha
llvm-svn: 47204
2008-02-16 01:24:58 +00:00
Dan Gohman 27ae573900 Rename CountMemOperands to ComputeMemOperandsEnd to reflect what
it actually does. Simplify CountOperands a little by reusing
ComputeMemOperandsEnd. And reword some comments for both.

llvm-svn: 47198
2008-02-16 00:36:48 +00:00
Dan Gohman 856c01204b Revert 47177, which was incorrect.
llvm-svn: 47196
2008-02-16 00:25:40 +00:00
Scott Michel a3cefeaf0c Make tblgen a little smarter about constants smaller than i32. Currently,
tblgen will complain if a sign-extended constant does not fit into a
data type smaller than i32, e.g., i16. This causes a problem when certain
hex constants are used, such as 0xff for byte masks or immediate xor
values.

tblgen will try the sign-extended value first and, if the sign extended
value would overflow, it tries to see if the unsigned value will fit.
Consequently, a software developer can now safely incant:

	(XORHIr16 R16C:$rA, 0xffff)

which is somewhat clearer and more informative than incanting:

	(XORHIr16 R16C:$rA, (i16 -1))

even if the two are bitwise equivalent.

Tblgen also outputs the 64-bit unsigned constant in the generated ISel code
when getTargetConstant() is invoked.

llvm-svn: 47188
2008-02-15 23:05:48 +00:00
Dan Gohman c278c4aba0 Skip over the defs and start at the uses when looking for operands
with the TIED_TO attribute.

llvm-svn: 47177
2008-02-15 20:59:17 +00:00
Dan Gohman 0340d1e2cd Use the TargetInstrDescr to determine the number of operands
that should be checked for the TIED_TO attribute instead of
using CountOperands.

llvm-svn: 47176
2008-02-15 20:50:13 +00:00
Duncan Sands 5560281c06 Teach LegalizeTypes how to promote the flags
in a ret node.  These are created as i32 constants
but on some platforms i32 is not legal.  This
fixes 26 "make check" failures, for example
Alpha/2005-07-12-TwoMallocCalls.ll.

llvm-svn: 47172
2008-02-15 19:34:17 +00:00
Dan Gohman a36ade5595 Use StoreSDNode::getValue instead of calling getOperand directly
with a hard-coded operand number.

llvm-svn: 47163
2008-02-15 18:11:59 +00:00
Chris Lattner 558a3ba17f Fix a miscompilation from Dan's recent apintification.
llvm-svn: 47128
2008-02-14 18:48:56 +00:00
Duncan Sands 4c95dbd69f In TargetLowering::LowerCallTo, don't assert that
the return value is zero-extended if it isn't
sign-extended.  It may also be any-extended.
Also, if a floating point value was returned
in a larger floating point type, pass 1 as the
second operand to FP_ROUND, which tells it
that all the precision is in the original type.
I think this is right but I could be wrong.
Finally, when doing libcalls, set isZExt on
a parameter if it is "unsigned".  Currently
isSExt is set when signed, and nothing is
set otherwise.  This should be right for all
calls to standard library routines.

llvm-svn: 47122
2008-02-14 17:28:50 +00:00
Nate Begeman 53e1b3f9d5 Change how FP immediates are handled.
1) ConstantFP is now expand by default
2) ConstantFP is not turned into TargetConstantFP during Legalize
   if it is legal.

This allows ConstantFP to be handled like Constant, allowing for 
targets that can encode FP immediates as MachineOperands.

As a bonus, fix up Itanium FP constants, which now correctly match,
and match more constants!  Hooray.

llvm-svn: 47121
2008-02-14 08:57:00 +00:00
Dan Gohman 7e22a5d8df Allow the APInt form of ComputeMaskedBits to operate on i128 types.
llvm-svn: 47101
2008-02-13 23:13:32 +00:00
Dan Gohman 95d25d39d0 Avoid setting bits that aren't demanded.
llvm-svn: 47098
2008-02-13 22:43:25 +00:00
Dan Gohman e1d9ee66ed Simplify some logic in ComputeMaskedBits. And change ComputeMaskedBits
to pass the mask APInt by value, not by reference. 

llvm-svn: 47096
2008-02-13 22:28:48 +00:00
Duncan Sands f8d29f228d Teach LegalizeTypes how to expand and promote CTLZ,
CTTZ and CTPOP.  The expansion code differs from
that in LegalizeDAG in that it chooses to take the
CTLZ/CTTZ count from the Hi/Lo part depending on
whether the Hi/Lo value is zero, not on whether
CTLZ/CTTZ of Hi/Lo returned 32 (or whatever the
width of the type is) for it.  I made this change
because the optimizers may well know that Hi/Lo
is zero and exploit it.  The promotion code for
CTTZ also differs from that in LegalizeDAG: it
uses an "or" to get the right result when the
original value is zero, rather than using a compare
and select.  This also means the value doesn't
need to be zero extended.

llvm-svn: 47075
2008-02-13 18:01:53 +00:00
Chris Lattner a08af08a88 In SDISel, for targets that support FORMAL_ARGUMENTS nodes, lower this
node as soon as we create it in SDISel.  Previously we would lower it in
legalize.  The problem with this is that it only exposes the argument
loads implied by FORMAL_ARGUMENTs after legalize, so that only dag combine 2
can hack on them.  This causes us to miss some optimizations because 
datatype expansion also happens here.

Exposing the loads early allows us to do optimizations on them.  For example
we now compile arg-cast.ll to:

_foo:
	movl	$2147483647, %eax
	andl	8(%esp), %eax
	ret

where we previously produced:

_foo:
	subl	$12, %esp
	movsd	16(%esp), %xmm0
	movsd	%xmm0, (%esp)
	movl	$2147483647, %eax
	andl	4(%esp), %eax
	addl	$12, %esp
	ret

It might also make sense to do this for ISD::CALL nodes, which have implicit
stores on many targets.

llvm-svn: 47054
2008-02-13 07:39:09 +00:00
Chris Lattner ee322b44a4 teach dag combiner how to eliminate MERGE_VALUES nodes.
llvm-svn: 47052
2008-02-13 07:25:05 +00:00
Nate Begeman 735ab3ce67 Support legalizing insert_vector_elt on targets where the element
type is not legal.

llvm-svn: 47048
2008-02-13 06:43:04 +00:00
Dan Gohman f990faf23b Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t.
Add an overload that supports the uint64_t interface for use by clients
that haven't been updated yet.

llvm-svn: 47039
2008-02-13 00:35:47 +00:00
Duncan Sands f213e82bc5 Generalize getCopyFromParts and getCopyToParts to
handle arbitrary precision integers and any number
of parts.  For example, on a 32 bit machine an i50
corresponds to two i32 parts.  getCopyToParts will
extend the i50 to an i64 then write half of the i64
to each part; getCopyFromParts will combine the two
i32 parts into an i64 then truncate the result to
i50.

llvm-svn: 47024
2008-02-12 20:46:31 +00:00
Duncan Sands a6ab6e7adb Generalize the handling of call and return arguments,
in preparation for apint support.  These changes are
intended to have no functional effect.

llvm-svn: 46967
2008-02-11 20:58:28 +00:00
Dan Gohman 11f6212bc0 From Chris' review: use isa instead of explicitly using classof.
llvm-svn: 46964
2008-02-11 19:00:34 +00:00
Dan Gohman 991056808b From Chris' review: minor corrections in comments.
llvm-svn: 46963
2008-02-11 19:00:03 +00:00
Dan Gohman 54d3b5a1f5 From Chris' review: use cast instead of dyn_cast with an assert.
llvm-svn: 46962
2008-02-11 18:58:42 +00:00