Commit Graph

645 Commits

Author SHA1 Message Date
Evan Cheng 87fbd66f9f Fix PR1975: dag isel emitter produces patterns that isel wrong flag result.
llvm-svn: 46776
2008-02-05 22:50:29 +00:00
Evan Cheng 8d78b0597b If a vr is already marked alive in a bb, then it has PHI uses that are visited earlier, then it is not killed in the def block (i.e. not dead).
llvm-svn: 46763
2008-02-05 20:04:18 +00:00
Duncan Sands 3342d4083f Crashes LegalizeTypes with "Do not know how to
expand the result of this operator!" (node: ctlz).

llvm-svn: 46713
2008-02-04 18:07:02 +00:00
Duncan Sands ff1a444879 Crashes LegalizeTypes with "Do not know how to split
this operator's operand" (node: extract_subvector).

llvm-svn: 46712
2008-02-04 18:05:42 +00:00
Chris Lattner 69f90ccb17 remove target triple to make this test more "generic"
llvm-svn: 46711
2008-02-04 18:02:37 +00:00
Duncan Sands 331cd706f5 Crashed the new type legalizer. Not likely to catch
any bugs in the future since to get the crash you also
need hacked in fake libcall support (which creates odd
but legal trees), but since adding it doesn't hurt...
Thanks to Chris for this ultimately reduced version.

llvm-svn: 46706
2008-02-04 09:40:27 +00:00
Lauro Ramos Venancio 192c07b727 CBackend: Implement unaligned load/store.
llvm-svn: 46646
2008-02-01 21:25:59 +00:00
Chris Lattner f4e5e556fd Add target triples to these so they don't fail on linux.
llvm-svn: 46496
2008-01-29 06:26:07 +00:00
Scott Michel ceae3bbf4d Overhaul Cell SPU's addressing mode internals so that there are now
only two addressing mode nodes, SPUaform and SPUindirect (vice the
three previous ones, SPUaform, SPUdform and SPUxform). This improves
code somewhat because we now avoid using reg+reg addressing when
it can be avoided. It also simplifies the address selection logic,
which was the main point for doing this.

Also, for various global variables that would be loaded using SPU's
A-form addressing, prefer D-form offs[reg] addressing, keeping the
base in a register if the variable is used more than once.

llvm-svn: 46483
2008-01-29 02:16:57 +00:00
Chris Lattner 34d6b6a319 Update this test. Due to dag combiner improvements, we now compile
f7/f11 to:

_f7:
	eor r0, r0, #2, 2 @ -2147483648
	bx lr
_f11:
	bic r0, r0, #2, 2 @ -2147483648
	bx lr

instead of:

_f7:
	fmsr s0, r0
	fnegs s0, s0
	fmrs r0, s0
	bx lr

_f11:
	fmsr s0, r0
	fabss s0, s0
	fmrs r0, s0
	bx lr

llvm-svn: 46423
2008-01-27 23:26:37 +00:00
Chris Lattner 888560d62c Implement some dag combines that allow doing fneg/fabs/fcopysign in integer
registers if used by a bitconvert or using a bitconvert.  This allows us to
avoid constant pool loads and use cheaper integer instructions when the
values come from or end up in integer regs anyway.  For example, we now 
compile CodeGen/X86/fp-in-intregs.ll to:

_test1:
	movl	$2147483648, %eax
	xorl	4(%esp), %eax
	ret
_test2:
	movl	$1065353216, %eax
	orl	4(%esp), %eax
	andl	$3212836864, %eax
	ret

Instead of:
_test1:
	movss	4(%esp), %xmm0
	xorps	LCPI2_0, %xmm0
	movd	%xmm0, %eax
	ret
_test2:
	movss	4(%esp), %xmm0
	andps	LCPI3_0, %xmm0
	movss	LCPI3_1, %xmm1
	andps	LCPI3_2, %xmm1
	orps	%xmm0, %xmm1
	movd	%xmm1, %eax
	ret

bitconverts can happen due to various calling conventions that require
fp values to passed in integer regs in some cases, e.g. when returning
a complex.

llvm-svn: 46414
2008-01-27 17:42:27 +00:00
Chris Lattner 596704405f New test to verify that "merging 4 loads into a vec load" continues to work and
continues to infer alignment info.

llvm-svn: 46403
2008-01-26 20:06:45 +00:00
Chris Lattner e30e33af4f Infer alignment of loads and increase their alignment when we can tell they are
from the stack.  This allows us to compile stack-align.ll to:

_test:
	movsd	LCPI1_0, %xmm0
	movapd	%xmm0, %xmm1
***	andpd	4(%esp), %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

instead of:

_test:
	movsd	LCPI1_0, %xmm0
**	movsd	4(%esp), %xmm1
**	andpd	%xmm0, %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

llvm-svn: 46401
2008-01-26 19:45:50 +00:00
Chris Lattner 364963d41c remove a useless xfailed test.
llvm-svn: 46400
2008-01-26 19:35:46 +00:00
Bill Wendling 1a17ef02c8 If there's no instructions being emitted on X86 for a function, emit a
nop. Emit the nop directly for PPC.

llvm-svn: 46398
2008-01-26 09:03:52 +00:00
Bill Wendling a60c61dc1a Need to convert to LLVM code and not C.
llvm-svn: 46397
2008-01-26 06:56:08 +00:00
Bill Wendling 0b973210f8 Rename the .c to .ll
llvm-svn: 46396
2008-01-26 06:53:40 +00:00
Bill Wendling 0f69974fdb Move testcase to the code gen directory.
llvm-svn: 46395
2008-01-26 06:53:06 +00:00
Chris Lattner 31e9edce1c Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to
delete a node even if it was not dead in some cases.  Instead, just add it to
the worklist.  Also, make sure to use the CombineTo methods, as it was doing
things that were unsafe: the top level combine loop could touch dangling memory.

This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll

llvm-svn: 46384
2008-01-26 01:09:19 +00:00
Chris Lattner 84ab724e06 Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows
us to compile:

double test(double X) {
  return copysign(0.0, X);
}

into:

_test:
	andpd	LCPI1_0(%rip), %xmm0
	ret

instead of:
_test:
	pxor	%xmm1, %xmm1
	andpd	LCPI1_0(%rip), %xmm1
	movapd	%xmm0, %xmm2
	andpd	LCPI1_1(%rip), %xmm2
	movapd	%xmm1, %xmm0
	orpd	%xmm2, %xmm0
	ret

llvm-svn: 46344
2008-01-25 05:46:26 +00:00
Chris Lattner a91f77eaac Significantly simplify and improve handling of FP function results on x86-32.
This case returns the value in ST(0) and then has to convert it to an SSE
register.  This causes significant codegen ugliness in some cases.  For 
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:

_bar:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.

Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always 
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.

This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case.  This gives 
us this code for the example above:

_bar:
	subl	$12, %esp
	call	L_foo$stub
	addl	$12, %esp
	ret

The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert).  This is gross, but
less gross than the code it is replacing :)

This also allows us to generate better code in several other cases.  For 
example on fp-stack-ret-conv.ll, we now generate:

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstps	8(%esp)
	movl	16(%esp), %eax
	cvtss2sd	8(%esp), %xmm0
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

where before we produced (incidentally, the old bad code is identical to what
gcc produces):

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	cvtsd2ss	(%esp), %xmm0
	cvtss2sd	%xmm0, %xmm0
	movl	16(%esp), %eax
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

Note that we generate slightly worse code on pr1505b.ll due to a scheduling 
deficiency that is unrelated to this patch.

llvm-svn: 46307
2008-01-24 08:07:48 +00:00
Chris Lattner 001d781c41 take these with a pr #
llvm-svn: 46303
2008-01-24 06:35:44 +00:00
Evan Cheng 35abd840a6 Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type.
llvm-svn: 46286
2008-01-23 23:17:41 +00:00
Evan Cheng 1e0d4d2aa8 SSE varargs arguments are passed in memory.
llvm-svn: 46262
2008-01-22 23:26:53 +00:00
Dale Johannesen 5c94cb3596 Implement flt_rounds for PowerPC.
llvm-svn: 46174
2008-01-18 19:55:37 +00:00
Chris Lattner 1b35211fcc remove extraneous &&'s from tests, as Scott is apparently not going to.
llvm-svn: 46173
2008-01-18 19:53:43 +00:00
Dale Johannesen 4768c3c9b6 Test is correct again for the moment.
llvm-svn: 46172
2008-01-18 19:53:31 +00:00
Chris Lattner f5b46f7dad Fix a latent bug exposed by my truncstore patch. We compiled stfiwx-2.ll to:
_test:
	fctiwz f0, f1
	stfiwx f0, 0, r4
	blr 

instead of:

_test:
	fctiwz f0, f1
	stfd f0, -8(r1)
	nop
	nop
	lwz r2, -4(r1)
	stb r2, 0(r4)
	blr 

The former is not correct (stores 4 bytes, not 1).

llvm-svn: 46161
2008-01-18 16:54:56 +00:00
Scott Michel e4d3e3c0e7 Forward progress: crtbegin.c now compiles successfully!
Fixed CellSPU's A-form (local store) address mode, so that all globals,
externals, constant pool and jump table symbols are now wrapped within
a SPUISD::AFormAddr pseudo-instruction. This now identifies all local
store memory addresses, although it requires a bit of legerdemain during
instruction selection to properly select loads to and stores from local
store, properly generating "LQA" instructions.

Also added mul_ops.ll test harness for exercising integer multiplication.

llvm-svn: 46142
2008-01-17 20:38:41 +00:00
Chris Lattner 1ea55cf816 This commit changes:
1. Legalize now always promotes truncstore of i1 to i8. 
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
   X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
   safe.

The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:

_foo:
	fldt	20(%esp)
	fldt	4(%esp)
	faddp	%st(1)
	movl	36(%esp), %eax
	fstps	(%eax)
	ret

instead of:

_foo:
	subl	$4, %esp
	fldt	24(%esp)
	fldt	8(%esp)
	faddp	%st(1)
	fstps	(%esp)
	movl	40(%esp), %eax
	movss	(%esp), %xmm0
	movss	%xmm0, (%eax)
	addl	$4, %esp
	ret

llvm-svn: 46140
2008-01-17 19:59:44 +00:00
Chris Lattner 9f7fed1c1b new testcase.
llvm-svn: 46139
2008-01-17 19:47:23 +00:00
Chris Lattner 89126bde19 add testcase that has been sitting in my tree for awhile.
llvm-svn: 46124
2008-01-17 06:54:09 +00:00
Evan Cheng 54c20b559e When a live virtual register is being clobbered by an implicit def, it is spilled
and the spill is its kill. However, if the local allocator has determined the
register has not been modified (possible when its value was reloaded), it would
not issue a restore. In that case, mark the last use of the virtual register as
kill.

llvm-svn: 46111
2008-01-17 02:08:17 +00:00
Evan Cheng 7be1528004 Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0.
It's not safe to use the two value CombineTo variant to combine away a dead load.
e.g. 
v1, chain2 = load chain1, loc
v2, chain3 = load chain2, loc
v3         = add v2, c 
Now we replace use of v1 with undef, use of chain2 with chain1.
ReplaceAllUsesWith() will iterate through uses of the first load and update operands:
v1, chain2 = load chain1, loc
v2, chain3 = load chain1, loc
v3         = add v2, c 
Now the second load is the same as the first load, SelectionDAG cse will ensure
the use of second load is replaced with the first load.
v1, chain2 = load chain1, loc
v3         = add v1, c
Then v1 is replaced with undef and bad things happen.

llvm-svn: 46099
2008-01-16 23:11:54 +00:00
Duncan Sands 32b0ff6814 Trampoline support for x86-64. This looks like
it should work, but I have no machine to test
it on.  Committed because it will at least
cause no harm, and maybe someone can test it
for me!

llvm-svn: 46098
2008-01-16 22:55:25 +00:00
Chris Lattner aebbe4700a add testcase for regression
llvm-svn: 46073
2008-01-16 18:03:52 +00:00
Chris Lattner 6e3379c07b make sure to use a cpu that has sse.
llvm-svn: 46060
2008-01-16 06:32:02 +00:00
Chris Lattner 8f7cec859e My previous commit had an incomplete message, it should have been:
make the 'fp return in ST(0)' optimization smart enough to
look through token factor nodes.  THis allows us to compile 
testcases like CodeGen/X86/fp-stack-retcopy.ll into:

_carg:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	fldl	(%esp)
	addl	$12, %esp
	ret

instead of:

_carg:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

Still not optimal, but much better and this is a trivial patch.  Fixing 
the rest requires invasive surgery that is is not llvm 2.2 material.

llvm-svn: 46054
2008-01-16 05:56:59 +00:00
Chris Lattner 915ec14073 verify x86 generates ud2 for llvm.trap
llvm-svn: 46023
2008-01-15 22:22:02 +00:00
Chris Lattner 50baecd31e new testcase for llvm.trap.
llvm-svn: 46020
2008-01-15 22:17:26 +00:00
Scott Michel a8f67e04bd More CellSPU refinements:
- struct_2.ll: Completely unaligned load/store testing

- call_indirect.ll, struct_1.ll: Add test lines to exercise
   X-form [$reg($reg)] addressing

At this point, loads and stores should be under control (he says
in an optimistic tone of voice.)

llvm-svn: 45882
2008-01-11 21:01:19 +00:00
Dale Johannesen 04b99780cf Disable for now.
llvm-svn: 45881
2008-01-11 20:47:33 +00:00
Scott Michel 8d5841ae3c More CellSPU refinement and progress:
- Cleaned up custom load/store logic, common code is now shared [see note
  below], cleaned up address modes

- More test cases: various intrinsics, structure element access (load/store
  test), updated target data strings, indirect function calls.

Note: This patch contains a refactoring of the LoadSDNode and StoreSDNode
structures: they now share a common base class, LSBaseSDNode, that
provides an interface to their common functionality. There is some hackery
to access the proper operand depending on the derived class; otherwise,
to do a proper job would require finding and rearranging the SDOperands
sent to StoreSDNode's constructor. The current refactor errs on the
side of being conservatively and backwardly compatible while providing
functionality that reduces redundant code for targets where loads and
stores are custom-lowered.

llvm-svn: 45851
2008-01-11 02:53:15 +00:00
Duncan Sands 53c954fa86 Output sinl for a long double FSIN node, not sin.
Likewise fix up a bunch of other libcalls.  While
there I remove NEG_F32 and NEG_F64 since they are
not used anywhere.  This fixes 9 Ada ACATS failures.

llvm-svn: 45833
2008-01-10 10:28:30 +00:00
Evan Cheng 0f8c7c4a73 Codegen improvement has reduced one spill.
llvm-svn: 45814
2008-01-10 02:54:40 +00:00
Chris Lattner e34d7d0e24 new testcase for PR1845
llvm-svn: 45795
2008-01-10 00:30:38 +00:00
Evan Cheng 0e400d4cb7 Special copy SUnit's do not have SDNode's.
llvm-svn: 45787
2008-01-09 23:01:55 +00:00
Evan Cheng a31824a08e Fix sse2.psrl.w and sse2.psrl.q definitions.
llvm-svn: 45772
2008-01-09 02:16:44 +00:00
Chris Lattner 51b01bf8a5 Make load->store deletion a bit smarter. This allows us to compile this:
void test(long long *P) { *P ^= 1; }

into just:

_test:
	movl	4(%esp), %eax
	xorl	$1, (%eax)
	ret

instead of code like this:

_test:
	movl	4(%esp), %ecx
        xorl    $1, (%ecx)
	movl	4(%ecx), %edx
	movl	%edx, 4(%ecx)
	ret

llvm-svn: 45762
2008-01-08 23:08:06 +00:00
Duncan Sands 7b1460cca4 Crashes llc when using Chris's new legalization logic.
llvm-svn: 45758
2008-01-08 21:51:53 +00:00