Commit Graph

892 Commits

Author SHA1 Message Date
Evan Cheng 3f40c69083 On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16.
llvm-svn: 51019
2008-05-13 00:54:02 +00:00
Evan Cheng b980f6fb3d Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other.
llvm-svn: 51008
2008-05-12 23:04:07 +00:00
Dale Johannesen e6942c31ea New test for tail merging
llvm-svn: 51007
2008-05-12 22:59:44 +00:00
Evan Cheng 71b9afb053 When transforming a vector_shuffle to a load, the base address must not be an undef.
llvm-svn: 50940
2008-05-10 06:46:49 +00:00
Evan Cheng 9c4d685165 Add nounwind.
llvm-svn: 50931
2008-05-10 02:22:25 +00:00
Evan Cheng bec201fa06 If all sources of a PHI node are defined by an implicit_def, just emit an implicit_def instead of a copy.
llvm-svn: 50927
2008-05-10 00:17:50 +00:00
Evan Cheng 867af2678f Add a pattern to do move the low element of a v4f32 and zero extend the rest.
llvm-svn: 50922
2008-05-09 23:37:55 +00:00
Evan Cheng 961339bbdb Handle a few more cases of folding load i64 into xmm and zero top bits.
Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch.

llvm-svn: 50918
2008-05-09 21:53:03 +00:00
Evan Cheng 0352e63e39 Simplify test.
llvm-svn: 50911
2008-05-09 19:56:32 +00:00
Evan Cheng 0360ecbec1 Use movq to move low half of XMM register and zero-extend the rest.
llvm-svn: 50874
2008-05-08 22:35:02 +00:00
Evan Cheng 78af38c392 Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine.
llvm-svn: 50838
2008-05-08 00:57:18 +00:00
Evan Cheng 0d6311d46c Add nounwind.
llvm-svn: 50837
2008-05-07 22:59:08 +00:00
Evan Cheng 7ca4a67ca1 Yet another nasty spiller bug.
%ecx = op
store %cl<kill>, (addr)
(addr) = op %al

It's not safe to unfold the last operand and eliminate store even though %cl is marked kill. It's a sub-register use which means one of its super-register(s) may be used below.

llvm-svn: 50794
2008-05-07 00:49:28 +00:00
Anton Korobeynikov f5d2c3b45a Use target triple in tests, not 'realign-stack=0' option. Per request.
llvm-svn: 50778
2008-05-06 23:09:29 +00:00
Evan Cheng ef3faa1b1c Fix PR2287. Darwin passes mmx values in register in 64-mode, not Linux.
llvm-svn: 50716
2008-05-06 07:23:50 +00:00
Mon P Wang 3e58393c3d Added addition atomic instrinsics and, or, xor, min, and max.
llvm-svn: 50663
2008-05-05 19:05:59 +00:00
Chris Lattner 9c0c60d080 no need for eh info
llvm-svn: 50658
2008-05-05 18:24:33 +00:00
Dan Gohman bcde172222 Add AsmPrinter support for emitting a directive to declare that
the code being generated does not require an executable stack.

Also, add target-specific code to make use of this on Linux
on x86. 

llvm-svn: 50634
2008-05-05 00:28:39 +00:00
Evan Cheng d9481366e3 Select vector shift with non-immediate i32 shift amount operand by first moving the operand into the right register.
llvm-svn: 50619
2008-05-04 09:15:50 +00:00
Evan Cheng cdf22f2953 Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code.
llvm-svn: 50601
2008-05-03 00:52:09 +00:00
Chris Lattner 34931afff7 specify an arch for non-x86 hosts.
llvm-svn: 50576
2008-05-02 15:11:58 +00:00
Bill Wendling fb4191cc04 Adding testcase.
llvm-svn: 50536
2008-05-01 18:41:09 +00:00
Chris Lattner d4b2a67cf3 don't randomly miscompile seto/setuo just because we are in
ffastmath mode.  This fixes rdar://5902801, a miscompilation
of gcc.dg/builtins-8.c.

Bill, please pull this into Tak.

llvm-svn: 50523
2008-05-01 07:26:11 +00:00
Arnold Schwaighofer 68988d13f6 Really commit the test checking the argument lowering behaviour on x86-64 :).
llvm-svn: 50478
2008-04-30 09:19:47 +00:00
Arnold Schwaighofer be0de34ede Tail call optimization improvements:
Move platform independent code (lowering of possibly overwritten
arguments, check for tail call optimization eligibility) from
target X86ISelectionLowering.cpp to TargetLowering.h and
SelectionDAGISel.cpp.

Initial PowerPC tail call implementation:

Support ppc32 implemented and tested (passes my tests and
test-suite llvm-test).  
Support ppc64 implemented and half tested (passes my tests).
On ppc tail call optimization is performed if 
  caller and callee are fastcc
  call is a tail call (in tail call position, call followed by ret)
  no variable argument lists or byval arguments
  option -tailcallopt is enabled
Supported:
 * non pic tail calls on linux/darwin
 * module-local tail calls on linux(PIC/GOT)/darwin(PIC)
 * inter-module tail calls on darwin(PIC)
If constraints are not met a normal call will be emitted.

A test checking the argument lowering behaviour on x86-64 was added.

llvm-svn: 50477
2008-04-30 09:16:33 +00:00
Chris Lattner 5c88f7b1ad make the vector conversion magic handle multiple results.
We now compile test2/test3 to:

_test2:
	## InlineAsm Start
	set %xmm0, %xmm1
	## InlineAsm End
	addps	%xmm1, %xmm0
	ret
_test3:
	## InlineAsm Start
	set %xmm0, %xmm1
	## InlineAsm End
	paddd	%xmm1, %xmm0
	ret

as expected.

llvm-svn: 50389
2008-04-29 04:48:56 +00:00
Chris Lattner f9a49c4322 add support for multiple return values in inline asm. This is a step
towards PR2094.  It now compiles the attached .ll file to:

_sad16_sse2:
	movslq	%ecx, %rax
	## InlineAsm Start
	%ecx %rdx %rax %rax %r8d %rdx %rsi
	## InlineAsm End
	## InlineAsm Start
	set %eax
	## InlineAsm End
	ret

which is pretty decent for a 3 output, 4 input asm.

llvm-svn: 50386
2008-04-29 04:29:54 +00:00
Evan Cheng 11b98b6612 Another extract_subreg coalescing bug.
e.g.
vr1024<2> extract_subreg vr1025, 2
If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64.

llvm-svn: 50385
2008-04-29 01:41:44 +00:00
Evan Cheng 73c3b4741a Add -march=x86.
llvm-svn: 50380
2008-04-28 23:31:41 +00:00
Dan Gohman c73845b387 Update and_ops.ll according to the recent dagcombiner changes.
Add a new test, and_ops_more.ll, which is XFAIL'd, to
record the parts of and_ops.ll that were affected by this
change.

llvm-svn: 50379
2008-04-28 23:26:22 +00:00
Evan Cheng 315e3cb9a3 Test case.
llvm-svn: 50377
2008-04-28 22:14:34 +00:00
Chris Lattner 2237973438 Implement a signficant optimization for inline asm:
When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible.  This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:

void test () {
  asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}

Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.

Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??

Incidentally, this was the todo in 
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll

Please do NOT pull this into Tak.

llvm-svn: 50315
2008-04-27 00:37:18 +00:00
Nate Begeman 98f0898e42 Feedback from chris
llvm-svn: 50305
2008-04-25 21:47:35 +00:00
Nate Begeman f10b493fc0 Add a testcase for the recent "handle variable vector insert elt in mem" patch
llvm-svn: 50303
2008-04-25 21:26:59 +00:00
Evan Cheng 402572a149 Update tests.
llvm-svn: 50293
2008-04-25 20:13:47 +00:00
Evan Cheng ccde6dd016 Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers.
llvm-svn: 50289
2008-04-25 19:11:04 +00:00
Dan Gohman ca95a5f49f Remove the code from CodeGenPrepare that moved getresult instructions
to the block that defines their operands. This doesn't work in the
case that the operand is an invoke, because invoke is a terminator
and must be the last instruction in a block.

Replace it with support in SelectionDAGISel for copying struct values
into sequences of virtual registers.

llvm-svn: 50279
2008-04-25 18:27:55 +00:00
Evan Cheng df38b35a1e MMX argument passing fixes:
On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2].                                                                                                                                      
On Darwin / Linux x86-32, v1i64 values are passed in memory.                                                                                                                                                    
On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7].                                                                                                                                     
On Darwin x86-64, v1i64 values are passed in 64-bit GPRs.

llvm-svn: 50257
2008-04-25 07:56:45 +00:00
Chris Lattner 741c7a3b49 Loosen up an assertion to allow intrinsics. I really have no
idea what this code (findNonImmUse) does, so I'm only guessing 
that this is the right thing.  It would be really really nice
if this had comments and perhaps switched to SmallPtrSet
(hint hint) :)

This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c

llvm-svn: 50252
2008-04-25 05:13:01 +00:00
Evan Cheng 9165e165dc Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero.
llvm-svn: 50239
2008-04-25 00:26:43 +00:00
Evan Cheng a42d24003d New test.
llvm-svn: 50229
2008-04-24 20:01:58 +00:00
Dan Gohman b418aafabf Add support to codegen for getresult instructions with undef operands.
llvm-svn: 50180
2008-04-23 20:21:29 +00:00
Anton Korobeynikov dd4ef2e30c Disable stack realignment for these tests
llvm-svn: 50172
2008-04-23 18:25:44 +00:00
Anton Korobeynikov c3ada5c9c4 Fix test becase ABI stack alignment dropped to 'normal' value
llvm-svn: 50171
2008-04-23 18:25:16 +00:00
Anton Korobeynikov 955a8a9101 Fix test, instruction count is valid only if stack is not realigned
llvm-svn: 50170
2008-04-23 18:24:48 +00:00
Dan Gohman f166d2d0d6 Implement an x86-64 ABI detail of passing structs by hidden first
argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.

Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.

llvm-svn: 50075
2008-04-21 23:59:07 +00:00
Chris Lattner 470ab00c76 A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2.
llvm-svn: 49986
2008-04-20 05:52:46 +00:00
Chris Lattner a124f1e219 Not all x86-64 machines have sse3 apparently.
llvm-svn: 49985
2008-04-20 05:47:56 +00:00
Chris Lattner 50fb77f829 rename *.llx -> *.ll
llvm-svn: 49970
2008-04-19 22:29:10 +00:00
Evan Cheng 5102bd9359 64-bit atomic operations.
llvm-svn: 49949
2008-04-19 02:30:38 +00:00