Commit Graph

3318 Commits

Author SHA1 Message Date
Evan Cheng 2034d9f2da - Clean up some crappy code which deals with coalescing of copies which look at
extract_subreg / insert_subreg, etc.
- Add support for more aggressive insert_subreg coalescing.

llvm-svn: 101971
2010-04-21 00:44:22 +00:00
Dan Gohman ad33d33719 Add another variant of this test which found a place where
CodeGen's ComputeMaskedBits was being over-conservative when computing
bits for an ADD.

llvm-svn: 101963
2010-04-21 00:19:28 +00:00
Chris Lattner 84776786a7 teach the x86 address matching stuff to handle
(shl (or x,c), 3) the same as (shl (add x, c), 3)
when x doesn't have any bits from c set.

This finishes off PR1135.  Before we compiled the block to:
to:

LBB0_3:                                 ## %bb
	cmpb	$4, %dl
	sete	%dl
	addb	%dl, %cl
	movb	%cl, %dl
	shlb	$2, %dl
	addb	%r8b, %dl
	shlb	$2, %dl
	movzbl	%dl, %edx
	movl	%esi, (%rdi,%rdx,4)
	leaq	2(%rdx), %r9
	movl	%esi, (%rdi,%r9,4)
	leaq	1(%rdx), %r9
	movl	%esi, (%rdi,%r9,4)
	addq	$3, %rdx
	movl	%esi, (%rdi,%rdx,4)
	incb	%r8b
	decb	%al
	movb	%r8b, %dl
	jne	LBB0_1

Now we produce:

LBB0_3:                                 ## %bb
	cmpb	$4, %dl
	sete	%dl
	addb	%dl, %cl
	movb	%cl, %dl
	shlb	$2, %dl
	addb	%r8b, %dl
	shlb	$2, %dl
	movzbl	%dl, %edx
	movl	%esi, (%rdi,%rdx,4)
	movl	%esi, 8(%rdi,%rdx,4)
	movl	%esi, 4(%rdi,%rdx,4)
	movl	%esi, 12(%rdi,%rdx,4)
	incb	%r8b
	decb	%al
	movb	%r8b, %dl
	jne	LBB0_1

llvm-svn: 101958
2010-04-20 23:18:40 +00:00
Bill Wendling a8ae1783b4 Move CodeGen/X86/2010-04-19-DAGCombineCrash.ll into CodeGen/X86/crash.ll. Also
reduce.

llvm-svn: 101925
2010-04-20 18:14:47 +00:00
Chris Lattner 5100367ff3 Bill's change in r95336 broke empty aggregates embedded
in other types.  fix this by only bumping zero-byte globals
up to a single byte if the *entire global* is zero size,
fixing PR6340.

This also fixes empty arrays etc to be handled correctly,
and only does this on subsection-via-symbols targets (aka
darwin) which is the only place where this matters.

llvm-svn: 101879
2010-04-20 06:20:21 +00:00
Chris Lattner 38c1a1a247 teach cellspu how to return i8 and i16 from calls,
patch by Kalle Raiskila!

llvm-svn: 101875
2010-04-20 05:36:09 +00:00
Bill Wendling 467e6c2deb The visitXOR method can return the same SDNode. If so, we don't want to delete
it as it's not dead.

llvm-svn: 101855
2010-04-20 01:25:01 +00:00
Bob Wilson 92a4685dd2 Fix tests for Neon load/store intrinsics to match the i8* types expected by
the intrinsics.  The reason for those i8* types is that the intrinsics are
overloaded on the vector type and we don't have a way to declare an intrinsic
where one argument is an overloaded vector type and another argument is a
pointer to the vector element type.  The bitcasts added here will match what
the frontend will typically generate when these intrinsics are used.

llvm-svn: 101840
2010-04-20 00:17:16 +00:00
Nick Lewycky fbe8d2803d Fix declarations in a few more tests.
llvm-svn: 101676
2010-04-17 21:29:25 +00:00
Chris Lattner 0a8d91a816 fix PR6332, allowing an index of zero into a zero sized array
even if the element of the array has no size.

llvm-svn: 101662
2010-04-17 19:02:33 +00:00
Dan Gohman 4fee6f3bdd Start function numbering at 0.
llvm-svn: 101638
2010-04-17 16:29:15 +00:00
Evan Cheng 3af19e80c9 Add nounwind.
llvm-svn: 101613
2010-04-17 03:43:36 +00:00
Jakob Stoklund Olesen dc6d42dbf8 Add test case for machine-sink on critical edges
llvm-svn: 101416
2010-04-15 23:19:16 +00:00
Evan Cheng f7f97b4bbd Use default lowering of DYNAMIC_STACKALLOC. As far as I can tell, ARM isle is doing the right thing and codegen looks correct for both Thumb and Thumb2.
llvm-svn: 101410
2010-04-15 22:20:34 +00:00
Jakob Stoklund Olesen b642a27525 Fix PR6847. RegScavenger should ignore DebugValues.
llvm-svn: 101392
2010-04-15 20:28:39 +00:00
Evan Cheng 1ba1428577 ARM SelectDYN_ALLOC should emit a copy from SP rather than referencing SP directly. In cases where there are two dyn_alloc in the same BB it would have caused the old SP value to be reused and badness ensues. rdar://7493908
llvm is generating poor code for dynamic alloca, I'll fix that later.

llvm-svn: 101383
2010-04-15 18:42:28 +00:00
Chris Lattner 3245afdf05 enhance the load/store narrowing optimization to handle a
tokenfactor in between the load/store.  This allows us to 
optimize test7 into:

_test7:                                 ## @test7
## BB#0:                                ## %entry
	movl	(%rdx), %eax
                                        ## kill: SIL<def> ESI<kill>
	movb	%sil, 5(%rdi)
	ret

instead of:

_test7:                                 ## @test7
## BB#0:                                ## %entry
	movl	4(%esp), %ecx
	movl	$-65281, %eax           ## imm = 0xFFFFFFFFFFFF00FF
	andl	4(%ecx), %eax
	movzbl	8(%esp), %edx
	shll	$8, %edx
	addl	%eax, %edx
	movl	12(%esp), %eax
	movl	(%eax), %eax
	movl	%edx, 4(%ecx)
	ret

llvm-svn: 101355
2010-04-15 06:10:49 +00:00
Chris Lattner 6ebd8674eb teach codegen to turn trunc(zextload) into load when possible.
This doesn't occur much at all, it only seems to formed in the case
when the trunc optimization kicks in due to phase ordering.  In that
case it is saves a few bytes on x86-32.

llvm-svn: 101350
2010-04-15 05:40:59 +00:00
Chris Lattner f9b2e3c68a add a simple dag combine to replace trivial shl+lshr with
and.  This happens with the store->load narrowing stuff.

llvm-svn: 101348
2010-04-15 05:28:43 +00:00
Chris Lattner 4041ab6e00 Implement rdar://7860110 (also in target/readme.txt) narrowing
a load/or/and/store sequence into a narrower store when it is
safe.  Daniel tells me that clang will start producing this sort
of thing with bitfields, and this does  trigger a few dozen times
on 176.gcc produced by llvm-gcc even now.

This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll 
into:

        movl    %eax, 36(%rdi)

instead of:

        movl    $4294967295, %eax       ## imm = 0xFFFFFFFF
        andq    32(%rdi), %rax
        shlq    $32, %rcx
        addq    %rax, %rcx
        movq    %rcx, 32(%rdi)

and each of the testcases into a single store.  Each of them used
to compile into craziness like this:

_test4:
	movl	$65535, %eax            ## imm = 0xFFFF
	andl	(%rdi), %eax
	shll	$16, %esi
	addl	%eax, %esi
	movl	%esi, (%rdi)
	ret

llvm-svn: 101343
2010-04-15 04:48:01 +00:00
Chris Lattner 60bbb8c356 further tweak this to do something useful.
llvm-svn: 101341
2010-04-15 04:31:42 +00:00
Chris Lattner 9ebaf531ab remove undef control flow.
llvm-svn: 101340
2010-04-15 04:30:19 +00:00
Jakob Stoklund Olesen 938f2ae310 Remove unneeded types from test.
llvm-svn: 101286
2010-04-14 20:56:09 +00:00
Bob Wilson c05b887c84 Don't custom lower bit converts to ARM VMOVDRRD or VMOVDRR when the operand
does not have a legal type.  The legalizer does not know how to handle those
nodes.  Radar 7854640.

llvm-svn: 101282
2010-04-14 20:45:23 +00:00
Evan Cheng 6c35893aa6 Add test for post-ra machine licm.
llvm-svn: 101182
2010-04-13 22:10:03 +00:00
Bob Wilson 699bdf7adf Handle a v2f64 formal parameter that is split between registers and memory
such that the entire second half is in memory.  Radar 7855014.

llvm-svn: 101181
2010-04-13 22:03:22 +00:00
Evan Cheng 4d89dd8353 Fix test on non-x86 hosts.
llvm-svn: 101163
2010-04-13 18:54:04 +00:00
Evan Cheng 4ca4bc6f95 Re-apply 101075 and fix it properly. Just reuse the debug info of the branch instruction being optimized. There is no need to --I which can deref off start of the BB.
llvm-svn: 101162
2010-04-13 18:50:27 +00:00
Eric Christopher d67f66dc0c Temporarily revert r101075, it's causing invalid iterator assertions
in a nightly tester.

llvm-svn: 101158
2010-04-13 18:37:58 +00:00
Chris Lattner 5b212a31a2 add llvm codegen support for -ffunction-sections and -fdata-sections,
patch by Sylvere Teissier!

llvm-svn: 101106
2010-04-13 00:36:43 +00:00
Evan Cheng d0d8e3343a Use .set expression for x86 pic jump table reference to reduce assembly relocation. rdar://7738756
llvm-svn: 101085
2010-04-12 23:07:17 +00:00
Bill Wendling caaf445a01 Third time's a charm...
llvm-svn: 101081
2010-04-12 22:43:21 +00:00
Bill Wendling 4fc5a4d8b8 Genericize the label test.
llvm-svn: 101079
2010-04-12 22:40:37 +00:00
Bill Wendling 4627917b9a Correct test to test what I mean it to test.
llvm-svn: 101077
2010-04-12 22:25:42 +00:00
Bill Wendling b02bbe416f Micro-optimization:
If we have this situation:

    jCC  L1
    jmp  L2
L1:
  ...
L2:
  ...

We can get a small performance boost by emitting this instead:

    jnCC L2
L1:
  ...
L2:
  ...

This testcase shows an example of this:

float func(float x, float y) {
    double product = (double)x * y;
    if (product == 0.0)
        return product;
    return product - 1.0;
}

llvm-svn: 101075
2010-04-12 22:19:57 +00:00
Evan Cheng 250283916d Enable post regalloc machine licm by default.
llvm-svn: 101023
2010-04-12 06:25:28 +00:00
Benjamin Kramer 7e4a475929 Make sure this test tests something.
llvm-svn: 100879
2010-04-09 19:03:31 +00:00
Bob Wilson 030591320d Add a testcase for svn r100568.
llvm-svn: 100876
2010-04-09 18:29:29 +00:00
Chris Lattner 1ef9826ff8 "On SPU, variables in the .bss section that are allocated with the .lcomm directive are not aligned on 16 byte boundaries. This causes misaligned loads, as the generated assembly assumes this "default" alignment.
this patch disables .lcomm in favour of '.local .comm'

Patch by Kalle Raisklia!

llvm-svn: 100875
2010-04-09 18:27:03 +00:00
Dan Gohman d23fa7d90d Merge a few fast-isel tests.
llvm-svn: 100860
2010-04-09 15:03:55 +00:00
Evan Cheng b083c47c21 Coalescer should not delete copy instructions whose defs are partially dead. e.g.
%RDI<def,dead> = MOV64rr %RAX<kill>, %EDI<imp-def>

llvm-svn: 100804
2010-04-08 20:02:37 +00:00
Evan Cheng ebe47c872f Avoid using f64 to lower memcpy from constant string. It's cheaper to use i32 store of immediates.
llvm-svn: 100751
2010-04-08 07:37:57 +00:00
Dan Gohman 4506539d84 When expanding expressions which are using post-inc mode for multiple loops,
ensure that the expansion is dominated by the increments of those loops.

llvm-svn: 100748
2010-04-08 05:57:57 +00:00
Chris Lattner 3ae2dd2ba5 add newlines at the end of files.
llvm-svn: 100705
2010-04-07 22:53:17 +00:00
Dan Gohman d006ab90dd Generalize IVUsers to track arbitrary expressions rather than expressions
explicitly split into stride-and-offset pairs. Also, add the
ability to track multiple post-increment loops on the same expression.

This refines the concept of "normalizing" SCEV expressions used for
to post-increment uses, and introduces a dedicated utility routine for
normalizing and denormalizing expressions.

This fixes the expansion of expressions which are post-increment users
of more than one loop at a time. More broadly, this takes LSR another
step closer to being able to reason about more than one loop at a time.

llvm-svn: 100699
2010-04-07 22:27:08 +00:00
Dale Johannesen f118f9788b Split big test into multiple directories to cater to
those who don't build all targets.

llvm-svn: 100688
2010-04-07 20:43:35 +00:00
Chris Lattner 2c88f8a8c4 this has a pr!
llvm-svn: 100637
2010-04-07 18:04:56 +00:00
Chris Lattner f839ee0c13 fix a latent bug my inline asm stuff exposed:
MachineOperand::isIdenticalTo wasn't handling metadata operands.

llvm-svn: 100636
2010-04-07 18:03:19 +00:00
Sanjiv Gupta 4948793041 Remove XFAIL for vg_leak as the leaks are fixed by 100601.
llvm-svn: 100612
2010-04-07 07:06:48 +00:00
Jakob Stoklund Olesen 41051a0bfe Don't try to collapse DomainValues onto an incompatible SSE domain.
This fixes the Bullet regression on i386/nocona.

llvm-svn: 100553
2010-04-06 19:48:56 +00:00
Evan Cheng b7a20ee5b5 Add nounwind.
llvm-svn: 100482
2010-04-05 22:30:05 +00:00
Dan Gohman 918a90a3ca Don't do code sinking on unreachable blocks. It's unprofitable and hazardous.
llvm-svn: 100455
2010-04-05 19:17:22 +00:00
Chris Lattner 4e4549deea resolve a fixme.
llvm-svn: 100346
2010-04-04 19:28:59 +00:00
Evan Cheng 61399375a2 Correctly lower memset / memcpy of undef. It should be a nop. PR6767.
llvm-svn: 100208
2010-04-02 19:36:14 +00:00
Dan Gohman 4bd755419f Revert the recent alignment changes. They're broken for -Os because,
in particular, they end up aligning strings at 16-byte boundaries, and
there's no way for GlobalOpt to check OptForSize.

llvm-svn: 100172
2010-04-02 03:04:37 +00:00
Evan Cheng 604bc162da After trivial coalescing, the MI being visited may have become a copy. Avoid adding it to CSE hash table since copies aren't being considered for CSE and they may be deleted.
rdar://7819990

llvm-svn: 100170
2010-04-02 02:21:24 +00:00
Dan Gohman 8ceeeb444e Remove this initializer so that the optimizer doesn't convert
unaligned loads into aligned loads.

llvm-svn: 100166
2010-04-02 01:26:13 +00:00
Dan Gohman ffb9c71174 Update this test for the new preferred alignment heuristics.
llvm-svn: 100165
2010-04-02 01:24:08 +00:00
Evan Cheng f997c31598 In 64-bit mode, use i64 to lower memcpy / memset instead of f64.
llvm-svn: 100137
2010-04-01 20:27:45 +00:00
Evan Cheng 4c014c892a - Avoid using floating point stores to implement memset unless the value is zero.
- Do not try to infer GV alignment unless its type is sized. It's not possible to infer alignment if it has opaque type.

llvm-svn: 100118
2010-04-01 18:19:11 +00:00
Evan Cheng 1e8ee79957 Add -mcpu to memcpy / memset tests to ensure they behave the same on all hosts / targets.
llvm-svn: 100101
2010-04-01 08:25:26 +00:00
Evan Cheng 43cd9e3845 Fix sdisel memcpy, memset, memmove lowering:
1. Makes it possible to lower with floating point loads and stores.
2. Avoid unaligned loads / stores unless it's fast.
3. Fix some memcpy lowering logic bug related to when to optimize a
   load from constant string into a constant.
4. Adjust x86 memcpy lowering threshold to make it more sane.
5. Fix x86 target hook so it uses vector and floating point memory
   ops more effectively.
rdar://7774704

llvm-svn: 100090
2010-04-01 06:04:33 +00:00
Jakob Stoklund Olesen 9986ba954c Replace V_SET0 with variants for each SSE execution domain.
llvm-svn: 99975
2010-03-31 00:40:13 +00:00
Jakob Stoklund Olesen 710c6892be Fix typo. Thank you, valgrind.
llvm-svn: 99974
2010-03-31 00:40:08 +00:00
Jakob Stoklund Olesen 19aa6f72a0 Not all platforms start symbols with _
llvm-svn: 99959
2010-03-30 23:12:48 +00:00
Jakob Stoklund Olesen 6f6ebb663c Enable -sse-domain-fix by default. Now with tests!
llvm-svn: 99954
2010-03-30 22:47:00 +00:00
Eric Christopher 6ad8167714 Remove the pmulld intrinsic and autoupdate it as a vector multiply.
Rewrite the pmulld patterns, and make sure that they fold in loads of
arguments into the instruction.

llvm-svn: 99910
2010-03-30 18:49:01 +00:00
Benjamin Kramer 0c1dcb083e XFAIL some PIC16 tests when running under valgrind-leaks. I don't expect these
to be fixed any time soon.

llvm-svn: 99888
2010-03-30 14:34:13 +00:00
Evan Cheng 742db6874a Fix PR4975. Avoid referencing empty vector.
llvm-svn: 99840
2010-03-29 21:27:30 +00:00
Chris Lattner f60c556b91 From Kalle Raiskila:
"the bigstack patch for SPU, with testcase. It is essentially the patch committed as 97091, and reverted as 97099, but with the following additions:
-in vararg handling, registers are marked to be live, to not confuse the register scavenger
-function prologue and epilogue are not emitted, if the stack size is 16. 16 means it is empty - there is only the register scavenger emergency spill slot, which is not used as there is no stack."

llvm-svn: 99819
2010-03-29 17:38:47 +00:00
Chris Lattner a787c9e23a teach tblgen to allow patterns like (add (i32 (bitconvert (i32 GPR))), 4),
transforming it into (add (i32 GPR), 4).  This allows us to write type
generic multi patterns and have tblgen automatically drop the bitconvert
in the case when the types align.  This allows us to fold an extra load
in the changed testcase.

llvm-svn: 99756
2010-03-28 08:38:32 +00:00
Chris Lattner 6bba2f3c69 add some nounwinds
llvm-svn: 99752
2010-03-28 07:58:37 +00:00
Chris Lattner 108667f3ec this takes an insane amount of time to run, disable it for now (PR6727)
llvm-svn: 99751
2010-03-28 07:58:09 +00:00
Evan Cheng 3365fb1412 Do not sibcall if stack needs to be dynamically aligned.
llvm-svn: 99620
2010-03-26 16:26:03 +00:00
Evan Cheng 00a620c61e Allow trivial sibcall of vararg callee when no arguments are being passed.
llvm-svn: 99598
2010-03-26 02:13:13 +00:00
Evan Cheng 7b4a1a221b Try trivial remat before the coalescer gives up on a vr / physreg coalescing for fear of tying up a physical register.
llvm-svn: 99575
2010-03-26 00:07:25 +00:00
Jim Grosbach 71fcb4fedd switch the flag for using NEON for SP floating point to a subtarget 'feature'.
Re-commit. This time complete with testsuite updates.

llvm-svn: 99570
2010-03-25 23:47:34 +00:00
Evan Cheng dbcf861a96 Add nounwind.
llvm-svn: 99546
2010-03-25 20:01:07 +00:00
Chris Lattner 4690af8567 Make the NDEBUG assertion stronger and more clear what is
happening.

Enhance scheduling to set the DEAD flag on implicit defs
more aggressively.  Before, we'd set an implicit def operand
to dead if it were present in the SDNode corresponding to
the machineinstr but had no use.  Now we do it in this case
AND if the implicit def does not exist in the SDNode at all.

This exposes a couple of problems: one is the FIXME, which
causes a live intervals crash on CodeGen/X86/sibcall.ll.
The second is that it makes machinecse and licm more 
aggressive (which is a good thing) but also exposes a case
where licm hoists a set0 and then it doesn't get resunk.

Talking to codegen folks about both these issues, but I need
this patch in in the meantime.

llvm-svn: 99485
2010-03-25 05:40:48 +00:00
Nate Begeman 583e05d8ce BUILD_VECTOR was missing out on some prime opportunities to use SSE 4.1 inserts.
llvm-svn: 99423
2010-03-24 20:49:50 +00:00
Bob Wilson 4d87012eb3 Revert Edwin's change that is breaking MultiSource/Applications/ClamAV/clamscan.
--- Reverse-merging r99400 into '.':
D    test/CodeGen/Generic/2010-03-24-liveintervalleak.ll
U    lib/CodeGen/LiveIntervalAnalysis.cpp

llvm-svn: 99419
2010-03-24 20:25:25 +00:00
Torok Edwin 4bbfdd41ea Fix memory leak in liveintervals: the destructor for VNInfos must be called,
otherwise the SmallVector it contains doesn't free its memory.
In most cases LiveIntervalAnalysis could get away by not calling the destructor,
because VNInfos are bumpptr-allocated, and smallvectors usually don't grow.
However when the SmallVector does grow it always leaks.

This is the valgrind shown leak from the original testcase:
==8206== 18,304 bytes in 151 blocks are definitely lost in loss record 164 of 164
==8206==    at 0x4A079C7: operator new(unsigned long) (vg_replace_malloc.c:220)
==8206==    by 0x4DB7A7E: llvm::SmallVectorBase::grow_pod(unsigned long, unsigned long) (in /home/edwin/clam/git/builds/defaul
t/libclamav/.libs/libclamav.so.6.1.0)
==8206==    by 0x4F90382: llvm::VNInfo::addKill(llvm::SlotIndex) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libcl
amav.so.6.1.0)
==8206==    by 0x5126B5C: llvm::LiveIntervals::handleVirtualRegisterDef(llvm::MachineBasicBlock*, llvm::ilist_iterator<llvm::M
achineInstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int, llvm::LiveInterval&) (in /home/edwin/clam/git/builds/defau
lt/libclamav/.libs/libclamav.so.6.1.0)
==8206==    by 0x512725E: llvm::LiveIntervals::handleRegisterDef(llvm::MachineBasicBlock*, llvm::ilist_iterator<llvm::MachineI
nstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav
.so.6.1.0)
==8206==    by 0x51278A8: llvm::LiveIntervals::computeIntervals() (in /home/edwin/clam/git/builds/default/libclamav/.libs/libc
lamav.so.6.1.0)
==8206==    by 0x5127CB4: llvm::LiveIntervals::runOnMachineFunction(llvm::MachineFunction&) (in /home/edwin/clam/git/builds/de
fault/libclamav/.libs/libclamav.so.6.1.0)
==8206==    by 0x4DAE935: llvm::FPPassManager::runOnFunction(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama
v/.libs/libclamav.so.6.1.0)
==8206==    by 0x4DAEB10: llvm::FunctionPassManagerImpl::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama
v/.libs/libclamav.so.6.1.0)
==8206==    by 0x4DAED3D: llvm::FunctionPassManager::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclamav/.l
ibs/libclamav.so.6.1.0)
==8206==    by 0x4D8BE8E: llvm::JIT::runJITOnFunctionUnlocked(llvm::Function*, llvm::MutexGuard const&) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0)
==8206==    by 0x4D8CA72: llvm::JIT::getPointerToFunction(llvm::Function*) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0)

llvm-svn: 99400
2010-03-24 13:50:36 +00:00
Chris Lattner b1c4f62cac Fix PR6673: updating the callback should not clear the map.
llvm-svn: 99227
2010-03-22 23:15:57 +00:00
Bob Wilson 162242b63b pr6652: Use LDM to restore PC to the return address on ARMv4.
Patch by John Tytgat!

llvm-svn: 99096
2010-03-20 22:20:40 +00:00
Evan Cheng b8d1fd0553 Stupid svn. Add back to the lost sibcall tests.
llvm-svn: 99033
2010-03-20 03:17:05 +00:00
Kevin Enderby cf0843ed93 Fixed the encoding problems of the crc32 instructions. All had the Operand size
override prefix and only the r/m16 forms should have had that.  Also for variant
one, the AT&T syntax, added suffixes to all forms.  Also added the missing
64-bit form for 'CRC32 r64, r/m8'.  Plus added test cases for all forms and
tweaked one test case to add the needed suffixes.

llvm-svn: 98980
2010-03-19 20:04:42 +00:00
Mon P Wang 7ad43f8768 Fixed a widening bug where we were not using the correct size for the load
llvm-svn: 98920
2010-03-19 01:19:52 +00:00
Evan Cheng bf724b9ee0 Turning off post-ra scheduling for x86. It isn't a consistent win.
llvm-svn: 98810
2010-03-18 06:55:42 +00:00
Evan Cheng 68333f5c6e X86 address mode matching code MatchAddressRecursively does some aggressive hack which require doing a RAUW. It may end up deleting some SDNode up stream. It should avoid referencing deleted nodes.
llvm-svn: 98780
2010-03-17 23:58:35 +00:00
Johnny Chen 8f3004cff2 Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm
instructions to help disassembly.

We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>.  See, for example, A8.6.57/58/60.

And modified test cases to not expect '+' in +reg or #+num.  For example,

; CHECK:       ldr.w	r9, [r7, #28]

llvm-svn: 98745
2010-03-17 17:52:21 +00:00
Evan Cheng 403062313f Fix liveintervals handling of dbg_value instructions.
llvm-svn: 98686
2010-03-16 21:51:27 +00:00
Dan Gohman 5a6dc1dd09 Add an rdar number to this test.
llvm-svn: 98654
2010-03-16 19:08:20 +00:00
Bob Wilson 1b4e8cc69c --- Reverse-merging r98637 into '.':
U    test/CodeGen/ARM/tls2.ll
U    test/CodeGen/ARM/arm-negative-stride.ll
U    test/CodeGen/ARM/2009-10-30.ll
U    test/CodeGen/ARM/globals.ll
U    test/CodeGen/ARM/str_pre-2.ll
U    test/CodeGen/ARM/ldrd.ll
U    test/CodeGen/ARM/2009-10-27-double-align.ll
U    test/CodeGen/Thumb2/thumb2-strb.ll
U    test/CodeGen/Thumb2/ldr-str-imm12.ll
U    test/CodeGen/Thumb2/thumb2-strh.ll
U    test/CodeGen/Thumb2/thumb2-ldr.ll
U    test/CodeGen/Thumb2/thumb2-str_pre.ll
U    test/CodeGen/Thumb2/thumb2-str.ll
U    test/CodeGen/Thumb2/thumb2-ldrh.ll
U    utils/TableGen/TableGen.cpp
U    utils/TableGen/DisassemblerEmitter.cpp
D    utils/TableGen/RISCDisassemblerEmitter.h
D    utils/TableGen/RISCDisassemblerEmitter.cpp
U    Makefile.rules
U    lib/Target/ARM/ARMInstrNEON.td
U    lib/Target/ARM/Makefile
U    lib/Target/ARM/AsmPrinter/ARMInstPrinter.cpp
U    lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp
U    lib/Target/ARM/AsmPrinter/ARMInstPrinter.h
D    lib/Target/ARM/Disassembler
U    lib/Target/ARM/ARMInstrFormats.td
U    lib/Target/ARM/ARMAddressingModes.h
U    lib/Target/ARM/Thumb2ITBlockPass.cpp

llvm-svn: 98640
2010-03-16 16:59:47 +00:00
Johnny Chen 3d9327bd06 Initial ARM/Thumb disassembler check-in. It consists of a tablgen backend
(RISCDisassemblerEmitter) which emits the decoder functions for ARM and Thumb,
and the disassembler core which invokes the decoder function and builds up the
MCInst based on the decoded Opcode.

Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm
instructions to help disassembly.

We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>.  See, for example, A8.6.57/58/60.

And modified test cases to not expect '+' in +reg or #+num.  For example,

; CHECK:       ldr.w	r9, [r7, #28]

llvm-svn: 98637
2010-03-16 16:36:54 +00:00
Bob Wilson 298a83ecfe Stop using the old pre-UAL syntax for LDM/STM instruction suffixes.
This does not move entirely to UAL syntax, since the default "increment after"
suffix is empty but we still use "IA" for that.

llvm-svn: 98635
2010-03-16 16:19:07 +00:00
Bob Wilson 5125770346 Add a testcase for the change in r98586.
llvm-svn: 98610
2010-03-16 05:33:29 +00:00
Bill Wendling 31d7f0d96a Forgot testcase for r98599.
llvm-svn: 98602
2010-03-16 01:54:20 +00:00
Chris Lattner db035a0af2 Fix the third (and last known) case of code update problems due
to LLVM IR changes with addr label weirdness.  In the testcase, we
generate references to the two bb's when codegen'ing the first
function:

_test1:                                 ## @test1
	leaq	Ltmp0(%rip), %rax
..
	leaq	Ltmp1(%rip), %rax

Then continue to codegen the second function where the blocks
get merged.  We're now smart enough to emit both labels, producing
this code:

_test_fun:                              ## @test_fun
## BB#0:                                ## %entry
Ltmp1:                                  ## Block address taken
Ltmp0:
## BB#1:                                ## %ret
	movl	$-1, %eax
	ret

Rejoice.

llvm-svn: 98595
2010-03-16 00:29:39 +00:00
Daniel Dunbar 5599256415 MC: Allow modifiers in MCSymbolRefExpr, and eliminate X86MCTargetExpr.
- Although it would be nice to allow this decoupling, the assembler needs to be able to reason about MCSymbolRefExprs in too many places to make this viable. We can use a target specific encoding of the variant if this becomes an issue.
 - This patch also extends llvm-mc to support parsing of the modifiers, as opposed to lumping them in with the symbol.

llvm-svn: 98592
2010-03-15 23:51:06 +00:00
Dan Gohman c6ddebd6d1 Recognize code for doing vector gather/scatter index calculations with
32-bit indices. Instead of shuffling each element out of the index vector,
when all indices are needed, just store the input vector to the stack and
load the elements out.

llvm-svn: 98588
2010-03-15 23:23:03 +00:00
Chris Lattner 561334a81f Implement support for the case when a reference to a addr-of-bb
label is generated, but then the block is deleted.  Since the
value is undefined, we just emit the label right after the entry 
label of the function.  It might matter that the label is in the
same section as the function was afterall.

llvm-svn: 98579
2010-03-15 20:39:00 +00:00
Chris Lattner 347a0eb85c Fix the case when a reference to an address taken BB is emitted in one
function, then the BB is RAUW'd before the definition is emitted.  There
are still two cases not being handled, but this should improve us back to
the situation before I touched anything.

llvm-svn: 98566
2010-03-15 19:09:43 +00:00
Chris Lattner d03a956a01 filecheckize a test and mark these wiht a cpu so it passes
on hosts without cmovs.

llvm-svn: 98521
2010-03-14 22:31:16 +00:00
Duncan Sands ca595495e4 Turn calls to copysignl into an FCOPYSIGN node. Handle FCOPYSIGN nodes
with ppc_f128 type by having the type legalizer turn these back into a
call to copysignl.

llvm-svn: 98514
2010-03-14 21:08:40 +00:00
Chris Lattner f71cb6c439 fix ShrinkDemandedOps to not leave dead nodes around,
fixing PR6607

llvm-svn: 98512
2010-03-14 19:46:02 +00:00
Chris Lattner 5049f23592 don't have i386-specific tests in CodeGen/Generic, PR6601.
llvm-svn: 98508
2010-03-14 18:51:18 +00:00
Chris Lattner 6feb7e3325 fix PR6605, X86ISD::CMP always returns i32 (EFLAGS), not
the operand type.

llvm-svn: 98507
2010-03-14 18:44:35 +00:00
Anton Korobeynikov 79a7c7823d Fix typo
llvm-svn: 98506
2010-03-14 18:42:52 +00:00
Anton Korobeynikov 846a117892 Feature test for half precision FP.
llvm-svn: 98504
2010-03-14 18:42:43 +00:00
Chris Lattner 9efbbcbe45 fix AsmPrinter::GetBlockAddressSymbol to always return a unique
label instead of trying to form one based on the BB name (which
causes collisions if the name is empty).  This fixes PR6608

llvm-svn: 98495
2010-03-14 17:53:23 +00:00
Chris Lattner 6e52e9db31 get MMI out of the label uniquing business, just go to MCContext
to get unique assembler temporary labels.

llvm-svn: 98489
2010-03-14 08:36:50 +00:00
Evan Cheng d703df67ce Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions.
llvm-svn: 98465
2010-03-14 03:48:46 +00:00
Chris Lattner d75813970a simplify code to use OutContext.GetOrCreateTemporarySymbol with
no arguments instead of having to come up with a unique name.
This also makes the code less fragile.

llvm-svn: 98364
2010-03-12 18:47:50 +00:00
Chris Lattner 53ebf8a7ca fix PR6577, a bug in sdbuilder lowering select instructions
whose true value was not Val#0.

llvm-svn: 98336
2010-03-12 07:15:36 +00:00
Bill Wendling 00810c39da revert r98270.
llvm-svn: 98281
2010-03-11 19:50:31 +00:00
Evan Cheng 31fe835bf2 Bad bad bug. x86 force indirect tail call address into eax when it's meant to force it into a call preserved register instead. Change it to ecx for now.
llvm-svn: 98270
2010-03-11 18:49:14 +00:00
Richard Osborne 4780109254 Add dag combine to simplify lmul(x, 0, a, b)
llvm-svn: 98258
2010-03-11 16:26:35 +00:00
Evan Cheng 8c4df8160e The check for coalescing a virtual register to a physical register, e.g.
cl = EXTRACT_SUBREG reg1024, 1, is overly conservative. It should check
for overlaps of vr's live interval with the super registers of the
physical register (ECX in this case) and let JoinIntervals() handle checking
the coalescing feasibility against the physical register (cl in this case).

llvm-svn: 98251
2010-03-11 08:20:21 +00:00
Eric Christopher 304f13c637 Have fast-isel understand llvm.objectsize. Update testcase for slightly
different codegen.

llvm-svn: 98244
2010-03-11 06:20:22 +00:00
Chris Lattner a179e4d0a8 add support, testcases, and dox for the new GHC calling
convention.  Patch by David Terei!

llvm-svn: 98212
2010-03-11 00:22:57 +00:00
Chris Lattner 4ec0b670d5 fix PR6533 by updating the br(xor) code to remember the case
when it looked past a trunc.

llvm-svn: 98203
2010-03-10 23:46:44 +00:00
Richard Osborne 54a2c32670 Handle MVT::i64 type in DAG combine for ISD::ADD. Fold 64 bit
expression add(add(mul(x,y),a),b) -> lmul(x,y,a,b) if all
operands are zero extended.

llvm-svn: 98168
2010-03-10 18:12:27 +00:00
Richard Osborne 1a396d53ed Fold add(add(mul(x,y),a),b) -> lmul(x,y,a,b) if the intermediate
results are unused elsewhere.

llvm-svn: 98157
2010-03-10 16:19:31 +00:00
Richard Osborne f57aea3d38 Prefer LMUL to MACCU as LMUL has no tied operands.
llvm-svn: 98153
2010-03-10 13:27:10 +00:00
Richard Osborne 0012bc1e41 Custom lower (S|U)MUL_LOHI -> MACC(S|U)
llvm-svn: 98152
2010-03-10 13:20:07 +00:00
Richard Osborne 54dfa01adc Lower add (mul a, b), c into MACCU / MACCS nodes which translate
directly to the maccu / maccs instructions. We handle this in
ExpandADDSUB since after type legalisation it is messy to
recognise these operations.

llvm-svn: 98150
2010-03-10 11:41:08 +00:00
Richard Osborne e35eabdd69 Convert test to FileCheck.
llvm-svn: 98148
2010-03-10 11:24:03 +00:00
Evan Cheng 72811e8714 Fix typo.
llvm-svn: 98142
2010-03-10 07:07:55 +00:00
Evan Cheng a3b6739749 Unbreak test on Linux.
llvm-svn: 98141
2010-03-10 07:07:45 +00:00
Evan Cheng 80ad113731 Enable machine cse pass.
llvm-svn: 98132
2010-03-10 03:07:41 +00:00
Dale Johannesen 90eab67320 The address of an indirect call must be in R12 on Darwin.
Make it so.  (This patch is in LowerCall_Darwin, which seems
to be used by SVR4 code as well; since that doesn't belong here,
I haven't worried about this case.)

llvm-svn: 98077
2010-03-09 20:15:42 +00:00
Richard Osborne c420c4cb4e In cases where the carry / borrow unused converted ladd / lsub
to an add or a sub.

llvm-svn: 98059
2010-03-09 16:34:25 +00:00
Richard Osborne f4e76cf44d Add DAG combine for ladd / lsub.
llvm-svn: 98057
2010-03-09 16:07:47 +00:00
Chris Lattner 9889c1eb9e move .set generation out of DwarfPrinter into AsmPrinter and
MCize it.

llvm-svn: 98010
2010-03-08 23:58:37 +00:00
Chris Lattner 27a9732450 simplify EmitSectionOffset to always use .set if it is
available, the only thing this affects is that we produce
.set in one case we didn't before, which shouldn't harm
anything.  Make EmitSectionOffset call EmitDifference
instead of duplicating it.

llvm-svn: 98005
2010-03-08 23:23:25 +00:00
Bob Wilson 0bfbd9b68c Fix a crash compiling 254.gap for Thumb2. The Thumb2 add/sub with 12-bit
immediate instructions cannot set the condition codes, so they do not have
the extra cc_out operand.  We hit an assertion during tail duplication
because the instruction being duplicated had more operands that expected.

llvm-svn: 98001
2010-03-08 22:56:15 +00:00
Evan Cheng 5967649780 Add documentation on sibling call optimization. Rename tailcall2.ll test to sibcall.ll.
llvm-svn: 97980
2010-03-08 21:05:02 +00:00
Wesley Peck 1fb4edc05d Re-committing the failed r97807 commit with changes to eliminate warnings.
llvm-svn: 97891
2010-03-06 23:23:12 +00:00
Anton Korobeynikov bf16a17fc1 Initial bits of ARMv4-only support.
Patch by John Tytgat!

llvm-svn: 97886
2010-03-06 19:39:36 +00:00
Anton Korobeynikov 6f5523aa8b Do not use '&' prefix for globals when register base field is non-zero, otherwise msp430-as will silently miscompile the code (TI's assembler report an error though).
This fixes PR6349

llvm-svn: 97877
2010-03-06 11:41:12 +00:00
Chris Lattner 4279a078a5 revert r97807, it introduced build warnings.
llvm-svn: 97869
2010-03-06 04:32:46 +00:00
Charles Davis 8545afe0b0 Don't emit global symbols into the (__TEXT,__ustring) section on Darwin. This
is a workaround for <rdar://problem/7672401/> (which I filed).

This let's us build Wine on Darwin, and it gets the Qt build there a little bit
further (so Doug says).

llvm-svn: 97845
2010-03-05 22:28:45 +00:00
Jakob Stoklund Olesen 2664d295cb Better handling of dead super registers in LiveVariables. We used to do this:
CALL ... %RAX<imp-def>
   ... [not using %RAX]
   %EAX = ..., %RAX<imp-use, kill>
   RET %EAX<imp-use,kill>

Now we do this:

   CALL ... %RAX<imp-def, dead>
   ... [not using %RAX]
   %EAX = ...
   RET %EAX<imp-use,kill>

By not artificially keeping %RAX alive, we lower register pressure a bit.

The correct number of instructions for 2008-08-05-SpillerBug.ll is obviously
55, anybody can see that. Sheesh.

llvm-svn: 97838
2010-03-05 21:49:17 +00:00
Jakob Stoklund Olesen 8c5b8db5cd We don't really care about correct register liveness information after the
post-ra scheduler has run. Disable the verifier checks that late in the game.

llvm-svn: 97837
2010-03-05 21:49:13 +00:00
Jakob Stoklund Olesen b0503beff1 Avoid creating bad PHI instructions when BR is being const-folded.
llvm-svn: 97836
2010-03-05 21:49:10 +00:00
Chris Lattner f0692603d5 fix bss section printing for cell, patch by Kalle Raiskila!
llvm-svn: 97814
2010-03-05 18:55:36 +00:00
Wesley Peck 34004170c5 Reworking the stack layout that the MicroBlaze backend generates.
The MicroBlaze backend was generating stack layouts that did not
conform correctly to the ABI. This update generates stack layouts
which are closer to what GCC does.

Variable arguments support was added as well but the stack layout
for varargs has not been finalized.

llvm-svn: 97807
2010-03-05 15:26:02 +00:00
Evan Cheng 654ec2a663 Fix an oops in x86 sibcall optimization. If the ByVal callee argument is itself passed as a pointer, then it's obviously not safe to do a tail call.
llvm-svn: 97797
2010-03-05 08:38:04 +00:00
Chris Lattner 55e81eb49f Fix PR6497, a bug where we'd fold a load into an addc
node which has a flag.  That flag in turn was used by an
already-selected adde which turned into an ADC32ri8 which
used a selected load which was chained to the load we
folded.  This flag use caused us to form a cycle.  Fix
this by not ignoring chains in IsLegalToFold even in
cases where the isel thinks it can.

llvm-svn: 97791
2010-03-05 06:19:13 +00:00
Chris Lattner bfdd17a2ea cleanup
llvm-svn: 97790
2010-03-05 06:17:43 +00:00
Evan Cheng cf67ffa500 Rever 96389 and 96990. They are causing some miscompilation that I do not fully understand.
llvm-svn: 97782
2010-03-05 03:08:23 +00:00
Bill Wendling 543ce1f64a Revert r97766. It's deleting a tag.
llvm-svn: 97768
2010-03-05 00:33:59 +00:00
Bill Wendling 6517f88f25 Micro-optimization:
This code:

float floatingPointComparison(float x, float y) {
    double product = (double)x * y;
    if (product == 0.0)
        return product;
    return product - 1.0;
}

produces this:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jne             0x00000004
0000000000000016        jp              0x00000002
0000000000000018        jmp             0x00000008
000000000000001a        addsd           0x00000006(%rip),%xmm0
0000000000000022        cvtsd2ss        %xmm0,%xmm0
0000000000000026        ret

The "jne/jp/jmp" sequence can be reduced to this instead:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jp              0x00000002
0000000000000016        je              0x00000008
0000000000000018        addsd           0x00000006(%rip),%xmm0
0000000000000020        cvtsd2ss        %xmm0,%xmm0
0000000000000024        ret

for a savings of 2 bytes.

This xform can happen when we recognize that jne and jp jump to the same "true"
MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch
is the fall-through MBB.

llvm-svn: 97766
2010-03-05 00:24:26 +00:00
Johnny Chen ece1797542 Drop the ".w" qualifier for t2UXTB16* instructions as there is no 16-bit version
of either sxtb16 or uxtb16, and the unified syntax does not specify ".w".

llvm-svn: 97760
2010-03-04 22:24:41 +00:00
Bob Wilson 749ba9a7d5 pr6478: The frame pointer spill frame index is only defined when there is a
frame pointer.

llvm-svn: 97755
2010-03-04 21:42:36 +00:00
Bob Wilson cf6e29a818 pr6480: Don't try producing ld/st-multiple instructions when the address is
an undef value.  This is only going to come up for bugpoint-reduced tests --
correct programs will not access memory at undefined addresses -- so it's not
worth the effort of doing anything more aggressive.

llvm-svn: 97745
2010-03-04 21:04:38 +00:00
Jakob Stoklund Olesen af6ca23294 Fix the remaining MUL8 and DIV8 to define AX instead of AL,AH.
These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG
reads AX in order to avoid reading AH with a REX instruction.

Fix PR6489.

llvm-svn: 97742
2010-03-04 20:42:07 +00:00
Dan Gohman b8ebd408da Fix recognition of 16-bit bswap for C front-ends which emit the
clobber registers in a different order.

llvm-svn: 97741
2010-03-04 19:58:08 +00:00
Dan Gohman 2850b41412 Revert r97580; that's not the right way to fix this.
llvm-svn: 97639
2010-03-03 04:36:42 +00:00
Bill Wendling af13d82945 This test case:
long test(long x) { return (x & 123124) | 3; }

Currently compiles to:

_test:
        orl     $3, %edi
        movq    %rdi, %rax
        andq    $123127, %rax
        ret

This is because instruction and DAG combiners canonicalize

  (or (and x, C), D) -> (and (or, D), (C | D))

However, this is only profitable if (C & D) != 0. It gets in the way of the
3-addressification because the input bits are known to be zero.

llvm-svn: 97616
2010-03-03 00:35:56 +00:00
Chris Lattner dd030701bd Fix some issues in WalkChainUsers dealing with
CopyToReg/CopyFromReg/INLINEASM.  These are annoying because
they have the same opcode before an after isel.  Fix this by
setting their NodeID to -1 to indicate that they are selected,
just like what automatically happens when selecting things that
end up being machine nodes.

With that done, give IsLegalToFold a new flag that causes it to
ignore chains.  This lets the HandleMergeInputChains routine be
the one place that validates chains after a match is successful,
enabling the new hotness in chain processing.  This smarter
chain processing eliminates the need for "PreprocessRMW" in the
X86 and MSP430 backends and enables MSP to start matching it's
multiple mem operand instructions more aggressively.

I currently #if out the dead code in the X86 backend and MSP 
backend, I'll remove it for real in a follow-on patch.

The testcase changes are:
  test/CodeGen/X86/sse3.ll: we generate better code
  test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was 
      miscompiling this before, we now generate correct code
      Convert it to filecheck while I'm at it.
  test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem
      folding to make anton happy. :)

llvm-svn: 97596
2010-03-02 22:20:06 +00:00
Chris Lattner f61e34d120 this testcase is failing because pic16 doesn't define a reg/reg
xor pattern.  I have no plans to fix this XFAIL.

llvm-svn: 97587
2010-03-02 20:48:24 +00:00
Chris Lattner 7ecdcadcc9 xfail this for now.
llvm-svn: 97584
2010-03-02 19:53:25 +00:00
Dan Gohman d55f574589 When expanding an expression such as (A + B + C + D), sort the operands
by loop depth and emit loop-invariant subexpressions outside of loops.
This speeds up MultiSource/Applications/viterbi and others.

llvm-svn: 97580
2010-03-02 19:32:21 +00:00
Chris Lattner 35ec683b78 clean up some testcases.
llvm-svn: 97576
2010-03-02 18:56:03 +00:00
Chris Lattner 925ac71f26 Fix the xfail I added a couple of patches back. The issue
was that we weren't properly handling the case when interior
nodes of a matched pattern become dead after updating chain
and flag uses.  Now we handle this explicitly in 
UpdateChainsAndFlags.

llvm-svn: 97561
2010-03-02 07:50:03 +00:00
Chris Lattner b884fe867e Rewrite chain handling validation and input TokenFactor handling
stuff now that we don't care about emulating the old broken 
behavior of the old isel.  This eliminates the 
'CheckChainCompatible' check (along with IsChainCompatible) which
did an incorrect and inefficient scan *up* the chain nodes which
happened as the pattern was being formed and does the validation
at the end in HandleMergeInputChains when it forms a structural 
pattern.  This scans "down" the graph, which means that it is
quickly bounded by nodes already selected.  This also handles
token factors that get "trapped" in the dag.

Removing the CheckChainCompatible nodes also shrinks the 
generated tables by about 6K for X86 (down to 83K).

There are two pieces remaining before I can nuke PreprocessRMW:
1. I xfailed a test because we're now producing worse code in a 
   case that has nothing to do with the change: it turns out that
   our use of MorphNodeTo will leave dead nodes in the graph
   which (depending on how the graph is walked) end up causing
   bogus uses of chains and blocking matches.  This is really 
   bad for other reasons, so I'll fix this in a follow-up patch.

2. CheckFoldableChainNode needs to be improved to handle the TF.

llvm-svn: 97539
2010-03-02 02:22:10 +00:00
Dan Gohman 4cec543952 Fix several places to handle vector operands properly.
Based on a patch by Micah Villmow for PR6438.

llvm-svn: 97538
2010-03-02 02:14:38 +00:00
Chris Lattner d39f75ba39 Fix PR2590 by making PatternSortingPredicate actually be
ordered correctly.  Previously it would get in trouble when
two patterns were too similar and give them nondet ordering.
We force this by using the record ID order as a fallback.

The testsuite diff is due to alpha patterns being ordered
slightly differently, the change is a semantic noop afaict:

< 	lda $0,-100($16)
---
> 	subq $16,100,$0

llvm-svn: 97509
2010-03-01 22:09:11 +00:00
Chris Lattner 8c1132746b stop using anders-aa
llvm-svn: 97491
2010-03-01 20:24:05 +00:00
Devang Patel 3bf0571bb0 Rewrite test to test VLA using new debug info encoding scheme.
llvm-svn: 97465
2010-03-01 18:30:58 +00:00
Devang Patel 4aefd92040 Remove this generic debug info intrinsic test. LLVM does not use this llvm.dbg.stoppoint intrinsic anymore. There are tests to check new implementation, which attaches location information directly with an instruction using metadata.
llvm-svn: 97464
2010-03-01 18:30:08 +00:00
Chris Lattner 90e7924cf0 add some random nounwinds.
llvm-svn: 97411
2010-02-28 20:36:49 +00:00
Dan Gohman 34021b7445 Don't try to replace physical registers when doing CSE.
llvm-svn: 97360
2010-02-28 01:33:43 +00:00
Dan Gohman 45e7ffc350 Add nounwinds.
llvm-svn: 97349
2010-02-27 23:53:53 +00:00
Evan Cheng 228c31f045 Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap.
llvm-svn: 97310
2010-02-27 07:36:59 +00:00
Jakob Stoklund Olesen ddbf7a858e Use the right floating point load/store instructions in PPCInstrInfo::foldMemoryOperandImpl().
The PowerPC floating point registers can represent both f32 and f64 via the
two register classes F4RC and F8RC. F8RC is considered a subclass of F4RC to
allow cross-class coalescing. This coalescing only affects whether registers
are spilled as f32 or f64.

Spill slots must be accessed with load/store instructions corresponding to the
class of the spilled register. PPCInstrInfo::foldMemoryOperandImpl was looking
at the instruction opcode which is wrong.

X86 has similar floating point register classes, but doesn't try to fold
memory operands, so there is no problem there.

llvm-svn: 97262
2010-02-26 21:09:24 +00:00
Sanjiv Gupta 2bdbb3c167 Reapply things reverted back in 97220, with the fixed test case.
llvm-svn: 97228
2010-02-26 17:59:28 +00:00
Richard Osborne 333300e0df Fix XCoreTargetLowering::isLegalAddressingMode() to handle VoidTy.
Previously LoopStrengthReduce would sometimes be unable to find
a legal formula, causing an assertion failure.

llvm-svn: 97226
2010-02-26 16:44:51 +00:00
Chris Lattner f7fc2d8b86 change the scope node to include a list of children to be checked
instead of to have a chained series of scope nodes.  This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more 
reasonable to implement.

llvm-svn: 97160
2010-02-25 19:00:39 +00:00
Dan Gohman 9b80f86e5b Revert r97064. Duncan pointed out that bitcasts are defined in
terms of store and load, which means bitcasting between scalar
integer and vector has endian-specific results, which undermines
this whole approach.

llvm-svn: 97137
2010-02-25 15:20:39 +00:00
Dan Gohman a9c205cc88 Make LoopSimplify change conditional branches in loop exiting blocks
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.

Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.

llvm-svn: 97126
2010-02-25 06:57:05 +00:00
Jakob Stoklund Olesen 63af51c1c8 Create a stack frame on ARM when
- Function uses all scratch registers AND
- Function does not use any callee saved registers AND
- Stack size is too big to address with immediate offsets.

In this case a register must be scavenged to calculate the address of a stack
object, and the scavenger needs a spare register or emergency spill slot.

llvm-svn: 97071
2010-02-24 22:43:17 +00:00
Bob Wilson ba8ac74fd9 Check for comparisons of +/- zero when optimizing less-than-or-equal and
greater-than-or-equal SELECT_CCs to NEON vmin/vmax instructions.  This is
only allowed when UnsafeFPMath is set or when at least one of the operands
is known to be nonzero.

llvm-svn: 97065
2010-02-24 22:15:53 +00:00
Dan Gohman 4b2b48daba Make getTypeSizeInBits work correctly for array types; it should return
the number of value bits, not the number of bits of allocation for in-memory
storage.

Make getTypeStoreSize and getTypeAllocSize work consistently for arrays and
vectors.

Fix several places in CodeGen which compute offsets into in-memory vectors
to use TargetData information.

This fixes PR1784.

llvm-svn: 97064
2010-02-24 22:05:23 +00:00
Daniel Dunbar 4811d004be Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in
the hopes of fixing PPC bootstrap.

llvm-svn: 97040
2010-02-24 17:05:47 +00:00
Dan Gohman 3860521406 When forming SSE min and max nodes for UGE and ULE comparisons, it's
necessary to swap the operands to handle NaN and negative zero properly.

Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.

llvm-svn: 97025
2010-02-24 06:52:40 +00:00
Chris Lattner df8a8a8c6f Change the scheduler from adding nodes in allnodes order
to adding them in a determinstic order (bottom up from 
the root) based on the structure of the graph itself.

This updates tests for some random changes, interesting
bits: CodeGen/Blackfin/promote-logic.ll no longer crashes.
I have no idea why, but that's good right?

CodeGen/X86/2009-07-16-LoadFoldingBug.ll also fails, but
now compiles to have one fewer constant pool entry, making
the expected load that was being folded disappear.  Since it
is an unreduced mass of gnast, I just removed it.

This fixes PR6370

llvm-svn: 97023
2010-02-24 06:11:37 +00:00
Jim Grosbach 6ad4bcb0da LowerCall() should always do getCopyFromReg() to reference the stack pointer.
Machine instruction selection is much happier when operands are in virtual
registers.

llvm-svn: 97012
2010-02-24 01:43:03 +00:00
Evan Cheng 328a607490 Re-apply 96540 and 96556 with fixes.
llvm-svn: 97011
2010-02-24 01:42:31 +00:00
Jakob Stoklund Olesen a2d8c97b65 DIV8r must define %AX since X86DAGToDAGISel::Select() sometimes uses it
instead of %AL/%AH.

llvm-svn: 97006
2010-02-24 00:39:35 +00:00
Jakob Stoklund Olesen fe0a8cd210 Remember to handle sub-registers when moving imp-defs to a rematted instruction.
llvm-svn: 96995
2010-02-23 22:44:02 +00:00
Jakob Stoklund Olesen 38b76e27a7 Keep track of phi join registers explicitly in LiveVariables.
Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply
defined registers. That doesn't work if the phi join is implicitly defined in
all but one of the predecessors.

llvm-svn: 96994
2010-02-23 22:43:58 +00:00
Wesley Peck e4801e49c9 Adding the MicroBlaze backend.
The MicroBlaze is a highly configurable 32-bit soft-microprocessor for
use on Xilinx FPGAs. For more information see:
http://www.xilinx.com/tools/microblaze.htm
http://en.wikipedia.org/wiki/MicroBlaze

The current LLVM MicroBlaze backend generates assembly which can be
compiled using the an appropriate binutils assembler.

llvm-svn: 96969
2010-02-23 19:15:24 +00:00
Richard Osborne f578196968 Lower BR_JT on the XCore to a jump into a series of jump instructions.
llvm-svn: 96942
2010-02-23 13:25:07 +00:00
Evan Cheng 2a33390e2b These should not have been committed.
llvm-svn: 96827
2010-02-22 23:37:48 +00:00
Chris Lattner 72334622d6 no need to run llvm-as here.
llvm-svn: 96826
2010-02-22 23:34:12 +00:00
Evan Cheng 3688b8fa68 Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting.
llvm-svn: 96825
2010-02-22 23:34:00 +00:00
Dan Gohman 6c5ac6de5c Canonicalize ConstantInts to the right operand of commutative
operators.

The test difference is just due to the multiplication operands
being commuted (and thus requiring a more elaborate match). In
optimized code, that expression would be folded.

llvm-svn: 96816
2010-02-22 22:43:23 +00:00
Dan Gohman be24c455c4 Actually enable the -enable-unsafe-fp-math tests.
llvm-svn: 96796
2010-02-22 18:53:26 +00:00
Arnold Schwaighofer 30ece5b807 Mark the return address stack slot as mutable when moving the return address
during a tail call. A parameter might overwrite this stack slot during the tail
call. 

The sequence during a tail call is:
1.) load return address to temp reg
2.) move parameters (might involve storing to return address stack slot)
3.) store return address to new location from temp reg

If the stack location is marked immutable CodeGen can colocate load (1) with the
store (3).

This fixes bug 6225.

llvm-svn: 96783
2010-02-22 16:18:09 +00:00
Dan Gohman b87de8d30d Remove the logic for reasoning about NaNs from the code that forms
SSE min and max instructions. The real thing this code needs to be
concerned about is negative zero.

Update the sse-minmax.ll test accordingly, and add tests for
-enable-unsafe-fp-math mode as well.

llvm-svn: 96775
2010-02-22 04:03:39 +00:00
Dan Gohman 4506fcb3c2 When emitting an instruction which depends on both a post-incremented
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.

llvm-svn: 96774
2010-02-22 03:59:54 +00:00
Chris Lattner 745219ea64 add some no-unwinds, other minor cleanups.
llvm-svn: 96756
2010-02-21 20:33:20 +00:00
Chris Lattner c43c88ebce add a triple so that this doesn't fail due to linux/ppc register printing
syntax.

llvm-svn: 96748
2010-02-21 19:27:38 +00:00
Chris Lattner 53485469b4 filecheckize and add nouwinds.
llvm-svn: 96745
2010-02-21 18:53:28 +00:00
Anton Korobeynikov e96503faa1 IT turns out that during jumpless setcc lowering eq and ne were swapped.
This fixes PR6348

llvm-svn: 96734
2010-02-21 12:28:58 +00:00
Chris Lattner 3c29aff9ff fix and un-xfail X86/vec_ss_load_fold.ll
llvm-svn: 96720
2010-02-21 04:53:34 +00:00
Chris Lattner 7d5f4a4c03 temporarily disable this.
llvm-svn: 96717
2010-02-21 03:24:41 +00:00
Dan Gohman 85af256779 Check for overflow when scaling up an add or an addrec for
scaled reuse.

llvm-svn: 96692
2010-02-19 19:32:49 +00:00
Charles Davis 7e47767763 Add support for the 'alignstack' attribute to the x86 backend. Fixes PR5254.
Also, FileCheck'ize a test.

llvm-svn: 96686
2010-02-19 18:17:13 +00:00
Duncan Sands d0bf6f640f Revert commits 96556 and 96640, because commit 96556 breaks the
dragonegg self-host build.  I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier.  The
symptom of the 96556 miscompile is the following crash:

  llvm[3]: Compiling AlphaISelLowering.cpp for Release build
  cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
  Stack dump:
  0.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
  g++: Internal error: Aborted (program cc1plus)

This occurs when building LLVM using LLVM built by LLVM (via
dragonegg).  Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM.  Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.

Found by bisection.

r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines

Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines

Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96672
2010-02-19 11:30:41 +00:00
Evan Cheng d2d9252f35 Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96640
2010-02-19 00:34:39 +00:00
Dan Gohman 2446f57503 When determining the set of interesting reuse factors, consider
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.

llvm-svn: 96629
2010-02-19 00:05:23 +00:00
Mon P Wang c94892513d getSplatIndex assumes that the first element of the mask contains the splat index
which is not always true if the mask contains undefs. Modified it to return
the first non undef value.

llvm-svn: 96621
2010-02-18 22:33:18 +00:00
Jakob Stoklund Olesen c953acbd7f Always normalize spill weights, also for intervals created by spilling.
Moderate the weight given to very small intervals.

The spill weight given to new intervals created when spilling was not
normalized in the same way as the original spill weights calculated by
CalcSpillWeights. That meant that restored registers would tend to hang around
because they had a much higher spill weight that unspilled registers.

This improves the runtime of a few tests by up to 10%, and there are no
significant regressions.

llvm-svn: 96613
2010-02-18 21:33:05 +00:00
Dan Gohman 5ffef745c2 Make CodePlacementOpt detect special EH control flow by
checking whether AnalyzeBranch disagrees with the CFG
directly, rather than looking for EH_LABEL instructions.
EH_LABEL instructions aren't always at the end of the
block, due to FP_REG_KILL and other things. This fixes
an infinite loop compiling MultiSource/Benchmarks/Bullet.

llvm-svn: 96611
2010-02-18 21:25:53 +00:00
Chris Lattner 6a9bdade29 remove empty file
llvm-svn: 96573
2010-02-18 06:29:06 +00:00
Bob Wilson c6c13a3515 Use NEON vmin/vmax instructions for floating-point selects.
Radar 7461718.

llvm-svn: 96572
2010-02-18 06:05:53 +00:00
Evan Cheng 0ceb68a552 Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

llvm-svn: 96556
2010-02-18 02:13:50 +00:00
Dan Gohman 104207b4c5 Don't check for comments, which vary between subtargets.
llvm-svn: 96434
2010-02-17 01:08:57 +00:00
Dan Gohman 5f10d6c52c Don't attempt to divide INT_MIN by -1; consider such cases to
have overflowed.

llvm-svn: 96428
2010-02-17 00:41:53 +00:00
Chris Lattner 1fc2773a33 roundss is an sse 4 thing, fix the test on non-sse41 builders
like llvm-gcc-x86_64-darwin10-selfhost

llvm-svn: 96417
2010-02-17 00:29:06 +00:00
Dale Johannesen cee887425e Make g5 target explicit; scheduling affects register choice.
llvm-svn: 96413
2010-02-16 23:25:23 +00:00
Chris Lattner afac7dad21 fix rdar://7653908, a crash on a case where we would fold a load
into a roundss intrinsic, producing a cyclic dag.  The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection.  Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.

llvm-svn: 96408
2010-02-16 22:35:06 +00:00
Dale Johannesen 0062f7bf59 Adjust register numbers in tests to compensate for the
new lack of R2.

llvm-svn: 96407
2010-02-16 22:31:31 +00:00
Chris Lattner c98beb567c filecheckize
llvm-svn: 96404
2010-02-16 22:13:43 +00:00
Evan Cheng 82b04130cb Look for SSE and instructions of this form: (and x, (build_vector c1,c2,c3,c4)).
If there exists a use of a build_vector that's the bitwise complement of the mask,
then transform the node to
(and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)).

Since this transformation is only useful when 1) the given build_vector will
become a load from constpool, and 2) (and (xor x -1), y) matches to a single
instruction, I decided this is appropriate as a x86 specific transformation.
rdar://7323335

llvm-svn: 96389
2010-02-16 21:09:44 +00:00
David Greene 9641d06809 Add support for emitting non-temporal stores for DAGs marked
non-temporal.  Fix from r96241 for botched encoding of MOVNTDQ.

Add documentation for !nontemporal metadata.

Add a simpler movnt testcase.

llvm-svn: 96386
2010-02-16 20:50:18 +00:00
Bob Wilson 70aa8d0745 Fix pr6111: Avoid using the LR register for the target address of an indirect
branch in ARM v4 code, since it gets clobbered by the return address before
it is used.  Instead of adding a new register class containing all the GPRs
except LR, just use the existing tGPR class.

llvm-svn: 96360
2010-02-16 17:24:15 +00:00
Dan Gohman 521efe68ab Split the main for-each-use loop again, this time for GenerateTruncates,
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.

llvm-svn: 96308
2010-02-16 01:42:53 +00:00
Anton Korobeynikov ae4ccc10da Preliminary patch to improve dwarf EH generation - Hooks to return Personality / FDE / LSDA / TType encoding depending on target / options (e.g. code model / relocation model) - MCIzation of Dwarf EH printer to use encoding information - Stub generation for ELF target (needed for indirect references) - Some other small changes here and there
llvm-svn: 96285
2010-02-15 22:35:59 +00:00
Jakob Stoklund Olesen 2988d573e5 Fix PR6300.
A virtual register can be used before it is defined in the same MBB if the MBB
is part of a loop. Teach the implicit-def pass about this case.

llvm-svn: 96279
2010-02-15 22:03:29 +00:00
Bob Wilson 9be7200b08 Last week we were generating code with duplicate induction variables in this
test, but the problem seems to have gone away today.  Add a check to make sure
it doesn't come back.

llvm-svn: 96277
2010-02-15 21:56:40 +00:00
Chris Lattner 3818d9763d remove empty file.
llvm-svn: 96271
2010-02-15 21:14:50 +00:00
Chris Lattner bcbaaba532 revert r96241. It breaks two regression tests, isn't documented,
and the testcase needs improvement.

llvm-svn: 96265
2010-02-15 20:53:01 +00:00
Chris Lattner 6fbfe5897c fix PR6305 by handling BlockAddress in a helper function
called by jump threading.

llvm-svn: 96263
2010-02-15 20:47:49 +00:00
David Greene 63cedef74b Add support for emitting non-temporal stores for DAGs marked
non-temporal.

llvm-svn: 96241
2010-02-15 17:02:56 +00:00
Jakob Stoklund Olesen b659c76c77 Fix PR6283.
When coalescing with a physreg, remember to add imp-def and imp-kill when
dealing with sub-registers.

Also fix a related bug in VirtRegRewriter where substitutePhysReg may
reallocate the operand list on an instruction and invalidate the reg_iterator.
This can happen when a register is mentioned twice on the same instruction.

llvm-svn: 96072
2010-02-13 02:06:10 +00:00
Bob Wilson 01abf8fc2f Besides removing phi cycles that reduce to a single value, also remove dead
phi cycles.  Adjust a few tests to keep dead instructions from being optimized
away.  This (together with my previous change for phi cycles) fixes Apple
radar 7627077.

llvm-svn: 96057
2010-02-13 00:31:44 +00:00
Dale Johannesen 26062150fa When save/restoring CR at prolog/epilog, in a large
stack frame, the prolog/epilog code was using the same
register for the copy of CR and the address of the save slot.  Oops.
This is fixed here for Darwin, sort of, by reserving R2 for this case.
A better way would be to do the store before the decrement of SP,
which is safe on Darwin due to the red zone.

SVR4 probably has the same problem, but I don't know how to fix it;
there is no red zone and R2 is already used for something else.
I'm going to leave it to someone interested in that target.

Better still would be to rewrite the CR-saving code completely;
spilling each CR subregister individually is horrible code.

llvm-svn: 96015
2010-02-12 21:35:34 +00:00
Anton Korobeynikov b9ce3cc458 Testcases for recent stdcall / fastcall mangling improvements
llvm-svn: 95982
2010-02-12 15:29:13 +00:00
Anton Korobeynikov c9276dfe04 Cleanup stdcall / fastcall name mangling.
This should fix alot of problems we saw so far, e.g. PRs 5851 & 2936

llvm-svn: 95980
2010-02-12 15:28:40 +00:00
Dan Gohman 45774ce0ad Reapply the new LoopStrengthReduction code, with compile time and
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.

This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.

llvm-svn: 95975
2010-02-12 10:34:29 +00:00
Bob Wilson 0827e040e0 Add a new pass on machine instructions to optimize away PHI cycles that
reduce down to a single value.  InstCombine already does this transformation
but DAG legalization may introduce new opportunities.  This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized.  I measured the compile time 
impact of this (running llc on 176.gcc) and it was not significant.

llvm-svn: 95951
2010-02-12 01:30:21 +00:00
Jakob Stoklund Olesen 93c92225af Reapply coalescer fix for better cross-class coalescing.
This time with fixed test cases.

llvm-svn: 95938
2010-02-11 23:55:29 +00:00
Mon P Wang 5b77f0dac1 The previous fix of widening divides that trap was too fragile as it depends on custom
lowering and requires that certain types exist in ValueTypes.h.  Modified widening to
check if an op can trap and if so, the widening algorithm will apply only the op on
the defined elements.  It is safer to do this in widening because the optimizer can't
guarantee removing unused ops in some cases.

llvm-svn: 95823
2010-02-10 23:37:45 +00:00
Bob Wilson 0f52d0c074 Delete dead PHI machine instructions. These can be created due to type
legalization even when the IR-level optimizer has removed dead phis, such
as when the high half of an i64 value is unused on a 32-bit target.
I had to adjust a few test cases that had dead phis.
This is a partial fix for Radar 7627077.

llvm-svn: 95816
2010-02-10 22:58:57 +00:00
Evan Cheng 29b8f554fc Now that ShrinkDemandedOps() is separated out from DAG combine. It sometimes leave some obvious nops which dag combine used to clean up afterwards e.g. (trunk (ext n)) -> n. Look for them and squash them.
llvm-svn: 95757
2010-02-10 02:17:34 +00:00
Chris Lattner 78360a8184 move tests that depend on the x86 backend out of codegen/generic,
and remove a few old and unreduced ones.  Fixes PR5624. 

llvm-svn: 95656
2010-02-09 06:41:03 +00:00