Commit Graph

11351 Commits

Author SHA1 Message Date
Chris Lattner 5b2be1f890 Fix two bugs in my patch earlier today that broke int->fp conversion on X86.
llvm-svn: 23522
2005-09-29 06:44:39 +00:00
Chris Lattner 87ef943a4c Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%,
bringing the LLC time down to the CBE time.

llvm-svn: 23521
2005-09-29 06:17:27 +00:00
Chris Lattner 5f6035feb0 remove a bunch of unneeded stuff, or self evident comments
llvm-svn: 23519
2005-09-29 06:16:11 +00:00
Chris Lattner c244e7c178 Implement a couple of memcmp folds from the todo list
llvm-svn: 23517
2005-09-29 04:54:20 +00:00
Jeff Cohen b01a41a06d Silence VC++ redeclaration warnings.
llvm-svn: 23516
2005-09-29 01:59:49 +00:00
Chris Lattner 08c319fbdd Never rely on ReplaceAllUsesWith when selecting, use CodeGenMap instead.
ReplaceAllUsesWith does not replace scalars SDOperand floating around on
the stack, permitting things to be selected multiple times.

llvm-svn: 23515
2005-09-29 00:59:32 +00:00
Chris Lattner d4e9e8b7ec Codegen ADD X, IMM -> addis/addi if needed.
This implements PowerPC/fold-li.ll

llvm-svn: 23514
2005-09-28 23:07:13 +00:00
Chris Lattner b9b2e77295 Autogen MUL, move FP cases together
llvm-svn: 23512
2005-09-28 22:53:16 +00:00
Chris Lattner 5769311c92 disentangle FP from INT versions of div/mul
llvm-svn: 23511
2005-09-28 22:50:24 +00:00
Chris Lattner 585131baaf Use the autogenerated matcher for ADD/SUB
llvm-svn: 23510
2005-09-28 22:47:28 +00:00
Chris Lattner f023b2cda2 add a patter for SUBFIC
llvm-svn: 23509
2005-09-28 22:47:06 +00:00
Chris Lattner 21551ea5ab Mark int binops as int-only, add FP binops. Mark FADD/FMUL as commutative but
not associative.  Add [SU]REM.

llvm-svn: 23508
2005-09-28 22:38:27 +00:00
Chris Lattner cd002b2461 wrap a long line
llvm-svn: 23507
2005-09-28 22:30:58 +00:00
Chris Lattner d3ea19b51a Add FP versions of the binary operators, keeping the int and fp worlds seperate.
llvm-svn: 23506
2005-09-28 22:29:58 +00:00
Chris Lattner 0815dcae3f Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23505
2005-09-28 22:29:17 +00:00
Chris Lattner 6f3b577ee6 Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23504
2005-09-28 22:28:18 +00:00
Chris Lattner 7fe6734dff Mark associative nodes as associative
llvm-svn: 23503
2005-09-28 20:58:39 +00:00
Chris Lattner b97b054ba7 Nate pointed out that mulh[us] are commutative as well. Thanks!
llvm-svn: 23500
2005-09-28 19:01:44 +00:00
Chris Lattner 89d168ceb3 expose commutativity information
llvm-svn: 23498
2005-09-28 18:27:58 +00:00
Chris Lattner fab48b3285 All (xor *) cases are autogenerated now
llvm-svn: 23497
2005-09-28 18:12:37 +00:00
Chris Lattner 037d69a404 add support for missed eqv tests
llvm-svn: 23496
2005-09-28 18:10:51 +00:00
Chris Lattner 33f8e08c8f Implement PowerPC/eqv-andc-orc-nor.ll:EQV3
llvm-svn: 23494
2005-09-28 18:04:52 +00:00
Chris Lattner 8cd7b88a88 learn to codegen not as NOR instead of xoris/xori
llvm-svn: 23490
2005-09-28 17:13:15 +00:00
Chris Lattner bb5939a436 These nodes are all autogenerated
llvm-svn: 23489
2005-09-28 17:07:09 +00:00
Chris Lattner ea7214b23d Constant fold llvm.sqrt
llvm-svn: 23487
2005-09-28 01:34:32 +00:00
Chris Lattner 3b63bb375c add a note about a way to improve this code further, that I won't be getting
to right now.

llvm-svn: 23485
2005-09-27 22:44:59 +00:00
Chris Lattner eb953f0ef8 Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll
and PR632.

llvm-svn: 23484
2005-09-27 22:28:11 +00:00
Chris Lattner a028e7a39c Darwin, like many BSD systems, has a setjmp/longjmp which saves the signal mask
on setjmp calls and restores it on longjmp calls (both of which require syscalls).

This makes the calls REALLY slow.  Use _setjmp/_longjmp instead.  This speeds up
hexxagon from 120.31s to 15.68s: from 5.53x slower than GCC to 28% faster than GCC.

llvm-svn: 23482
2005-09-27 22:18:25 +00:00
Chris Lattner 0fd8f9fbc9 If the target prefers it, use _setjmp/_longjmp should be used instead of setjmp/longjmp for llvm.setjmp/llvm.longjmp.
llvm-svn: 23481
2005-09-27 22:15:53 +00:00
Chris Lattner 59dc1e082c initialize new flag
llvm-svn: 23480
2005-09-27 22:13:56 +00:00
Chris Lattner e285f5ed8f Avoid spilling stack slots... to stack slots.
llvm-svn: 23478
2005-09-27 21:33:12 +00:00
Chris Lattner 87eb249300 Completely rewrite 'correct' eh support. This changes how setjmp insertion
is performed so it is only at most once per function that contains an invoke
instead of once per invoke in the function.  This patch has the following perks:

1. It fixes PR631, which complains about slowness.
2. If fixes PR240, which complains about non-volatile vars being live across
   setjmp/longjmps.
3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not
   forcing the jmpbufs to always be 8-bytes off the alignment of the structure.
4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us
   now about 4% faster than GCC.

Further improvements are also possible.

llvm-svn: 23477
2005-09-27 21:18:17 +00:00
Chris Lattner 92233d2175 Make the pass name simpler
llvm-svn: 23476
2005-09-27 21:10:32 +00:00
Chris Lattner 5635cc069f fix CBackend/2005-09-27-VolatileFuncPtr.ll
llvm-svn: 23475
2005-09-27 20:52:44 +00:00
Chris Lattner 16cd356fb2 allow demotion to volatile values, add support for invoke
llvm-svn: 23473
2005-09-27 19:39:00 +00:00
Chris Lattner c628f00845 Make sure to clear the CodeGenMap after each basic block is selected to avoid
cross MBB pollution.

llvm-svn: 23470
2005-09-27 17:45:33 +00:00
Jim Laskey 63523f98d5 Remove some redundancies.
llvm-svn: 23469
2005-09-27 17:32:45 +00:00
Chris Lattner e7e139e8e8 Split SimpleConstantVal up into its components, so each Constant subclass getsa different enum value. This allows 'classof' for these to be really simple,not needing to call getType() anymore.
This speeds up isa/dyncast/etc for constants, and also makes them smaller.
For example, the text section of a release build of InstCombine.cpp shrinks
from 230037 bytes to 216363 bytes, a 6% reduction.

llvm-svn: 23467
2005-09-27 06:09:08 +00:00
Chris Lattner 3d27e7f27f Add support for external calls that we know how to constant fold. This implements
ctor-list-opt.ll:CTOR8

llvm-svn: 23465
2005-09-27 05:02:43 +00:00
Chris Lattner 29b2780c8a Fix a bug where we would evaluate stores into linkonce objects which could be
potentially replaced at link-time.

llvm-svn: 23463
2005-09-27 04:50:03 +00:00
Chris Lattner 65a3a0918f Implement support for static constructors with calls in them. This is useful
because gccas runs globalopt before inlining.

This implements ctor-list-opt.ll:CTOR7

llvm-svn: 23462
2005-09-27 04:45:34 +00:00
Chris Lattner da1889b778 Refactor this code a bit, no functionality changes.
llvm-svn: 23460
2005-09-27 04:27:01 +00:00
Chris Lattner 54ec5f2089 Move the post-lsr simplify cfg pass after lowereh, so it can clean up after
eh lowering as well.

llvm-svn: 23459
2005-09-27 00:14:41 +00:00
Chris Lattner 4435b149a0 minor pattern shuffling
llvm-svn: 23458
2005-09-26 22:20:16 +00:00
Jim Laskey 5f2443c8a3 Addition of a simple two pass scheduler. This version is currently hacked up
for testing and will require target machine info to do a proper scheduling.
The simple scheduler can be turned on using -sched=simple (defaults
to -sched=none)

llvm-svn: 23455
2005-09-26 21:57:04 +00:00
Chris Lattner f2f89af69a Remove some dead code. ctor evaluation subsumes empty ctor elim
llvm-svn: 23453
2005-09-26 20:38:20 +00:00
Chris Lattner 6bf2cd5735 Add support for alloca, implementing ctor-list-opt.ll:CTOR6
llvm-svn: 23452
2005-09-26 17:07:09 +00:00
Chris Lattner 46d9ff081d Add a debug printout, fix a crash on kc++
llvm-svn: 23450
2005-09-26 07:34:35 +00:00
Chris Lattner 46af55e0e4 Implement loads/stores through GEP's of globals. This implements
ctor-list-opt.ll:CTOR5.

llvm-svn: 23449
2005-09-26 06:52:44 +00:00
Chris Lattner 61ff32cd70 Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr
llvm-svn: 23447
2005-09-26 05:34:07 +00:00
Chris Lattner 02ae21e1e0 Eliminate GetGEPGlobalInitializer in favor of the more powerful
ConstantFoldLoadThroughGEPConstantExpr function in the utils lib.

llvm-svn: 23446
2005-09-26 05:28:52 +00:00
Chris Lattner 0b011ec8e2 Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils
as ConstantFoldLoadThroughGEPConstantExpr.

llvm-svn: 23445
2005-09-26 05:28:06 +00:00
Chris Lattner c13c7b9376 Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine
pass.

llvm-svn: 23444
2005-09-26 05:27:10 +00:00
Chris Lattner b009663e27 add a comment
llvm-svn: 23442
2005-09-26 05:16:34 +00:00
Chris Lattner 4b05c322d5 Add support for getelementptr, load, and correctly reject volatile stores.
llvm-svn: 23441
2005-09-26 05:15:37 +00:00
Chris Lattner 3e9ea5ffec Add support for br/brcond/switch and phi
llvm-svn: 23439
2005-09-26 04:57:38 +00:00
Chris Lattner 99e23fa74c Add a simple interpreter to this code, allowing us to statically evaluate
global ctors that are simple enough.  This implements ctor-list-opt.ll:CTOR2.

llvm-svn: 23437
2005-09-26 04:44:35 +00:00
Chris Lattner 696beefabb factor some code into a InstallGlobalCtors method, add comments. No functionality change.
llvm-svn: 23435
2005-09-26 02:31:18 +00:00
Chris Lattner 838bdc1836 Make the global opt optimizer work on modules with a null terminator, by
accepting the null even with a non-65535 init prio

llvm-svn: 23434
2005-09-26 02:19:27 +00:00
Chris Lattner 41b6a5a693 Factor this code out into a few methods.
Implement the start of global ctor optimization.  It is currently smart
enough to remove the global ctor for cases like this:

struct foo {
  foo() {}
} x;

... saving a bit of startup time for the program.

llvm-svn: 23433
2005-09-26 01:43:45 +00:00
Chris Lattner f487768062 Fix some logic I broke that caused a regression on
SimplifyLibCalls/2005-05-20-sprintf-crash.ll

llvm-svn: 23430
2005-09-25 07:06:48 +00:00
Chris Lattner 0b3557f54a Move MaskedValueIsZero up.
Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll

llvm-svn: 23428
2005-09-24 23:43:33 +00:00
Chris Lattner 175463a165 Simplify this code a bit by relying on recursive simplification. Support
sprintf("%s", P)'s that have uses.

s/hasNUses(0)/use_empty()/

llvm-svn: 23425
2005-09-24 22:17:06 +00:00
Chris Lattner cc9c03386f Add support for a marker byte that indicates that we shouldn't add the user
prefix to a symbol name

llvm-svn: 23421
2005-09-24 08:24:28 +00:00
Chris Lattner 6736a6cdd2 Teach the dag isel generator how to construct arbitrary immediates. The
generated isel now tries li then lis, then lis+ori.

llvm-svn: 23418
2005-09-24 00:41:58 +00:00
Chris Lattner 499e33646e remove some debugging code
llvm-svn: 23411
2005-09-23 18:49:09 +00:00
Chris Lattner c59a371d45 Fold two consequtive branches that share a common destination between them.
This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy
code

llvm-svn: 23410
2005-09-23 18:47:20 +00:00
Chris Lattner 3a978bf66d simplify some logic further
llvm-svn: 23408
2005-09-23 07:23:18 +00:00
Chris Lattner cc14ebc17b pull a bunch of logic out of SimplifyCFG into a helper fn
llvm-svn: 23407
2005-09-23 06:39:30 +00:00
Chris Lattner 1e3d3148bb speed up Archive::isBytecodeArchive in the case when the archive doesn't have
an llvm-ranlib symtab.  This speeds up gccld -native on an almost empty .o file
from 1.63s to 0.18s.

llvm-svn: 23406
2005-09-23 06:22:58 +00:00
Chris Lattner 59a05bdde6 Turn (X^C1) == C2 into X == C1^C2 iff X&~C1 = 0 (and move a function)
This happens all the time on PPC for bool values, e.g. eliminating a xori
in inverted-bool-compares.ll.

This should be added to the dag combiner as well.

llvm-svn: 23403
2005-09-23 00:55:52 +00:00
Chris Lattner b1f8982ff0 Expose the LiveInterval interfaces as public headers.
llvm-svn: 23400
2005-09-21 04:19:09 +00:00
Chris Lattner 6c70106053 Start threading across blocks with code in them, so long as the code does
not define a value that is used outside of it's block.  This catches many
more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc.

This implements branch-phi-thread.ll:test3.ll

llvm-svn: 23397
2005-09-20 01:48:40 +00:00
Chris Lattner f0bd8d0107 Implement merging of blocks with the same condition if the block has multiple
predecessors.  This implements branch-phi-thread.ll::test1

llvm-svn: 23395
2005-09-20 00:43:16 +00:00
Chris Lattner 049cb4482f Reject a case we don't handle yet
llvm-svn: 23393
2005-09-19 23:57:04 +00:00
Chris Lattner a160924d57 remove debugging code :-/
llvm-svn: 23392
2005-09-19 23:50:15 +00:00
Chris Lattner 748f903046 Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading
control across branches with determined outcomes.  More generality to follow.
This triggers a couple thousand times in specint.

llvm-svn: 23391
2005-09-19 23:49:37 +00:00
Nate Begeman c760f80fed Stub out the rest of the DAG Combiner. Just need to fill in the
select_cc bits and then wrap it in a convenience function for  use with
regular select.

llvm-svn: 23389
2005-09-19 22:34:01 +00:00
Chris Lattner 2f838f2192 Teach the local spiller to turn stack slot loads into register-register copies
when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.

llvm-svn: 23388
2005-09-19 06:56:21 +00:00
Chris Lattner de3c87a2ab Implement the isLoadFromStackSlot interface
llvm-svn: 23387
2005-09-19 05:23:44 +00:00
Chris Lattner b4b2530a1a Refactor this code a bit and make it more general. This now compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) { b.j += x; }

To:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        slwi r3, r3, 6
        add r3, r4, r3
        rlwimi r3, r4, 0, 26, 14
        stw r3, 0(r2)
        blr


instead of:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 26, 21, 31
        add r3, r5, r3
        rlwimi r4, r3, 6, 15, 25
        stw r4, 0(r2)
        blr

by eliminating an 'and'.

I'm pretty sure this is as small as we can go :)

llvm-svn: 23386
2005-09-18 07:22:02 +00:00
Chris Lattner 797dee7705 Compile
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) {
  b.j += x;
}

to:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        and %ECX, 131008
        mov %EDX, DWORD PTR [%ESP + 4]
        shl %EDX, 6
        add %EDX, %ECX
        and %EDX, 131008
        and %EAX, -131009
        or %EDX, %EAX
        mov DWORD PTR [b], %EDX
        ret

instead of:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        shr %ECX, 6
        and %ECX, 2047
        add %ECX, DWORD PTR [%ESP + 4]
        shl %ECX, 6
        and %ECX, 131008
        and %EAX, -131009
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23385
2005-09-18 06:30:59 +00:00
Chris Lattner 01f56c68e9 Generalize this transform, using MaskedValueIsZero, allowing us to compile:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) { b.k += x; }

To:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        add DWORD PTR [b], %EAX
        ret

instead of:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        mov %ECX, DWORD PTR [b]
        add %EAX, %ECX
        and %EAX, -131072
        and %ECX, 131071
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23384
2005-09-18 06:02:59 +00:00
Chris Lattner 4ebc8ab4e0 fix typeo
llvm-svn: 23383
2005-09-18 05:25:20 +00:00
Chris Lattner e5b23a6d67 Remove unintentionally committed code
llvm-svn: 23382
2005-09-18 05:12:51 +00:00
Chris Lattner 27cb9dbd35 implement shift.ll:test25. This compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) {
  b.k += x;
}

to:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r3, 0(r2)
        rlwinm r4, r3, 0, 0, 14
        add r4, r4, r3
        rlwimi r4, r3, 0, 15, 31
        stw r4, 0(r2)
        blr

instead of:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        srwi r5, r4, 17
        add r3, r5, r3
        slwi r3, r3, 17
        rlwimi r3, r4, 0, 15, 31
        stw r3, 0(r2)
        blr

llvm-svn: 23381
2005-09-18 05:12:10 +00:00
Chris Lattner af517574ce Implement add.ll:test29. Codegening:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus1 (unsigned int x) {
  b.i += x;
}

as:
_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        add r3, r4, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

instead of:

_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 0, 26, 31
        add r3, r5, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

llvm-svn: 23379
2005-09-18 04:24:45 +00:00
Chris Lattner 027eaf01cf remove debug output
llvm-svn: 23377
2005-09-18 03:50:25 +00:00
Chris Lattner 1521298993 Implement or.ll:test21. This teaches instcombine to be able to turn this:
struct {
   unsigned int bit0:1;
   unsigned int ubyte:31;
} sdata;

void foo() {
  sdata.ubyte++;
}

into this:

foo:
        add DWORD PTR [sdata], 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [sdata]
        mov %ECX, %EAX
        add %ECX, 2
        and %ECX, -2
        and %EAX, 1
        or %EAX, %ECX
        mov DWORD PTR [sdata], %EAX
        ret

llvm-svn: 23376
2005-09-18 03:42:07 +00:00
Chris Lattner 4d9cf68023 Implement hook for ppc
llvm-svn: 23374
2005-09-17 01:03:26 +00:00
Nate Begeman 24a7eca282 More DAG combining. Still need the branch instructions, and select_cc
llvm-svn: 23371
2005-09-16 00:54:12 +00:00
Chris Lattner 0ebec06671 disable this for now
llvm-svn: 23366
2005-09-15 21:44:00 +00:00
Chris Lattner 9e4a4ee3dc Give all operands names
llvm-svn: 23357
2005-09-14 21:11:13 +00:00
Chris Lattner 2e84be22a8 give all operands names
llvm-svn: 23356
2005-09-14 21:10:24 +00:00
Chris Lattner f006d15e7f Fix some issues exposed by more testing. XORIS had the wrong operands
specified.  The various *imm operands defined by PPC are really all i32,
even though the actual immediate is restricted to a smaller value in it.

llvm-svn: 23352
2005-09-14 20:53:05 +00:00
Chris Lattner 6b013fc923 Fix some bugs noticed by new checking code
llvm-svn: 23350
2005-09-14 18:18:39 +00:00
Chris Lattner a393e4d4b3 Fix the regression last night compiling povray
llvm-svn: 23348
2005-09-14 17:32:56 +00:00
Chris Lattner b42e962d23 fix a major regression from my patch this afternoon
llvm-svn: 23347
2005-09-14 06:06:45 +00:00
Chris Lattner b011cb2746 we don't need this proto any longer
llvm-svn: 23342
2005-09-13 22:05:21 +00:00
Chris Lattner 03e08eefc7 move the #include for the generated code into the isel class body so we
can use/define class methods

llvm-svn: 23339
2005-09-13 22:03:06 +00:00
Chris Lattner 0f965a615e Change the arg lowering code to use copyfromreg from vregs associated
with incoming arguments instead of the pregs themselves.  This fixes
the scheduler from causing problems by moving a copyfromreg for an argument
to after a select_cc node (now it can, and bad things won't happen).

llvm-svn: 23334
2005-09-13 19:33:40 +00:00
Chris Lattner ee8113293e This has been moved to the target-indep code
llvm-svn: 23333
2005-09-13 19:32:18 +00:00
Chris Lattner fb96e50b8c This code is no longer needed, it is moved to the target-indep code
llvm-svn: 23332
2005-09-13 19:31:44 +00:00
Chris Lattner d4382f0afa If a function has liveins, and if the target requested that they be plopped
into particular vregs, emit copies into the entry MBB.

llvm-svn: 23331
2005-09-13 19:30:54 +00:00
Chris Lattner 64685b4ca2 Majik numbers are bad
llvm-svn: 23330
2005-09-13 19:03:13 +00:00
Chris Lattner aa6cbd90c5 Remove some dead vectors
llvm-svn: 23329
2005-09-13 18:47:49 +00:00
Chris Lattner 2a8932960d Add a simple xform to simplify array accesses with casts in the way.
This is useful for 178.galgel where resolution of dope vectors (by the
optimizer) causes the scales to become apparent.

llvm-svn: 23328
2005-09-13 18:36:04 +00:00
Chris Lattner fd018c8dfe Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI.
This fixes up a dot-product loop in galgel, speeding it up from 18.47s to
16.13s.

llvm-svn: 23327
2005-09-13 02:09:55 +00:00
Chris Lattner 567b81f0d2 Add a helper function, allowing us to simplify some code a bit, changing
indentation, no functionality change

llvm-svn: 23325
2005-09-13 00:40:14 +00:00
Chris Lattner 219175c84d Implement a simple xform to turn code like this:
if () { store A -> P; } else { store B -> P; }

into a PHI node with one store, in the most trival case.  This implements
load.ll:test10.

llvm-svn: 23324
2005-09-12 23:23:25 +00:00
Chris Lattner e0bfdf1485 Another load-peephole optimization: do gcse when two loads are next to
each other.  This implements InstCombine/load.ll:test9

llvm-svn: 23322
2005-09-12 22:21:03 +00:00
Chris Lattner b990f7d8ed Implement a trivial form of store->load forwarding where the store and the
load are exactly consequtive.  This is picked up by other passes, but this
triggers thousands of times in fortran programs that use static locals
(and is thus a compile-time speedup).

llvm-svn: 23320
2005-09-12 22:00:15 +00:00
Chris Lattner 8048b85e8f Fix a regression from last night, which caused this pass to create invalid
code for IV uses outside of loops that are not dominated by the latch block.
We should only convert these uses to use the post-inc value if they ARE
dominated by the latch block.

Also use a new LoopInfo method to simplify some code.

This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll

llvm-svn: 23318
2005-09-12 17:11:27 +00:00
Chris Lattner b35df5f5bc Add a new getLoopLatch() method.
llvm-svn: 23315
2005-09-12 17:03:55 +00:00
Chris Lattner a67648396a _test:
li r2, 0
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r2, 1
        stw r2, 0(r4)
        blr
[zion ~/llvm]$ cat > ~/xx
Uses of IV's outside of the loop should use hte post-incremented version
of the IV, not the preincremented version.  This helps many loops (e.g. in sixtrack)
which used to generate code like this (this is the code from the
dont-hoist-simple-loop-constants.ll testcase):

_test:
        li r2, 0                 **** IV starts at 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2            **** Copy for loop exit
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2           **** IV+2
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2       ****  IV+2
        stw r2, 0(r4)
        blr

And now generated code like this:

_test:
        li r2, 1               *** IV starts at 1
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701     *** IV.postinc + 0
        blt cr0, LBB_test_1
LBB_test_2:     ; loopexit.2.loopexit
        stw r2, 0(r4)          *** IV.postinc + 0
        blr

llvm-svn: 23313
2005-09-12 06:04:47 +00:00
Chris Lattner 530fe6ab30 implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll.
We used to emit this code for it:

_test:
        li r2, 1     ;; Value tying up a register for the whole loop
        li r5, 0
LBB_test_1:     ; no_exit.2
        or r6, r5, r5
        li r5, 0
        stw r5, 0(r3)
        addi r5, r6, 1
        addi r3, r3, 4
        add r7, r2, r5  ;; should be addi r7, r5, 1
        cmpwi cr0, r7, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r6, 2
        stw r2, 0(r4)
        blr

now we emit this:

_test:
        li r2, 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2   ;; whoa, fold those adds!
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2
        stw r2, 0(r4)
        blr

more improvement coming.

llvm-svn: 23306
2005-09-10 01:18:45 +00:00
Chris Lattner 4309c3a785 PowerPC cannot truncstore i1 natively
llvm-svn: 23304
2005-09-10 00:21:06 +00:00
Chris Lattner 2d454bf5be Allow targets to say they don't support truncstore i1 (which includes a mask
when storing to an 8-bit memory location), as most don't.

llvm-svn: 23303
2005-09-10 00:20:18 +00:00
Chris Lattner bd39c1a4c6 Add a missing #include, patch courtesy of Baptiste Lepilleur.
llvm-svn: 23302
2005-09-09 23:53:39 +00:00
Chris Lattner 331b311f7b Fix a problem duraid encountered on itanium where this folding:
select (x < y), 1, 0 -> (x < y) incorrectly: the setcc returns i1 but the
select returned i32.  Add the zero extend as needed.

llvm-svn: 23301
2005-09-09 23:00:07 +00:00
Chris Lattner 16e5cb87ba Fix a crash viewing dags that have target nodes in them
llvm-svn: 23300
2005-09-09 22:35:03 +00:00
Chris Lattner 0f2146bb5d I forgot that we always spill fp values as 64-bits. Implement spill folding
for FP as well.  This triggers a couple dozen times on 177.mesa (for example).

llvm-svn: 23299
2005-09-09 21:59:44 +00:00
Chris Lattner 712e78ee28 Fix a problem that Nate noticed, where spill code was not getting coallesced
with copies, leading to code like this:

       lwz r4, 380(r1)
       or r10, r4, r4    ;; Last use of r4

By teaching the PPC backend how to fold spills into copies, we now get this
code:

       lwz r10, 380(r1)

wow. :)

This reduces a testcase nate sent me from 1505 instructions to 1484.

Note that this could handle FP values but doesn't currently, for reasons
mentioned in the patch

llvm-svn: 23298
2005-09-09 21:46:49 +00:00
Chris Lattner f540c1a2e8 code cleanup
llvm-svn: 23297
2005-09-09 20:51:08 +00:00
Chris Lattner 1410003751 Use continue in the use-processing loop to make it clear what the early exits
are, simplify logic, and cause things to not be nested as deeply.  This also
uses MRI->areAliases instead of an explicit loop.

No functionality change, just code cleanup.

llvm-svn: 23296
2005-09-09 20:29:51 +00:00
Nate Begeman 049b748c76 Last round of 2-node folds from SD.cpp. Will move on to 3 node ops such
as setcc and select next.

llvm-svn: 23295
2005-09-09 19:49:52 +00:00
Chris Lattner ce3662f2a2 remove debugging code *slaps head*
llvm-svn: 23294
2005-09-09 19:19:20 +00:00
Chris Lattner c9053083eb When spilling a live range that is used multiple times by one instruction,
only add a reload live range once for the instruction.  This is one step
towards fixing a regalloc pessimization that Nate notice, but is later undone
by the spiller (so no code is changed).

llvm-svn: 23293
2005-09-09 19:17:47 +00:00
Chris Lattner c37a2f13c4 Teach the code generator that rlwimi is commutable if the rotate amount
is zero.  This lets the register allocator elide some copies in some cases.

This implements CodeGen/PowerPC/rlwimi-commute.ll

llvm-svn: 23292
2005-09-09 18:17:41 +00:00
Chris Lattner 39b4d83f6a Introduce two new concepts:
1. Add support for defining Pattern's, which can match expressions when there
   is no instruction that directly implements something.  Instructions usually
   implicitly define patterns.
2. Add support for defining SDNodeXForm's, which are node transformations.
   This seperates the concept of a node xform out from the existing predicate
   support.

Using this new stuff, we add a few instruction patterns, one for testing, and
two for OR/XOR by an arbitrary immediate.

llvm-svn: 23286
2005-09-09 00:39:56 +00:00
Chris Lattner 4b09f3c6f5 whitespace/comment changes, no functionality diffs
llvm-svn: 23283
2005-09-08 23:17:26 +00:00
Nate Begeman 85c1cc4523 Move yet more folds over to the dag combiner from sd.cpp
llvm-svn: 23278
2005-09-08 20:18:10 +00:00
Chris Lattner 0ec8fa0880 Add a bunch of stuff needed for node type inference. Move 'BLR' down with
the rest of the instructions, add comment markers to seperate portions of
the file into logical parts

llvm-svn: 23277
2005-09-08 19:50:41 +00:00
Chris Lattner 76cb006e2c add patterns for x?oris?
llvm-svn: 23268
2005-09-08 17:40:49 +00:00
Chris Lattner 2d8032b54c add patterns to the addi/addis/mulli etc instructions. Define predicates
for matching signed 16-bit and shifted 16-bit ppc immediates

llvm-svn: 23267
2005-09-08 17:33:10 +00:00
Chris Lattner cf9b0e6673 Add patterns for some new instructions, allowing the use of the ineg fragment.
llvm-svn: 23266
2005-09-08 17:01:54 +00:00
Chris Lattner 7c8dba750c ignore generated files
llvm-svn: 23263
2005-09-07 23:47:44 +00:00
Chris Lattner 498915dafa Remove some cases handled by the generated portion of the isel
llvm-svn: 23262
2005-09-07 23:45:15 +00:00
Nate Begeman 2cc2c9a79c Another round of dag combiner changes. This fixes some missing XOR folds
as well as fixing how we replace old values with new values.

llvm-svn: 23260
2005-09-07 23:25:52 +00:00
Chris Lattner 5d16dbd5bb Fix a bug that Tzu-Chien Chiu noticed: live interval analysis does NOT
preserve livevar

llvm-svn: 23259
2005-09-07 17:34:39 +00:00
Nate Begeman 6791d63e55 Implement a common missing fold, (add (add x, c1), c2) -> (add x, c1+c2).
This restores all of stanford to being identical with and without the dag
combiner with the add folding turned off in sd.cpp.

llvm-svn: 23258
2005-09-07 16:09:19 +00:00
Chris Lattner 5ea0ee7b19 On non-apple systems, when using -march=ppc32, do not print:
'' is not a recognized processor for this target (ignoring processor)

Default to "generic" instead of "" for the default CPU.

llvm-svn: 23257
2005-09-07 05:45:33 +00:00
Chris Lattner 8d0e9d90aa Print:
'' is not a recognized processor for this target (ignoring processor)

instead of:

 is not a recognized processor for this target (ignoring processor)

llvm-svn: 23256
2005-09-07 05:44:14 +00:00
Chris Lattner fe883adfd2 Fix a bug nate ran into with replacealluseswith. In the recursive cse case,
we were losing a node, causing an assertion to fail.  Now we eagerly delete
discovered CSE's, and provide an optional vector to keep track of these
discovered equivalences.

llvm-svn: 23255
2005-09-07 05:37:01 +00:00
Nate Begeman 007c650699 Add an option to the DAG Combiner to enable it for beta runs, and turn on
that option for PowerPC's beta.

llvm-svn: 23253
2005-09-07 00:15:36 +00:00
Nate Begeman 6095214bf0 Implement i64<->fp using the fctidz/fcfid instructions on PowerPC when we
are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts,
such as the PowerPC 970.

This speeds up 189.lucas from 81.99 to 32.64 seconds.

llvm-svn: 23250
2005-09-06 22:03:27 +00:00
Andrew Lenharth a63a066205 Fix up the AssertXext problem, as well as adding it at calls
llvm-svn: 23246
2005-09-06 17:00:23 +00:00
Nate Begeman e9e2c6d314 Add note about future optimization noted in the ppc compiler writer's guide
llvm-svn: 23245
2005-09-06 15:30:48 +00:00
Nate Begeman 2dded8302a Add accessor for 64bit flag, so that we can tell when it is safe to
generate the fun in-register fp<->long instructions.

llvm-svn: 23244
2005-09-06 15:30:12 +00:00
Nate Begeman d23739d020 Next round of DAGCombiner changes. This version now passes all the tests
I have run so far when run before Legalize.  It still needs to pick up the
SetCC folds, and nodes that use SetCC.

llvm-svn: 23243
2005-09-06 04:43:02 +00:00
Andrew Lenharth c8bd5bda59 revert part of the last change, should fix regressions
llvm-svn: 23241
2005-09-04 06:12:19 +00:00
Chris Lattner aa833d4571 explicitly specify an operands list for patterns with inputs (e.g. neg)
llvm-svn: 23240
2005-09-03 01:28:40 +00:00
Chris Lattner 8ae9525bd0 include the dag isel fragment
llvm-svn: 23239
2005-09-03 01:17:22 +00:00
Chris Lattner 0442dcfabc ask for a dag isel
llvm-svn: 23238
2005-09-03 01:15:41 +00:00
Chris Lattner 821628ff2a Fix a checking failure in gs
llvm-svn: 23235
2005-09-03 01:04:40 +00:00
Chris Lattner 5f12cf14be Change the isel to not break out of the big giant switch. Instead, the
switch should never be exited, so its bottom is now unreachable.

llvm-svn: 23234
2005-09-03 00:53:47 +00:00
Chris Lattner 9220f92c41 rearrange logical ops to group them together more consistently.
Define the PatFrag class which can be used to define subpatterns to match
things with.  Define 'not', and use it to define the patterns for andc,
nand, etc.

llvm-svn: 23233
2005-09-03 00:21:51 +00:00
Chris Lattner dcbb561b76 Add AND/OR/XOR
llvm-svn: 23232
2005-09-02 22:35:53 +00:00
Nate Begeman 7cea6ef16e Next round of DAG Combiner changes. Just need to support multiple return
values, and then we should be able to hook it up.

llvm-svn: 23231
2005-09-02 21:18:40 +00:00
Chris Lattner 3a1002d529 Add some initial patterns to simple binary instructions, though they
currently don't do anything.  This elides patterns for binary operators
that ping on the carry flag, since we don't model it yet.

This patch also removes PPC::SUB, because it is dead.

llvm-svn: 23230
2005-09-02 21:18:00 +00:00
Chris Lattner 1a570f1fe4 Clean up some code from the last checkin
llvm-svn: 23229
2005-09-02 20:32:45 +00:00
Chris Lattner 630226697f Fix a bug in legalize where it would emit two calls to libcalls that return
i64 values on targets that need that expanded to 32-bit registers.  This fixes
PowerPC/2005-09-02-LegalizeDuplicatesCalls.ll and speeds up 189.lucas from
taking 122.72s to 81.96s on my desktop.

llvm-svn: 23228
2005-09-02 20:26:58 +00:00
Chris Lattner 06e237f253 turn on dag isel by default
llvm-svn: 23226
2005-09-02 19:53:54 +00:00
Chris Lattner b95b280bee Make sure to auto-cse nullary ops
llvm-svn: 23224
2005-09-02 19:36:17 +00:00
Jim Laskey 27d628dfc9 Add help support for -mcpu and -mattr.
llvm-svn: 23222
2005-09-02 19:27:43 +00:00
Chris Lattner 1e89e36dcd Fix some buggy logic where we would try to remove nodes with two operands
from the binary ops map, even if they had multiple results.  This latent bug
caused a few failures with the dag isel last night.

To prevent stuff like this from happening in the future, add some really
strict checking to make sure that the CSE maps always match up with reality!

llvm-svn: 23221
2005-09-02 19:15:44 +00:00
Andrew Lenharth 9690a4f321 Pull out Lowering in preperation for multiple ISels. Oh, and get rid of some stuff
llvm-svn: 23220
2005-09-02 18:46:02 +00:00
Chris Lattner b0b4ec5655 Don't create zero sized stack objects even for array allocas with a zero
number of elements.

llvm-svn: 23219
2005-09-02 18:41:28 +00:00
Chris Lattner aa3b1fcc58 Decouple fsqrt from gpul optimizations, implementing fsqrt.ll.
Remove the -enable-gpopt option which is subsumed by feature flags.

llvm-svn: 23218
2005-09-02 18:33:05 +00:00
Chris Lattner b6cde17d29 Fix the release build, noticed by Eric van Riet Paap
llvm-svn: 23215
2005-09-02 07:09:28 +00:00
Chris Lattner b5e381a8cf Fix a problem that Dan Berlin noticed, where reassociation would not succeed
in building maximal expressions before simplifying them.  In particular, i
cases like this:

X-(A+B+X)

the code would consider A+B+X to be a maximal expression (not understanding
that the single use '-' would be turned into a + later), simplify it (a noop)
then later get simplified again.

Each of these simplify steps is where the cost of reassociation comes from,
so this patch should speed up the already fast pass a bit.

Thanks to Dan for noticing this!

llvm-svn: 23214
2005-09-02 07:07:58 +00:00
Chris Lattner 9fe263aa75 Avoid creating garbage instructions, just move the old add instruction
to where we need it when converting -(A+B+C) -> -A + -B + -C.

llvm-svn: 23213
2005-09-02 06:38:04 +00:00
Chris Lattner d1325da091 add some assertions and fix problems where reassociate could access the
Ops vector out of range

llvm-svn: 23211
2005-09-02 05:23:22 +00:00
Jeff Cohen a6dde9962d Fix VC++ build errors
llvm-svn: 23210
2005-09-02 02:51:42 +00:00
Chris Lattner 763a3a0fa7 Restore this patch now that the latent bug has been fixed
llvm-svn: 23209
2005-09-02 01:24:55 +00:00
Chris Lattner d9af1aab51 Make sure to legalize assert[zs]ext's operand correctly
llvm-svn: 23208
2005-09-02 01:15:01 +00:00
Chris Lattner 06d440f2ee Revert the previous patch which causes a mysterious regression in toast.
llvm-svn: 23207
2005-09-02 00:47:05 +00:00
Chris Lattner 7138f91424 Teach live intervals to not crash on dead livein regs
llvm-svn: 23206
2005-09-02 00:20:32 +00:00
Chris Lattner a66403dbf7 For values that are live across basic blocks and need promotion, use ANY_EXTEND
instead of ZERO_EXTEND to eliminate extraneous extensions.  This eliminates
dead zero extensions on formal arguments and other cases on PPC, implementing
the newly tightened up test/Regression/CodeGen/PowerPC/small-arguments.ll test.

llvm-svn: 23205
2005-09-02 00:19:37 +00:00
Chris Lattner 7753f175e6 legalize ANY_EXTEND appropriately
llvm-svn: 23204
2005-09-02 00:18:10 +00:00
Chris Lattner 8c393c218b Add support for ANY_EXTEND and add a few minor folds for it
llvm-svn: 23203
2005-09-02 00:17:32 +00:00
Chris Lattner 210975cfbb Handle any_extend like zext
llvm-svn: 23202
2005-09-02 00:16:09 +00:00
Chris Lattner 2493f0e5fd Handle ANY_EXTEND like ZERO_EXTEND. Simplify the extend/truncate code on
the observation that it only has to handle i1 -> i64 and i64 -> i1.

llvm-svn: 23201
2005-09-02 00:15:30 +00:00
Chris Lattner 9ee867b93b Implement small-arguments.ll:test3 by teaching the DAG optimizer that
the results of calls to functions returning small values are properly
sign/zero extended.

llvm-svn: 23198
2005-09-01 23:44:32 +00:00
Nate Begeman d78d975437 Fix some code in the current node combining code, spotted when it was moved
over to DAGCombiner.cpp

1. Don't assume that SetCC returns i1 when folding (xor (setcc) constant)
2. Don't duplicate code in folding AND with AssertZext that is handled by
   MaskedValueIsZero

llvm-svn: 23196
2005-09-01 23:25:49 +00:00
Nate Begeman 2504fe2613 Implement first round of feedback from chris (there's still a couple things
left to do).

llvm-svn: 23195
2005-09-01 23:24:04 +00:00
Chris Lattner 68d15fdfea Align functions to 16-byte boundaries, to eliminate noise in performance measurements. This improves the performance of 'treeadd' by about 20% with the dag
isel, restoring it to the pattern-isel level (which happens to get the alignment right).

llvm-svn: 23194
2005-09-01 23:08:50 +00:00
Chris Lattner e40a3ccd60 Local labels on darwin apparently start with just 'L', not .L like other
platforms.  This reduces executable size and makes shark realize the actual
bounds of functions instead of showing each MBB as a function :)

llvm-svn: 23193
2005-09-01 21:48:35 +00:00
Jim Laskey 19058c3989 1. Use SubtargetFeatures in llc/lli.
2. Propagate feature "string" to all targets.

3. Implement use of SubtargetFeatures in PowerPCTargetSubtarget.

llvm-svn: 23192
2005-09-01 21:38:21 +00:00
Jim Laskey 3fee6a51a9 This new class provides support for platform specific "features". The intent
is to manage processor specific attributes from the command line.  See examples
of use in llc/lli and PowerPCTargetSubtarget.

llvm-svn: 23191
2005-09-01 21:36:18 +00:00
Chris Lattner a305d28cf6 Implement dynamic allocas correctly. In particular, because we were copying
directly out of R1 (without using a CopyFromReg, which uses a chain), multiple
allocas were getting CSE'd together, producing bogus code.  For this:

int %foo(bool %X, int %A, int %B) {
        br bool %X, label %T, label %F
F:
        %G = alloca int
        %H = alloca int
        store int %A, int* %G
        store int %B, int* %H
        %R = load int* %G
        ret int %R
T:
        ret int 0
}

We were generating:

_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        li r2, 16
        subf r2, r2, r1   ;; One alloca
        or r1, r2, r2
        or r3, r1, r1
        or r1, r2, r2
        or r2, r1, r1
        stw r4, 0(r3)
        stw r5, 0(r2)
        lwz r3, 0(r3)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr

Now we generate:

_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        or r2, r1, r1
        li r3, 16
        subf r2, r3, r2  ;; Alloca 1
        or r1, r2, r2
        or r2, r1, r1
        or r6, r1, r1
        subf r3, r3, r6  ;; Alloca 2
        or r1, r3, r3
        or r3, r1, r1
        stw r4, 0(r2)
        stw r5, 0(r3)
        lwz r3, 0(r2)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr

This fixes Povray and SPASS with the dag isel, the last two failing cases.
Tommorow we will hopefully turn it on by default! :)

llvm-svn: 23190
2005-09-01 21:31:30 +00:00
Chris Lattner 293b3a68e0 Fix a bug where we were useing HA to get the high part, which seems like it
could cause a miscompile.  Fixing this didn't fix the two programs that fail
though.  :(

This also changes the implementation to follow the pattern selector more
closely, causing us to select 0 to li instead of lis.

llvm-svn: 23189
2005-09-01 19:38:28 +00:00
Chris Lattner 34182aff7f Do not select the operands being passed into SelectCC. IT does this itself
and selecting early prevents folding immediates into the cmpw* instructions

llvm-svn: 23188
2005-09-01 19:20:44 +00:00
Chris Lattner 975f5c9f46 It is NDEBUG not _NDEBUG
llvm-svn: 23186
2005-09-01 18:44:10 +00:00
Nate Begeman e8f78d1aab Add the rest of the currently implemented visit routines to the switch
statement in visit().

llvm-svn: 23185
2005-09-01 00:33:32 +00:00
Nate Begeman 21158fc485 First pass at the DAG Combiner. It isn't used anywhere yet, but it should
be mostly functional.  It currently has all folds from SelectionDAG.cpp
that do not involve a condition code.

llvm-svn: 23184
2005-09-01 00:19:25 +00:00
Chris Lattner d4d10fff99 If a function has live ins/outs, print them
llvm-svn: 23181
2005-08-31 22:34:59 +00:00
Chris Lattner da2e04c69d Move FCTIWZ handling out of the instruction selectors and into legalization,
getting them out of the business of making stack slots.

llvm-svn: 23180
2005-08-31 21:09:52 +00:00
Chris Lattner 6bad1fb19e Remove dead code
llvm-svn: 23179
2005-08-31 20:25:15 +00:00
Chris Lattner e675a08e10 Move SHL,SHR i64 -> legalizer
llvm-svn: 23178
2005-08-31 20:23:54 +00:00
Chris Lattner 3a04a4b767 Remove code that is now dead from the pattern isel.
llvm-svn: 23177
2005-08-31 19:11:36 +00:00
Chris Lattner 2f03896a0f lower sra_parts on the dag, implementing it for the dag isel, and exposing
the ops to dag optimization.

llvm-svn: 23176
2005-08-31 19:09:57 +00:00
Chris Lattner 8a1a5f2818 Allow targets to custom expand shifts that are too large for their registers
llvm-svn: 23173
2005-08-31 19:01:53 +00:00
Chris Lattner 2bd2af8ecd add assert zext/sext to the dag isel
llvm-svn: 23171
2005-08-31 18:08:46 +00:00
Chris Lattner 46ff6aa993 Handle AssertSext/AssertZext nodes, fixing the regressions last night.
llvm-svn: 23170
2005-08-31 17:48:04 +00:00
Jeff Cohen d8c84e3c7e Fix VC++ precedence warnings
llvm-svn: 23169
2005-08-31 02:47:06 +00:00
Nate Begeman e3287b85b7 Enable generation of AssertSext and AssertZext in the PPC backend.
llvm-svn: 23168
2005-08-31 01:58:39 +00:00
Chris Lattner f4d594370b Fix 'ret long' to return the high and lo parts in the right registers. This
fixes crafty and probably others.

llvm-svn: 23167
2005-08-31 01:34:29 +00:00
Nate Begeman 539e7c892c Sigh, not my day. Fix typo.
llvm-svn: 23166
2005-08-31 00:43:49 +00:00
Nate Begeman d513d8a662 Fix a mistake in my previous patch pointed out by sabre; the AssertZext
case in MaskedValueIsZero was wrong.

llvm-svn: 23165
2005-08-31 00:43:08 +00:00
Nate Begeman e07bc28cca Remove some unnecessary casts, and add the AssertZext case to
MaskedValueIsZero.

llvm-svn: 23164
2005-08-31 00:27:53 +00:00
Chris Lattner 69e9a9a94c now that physregs can exist in the same dag with multiple types, remove some
ugly hacks

llvm-svn: 23162
2005-08-30 22:59:48 +00:00
Chris Lattner 5764da422a Allow physregs to occur in the dag with multiple types. Though I don't likethis, it is a requirement on PPC, which can have an f32 value in r3 at onepoint in a function and a f64 value in r3 at another point. :(
This fixes compilation of mesa

llvm-svn: 23161
2005-08-30 22:38:38 +00:00
Chris Lattner 8f8d539746 Fix type mismatches when passing f32 values to calls
llvm-svn: 23159
2005-08-30 21:28:19 +00:00
Chris Lattner 4d602bed10 When checking the fixed intervals, don't forget to check for register aliases.
This fixes PR621 and Regression/CodeGen/X86/2005-08-30-RegAllocAliasProblem.ll

llvm-svn: 23158
2005-08-30 21:03:36 +00:00
Chris Lattner 9f23ae226f Fix some indentation (first hunks).
Remove code (last hunk) that miscompiled immediate and's, such as
  and uint %tmp.30, 4294958079

into

 andi. r8, r8, 56319
 andis. r8, r8, 65535

instead of:

 li r9, -9217
 and r8, r8, r9

The first always generates zero.

This fixes espresso.

llvm-svn: 23155
2005-08-30 18:37:48 +00:00
Chris Lattner 6a41fd75cd Fix a problem Nate found where we swapped the operands of SHL/SHR_PARTS. This
fixes fourinarow

llvm-svn: 23153
2005-08-30 17:42:59 +00:00
Chris Lattner bdf3d3defb codegen ADD_PARTS correctly: put the results in the right registers! This
fixes fhourstones

llvm-svn: 23152
2005-08-30 17:40:13 +00:00
Chris Lattner 61d21b1f3c Fix FreeBench/fourinarow with the dag isel, by not adding a bogus result
to SHIFT_PARTS nodes

llvm-svn: 23151
2005-08-30 17:21:17 +00:00
Chris Lattner 45706e9fb8 add operands in the right order, fixing McCat/18-imp with the dag isel
llvm-svn: 23150
2005-08-30 17:13:58 +00:00
Chris Lattner 9a4ad487f0 Fix a miscompile of PtrDist/bc. Sign extending bools is not the right thing,
at least tends to expose problems elsewhere.

llvm-svn: 23149
2005-08-30 16:56:19 +00:00
Nate Begeman a3da8c4819 Remove a bogus piece of my AssertSext/AssertZext patch. oops.
llvm-svn: 23148
2005-08-30 02:54:28 +00:00
Nate Begeman 43144a2fe0 Add support for AssertSext and AssertZext, folding other extensions with
them.  This allows for elminination of redundant extends in the entry
blocks of functions on PowerPC.

Add support for i32 x i32 -> i64 multiplies, by recognizing when the inputs
to ISD::MUL in ExpandOp are actually just extended i32 values and not real
i64 values.  this allows us to codegen

int mulhs(int a, int b) { return ((long long)a * b) >> 32; }
as:
_mulhs:
        mulhw r3, r4, r3
        blr

instead of:
_mulhs:
        mulhwu r2, r4, r3
        srawi r5, r3, 31
        mullw r5, r4, r5
        add r2, r2, r5
        srawi r4, r4, 31
        mullw r3, r4, r3
        add r3, r2, r3
        blr

with a similar improvement on x86.

llvm-svn: 23147
2005-08-30 02:44:00 +00:00
Chris Lattner 08a1e38730 Name this variable to be what it really is!
llvm-svn: 23145
2005-08-30 01:58:51 +00:00
Chris Lattner 04cb82278a Handle CopyToReg nodes with flag operands correctly
llvm-svn: 23144
2005-08-30 01:57:23 +00:00
Chris Lattner 7a59b1cf90 Make sure the selector emits register register copies with flag operands
linking them to calls when appropriate, this prevents the scheduler from
pulling these copies away from the call.

This fixes Ptrdist/yacr2

llvm-svn: 23143
2005-08-30 01:57:02 +00:00
Chris Lattner e413b60632 The first operand to AND does not always have more than two operands. This
fixes MediaBench/toast with the dag selector

llvm-svn: 23141
2005-08-30 00:59:16 +00:00
Chris Lattner e75b5e63a7 Fix a bug in my patch for legalizing to fsel. It cannot handle seteq/setne,
which I failed to include when I moved the code over.  This fixes
MallocBench/gs.

llvm-svn: 23140
2005-08-30 00:45:18 +00:00
Chris Lattner 61f7c3e843 emit FMR instructions to convert f64<->f32 instructions, so things like
STOREs, know the right type to store.

llvm-svn: 23139
2005-08-30 00:30:43 +00:00
Chris Lattner 62b9a5d1f8 Fix some really strange indentation that xcode likes to use.
no xcode, this is not right:

   if (!foo) break;
     X;

llvm-svn: 23138
2005-08-30 00:19:00 +00:00
Chris Lattner 12357281b8 fix a crash in cfrac
llvm-svn: 23137
2005-08-29 23:49:25 +00:00
Chris Lattner 1cbbe1015a Implement DYNAMIC_STACKALLOC, wrap some long lines
llvm-svn: 23136
2005-08-29 23:30:11 +00:00
Chris Lattner f7e5ec84c6 Add a hack to avoid some horrible code in some cases by always emitting
token chains first.  For this C function:

int test() {
  int i;
  for (i = 0; i < 100000; ++i)
    foo();
}

Instead of emitting this (condition before call)

.LBB_test_1:    ; no_exit
        addi r30, r30, 1
        lis r2, 1
        ori r2, r2, 34464
        cmpw cr2, r30, r2
        bl L_foo$stub
        bne cr2, .LBB_test_1    ; no_exit

Emit this:

.LBB_test_1:    ; no_exit
        bl L_foo$stub
        addi r30, r30, 1
        lis r2, 1
        ori r2, r2, 34464
        cmpw cr0, r30, r2
        bne cr0, .LBB_test_1    ; no_exit

Which makes it so we don't have to save/restore cr2 in the prolog/epilog of
the function.

This also makes the code much more similar to what the pattern isel produces.

llvm-svn: 23135
2005-08-29 23:21:29 +00:00
Chris Lattner b2b418509b Fix a dumb bug of mine where we were mishandling the PPC ABI (undef handling).
This fixes voronoi and bh in Olden, allowing all of olden to pass!

llvm-svn: 23133
2005-08-29 22:22:57 +00:00
Chris Lattner c738d000d5 Add a new API for Nate
llvm-svn: 23131
2005-08-29 21:59:31 +00:00
Andrew Lenharth 835cbb364d Some of us cared about the the promote path
llvm-svn: 23130
2005-08-29 20:46:51 +00:00
Chris Lattner dcde1b2b6a Fix an infinite loop on x86
llvm-svn: 23129
2005-08-29 17:30:00 +00:00
Chris Lattner 1a1ecf0679 Allow bugpoint+PPC codegen to use fsqrt
llvm-svn: 23128
2005-08-29 13:14:24 +00:00
Chris Lattner c429ab2fb1 Fix a bug the last patch exposed in treeadd among others
llvm-svn: 23127
2005-08-29 01:07:02 +00:00
Chris Lattner d4d683a47b A hack to fix a problem folding immedaites. This fixes Olden/power.
llvm-svn: 23126
2005-08-29 01:01:01 +00:00
Chris Lattner 3ccad3fb8c Fix order of operands for copytoreg node when emitting calls. This fixes
Olden/msFix order of operands for copytoreg node when emitting calls.  This fixes
Olden/mstt.

llvm-svn: 23125
2005-08-29 00:26:57 +00:00
Chris Lattner 46d4c75cd1 Fix a bug in my previous patch that was using the wrong iterator. This fixes
Olden/bisort among others.

llvm-svn: 23124
2005-08-29 00:10:46 +00:00
Chris Lattner 66ddc8d3bf add operands in the correct order
llvm-svn: 23123
2005-08-29 00:02:01 +00:00
Chris Lattner 87421c8658 Fix a bug in ReplaceAllUsesWith
llvm-svn: 23122
2005-08-28 23:59:36 +00:00
Chris Lattner dfcde88d07 Fix a bug in FP_EXTEND, implement FP_TO_SINT
llvm-svn: 23121
2005-08-28 23:59:09 +00:00
Chris Lattner 38660c6666 fix an assertion failure in treeadd
llvm-svn: 23120
2005-08-28 23:39:22 +00:00
Reid Spencer aa7fbca285 Adjust to member variable name change.
llvm-svn: 23119
2005-08-27 19:09:48 +00:00
Reid Spencer 85d93a3ec9 Change the names of member variables per Chris' instructions, and document
them more clearly.

llvm-svn: 23118
2005-08-27 19:09:02 +00:00
Reid Spencer dfb3fb4a25 Implement PR614:
These changes modify the makefiles so that the output of flex and bison are
placed in the SRC directory, not the OBJ directory. It is intended that they
be checked in as any other LLVM source so that platforms without convenient
access to flex/bison can be compiled. From now on, if you change a .y or
.l file you *must* also commit the generated .cpp and .h files.

llvm-svn: 23115
2005-08-27 18:50:39 +00:00
Chris Lattner 075250bda1 Disable this code, which broke many tests last night
llvm-svn: 23114
2005-08-27 16:16:51 +00:00
Chris Lattner 5ee85e89b6 fix PHI node emission for basic blocks that have select_cc's in them on ppc32
llvm-svn: 23113
2005-08-27 00:58:02 +00:00
Chris Lattner 787e962795 The condition register being branched on may not be cr0, as such, print it.
This fixes: UnitTests/2005-07-17-INT-To-FP.c

llvm-svn: 23112
2005-08-26 23:42:05 +00:00
Chris Lattner 29bfaa7ef0 Propagate cr# from COND_BRANCH to the actual branch instruction as appropriate
llvm-svn: 23111
2005-08-26 23:41:27 +00:00
Chris Lattner 56ca46ee04 Nate noticed that Andrew never did this. This fixes PR600
llvm-svn: 23110
2005-08-26 22:50:40 +00:00
Chris Lattner e7a2998064 Don't copy regs that are only used in the entry block into a vreg. This
changes the code generated for:

short %test(short %A) {
  %B = xor short %A, -32768
  ret short %B
}

to:

_test:
        xori r2, r3, 32768
        xoris r2, r2, 65535
        extsh r3, r2
        blr

instead of:

_test:
        rlwinm r2, r3, 0, 16, 31
        xori r2, r3, 32768
        xoris r2, r2, 65535
        extsh r3, r2
        blr

llvm-svn: 23109
2005-08-26 22:49:59 +00:00
Chris Lattner d4f43f7967 Make this code safe for when loadRegFromStackSlot inserts multiple instructions.
llvm-svn: 23108
2005-08-26 22:18:32 +00:00
Chris Lattner 422e23dd02 allow code using mtcrf to assemble
llvm-svn: 23107
2005-08-26 22:05:54 +00:00
Nate Begeman 72f23815bc Remove operand type 'crbit', since it is no longer used
llvm-svn: 23106
2005-08-26 22:04:17 +00:00
Chris Lattner c3d1bdd0a9 teach getClass what a condition reg is
llvm-svn: 23105
2005-08-26 21:51:29 +00:00
Chris Lattner 97345405a6 Minor cleanups:
* avoid calling getClass() multiple times (it is relatively expensive)
  * Allow -disable-fp-elim to turn of frame pointer elimination.

llvm-svn: 23104
2005-08-26 21:49:18 +00:00
Chris Lattner 4a5ebe94ba Checking types here is not safe, because multiple types can map to the same
register class.

llvm-svn: 23103
2005-08-26 21:39:15 +00:00
Chris Lattner 9b577f108a implement SELECT_CC fully for the DAG->DAG isel!
llvm-svn: 23101
2005-08-26 21:23:58 +00:00
Chris Lattner c6a0338c04 spell this right
llvm-svn: 23099
2005-08-26 20:55:40 +00:00
Chris Lattner 13d7c252e5 Call the InsertAtEndOfBasicBlock hook if the usesCustomDAGSchedInserter
flag is set on an instruction.

llvm-svn: 23098
2005-08-26 20:54:47 +00:00
Chris Lattner 0081dfa91e Add a flag
llvm-svn: 23092
2005-08-26 20:29:01 +00:00
Chris Lattner b2854fadda Make fsel emission work with both the pattern and dag-dag selectors, by
giving it a non-instruction opcode.  The dag->dag selector used to not
select the operands of the fsel, because it thought that whole tree was
already selected.

llvm-svn: 23091
2005-08-26 20:25:03 +00:00
Chris Lattner bec817ce6f implement the fold for:
bool %test(int %X, int %Y) {
        %C = setne int %X, 0
        ret bool %C
}

to:

_test:
        addic r2, r3, -1
        subfe r3, r2, r3
        blr

llvm-svn: 23089
2005-08-26 18:46:49 +00:00
Chris Lattner a9e6a82d66 Changes to adjust to new ReplaceAllUsesWith syntax. Change FP_EXTEND to
just return its input, instead of emitting an explicit copy.

llvm-svn: 23088
2005-08-26 18:37:23 +00:00
Chris Lattner 373f048a79 Revampt ReplaceAllUsesWith to be more efficient and easier to use.
llvm-svn: 23087
2005-08-26 18:36:28 +00:00
Nate Begeman 76eea9a480 Remove some code made dead by the fsel patch
llvm-svn: 23085
2005-08-26 17:45:06 +00:00
Chris Lattner c75e047245 now that fsel is formed during legalization, this code is dead
llvm-svn: 23084
2005-08-26 17:40:39 +00:00
Chris Lattner 7f1fa8eaef implement the other half of the select_cc -> fsel lowering, which handles
when the RHS of the comparison is 0.0.  Turn this on by default.

llvm-svn: 23083
2005-08-26 17:36:52 +00:00
Chris Lattner d0dc6f4299 Fix a bug in my previous checkin
llvm-svn: 23082
2005-08-26 17:18:44 +00:00
Chris Lattner c30405e0ee Change ConstantPoolSDNode to actually hold the Constant itself instead of
putting it into the constant pool.  This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.

llvm-svn: 23081
2005-08-26 17:15:30 +00:00
Chris Lattner 7bbdae53d6 Fix some warnings in an optimized build
llvm-svn: 23080
2005-08-26 16:38:51 +00:00
Chris Lattner 2091a36631 Fix a huge annoyance: SelectNodeTo took types before the opcode unlike
every other SD API.  Fix it to take the opcode before the types.

llvm-svn: 23079
2005-08-26 16:36:26 +00:00
Nate Begeman 7b809f593b Fix JIT encoding of conditional branches
llvm-svn: 23076
2005-08-26 04:11:42 +00:00
Chris Lattner f3d06c6417 add initial support for converting select_cc -> fsel in the legalizer
instead of in the backend.  This currently handles fsel cases with registers,
but doesn't have the 0.0 and -0.0 optimization enabled yet.

Once this is finished, special hack for fp immediates can go away.

llvm-svn: 23075
2005-08-26 00:52:45 +00:00
Chris Lattner c6d481db7a the 5th operand is the 4th number
llvm-svn: 23074
2005-08-26 00:43:46 +00:00
Nate Begeman 89093ca62a SUBFIC produces two results, not one.
llvm-svn: 23073
2005-08-26 00:34:06 +00:00
Nate Begeman bed4f2b982 Implement SHL_PARTS and SRL_PARTS
llvm-svn: 23072
2005-08-26 00:28:00 +00:00
Chris Lattner 5f573416cd Add support for targets that want to custom expand select_cc in some cases.
llvm-svn: 23071
2005-08-26 00:23:59 +00:00
Chris Lattner dff50cadaa Allow LowerOperation to return a null SDOperand in case it wants to lower
some things given to it, but not all.

llvm-svn: 23070
2005-08-26 00:14:16 +00:00
Chris Lattner 1cb550c603 Fix a nasty bug from a previous patch of mine
llvm-svn: 23069
2005-08-26 00:13:12 +00:00
Chris Lattner b81431b012 Emit the lo/hi parts in the right order :)
llvm-svn: 23068
2005-08-25 23:36:49 +00:00
Chris Lattner 02884fe41c implement support for 64-bit add/sub, fix a broken assertion for 64-bit
return.  Allow the udiv breaker-upper to work with any non-zero constant
operand.

llvm-svn: 23066
2005-08-25 23:21:06 +00:00
Chris Lattner abbd8ea048 simplify the add/sub_parts code
llvm-svn: 23065
2005-08-25 23:19:58 +00:00
Chris Lattner 6e184f2b3d Finish implementing SDIV/UDIV by copying over the majik constant code from
ISelPattern

llvm-svn: 23062
2005-08-25 22:04:30 +00:00
Chris Lattner 717f97a5c8 Simplify some code. It's not clear why the UDIV expanded sequence
doesn't work for large uint constants, but we'll keep the current behavior

llvm-svn: 23061
2005-08-25 22:03:50 +00:00
Chris Lattner b746dd1cf6 Implement setcc correctly for G5 and non-G5 systems
llvm-svn: 23060
2005-08-25 21:39:42 +00:00
Chris Lattner 3dcd75bc54 implement setcc on the G5. We're still missing the non-g5 specific bits, but
they will come later.

llvm-svn: 23059
2005-08-25 20:08:18 +00:00
Nate Begeman 33840c3268 New fold for SELECT_CC
llvm-svn: 23058
2005-08-25 20:04:38 +00:00
Nate Begeman 65ffd8fbf4 Remove option to make SetCC illegal on PowerPC after long discussion with
Chris.  This will be accomplished through correctly modeling CR's and
subregs.

llvm-svn: 23056
2005-08-25 20:01:10 +00:00
Chris Lattner f9c19157df Don't auto-cse nodes that return flags
llvm-svn: 23055
2005-08-25 19:12:10 +00:00
Chris Lattner 12756be53b add printer support for flag operands
llvm-svn: 23054
2005-08-25 17:59:23 +00:00
Chris Lattner 9d28a56d55 simplify the code a bit using isOperationLegal
llvm-svn: 23053
2005-08-25 17:54:58 +00:00
Chris Lattner dc66457022 Add support for sdiv by 2^k and -2^k. Producing code like:
_test:
        srawi r2, r3, 2
        addze r3, r2
        blr

llvm-svn: 23052
2005-08-25 17:50:06 +00:00
Chris Lattner 4bd2aab6c1 fit in 80 cols
llvm-svn: 23051
2005-08-25 17:49:31 +00:00
Chris Lattner 8a93f64efa Add support for flag operands
llvm-svn: 23050
2005-08-25 17:48:54 +00:00
Chris Lattner d24ad52efa add an enum value
llvm-svn: 23048
2005-08-25 17:07:09 +00:00
Chris Lattner 25db699671 Implement support for taking the address of constant pool indices, which
is used by the int -> FP code among other things.  This gets
2005-05-12-Int64ToFP past that failure, to dying on lack of support for add_parts

llvm-svn: 23042
2005-08-25 05:04:11 +00:00
Chris Lattner 407c6415b4 ADd support for TargetConstantPool nodes
llvm-svn: 23041
2005-08-25 05:03:06 +00:00
Chris Lattner 666512c832 Add support for FP constants, fixing UnitTests/2004-02-02-NegativeZero
llvm-svn: 23038
2005-08-25 04:47:18 +00:00
Chris Lattner e4c338d0d8 Fully implement frame index, so that we can pass the address of alloca's
around to functions and stuff

llvm-svn: 23036
2005-08-25 00:45:43 +00:00
Chris Lattner bbe0e7df2c add a new TargetFrameIndex node
llvm-svn: 23035
2005-08-25 00:43:01 +00:00
Chris Lattner 66a6a13225 implement unconditional branches, fixing UnitTests/2003-05-02-DependentPHI.c
llvm-svn: 23034
2005-08-25 00:29:58 +00:00
Chris Lattner 4ae278a760 LFS/STFS load and store FP values, not integer ones. This change allows us
to codegen this: float foo() { return 1.245; }

into this:

_foo:
        lis r2, ha16(.CPI_foo_0)
        lfs f1, lo16(.CPI_foo_0)(r2)
        blr

instead of this:

_foo:
        lis r2, ha16(.CPI_foo_0)
        lfs r2, lo16(.CPI_foo_0)(r2)   <-- ouch
        or f1, r2, r2                  <-- ouch
        blr

with the dag isel.

llvm-svn: 23033
2005-08-25 00:26:22 +00:00
Chris Lattner 794eb6684d Fix a broken assertion
llvm-svn: 23032
2005-08-25 00:19:12 +00:00
Chris Lattner c146940f0d Fix a warning
llvm-svn: 23031
2005-08-25 00:05:15 +00:00
Chris Lattner daae1e10f7 fix a warning in optimized build
llvm-svn: 23030
2005-08-25 00:03:21 +00:00
Chris Lattner 751c6c3944 Fix some warnings
llvm-svn: 23029
2005-08-25 00:00:26 +00:00
Chris Lattner a3fbdae515 Split IMPLICIT_DEF into IMPLICIT_DEF_GPR and IMPLICIT_DEF_FP, so that the
instructions take a consistent reg class.  Implement ISD::UNDEF in the dag->dag
selector to generate this, fixing UnitTests/2003-07-06-IntOverflow.

llvm-svn: 23028
2005-08-24 23:08:16 +00:00
Chris Lattner 45e1ce4e28 add a method
llvm-svn: 23027
2005-08-24 23:00:29 +00:00
Chris Lattner d83cd354bd implement support for calls
llvm-svn: 23026
2005-08-24 22:45:17 +00:00
Chris Lattner d7ee4d8671 Add ReplaceAllUsesWith that can take a vector of replacement values.
Add some foldings to hopefully help the illegal setcc issue, and move some code around.

llvm-svn: 23025
2005-08-24 22:44:39 +00:00
Chris Lattner 1fc2a7f006 Remove some dead cases.
Emit the indcall sequence as:

mtctr inreg
mr R12, inreg
btctr

If inreg and R12 aren't coallesced, this reduces the odds of having the mtctr
and btctr in the same dispatch group.  :)

llvm-svn: 23023
2005-08-24 22:21:47 +00:00
Chris Lattner ad9565dfbe Add support for external symbols, and support for variable arity instructions
llvm-svn: 23022
2005-08-24 22:02:41 +00:00
Chris Lattner bb8cc0acb2 Fix pasto that prevented VT ndoes from showing up in -view-isel-dags correctly
llvm-svn: 23021
2005-08-24 18:30:00 +00:00
Chris Lattner 1e98a330f2 add an idea
llvm-svn: 23020
2005-08-24 18:15:24 +00:00
Chris Lattner 8ca5b2a6d2 Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll
llvm-svn: 23019
2005-08-24 17:55:32 +00:00
Chris Lattner 4201cd1bbc Transform floor((double)FLT) -> (double)floorf(FLT), implementing
Regression/Transforms/SimplifyLibCalls/floor.ll.  This triggers 19 times in
177.mesa.

llvm-svn: 23017
2005-08-24 17:22:17 +00:00
Chris Lattner 898e50ecb3 floor/ceil don't read/write memory. This allows gcse to eliminate 6 calls
in mesa.

llvm-svn: 23015
2005-08-24 16:58:56 +00:00
Chris Lattner 86b1658d58 teach selection dag mask tracking about the fact that select_cc operates like
select.  Also teach it that the bit count instructions can only set the low bits
of the result, depending on the size of the input.

This allows us to compile this:

int %eq0(int %a) {
        %tmp.1 = seteq int %a, 0                ; <bool> [#uses=1]
        %tmp.2 = cast bool %tmp.1 to int                ; <int> [#uses=1]
        ret int %tmp.2
}

To this:

_eq0:
        cntlzw r2, r3
        srwi r3, r2, 5
        blr

instead of this:

_eq0:
        cntlzw r2, r3
        rlwinm r3, r2, 27, 31, 31
        blr

when setcc is marked illegal on ppc (which restores parity to non-illegal
setcc).  Thanks to Nate for pointing this out.

llvm-svn: 23013
2005-08-24 16:46:55 +00:00
Chris Lattner f12eb4d676 Start using isOperationLegal and isTypeLegal to simplify the code
llvm-svn: 23012
2005-08-24 16:35:28 +00:00
Chris Lattner ade525491f Adjust to new interface
llvm-svn: 23010
2005-08-24 16:34:12 +00:00
Reid Spencer f85fabeb71 For PR616:
These patches make threading optional in LLVM. The configuration scripts are now
modified to accept a --disable-threads switch. If this is used, the Mutex class
will be implemented with all functions as no-op. Furthermore, linking against
libpthread will not be done. Finally, the ParallelJIT example needs libpthread
so its makefile was changed to always add -lpthread to the link line.

llvm-svn: 23003
2005-08-24 10:07:20 +00:00
Nate Begeman 7c1ba938be Whoops, fix a thinko. All cases except SETNE are now handled by the
target independent code in SelectionDAG.cpp

llvm-svn: 23002
2005-08-24 05:06:48 +00:00
Nate Begeman a1e0a2f72b Remove unused statistic
Prefer 'neg X' to 'subfic 0, X' since neg does not set XER[CA]

llvm-svn: 23001
2005-08-24 05:03:20 +00:00
Nate Begeman 6948b79b26 Add the "ppc specific" setcc-equivalent select_cc cases
Prefer 'neg X' to 'subfic 0, X' since it does not set XER[CA]

llvm-svn: 23000
2005-08-24 04:59:21 +00:00
Nate Begeman 45bbbb3f11 Teach SelectionDAG how to simplify a few more setcc-equivalent select_cc
nodes so that backends don't have to.

llvm-svn: 22999
2005-08-24 04:57:57 +00:00
Chris Lattner b6d034a841 Add callseq_begin/end support
Call stil not supported yet

llvm-svn: 22998
2005-08-24 00:47:15 +00:00
Chris Lattner 99282c7b92 Make -view-isel-dags show the dag before instruction selecting, in case
the target isel crashes due to unimplemented features like calls :)

llvm-svn: 22997
2005-08-24 00:34:29 +00:00
Nate Begeman 72eab5dd5c Fix optimization of select_cc seteq X, 0, 1, 0 -> srl (ctlz X), log2 X size
llvm-svn: 22995
2005-08-24 00:21:28 +00:00
Chris Lattner eeacce5a60 Implement LiveVariables.h change
llvm-svn: 22994
2005-08-24 00:09:33 +00:00
Chris Lattner 469652752c adjust to new live variables interface
llvm-svn: 22992
2005-08-23 23:42:17 +00:00
Chris Lattner cdc0cbbcd0 Adjust to new livevars interface
llvm-svn: 22991
2005-08-23 23:41:14 +00:00
Chris Lattner 774158239b Simplify this code by using higher-level LiveVariables methods
llvm-svn: 22989
2005-08-23 22:51:41 +00:00
Chris Lattner 7c1c6e06f3 Simplify this code by using LiveVariables::KillsRegister
llvm-svn: 22988
2005-08-23 22:49:55 +00:00
Chris Lattner 22e91cc3b5 Keep track of which registers are related to which other registers.
Use this information to avoid doing expensive interval intersections for
registers that could not possible be interesting.  This speeds up linscan
on ia64 compiling kc++ in release mode from taking 7.82s to 4.8s(!), total
itanium llc time on this program is 27.3s now.  This marginally speeds up
PPC and X86, but they appear to be limited by other parts of linscan, not
this code.

On this program, on itanium, live intervals now takes 41% of llc time.

llvm-svn: 22986
2005-08-23 22:27:31 +00:00
Chris Lattner 9c0a243ce5 Fix PR618 and Regression/CodeGen/CBackend/2005-08-23-Fmod.ll by not emitting
x%y for 'rem' on fp values.

llvm-svn: 22984
2005-08-23 20:22:50 +00:00
Chris Lattner 5e3953d761 add a note
llvm-svn: 22982
2005-08-23 06:27:59 +00:00
Nate Begeman f3ce09b36e Ack, typo
llvm-svn: 22981
2005-08-23 05:45:10 +00:00
Nate Begeman 7216ad415b Add an option to make SetCC illegal as a beta option
llvm-svn: 22979
2005-08-23 05:42:36 +00:00
Nate Begeman bf8c3939d7 Teach the SelectionDAG how to transform select_cc eq, X, 0, 1, 0 into
either seteq X, 0 or srl (ctlz X), size(X-1), depending on what's legal
for the target.

llvm-svn: 22978
2005-08-23 05:41:12 +00:00
Nate Begeman 987121a61a Teach Legalize how to turn setcc into select_cc
llvm-svn: 22977
2005-08-23 04:29:48 +00:00
Nate Begeman 06436b2b7d Remove some instructions we no longer generate
llvm-svn: 22976
2005-08-23 01:16:46 +00:00
Chris Lattner 46323cf0e2 Remove some regs that are not used.
llvm-svn: 22975
2005-08-22 22:32:13 +00:00
Chris Lattner 956820d989 Nate noticed that 30% of the malloc/frees in llc come from calls to LowercaseString
in the asmprinter.  This changes the .td files to use lower case register names,
avoiding the need to do this call.  This speeds up the asmprinter from 1.52s
to 1.06s on kc++ in a release build.

llvm-svn: 22974
2005-08-22 22:00:02 +00:00
Chris Lattner d2f2aff484 Fix a crash I introduced into the IA64 backend with my copyfromreg change.
It used to crash on any function that took float arguments.

llvm-svn: 22973
2005-08-22 21:33:11 +00:00
Chris Lattner 834a2316a3 Try to avoid scanning the fixed list. On architectures with a non-stupid
number of regs (e.g. most riscs), many functions won't need to use callee
clobbered registers.  Do a speculative check to see if we can get a free
register without processing the fixed list (which has all of these).  This
saves a lot of time on machines with lots of callee clobbered regs (e.g.
ppc and itanium, also x86).

This reduces ppc llc compile time from 184s -> 172s on kc++.  This is probably
worth FAR FAR more on itanium though.

llvm-svn: 22972
2005-08-22 20:59:30 +00:00