Commit Graph

1723 Commits

Author SHA1 Message Date
Bill Wendling 00810c39da revert r98270.
llvm-svn: 98281
2010-03-11 19:50:31 +00:00
Evan Cheng 31fe835bf2 Bad bad bug. x86 force indirect tail call address into eax when it's meant to force it into a call preserved register instead. Change it to ecx for now.
llvm-svn: 98270
2010-03-11 18:49:14 +00:00
Evan Cheng 8c4df8160e The check for coalescing a virtual register to a physical register, e.g.
cl = EXTRACT_SUBREG reg1024, 1, is overly conservative. It should check
for overlaps of vr's live interval with the super registers of the
physical register (ECX in this case) and let JoinIntervals() handle checking
the coalescing feasibility against the physical register (cl in this case).

llvm-svn: 98251
2010-03-11 08:20:21 +00:00
Eric Christopher 304f13c637 Have fast-isel understand llvm.objectsize. Update testcase for slightly
different codegen.

llvm-svn: 98244
2010-03-11 06:20:22 +00:00
Chris Lattner a179e4d0a8 add support, testcases, and dox for the new GHC calling
convention.  Patch by David Terei!

llvm-svn: 98212
2010-03-11 00:22:57 +00:00
Chris Lattner 4ec0b670d5 fix PR6533 by updating the br(xor) code to remember the case
when it looked past a trunc.

llvm-svn: 98203
2010-03-10 23:46:44 +00:00
Evan Cheng 72811e8714 Fix typo.
llvm-svn: 98142
2010-03-10 07:07:55 +00:00
Evan Cheng a3b6739749 Unbreak test on Linux.
llvm-svn: 98141
2010-03-10 07:07:45 +00:00
Evan Cheng 80ad113731 Enable machine cse pass.
llvm-svn: 98132
2010-03-10 03:07:41 +00:00
Chris Lattner 9889c1eb9e move .set generation out of DwarfPrinter into AsmPrinter and
MCize it.

llvm-svn: 98010
2010-03-08 23:58:37 +00:00
Chris Lattner 27a9732450 simplify EmitSectionOffset to always use .set if it is
available, the only thing this affects is that we produce
.set in one case we didn't before, which shouldn't harm
anything.  Make EmitSectionOffset call EmitDifference
instead of duplicating it.

llvm-svn: 98005
2010-03-08 23:23:25 +00:00
Evan Cheng 5967649780 Add documentation on sibling call optimization. Rename tailcall2.ll test to sibcall.ll.
llvm-svn: 97980
2010-03-08 21:05:02 +00:00
Charles Davis 8545afe0b0 Don't emit global symbols into the (__TEXT,__ustring) section on Darwin. This
is a workaround for <rdar://problem/7672401/> (which I filed).

This let's us build Wine on Darwin, and it gets the Qt build there a little bit
further (so Doug says).

llvm-svn: 97845
2010-03-05 22:28:45 +00:00
Jakob Stoklund Olesen 2664d295cb Better handling of dead super registers in LiveVariables. We used to do this:
CALL ... %RAX<imp-def>
   ... [not using %RAX]
   %EAX = ..., %RAX<imp-use, kill>
   RET %EAX<imp-use,kill>

Now we do this:

   CALL ... %RAX<imp-def, dead>
   ... [not using %RAX]
   %EAX = ...
   RET %EAX<imp-use,kill>

By not artificially keeping %RAX alive, we lower register pressure a bit.

The correct number of instructions for 2008-08-05-SpillerBug.ll is obviously
55, anybody can see that. Sheesh.

llvm-svn: 97838
2010-03-05 21:49:17 +00:00
Jakob Stoklund Olesen 8c5b8db5cd We don't really care about correct register liveness information after the
post-ra scheduler has run. Disable the verifier checks that late in the game.

llvm-svn: 97837
2010-03-05 21:49:13 +00:00
Jakob Stoklund Olesen b0503beff1 Avoid creating bad PHI instructions when BR is being const-folded.
llvm-svn: 97836
2010-03-05 21:49:10 +00:00
Evan Cheng 654ec2a663 Fix an oops in x86 sibcall optimization. If the ByVal callee argument is itself passed as a pointer, then it's obviously not safe to do a tail call.
llvm-svn: 97797
2010-03-05 08:38:04 +00:00
Chris Lattner 55e81eb49f Fix PR6497, a bug where we'd fold a load into an addc
node which has a flag.  That flag in turn was used by an
already-selected adde which turned into an ADC32ri8 which
used a selected load which was chained to the load we
folded.  This flag use caused us to form a cycle.  Fix
this by not ignoring chains in IsLegalToFold even in
cases where the isel thinks it can.

llvm-svn: 97791
2010-03-05 06:19:13 +00:00
Chris Lattner bfdd17a2ea cleanup
llvm-svn: 97790
2010-03-05 06:17:43 +00:00
Evan Cheng cf67ffa500 Rever 96389 and 96990. They are causing some miscompilation that I do not fully understand.
llvm-svn: 97782
2010-03-05 03:08:23 +00:00
Bill Wendling 543ce1f64a Revert r97766. It's deleting a tag.
llvm-svn: 97768
2010-03-05 00:33:59 +00:00
Bill Wendling 6517f88f25 Micro-optimization:
This code:

float floatingPointComparison(float x, float y) {
    double product = (double)x * y;
    if (product == 0.0)
        return product;
    return product - 1.0;
}

produces this:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jne             0x00000004
0000000000000016        jp              0x00000002
0000000000000018        jmp             0x00000008
000000000000001a        addsd           0x00000006(%rip),%xmm0
0000000000000022        cvtsd2ss        %xmm0,%xmm0
0000000000000026        ret

The "jne/jp/jmp" sequence can be reduced to this instead:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jp              0x00000002
0000000000000016        je              0x00000008
0000000000000018        addsd           0x00000006(%rip),%xmm0
0000000000000020        cvtsd2ss        %xmm0,%xmm0
0000000000000024        ret

for a savings of 2 bytes.

This xform can happen when we recognize that jne and jp jump to the same "true"
MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch
is the fall-through MBB.

llvm-svn: 97766
2010-03-05 00:24:26 +00:00
Jakob Stoklund Olesen af6ca23294 Fix the remaining MUL8 and DIV8 to define AX instead of AL,AH.
These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG
reads AX in order to avoid reading AH with a REX instruction.

Fix PR6489.

llvm-svn: 97742
2010-03-04 20:42:07 +00:00
Dan Gohman b8ebd408da Fix recognition of 16-bit bswap for C front-ends which emit the
clobber registers in a different order.

llvm-svn: 97741
2010-03-04 19:58:08 +00:00
Dan Gohman 2850b41412 Revert r97580; that's not the right way to fix this.
llvm-svn: 97639
2010-03-03 04:36:42 +00:00
Bill Wendling af13d82945 This test case:
long test(long x) { return (x & 123124) | 3; }

Currently compiles to:

_test:
        orl     $3, %edi
        movq    %rdi, %rax
        andq    $123127, %rax
        ret

This is because instruction and DAG combiners canonicalize

  (or (and x, C), D) -> (and (or, D), (C | D))

However, this is only profitable if (C & D) != 0. It gets in the way of the
3-addressification because the input bits are known to be zero.

llvm-svn: 97616
2010-03-03 00:35:56 +00:00
Chris Lattner dd030701bd Fix some issues in WalkChainUsers dealing with
CopyToReg/CopyFromReg/INLINEASM.  These are annoying because
they have the same opcode before an after isel.  Fix this by
setting their NodeID to -1 to indicate that they are selected,
just like what automatically happens when selecting things that
end up being machine nodes.

With that done, give IsLegalToFold a new flag that causes it to
ignore chains.  This lets the HandleMergeInputChains routine be
the one place that validates chains after a match is successful,
enabling the new hotness in chain processing.  This smarter
chain processing eliminates the need for "PreprocessRMW" in the
X86 and MSP430 backends and enables MSP to start matching it's
multiple mem operand instructions more aggressively.

I currently #if out the dead code in the X86 backend and MSP 
backend, I'll remove it for real in a follow-on patch.

The testcase changes are:
  test/CodeGen/X86/sse3.ll: we generate better code
  test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was 
      miscompiling this before, we now generate correct code
      Convert it to filecheck while I'm at it.
  test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem
      folding to make anton happy. :)

llvm-svn: 97596
2010-03-02 22:20:06 +00:00
Dan Gohman d55f574589 When expanding an expression such as (A + B + C + D), sort the operands
by loop depth and emit loop-invariant subexpressions outside of loops.
This speeds up MultiSource/Applications/viterbi and others.

llvm-svn: 97580
2010-03-02 19:32:21 +00:00
Chris Lattner 35ec683b78 clean up some testcases.
llvm-svn: 97576
2010-03-02 18:56:03 +00:00
Chris Lattner 925ac71f26 Fix the xfail I added a couple of patches back. The issue
was that we weren't properly handling the case when interior
nodes of a matched pattern become dead after updating chain
and flag uses.  Now we handle this explicitly in 
UpdateChainsAndFlags.

llvm-svn: 97561
2010-03-02 07:50:03 +00:00
Chris Lattner b884fe867e Rewrite chain handling validation and input TokenFactor handling
stuff now that we don't care about emulating the old broken 
behavior of the old isel.  This eliminates the 
'CheckChainCompatible' check (along with IsChainCompatible) which
did an incorrect and inefficient scan *up* the chain nodes which
happened as the pattern was being formed and does the validation
at the end in HandleMergeInputChains when it forms a structural 
pattern.  This scans "down" the graph, which means that it is
quickly bounded by nodes already selected.  This also handles
token factors that get "trapped" in the dag.

Removing the CheckChainCompatible nodes also shrinks the 
generated tables by about 6K for X86 (down to 83K).

There are two pieces remaining before I can nuke PreprocessRMW:
1. I xfailed a test because we're now producing worse code in a 
   case that has nothing to do with the change: it turns out that
   our use of MorphNodeTo will leave dead nodes in the graph
   which (depending on how the graph is walked) end up causing
   bogus uses of chains and blocking matches.  This is really 
   bad for other reasons, so I'll fix this in a follow-up patch.

2. CheckFoldableChainNode needs to be improved to handle the TF.

llvm-svn: 97539
2010-03-02 02:22:10 +00:00
Dan Gohman 4cec543952 Fix several places to handle vector operands properly.
Based on a patch by Micah Villmow for PR6438.

llvm-svn: 97538
2010-03-02 02:14:38 +00:00
Devang Patel 3bf0571bb0 Rewrite test to test VLA using new debug info encoding scheme.
llvm-svn: 97465
2010-03-01 18:30:58 +00:00
Chris Lattner 90e7924cf0 add some random nounwinds.
llvm-svn: 97411
2010-02-28 20:36:49 +00:00
Dan Gohman 34021b7445 Don't try to replace physical registers when doing CSE.
llvm-svn: 97360
2010-02-28 01:33:43 +00:00
Dan Gohman 45e7ffc350 Add nounwinds.
llvm-svn: 97349
2010-02-27 23:53:53 +00:00
Evan Cheng 228c31f045 Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap.
llvm-svn: 97310
2010-02-27 07:36:59 +00:00
Chris Lattner f7fc2d8b86 change the scope node to include a list of children to be checked
instead of to have a chained series of scope nodes.  This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more 
reasonable to implement.

llvm-svn: 97160
2010-02-25 19:00:39 +00:00
Dan Gohman 9b80f86e5b Revert r97064. Duncan pointed out that bitcasts are defined in
terms of store and load, which means bitcasting between scalar
integer and vector has endian-specific results, which undermines
this whole approach.

llvm-svn: 97137
2010-02-25 15:20:39 +00:00
Dan Gohman a9c205cc88 Make LoopSimplify change conditional branches in loop exiting blocks
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.

Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.

llvm-svn: 97126
2010-02-25 06:57:05 +00:00
Dan Gohman 4b2b48daba Make getTypeSizeInBits work correctly for array types; it should return
the number of value bits, not the number of bits of allocation for in-memory
storage.

Make getTypeStoreSize and getTypeAllocSize work consistently for arrays and
vectors.

Fix several places in CodeGen which compute offsets into in-memory vectors
to use TargetData information.

This fixes PR1784.

llvm-svn: 97064
2010-02-24 22:05:23 +00:00
Daniel Dunbar 4811d004be Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in
the hopes of fixing PPC bootstrap.

llvm-svn: 97040
2010-02-24 17:05:47 +00:00
Dan Gohman 3860521406 When forming SSE min and max nodes for UGE and ULE comparisons, it's
necessary to swap the operands to handle NaN and negative zero properly.

Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.

llvm-svn: 97025
2010-02-24 06:52:40 +00:00
Chris Lattner df8a8a8c6f Change the scheduler from adding nodes in allnodes order
to adding them in a determinstic order (bottom up from 
the root) based on the structure of the graph itself.

This updates tests for some random changes, interesting
bits: CodeGen/Blackfin/promote-logic.ll no longer crashes.
I have no idea why, but that's good right?

CodeGen/X86/2009-07-16-LoadFoldingBug.ll also fails, but
now compiles to have one fewer constant pool entry, making
the expected load that was being folded disappear.  Since it
is an unreduced mass of gnast, I just removed it.

This fixes PR6370

llvm-svn: 97023
2010-02-24 06:11:37 +00:00
Evan Cheng 328a607490 Re-apply 96540 and 96556 with fixes.
llvm-svn: 97011
2010-02-24 01:42:31 +00:00
Jakob Stoklund Olesen a2d8c97b65 DIV8r must define %AX since X86DAGToDAGISel::Select() sometimes uses it
instead of %AL/%AH.

llvm-svn: 97006
2010-02-24 00:39:35 +00:00
Jakob Stoklund Olesen fe0a8cd210 Remember to handle sub-registers when moving imp-defs to a rematted instruction.
llvm-svn: 96995
2010-02-23 22:44:02 +00:00
Jakob Stoklund Olesen 38b76e27a7 Keep track of phi join registers explicitly in LiveVariables.
Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply
defined registers. That doesn't work if the phi join is implicitly defined in
all but one of the predecessors.

llvm-svn: 96994
2010-02-23 22:43:58 +00:00
Evan Cheng 2a33390e2b These should not have been committed.
llvm-svn: 96827
2010-02-22 23:37:48 +00:00
Evan Cheng 3688b8fa68 Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting.
llvm-svn: 96825
2010-02-22 23:34:00 +00:00