Commit Graph

5304 Commits

Author SHA1 Message Date
Mikhail Glushenkov f83f33e8d3 Fix: 'sink' handling was broken.
llvm-svn: 51750
2008-05-30 06:23:29 +00:00
Nick Lewycky 048fc8db62 Unbreak this test.
llvm-svn: 51726
2008-05-30 05:02:37 +00:00
Dan Gohman 96af4ddb62 Add patterns for CALL32m and CALL64m. They aren't matched in most
cases due to an isel deficiency already noted in
lib/Target/X86/README.txt, but they can be matched in this fold-call.ll
testcase, for example.

This is interesting mainly because it exposes a tricky tblgen bug;
tblgen was incorrectly computing the starting index for variable_ops
in the case of a complex pattern.

llvm-svn: 51706
2008-05-29 21:50:34 +00:00
Dan Gohman 714663ab94 Expand small memmovs using inline code. Set the X86 threshold for expanding
memmove to a more plausible value, now that it's actually being used.

llvm-svn: 51696
2008-05-29 19:42:22 +00:00
Anton Korobeynikov d8734cf916 For PR1338: Rename test dirs
llvm-svn: 51695
2008-05-29 19:17:15 +00:00
Owen Anderson 50d602cda2 Move these tests into the proper directory.
llvm-svn: 51685
2008-05-29 16:30:29 +00:00
Owen Anderson 7686b555e2 Replace the old ADCE implementation with a new one that more simply solves
the one case that ADCE catches that normal DCE doesn't: non-induction variable
loop computations.

This implementation handles this problem without using postdominators.

llvm-svn: 51668
2008-05-29 08:45:13 +00:00
Evan Cheng 5e28227dbd Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq.
llvm-svn: 51667
2008-05-29 08:22:04 +00:00
Evan Cheng 6892c5507f Add nounwind.
llvm-svn: 51665
2008-05-29 07:09:24 +00:00
Evan Cheng 68079268f5 Fix PR2289: vr defined by multiple implicit_def as result of coalescing.
llvm-svn: 51648
2008-05-28 17:40:10 +00:00
Evan Cheng 427412e7c8 Teach local register allocator to deal with landing pad MBB's.
llvm-svn: 51647
2008-05-28 17:22:32 +00:00
Chris Lattner ecdefb5df7 Implement PR2370: memmove(x,x,size) -> noop.
llvm-svn: 51636
2008-05-28 05:30:41 +00:00
Dan Gohman 221e9d0d22 Specify a target so that this tests tests what it's intended to test.
llvm-svn: 51600
2008-05-27 17:55:57 +00:00
Dan Gohman 923a375053 Make this test independent of the target-triple; the stack alignment
is specifically what this test depends on.

llvm-svn: 51599
2008-05-27 17:44:23 +00:00
Nick Lewycky a61cc6ece0 Whoops -- forgot PR reference on this test.
llvm-svn: 51569
2008-05-26 20:23:33 +00:00
Nick Lewycky 213e114a2c The Linux ABI emits an extra "movl %esp, %ebp" in function prologue and
sometimes a "mov %ebp, %esp" in the epilogue.

Force these tests that rely on counting 'mov' to use i686-apple-darwin8.8.0
where they were written.

llvm-svn: 51568
2008-05-26 20:18:56 +00:00
Nick Lewycky be993358a7 Use {} instead of "" in RUN lines.
llvm-svn: 51561
2008-05-26 01:27:08 +00:00
Nick Lewycky 3195b393d6 Don't treat values as signed when looking at loop steppings in HowForToNonZero.
llvm-svn: 51560
2008-05-25 23:43:32 +00:00
Nick Lewycky f6ccd2580c "ret (constexpr)" can't be folded into a Constant. Add a method to
Analysis/ConstantFolding to fold ConstantExpr's, then make instcombine use it
to try to use targetdata to fold constant expressions on void instructions.

Also extend the icmp(inttoptr, inttoptr) folding to handle the case where
int size != ptr size.

llvm-svn: 51559
2008-05-25 20:56:15 +00:00
Chris Lattner 87a099a057 Fix a serious brain-o. Obviously no-one reviewed my patch :(
This fixes PR2359

llvm-svn: 51536
2008-05-24 04:06:28 +00:00
Chris Lattner 5c207c83c6 Fix PR2358 by resolving calls with undef arguments to overdefined.
llvm-svn: 51535
2008-05-24 03:59:33 +00:00
Evan Cheng 91a2e56b06 Eliminate x86.sse2.punpckh.qdq and x86.sse2.punpckl.qdq.
llvm-svn: 51533
2008-05-24 02:56:30 +00:00
Evan Cheng 2146270c9b Eliminate x86.sse2.movs.d, x86.sse2.shuf.pd, x86.sse2.unpckh.pd, and x86.sse2.unpckl.pd intrinsics. These will be lowered into shuffles.
llvm-svn: 51531
2008-05-24 02:14:05 +00:00
Evan Cheng 948627aadd New loadl_pd and loadh_pd tests.
llvm-svn: 51525
2008-05-24 00:10:02 +00:00
Evan Cheng 5065932276 Autoupgrade x86.sse2.loadh.pd and x86.sse2.loadl.pd.
llvm-svn: 51523
2008-05-24 00:08:39 +00:00
Dan Gohman fa9dac2859 Don't silently truncate array extents to 32 bits.
llvm-svn: 51505
2008-05-23 21:40:55 +00:00
Evan Cheng 04d24edcbb Use movlps / movhps to modify low / high half of 16-byet memory location.
llvm-svn: 51501
2008-05-23 21:23:16 +00:00
Dan Gohman 5e7863de1b Remove lingering references to .llx and .tr in the tests.
llvm-svn: 51500
2008-05-23 21:15:35 +00:00
Dan Gohman 3388d022ac Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add
load-folding table entries for PMULDQ and PMULLD.

llvm-svn: 51489
2008-05-23 17:49:40 +00:00
Matthijs Kooijman aef2b8198b Restucture a part of the SimplifyCFG pass and include a testcase.
The SimplifyCFG pass looks at basic blocks that contain only phi nodes,
followed by an unconditional branch. In a lot of cases, such a block (BB) can
be merged into their successor (Succ).

This merging is performed by TryToSimplifyUncondBranchFromEmptyBlock. It does
this by taking all phi nodes in the succesor block Succ and expanding them to
include the predecessors of BB. Furthermore, any phi nodes in BB are moved to
Succ and expanded to include the predecessors of Succ as well.

Before attempting this merge, CanPropagatePredecessorsForPHIs checks to see if
all phi nodes can be properly merged. All functional changes are made to
this function, only comments were updated in
TryToSimplifyUncondBranchFromEmptyBlock.

In the original code, CanPropagatePredecessorsForPHIs looks quite convoluted
and more like stack of checks added to handle different kinds of situations
than a comprehensive check. In particular the first check in the function did
some value checking for the case that BB and Succ have a common predecessor,
while the last check in the function simply rejected all cases where BB and
Succ have a common predecessor. The first check was still useful in the case
that BB did not contain any phi nodes at all, though, so it was not completely
useless.

Now, CanPropagatePredecessorsForPHIs is restructured to to look a lot more
similar to the code that actually performs the merge. Both functions now look
at the same phi nodes in about the same order.  Any conflicts (phi nodes with
different values for the same source) that could arise from merging or moving
phi nodes are detected. If no conflicts are found, the merge can happen.

Apart from only restructuring the checks, two main changes in functionality
happened.

Firstly, the old code rejected blocks with common predecessors in most cases.
The new code performs some extra checks so common predecessors can be handled
in a lot of cases. Wherever common predecessors still pose problems, the
blocks are left untouched.

Secondly, the old code rejected the merge when values (phi nodes) from BB were
used in any other place than Succ. However, it does not seem that there is any
situation that would require this check. Even more, this can be proven.

Consider that BB is a block containing of a single phi node "%a" and a branch
to Succ. Now, since the definition of %a will dominate all of its uses, BB
will dominate all blocks that use %a. Furthermore, since the branch from BB to
Succ is unconditional, Succ will also dominate all uses of %a.

Now, assume that one predecessor of Succ is not dominated by BB (and thus not
dominated by Succ). Since at least one use of %a (but in reality all of them)
is reachable from Succ, you could end up at a use of %a without passing
through it's definition in BB (by coming from X through Succ). This is a
contradiction, meaning that our original assumption is wrong. Thus, all
predecessors of Succ must also be dominated by BB (and thus also by Succ).

This means that moving the phi node %a from BB to Succ does not pose any
problems when the two blocks are merged, and any use checks are not needed.

llvm-svn: 51478
2008-05-23 09:09:41 +00:00
Nick Lewycky 3bf5512d87 Constant integer vectors may also be negated.
llvm-svn: 51476
2008-05-23 04:54:45 +00:00
Nick Lewycky 4f3d878507 Revert X + X --> X * 2 optz'n which pessimizes heavily on x86.
llvm-svn: 51474
2008-05-23 04:34:58 +00:00
Nick Lewycky 452fb32927 Implement X + X for vectors.
llvm-svn: 51472
2008-05-23 04:14:51 +00:00
Nick Lewycky 2ec9a01173 Fix a recently added optimization to not crash on vectors.
llvm-svn: 51471
2008-05-23 03:26:47 +00:00
Dan Gohman 6d5f120c5c Generalize the new code in instcombine's ComputeNumSignBits for handling
and/or to handle more cases (such as this add-sitofp.ll testcase), and
port it to selectiondag's ComputeNumSignBits.

llvm-svn: 51469
2008-05-23 02:28:01 +00:00
Dan Gohman 30499844ea Make structs and arrays first-class types, and add assembly
and bitcode support for the extractvalue and insertvalue
instructions and constant expressions.

Note that this does not yet include CodeGen support.

llvm-svn: 51468
2008-05-23 01:55:30 +00:00
Evan Cheng f3be7a7ea7 Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks.
Also fixed some 80 col. violations.

llvm-svn: 51462
2008-05-23 00:37:07 +00:00
Evan Cheng a1100782d5 Add a couple of test cases.
llvm-svn: 51441
2008-05-22 21:19:19 +00:00
Evan Cheng 53963b775e Add missing patterns.
llvm-svn: 51435
2008-05-22 18:56:56 +00:00
Chris Lattner 79be90c3c7 Add support for multiple-return values in inline asm. This should
get inline asm working as well as it did previously with the CBE
with the new MRV support for inline asm.

llvm-svn: 51420
2008-05-22 06:19:37 +00:00
Chris Lattner a87f1a568c testcase for PR2267
llvm-svn: 51408
2008-05-22 04:45:22 +00:00
Evan Cheng a5d27ae586 Fix PR2343. An *interesting* coalescer bug.
BB1:                                                                                                                                                  
  vr1025 = copy vr1024                                                                                                                                
  ..                                                                                                                                                  
BB2:                                                                                                                                                  
  vr1024 = op                                                                                                                                         
         = op vr1025                                                                                                                                     
  <loop eventually branch back to BB1>

Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop.

llvm-svn: 51394
2008-05-21 22:34:12 +00:00
Gabor Greif 78d66e4ac1 resurrect lost tests by renaming them to not end with .tr
llvm-svn: 51375
2008-05-21 14:48:24 +00:00
Gabor Greif d01c562e48 Eliminate questionable syntax for stdin redirection. This probably also speeds things up a bit.
llvm-svn: 51357
2008-05-20 22:07:21 +00:00
Chris Lattner b76ad168dc Fix PR2346 by marking vaarg as volatile so that licm doesn't try to
hoist them.

llvm-svn: 51356
2008-05-20 22:05:28 +00:00
Dan Gohman 0843435b36 Oops, commit the version of this test that actually works.
llvm-svn: 51351
2008-05-20 21:19:36 +00:00
Dan Gohman 81ab753b14 Port SelectionDAG's ComputeNumSignBits-using code to instcombine,
now that instcombine also has ComputeNumSignBits.

llvm-svn: 51350
2008-05-20 21:01:12 +00:00
Gabor Greif 1e427c3264 sabre brings to my attention that the 'tr' suffix is also obsolete
llvm-svn: 51349
2008-05-20 21:00:03 +00:00
Gabor Greif f45ff35bfe Rename the last test with .llx extension to .ll, resolve duplicate test by renaming to isnan2. Now that no test has llx ending there is no need to search for them from dg.exp too.
llvm-svn: 51328
2008-05-20 19:52:04 +00:00
Evan Cheng 0609ab646b More local spiller complexity!
If local spiller optimization turns some instruction into an identity copy, it will be removed. If the output register happens to be dead (and source is obviously killed), transfer the kill / dead information to last use / def in the same MBB.

llvm-svn: 51306
2008-05-20 08:13:21 +00:00