Commit Graph

388 Commits

Author SHA1 Message Date
Gabor Greif aafd209632 rotate CallInst operands, i.e. move callee to the back
of the operand array

the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary

llvm-svn: 101364
2010-04-15 10:49:53 +00:00
Gabor Greif df323a51f5 performance: get rid of repeated dereferencing of use_iterator by caching its result
llvm-svn: 100550
2010-04-06 19:32:30 +00:00
Mon P Wang c576ee9040 Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100304
2010-04-04 03:10:48 +00:00
Mon P Wang 999c1b927b Revert r100191 since it breaks objc in clang
llvm-svn: 100199
2010-04-02 18:43:02 +00:00
Mon P Wang a972ab8564 Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100191
2010-04-02 18:04:15 +00:00
Bob Wilson 6f7fd28824 Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots.
llvm-svn: 99948
2010-03-30 22:27:04 +00:00
Mon P Wang 7460571381 Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.

llvm-svn: 99928
2010-03-30 20:55:56 +00:00
Duncan Sands 19d0b47b1f There are two ways of checking for a given type, for example isa<PointerType>(T)
and T->isPointerTy().  Convert most instances of the first form to the second form.
Requested by Chris.

llvm-svn: 96344
2010-02-16 11:11:14 +00:00
Duncan Sands 9dff9bec31 Uniformize the names of type predicates: rather than having isFloatTy and
isInteger, we now have isFloatTy and isIntegerTy.  Requested by Chris!

llvm-svn: 96223
2010-02-15 16:12:20 +00:00
Bob Wilson 04365c5f72 Adjust the heuristics used to decide when SROA is likely to be profitable.
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements.  A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA.  Even if the total aggregate size is large, it may still be good to
perform SROA.  Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.

We have also been checking the number of elements to decide if an aggregate
should be split up.  The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway.  I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types.  One thing suggested by the data is that distinguishing between structs
and arrays might be useful.  There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now).  Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements.  And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial.  So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.

This fixes Apple Radar 7563690.

llvm-svn: 95224
2010-02-03 17:23:56 +00:00
Benjamin Kramer 40582a891c Use the less expensive getName function instead of getNameStr.
llvm-svn: 94683
2010-01-27 19:46:52 +00:00
Bob Wilson fc060e4337 Change Value::getUnderlyingObject to have the MaxLookup value specified as a
parameter with a default value, instead of just hardcoding it in the
implementation.  The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.

llvm-svn: 94433
2010-01-25 18:26:54 +00:00
Victor Hernandez 1df65186d1 DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up looking for them in use iterations and remove OnlyUsedByDbgInfoIntrinsics()
llvm-svn: 94111
2010-01-21 23:05:53 +00:00
Bob Wilson 58d59fe394 Fix a crash in scalarrepl for memcpy/memmove where the source and destination
are the same.  I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value.  Radar 7552893.

llvm-svn: 93848
2010-01-19 04:32:48 +00:00
Dan Gohman 28943873e6 Use do+while instead of while for loops which obviously have a
non-zero trip count. Use SmallVector's pop_back_val().

llvm-svn: 92734
2010-01-05 16:27:25 +00:00
David Greene 48c86bedbd Change errs() to dbgs().
llvm-svn: 92610
2010-01-05 01:27:09 +00:00
Chris Lattner c0f6402a94 Fix the Convert to scalar to not insert dead loads in the store case. The
load is needed when we have a small store into a large alloca (at which 
point we get a load/insert/store sequence), but when you do a full-sized
store, this load ends up being dead.

This dead load is bad in really large nasty testcases where the load ends
up causing mem2reg to insert large chains of dependent phi nodes which only
ADCE can delete.  Instead of doing this, just don't insert the dead load.

This fixes rdar://6864035

llvm-svn: 91917
2009-12-22 19:33:28 +00:00
Chris Lattner fda3b559e6 fix some fixme's by using twines
llvm-svn: 91916
2009-12-22 19:23:33 +00:00
Bob Wilson 62a84ea8e3 Generalize SROA to allow the first index of a GEP to be non-zero. Add a
missing check that an array reference doesn't go past the end of the array,
and remove some redundant checks for in-bound array and vector references
that are no longer needed.

llvm-svn: 91897
2009-12-22 06:57:14 +00:00
Bob Wilson 88a0598fe8 Remove special-case SROA optimization of variable indexes to one-element and
two-element arrays.  After restructuring the SROA code, it was not safe to
do this without adding more checking.  It is not clear that this special-case
has really been useful, and removing this simplifies the code quite a bit.

llvm-svn: 91828
2009-12-21 18:39:47 +00:00
Bob Wilson c16811b575 Update my SROA changes in response to review.
* change FindElementAndOffset to return a uint64_t instead of unsigned, and
  to identify the type to be used for that result in a GEP instruction.
* move "isa<ConstantInt>" to be first in conditional.
* replace some dyn_casts with casts.
* add a comment about handling mem intrinsics.

llvm-svn: 91762
2009-12-19 06:53:17 +00:00
Bob Wilson 532cd232fb Reapply 91459 with a simple fix for the problem that broke the x86_64-darwin
bootstrap.  This also replaces the WeakVH references that Chris objected to
with normal Value references.

llvm-svn: 91711
2009-12-18 20:14:40 +00:00
Bob Wilson f3927b7994 Re-revert 91459. It's breaking the x86_64 darwin bootstrap.
llvm-svn: 91607
2009-12-17 18:34:24 +00:00
Daniel Dunbar ab42d42390 Reapply r91459, it was only unmasking the bug, and since TOT is still broken having it reverted does no good.
llvm-svn: 91559
2009-12-16 20:09:53 +00:00
Daniel Dunbar 133efc317e Revert "Reapply 91184 with fixes and an addition to the testcase to cover the
problem", this broke llvm-gcc bootstrap for release builds on
x86_64-apple-darwin10.

This reverts commit db22309800b224a9f5f51baf76071d7a93ce59c9.

llvm-svn: 91534
2009-12-16 10:56:17 +00:00
Bob Wilson e44756d7c2 Reapply 91184 with fixes and an addition to the testcase to cover the problem
found last time.  Instead of trying to modify the IR while iterating over it,
I've change it to keep a list of WeakVH references to dead instructions, and
then delete those instructions later.  I also added some special case code to
detect and handle the situation when both operands of a memcpy intrinsic are
referencing the same alloca.

llvm-svn: 91459
2009-12-15 22:00:51 +00:00
Chris Lattner aaa6ac10a6 revert r91184, because it causes a crash on a .bc file I just
sent to Bob.

llvm-svn: 91268
2009-12-14 05:11:02 +00:00
Bob Wilson 895f364ae6 Revise scalar replacement to be more flexible about handle bitcasts and GEPs.
While scanning through the uses of an alloca, keep track of the current offset
relative to the start of the alloca, and check memory references to see if
the offset & size correspond to a component within the alloca.  This has the
nice benefit of unifying much of the code from isSafeUseOfAllocation,
isSafeElementUse, and isSafeUseOfBitCastedAllocation.  The code to rewrite
the uses of a promoted alloca, after it is determined to be safe, is
reorganized in the same way.

Also, when rewriting GEP instructions, mark them as "in-bounds" since all the
indices are known to be safe.

llvm-svn: 91184
2009-12-11 23:47:40 +00:00
Bob Wilson 1c5a6fb299 Fix a comment.
llvm-svn: 90975
2009-12-09 18:05:27 +00:00
Bob Wilson c5d082fd5d Some superficial cleanups.
llvm-svn: 90866
2009-12-08 18:27:03 +00:00
Bob Wilson 2029ea04f9 Clean up dead operands left around after SROA replaces a mem intrinsic.
I'm not aware that this does anything significant on its own, but it's
needed for another patch that I'm working on.

llvm-svn: 90864
2009-12-08 18:22:03 +00:00
Bob Wilson 050b812fe7 Fix up some comments.
llvm-svn: 90603
2009-12-04 21:57:37 +00:00
Bob Wilson 5ca37b274c Fix 80-column violations.
llvm-svn: 90601
2009-12-04 21:51:35 +00:00
Benjamin Kramer 3efc050ac4 Revert r90089 for now, it's breaking selfhost.
llvm-svn: 90097
2009-11-29 21:17:48 +00:00
Benjamin Kramer bfa993ab20 Fix two FIXMEs.
llvm-svn: 90089
2009-11-29 20:29:30 +00:00
Chris Lattner 2226db66ab fix PR5436 by making the 'simple' case of SRoA not promote out of range
array indexes.  The "complex" case of SRoA still handles them, and correctly.

This fixes a weirdness where we'd correctly avoid transforming A[0][42] if
the 42 was too large, but we'd only do it if it was one gep, not two separate
ones.

llvm-svn: 90007
2009-11-27 16:37:41 +00:00
Nick Lewycky 15a1287c1f Pull LLVMContext out of PromoteMemToReg.
llvm-svn: 89645
2009-11-23 03:50:44 +00:00
Victor Hernandez 8acf2956b8 Remove AllocationInst. Since MallocInst went away, AllocaInst is the only subclass of AllocationInst, so it no longer is necessary.
llvm-svn: 84969
2009-10-23 21:09:37 +00:00
Chris Lattner fdd8790718 strength reduce a ton of type equality tests to check the typeid (Through
the new predicates I added) instead of going through a context and doing a
pointer comparison.  Besides being cheaper, this allows a smart compiler
to turn the if sequence into a switch.

llvm-svn: 83297
2009-10-05 05:54:46 +00:00
Nick Lewycky 7465cd769c Add more newlines to make up for the ones removed from the end of instructions.
llvm-svn: 81851
2009-09-15 07:08:25 +00:00
Chris Lattner e9a4992399 add newline to debug dump
llvm-svn: 81840
2009-09-15 05:14:57 +00:00
Chris Lattner 2dd09dbdf7 eliminate VISIBILITY_HIDDEN from Transforms/Scalar. PR4861
llvm-svn: 80766
2009-09-02 06:11:42 +00:00
Chris Lattner b25de3ff60 eliminate the "Value" printing methods that print to a std::ostream.
This required converting a bunch of stuff off DOUT and other cleanups.

llvm-svn: 79819
2009-08-23 04:37:46 +00:00
Dan Gohman 915302c605 Make SROA and PredicateSimplifier cope if TargetData is not
available. This is very conservative for now.

llvm-svn: 79442
2009-08-19 18:22:18 +00:00
Nick Lewycky aa464002f0 Don't crash trying to promote VLAs.
llvm-svn: 79226
2009-08-17 05:37:31 +00:00
Owen Anderson 55f1c09e31 Push LLVMContexts through the IntegerType APIs.
llvm-svn: 78948
2009-08-13 21:58:54 +00:00
Owen Anderson 5a1acd9912 Move a few more APIs back to 2.5 forms. The only remaining ones left to change back are
metadata related, which I'm waiting on to avoid conflicting with Devang.

llvm-svn: 77721
2009-07-31 20:28:14 +00:00
Owen Anderson b292b8ce70 Move more code back to 2.5 APIs.
llvm-svn: 77635
2009-07-30 23:03:37 +00:00
Daniel Dunbar 132f78395a Twines: Don't allow implicit conversion from integers, this is too tricky.
llvm-svn: 77605
2009-07-30 17:37:43 +00:00
Daniel Dunbar 6afdc5e694 Switch obvious clients to Twine instead of utostr (when they were already using
a Twine, e.g., for names).
 - I am a little ambivalent about this; we don't want the string conversion of
   utostr, but using overload '+' mixed with string and integer arguments is
   sketchy. On the other hand, this particular usage is something of an idiom.

llvm-svn: 77579
2009-07-30 04:20:37 +00:00
Owen Anderson 4056ca9568 Move types back to the 2.5 API.
llvm-svn: 77516
2009-07-29 22:17:13 +00:00
Owen Anderson 487375e9a2 Move ConstantExpr to 2.5 API.
llvm-svn: 77494
2009-07-29 18:55:55 +00:00
Owen Anderson 4aa3295a65 Return ConstantVector to 2.5 API.
llvm-svn: 77366
2009-07-28 21:19:26 +00:00
Daniel Dunbar 4975db6276 Initial update to VMCore to use Twines for string arguments.
- The only meat here is in Value.{h,cpp} the rest is essential 'const
   std::string &' -> 'const Twine &'.

llvm-svn: 77048
2009-07-25 04:41:11 +00:00
Owen Anderson edb4a70325 Revert the ConstantInt constructors back to their 2.5 forms where possible, thanks to contexts-on-types. More to come.
llvm-svn: 77011
2009-07-24 23:12:02 +00:00
Owen Anderson 47db941fd3 Get rid of the Pass+Context magic.
llvm-svn: 76702
2009-07-22 00:24:57 +00:00
Owen Anderson 4fdeba9706 Revert yesterday's change by removing the LLVMContext parameter to AllocaInst and MallocInst.
llvm-svn: 75863
2009-07-15 23:53:25 +00:00
Owen Anderson b6b2530000 Move EVER MORE stuff over to LLVMContext.
llvm-svn: 75703
2009-07-14 23:09:55 +00:00
Torok Edwin fbcc663cbf llvm_unreachable->llvm_unreachable(0), LLVM_UNREACHABLE->llvm_unreachable.
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").

llvm-svn: 75640
2009-07-14 16:55:14 +00:00
Torok Edwin 56d0659726 assert(0) -> LLVM_UNREACHABLE.
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.

llvm-svn: 75379
2009-07-11 20:10:48 +00:00
Torok Edwin ccb29cd290 Convert more assert(0)+abort() -> LLVM_UNREACHABLE,
and abort()/exit() -> llvm_report_error().

llvm-svn: 75363
2009-07-11 13:10:19 +00:00
Owen Anderson 1e5f00e7a7 This started as a small change, I swear. Unfortunately, lots of things call the [I|F]CmpInst constructors. Who knew!?
llvm-svn: 75200
2009-07-09 23:48:35 +00:00
Owen Anderson 38264b1554 "LLVMContext* " --> "LLVMContext *"
llvm-svn: 74878
2009-07-06 23:00:19 +00:00
Owen Anderson e70b637033 More LLVMContext-ification.
llvm-svn: 74807
2009-07-05 22:41:43 +00:00
Owen Anderson 340288c621 Even more passes being LLVMContext'd.
llvm-svn: 74781
2009-07-03 19:42:02 +00:00
Dan Gohman adfd42a3c8 Use Type::getScalarType.
llvm-svn: 73451
2009-06-16 00:20:26 +00:00
Jay Foad e57ba2eab5 Use cast<> instead of dyn_cast<> for things that are known to be
Instructions.

llvm-svn: 73002
2009-06-06 17:49:35 +00:00
Eli Friedman ee94e3cc9e PR4286: Make RewriteLoadUserOfWholeAlloca and
RewriteStoreUserOfWholeAlloca deal with tail padding because 
isSafeUseOfBitCastedAllocation expects them to.  Otherwise, we crash 
trying to erase the bitcast.

llvm-svn: 72688
2009-06-01 09:14:32 +00:00
Duncan Sands af9eaa830a Rename PaddedSize to AllocSize, in the hope that this
will make it more obvious what it represents, and stop
it being confused with the StoreSize.

llvm-svn: 71349
2009-05-09 07:06:46 +00:00
Chris Lattner c48091f141 fix RewriteStoreUserOfWholeAlloca to use the correct type size
method, fixing a crash on PR4146.  While the store will 
ultimately overwrite the "padded size" number of bits in memory,
the stored value may be a subset of this size.  This function
only wants to handle the case where all bits are stored.

llvm-svn: 71224
2009-05-08 15:54:41 +00:00
Chris Lattner 69223bb7f5 fix a crash on a pointless but valid zero-length memset, rdar://6808691
llvm-svn: 69680
2009-04-21 16:52:12 +00:00
Zhou Sheng 4e2af3cb55 Explicitly check for StoreInst, do not lose the chance to delete
unused loads or bitcasts.

llvm-svn: 67202
2009-03-18 12:48:48 +00:00
Zhou Sheng 05bea906c1 Revert my previous change on Local.cpp, instead, fix the bug on scalarrepl.
If the instruction has no users, it is also not only used by debug info 
and should not be deleted.

llvm-svn: 67194
2009-03-18 10:13:08 +00:00
Chris Lattner 21a84f3054 teach SROA to handle promoting vector allocas with a memset into them into
a vector type instead of into an integer type.

llvm-svn: 66368
2009-03-08 04:17:04 +00:00
Chris Lattner c009757761 Enhance SROA to "promote to scalar" allocas which are
memcpy/memmove'd into or out of.  This fixes a serious
perf issue that Nate ran into.

llvm-svn: 66366
2009-03-08 04:04:21 +00:00
Chris Lattner dc35e5b43a change the MemIntrinsic get/setAlignment method to take an unsigned
instead of a Constant*, which is what the clients of it really want.

llvm-svn: 66364
2009-03-08 03:59:00 +00:00
Chris Lattner 334268a211 Introduce a new MemTransferInst pseudo class, which is a common
parent between MemCpyInst and MemMoveInst, simplify some code to
use it.

llvm-svn: 66361
2009-03-08 03:37:16 +00:00
Devang Patel 25b625165f While converting an aggregate to scalare, ignore and remove aggregate's debug info.
llvm-svn: 66262
2009-03-06 07:03:54 +00:00
Evan Cheng 5fd4fc76bf SRThreshold is meant to be inclusive.
llvm-svn: 66227
2009-03-06 00:56:43 +00:00
Chris Lattner a41bb40458 complete comment.
llvm-svn: 66055
2009-03-04 19:23:25 +00:00
Chris Lattner b5b0c87be6 this wasn't intended to be committed.
llvm-svn: 66054
2009-03-04 19:22:30 +00:00
Chris Lattner 5c204c92a4 Fix PR3720 by properly propagating alignment information from memcpy/memmove
onto element accesses.

llvm-svn: 66053
2009-03-04 19:20:50 +00:00
Bill Wendling a68fc7af63 Use > instead of >=. We want to promote aggregates of 128-bytes.
llvm-svn: 65960
2009-03-03 19:18:49 +00:00
Bill Wendling 3e44bf3c4b Reapply r65755, but reversing "<" to ">=".
llvm-svn: 65945
2009-03-03 12:12:58 +00:00
Bill Wendling 38eae046cf Temporarily revert r65755. It was causing failures in the self-hosting
testsuite:

Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/dg.exp ...
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/nancvt.ll
Failed with exit(1) at line 2
while running: grep 2147027116 nancvt.ll.tmp | count 3
count: expected 3 lines and got        0.
child process exited abnormally
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/vec_ins_extract.ll
Failed with exit(1) at line 1
while running:  llvm-as < /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/vec_ins_extract.ll |  opt -scalarrepl -instcombine |   llc -march=x86 -mcpu=yonah | not /usr/bin/grep sub.*esp
      subl      $28, %esp
      subl      $28, %esp
child process exited abnormally

And more.

llvm-svn: 65758
2009-03-01 03:55:12 +00:00
Chris Lattner e2bb5e31c8 hoist the check for alloca size up so that it controls CanConvertToScalar
as well as isSafeAllocaToScalarRepl.

llvm-svn: 65755
2009-03-01 02:26:47 +00:00
Devang Patel da1a632a87 Use early exits. Reduce indentation.
llvm-svn: 64226
2009-02-10 19:28:07 +00:00
Devang Patel caf4485781 Enable scalar replacement of AllocaInst whose one of the user is dbg info.
llvm-svn: 64207
2009-02-10 07:00:59 +00:00
Chris Lattner bbbb74372b fix PR3489, use bits instead of bytes.
llvm-svn: 63916
2009-02-06 04:34:07 +00:00
Chris Lattner ef37dc8511 teach "convert from scalar" to handle loads of fca's.
llvm-svn: 63659
2009-02-03 21:08:45 +00:00
Chris Lattner f5df53cb46 refactor the interface to ConvertUsesOfLoadToScalar,
renaming it to ConvertScalar_ExtractValue

llvm-svn: 63658
2009-02-03 21:01:03 +00:00
Chris Lattner 576baa4adf convert ConvertUsesOfLoadToScalar to use IRBuilder,
no functionality change.

llvm-svn: 63652
2009-02-03 19:45:44 +00:00
Chris Lattner c1fb96d347 switch ConvertScalar_InsertValue to use an IRBuilder, no
functionality change.

llvm-svn: 63651
2009-02-03 19:41:50 +00:00
Chris Lattner 18f56c295c make scalar conversion handle stores of first class
aggregate values.  loads are not yet handled (coming
soon to an sroa near you).

llvm-svn: 63649
2009-02-03 19:30:11 +00:00
Chris Lattner 73eff2e6e8 Make SROA produce a vector only when the alloca is actually
accessed at least once as a vector.  This prevents it from
compiling the example in not-a-vector into:

define double @test(double %A, double %B) {
	%tmp4 = insertelement <7 x double> undef, double %A, i32 0
	%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
	%tmp2 = extractelement <7 x double> %tmp, i32 4
	ret double %tmp2
}

instead, producing the integer code.  Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.

llvm-svn: 63638
2009-02-03 18:15:05 +00:00
Chris Lattner 80810b4c2d add another case of undefined behavior without crashing, PR3466.
llvm-svn: 63620
2009-02-03 07:08:57 +00:00
Chris Lattner 6aa6b1f263 Teach ConvertUsesToScalar to handle memset, allowing it to handle
crazy cases like:

struct f {  int A, B, C, D, E, F; };
short test4() {
  struct f A;
  A.A = 1;
  memset(&A.B, 2, 12);
  return A.C;
}

llvm-svn: 63596
2009-02-03 02:01:43 +00:00
Chris Lattner 09b65ab288 rearrange how SRoA handles promotion of allocas to vectors.
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type 
itself.  This allows us to un-xfail 
test/CodeGen/X86/vec_ins_extract.ll.

llvm-svn: 63590
2009-02-03 01:30:09 +00:00
Chris Lattner 43cecd7c26 inline SROA::ConvertToScalar, no functionality change.
llvm-svn: 63544
2009-02-02 20:44:45 +00:00
Chris Lattner 18eba4f211 Fix a bug which caused us to miscompile a couple of Ada
tests.  Thanks for the beautiful reduced testcase Duncan!

llvm-svn: 63529
2009-02-02 18:02:59 +00:00
Duncan Sands 6f361ff345 Fix a comment (bytes -> bits), reformat a comment
and remove trailing whitespace.  No functionality
change.

llvm-svn: 63511
2009-02-02 10:06:20 +00:00
Duncan Sands 33d6e97e33 Fix an obvious thinko.
llvm-svn: 63510
2009-02-02 09:53:14 +00:00
Chris Lattner ec99c46d44 Simplify and generalize the SROA "convert to scalar" transformation to
be able to handle *ANY* alloca that is poked by loads and stores of 
bitcasts and GEPs with constant offsets.  Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc.  This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.

One case that is pretty great is that we compile 
2006-11-07-InvalidArrayPromote.ll into:

define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
	%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
	%tmp105 = bitcast <4 x i32> %tmp10 to i128
	%tmp1056 = zext i128 %tmp105 to i256	
	%tmp.upgrd.43 = lshr i256 %tmp1056, 96
	%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32	
	ret i32 %tmp.upgrd.44
}

which turns into:

_func:
	subl	$28, %esp
	cvttps2dq	%xmm1, %xmm0
	movaps	%xmm0, (%esp)
	movl	12(%esp), %eax
	addl	$28, %esp
	ret

Which is pretty good code all things considering :).

One effect of this is that SROA will start generating arbitrary bitwidth 
integers that are a multiple of 8 bits.  In the case above, we got a 
256 bit integer, but the codegen guys assure me that it can handle the 
simple and/or/shift/zext stuff that we're doing on these operations.

This addresses rdar://6532315

llvm-svn: 63469
2009-01-31 02:28:54 +00:00
Chris Lattner df17987c19 Fix some issues with volatility, move "CanConvertToScalar" check
after the others.

llvm-svn: 63227
2009-01-28 20:16:43 +00:00
Duncan Sands dc020f9c3c Rename getABITypeSize to getTypePaddedSize, as
suggested by Chris.

llvm-svn: 62099
2009-01-12 20:38:59 +00:00
Chris Lattner ae0e857b98 Fix PR3304
llvm-svn: 61995
2009-01-09 18:18:43 +00:00
Chris Lattner c518dfd11b This implements the second half of the fix for PR3290, handling
loads from allocas that cover the entire aggregate.  This handles
some memcpy/byval cases that are produced by llvm-gcc.  This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).

llvm-svn: 61915
2009-01-08 05:42:05 +00:00
Chris Lattner f2b8c82ad1 Implement the first half of PR3290: if there is a store of an
integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and 
filing them away in the SROA'd elements.

This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these.  For 
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"

In 176.gcc I see a few i32 stores to "%struct..0anon".

In the testcase, this is a difference between compiling test1 to:

_test1:
	subl	$12, %esp
	movl	20(%esp), %eax
	movl	%eax, 4(%esp)
	movl	16(%esp), %eax
	movl	%eax, (%esp)
	movl	(%esp), %eax
	addl	4(%esp), %eax
	addl	$12, %esp
	ret

vs:

_test1:
	movl	8(%esp), %eax
	addl	4(%esp), %eax
	ret

The second half of this will be to handle loads of the same form.

llvm-svn: 61853
2009-01-07 08:11:13 +00:00
Chris Lattner 9a2de65fd6 Factor a bunch of code out into a helper method.
llvm-svn: 61852
2009-01-07 07:18:45 +00:00
Chris Lattner db561146aa use continue to simplify code and reduce nesting, no functionality
change.

llvm-svn: 61851
2009-01-07 06:39:58 +00:00
Chris Lattner 938b54f383 Get TargetData once up front and cache as an ivar instead of
requerying it all over the place.

llvm-svn: 61850
2009-01-07 06:34:28 +00:00
Chris Lattner a63dba9e6c Use the hasAllZeroIndices predicate to simplify some
code, no functionality change.

llvm-svn: 61849
2009-01-07 06:25:07 +00:00
Dale Johannesen 0a7b4f5800 Allow SROA of vectors. Removing this caused a
huge performance regression in something we care
about.  This may not be final fix.

llvm-svn: 58718
2008-11-04 20:54:03 +00:00
Matthijs Kooijman cbe5e16eb5 Allow scalarrepl to treat an all-zero GEP just as bitcast.
This includes not marking a GEP involving a vector as unsafe, but only when it
has all zero indices. This allows scalarrepl to work in a few more cases.

llvm-svn: 57177
2008-10-06 16:23:31 +00:00
Dan Gohman a79db30d28 Tidy up several unbeseeming casts from pointer to intptr_t.
llvm-svn: 55779
2008-09-04 17:05:41 +00:00
Chris Lattner 3f972c9150 Fix PR2423 by checking all indices for out of range access, not only
indices that start with an array subscript.  x->field[10000] is just 
as bad as (*X)[14][10000].

llvm-svn: 55226
2008-08-23 05:21:06 +00:00
Chris Lattner 4d754bc97b minor tidying of comments.
llvm-svn: 52630
2008-06-23 17:11:23 +00:00
Chris Lattner 6ff85681e4 Fix PR2369 by making scalarrepl more careful about promoting
structures.  Its default threshold is to promote things that are
smaller than 128 bytes, which is sane.  However, it is not sane
to do this for things that turn into 128 *registers*.  Add a cap
on the number of registers introduced, defaulting to 128/4=32.

llvm-svn: 52611
2008-06-22 17:46:21 +00:00
Matthijs Kooijman 812989b147 Learn ScalarReplAggregrates how stores and loads of first class aggregrates
work and how to replace them into individual values. Also, when trying to
replace an aggregrate that is used by load or store with a single (large)
integer, don't crash (but don't replace the aggregrate either).

Also adds a testcase for both structs and arrays.

llvm-svn: 51997
2008-06-05 12:51:53 +00:00
Duncan Sands fc3c489b52 Change packed struct layout so that field sizes
are the same as in unpacked structs, only field
positions differ.  This only matters for structs
containing x86 long double or an apint; it may
cause backwards compatibility problems if someone
has bitcode containing a packed struct with a
field of one of those types.
The issue is that only 10 bytes are needed to
hold an x86 long double: the store size is 10
bytes, but the ABI size is 12 or 16 bytes (linux/
darwin) which comes from rounding the store size
up by the alignment.  Because it seemed silly not
to pack an x86 long double into 10 bytes in a
packed struct, this is what was done.  I now
think this was a mistake.  Reserving the ABI size
for an x86 long double field even in a packed
struct makes things more uniform: the ABI size is
now always used when reserving space for a type.
This means that developers are less likely to
make mistakes.  It also makes life easier for the
CBE which otherwise could not represent all LLVM
packed structs (PR2402).
Front-end people might need to adjust the way
they create LLVM structs - see following change
to llvm-gcc.

llvm-svn: 51928
2008-06-04 08:21:45 +00:00
Dan Gohman 7a0566b9cd Use isSingleValueType instead of isFirstClassType to
exclude struct and array types.

llvm-svn: 51456
2008-05-23 00:12:03 +00:00
Gabor Greif e1f6e4b21d API change for {BinaryOperator|CmpInst|CastInst}::create*() --> Create. Legacy interfaces will be in place for some time. (Merge from use-diet branch.)
llvm-svn: 51200
2008-05-16 19:29:10 +00:00
Dan Gohman d78c400b5b Clean up the use of static and anonymous namespaces. This turned up
several things that were neither in an anonymous namespace nor static
but not intended to be global.

llvm-svn: 51017
2008-05-13 00:00:25 +00:00
Gabor Greif e9ecc68d8f API changes for class Use size reduction, wave 1.
Specifically, introduction of XXX::Create methods
for Users that have a potentially variable number of
Uses.

llvm-svn: 49277
2008-04-06 20:25:17 +00:00
Chris Lattner c966cebe93 fix a bug Anders ran into where scalarrepl would crash when promoting
a union containing a vector and an array whose elements were smaller than
the vector elements.  this means we need to compile the load of the 
array elements into an extract element plus a truncate.

llvm-svn: 47752
2008-02-29 07:12:06 +00:00
Chris Lattner 77205def10 Refactor some code out of ConvertUsesToScalar into their own methods, no
functionality change.

llvm-svn: 47751
2008-02-29 07:03:13 +00:00
Chris Lattner dcddd64424 Fix scalarrepl to not 'miscompile' undefined code, part #2.
This fixes the store case, my previous patch just fixed the load
case.  rdar://5707076.

llvm-svn: 46932
2008-02-10 19:05:37 +00:00
Chris Lattner b9e5b8fb9e Fix a bug where scalarrepl would discard offset if type would match.
In practice this can only happen on code with already undefined behavior, 
but this is still a good thing to handle correctly.

llvm-svn: 46539
2008-01-30 00:39:15 +00:00
Chris Lattner f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Duncan Sands f042e862fd At the point of calculating the shift amount, the
type of SV has changed from what it originally was.
However we need the store width of the original.

llvm-svn: 43775
2007-11-06 20:39:11 +00:00
Duncan Sands f07fa24289 If a long double is in a packed struct, it may be
that there is no padding.

llvm-svn: 43691
2007-11-05 00:35:07 +00:00
Duncan Sands 399d97987b Change uses of getTypeSize to getABITypeSize, getTypeStoreSize
or getTypeSizeInBits as appropriate in ScalarReplAggregates.
The right change to make was not always obvious, so it would
be good to have an sroa guru review this.  While there I noticed
some bugs, and fixed them: (1) arrays of x86 long double have
holes due to alignment padding, but this wasn't being spotted
by HasStructPadding (renamed to HasPadding).  The same goes
for arrays of oddly sized ints.  Vectors also suffer from this,
in fact the problem for vectors is much worse because basic
vector assumptions seem to be broken by vectors of type with
alignment padding.   I didn't try to fix any of these vector
problems.  (2) The code for extracting smaller integers from
larger ones (in the "int union" case) was wrong on big-endian
machines for integers with size not a multiple of 8, like i1.
Probably this is impossible to hit via llvm-gcc, but I fixed
it anyway while there and added a testcase.  I also got rid of
some trailing whitespace and changed a function name which
had an obvious typo in it.

llvm-svn: 43672
2007-11-04 14:43:57 +00:00
Dale Johannesen 1d1d0e7735 Don't do SRA for unions with long double fields.
Fixes a SWB crash.

llvm-svn: 42422
2007-09-28 00:21:38 +00:00
David Greene c656cbb8c2 Update GEP constructors to use an iterator interface to fix
GLIBCXX_DEBUG issues.

llvm-svn: 41697
2007-09-04 15:46:09 +00:00
Chris Lattner 1f70816c73 Fix an accidental commit.
llvm-svn: 40758
2007-08-02 21:33:36 +00:00
Chris Lattner 2740694450 wrap some long lines. Major offenders that are left include
gvn, gvnpre, dse, and predsimplify.  To see these, use:

  make check-line-length

llvm-svn: 40738
2007-08-02 16:53:43 +00:00
Dan Gohman 34d442f274 More explicit keywords.
llvm-svn: 40673
2007-08-01 15:32:29 +00:00
David Greene 17a5dfe6f7 New CallInst interface to address GLIBCXX_DEBUG errors caused by
indexing an empty std::vector.

Updates to all clients.

llvm-svn: 40660
2007-08-01 03:43:44 +00:00
Dan Gohman 06c60b6032 Fix comments about vectors to use the current wording.
llvm-svn: 39921
2007-07-16 14:29:03 +00:00
Devang Patel e8ec7661ea Expose struct size threhold to allow users to tweak their own setting.
llvm-svn: 38472
2007-07-09 21:19:23 +00:00
Zhou Sheng 1ee941dac4 Correct a typo.
llvm-svn: 37936
2007-07-06 06:01:16 +00:00
Devang Patel fc7fdef7d2 Use DominatorTree instead of ETForest.
This allows faster immediate domiantor walk.

llvm-svn: 37500
2007-06-07 21:57:03 +00:00
Chris Lattner 8767920f20 Fix Transforms/ScalarRepl/2007-05-29-MemcpyPreserve.ll and the second
half of PR1421, by not decimating structs with holes that are the source and
destination of a memcpy.

llvm-svn: 37358
2007-05-30 06:11:23 +00:00
Chris Lattner 80c94a4a04 Fix PR1446 by not scalarrepl'ing giant structures.
llvm-svn: 37326
2007-05-24 18:43:04 +00:00
Nick Lewycky e7da2d6ac3 Fix typo in comment.
llvm-svn: 36873
2007-05-06 13:37:16 +00:00
Devang Patel 8c78a0bff0 Drop 'const'
llvm-svn: 36662
2007-05-03 01:11:54 +00:00
Devang Patel e95c6ad802 Use 'static const char' instead of 'static const int'.
Due to darwin gcc bug, one version of darwin linker coalesces
static const int, which defauts PassID based pass identification.

llvm-svn: 36652
2007-05-02 21:39:20 +00:00
Devang Patel 09f162ca6a Do not use typeinfo to identify pass in pass manager.
llvm-svn: 36632
2007-05-01 21:15:47 +00:00
Devang Patel d3ccc073a2 Mem2Reg does not need TargetData.
llvm-svn: 36444
2007-04-25 18:32:35 +00:00
Devang Patel 073be55d8e Remove unused function argument.
llvm-svn: 36441
2007-04-25 17:15:20 +00:00
Chris Lattner 827cb98a0a If an alloca only has two types of uses: 1) reads 2) a memcpy/memmove that
copies from a constant global, then we can change the reads to read from the
global instead of from the alloca.  This eliminates the alloca and the memcpy,
and promotes secondary optimizations (because the loads are now loads from
a constant global).

This is important for a common C idiom:

void foo() {
   int A[] = {1,2,3,4,5,6,7,8,9...};
   ... only reads of A ...
}

For some reason, people forget to mark the array static or const.

This triggers on these multisource benchmarks:
JM/ldecode: block_pos, [3 x [4 x [4 x i32]]]
FreeBench/mason: m, [18 x i32], inlined 4 times
MiBench/office-stringsearch: search_strings, [1332 x i8*]
MiBench/office-stringsearch: find_strings, [1333 x i8*]
Prolangs-C++/city: dirs, [9 x i8*], inlined 4 places

and these spec benchmarks:
177.mesa: message, [8 x [32 x i8]]
186.crafty: bias_rl45, [64 x i32]
186.crafty: diag_sq, [64 x i32]
186.crafty: empty, [9 x i8]
186.crafty: xlate, [15 x i8]
186.crafty: status, [13 x i8]
186.crafty: bdinfo, [25 x i8]
445.gobmk: routines, [16 x i8*]
458.sjeng: piece_rep, [14 x i8*]
458.sjeng: t, [13 x i32], inlined 4 places.
464.h264ref: block8x8_idx, [3 x [4 x [4 x i32]]]
464.h264ref: block_pos, [3 x [4 x [4 x i32]]]
464.h264ref: j_off_tab, [12 x i32]

This implements Transforms/ScalarRepl/memcpy-from-global.ll

llvm-svn: 36429
2007-04-25 06:40:51 +00:00
Chris Lattner 31e5addb67 refactor the SROA code out into its own method, no functionality change.
llvm-svn: 36426
2007-04-25 05:02:56 +00:00
Owen Anderson 2da606c757 Move more passes to using ETForest instead of DominatorTree.
llvm-svn: 36271
2007-04-20 06:27:13 +00:00
Zhou Sheng aafe4e216e Make use of ConstantInt::isZero instead of ConstantInt::isNullValue.
llvm-svn: 36261
2007-04-19 05:39:12 +00:00
Chris Lattner 5ee4d0726a Fix Transforms/ScalarRepl/union-pointer.ll
llvm-svn: 35906
2007-04-11 15:45:25 +00:00
Chris Lattner 32104034f8 fix a regression introduced by my last patch.
llvm-svn: 35879
2007-04-11 03:27:24 +00:00
Chris Lattner daa012d1fb Simplify SROA conversion to integer in some ways, make it more general in others.
We now tolerate small amounts of undefined behavior, better emulating what
would happen if the transaction actually occurred in memory.  This fixes
SingleSource/UnitTests/2007-04-10-BitfieldTest.c on PPC, at least until
Devang gets a chance to fix the CFE from doing undefined things with bitfields :)

llvm-svn: 35875
2007-04-11 00:57:54 +00:00
Dan Gohman dcb291faa4 Change uses of Function::front to Function::getEntryBlock for readability.
llvm-svn: 35265
2007-03-22 16:38:57 +00:00
Chris Lattner 9c62db7c8c fix ScalarRepl/2007-03-19-CanonicalizeMemcpy.ll
llvm-svn: 35169
2007-03-19 18:25:57 +00:00
Chris Lattner 877a3b424d implement the next chunk of SROA with memset/memcpy's of aggregates. This
implements Transforms/ScalarRepl/memset-aggregate-byte-leader.ll

llvm-svn: 35150
2007-03-19 00:16:43 +00:00
Chris Lattner abd3bff4f2 This appears correct, enable it so we can see perf changes on testers
llvm-svn: 35024
2007-03-08 07:03:55 +00:00
Chris Lattner 9f022d550b Second half of PR1226. This is currently still disabled, until I have a chance to
do the correctness/performance analysis testing.

llvm-svn: 35023
2007-03-08 06:36:54 +00:00
Chris Lattner 66e6a8229a This is the first major step of implementing PR1226. We now successfully
scalarrepl things down to elements, but mem2reg can't promote elements that
are memset/memcpy'd.  Until then, the code is disabled "0 &&".

llvm-svn: 34924
2007-03-05 07:52:57 +00:00
Reid Spencer 09575bac2e For PR1195:
Change use of "packed" term to "vector" in comments, strings, variable
names, etc.

llvm-svn: 34300
2007-02-15 03:39:18 +00:00
Reid Spencer d84d35ba70 For PR1195:
Rename PackedType -> VectorType, ConstantPacked -> ConstantVector, and
PackedTyID -> VectorTyID. No functional changes.

llvm-svn: 34293
2007-02-15 02:26:10 +00:00
Chris Lattner a731513406 stop using methods that take vectors.
llvm-svn: 34205
2007-02-12 22:56:41 +00:00
Chris Lattner 6e0123b17f Simplify code by using value::takename
llvm-svn: 34176
2007-02-11 01:23:03 +00:00
Chris Lattner c473d8e431 Privatize StructLayout::MemberOffsets, adding an accessor
llvm-svn: 34156
2007-02-10 19:55:17 +00:00
Reid Spencer 0d5f9237b6 Use short form of binary operator create functions.
llvm-svn: 33783
2007-02-02 14:08:20 +00:00
Reid Spencer 2341c22ec7 Changes to support making the shift instructions be true BinaryOperators.
This feature is needed in order to support shifts of more than 255 bits
on large integer types.  This changes the syntax for llvm assembly to
make shl, ashr and lshr instructions look like a binary operator:
   shl i32 %X, 1
instead of
   shl i32 %X, i8 1
Additionally, this should help a few passes perform additional optimizations.

llvm-svn: 33776
2007-02-02 02:16:23 +00:00
Reid Spencer 2eadb5310d For PR970:
Clean up handling of isFloatingPoint() and dealing with PackedType.
Patch by Gordon Henriksen!

llvm-svn: 33415
2007-01-21 00:29:26 +00:00
Reid Spencer a94d394ad2 For PR1043:
This is the final patch for this PR. It implements some minor cleanup
in the use of IntegerType, to wit:
1. Type::getIntegerTypeMask -> IntegerType::getBitMask
2. Type::Int*Ty changed to IntegerType* from Type*
3. ConstantInt::getType() returns IntegerType* now, not Type*

This also fixes PR1120.

Patch by Sheng Zhou.

llvm-svn: 33370
2007-01-19 21:13:56 +00:00
Chris Lattner 03c4953cdd rename Type::isIntegral to Type::isInteger, eliminating the old Type::isInteger.
rename Type::getIntegralTypeMask to Type::getIntegerTypeMask.

This makes naming much more consistent.  For example, there are now no longer any
instances of IntegerType that are not considered isInteger! :)

llvm-svn: 33225
2007-01-15 02:27:26 +00:00
Chris Lattner 1942249c5b Eliminate calls to isInteger, generalizing code and tightening checks as needed.
llvm-svn: 33218
2007-01-15 01:55:30 +00:00
Reid Spencer 7a9c62baa6 For PR1064:
Implement the arbitrary bit-width integer feature. The feature allows
integers of any bitwidth (up to 64) to be defined instead of just 1, 8,
16, 32, and 64 bit integers.

This change does several things:
1. Introduces a new Derived Type, IntegerType, to represent the number of
   bits in an integer. The Type classes SubclassData field is used to
   store the number of bits. This allows 2^23 bits in an integer type.
2. Removes the five integer Type::TypeID values for the 1, 8, 16, 32 and
   64-bit integers. These are replaced with just IntegerType which is not
   a primitive any more.
3. Adjust the rest of LLVM to account for this change.

Note that while this incremental change lays the foundation for arbitrary
bit-width integers, LLVM has not yet been converted to actually deal with
them in any significant way. Most optimization passes, for example, will
still only deal with the byte-width integer types.  Future increments
will rectify this situation.

llvm-svn: 33113
2007-01-12 07:05:14 +00:00
Reid Spencer 8f166b0ef3 Comparison of primitive type sizes should now be done in bits, not bytes.
This patch converts getPrimitiveSize to getPrimitiveSizeInBits where it is
appropriate to do so (comparison of integer primitive types).

llvm-svn: 33012
2007-01-08 16:32:00 +00:00
Reid Spencer c635f47d9a For PR950:
This patch replaces signed integer types with signless ones:
1. [US]Byte -> Int8
2. [U]Short -> Int16
3. [U]Int   -> Int32
4. [U]Long  -> Int64.
5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion
   and other methods related to signedness. In a few places this warranted
   identifying the signedness information from other sources.

llvm-svn: 32785
2006-12-31 05:48:39 +00:00
Reid Spencer 266e42b312 For PR950:
This patch removes the SetCC instructions and replaces them with the ICmp
and FCmp instructions. The SetCondInst instruction has been removed and
been replaced with ICmpInst and FCmpInst.

llvm-svn: 32751
2006-12-23 06:05:41 +00:00
Chris Lattner f171af97d5 add a simple fast-path for dead allocas
llvm-svn: 32750
2006-12-22 23:14:42 +00:00
Chris Lattner 79a42ac941 Switch over Transforms/Scalar to use the STATISTIC macro. For each statistic
converted, we lose a static initializer.  This also allows GCC to emit warnings
about unused statistics.

llvm-svn: 32690
2006-12-19 21:40:18 +00:00
Chris Lattner 8f7b775bf4 re-enable a temporarily-reverted patch
llvm-svn: 32595
2006-12-15 07:32:38 +00:00
Chris Lattner 7c1dff99dc revert my recent int<->fp and vector union promotion changes, they expose
obscure bugs affecting the X86 code generator.  I will reenable this
when fixed.

llvm-svn: 32524
2006-12-13 02:26:45 +00:00
Chris Lattner 6e5fe376ec Patch for PR1045 and Transforms/ScalarRepl/2006-12-11-SROA-Crash.ll
llvm-svn: 32468
2006-12-12 04:24:41 +00:00
Chris Lattner e810140c4b trunc to integer, not to FP.
llvm-svn: 32426
2006-12-11 01:17:00 +00:00
Chris Lattner 23f4b68f7e implement promotion of unions containing two packed types of the same width.
This implements Transforms/ScalarRepl/union-packed.ll

llvm-svn: 32422
2006-12-11 00:35:08 +00:00
Chris Lattner 216c3028e6 * Eliminate calls to CastInst::createInferredCast.
* Add support for promoting unions with fp values in them.  This produces
   our new int<->fp bitcast instructions, implementing
   Transforms/ScalarRepl/union-fp-int.ll

As an example, this allows us to compile this:

union intfloat { int i; float f; };
float invsqrt(const float arg_x) {
    union intfloat x = { .f = arg_x };
    const float xhalf = arg_x * 0.5f;
    x.i = 0x5f3759df - (x.i >> 1);
    return x.f * (1.5f - xhalf * x.f * x.f);
}

into:

_invsqrt:
        movss 4(%esp), %xmm0
        movd %xmm0, %eax
        sarl %eax
        movl $1597463007, %ecx
        subl %eax, %ecx
        movd %ecx, %xmm1
        mulss LCPI1_0, %xmm0
        mulss %xmm1, %xmm0
        movss LCPI1_1, %xmm2
        mulss %xmm1, %xmm0
        subss %xmm0, %xmm2
        movl 8(%esp), %eax
        mulss %xmm2, %xmm1
        movss %xmm1, (%eax)
        ret

instead of:

_invsqrt:
        subl $4, %esp
        movss 8(%esp), %xmm0
        movss %xmm0, (%esp)
        movl (%esp), %eax
        movl $1597463007, %ecx
        sarl %eax
        subl %eax, %ecx
        movl %ecx, (%esp)
        mulss LCPI1_0, %xmm0
        movss (%esp), %xmm1
        mulss %xmm1, %xmm0
        mulss %xmm1, %xmm0
        movss LCPI1_1, %xmm2
        subss %xmm0, %xmm2
        mulss %xmm2, %xmm1
        movl 12(%esp), %eax
        movss %xmm1, (%eax)
        addl $4, %esp
        ret

llvm-svn: 32418
2006-12-10 23:56:50 +00:00
Chris Lattner 700b873130 Detemplatize the Statistic class. The only type it is instantiated with
is 'unsigned'.

llvm-svn: 32279
2006-12-06 17:46:33 +00:00
Reid Spencer 6c38f0bb07 For PR950:
The long awaited CAST patch. This introduces 12 new instructions into LLVM
to replace the cast instruction. Corresponding changes throughout LLVM are
provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the
exception of 175.vpr which fails only on a slight floating point output
difference.

llvm-svn: 31931
2006-11-27 01:05:10 +00:00
Bill Wendling 5dbf43c983 Removed #include <iostream> and replaced with llvm_* streams.
llvm-svn: 31923
2006-11-26 09:46:52 +00:00
Reid Spencer fdff938a7e For PR950:
This patch converts the old SHR instruction into two instructions,
AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not
dependent on the sign of their operands.

llvm-svn: 31542
2006-11-08 06:47:33 +00:00
Chris Lattner 4967f6ddea scalarrepl should not split the two elements of the vsiidx array:
int func(vFloat v0, vFloat v1) {
        int ii;
        vSInt32 vsiidx[2];
        vsiidx[0] = _mm_cvttps_epi32(v0);
        vsiidx[1] = _mm_cvttps_epi32(v1);
        ii = ((int *) vsiidx)[4];
        return ii;
}

This fixes Transforms/ScalarRepl/2006-11-07-InvalidArrayPromote.ll

llvm-svn: 31524
2006-11-07 22:42:47 +00:00
Reid Spencer de46e48420 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Chris Lattner ebb1ad4382 Fix Transforms/ScalarRepl/2006-10-23-PointerUnionCrash.ll
llvm-svn: 31151
2006-10-24 06:26:32 +00:00
Reid Spencer e0fc4dfc22 For PR950:
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.

llvm-svn: 31063
2006-10-20 07:07:24 +00:00
Chris Lattner 41b442242d Implement SROA of unions with mixed pointers/integers in them. This implements
PR892 and Transforms/ScalarRepl/union-pointer.ll:test2

llvm-svn: 30825
2006-10-08 23:53:04 +00:00
Chris Lattner 05f8272afa Implement Transforms/ScalarRepl/union-pointer.ll:test
llvm-svn: 30823
2006-10-08 23:28:04 +00:00
Chris Lattner c2d3d3112e eliminate RegisterOpt. It does the same thing as RegisterPass.
llvm-svn: 29925
2006-08-27 22:42:52 +00:00
Chris Lattner 3d27be1333 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Chris Lattner 996795b0dd Use hidden visibility to make symbols in an anonymous namespace get
dropped.  This shrinks libllvmgcc.dylib another 67K

llvm-svn: 28975
2006-06-28 23:17:24 +00:00
Chris Lattner dae49df407 Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll
llvm-svn: 27912
2006-04-20 20:48:50 +00:00