use raw_ostream instead of std::ostream. Among other goodness,
this speeds up llvm-dis of kc++ with a release build from 0.85s
to 0.49s (88% faster).
Other interesting changes:
1) This makes Value::print be non-virtual.
2) AP[S]Int and ConstantRange can no longer print to ostream directly,
use raw_ostream instead.
3) This fixes a bug in raw_os_ostream where it didn't flush itself
when destroyed.
4) This adds a new SDNode::print method, instead of only allowing "dump".
A lot of APIs have both std::ostream and raw_ostream versions, it would
be useful to go through and systematically anihilate the std::ostream
versions.
This passes dejagnu, but there may be minor fallout, plz let me know if
so and I'll fix it.
llvm-svn: 55263
continue past the first conditional branch when looking for a
relevant test. This helps it avoid using MAX expressions in
loop trip counts in more cases.
llvm-svn: 54697
type lattice value for an Argument*, giving clients the opportunity to
use something other than Top for it if they choose to."
Patch by John McCall!
llvm-svn: 54589
version uses a new algorithm for evaluating the binomial coefficients
which is significantly more efficient for AddRecs of more than 2 terms
(see the comments in the code for details on how the algorithm works).
It also fixes some bugs: it removes the arbitrary length restriction for
AddRecs, it fixes the silent generation of incorrect code for AddRecs
which require a wide calculation width, and it fixes an issue where we
were incorrectly truncating the iteration count too far when evaluating
an AddRec expression narrower than the induction variable.
There are still a few related issues I know of: I think there's
still an issue with the SCEVExpander expansion of AddRec in terms of
the width of the induction variable used. The hack to avoid generating
too-wide integers shouldn't be necessary; instead, the callers should be
considering the cost of the expansion before expanding it (in addition
to not expanding too-wide integers, we might not want to expand
expressions that are really expensive, especially when optimizing for
size; calculating an length-17 32-bit AddRec currently generates about 250
instructions of straight-line code on X86). Also, for long 32-bit
AddRecs on X86, CodeGen really sucks at scheduling the code. I'm planning on
filing follow-up PRs for these issues.
llvm-svn: 54332
time applying to the implicit comparison in smin expressions. The
correct way to transform an inequality into the opposite
inequality, either signed or unsigned, is with a not expression.
I looked through the SCEV code, and I don't think there are any more
occurrences of this issue.
llvm-svn: 54194
SGT exit condition. Essentially, the correct way to flip an inequality
in 2's complement is the not operator, not the negation operator.
That said, the difference only affects cases involving INT_MIN.
Also, enhance the pre-test search logic to be a bit smarter about
inequalities flipped with a not operator, so it can eliminate the smax
from the iteration count for simple loops.
llvm-svn: 54184
circumstances we could end up remapping a dependee to the same instruction
that we're trying to remove. Handle this properly by just falling back to
a conservative solution.
llvm-svn: 54132
bail after 256-bits to avoid producing code that the backends can't handle.
Previously, we capped it at 64-bits, preferring to miscompile in those cases.
This change also reverts much of r52248 because the invariants the code was
expecting are now being met.
llvm-svn: 53812
Move GetConstantStringInfo to lib/Analysis. Remove
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
This unbreaks llvm-gcc bootstrap.
llvm-svn: 52884
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
llvm-svn: 52748
inserting extractvalues. In particular, this prevents the insertion of
extractvalues that can't be folded away later. Also add an example of when this
stuff is needed.
llvm-svn: 52328
I'm at it, rename it to FindInsertedValue.
The only functional change is that newly created instructions are no longer
added to instcombine's worklist, but that is not really necessary anyway (and
I'll commit some improvements next that will completely remove the need).
llvm-svn: 52315
This fixes several minor bugs (such as returning noalias
for comparisons between external weak functions an null) but
is mostly a cleanup.
llvm-svn: 52299
take into account the instrucion pointed by InsertPt. Thanks to it,
returning the new value of InsertPt to the InsertBinop() caller can be
avoided. The bug was, actually, in visitAddRecExpr() method which wasn't
correctly handling changes of InsertPt. There shouldn't be any
performance regression, as -gvn pass (run after -indvars) removes any
redundant binops.
llvm-svn: 52291
Add a safety measure. It isn't safe to assume in ScalarEvolutionExpander that
all loops are in canonical form (but it should be safe for loops that have
AddRecs).
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
llvm-svn: 52275
with code that was expecting different bit widths for different values.
Make getTruncateOrZeroExtend a method on ScalarEvolution, and use it.
llvm-svn: 52248
is longer than the second one) should stop after finding one. Added break
instruction guarantees it. It also changes difference between offsets to
absolute value of this difference in the condition.
llvm-svn: 51875
out of instcombine into a new file in libanalysis. This also teaches
ComputeNumSignBits about the number of sign bits in a constantint.
llvm-svn: 51863
Analysis/ConstantFolding to fold ConstantExpr's, then make instcombine use it
to try to use targetdata to fold constant expressions on void instructions.
Also extend the icmp(inttoptr, inttoptr) folding to handle the case where
int size != ptr size.
llvm-svn: 51559
SCCP like sparse lattice analysis with relative ease. Just pick your
lattice function and implement the transfer function and you're good.
Just make sure you don't break monotonicity ;-)
llvm-svn: 50961
by an instance of LibCallInfo to provide mod/ref info of
standard library functions. This is powerful enough to
say that 'sqrt' is readonly except that it modifies errno,
or that "printf doesn't store to memory unless the %n
constraint is present" etc.
llvm-svn: 50827
Currently is sufficient to describe mod/ref behavior but will hopefully
eventually be extended for other purposes.
This isn't used by anything yet.
llvm-svn: 50820
manually performing the comparison. This allows the special
case to work correctly even in the case where someone is
experimenting with a different comparison function :-).
llvm-svn: 49670
Parse reversed smax and umax as smin and umin and express them with negative
or binary-not SCEVs (which are really just subtract under the hood).
Parse 'xor %x, -1' as (-1 - %x).
Remove dead code (ConstantInt::get always returns a ConstantInt).
Don't use getIntegerSCEV(-1, Ty). The first value is an int, then it gets
passed into a uint64_t. Instead, create the -1 directly from
ConstantInt::getAllOnesValue().
llvm-svn: 47360
Also, noalias arguments are be considered "like" stack allocated ones for this purpose, because
the only way they can be modref'ed is if they escape somewhere in the current function.
llvm-svn: 47247
variable (with step 1) and m is its final value. Then, the correct trip
count is SMAX(m,n)-n. Previously, we used SMAX(0,m-n), but m-n may
overflow and can't in general be interpreted as signed.
Patch by Nick Lewycky.
llvm-svn: 47007
to the RHS. This simple change allows to compute loop iteration count
for loops with condition similar to the one in the testcase (which seems
to be quite common).
llvm-svn: 46959
arbitrary iteration.
The patch:
1) changes SCEVSDivExpr into SCEVUDivExpr,
2) replaces PartialFact() function with BinomialCoefficient(); the
computations (essentially, the division) in BinomialCoefficient() are
performed with the apprioprate bitwidth necessary to avoid overflow;
unsigned division is used instead of the signed one.
Computations in BinomialCoefficient() require support from the code
generator for APInts. Currently, we use a hack rounding up the
neccessary bitwidth to the nearest power of 2. The hack is easy to turn
off in future.
One remaining issue: we assume the divisor of the binomial coefficient
formula can be computed accurately using 16 bits. It means we can handle
AddRecs of length up to 9. In future, we should use APInts to evaluate
the divisor.
Thanks to Nicholas for cooperation!
llvm-svn: 46955
and readnone for functions with bodies because it
broke llvm-gcc-4.2 bootstrap. It turns out that,
because of LLVM's array_ref hack, gcc was computing
pure/const attributes wrong (now fixed by turning
off the gcc ipa-pure-const pass).
llvm-svn: 44937
Reimplement the xform in Analysis/ConstantFolding.cpp where we can use
targetdata to validate that it is safe. While I'm in there, fix some const
correctness issues and generalize the interface to the "operand folder".
llvm-svn: 44817
throw exceptions", just mark intrinsics with the nounwind
attribute. Likewise, mark intrinsics as readnone/readonly
and get rid of special aliasing logic (which didn't use
anything more than this anyway).
llvm-svn: 44544
into alias analysis. This meant updating the API
which now has versions of the getModRefBehavior,
doesNotAccessMemory and onlyReadsMemory methods
which take a callsite parameter. These should be
used unless the callsite is not known, since in
general they can do a better job than the versions
that take a function. Also, users should no longer
call the version of getModRefBehavior that takes
both a function and a callsite. To reduce the
chance of misuse it is now protected.
llvm-svn: 44487
the function type, instead they belong to functions
and function calls. This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll). Hopefully
a bitcode guru (who might that be? :) ) will fix it.
llvm-svn: 44359
OnlyReadsMemoryFns tables are dead! We
get more, and more accurate, information
from gcc via the readnone and readonly
function attributes.
llvm-svn: 44288
The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment. This gives a primitive type for
which getTypeSize differed from getABITypeSize. For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).
This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition). Instead there is:
(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type. For a primitive type, this is the minimum number
of bits. For an i36 this is 36 bits. For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.
(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it). For an
i36 this is 40 bits, for an x86 long double it is 80 bits. This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes). There doesn't seem to be anything
corresponding to this in gcc.
(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment. For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS. This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes). This is
TYPE_SIZE in gcc.
Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize. This means that the size of an array
is the length times the getABITypeSize. It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize. Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case. So alloca's and mallocs should use getABITypeSize. Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.
Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.
In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases). I will get around to auditing these too at some point,
but I could do with some help.
Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize. I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers. If someone wants to pack these types more
tightly they can always use a packed struct.
llvm-svn: 43620
and time usage.
Fixup operator == to make this work, and add a resize method to DenseMap
so we can resize our hashtable once we know how big it should be.
llvm-svn: 42269
Use APFloat in UpgradeParser and AsmParser.
Change all references to ConstantFP to use the
APFloat interface rather than double. Remove
the ConstantFP double interfaces.
Use APFloat functions for constant folding arithmetic
and comparisons.
(There are still way too many places APFloat is
just a wrapper around host float/double, but we're
getting there.)
llvm-svn: 41747
us to fold the entry block of PR1602 to false instead of:
br i1 icmp eq (i32 and (i32 ptrtoint (void (%struct.S*)* inttoptr (i64
1 to void (%struct.S*)*) to i32), i32 1), i32 0), label %cond_next, label
%cond_true
llvm-svn: 41023