nodes. This allows it to do fairly general phi insertion if a
load from a pointer global wants to be SRAd but the load is used
by (recursive) phi nodes. This fixes a pessimization on ppc
introduced by Load PRE.
llvm-svn: 61123
s/ParamAttr/Attribute/g
s/PAList/AttrList/g
s/FnAttributeWithIndex/AttributeWithIndex/g
s/FnAttr/Attribute/g
This sets the stage
- to implement function notes as function attributes and
- to distinguish between function attributes and return value attributes.
This requires corresponding changes in llvm-gcc and clang.
llvm-svn: 56622
1. There is now a "PAListPtr" class, which is a smart pointer around
the underlying uniqued parameter attribute list object, and manages
its refcount. It is now impossible to mess up the refcount.
2. PAListPtr is now the main interface to the underlying object, and
the underlying object is now completely opaque.
3. Implementation details like SmallVector and FoldingSet are now no
longer part of the interface.
4. You can create a PAListPtr with an arbitrary sequence of
ParamAttrsWithIndex's, no need to make a SmallVector of a specific
size (you can just use an array or scalar or vector if you wish).
5. All the client code that had to check for a null pointer before
dereferencing the pointer is simplified to just access the
PAListPtr directly.
6. The interfaces for adding attrs to a list and removing them is a
bit simpler.
Phase #2 will rename some stuff (e.g. PAListPtr) and do other less
invasive changes.
llvm-svn: 48289
The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment. This gives a primitive type for
which getTypeSize differed from getABITypeSize. For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).
This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition). Instead there is:
(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type. For a primitive type, this is the minimum number
of bits. For an i36 this is 36 bits. For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.
(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it). For an
i36 this is 40 bits, for an x86 long double it is 80 bits. This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes). There doesn't seem to be anything
corresponding to this in gcc.
(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment. For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS. This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes). This is
TYPE_SIZE in gcc.
Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize. This means that the size of an array
is the length times the getABITypeSize. It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize. Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case. So alloca's and mallocs should use getABITypeSize. Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.
Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.
In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases). I will get around to auditing these too at some point,
but I could do with some help.
Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize. I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers. If someone wants to pack these types more
tightly they can always use a packed struct.
llvm-svn: 43620
miscompilation of 188.ammp. Reject select and bitcast in
ValueIsOnlyUsedLocallyOrStoredToOneGlobal because RewriteHeapSROALoadUser can't handle it.
llvm-svn: 41950
the Transforms library. This reduces debug library size by 132 KB, debug
binary size by 376 KB, and reduces link time for llvm tools slightly.
llvm-svn: 33939
This feature is needed in order to support shifts of more than 255 bits
on large integer types. This changes the syntax for llvm assembly to
make shl, ashr and lshr instructions look like a binary operator:
shl i32 %X, 1
instead of
shl i32 %X, i8 1
Additionally, this should help a few passes perform additional optimizations.
llvm-svn: 33776
recommended that getBoolValue be replaced with getZExtValue and that
get(bool) be replaced by get(const Type*, uint64_t). This implements
those changes.
llvm-svn: 33110
This patch replaces signed integer types with signless ones:
1. [US]Byte -> Int8
2. [U]Short -> Int16
3. [U]Int -> Int32
4. [U]Long -> Int64.
5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion
and other methods related to signedness. In a few places this warranted
identifying the signedness information from other sources.
llvm-svn: 32785
This patch removes the SetCC instructions and replaces them with the ICmp
and FCmp instructions. The SetCondInst instruction has been removed and
been replaced with ICmpInst and FCmpInst.
llvm-svn: 32751
The long awaited CAST patch. This introduces 12 new instructions into LLVM
to replace the cast instruction. Corresponding changes throughout LLVM are
provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the
exception of 175.vpr which fails only on a slight floating point output
difference.
llvm-svn: 31931
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.
llvm-svn: 31380
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.
llvm-svn: 31063
DLL* linkages got full (I hope) codegeneration support in C & both x86
assembler backends.
External weak linkage added for future use, we don't provide any
codegeneration, etc. support for it.
llvm-svn: 30374
Add support for specifying alignment and size of setjmp jmpbufs.
No targets currently do anything with this information, nor is it presrved
in the bytecode representation. That's coming up next.
llvm-svn: 24196
Implement the start of global ctor optimization. It is currently smart
enough to remove the global ctor for cases like this:
struct foo {
foo() {}
} x;
... saving a bit of startup time for the program.
llvm-svn: 23433
in SPEC, the subsequent optimziations that we are after don't play with
with FP values, so disable this xform for them. Really we just don't want
stuff like:
double G; (always 0 or 412312.312)
= G;
turning into:
bool G_b;
= G_b ? 412312.312 : 0;
We'd rather just do the load.
-Chris
llvm-svn: 18819
down to actually BE a bool. This allows simple value range propagation
stuff work harder, deleting comparisons in bzip2 in some hot loops.
This implements GlobalOpt/integer-bool.ll, which is the essence of the
loop condition distilled into a testcase.
llvm-svn: 18817
in scary and unknown ways before we promote it. This fixes the miscompilation
of 188.ammp that has been plauging us since a globalopt patch went in.
Thanks a ton to Tanya for helping me diagnose the problem!
llvm-svn: 18418
value. This allows us to turn more globals into constants and eliminate them.
This patch implements GlobalOpt/load-store-global.llx.
Note that this patch speeds up 255.vortex from:
Output/255.vortex.out-cbe.time:program 7.640000
Output/255.vortex.out-llc.time:program 9.810000
to:
Output/255.vortex.out-cbe.time:program 7.250000
Output/255.vortex.out-llc.time:program 9.490000
Which isn't bad at all!
llvm-svn: 17746
First, it allows SRA of globals that have embedded arrays, implementing
GlobalOpt/globalsra-partial.llx. This comes up infrequently, but does allow,
for example, deleting several stores to dead parts of globals in dhrystone.
Second, this implements GlobalOpt/malloc-promote-*.llx, which is the
following nifty transformation:
Basically if a global pointer is initialized with malloc, and we can tell
that the program won't notice, we transform this:
struct foo *FooPtr;
...
FooPtr = malloc(sizeof(struct foo));
...
FooPtr->A FooPtr->B
Into:
struct foo FooPtrBody;
...
FooPtrBody.A FooPtrBody.B
This comes up occasionally, for example, the 'disp' global in 183.equake (where
the xform speeds the CBE version of the program up from 56.16s to 52.40s (7%)
on apoc), and the 'desired_accept', 'fixLRBT', 'macroArray', & 'key_queue'
globals in 300.twolf (speeding it up from 22.29s to 21.55s (3.4%)).
The nice thing about this xform is that it exposes the resulting global to
global variable optimization and makes alias analysis easier in addition to
eliminating a few loads.
llvm-svn: 16916
still optimize away all of the indirect calls and loads, etc from it.
This turns code like this:
if (G != 0)
G();
into
if (G != 0)
ActualCallee();
This triggers a couple of times in gcc and libstdc++.
llvm-svn: 16901
stored to, but are stored at variable indexes. This occurs at least in
176.gcc, but probably others, and we should handle it for completeness.
llvm-svn: 16876
has a large number of users. Instead, just keep track of whether we're
making changes as we do so.
This patch has no functionlity changes.
llvm-svn: 16874
we know that all uses of the global will trap if the pointer contained is
null. In this case, we forward substitute the stored value to any uses.
This has the effect of devirtualizing trivial globals in trivial cases. For
example, 164.gzip contains this:
gzip.h:extern int (*read_buf) OF((char *buf, unsigned size));
bits.c: read_buf = file_read;
deflate.c: lookahead = read_buf((char*)window,
deflate.c: n = read_buf((char*)window+strstart+lookahead, more);
Since read_buf has to point to file_read at every use, we just replace
the calls through read_buf with a direct call to file_read.
This occurs in several benchmarks, including 176.gcc and 164.gzip. Direct
calls are good and stuff.
llvm-svn: 16871
* Do not lead dangling dead constants prevent optimization
* Iterate global optimization while we're making progress.
These changes allow us to be more aggressive, handling cases like
GlobalOpt/iterate.llx without a problem (turning it into 'ret int 0').
llvm-svn: 16857
optimizations to trigger much more often. This allows the elimination of
several dozen more global variables in Programs/External. Note that we only
do this for non-constant globals: constant globals will already be optimized
out if the accesses to them permit it.
This implements Transforms/GlobalOpt/globalsra.llx
llvm-svn: 16842
* Instead of handling dead functions specially, just nuke them.
* Be more aggressive about cleaning up after constification, in
particular, handle getelementptr instructions and constantexprs.
* Be a little bit more structured about how we process globals.
*** Delete globals that are only stored to, and never read. These are
clearly not useful, so they should go. This implements deadglobal.llx
This last one triggers quite a few times. In particular, 2208 in the
external tests, 1865 of which are in 252.eon. This shrinks eon from
1995094 to 1732341 bytes of bytecode.
llvm-svn: 16802