and StoreSDNode into their common base class LSBaseSDNode. Member
functions getLoadedVT and getStoredVT are replaced with the common
getMemoryVT to simplify code that will handle both loads and stores.
llvm-svn: 46538
type that matters but the operand type. This fixes
2008-01-08-IllegalCMP.ll which crashed with the new
legalize infrastructure because SETCC with result
type i8 and operand type i64 was being custom expanded
by the X86 backend. With this fix, the gcc build gets
as far as the first libcall.
llvm-svn: 46525
only two addressing mode nodes, SPUaform and SPUindirect (vice the
three previous ones, SPUaform, SPUdform and SPUxform). This improves
code somewhat because we now avoid using reg+reg addressing when
it can be avoided. It also simplifies the address selection logic,
which was the main point for doing this.
Also, for various global variables that would be loaded using SPU's
A-form addressing, prefer D-form offs[reg] addressing, keeping the
base in a register if the variable is used more than once.
llvm-svn: 46483
way or the other. Rewriting the code itself prevents subsequent analysis
passes from making contradictory conclusions about the code that could
cause an infeasible path to be made feasible.
llvm-svn: 46427
registers if used by a bitconvert or using a bitconvert. This allows us to
avoid constant pool loads and use cheaper integer instructions when the
values come from or end up in integer regs anyway. For example, we now
compile CodeGen/X86/fp-in-intregs.ll to:
_test1:
movl $2147483648, %eax
xorl 4(%esp), %eax
ret
_test2:
movl $1065353216, %eax
orl 4(%esp), %eax
andl $3212836864, %eax
ret
Instead of:
_test1:
movss 4(%esp), %xmm0
xorps LCPI2_0, %xmm0
movd %xmm0, %eax
ret
_test2:
movss 4(%esp), %xmm0
andps LCPI3_0, %xmm0
movss LCPI3_1, %xmm1
andps LCPI3_2, %xmm1
orps %xmm0, %xmm1
movd %xmm1, %eax
ret
bitconverts can happen due to various calling conventions that require
fp values to passed in integer regs in some cases, e.g. when returning
a complex.
llvm-svn: 46414
void bork() {
int *address = 0;
*address = 0;
}
It's compiled into LLVM code that looks like this:
define void @bork() noreturn nounwind {
entry:
unreachable
}
This is bad on some platforms (like PPC) because it will generate the label for
the function but no body. The label could end up being associated with some
non-code related stuff, like a section. This places a "trap" instruction if the
SimplifyCFG pass removed all code from the function leaving only one
"unreachable" instruction.
llvm-svn: 46387
delete a node even if it was not dead in some cases. Instead, just add it to
the worklist. Also, make sure to use the CombineTo methods, as it was doing
things that were unsafe: the top level combine loop could touch dangling memory.
This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll
llvm-svn: 46384
was actually passing a completely incorrect size to sys_icache_invalidate.
Instead of having the JITEmitter do this (which doesn't have the correct
size), just make the target sync its own stubs.
llvm-svn: 46354
arrays. Also, as a convenience, don't barf, just
return false, if someone calls isTruncStoreLegal
or isLoadXLegal with an extended type for the in
memory type.
llvm-svn: 46352
APInt.
While some operators were already specifically overloaded for APSInt, others
resulted in using the overloaded operator methods in APInt, which would result
in the signedness bit being lost.
Modified the APSInt(APInt&) constructor to be "explicit" and to take an
extra (optional) flag to indicate the signedness. Making the ctor explicit
will catch any implicit conversations between APSInt -> APInt -> APSInt that
results in the signedness flag being lost.
llvm-svn: 46316
This case returns the value in ST(0) and then has to convert it to an SSE
register. This causes significant codegen ugliness in some cases. For
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:
_bar:
subl $28, %esp
call L_foo$stub
fstpl 16(%esp)
movsd 16(%esp), %xmm0
movsd %xmm0, 8(%esp)
fldl 8(%esp)
addl $28, %esp
ret
because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.
Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.
This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case. This gives
us this code for the example above:
_bar:
subl $12, %esp
call L_foo$stub
addl $12, %esp
ret
The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert). This is gross, but
less gross than the code it is replacing :)
This also allows us to generate better code in several other cases. For
example on fp-stack-ret-conv.ll, we now generate:
_test:
subl $12, %esp
call L_foo$stub
fstps 8(%esp)
movl 16(%esp), %eax
cvtss2sd 8(%esp), %xmm0
movsd %xmm0, (%eax)
addl $12, %esp
ret
where before we produced (incidentally, the old bad code is identical to what
gcc produces):
_test:
subl $12, %esp
call L_foo$stub
fstpl (%esp)
cvtsd2ss (%esp), %xmm0
cvtss2sd %xmm0, %xmm0
movl 16(%esp), %eax
movsd %xmm0, (%eax)
addl $12, %esp
ret
Note that we generate slightly worse code on pr1505b.ll due to a scheduling
deficiency that is unrelated to this patch.
llvm-svn: 46307
1. we already know the value is dead, so don't bother replacing
it with undef.
2. The very case the comment describes actually makes the load
live which asserts in deletenode. If we do the replacement
and the node becomes live, just treat it as new. This fixes
a failure on X86/2008-01-16-InvalidDAGCombineXform.ll with
some local changes in my tree.
llvm-svn: 46306
dead stuff around. This gets fed into the isel pass and causes certain foldings from
happening because nodes have extraneous uses floating around. For example, if we turned
foo(bar(x)) -> baz(x), we sometimes left bar(x) around.
llvm-svn: 46305
getNodeLabel(); these sequences allow the user to specify the characters '{',
'}', and '|' in the label, which facilitate breaking the label into multiple
record segments.
llvm-svn: 46283
precision integers. This won't actually work
(and most of the code is dead) unless the new
legalization machinery is turned on. While
there, I rationalized the handling of i1, and
removed some bogus (and unused) sextload patterns.
For i1, this could result in microscopically
better code for some architectures (not X86).
It might also result in worse code if annotating
with AssertZExt nodes turns out to be more harmful
than helpful.
llvm-svn: 46280
NDEBUG. This is in response to a really nasty bug I introduced that
Dale tracked down, hopefully this won't happen in the future.
Many thanks Dale.
llvm-svn: 46254
integers. Handle truncstore of a legal type to an unusual
number of bits. Most of this code is not reachable unless
the new legalize infrastructure is turned on.
llvm-svn: 46249
problem was that we previously hashed based on the pointers of the left and
right children, but this is bogus: we can easily have different trees that
represent the same set. Now we use a hashing based scheme that compares the
*contents* of the trees, but not without having to do a full scan of a tree. The
only caveat is that with hashing is that we may have collisions, which result in
two different trees being falsely labeled as equivalent. If this becomes a
problem, we can add extra data to the profile to hopefully resolve most
collisions.
llvm-svn: 46224
external symbols (e.g. 'fmod') as needing a stub. This regression
was introduced by Evan's jit patch here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20071231/056749.html
With this fixed, the two ExecutionEngine failures are passing on ppc,
and the ppc jit works on freebench and olden.
This should be pulled into the 2.2 release branch.
llvm-svn: 46222
that return an opaque type by value, as long as you don't
call it or provide a body (you can take the address of it).
So it is wrong to insist that sret parameters not be an
opaque*. And I guess it is really up to codegen to complain
if someone tries to call such a function. I'm also removing
the analogous check from byval parameters, since I don't
see why we shouldn't allow them as long as no-one tries to
call the function or give it a body.
llvm-svn: 46216
parameters, since otherwise it won't be passed in
the right register. With this change trampolines
work on x86-64 (thanks to Luke Guest for providing
access to an x86-64 box).
llvm-svn: 46192
'FoldingSetNodeImpl' (previously 'FoldingSetNodeID' was a typedef of
'FoldingSetNodeImpl::NodeID').
Why? Clients can now easily forward declare 'FoldingSetNodeID' without having
to include FoldingSet.h.
llvm-svn: 46187
instead of always assuming that the stored objects had a method called
'Profile'. The default behavior is to dispatch to a 'Profile' method (as
before), but via template specialization this behavior can now be overridden by
clients.
Added templated class 'FoldingSetNodeWrapper', a generic wrapper class that
allows one to insert objects into a FoldingSet that do not directly inherit from
FoldingSetNode. This is useful for inserting objects that do not always need to
pay the overhead of inheriting from FoldingSetNode, or were designed with that
behavior in mind.
llvm-svn: 46186
an iterator, since the implementation returned an iterator that pointed to a
different node! Renamed this implementation to SlimFind() so that users do not
expect it to return an iterator (it is a more efficient implementation than
returning an iterator if the user just wants to find the value of a key).
Added a FIXME to implement ImmutableMap::find() that returns an iterator.
llvm-svn: 46150
as weak globals rather than commons. While not wrong,
this change tickled a latent bug in Darwin's strip,
so revert it for now as a workaround.
llvm-svn: 46147
as weak globals rather than commons. While not wrong,
this change tickled a latent bug in Darwin's strip,
so revert it for now as a workaround.
llvm-svn: 46144
Fixed CellSPU's A-form (local store) address mode, so that all globals,
externals, constant pool and jump table symbols are now wrapped within
a SPUISD::AFormAddr pseudo-instruction. This now identifies all local
store memory addresses, although it requires a bit of legerdemain during
instruction selection to properly select loads to and stores from local
store, properly generating "LQA" instructions.
Also added mul_ops.ll test harness for exercising integer multiplication.
llvm-svn: 46142
1. Legalize now always promotes truncstore of i1 to i8.
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
safe.
The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:
_foo:
fldt 20(%esp)
fldt 4(%esp)
faddp %st(1)
movl 36(%esp), %eax
fstps (%eax)
ret
instead of:
_foo:
subl $4, %esp
fldt 24(%esp)
fldt 8(%esp)
faddp %st(1)
fstps (%esp)
movl 40(%esp), %eax
movss (%esp), %xmm0
movss %xmm0, (%eax)
addl $4, %esp
ret
llvm-svn: 46140
and not just the key value when comparing trees. To do this we added data_type
and data_type_ref to the ImutContainerInfo trait classes. For values stored in
the tree that do not have separate key and data components, data_type is simply
a typedef of bool, and isDataEqual() always evaluates to true. This allows us to
support both ImmutableSet and ImmutableMap using the same underlying logic.
llvm-svn: 46130
and switch various codegen pieces and the X86 backend over
to using it.
* Add some comments to SelectionDAGNodes.h
* Introduce a second argument to FP_ROUND, which indicates
whether the FP_ROUND changes the value of its input. If
not it is safe to xform things like fp_extend(fp_round(x)) -> x.
llvm-svn: 46125
and the spill is its kill. However, if the local allocator has determined the
register has not been modified (possible when its value was reloaded), it would
not issue a restore. In that case, mark the last use of the virtual register as
kill.
llvm-svn: 46111
It's not safe to use the two value CombineTo variant to combine away a dead load.
e.g.
v1, chain2 = load chain1, loc
v2, chain3 = load chain2, loc
v3 = add v2, c
Now we replace use of v1 with undef, use of chain2 with chain1.
ReplaceAllUsesWith() will iterate through uses of the first load and update operands:
v1, chain2 = load chain1, loc
v2, chain3 = load chain1, loc
v3 = add v2, c
Now the second load is the same as the first load, SelectionDAG cse will ensure
the use of second load is replaced with the first load.
v1, chain2 = load chain1, loc
v3 = add v1, c
Then v1 is replaced with undef and bad things happen.
llvm-svn: 46099
it should work, but I have no machine to test
it on. Committed because it will at least
cause no harm, and maybe someone can test it
for me!
llvm-svn: 46098
into the ANY_EXTEND/ZERO_EXTEND/SIGN_EXTEND code to simplify it.
Unmerge the code for FP_ROUND and FP_EXTEND from each other to
make each one simpler.
llvm-svn: 46061
make the 'fp return in ST(0)' optimization smart enough to
look through token factor nodes. THis allows us to compile
testcases like CodeGen/X86/fp-stack-retcopy.ll into:
_carg:
subl $12, %esp
call L_foo$stub
fstpl (%esp)
fldl (%esp)
addl $12, %esp
ret
instead of:
_carg:
subl $28, %esp
call L_foo$stub
fstpl 16(%esp)
movsd 16(%esp), %xmm0
movsd %xmm0, 8(%esp)
fldl 8(%esp)
addl $28, %esp
ret
Still not optimal, but much better and this is a trivial patch. Fixing
the rest requires invasive surgery that is is not llvm 2.2 material.
llvm-svn: 46054