named the same, so it had to qualify type names according to the enclosing
scope to ensure uniqueness. This is no longer needed for correctness (though
it may be helpful when reading the IR), so this test has lost its importance.
Zap it because dragonegg will never be able to produce the qualified type name
since modern gcc zaps language specific info (such as whether a type is nested
inside another - needed to get X::Y here) before dragonegg is reached.
llvm-svn: 120721
Lifted adjustFixupValue() from Darwin for sharing w ELF.
Test added
TODO:
refactor ELFObjectWriter::RecordRelocation more.
Possibly share more code with Darwin?
Lots more relocations...
llvm-svn: 120534
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777
llvm-svn: 120501
about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate. This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.
This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet. Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.
llvm-svn: 120406
contains "ref".
Enhance DSE to use a modref query instead of a store-specific hack
to generalize the "ignore may-alias stores" optimization to handle
memset and memcpy.
llvm-svn: 120368
instructions. I choose to handle this with an asmparser hack,
though it could be handled by changing all the instruction definitions
to allow be "setneb" instead of "setne". The asm parser hack is
better in this case, because we want the disassembler to produce
setne, not setneb.
llvm-svn: 120260
Unittests need LLVM_BUILD_MODE to pick up each test.
Confirmed on CentOS5, Mingw, MSYS, and with possible configurations on VS8 and VS10.
llvm-svn: 120212
automatically. Use -S with llvm-gcc rather than -c, so tests can
work when llvm-gcc is really dragonegg (which can output IR with -S
but not -c). Yes, dragonegg supports objective-c++ (poorly though).
llvm-svn: 120164
automatically. Use -S with llvm-gcc rather than -c, so tests can
work when llvm-gcc is really dragonegg (which can output IR with -S
but not -c). Yes, dragonegg supports objective-c (poorly though).
llvm-svn: 120163
automatically. Use -S with llvm-gcc rather than -c, so tests can
work when llvm-gcc is really dragonegg (which can output IR with -S
but not -c).
llvm-svn: 120160
the output of this test. Since it was producing bitcode, that clearly
wasn't happening! Have it produce target assembler and assemble that
instead.
llvm-svn: 120159
automatically. Use -S with llvm-gcc rather than -c, so tests can
work when llvm-gcc is really dragonegg (which can output IR with -S
but not -c).
llvm-svn: 120158
We need to check if the individual vector elements are sign/zero-extended
values. For now this only handles constants values. Radar 8687140.
llvm-svn: 120034
fairly systematic way in instcombine. Some of these cases were already dealt
with, in which case I removed the existing code. The case of Add has a bunch of
funky logic which covers some of this plus a few variants (considers shifts to be
a form of multiplication), which I didn't touch. The simplification performed is:
A*B+A*C -> A*(B+C). The improvement is to do this in cases that were not already
handled [such as A*B-A*C -> A*(B-C), which was reported on the mailing list], and
also to do it more often by not checking for "only one use" if "B+C" simplifies.
llvm-svn: 120024
state. Previously Thumb2 would restore sp from fp like this:
mov sp, r7
sub, sp, #4
If an interrupt is taken after the 'mov' but before the 'sub', callee-saved
registers might be clobbered by the interrupt handler. Instead, try
restoring directly from sp:
add sp, #4
Or, if necessary (with VLA, etc.) use a scratch register to compute sp and
then restore it:
sub.w r4, r7, #8
mov sp, r7
rdar://8465407
llvm-svn: 119977
folding improvements: if P points to a type of size zero, turn "gep P, N" into "P".
More generally, if a gep index type has size zero, instcombine could replace the
index with zero, but that is not done here.
llvm-svn: 119942
allowing the memcpy to be eliminated.
Unfortunately, the requirements on byval's without explicit
alignment are really weak and impossible to predict in the
mid-level optimizer, so this doesn't kick in much with current
frontends. The fix is to change clang to set alignment on all
byval arguments.
llvm-svn: 119916
lr" instruction cannot be tested just yet. It requires matching a "condition
code", but adding one of those makes things go south quickly...
llvm-svn: 119774
if the extension types were not the same. The result was that if you
fed a select with sext and zext loads, as in the testcase, then it
would get turned into a zext (or sext) of the select, which is wrong
in the cases when it should have been an sext (resp. zext). Reported
and diagnosed by Sebastien Deldon.
llvm-svn: 119728
preserves LCSSA form out of ScalarEvolution and into the LoopInfo
class. Use it to check that SimplifyInstruction simplifications
are not breaking LCSSA form. Fixes PR8622.
llvm-svn: 119727
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.
Adjust all testcases accordingly.
llvm-svn: 119725
this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
if the initial lookup failed. This led to really bad performance on tall, narrow CFGs.
We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
represented by a hashtable with a list of Value*'s as the value type), and then
determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
DominatorTree. Because there are typically few duplicates of a given value, this scan tends to be
quite fast. Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
allocation in representing the value-side of the multimap.
This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
think is pretty good considering that includes all the "real work" being done by MemDep as well.
The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
the slack. If you see conditional propagation that's not happening, please file bugs against LVI or CVP.
llvm-svn: 119714
refusing to optimize two memcpy's like this:
copy A <- B
copy C <- A
if it couldn't prove that noalias(B,C). We can eliminate
the copy by producing a memmove instead of memcpy.
llvm-svn: 119694
if it is passed as a byval argument. The byval argument will just be a
read, so it is safe to read from the original global instead. This allows
us to promote away the %agg.tmp alloca in PR8582
llvm-svn: 119686
and testing is easier. A good example is the unknown-location.ll test that
now can just look for ".loc 1 0 0". We also don't use a DW_LNE_set_address for
every address change anymore.
llvm-svn: 119613
It is generally not sufficient to check if the starting offset is in range
of the maximum offset that can be efficiently used for the target.
llvm-svn: 119565
This makes it more clear that the symbol is an internal, compiler-generated
name and gives a little more description about its contents.
llvm-svn: 119564
It was mistakenly looking at the pointer type when checking for the size of
global variables. This is a partial fix for Radar 8673120.
llvm-svn: 119563
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.
Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.
Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.
2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.
rdar://8663787, rdar://8241368
llvm-svn: 119548
instructions have to distinguish between lists of single- and double-precision
registers in order for the ASM matcher to do a proper job. In all other
respects, a list of single- or double-precision registers are the same as a list
of GPR registers.
llvm-svn: 119460
over a phi node by applying it to each operand may be wrong if the
operation and the phi node are mutually interdependent (the testcase
has a simple example of this). So only do this transform if it would
be correct to perform the operation in each predecessor of the block
containing the phi, i.e. if the other operands all dominate the phi.
This should fix the FFMPEG snow.c regression reported by İsmail Dönmez.
llvm-svn: 119347
The live range of a register defined by an early clobber starts at the use slot,
not the def slot.
Except when it is an early clobber tied to a use operand. Then it starts at the
def slot like a standard def.
llvm-svn: 119305