one implementation into its one caller. This eliminates a totally
awesome and gratuitous hack where we casted a Function* to
GlobalVariable*.
llvm-svn: 81967
and use PersonalityPrefix/Suffix to achieve the same effect (like
the x86 backend).
This changes the code generated for ppc static mode, but guess what,
we were generating this before:
.byte 0x9B ; Personality (indirect pcrel sdata4)
.long ___gxx_personality_v0-. ; Personality
which is not correct! (it is not an 'indirect' reference).
llvm-svn: 81965
interrupt instruction, which shouldn't arise any other way). 0xcd is
also used by JITMemoryManager to initialize the buffer to garbage,
which means it could appear following a noreturn call even when
that is not a stub, confusing X86CompilationCallback2. PR 4929.
llvm-svn: 81888
VLDM/VSTM instructions, and without this check, the code assumes that an
offset is allowed, as it would be with VLDR/VSTR. The asm printer,
however, silently drops the offset, producing incorrect code. Since the
address register in this case is either the stack or frame pointer, the
spill location ends up conflicting with some other stack slot or with
outgoing arguments on the stack.
llvm-svn: 81879
instead of cloning and RAUWing it.
- Make AbstractTypeUser a friend of Value so that it can offer
its subclasses a way to update a Value's type in place. This
is better than a universally visible setType method on Value,
and it's sufficient for the immediate need.
- Eliminate the constant "convert" functions. This eliminates a
lot of logic duplication, and fixes a complicated bug where a
constant can't actually be cloned during the type refinement
process because some of the types that its folder needs are
half-destroyed, being in the middle of refinement themselves.
- Move the getValType functions from being static overloaded
functions in Constants.cpp to be members of class template
specializations in ConstantsContext.h. This means that the
code ends up getting instantiated twice, however it also
makes it possible to eliminate all "convert" functions, so
it's not a big net code size increase. And if desired, the
duplicate instantiations could be eliminated with some
reorganization.
llvm-svn: 81861
While I'm there, change code that does:
SomeTy == Type::getFooType(Context)
into:
SomeTy->getTypeID() == FooTyID
to decrease the amount of useless type creation which may involve locking, etc.
llvm-svn: 81846
argpromote to avoid invalidating an iterator. This fixes PR4977.
All clang tests now pass with expensive checking (on my system
at least).
llvm-svn: 81843
has multiple uses, as one of the other uses may be on a path
to a different node above the callseq_start, because that
leads to a cyclic graph. This problem is exposed when
-combiner-global-alias-analysis is used. This fixes PR4880.
llvm-svn: 81821
parses the .word directive as 4 bytes and ARMAsmParser::ParseInstruction will
give an error is called. Broke out the test of the .word directive into two
different test cases, one for x86 and one for arm.
llvm-svn: 81817
1. Switch from an std::set to a SmallPtrSet for visited chain nodes.
2. Do not force the recursive flattening of token factor nodes, regardless of
use count.
3. Immediately process newly created TokenFactor nodes.
Also, improve combiner-aa by teaching it that loads to non-overlapping offsets
of relatively aligned objects cannot alias.
These changes result in a >5x speedup for combiner-aa on most testcases.
llvm-svn: 81816
The gist of this is if source of some of the copies that feed into a phi join is defined by the phi join, we'd like to eliminate them. However, if any of the non-identity source overlaps the live interval of the phi join then the coalescer won't be able to coalesce them. The early coalescer's job is to eliminate the identity copies by partially-coalescing the two live intervals.
llvm-svn: 81796
full AsmPrinter, and change TargetRegistry to keep track
of registered MCInstPrinters.
llvm-mc is still linking in the entire
target foo to get the code emitter stuff, but this is an
important step in the right direction.
llvm-svn: 81754
change as types are refined. Remove abstract types from CheckedTypes when they
we're informed that they have been refined. The only way types get refined in
the verifier is when later function passes start optimizing. Fixes PR4970.
llvm-svn: 81716
Change the picbase symbol on non-darwin systems from ".Lllvm$4.$piclabel" to
".L4$pb". The actual name doesn't matter and the darwin name is shorter.
llvm-svn: 81688
Move GetMBBSymbol up to AsmPrinter and make printBasicBlockLabel use it so that
we only have one place that decides what to name bb labels. Hopefully various
clients of printBasicBlockLabel can start using GetMBBSymbol instead.
llvm-svn: 81652
this means that it can only lower one MachineInstr to one MCInst. To
make this fly, we need to pull out handling of MO_GOT_ABSOLUTE_ADDRESS
(which generates an implicit label) out of X86MCInstLower.
llvm-svn: 81629
working. To support this, add an is_displayed() function to raw_ostream,
and generalize Process::StandardOutIsDisplayed and friends in order to
support it.
Also, call RemoveFileOnSignal before creating a file instead of after, so
that the file isn't left behind if the program is interrupted between when
the file is created and RemoveFileOnSignal is called.
While here, add a -S to llvm-extract and port it to IRReader so that it
supports assembly input.
llvm-svn: 81568
object, the timer it creates was not being deleted. Since the
timer belonged to a static timer group, the timer group would
be destroyed on shutdown, and would notice and complain that
not all timers it contained were destroyed.
llvm-svn: 81533
more efficient SmallPtrSet<MCSymbol*>. This eliminates string
craziness and fixes CodeGen/X86/darwin-quote.ll with the new asmprinter.
Codegen is producing stubs in a nondeterminstic order, but it was doing
this before anyway.
llvm-svn: 81511
Mangler::getNameWithPrefix. In addition to avoiding some over
quoting, this also is more efficient because it uses smallvector
instead of std::string thrashing.
llvm-svn: 81508
(uniqued if unnamed) global variable name with the prefix that
it is supposed to get. It doesn't do "mangling" in the sense of
adding quotes and hacking on bad characters.
llvm-svn: 81505
safe. This can happen we a subreg_to_reg 0 has been coalesced. One
exception is when the instruction that folds the load is a move, then we
can simply turn it into a 32-bit load from the stack slot.
rdar://7170444
llvm-svn: 81494
how to fold notionally-out-of-bounds array getelementptr indices instead
of just doing these in lib/Analysis/ConstantFolding.cpp, because it can
be done in a fairly general way without TargetData, and because not all
constants are visited by lib/Analysis/ConstantFolding.cpp. This enables
more constant folding.
Also, set the "inbounds" flag when the getelementptr indices are
one-past-the-end.
llvm-svn: 81483
within the notional bounds of the static type of the getelementptr (which
is not the same as "inbounds") from GlobalOpt into a utility routine,
and use it in ConstantFold.cpp to check whether there are any mis-behaved
indices.
llvm-svn: 81478
that things like .word can be parsed as target specific. Moved parsing .word
out of AsmParser.cpp into X86AsmParser.cpp as it is 2 bytes on X86 and 4 bytes
for other targets that support the .word directive.
llvm-svn: 81461
from the exception tables. However, Duncan explained why it's a can of worms to
do it the GCC way. I went back to doing it the LLVM way and added Duncan's
explanation so that I don't do this again in the future.
llvm-svn: 81434
like what GCC outputs. The mysterious code to insert padding wasn't in GCC at
all. I modified the TType base offset code to calculate the offset like GCC
does, though.
llvm-svn: 81424
code within it was the same inside and out. There's still a problem of the
TypeInfoSize should be the size of the TType format encoding (at least that's
what GCC thinks it should be).
llvm-svn: 81417
Basically, this patch is working towards removing the hard-coded values that are
output for the CIE. In particular, the CIE augmentation and the CIE augmentation
size. Both of these should be calculated. In the process, I was able to make a
bunch of code simpler.
The encodings for the personality, LSDA, and FDE in the CIE are still not
correct. They should be generated either from target-specific callbacks (blech!)
or grokked from first-principles.
llvm-svn: 81404
the MCInst path of the asmprinter. Instead, pull comment printing
out of the autogenerated asmprinter into each target that uses the
autogenerated asmprinter. This causes code duplication into each
target, but in a way that will be easier to clean up later when more
asmprinter stuff is commonized into the base AsmPrinter class.
This also fixes an xcore strangeness where it inserted two tabs
before every instruction.
llvm-svn: 81396
all disassemblers.
Modified the MemoryObject to support 64-bit address
spaces, regardless of the LLVM process's address
width.
Modified the Target class to allow extraction of a
MCDisassembler.
llvm-svn: 81392
asm printer into the "printInstruction" routine. This
fixes a problem where the experimental asmprinter would
drop debug labels in some cases, and fixes issues on ppc/xcore
where pseudo instructions like "mr" didn't get debug locs properly.
It is annoying that this moves the call from one place into each
target, but a future set of more invasive refactorings will fix
that problem.
llvm-svn: 81377
loop exit edge -- new PHIs may be needed not only for the additional
splits that are made to preserve LoopSimplify form, but also for the
original split. Factor out the code that inserts new PHIs so that it
can be used for both. Remove LoopRotation.cpp's code for manually
updating LCSSA form, as it is now redundant. This fixes PR4934.
llvm-svn: 81363
Fixed non working -profile-verifier-noassert option.
Fixed missing newline in debugEntry().
Cleaned up assert messages. (assert(0 && Message) is still shown, but the message is printed before.)
When verifiying loaded profiles the ProfileVerifier got confused when block was a setjmp target, this is checked now.
When verifiying loaded profiles the ProfileVerifier got confused when block eventually reaching an exit(), this is checked now.
llvm-svn: 81338
to instructions instead of zero extended ones. This makes the asmprinter
print signed values more consistently. This apparently only really affects
the X86 backend.
llvm-svn: 81265
that get created during loop unswitching, and fix SplitBlockPredecessors'
LCSSA updating code to create new PHIs instead of trying to just move
existing ones.
Also, optimize Loop::verifyLoop, since it gets called a lot. Use
searches on a sorted list of blocks instead of calling the "contains"
function, as is done in other places in the Loop class, since "contains"
does a linear search. Also, don't call verifyLoop from LoopSimplify or
LCSSA, as the PassManager is already calling verifyLoop as part of
LoopInfo's verifyAnalysis.
llvm-svn: 81221
instruction to insert before can be end(). getDebugLoc on
end() returns an invalid value, therefore use the debug
loc of the call instruction, and give it to InsertLabel.
llvm-svn: 81207
depth first order, so it wouldn't process unreachable blocks.
When compiling at -O0, late dead block elimination isn't done
and the bad instructions got to isel.
llvm-svn: 81187
extractelement operations into a bitcast of the pointer,
then a gep, then a scalar load. Disable this when the vector
only has one element, because it leads to infinite loops in
instcombine (PR4908).
This transformation seems like a really bad idea to me, as it
will likely disable CSE of vector load/stores etc and can be
better done in the code generator when profitable. This
goes all the way back to the first days of packed types,
r25299 specifically.
I'll let those people who care about the performance of vector
code decide what to do with this.
llvm-svn: 81185
so "Assert1(isa<>); cast<>" is a valid idiom.
Actually check the PHI node's odd-numbered operands for BasicBlock-ness, like
the comment said.
llvm-svn: 81182
Make the verifier more robust by avoiding unprotected cast<> calls. Notably,
Assert1(isa<>); cast<> is not safe as Assert1 does not terminate the program.
llvm-svn: 81179
from floating-point to integer first, and bitcast the result
back to floating-point. Previously, this test was passing by
falling back to SelectionDAG lowering. The resulting code isn't
as nice, but it's correct and CodeGen now stays on the fast path.
llvm-svn: 81171
compile-time constant integers or that are out of bounds for their
corresponding static array types. These can cause aliasing that
GlobalOpt assumes won't happen.
llvm-svn: 81165
is missing the inbounds flag. This is slightly conservative, but it
avoids problems with two constants pointing to the same address but
getting distinct entries in the Memory DenseMap.
llvm-svn: 81163
- I think there are more instances of this, but I think they are fixed in Dan's
incoming patch. This one was preventing me from doing a bugpoint reduction
though.
llvm-svn: 81103
a new class, MachineInstrIndex, which hides arithmetic details from
most clients. This is a step towards allowing the register allocator
to update/insert code during allocation.
llvm-svn: 81040
Constant uniquing tables. This allows distinct ConstantExpr objects
with the same operation and different flags.
Even though a ConstantExpr "a + b" is either always overflowing or
never overflowing (due to being a ConstantExpr), it's still necessary
to be able to represent it both with and without overflow flags at
the same time within the IR, because the safety of the flag may
depend on the context of the use. If the constant really does overflow,
it wouldn't ever be safe to use with the flag set, however the use
may be in code that is never actually executed.
This also makes it possible to merge all the flags tests into a single test.
llvm-svn: 80998
D test/Analysis/Profiling
--- Reverse-merging r80907 into '.':
U lib/Analysis/ProfileInfoLoaderPass.cpp
Attempt to remove failure in the self-hosting build bot.
llvm-svn: 80966
and exact flags. Because ConstantExprs are uniqued, creating an
expression with this flag causes all expressions with the same operands
to have the same flag, which may not be safe. Add, sub, mul, and sdiv
ConstantExprs are usually folded anyway, so the main interesting flag
here is inbounds, and the constant folder already knows how to set the
inbounds flag automatically in most cases, so there isn't an urgent need
for the API support.
This can be reconsidered in the future, but for now just removing these
API bits eliminates a source of potential trouble with little downside.
llvm-svn: 80959
for the complicated case where one register is tied to multiple destinations.
This avoids the extra scan of instruction operands that was introduced by
my recent change. I also pulled some code out into a separate
TryInstructionTransform method, added more comments, and renamed some
variables.
Besides all those changes, this takes care of a FIXME in the code regarding
an assumption about there being a single tied use of a register when
converting to a 3-address form. I'm not aware of cases where that assumption
is violated, but the code now only attempts to transform an instruction,
either by commuting its operands or by converting to a 3-address form,
for the simple case where there is a single pair of tied operands.
llvm-svn: 80945
- when transforming a vector shift of a non-immediate scalar shift amount, zero
extend the i32 shift amount to i64 since the vector shift reads 64 bits
- when transforming i16 vectors to use a vector shift, zero extend i16 shift amount
- improve the code quality in some cases when transforming vectors to use a vector shift
llvm-svn: 80935
disabling the use of 16-bit operations on x86. This doesn't yet work for
inline asms with 16-bit constraints, vectors with 16-bit elements,
trampoline code, and perhaps other obscurities, but it's enough to try
some experiments.
llvm-svn: 80930
from MCAsmLexer.h in preparation of supporting other targets. Changed the
X86AsmParser code to reflect this by removing AsmLexer::LexPercent and looking
for AsmToken::Percent when parsing in places that used AsmToken::Register.
Then changed X86ATTAsmParser::ParseRegister to parse out registers as an
AsmToken::Percent followed by an AsmToken::Identifier.
llvm-svn: 80929
instead of a bool argument, and to do the dominator check itself.
This makes it eaiser to use when DominatorTree information is
available.
llvm-svn: 80920
different formatting from the old asmprinter, but it should be
semantically the same. We used to get:
popl %eax
addl $_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$6.$piclabel], %eax
...
Now we get:
popl %eax
.Lpicbaseref6:
addl $(_GLOBAL_OFFSET_TABLE_ + (.Lpicbaseref6 - .Lllvm$6.$piclabel)), %eax
...
llvm-svn: 80905
simplifylibcalls optimization is thus valid for C++ but not C.
It's not important enough to worry about for C++ apps, so just
remove it.
rdar://7191924
llvm-svn: 80887
avoid reloads by reusing clobbered registers.
This was causing issues in 256.bzip2 when compiled with PIC for
a while (starting at r78217), though the problem has since been masked.
llvm-svn: 80872
Use CallbackVH, instead of WeakVH, to hold MDNode elements.
Use FoldingSetNode to unique MDNodes in a context.
Use CallbackVH hooks to update context's MDNodeSet appropriately.
llvm-svn: 80868
instruction tables to support segmented addressing (and other objects
of obscure type).
Modified the X86 assembly printers to handle these new operand types.
Added JMP and CALL instructions that use segmented addresses.
llvm-svn: 80857
desired triplet is a sub-target, e.g. thumbv7 vs. arm host). Reverting the
patch isn't quite right either since the previous behavior does not allow the
triplet to be overridden with -march.
llvm-svn: 80742
Add MO flags to simplify the printing of relocations.
Remove the support for printing large code model relocs (which
aren't supported anyway).
llvm-svn: 80691
Add statistics for regular edge profiling, this enables the comparation of the
number of edges inserted by regular and optimal edge profiling.
llvm-svn: 80668
Optimal edge profiling is only possible when blocks with no predecessors get an
virtual edge (BB,0) that counts the execution frequencies of this
function-exiting blocks.
This patch makes the necessary changes before actually enabling optimal edge profiling.
llvm-svn: 80667
This adds a pass to verify the current profile against the flow conditions.
This is very helpful when later on trying to perserve the profiling information
during all passes.
llvm-svn: 80666
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
llvm-svn: 80663
tied to different source registers, the TwoAddressInstructionPass needs to
be smarter. Change it to check before replacing a source register whether
that source register is tied to a different destination register, and if so,
defer handling it until a subsequent iteration.
llvm-svn: 80654
makes an eggregious hack somewhat more palatable. Bringing the LSDA forward
and making it a GV available for reference would be even better, but is
beyond the scope of what I'm looking to solve at this point.
Objective C++ code could generate function names that broke the previous
scheme. This fixes that.
llvm-svn: 80649
SCEVUnknowns, as the non-SCEVUnknown cases in the getSCEVAtScope code
can also end up repeatedly climing through the same expression trees,
which can be unusably slow when the trees are very tall.
Also, add a quick check for SCEV pointer equality to the main
SCEV comparison routine, as the full comparison code can be expensive
in the case of large expression trees.
These fix compile-time problems in some pathlogical cases.
llvm-svn: 80623
This fixes leaks from LLVMContext in multithreaded apps.
Since constants are only deleted if they have no uses, it is safe to not delete
a Module on shutdown, as many single-threaded tools do.
Multithreaded apps should however delete the Module before destroying the
Context to ensure that there are no leaks (assuming they use a different context
for each thread).
llvm-svn: 80590
stem from the fact that we have two types of passes that need to update it:
1. callgraphscc and module passes that are explicitly aware of it
2. Functionpasses (and loop passes etc) that are interlaced with CGSCC passes
by the CGSCC Passmgr.
In the case of #1, we can reasonably expect the passes to update the call
graph just like any analysis. However, functionpasses are not and generally
should not be CG aware. This has caused us no end of problems, so this takes
a new approach. Logically, the CGSCC Pass manager can rescan every function
after it runs a function pass over it to see if the functionpass made any
updates to the IR that affect the callgraph. This allows it to catch new calls
introduced by the functionpass.
In practice, doing this would be slow. This implementation keeps track of
whether or not the current scc is dirtied by a function pass, and, if so,
delays updating the callgraph until it is actually needed again. This was
we avoid extraneous rescans, but we still have good invariants when the
callgraph is needed.
Step #2 of the "give Callgraph some sane invariants" is to change CallGraphNode
to use a CallBackVH for the callsite entry of the CallGraphNode. This way
we can immediately remove entries from the callgraph when a FunctionPass is
active instead of having dangling pointers. The current pass tries to tolerate
these dangling pointers, but it is just an evil hack.
This is related to PR3601/4835/4029. This also reverts r80541, a hack working
around the sad lack of invariants.
llvm-svn: 80566
changes: SimplifyDemandedBits can't use the builder yet because it
has the wrong insertion point. This fixes a crash building
MultiSource/Benchmarks/PAQ8p
llvm-svn: 80537
instead of CallGraphNode*'s. This also papers over a callgraph
problem where a pass (in this case, MemCpyOpt) introduces a new
function into the module (llvm.memset.i64) but doesn't add it to
the call graph (nor should it, since it is a function pass).
While it might be a good idea for MemCpyOpt to not synthesize
functions in a runOnFunction(), there is no need for FunctionAttrs
to be boneheaded, so fix it there. This fixes an assertion building
176.gcc.
llvm-svn: 80535
indirect function pointer, inline it, then go to delete the body.
The problem is that the callgraph had other references to the function,
though the inliner had no way to know it, so we got a dangling pointer
and an invalid iterator out of the deal.
The fix to this is pretty simple: stop the inliner from deleting the
function by knowing that there are references to it. Do this by making
CallGraphNodes contain a refcount. This requires moving deletion of
available_externally functions to the module-level cleanup sweep where
it belongs.
llvm-svn: 80533
Shared landing pads run into trouble with SJLJ, as the dispatch table is
mapped to call sites, and merging the pads will throw that off. There needs
to be a one-to-one mapping of landing pad exception table entries to invoke
call points.
Detecting the shared pad during lowering of SJLJ info insn't sufficient, as
the dispatch function may still need separate destinations to properly
handle phi-nodes.
llvm-svn: 80530
argpromotion and structretpromote. Basically, when replacing
a function, they used the 'changeFunction' api which changes
the entry in the function map (and steals/reuses the callgraph
node).
This has some interesting effects: first, the problem is that it doesn't
update the "callee" edges in any callees of the function in the call graph.
Second, this covers for a major problem in all the CGSCC pass stuff, which
is that it is completely broken when functions are deleted if they *don't*
reuse a CGN. (there is a cute little fixme about this though :).
This patch changes the protocol that CGSCC passes must obey: now the CGSCC
pass manager copies the SCC and preincrements its iterator to avoid passes
invalidating it. This allows CGSCC passes to mutate the current SCC. However
multiple passes may be run on that SCC, so if passes do this, they are now
required to *update* the SCC to be current when they return.
Other less interesting parts of this patch are that it makes passes update
the CG more directly, eliminates changeFunction, and requires clients of
replaceCallSite to specify the new callee CGN if they are changing it.
llvm-svn: 80527
is itself a bitcast. Since we have gep(bitcast(bitcast(y))) in this
case, just wait for the two bitcasts to get zapped. This prevents
instcombine from confusing some aliasing stuff, and allows it to
directly eliminate the load in the testcase.
llvm-svn: 80508
workslist and is set to insert new instructions before the current one.
Convert a bunch of stuff that used to call InsertNewInstBefore over to
use it, greatly simplifying code and making it more natural.
There is still a lot more to go, but this is a good start.
llvm-svn: 80492
if the operand is not an instruction.
Simplify most uses of AddOperandsToWorkList to use AddValue and
inline it into the one remaining callsite.
llvm-svn: 80488
former looks too much like AddUsersToWorkList and keeps
confusing me.
Remove AddSoonDeadInstToWorklist and change its two callers
to do the same thing in a simpler way.
llvm-svn: 80486
into their callers. simplify ReplaceInstUsesWith. Make
EraseInstFromFunction only add operands to the worklist if
there aren't too many of them (this was a scalability win
for crazy programs that was only infrequently enforced).
Switch more code to using EraseInstFromFunction instead of
duplicating it inline. Change some fcmp/icmp optimizations
to modify fcmp/icmp in place instead of creating a new one
and deleting the old one just to change the predicate.
llvm-svn: 80483
encodings.
- Make some of the values emitted by the FDEs dependent upon the pointer
size. This is in line with how GCC does things. And it has the benefit of
working for Darwin in 64-bit mode now.
llvm-svn: 80428
This implements the maximum spanning tree algorithm on CFGs according to
weights given by the ProfileEstimator. This is then used to implement Optimal
Edge Profiling.
llvm-svn: 80358