correlated pairs of pointer arguments at the callsite. This is designed
to recognize the common C++ idiom of begin/end pointer pairs when the
end pointer is a constant offset from the begin pointer. With the
C-based idiom of a pointer and size, the inline cost saw the constant
size calculation, and this provides the same level of information for
begin/end pairs.
In order to propagate this information we have to search for candidate
operations on a pair of pointer function arguments (or derived from
them) which would be simplified if the pointers had a known constant
offset. Then the callsite analysis looks for such pointer pairs in the
argument list, and applies the appropriate bonus.
This helps LLVM detect that half of bounds-checked STL algorithms
(such as hash_combine_range, and some hybrid sort implementations)
disappear when inlined with a constant size input. However, it's not
a complete fix due the inaccuracy of our cost metric for constants in
general. I'm looking into that next.
Benchmarks showed no significant code size change, and very minor
performance changes. However, specific code such as hashing is showing
significantly cleaner inlining decisions.
llvm-svn: 152752
Commit r152704 exposed a latent MSVC limitation (aka bug).
Both ilist and and iplist contains the same function:
template<class InIt> void insert(iterator where, InIt first, InIt last) {
for (; first != last; ++first) insert(where, *first);
}
Also ilist inherits from iplist and ilist contains a "using iplist<NodeTy>::insert".
MSVC doesn't know which one to pick and complain with an error.
I think it is safe to delete ilist::insert since it is redundant anyway.
llvm-svn: 152746
which are small enough to themselves be inlined. Delaying in this manner
can be harmful if the function is inelligible for inlining in some (or
many) contexts as it pessimizes the code of the function itself in the
event that inlining does not eventually happen.
Previously the check was written to only do this delaying of inlining
for static functions in the hope that they could be entirely deleted and
in the knowledge that all callers of static functions will have the
opportunity to inline if it is in fact profitable. However, with C++ we
get two other important sources of functions where the definition is
always available for inlining: inline functions and templated functions.
This patch generalizes the inliner to allow linkonce-ODR (the linkage
such C++ routines receive) to also qualify for this delay-based
inlining.
Benchmarking across a range of large real-world applications shows
roughly 2% size increase across the board, but an average speedup of
about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary
itself (when bootstrapped with this feature) shows a 1% -O0 performance
improvement when run over all Sema, Lex, and Parse source code smashed
into a single file. A clean re-build of Clang+LLVM with a bootstrapped
Clang shows approximately 2% improvement, but that measurement is often
noisy.
llvm-svn: 152737
There were cases where a value could be used and it's both crossing an invoke
and NOT crossing an invoke. This could happen in the landing pads. In that case,
we will demote the value to the stack like we did before.
<rdar://problem/10609139>
llvm-svn: 152705
New flags: -misched-topdown, -misched-bottomup. They can be used with
the default scheduler or with -misched=shuffle. Without either
topdown/bottomup flag -misched=shuffle now alternates scheduling
direction.
LiveIntervals update is unimplemented with bottom-up scheduling, so
only -misched-topdown currently works.
Capped the ScheduleDAG hierarchy with a concrete ScheduleDAGMI class.
ScheduleDAGMI is aware of the top and bottom of the unscheduled zone
within the current region. Scheduling policy can be plugged into
the ScheduleDAGMI driver by implementing MachineSchedStrategy.
ConvergingScheduler is now the default scheduling algorithm.
It exercises the new driver but still does no reordering.
llvm-svn: 152700
output (we're emitting a specification already and the information
isn't changing).
Saves 1% on the debug information for a build of llvm.
Fixes rdar://11043421
llvm-svn: 152697
(i16 load $addr+c*sizeof(i16)) and replace uses of (i32 vextract) with the
i16 load. It should issue an extload instead: (i32 extload $addr+c*sizeof(i16)).
rdar://11035895
llvm-svn: 152675
take a TargetLibraryInfo parameter. Internally, rather than passing TD, TLI
and DT parameters around all over the place, introduce a struct for holding
them.
llvm-svn: 152623
Also refactor the existing OProfile profiling code to reuse the same interfaces with the VTune profiling code.
In addition, unit tests for the profiling interfaces were added.
This patch was prepared by Andrew Kaylor and Daniel Malea, and reviewed in the llvm-commits list by Jim Grosbach
llvm-svn: 152620
offset accumulation to use a boring APInt instead of ConstantExprs.
I didn't go all the way to an 'int64_t' because I wanted APInt to handle
any magic required to properly wrap the arithmetic when the pointer
width is <64 bits. If there is a significant penalty from using APInt
here, first off WTF, and secondly let me know and I'll do the math by
hand.
I've left one layer still operating w/ ConstantExpr because it makes the
interface quite a bit simpler, and that one isn't iterative so has much
lower cost.
I suppose this may potentially speed up some strang compilation
situations, but I don't really expect much. It should have no functional
impact either way.
llvm-svn: 152590
candidate set for subsequent inlining, try to simplify the arguments to
the inner call site now that inlining has been performed.
The goal here is to propagate and fold constants through deeply nested
call chains. Without doing this, we loose the inliner bonus that should
be applied because the arguments don't match the exact pattern the cost
estimator uses.
Reviewed on IRC by Benjamin Kramer.
llvm-svn: 152556
Typically instcombine has handled this, but pointer differences show up
in several contexts where we would like to get constant folding, and
cannot afford to run instcombine. Specifically, I'm working on improving
the constant folding of arguments used in inline cost analysis with
instsimplify.
Doing this in instsimplify implies some algorithm changes. We have to
handle multiple layers of all-constant GEPs because instsimplify cannot
fold them into a single GEP the way instcombine can. Also, we're only
interested in all-constant GEPs. The result is that this doesn't really
replace the instcombine logic, it's just complimentary and focused on
constant folding.
Reviewed on IRC by Benjamin Kramer.
llvm-svn: 152555
Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default.
Added some notes relative to case iterators.
llvm-svn: 152532
This requires a C++ change to EDDisassembler's ctor to function properly
(the llvm::InitializeAll* functions aren't being called currently and
there is no way to call them from Python).
Code is partially tested and works well enough for initial commit. There
are probably many small bugs.
llvm-svn: 152506
The 'CmpInst::isFalseWhenEqual' function returns 'false' for values other than
simply equality. For instance, it returns 'false' for <= or >=. This isn't the
correct behavior for this transformation, which is checking for strict equality
and non-equality. It was causing the gcc.c-torture/execute/frame-address.c test
to fail because it would completely (and incorrectly) optimize a whole function
into a 'ret i32 0'.
llvm-svn: 152497
a common collection of methods on Value, and share their implementation.
We had two variations in two different places already, and I need the
third variation for inline cost estimation.
Reviewed by Duncan Sands on IRC, but further comments here welcome.
llvm-svn: 152490
The old way of determine when and where to spill a value that was used inside of
a landing pad resulted in spilling that value everywhere and not just at the
invoke edge.
This algorithm determines which values are used within a landing pad. It then
spills those values before the invoke and reloads them before the uses. This
should prevent excessive spilling in many cases, e.g. inside of loops.
<rdar://problem/10609139>
llvm-svn: 152486
* Add enums and structures for GNU version information.
* Implement extraction of that information on a per-symbol basis (ELFObjectFile::getSymbolVersion).
* Implement a generic interface, GetELFSymbolVersion(), for getting the symbol version from the ObjectFile (hides the templating).
* Have llvm-readobj print out the version, when available.
* Add a test for the new feature: readobj-elf-versioning.test
llvm-svn: 152436
traversal, consider nodes for which the only successors are backedges
which the traversal is ignoring to be exit nodes. This fixes a problem
where the bottom-up traversal was failing to visit split blocks along
split loop backedges. This fixes rdar://10989035.
llvm-svn: 152421
~0U might be i32 on 32-bit hosts, then (uint64_t)~0U might not be expected as (i64)0xFFFFFFFF_FFFFFFFF, but as (i64)0x00000000_FFFFFFFF.
llvm-svn: 152407
This contains a semi-functional skeleton for the implementation of the
LLVM bindings for Python.
The API for the Object.h interface is roughly designed but not
implemented. MemoryBufferRef is implemented and actually appears to
work!
The ObjectFile unit test fails with a segmentation fault because the
LLVM library isn't being properly initialized. The build system doesn't
know about this code yet, so no alerts should fire.
llvm-svn: 152397
caused several clients to select the slow variation. =[ This is extra
annoying because we don't have any realistic way of testing this -- by
design, these two functions *must* compute the same value.
Found while inspecting the output of some benchmarks I'm working on.
llvm-svn: 152369
introduced. Specifically, there are cost reductions for all
constant-operand icmp instructions against an alloca, regardless of
whether the alloca will in fact be elligible for SROA. That means we
don't want to abort the icmp reduction computation when we abort the
SROA reduction computation. That in turn frees us from the need to keep
a separate worklist and defer the ICmp calculations.
Use this new-found freedom and some judicious function boundaries to
factor the innards of computing the cost factor of any given instruction
out of the loop over the instructions and into static helper functions.
This greatly simplifies the code, and hopefully makes it more clear what
is happening here.
Reviewed by Eric Christopher. There is some concern that we'd like to
ensure this doesn't get out of hand, and I plan to benchmark the effects
of this change over the next few days along with some further fixes to
the inline cost.
llvm-svn: 152368
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078
llvm-svn: 152366
buildbots. Original commit message:
[ADT] Change the trivial FoldingSetNodeID::Add* methods to be inline, reapplied
with a fix for the longstanding over-read of 32-bit pointer values.
llvm-svn: 152304
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html
Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".
ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.
Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.
Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow
for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
BasicBlock *BB = i.getCaseSuccessor();
ConstantInt *V = i.getCaseValue();
// Do something.
}
If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.
There are also related changes in llvm-clients: klee and clang.
llvm-svn: 152297
Original commit message:
Use uint16_t to store InstrNameIndices in MCInstrInfo. Add asserts to protect all 16-bit string table offsets. Also make sure the string to offset table string is not larger than 65536 characters since larger string literals aren't portable.
llvm-svn: 152296
For example, this pattern
(select (setcc lhs, rhs, cc), true, 0)
is transformed to this one:
(select (setcc lhs, rhs, inverse(cc)), 0, true)
This enables MipsDAGToDAGISel::ReplaceUsesWithZeroReg (added in r152280) to
replace 0 with $zero.
llvm-svn: 152285
analysis to be methods on the cost analysis's function info object
instead of the code metrics object. These really are just users of the
code metrics, they're building the information for the function's
analysis.
This is the first step of growing the amount of information we collect
about a function in order to cope with pair-wise simplifications due to
allocas.
llvm-svn: 152283
For example, the first instruction in the code below can be eliminated if the
use of $vr0 is replaced with $zero:
addiu $vr0, $zero, 0
add $vr2, $vr1, $vr0
add $vr2, $vr1, $zero
llvm-svn: 152280
Allow targets to provide their own schedulers (subclass of
ScheduleDAGInstrs) to the misched pass. Select schedulers using
-misched=...
llvm-svn: 152278
The ARM code generator makes aggressive assumptions about the encodings
being selected for branches which MCRelaxAll invalidates.
rdar://11006355
llvm-svn: 152268
code that will be relocated into another memory space.
Now when relocations are resolved, the address of
the relocation in the host memory (where the JIT is)
is passed separately from the address that the
relocation will be at in the target memory (where
the code will run).
llvm-svn: 152264
ScheduleDAGInstrs will be the main interface for MI-level
schedulers. Make sure it's readable: one page of protected fields, one
page of public methids.
llvm-svn: 152258