Asserting with cast<T> did not actually make much sense because there was no
need to use dynamic casting in the first place. We could make the compiler to
statically type check these objects.
llvm-svn: 205350
This provides an initial implementation of getUnrollingPreferences for x86.
getUnrollingPreferences is used by the generic (concatenation) unroller, which
is distinct from the unrolling done by the loop vectorizer. Many modern x86
cores have some kind of uop cache and loop-stream detector (LSD) used to
efficiently dispatch small loops, and taking full advantage of this requires
unrolling small loops (small here means 10s of uops).
These caches also have limits on the number of taken branches in the loop, and
so we also cap the loop unrolling factor based on the maximum "depth" of the
loop. This is currently calculated with a partial DFS traversal (partial
because it will stop early if the path length grows too much). This is still an
approximation, and one that is both conservative (because it does not account
for branches eliminated via block placement) and optimistic (because it is only
recording the maximum depth over minimum paths). Nevertheless, because the
loops that fit in these uop caches are so small, it is not clear how much the
details matter.
The original set of patches posted for review produced the following test-suite
performance results (from the TSVC benchmark) at that time:
ControlLoops-dbl - 13% speedup
ControlLoops-flt - 15% speedup
Reductions-dbl - 7.5% speedup
llvm-svn: 205348
In preparation for an upcoming commit implementing unrolling preferences for
x86, this adds additional fields to the UnrollingPreferences structure:
- PartialThreshold and PartialOptSizeThreshold - Like Threshold and
OptSizeThreshold, but used when not fully unrolling. These are necessary
because we need different thresholds for full unrolling from those used when
partially unrolling (the full unrolling thresholds are generally going to be
larger).
- MaxCount - A cap on the unrolling factor when partially unrolling. This can
be used by a target to prevent the unrolled loop from exceeding some
resource limit independent of the loop size (such as number of branches).
There should be no functionality change for any in-tree targets.
llvm-svn: 205347
The implementation of getUserCost had duplicated (and hard-coded) the default
logic in getGEPCost. Instead, it is better to use getGEPCost directly, which
limits the default logic to the implementation of one function, and allows
targets to override the behavior.
No functionality change intended.
llvm-svn: 205346
Identical to Win32 method except the GS segment register is used for TLS
instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14.
llvm-svn: 205342
On FreeBSD ptrace(PT_KILL) is used to terminate the traced process
(as if PT_CONTINUE had been used with SIGKILL as the signal to be
delivered), and is the desired behaviour for ProcessPOSIX::DoDestroy.
On Linux, after ptrace(PTRACE_KILL) the traced process still exists
and can be interrogated. It is only upon resume that it exits as though
it received SIGKILL.
As the Linux PTRACE_KILL behaviour is not used by LLDB, rename
BringProcessIntoLimbo to Kill, and change the implementation to simply
call kill() instead of using ptrace.
Thanks to Todd F for testing (Ubuntu 12.04, gcc 4.8.2).
Sponsored by: DARPA, AFRL
Differential Revision: http://llvm-reviews.chandlerc.com/D3159
llvm-svn: 205337
cast<X> asserts the type is correct and does not return null on failure.
So we should use cast<X> rather than dyn_cast<X> at such places where we
don't expect type conversion could fail.
llvm-svn: 205332
Store the gpr data in a DataBufferHeap and use a DataExtractor to
extract register values with appropriate endianness. This avoids hard-
coding the register count, and with some further work would allow this
class to provide generic register context storage for any CPU.
llvm-svn: 205329
If we're trying to get the zero element region of something that's not a region,
we should be returning UnknownVal, which is what ProgramState::getLValue will
do for us.
Patch by Alex McCarthy!
llvm-svn: 205327
During code preperation trivial PHI nodes (mainly introduced by lcssa) are
deleted to decrease the number of introduced allocas (==> dependences). However
simply replacing them by their only incoming value would cause the independent
block pass to introduce new allocas. To prevent this we try to share stack slots
during code preperarion, hence to reuse a already created alloca 'to demote' the
trivial PHI node. This works if we know that the value stored in this alloca
will be the incoming value of the trivial PHI at the end of the predecessor
block of this trivial PHI.
Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 205320
On FreeBSD ptrace(PT_KILL) is used to terminate the traced process
(as if PT_CONTINUE had been used with SIGKILL as the signal to be
delivered), and is the desired behaviour for ProcessPOSIX::DoDestroy.
On Linux, after ptrace(PTRACE_KILL) the traced process still exists
and can be interrogated. It is only upon resume that it exits as though
it received SIGKILL.
For now I'm committing only the FreeBSD change, until the Linux change
(review D3159) is successfully tested.
http://llvm.org/pr18894
llvm-svn: 205315
The Cyclone CPU is similar to swift for most LLVM purposes, but does have two
preferred instructions for zeroing a VFP register. This teaches LLVM about
them.
llvm-svn: 205309