The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!
llvm-svn: 156258
It caused test/Index/index-many-call-ops.cpp to fail in stage2 c-index-test on selfhosting i686-cygwin and x86_64-linux since r156229 (Reverting making RecursiveASTVisitor data recursive).
llvm-svn: 156253
- Just use sys::Process::GetRandomNumber instead of having two poor
implementations.
- This is ~70 times (!) faster on my OS X machine.
llvm-svn: 156238
This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:
ucomisd (%rdi), %xmm0
cmovbel %edx, %esi
cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.
This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.
It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.
Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.
Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)
llvm-svn: 156234
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.
Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.
I'm not entirely happy with the name of this flag, suggestions welcome ;)
llvm-svn: 156233
which is leading to swapping in some cases when it didn't before. We need to see if we can make this change
without leading to a massive compile-time bloat.
llvm-svn: 156229
to get a const char* if necessary.
This avoids unnecessary conversions when we want to use the result of getName as
a StringRef.
Part of rdar://10796159
llvm-svn: 156227
This is still a topological ordering such that every register class gets
a smaller enum value than its sub-classes.
Placing the smaller spill sizes first makes a difference for the
super-register class bit masks. When looking for a super-register class,
we usually want the smallest possible kind of super-register. That is
now available as the first bit set in the bit mask.
llvm-svn: 156222
No one was using it and Locker(pthread_mutex_t *) immediately asserts for
pthread_mutex_t's that don't come from a Mutex anyway. Rather than try to make
that work, we should maintain the Mutex abstraction and not pass around the
platform implementation...
Make Mutex::Locker::Lock take a Mutex & or a Mutex *, and remove the constructor
taking a pthread_mutex_t *. You no longer need to call Mutex::GetMutex to pass
your mutex to a Locker (you can't in fact, since I made it private.)
llvm-svn: 156221
We want the representative register class to contain the largest
super-registers available. This makes the function less sensitive to the
register class numbering.
llvm-svn: 156220
Sema::ConvertToIntegralOrEnumerationType() from PartialDiagnostics to
abstract "diagnoser" classes. Not much of a win here, but we're
-several PartialDiagnostics.
llvm-svn: 156217
In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53:
../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
default: assert(0 && "Unknown condition code");
^
1 warning generated.
The prevailing pattern in LLVM is to not use a default label, and instead to
use llvm_unreachable to denote that the switch in fact covers all return paths
from the function.
llvm-svn: 156209