We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.
Differential Revision: http://reviews.llvm.org/D17782
llvm-svn: 262465
For some instructions the register is not the last operand and the immediate handling had to detect this and hardcode the index to find it. It also required CurOp to be pointing at the last operand handled in the Form switch whereas for any instruction it would be pointing at the next operand.
Now we just capture the value in the Form switch when we know exactly where it is and the CurOp pointer can behave normally.
llvm-svn: 262462
Fix checking the same instruction twice instead of the
second branch that uses vccz. I don't think this matters
currently because s_branch_vccnz is always used currently.
llvm-svn: 262457
For some reason MSVC seems to think I'm calling getConstant() from a
static context. Try to avoid this issue by explicitly specifying
'this->' (though I'm not confident that this will actually work).
llvm-svn: 262451
Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.
The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).
llvm-svn: 262438
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case.
Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.
But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
llvm-svn: 262424
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17742
llvm-svn: 262419
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.
This fixes some test failures in a future commit when i64 loads
are changed to promote.
llvm-svn: 262397
This adds some missing generic schedule info definitions, enables
completeness checking for cyclone and fixes a typo uncovered by that.
Differential Revision: http://reviews.llvm.org/D17748
llvm-svn: 262393
This isn't quite NFC because some of the SDLocs may change which could
cause scheduling differences. But no regression tests are affected and
there is no functional change intended.
llvm-svn: 262391
Revert r262248 in an attempt to fix the clang-native-aarch64-full
bot and to investigate a performance regression in
SingleSource/Benchmarks/CoyoteBench/huffbench
llvm-svn: 262388
This reverts commit r262316.
It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.
llvm-svn: 262387