Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs
pass, and a few are reinserted by PHIElimination when a PHI argument is
<undef>.
RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look
like those created by PHIElimination, and that their live range never
leaves the basic block.
The PR14732 test case does tricks with PHI nodes that causes a longer
IMPLICIT_DEF live range to appear. This happens very rarely, but
RegisterCoalescer should be able to handle it.
llvm-svn: 171435
sections for debug info. These are some of the dwo sections from the
DWARF5 split debug info proposal. Update the fission-cu.ll testcase
to show what we should be able to dump more of now.
Work in progress: Ultimately the relocations will be gone for the
dwo section and the strings will be a different form (as well as
the rest of the sections will be included).
llvm-svn: 171428
DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two
mistakes:
1. It was checking the legality of scalar INT_TO_FP nodes and then generating
vector nodes.
2. It was passing the result value type to
TargetLoweringInfo::getOperationAction() when it should have been
passing the value type of the first operand.
llvm-svn: 171420
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.
PR14725
llvm-svn: 171251
propagating one of the values it simplified to a constant across
a myriad of instructions. Notably, ptrtoint instructions when we had
a constant pointer (say, 0) didn't propagate that, blocking a massive
number of down-stream optimizations.
This was uncovered when investigating why we fail to inline and delete
the boilerplate in:
void f() {
std::vector<int> v;
v.push_back(1);
}
It turns out most of the efforts I've made thus far to improve the
analysis weren't making it far purely because of this. After this is
fixed, the store-to-load forwarding patch enables LLVM to optimize the
above to an empty function. We still can't nuke a second push_back, but
for different reasons.
There is a very real chance this will cause somewhat noticable changes
in inlining behavior, so please let me know if you see regressions (or
improvements!) because of this patch.
llvm-svn: 171196
how to propagate constants through insert and extract value
instructions.
With the recent improvements to instsimplify, this allows inline cost
analysis to constant fold through intrinsic functions, including notably
the with.overflow intrinsic math routines which often show up inside of
STL abstractions. This is yet another piece in the puzzle of breaking
down the code for:
void f() {
std::vector<int> v;
v.push_back(1);
}
But it still isn't enough. There are a pile of bugs in inline cost still
blocking this.
llvm-svn: 171195
constant folding calls. Add the initial tests for this which show that
now instsimplify can simplify blindingly obvious code patterns expressed
with both intrinsics and library calls.
llvm-svn: 171194
register. In most cases we actually compare or select YMM-sized registers
and mixing the two types creates horrible code. This commit optimizes
some of the transition sequences.
PR14657.
llvm-svn: 171148
information doesn't return an addend for Rel relocations. Go ahead
and use this information to fix relocation handling inside dwarfdump
for 32-bit ELF REL.
llvm-svn: 171126
For the time being this includes only some dummy test cases. Once the
generic implementation of the intrinsics cost function does something other
than assuming scalarization in all cases, or some target specializes the
interface, some real test cases can be added.
Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID
in a few other places.
llvm-svn: 171079
As with the prefetch intrinsic to which it maps, simply have dcbt
marked as reading from and writing to its arguments instead of having
unmodeled side effects. While this might cause unwanted code motion
(because aliasing checks don't really capture cache-line sharing),
it is more important that prefetches in unrolled loops don't block
the scheduler from rearranging the unrolled loop body.
llvm-svn: 171073
Use of store or load with the atomic specifier on 64-bit types would
cause instruction-selection failures. As with the 32-bit case, these
can use the default expansion in terms of cmp-and-swap.
llvm-svn: 171072