This patch migrates the math library call simplifications from the
simplify-libcalls pass into the instcombine library call simplifier.
I have typically migrated just one simplifier at a time, but the math
simplifiers are interdependent because:
1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt.
2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on
the option -enable-double-float-shrink.
These two factors made migrating each of these simplifiers individually
more of a pain than it would be worth. So, I migrated them all together.
llvm-svn: 167815
Don't choose a vectorization plan containing only shuffles and
vector inserts/extracts. Due to inperfections in the cost model,
these can lead to infinite recusion.
llvm-svn: 167811
If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the
same for both of them because we use the 'upper_bound' attribute. Instead use
the 'count' attrbute, which gives the correct number of elements in the array.
<rdar://problem/12566646>
llvm-svn: 167806
This fixes another infinite recursion case when using target costs.
We can only replace insert element input chains that are pure (end
with inserting into an undef).
llvm-svn: 167784
getNumContainedPasses() used to compute the size of the vector on demand. It is
called repeated in loops (such as runOnFunction()) and it can be updated while
inside the loop.
llvm-svn: 167759
This teaches the register coalescer to be less prone to split critical
edges. I am currently benchmarking this with the new (post-coalescer)
scheduler. I plan to enable this by default and remove the option as
soon as misched is enabled.
llvm-svn: 167758
The old checking code, which assumed that input shuffles and insert-elements
could always be folded (and thus were free) is too simple.
This can only happen in special circumstances.
Using the simple check caused infinite recursion.
llvm-svn: 167750
Uses the infrastructure from r167742 to support clustering instructure
that the target processor can "fuse". e.g. cmp+jmp.
Next step: target hook implementations with test cases, and enable.
llvm-svn: 167744
The pass would previously assert when trying to compute the cost of
compare instructions with illegal vector types (like struct pointers).
llvm-svn: 167743
This infrastructure is generally useful for any target that wants to
strongly prefer two instructions to be adjacent after scheduling.
A following checkin will add target-specific hooks with unit
tests. Then this feature will be enabled by default with misched.
llvm-svn: 167742
The assertion is trigged when the Reassociater tries to transform expression
... + 2 * n * 3 + 2 * m + ...
into:
... + 2 * (n*3 + m).
In the process of the transformation, a helper routine folds the constant 2*3 into 6,
confusing optimizer which is trying the to eliminate the common factor 2, and cannot
find 2 any more.
Review is pending. But I'd like commit first in order to help those who are waiting
for this fix.
llvm-svn: 167740
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.
llvm-svn: 167738
This fixes a bug where shuffles were being fused such that the
resulting input types were not legal on the target. This would
occur only when both inputs and dependencies were also foldable
operations (such as other shuffles) and there were other connected
pairs in the same block.
llvm-svn: 167731
The library call simplifier folds memcmp calls with all constant arguments
to a constant. For example:
memcmp("foo", "foo", 3) -> 0
memcmp("hel", "foo", 3) -> 1
memcmp("foo", "hel", 3) -> -1
The folding is implemented in terms of the system memcmp that LLVM gets
linked with. It currently just blindly uses the value returned from
the system memcmp as the folded constant.
This patch normalizes the values returned from the system memcmp to
(-1, 0, 1) so that we get consistent results across multiple platforms.
The test cases were adjusted accordingly.
llvm-svn: 167726
Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally,
PTX 3.1 is added as the default PTX version to be out-of-the-box compatible
with CUDA 5.0.
Available CPUs for this target:
sm_10 - Select the sm_10 processor.
sm_11 - Select the sm_11 processor.
sm_12 - Select the sm_12 processor.
sm_13 - Select the sm_13 processor.
sm_20 - Select the sm_20 processor.
sm_21 - Select the sm_21 processor.
sm_30 - Select the sm_30 processor.
sm_35 - Select the sm_35 processor.
Available features for this target:
ptx30 - Use PTX version 3.0.
ptx31 - Use PTX version 3.1.
sm_10 - Target SM 1.0.
sm_11 - Target SM 1.1.
sm_12 - Target SM 1.2.
sm_13 - Target SM 1.3.
sm_20 - Target SM 2.0.
sm_21 - Target SM 2.1.
sm_30 - Target SM 3.0.
sm_35 - Target SM 3.5.
llvm-svn: 167699
In some cases the library call simplifier may need to replace instructions
other than the library call being simplified. In those cases it may be
necessary for clients of the simplifier to override how the replacements
are actually done. As such, a new overrideable method for replacing
instructions was added to LibCallSimplifier.
A new subclass of LibCallSimplifier is also defined which overrides
the instruction replacement method. This is because the instruction
combiner defines its own replacement method which updates the worklist
when instructions are replaced.
llvm-svn: 167681
Several of the simplifiers migrated from the simplify-libcalls pass to
the instcombine pass were not correctly checking the target library
information to gate the simplifications. This patch ensures that the
check is made.
llvm-svn: 167660
In the process of migrating optimizations from the simplify-libcalls pass
to the instcombine pass I noticed that a few functions are missing from
the target library information. These functions need to be available for
querying in the instcombine library call simplifiers. More functions will
probably be added in the future as more simplifiers are migrated to
instcombine.
llvm-svn: 167659
mov lr, pc
b.w _foo
The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.
rdar://12663632
llvm-svn: 167657