Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.
Patch by James Molloy and Pablo Barrio.
Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop
Subscribers: jojo, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D25477
llvm-svn: 284971
Add synci to the microMIPS instruction definitions, mark the MIPS sync & synci
as not being part of microMIPS. This does not cover the sync instruction alias,
as that will be handled with a different patch. Add sync to the valid tests for
microMIPS.
Reviewers: vkalintiris
Differential Revision: https://reviews.llvm.org/D25795
llvm-svn: 284962
We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results.
llvm-svn: 284939
In BasicAA GEP operand values get adjusted ("wrap-around") based on the
pointersize. Otherwise, in non-64b modes, AA could report false negatives.
However, a wrap-around is valid only for a fully evaluated expression.
It had been introduced to fix an alias problem in
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html.
This commit restricts the wrap-around to constant gep operands only where the
value is known at compile-time.
llvm-svn: 284908
Summary:
When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV
multiple times if this SCEV is referenced by multiple other SCEVs. This has
exponential time complexity in the worst case. Memoizing the results will
avoid re-visiting the same SCEV. Add a map to save the results, and override
the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each
SCEV once and thus returns the same result for the same input SCEV.
This patch fixes PR18606, PR18607.
Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin
Differential Revision: https://reviews.llvm.org/D25810
llvm-svn: 284868
iterating over an archive with object and non-object members that
would cause an Abort because to was not calling consumeError()
when the code was wanting to ignore a non-object file.
Found by Justin Bogner!
llvm-svn: 284867
If a 64-bit value is tested against a bit which is known to be in the range
[0..31) (modulo 64), we can use the 32-bit BT instruction, which has a slightly
shorter encoding.
Differential Revision: https://reviews.llvm.org/D25862
llvm-svn: 284864
Summary: This adds support for dumping the globals stream from PDB files using llvm-pdbdump, similar to the support we have for the publics stream.
Reviewers: ruiu, zturner
Subscribers: beanz, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D25801
llvm-svn: 284861
Summary:
Utility pass to remove gc.relocates created by rewrite statepoints for GC.
With respect to safepoint verification, the IR generated would be incorrect, and cannot run
as such.
This would be a single transformation on the final optimized IR.
The benefit of the pass is for easy analysis when the IRs are 'polluted' by too
many gc.relocates.
Added tests.
test run: All RS4GC tests with -verify option. Local downstream tests on large
IR files. This also works when the pointer being gc.relocated is another
gc.relocate.
Reviewers: sanjoy, reames
Subscribers: beanz, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D25096
llvm-svn: 284855
the ARM_THREAD_STATE in the same format as
otool-classic(1) on darwin.
Also remove an extra space in printing the initprot to make
the output match otool-classic(1) on darwin.
rdar://28851457
llvm-svn: 284852
0 - X --> 0, if the sub is NUW
0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW
0 - X --> X, if X is 0 or the minimum signed value
This is the DAG equivalent of:
https://reviews.llvm.org/rL284649
plus the fold for the NUW case which already existed in InstSimplify.
Note that we miss a vector fold because of a deficiency in the DAG version of
computeKnownBits().
llvm-svn: 284844
These are the backend equivalents for the tests added in r284627.
The patterns may emerge late, so we should have folds for these in the DAG too.
llvm-svn: 284842
After register allocation it is possible to have a spill of a register
that is only partially defined. That in itself it fine, but creates a
problem for double vector registers. Stores of such registers are pseudo
instructions that are expanded into pairs of individual vector stores,
and in case of a partially defined source, one of the stores may use
an entirely undefined register. To avoid this, track the defined parts
and only generate actual stores for those.
llvm-svn: 284841
Summary:
Need to reorder the operands to have the callee as the last argument.
Adds a pseudo-instruction, and a pass to lower it into a real
call_indirect.
This is the first of two options for how to fix the problem.
Reviewers: dschuff, sunfish
Subscribers: jfb, beanz, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D25708
llvm-svn: 284840
As discussed in D24815, let's start the process of killing off the broken fast-math global
state housed in TargetOptions and eliminate the need for function-level fast-math attributes.
Here we enable two similar folds that are possible when we don't care about signed-zero:
fadd nsz x, 0 --> x
fsub nsz 0, x --> -x
Note that although the test cases include a 'sin' function call, I'm side-stepping the
FMF-on-calls question (and lack of support in the DAG) for now. It's not needed for these
tests - isNegatibleForFree/GetNegatedExpression just look through a ISD::FSIN node.
Also, when we create an FNEG node and propagate the Flags of the FSUB to it, this doesn't
actually do anything today because Flags are silently dropped for any node that is not a
binary operator.
Differential Revision: https://reviews.llvm.org/D25297
llvm-svn: 284824
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.
Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.
Differential Revision: https://reviews.llvm.org/D25682
llvm-svn: 284818
There's no agreement about this patch. I personally find the
PRE machinery of the current GVN hard enough to reason about
that I'm not sure I'll try to land this again, instead of working
on the rewrite).
llvm-svn: 284796
This is to avoid inlining too many multiplication operands into a SCEV, which could
take exponential time in the worst case.
Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin
Differential Revision: https://reviews.llvm.org/D25794
llvm-svn: 284784
load commands that use the MachO::twolevel_hints_command type
which includes only the LC_TWOLEVEL_HINTS load command.
This is not used in llvm libObject code or in llvm tool code. But
does appear in one of the binary test files. While this load command is
obsolete it is easier to add code for it in libObject than edit or change
the binary test case.
llvm-svn: 284769
Summary:
The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass.
The performance impact on speccpu2006 on Intel sandybridge machines:
spec/2006/fp/C++/444.namd 25.3 +0.26%
spec/2006/fp/C++/447.dealII 45.96 -0.10%
spec/2006/fp/C++/450.soplex 41.97 +1.49%
spec/2006/fp/C++/453.povray 36.83 -0.96%
spec/2006/fp/C/433.milc 23.81 +0.32%
spec/2006/fp/C/470.lbm 41.17 +0.34%
spec/2006/fp/C/482.sphinx3 48.13 +0.69%
spec/2006/int/C++/471.omnetpp 22.45 +3.25%
spec/2006/int/C++/473.astar 21.35 -2.06%
spec/2006/int/C++/483.xalancbmk 36.02 -2.39%
spec/2006/int/C/400.perlbench 33.7 -0.17%
spec/2006/int/C/401.bzip2 22.9 +0.52%
spec/2006/int/C/403.gcc 32.42 -0.54%
spec/2006/int/C/429.mcf 39.59 +0.19%
spec/2006/int/C/445.gobmk 26.98 -0.00%
spec/2006/int/C/456.hmmer 24.52 -0.18%
spec/2006/int/C/458.sjeng 28.26 +0.02%
spec/2006/int/C/462.libquantum 55.44 +3.74%
spec/2006/int/C/464.h264ref 46.67 -0.39%
geometric mean +0.20%
Manually checked 473 and 471 to verify the diff is in the noise range.
Reviewers: rengolin, davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24818
llvm-svn: 284757
Summary:
While promoting *_EXTEND_VECTOR_INREG nodes whose inputs are already
promoted, perform the appropriate sign extension for the promoted node
before doing the *_EXTEND_VECTOR_INREG operation. If not, the undefined
high-order bits of the promoted operand may (a) be garbage inc ase of
zext) or (b) contribute the wrong sign-bit (in case of sext)
Updated the promote-vec3.ll test after this change. The diff shows
explicit zeroing in case of zext and intermediate sign extension in case
of sext.
Reviewers: RKSimon
Subscribers: llvm-commits, srhines
Differential Revision: https://reviews.llvm.org/D25790
llvm-svn: 284752