Commit Graph

7594 Commits

Author SHA1 Message Date
Sanjay Patel 43aeb001c9 [InstCombine] use m_APInt to allow icmp (binop X, Y), C folds with constant splat vectors
This removes the restriction for the icmp constant, but as noted by the FIXME comments, 
we still need to change individual checks for binop operand constants.

llvm-svn: 277629
2016-08-03 18:59:03 +00:00
Duncan P. N. Exon Smith 9cbc69d1fe IR: Drop uniquing when an MDNode Value operand is deleted
This is a fix for PR28697.

An MDNode can indirectly refer to a GlobalValue, through a
ConstantAsMetadata.  When the GlobalValue is deleted, the MDNode operand
is reset to `nullptr`.  If the node is uniqued, this can lead to a
hard-to-detect cache invalidation in a Metadata map that's shared across
an LLVMContext.

Consider:

 1. A map from Metadata* to `T` called RemappedMDs.
 2. A node that references a global variable, `!{i1* @GV}`.
 3. Insert `!{i1* @GV} -> SomeT` in the map.
 4. Delete `@GV`, leaving behind `!{null} -> SomeT`.

Looking up the generic and uninteresting `!{null}` gives you `SomeT`,
which is likely related to `@GV`.  Worse, `SomeT`'s lifetime may be tied
to the deleted `@GV`.

This occurs in practice in the shared ValueMap used since r266579 in the
IRMover.  Other code that handles more than one Module (with different
lifetimes) in the same LLVMContext could hit it too.

The fix here is a partial revert of r225223: in the rare case that an
MDNode operand is a ConstantAsMetadata (i.e., wrapping a node from the
Value hierarchy), drop uniquing if it gets replaced with `nullptr`.
This changes step #4 above to leave behind `distinct !{null} -> SomeT`,
which can't be confused with the generic `!{null}`.

In theory, this can cause some churn in the LLVMContext's MDNode
uniquing map when Values are being deleted.  However:

  - The number of GlobalValues referenced from uniqued MDNodes is
    expected to be quite small.  E.g., the debug info metadata schema
    only references GlobalValues from distinct nodes.

  - Other Constants have the lifetime of the LLVMContext, whose teardown
    is careful to drop references before deleting the constants.

As a result, I don't expect a compile time regression from this change.

llvm-svn: 277625
2016-08-03 18:19:43 +00:00
David Majnemer fad0490869 [CloneFunction] Don't remove side effecting calls
We were able to figure out that the result of a call is some constant.
While propagating that fact, we added the constant to the value map.
This is problematic because it results in us losing the call site when
processing the value map.

This fixes PR28802.

llvm-svn: 277611
2016-08-03 17:12:47 +00:00
Renato Golin f583097434 Revert "Teach CorrelatedValuePropagation to mark adds as no wrap"
This reverts commit r277592, trying to fix the AArch64 42VMA buildbot.

llvm-svn: 277607
2016-08-03 16:20:48 +00:00
Sanjay Patel 91bab5364e add a vector variant of each test
llvm-svn: 277598
2016-08-03 14:25:55 +00:00
Nicolai Haehnle c1f1ad998a [InstCombine] Add select-bitext.ll tests
As requested for D22747.

llvm-svn: 277596
2016-08-03 13:37:56 +00:00
Artur Pilipenko 68cb947cc9 Teach CorrelatedValuePropagation to mark adds as no wrap
Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23059

llvm-svn: 277592
2016-08-03 13:11:39 +00:00
Chandler Carruth 6cb2ab2c60 [PM] Significantly refactor the pass pipeline parsing to be easier to
reason about and less error prone.

The core idea is to fully parse the text without trying to identify
passes or structure. This is done with a single state machine. There
were various bugs in the logic around this previously that were repeated
and scattered across the code. Having a single routine makes it much
easier to fix and get correct. For example, this routine doesn't suffer
from PR28577.

Then the actual pass construction is handled using *much* easier to read
code and simple loops, with particular pass manager construction sunk to
live with other pass construction. This is especially nice as the pass
managers *are* in fact passes.

Finally, the "implicit" pass manager synthesis is done much more simply
by forming "pre-parsed" structures rather than having to duplicate tons
of logic.

One of the bugs fixed by this was evident in the tests where we accepted
a pipeline that wasn't really well formed. Another bug is PR28577 for
which I have added a test case.

The code is less efficient than the previous code but I'm really hoping
that's not a priority. ;]

Thanks to Sean for the review!

Differential Revision: https://reviews.llvm.org/D22724

llvm-svn: 277561
2016-08-03 03:21:41 +00:00
Ivan Krasin 3aade11252 Add -lowertypetests-bitsets-level to control bitsets generation.
Summary:
Sometimes, bitsets could get really large (>300k entries) and
we might want to drop a check, as it would have a too much cost.

Adding a flag to control how much penalty are we willing to pay
for bitsets.

Reviewers: kcc

Differential Revision: https://reviews.llvm.org/D23088

llvm-svn: 277556
2016-08-03 00:59:38 +00:00
Sanjay Patel 287b81d27b add vector test for icmp+sub
llvm-svn: 277555
2016-08-03 00:36:54 +00:00
Daniel Berlin df10119e4e Support for lifetime begin/end markers in the MemorySSA use optimizer
Summary: Depends on D23072

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23076

llvm-svn: 277553
2016-08-03 00:01:46 +00:00
Evgeniy Stepanov d99f80b48e [safestack] Layout large allocas first to reduce fragmentation.
llvm-svn: 277544
2016-08-02 23:21:30 +00:00
Michael Zolotukhin ca0d48b742 [LoopUnroll] Fix a PowerPC test broken by r277524.
llvm-svn: 277527
2016-08-02 21:43:25 +00:00
Michael Zolotukhin b2738e41bf [LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its original value.
As agreed in post-commit review of r265388, I'm switching the flag to
its original value until the 90% runtime performance regression on
SingleSource/Benchmarks/Stanford/Bubblesort is addressed.

llvm-svn: 277524
2016-08-02 21:24:14 +00:00
Wei Mi dc7001afb2 [LoopVectorize] Change comment for isOutOfScope in collectLoopUniforms, NFC
Update comment for isOutOfScope and add a testcase for uniform value being used
out of scope.

Differential Revision: https://reviews.llvm.org/D23073

llvm-svn: 277515
2016-08-02 20:27:49 +00:00
Sanjoy Das f45e03e201 [IRCE] Preserve DomTree and LCSSA
This changes IRCE to "preserve" LCSSA and DomTree by recomputing them.
It still does not preserve LoopSimplify.

llvm-svn: 277505
2016-08-02 19:31:54 +00:00
Michael Zolotukhin d9b6ad3c01 [LoopUnroll] Ensure we create prolog loops in simplified form.
llvm-svn: 277502
2016-08-02 19:19:31 +00:00
Matthew Simpson 18d8898317 [LV] Generate both scalar and vector integer induction variables
This patch enables the vectorizer to generate both scalar and vector versions
of an integer induction variable for a given loop. Previously, we only
generated a scalar induction variable if we knew all its users were going to be
scalar. Otherwise, we generated a vector induction variable. In the case of a
loop with both scalar and vector users of the induction variable, we would
generate the vector induction variable and extract scalar values from it for
the scalar users. With this patch, we now generate both versions of the
induction variable when there are both scalar and vector users and select which
version to use based on whether the user is scalar or vector.

Differential Revision: https://reviews.llvm.org/D22869

llvm-svn: 277474
2016-08-02 15:25:16 +00:00
Matthew Simpson 58f562887b [LV] Untangle the concepts of uniform and scalar
This patch refactors the logic in collectLoopUniforms and
collectValuesToIgnore, untangling the concepts of "uniform" and "scalar". It
adds isScalarAfterVectorization along side isUniformAfterVectorization to
distinguish the two. Known scalar values include those that are uniform,
getelementptr instructions that won't be vectorized, and induction variables
and induction variable update instructions whose users are all known to be
scalar.

This patch includes the following functional changes:

- In collectLoopUniforms, we mark uniform the pointer operands of interleaved
  accesses. Although non-consecutive, these pointers are treated like
  consecutive pointers during vectorization.

- In collectValuesToIgnore, we insert a value into VecValuesToIgnore if it
  isScalarAfterVectorization rather than isUniformAfterVectorization. This
  differs from the previous functionaly in that we now add getelementptr
  instructions that will not be vectorized into VecValuesToIgnore.

This patch also removes the ValuesNotWidened set used for induction variable
scalarization since, after the above changes, it is now equivalent to
isScalarAfterVectorization.

Differential Revision: https://reviews.llvm.org/D22867

llvm-svn: 277460
2016-08-02 14:29:41 +00:00
Igor Breger f44b79d08e [AVX512] Don't use i128 masked gather/scatter/load/store. Do more accurately dataWidth check.
Differential Revision: http://reviews.llvm.org/D23055

llvm-svn: 277435
2016-08-02 09:15:28 +00:00
Sean Silva f801575fd0 CodeExtractor : Add ability to preserve profile data.
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22744

llvm-svn: 277411
2016-08-02 02:15:45 +00:00
Derek Schuff c64d7655b2 [WebAssembly] Support CFI for WebAssembly target
Summary: This patch implements CFI for WebAssembly. It modifies the
LowerTypeTest pass to pre-assign table indexes to functions that are
called indirectly, and lowers type checks to test against the
appropriate table indexes. It also modifies the WebAssembly backend to
support a special ".indidx" assembly directive that propagates the table
index assignments out to the linker.

Patch by Dominic Chen

Differential Revision: https://reviews.llvm.org/D21768

llvm-svn: 277398
2016-08-01 22:25:02 +00:00
Michael Kuperstein c40618610f [PM] Port SpeculativeExecution to the new PM
Differential Revision: https://reviews.llvm.org/D23033

llvm-svn: 277393
2016-08-01 21:48:33 +00:00
David Majnemer ba6665d88a [Verifier] Resume instructions can only be in functions w/ a personality
This fixes PR28799.

llvm-svn: 277360
2016-08-01 18:06:34 +00:00
Simon Pilgrim 7fd4ad6849 Fixed test check ordering issue on windows buildbots
llvm-svn: 277337
2016-08-01 10:40:15 +00:00
James Molloy bade86cedc [SimplifyCFG] Fix nasty RAUW bug from r277325
Using RAUW was wrong here; if we have a switch transform such as:
  18 -> 6 then
  6 -> 0

If we use RAUW, while performing the second transform the  *transformed* 6
from the first will be also replaced, so we end up with:
  18 -> 0
  6 -> 0

Found by clang stage2 bootstrap; testcase added.

llvm-svn: 277332
2016-08-01 09:34:48 +00:00
Craig Topper d2b2d745ff [AVX-512] Fix a test missed in r277327.
llvm-svn: 277330
2016-08-01 08:15:30 +00:00
James Molloy 91821bd0b4 [SimplifyCFG] Try and pacify buildbots after r277325
It looks like the two independent parts of the rotate operation (a lshr and shl) are being reordered on some bots. Add CHECK-DAGs to account for this.

llvm-svn: 277329
2016-08-01 08:09:55 +00:00
James Molloy b2e436de42 [SimplifyCFG] Range reduce switches
If a switch is sparse and all the cases (once sorted) are in arithmetic progression, we can extract the common factor out of the switch and create a dense switch. For example:

    switch (i) {
    case 5: ...
    case 9: ...
    case 13: ...
    case 17: ...
    }

can become:

    if ( (i - 5) % 4 ) goto default;
    switch ((i - 5) / 4) {
    case 0: ...
    case 1: ...
    case 2: ...
    case 3: ...
    }

or even better:

   switch ( ROTR(i - 5, 2) {
   case 0: ...
   case 1: ...
   case 2: ...
   case 3: ...
   }

The division and remainder operations could be costly so we only do this if the factor is a power of two, and emit a right-rotate instead of a divide/remainder sequence. Dense switches can be lowered significantly better than sparse switches and can even be transformed into lookup tables.

llvm-svn: 277325
2016-08-01 07:45:11 +00:00
Sean Silva 423c7149dc Revert r277313 and r277314.
They seem to trigger an LSan failure:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/15140/steps/check-llvm%20asan/logs/stdio

Revert "Add the tests for r277313"

This reverts commit r277314.

Revert "CodeExtractor : Add ability to preserve profile data."

This reverts commit r277313.

llvm-svn: 277317
2016-08-01 04:16:09 +00:00
Sean Silva e5a5c966cd Move this test to x86-specific directory.
No bots have yelled yet, but this test references an x86 intrinsic.
Also, it invokes llc on x86 IR.

Fixup to r277315.

llvm-svn: 277316
2016-08-01 03:22:05 +00:00
Sean Silva a0a802abe3 Fix - CodeExtractor : Inherit Target Dependent Attributes from the parent function.
When extracting a set of blocks make sure to inherit all of the target
dependent attributes to make sure that the function will be valid for
lowering. One example is the "target-features" attribute for x86, if the
extracted region has functionality that relies on a specific feature it
will fail to be lowered.
This also allows for extracted functions to be valid for inlining, at
least back into the parent function, as the target attributes are tested
when inlining for compatibility.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22713

llvm-svn: 277315
2016-08-01 03:15:32 +00:00
Sean Silva 72be9a6937 Add the tests for r277313
Forgot to `git add` them.

llvm-svn: 277314
2016-08-01 03:04:34 +00:00
Simon Pilgrim 9e201eac32 [SLPVectorizer][X86] Added vXi8/vXi16 sitofp/uitofp tests
Dropped useless 2i32-2f32 test

llvm-svn: 277281
2016-07-30 21:01:34 +00:00
Simon Pilgrim f5134a2867 [SLPVectorizer][X86] Added SITOFP/UITOFP vectorization tests
llvm-svn: 277275
2016-07-30 18:43:30 +00:00
Adam Nemet 12937c361f [LoopUnroll] Include hotness of region in opt remark
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.

The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.

This is how the patch affects the O3 pipeline:

         Dominator Tree Construction
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Rotate Loops
           Loop Invariant Code Motion
           Unswitch loops
         Simplify the CFG
         Dominator Tree Construction
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Combine redundant instructions
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Induction Variable Simplification
           Recognize loop idioms
           Delete dead loops
           Unroll loops
...

llvm-svn: 277203
2016-07-29 19:29:47 +00:00
David Majnemer 718da3d1f6 [ConstantFolding] Handle bitcasts of undef fp vector elements
We used the wrong type for constructing a zero vector element which led
to type mismatches.

This fixes PR28771.

llvm-svn: 277197
2016-07-29 18:48:27 +00:00
Andrew Kaylor b99d1cc7ed Recommitting r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Third try, now following fixes to MSan to handle mempcy in such a way that this commit won't break the MSan buildbots. (Thanks, Evegenii!)

llvm-svn: 277189
2016-07-29 18:23:18 +00:00
Matt Masten a6669a1e05 Initial support for vectorization using svml (short vector math library).
Differential Revision: https://reviews.llvm.org/D19544

llvm-svn: 277166
2016-07-29 16:42:44 +00:00
David Majnemer 130b9f99d6 [EarlyCSE] Correctly handle simplified, but live, instructions
Some instructions may have their uses replaced with a symbolic constant.
However, the instruction may still have side effects which percludes it
from being removed from the function.  EarlyCSE treated such an
instruction as if it were removed, resulting in PR28763.

llvm-svn: 277114
2016-07-29 05:39:21 +00:00
David Majnemer e4218cf11e [ConstantFolding] Fold bitcasts of vectors w/ undef elements
An undef vector element can be treated as if it had any value.  Folding
such a vector element to 0 in a bitcast can open up further folding
opportunities.

llvm-svn: 277104
2016-07-29 04:06:09 +00:00
David Majnemer 57b94c8d6a [ConstantFolding] Use ConstantExpr::getWithOperands
ConstantExpr::getWithOperands does much of the hard work that
ConstantFoldInstOperandsImpl tries to do but more completely.

This lets us fold ExtractValue/InsertValue expressions.

llvm-svn: 277100
2016-07-29 03:27:31 +00:00
David Majnemer d536f2328e [ConstnatFolding] Teach the folder how to fold ConstantVector
A ConstantVector can have ConstantExpr operands and vice versa.
However, the folder had no ability to fold ConstantVectors which, in
some cases, was an optimization barrier.

Instead, rephrase the folder in terms of Constants instead of
ConstantExprs and teach callers how to deal with failure.

llvm-svn: 277099
2016-07-29 03:27:26 +00:00
Piotr Padlewski 84abc74f2c Added ThinLTO inlining statistics
Summary:
copypasta doc of ImportedFunctionsInliningStatistics class
 \brief Calculate and dump ThinLTO specific inliner stats.
 The main statistics are:
 (1) Number of inlined imported functions,
 (2) Number of imported functions inlined into importing module (indirect),
 (3) Number of non imported functions inlined into importing module
 (indirect).
 The difference between first and the second is that first stat counts
 all performed inlines on imported functions, but the second one only the
 functions that have been eventually inlined to a function in the importing
 module (by a chain of inlines). Because llvm uses bottom-up inliner, it is
 possible to e.g. import function `A`, `B` and then inline `B` to `A`,
 and after this `A` might be too big to be inlined into some other function
 that calls it. It calculates this statistic by building graph, where
 the nodes are functions, and edges are performed inlines and then by marking
 the edges starting from not imported function.

 If `Verbose` is set to true, then it also dumps statistics
 per each inlined function, sorted by the greatest inlines count like
 - number of performed inlines
 - number of performed inlines to importing module

Reviewers: eraman, tejohnson, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D22491

llvm-svn: 277089
2016-07-29 00:27:16 +00:00
Adam Nemet aa3506c5f0 [BPI] Add new LazyBPI analysis
Summary:
The motivation is the same as in D22141: In order to add the hotness
attribute to optimization remarks we need BFI to be available in all
passes that emit optimization remarks.  BFI depends on BPI so unless we
make this lazy as well we would still compute BPI unconditionally.

The solution is to use the new LazyBPI pass in LazyBFI and only compute
BPI when computation of BFI is requested by the client.

I extended the laziness test using a LoopDistribute test to also cover
BPI.

Reviewers: hfinkel, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22835

llvm-svn: 277083
2016-07-28 23:31:12 +00:00
Vitaly Buka 0ab23cf1c8 Do not remove empty lifetime.start/lifetime.end ranges
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.

This is possible for code like this:
for (int i = 0; i < 3; i++) {
  int x;
  p = &x;
}

"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.

PR27453

Reviewers: eugenis

Differential Revision: https://reviews.llvm.org/D22842

llvm-svn: 277072
2016-07-28 22:59:03 +00:00
Vitaly Buka 2fae6a7702 Should be committed as one CL.
This reverts commits r277068 r277067 r277066.

llvm-svn: 277071
2016-07-28 22:59:01 +00:00
Vitaly Buka f0500b6ae5 Do not remove empty lifetime.start/lifetime.end ranges
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.

This is possible for code like this:
for (int i = 0; i < 3; i++) {
  int x;
  p = &x;
}

"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.

PR27453

Reviewers: eugenis

Differential Revision: https://reviews.llvm.org/D22842

llvm-svn: 277068
2016-07-28 22:50:48 +00:00
Vitaly Buka 3645793872 maned
llvm-svn: 277067
2016-07-28 22:50:45 +00:00
Michael Kuperstein e45d4d9b35 [PM] Port LowerGuardIntrinsic to the new PM.
llvm-svn: 277057
2016-07-28 22:08:41 +00:00
David Majnemer 3d32b7ed0d [coroutines] Part 3 of N: Adding Boilerplate for Coroutine Passes
This adds boilerplate code for all coroutine passes,
the passes are no-ops for now.
Also, a small test has been added to verify that passes execute in
the expected order or not at all if coroutine support is disabled.

Patch by Gor Nishanov!

Differential Revision: https://reviews.llvm.org/D22847

llvm-svn: 277033
2016-07-28 21:04:31 +00:00
David Majnemer 19d024b2fd [ConstantFolding] Don't bail on folding if ConstantFoldConstantExpression fails
When folding an expression, we run ConstantFoldConstantExpression on
each operand of that expression.
However, ConstantFoldConstantExpression can fail and retur nullptr.

Previously, we would bail on further refining the expression.
Instead, use the original operand and see if we can refine a later
operand.

llvm-svn: 276959
2016-07-28 06:39:48 +00:00
David Majnemer 0be7155350 [InstCombine] Handle failures from ConstantFoldConstantExpression
ConstantFoldConstantExpression returns null when folding fails.

This fixes PR28745.

llvm-svn: 276952
2016-07-28 02:29:06 +00:00
Wei Mi 315bb33f27 Fix the assertion error in collectLoopUniforms caused by empty Worklist before expanding.
Contributed-by: David Callahan

Differential Revision: https://reviews.llvm.org/D22886

llvm-svn: 276943
2016-07-27 23:53:58 +00:00
Justin Lebar 23a9686011 [LSV] Don't assume that bitcast ops are Instructions.
Summary:
When we ask the builder to create a bitcast on a constant, we get back a
constant, not an instruction.

Reviewers: asbirlea

Subscribers: jholewinski, mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22878

llvm-svn: 276922
2016-07-27 21:45:48 +00:00
Sebastian Pop 55c3007b88 GVN-hoist: improve code generation for recursive GEPs
When loading or storing in a field of a struct like "a.b.c", GVN is able to
detect the equivalent expressions, and GVN-hoist would fail in the code
generation.  This is because the GEPs are not hoisted as scalar operations to
avoid moving the GEPs too far from their ld/st instruction when the ld/st is not
movable.  So we end up having to generate code for the GEP of a ld/st when we
move the ld/st.  In the case of a GEP referring to another GEP as in "a.b.c" we
need to code generate all the GEPs necessary to make all the operands available
at the new location for the ld/st.  With this patch we recursively walk through
the GEP operands checking whether all operands are available, and in the case of
a GEP operand, it recursively makes all its operands available. Code generation
happens from the inner GEPs out until reaching the GEP that appears as an
operand of the ld/st.

Differential Revision: https://reviews.llvm.org/D22599

llvm-svn: 276841
2016-07-27 05:48:12 +00:00
David Majnemer bc36b15253 [ConstantFolding] Correctly handle failures in ConstantFoldConstantExpressionImpl
Failures in ConstantFoldConstantExpressionImpl were ignored causing
crashes down the line.

This fixes PR28725.

llvm-svn: 276827
2016-07-27 02:39:16 +00:00
Andrew Kaylor f990fa5f7b Reverting r276771 due to MSan failures.
llvm-svn: 276824
2016-07-27 01:19:24 +00:00
David Majnemer 6774d612d4 [InstSimplify] Cast folding can be made more generic
Use isEliminableCastPair to determine if a pair of casts are foldable.

llvm-svn: 276777
2016-07-26 17:58:05 +00:00
Andrew Kaylor 3104a6bad0 Re-committing r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 276771
2016-07-26 17:23:13 +00:00
David Majnemer a90a621d1e Reapply: [InstSimplify] Add support for bitcasts"
This reverts commit r276700 and reapplies r276698.
The relevant clang tests have been updated.

llvm-svn: 276727
2016-07-26 05:52:29 +00:00
Sebastian Pop 91d4a30159 GVN-hoist: use a DFS numbering of instructions (PR28670)
Instead of DFS numbering basic blocks we now DFS number instructions that avoids
the costly operation of which instruction comes first in a basic block.

Patch mostly written by Daniel Berlin.

Differential Revision: https://reviews.llvm.org/D22777

llvm-svn: 276714
2016-07-26 00:15:10 +00:00
Evgeniy Stepanov 906f6fb565 [safestack] Fix stack guard live range.
Stack guard slot is live throughout the function.

llvm-svn: 276712
2016-07-26 00:05:14 +00:00
David Majnemer 6e06b577cc Revert "[InstSimplify] Add support for bitcasts"
This reverts commit r276698.  Clang has tests which rely on the
optimizer :(

llvm-svn: 276700
2016-07-25 22:24:59 +00:00
David Majnemer 62611fd3f7 [InstSimplify] Add support for bitcasts
BitCasts of BitCasts can be folded away as can BitCasts which don't
change the type of the operand.

llvm-svn: 276698
2016-07-25 22:04:58 +00:00
Matt Arsenault 7cddfed7e8 Scalarizer: Support scalarizing intrinsics
llvm-svn: 276681
2016-07-25 20:02:54 +00:00
Evgeniy Stepanov 8d78bd5041 Fix invalid iterator use in safestack coloring.
llvm-svn: 276676
2016-07-25 19:25:40 +00:00
Rong Xu 705f7775bb [PGO] Fix profile mismatch in COMDAT function with pre-inliner
Pre-instrumentation inline (pre-inliner) greatly improves the IR
instrumentation code performance, among other benefits. One issue of the
pre-inliner is it can introduce CFG-mismatch for COMDAT functions. This
is due to the fact that the same COMDAT function may have different early
inline decisions across different modules -- that means different copies
of COMDAT functions will have different CFG checksum.

In this patch, we propose a partially renaming the COMDAT group and its
member function/variable so we have different profile counter for each
version. We will post-fix the COMDAT function and the group name with its
FunctionHash.

Differential Revision: http://reviews.llvm.org/D22600

llvm-svn: 276673
2016-07-25 18:45:37 +00:00
Sean Silva fe5abd5e0c Fix : Partial Inliner requires AssumptionCacheTracker
The public InlineFunction utility assumes that the passed in
InlineFunctionInfo has a valid AssumptionCacheTracker.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22706

llvm-svn: 276609
2016-07-25 05:00:00 +00:00
David Majnemer 68623a0e9f [GVNHoist] Merge metadata on hoisted instructions less conservatively
We can combine metadata from multiple instructions intelligently for
certain metadata nodes.

llvm-svn: 276602
2016-07-25 02:21:25 +00:00
David Majnemer 4728569d0a [GVNHoist] Properly merge alignments when hoisting
If we two loads of two different alignments, we must use the minimum of
the two alignments when hoisting.  Same deal for stores.

For allocas, use the maximum of the two allocas.

llvm-svn: 276601
2016-07-25 02:21:23 +00:00
Elena Demikhovsky 376a18bd92 [Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this:
float *A;
float x = init;
for (int i=0; i < N; ++i) {
  A[i] = x;
  x -= fp_inc;
}

The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.

Differential Revision: https://reviews.llvm.org/D21330

llvm-svn: 276554
2016-07-24 07:24:54 +00:00
Sanjay Patel 1271bf9178 [InstCombine] allow icmp (bit-manipulation-intrinsic(), C) folds for vectors
llvm-svn: 276523
2016-07-23 13:06:49 +00:00
Xinliang David Li 9239245401 [Profile] Use explicit flag to enable IR PGO
Patch by Jake VanAdrighem

Differential Revision: http://reviews.llvm.org/D22607

llvm-svn: 276516
2016-07-23 04:28:52 +00:00
Sanjay Patel 8d8594acb9 auto-generate checks
llvm-svn: 276501
2016-07-23 00:09:54 +00:00
Adam Nemet 9e6e63fba2 [LoopDataPrefetch] Include hotness of region in opt remark
llvm-svn: 276488
2016-07-22 22:53:17 +00:00
Sanjay Patel e063ddb347 add tests for icmp vector folds
llvm-svn: 276482
2016-07-22 22:19:52 +00:00
Michael Kuperstein 38e7298093 [SLPVectorizer] Vectorize reverse-order loads in horizontal reductions
When vectorizing a tree rooted at a store bundle, we currently try to sort the
stores before building the tree, so that the stores can be vectorized. For other
trees, the order of the root bundle - which determines the order of all other
bundles - is arbitrary. That is bad, since if a leaf bundle of consecutive loads
happens to appear in the wrong order, we will not vectorize it.

This is partially mitigated when the root is a binary operator, by trying to
build a "reversed" tree when that's considered profitable. This patch extends the
workaround we have for binops to trees rooted in a horizontal reduction.

This fixes PR28474.

Differential Revision: https://reviews.llvm.org/D22554

llvm-svn: 276477
2016-07-22 21:28:48 +00:00
Sanjay Patel cbc4377af1 add tests for icmp vector folds
llvm-svn: 276476
2016-07-22 21:28:20 +00:00
Sanjay Patel 97e61dcc2d add tests for icmp vector folds
llvm-svn: 276475
2016-07-22 21:13:08 +00:00
Sanjay Patel b73d7aed71 add tests for icmp vector folds
llvm-svn: 276472
2016-07-22 21:02:33 +00:00
Sanjay Patel 859278005d update to use FileCheck and auto-generate checks
llvm-svn: 276466
2016-07-22 20:39:07 +00:00
Sanjay Patel 296a776a5b add tests for icmp vector folds
llvm-svn: 276464
2016-07-22 20:11:08 +00:00
Jun Bum Lim 6a7dc5c430 Recommit - [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Recommiting r275571 after fixing crash reported in PR28270.
Now we erase elements of IOL in deleteDeadInstruction().

Original Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

llvm-svn: 276452
2016-07-22 18:27:24 +00:00
Sanjay Patel beaea95a0d add tests for vector bit manipulation intrinsics
llvm-svn: 276451
2016-07-22 18:22:25 +00:00
Anna Thomas 0be4a0e6a4 Invariant start/end intrinsics overloaded for address space
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address
space.

With this change, these intrinsics are overloaded for any adddress space
for memory objects
and we can use these llvm invariant intrinsics in non-default address
spaces.

Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)

This overloaded intrinsic is needed for representing final or invariant
memory in managed languages.

Reviewers: apilipenko, reames

Subscribers: llvm-commits
llvm-svn: 276447
2016-07-22 17:49:40 +00:00
David Majnemer 522a91181a Don't remove side effecting instructions due to ConstantFoldInstruction
Just because we can constant fold the result of an instruction does not
imply that we can delete the instruction.  It may have side effects.

This fixes PR28655.

llvm-svn: 276389
2016-07-22 04:54:44 +00:00
Sanjoy Das aae623f4c2 [IRCE] Don't misuse CHECK-LABEL; NFC
llvm-svn: 276373
2016-07-22 00:41:02 +00:00
Sanjoy Das bb969791b4 [IRCE] Add an option to skip profitability checks
If `-irce-skip-profitability-checks` is passed in, IRCE will kick in in
all cases where it is legal for it to kick in.  This flag is intended to
help diagnose and analyse performance issues.

llvm-svn: 276372
2016-07-22 00:40:56 +00:00
Sebastian Pop 31fd506623 GVH-hoist: only clone GEPs (PR28606)
Do not clone stored values unless they are GEPs that are special cased to avoid
hoisting them without hoisting their associated ld/st.

Differential revision: https://reviews.llvm.org/D22652

llvm-svn: 276358
2016-07-21 23:22:10 +00:00
Wei Mi 1cf58f8996 [PM] Port NaryReassociate to the new PM
Differential Revision: https://reviews.llvm.org/D22648

llvm-svn: 276349
2016-07-21 22:28:52 +00:00
Sanjay Patel e9fc79bb13 [InstSimplify] don't crash handling a pointer or aggregate type
llvm-svn: 276345
2016-07-21 21:56:00 +00:00
Sanjay Patel a3bfb4e313 [InstSimplify] recognize trunc + icmp sgt/slt variants of select simplifications (PR28466)
rL245171 exposed a hole in InstSimplify that manifested in a strange way in PR28466:
https://llvm.org/bugs/show_bug.cgi?id=28466

It's possible to use trunc + icmp sgt/slt in place of an and + icmp eq/ne, so we need to
recognize that pattern to eliminate selects that are choosing between some value and some
bitmasked version of that value.

Note that there is significant room for improvement (refactoring) and enhancement (more
patterns, possibly in InstCombine rather than here).

Differential Revision: https://reviews.llvm.org/D22537

llvm-svn: 276341
2016-07-21 21:26:45 +00:00
Adam Nemet 84a6425d61 [OptDiag,LDist] Convert remaining opt remarks to use the new API
llvm-svn: 276340
2016-07-21 21:21:34 +00:00
Matthew Simpson 102729cf1b [LV] Move vector int induction update to end of latch
This patch moves the update instruction for vectorized integer induction phi
nodes to the end of the latch block. This ensures consistent placement of all
induction updates across all the kinds of int inductions we create (scalar,
splat vector, or vector phi).

Differential Revision: https://reviews.llvm.org/D22416

llvm-svn: 276339
2016-07-21 21:20:15 +00:00
Sanjay Patel 9eec550a2b add vector tests and a simpler version of the negative tests
llvm-svn: 276328
2016-07-21 20:11:08 +00:00
Anna Thomas c858faa244 Revert "Invariant start/end intrinsics overloaded for address space"
This reverts commit r276316.

llvm-svn: 276320
2016-07-21 19:06:28 +00:00
Anna Thomas 29b24dfe44 Invariant start/end intrinsics overloaded for address space
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address space.

With this change, these intrinsics are overloaded for any adddress space for memory objects
and we can use these llvm invariant intrinsics in non-default address spaces.

Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)

This overloaded intrinsic is needed for representing final or invariant memory in managed languages.

Reviewers: tstellarAMD, reames, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22519

llvm-svn: 276316
2016-07-21 18:41:44 +00:00
David Majnemer 825e4ab9e3 [GVNHoist] Preserve optimization hints which agree
If we have optimization hints with agree with each other along different
paths, preserve them.

llvm-svn: 276248
2016-07-21 07:16:26 +00:00
David Majnemer 4808f26422 [GVNHoist] Don't wrongly preserve TBAA
We hoisted loads/stores without taking into account which can cause
miscompiles.

llvm-svn: 276240
2016-07-21 05:59:53 +00:00
Adam Nemet 7cfd5971ab [OptDiag,LV] Add hotness attribute to applied-optimization remarks
Test coverage is provided by modifying the function in the FP-math
testcase that we are allowed to vectorize.

llvm-svn: 276223
2016-07-21 01:07:13 +00:00
Sanjay Patel 0753c06d9c [InstCombine] LogicOpc (zext X), C --> zext (LogicOpc X, C) (PR28476)
The benefits of this change include:
1. Remove DeMorgan-matching code that was added specifically to work-around 
   the missing transform in http://reviews.llvm.org/rL248634.
2. Makes the DeMorgan transform work for vectors too.
3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476

Extending this transform to other casts and other associative operators may
be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for
doing that though.

Differential Revision: https://reviews.llvm.org/D22271

llvm-svn: 276221
2016-07-21 00:24:18 +00:00
Adam Nemet 0e0e2d5d26 [OptDiag,LV] Add hotness attribute to the derived analysis remarks
This includes FPCompute and Aliasing.

Testcase is based on no_fpmath.ll.

llvm-svn: 276211
2016-07-20 23:50:32 +00:00
Sanjay Patel 5f3c70307d [InstSimplify][InstCombine] don't crash when folding vector selects of icmp
Differential Revision: https://reviews.llvm.org/D22602

llvm-svn: 276209
2016-07-20 23:40:01 +00:00
Justin Lebar cd564c6b46 [NVPTX] Enable the load-store vectorizer on nvptx.
Reviewers: tra

Subscribers: jholewinski, arsenm, asbirlea

Differential Revision: https://reviews.llvm.org/D22592

llvm-svn: 276196
2016-07-20 22:11:36 +00:00
Adam Nemet 5b3a5cf6b0 [OptDiag,LV] Add hotness attribute to analysis remarks
The earlier change added hotness attribute to missed-optimization
remarks.  This follows up with the analysis remarks (the ones explaining
the reason for the missed optimization).

llvm-svn: 276192
2016-07-20 21:44:26 +00:00
David Majnemer bd21012c6c [GVNHoist] Don't hoist PHI nodes
We hoisted PHIs without respecting their special insertion point in the
block, leading to verfier errors.

This fixes PR28626.

llvm-svn: 276181
2016-07-20 21:05:01 +00:00
Davide Italiano 15ff2d6d0c [SCCP] Zap multiple return values.
We can replace the return values with undef if we replaced all
the call uses with a constant/undef.

Differential Revision:  https://reviews.llvm.org/D22336

llvm-svn: 276174
2016-07-20 20:17:13 +00:00
Justin Lebar a272c12b73 [LSV] Don't move stores across may-load instrs, and loosen restrictions on moving loads.
Summary:
Previously we wouldn't move loads/stores across instructions that had
side-effects, where that was defined as may-write or may-throw.  But
this is not sufficiently restrictive: Stores can't safely be moved
across instructions that may load.

This patch also adds a DEBUG check that all instructions in our chain
are either loads or stores.

Reviewers: asbirlea

Subscribers: llvm-commits, jholewinski, arsenm, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22547

llvm-svn: 276171
2016-07-20 20:07:37 +00:00
Justin Lebar 62b03e344e [LSV] Vectorize up to side-effecting instructions.
Summary:
Previously if we had a chain that contained a side-effecting
instruction, we wouldn't vectorize it at all.  Now we'll vectorize
everything that comes before the side-effecting instruction.

Reviewers: asbirlea

Subscribers: arsenm, jholewinski, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22536

llvm-svn: 276170
2016-07-20 20:07:34 +00:00
Sanjay Patel c0812702f8 minimize tests and auto-generate checks
llvm-svn: 276147
2016-07-20 17:58:20 +00:00
Benjamin Kramer b4d64cf27d Revert "[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))"
Makes InstCombine infloop when compiling v8.

This reverts commit r275989 and r276105.

llvm-svn: 276106
2016-07-20 11:40:16 +00:00
Tobias Grosser 8c6201b49f [InstCombine] Provide more test cases for cast-folding [NFC]
Summary: In r275989 we enabled the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))`. Here we add more test cases to assure this folding works for all logical operations `and`/`or`/`xor`.

Reviewers: grosser

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22561

Contributed-by: Matthias Reisinger
llvm-svn: 276105
2016-07-20 11:24:27 +00:00
Simon Pilgrim 1b4f511aaa [X86][SSE] Add cost model values for CTPOP of vectors
This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better.

Differential Revision: https://reviews.llvm.org/D22456

llvm-svn: 276104
2016-07-20 10:41:28 +00:00
David Majnemer a75736087d Forgot to add a test for r276008.
llvm-svn: 276082
2016-07-20 04:13:05 +00:00
Adam Nemet 67c8929a2c [LV] Add hotness attribute to missed-optimization remarks
The new OptimizationRemarkEmitter analysis pass is hooked up to both new
and old PM passes.

llvm-svn: 276080
2016-07-20 04:03:43 +00:00
Michael Zolotukhin 6bc56d552a Revert "Revert r275883 and r275891. They seem to cause PR28608."
This reverts commit r276064, and thus reapplies r275891 and r275883 with
a fix for PR28608.

llvm-svn: 276077
2016-07-20 01:55:27 +00:00
Justin Lebar 6114b37838 [LSV] Don't assume that loads/stores appear in address order in the BB.
Summary:
getVectorizablePrefix previously didn't work properly in the face of
aliasing loads/stores.  It unwittingly assumed that the loads/stores
appeared in the BB in address order.  If they didn't, it would do the
wrong thing.

Reviewers: asbirlea, tstellarAMD

Subscribers: arsenm, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22535

llvm-svn: 276072
2016-07-20 00:55:12 +00:00
Sean Silva 554efb28d2 Revert r275883 and r275891. They seem to cause PR28608.
Revert "[LoopSimplify] Update LCSSA after separating nested loops."

This reverts commit r275891.

Revert "[LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form."

This reverts commit r275883.

llvm-svn: 276064
2016-07-19 23:54:29 +00:00
Sean Silva e3c18a5ae8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063
2016-07-19 23:54:23 +00:00
Justin Lebar 8778c62629 [LSV] Insert stores at the right point.
Summary:
Previously, the insertion point for stores was the last instruction in
Chain *before calling getVectorizablePrefixEndIdx*.  Thus if
getVectorizablePrefixEndIdx didn't return Chain.size(), we still would
insert at the last instruction in Chain.

This patch changes our internal API a bit in an attempt to make it less
prone to this sort of error.  As a result, we end up recalculating the
Chain's boundary instructions, but I think worrying about the speed hit
of this is a premature optimization right now.

Reviewers: asbirlea, tstellarAMD

Subscribers: mzolotukhin, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D22534

llvm-svn: 276056
2016-07-19 23:19:20 +00:00
Justin Lebar d9446d3770 [LSV] Add detail to correct-order.ll test.
Summary:
This helps keep us honest -- there were a number of ways we could screw
up and still have passed this test.

Reviewers: asbirlea

Subscribers: llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22531

llvm-svn: 276053
2016-07-19 23:18:59 +00:00
Sanjay Patel d4ea94eb94 regenerate checks
llvm-svn: 276042
2016-07-19 22:32:15 +00:00
Sanjay Patel 2d477e59e8 [InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in the source type
The pattern may look more obviously like a sext if written as:

  define i32 @g(i16 %x) {
    %zext = zext i16 %x to i32
    %xor = xor i32 %zext, 32768
    %add = add i32 %xor, -32768
    ret i32 %add
  }

We already have that fold in visitAdd().

Differential Revision: https://reviews.llvm.org/D22477

llvm-svn: 276035
2016-07-19 22:09:34 +00:00
Sanjay Patel 47c04f9543 add even more missing tests for simplifySelectBitTest()
llvm-svn: 276024
2016-07-19 20:47:00 +00:00
David Majnemer 5246e0b2c2 [FunctionAttrs] Correct the safety analysis for inference of 'returned'
We skipped over ReturnInsts which didn't return an argument which would
lead us to incorrectly conclude that an argument returned by another
ReturnInst was 'returned'.

This reverts commit r275756.

This fixes PR28610.

llvm-svn: 276008
2016-07-19 18:50:26 +00:00
David Majnemer 07ea344222 Add a testcase for r275581
llvm-svn: 276002
2016-07-19 17:52:41 +00:00
Sanjay Patel 8b76ebe5b8 add tests related to PR28466
llvm-svn: 275995
2016-07-19 17:07:35 +00:00
Sanjay Patel d2ff6d727f add missing test for simplifySelectBitTest()
llvm-svn: 275990
2016-07-19 16:49:55 +00:00
Tobias Grosser 1c38262279 [InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))
Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:

> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.

Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:

`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`

This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation

Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).

As an example, consider the following IR snippet

```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```

which would now be transformed to

```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```

This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.

Reviewers: grosser, vtjnash, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22511

Contributed-by: Matthias Reisinger
llvm-svn: 275989
2016-07-19 16:39:17 +00:00
Simon Pilgrim 0ea8d275cc [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.

It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).

This patch changes both scalar and packed versions back to using x86-specific builtins.

It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.

A companion clang patch is at D22105

Differential Revision: https://reviews.llvm.org/D22106

llvm-svn: 275981
2016-07-19 15:07:43 +00:00
George Burgess IV 5f30897b7b [MemorySSA] Update to the new shiny walker.
This patch updates MemorySSA's use-optimizing walker to be more
accurate and, in some cases, faster.

Essentially, this changed our core walking algorithm from a
cache-as-you-go DFS to an iteratively expanded DFS, with all of the
caching happening at the end. Said expansion happens when we hit a Phi,
P; we'll try to do the smallest amount of work possible to see if
optimizing above that Phi is legal in the first place. If so, we'll
expand the search to see if we can optimize to the next phi, etc.

An iteratively expanded DFS lets us potentially quit earlier (because we
don't assume that we can optimize above all phis) than our old walker.
Additionally, because we don't cache as we go, we can now optimize above
loops.

As an added bonus, this patch adds a ton of verification (if
EXPENSIVE_CHECKS are enabled), so finding bugs is easier.

Differential Revision: https://reviews.llvm.org/D21777

llvm-svn: 275940
2016-07-19 01:29:15 +00:00
Wei Mi 79997a24d7 Recommit the patch "Use uniforms set to populate VecValuesToIgnore".
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.

The change will make the vector RegUsages estimation less conservative.

Differential Revision: https://reviews.llvm.org/D20474

The recommit fixed the testcase global_alias.ll.

llvm-svn: 275936
2016-07-19 00:50:43 +00:00
Sanjoy Das ab73c9d88e [LoopReroll] Reroll loops with unordered atomic memory accesses
Reviewers: hfinkel, jfb, reames

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22385

llvm-svn: 275932
2016-07-19 00:23:54 +00:00
Dehao Chen 6132ee8502 [PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22468

llvm-svn: 275919
2016-07-18 21:41:50 +00:00
Teresa Johnson 2124157102 [PM] Port FunctionImport Pass to new PM
Summary: Port FunctionImport Pass to new PM.

Reviewers: mehdi_amini, davide

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D22475

llvm-svn: 275916
2016-07-18 21:22:24 +00:00
Wei Mi f9afff71a2 Revert rL275912.
llvm-svn: 275915
2016-07-18 21:14:43 +00:00
Wei Mi 1fd25726af Use uniforms set to populate VecValuesToIgnore.
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.

The change will make the vector RegUsages estimation less conservative.

Differential Revision: https://reviews.llvm.org/D20474

llvm-svn: 275912
2016-07-18 20:59:53 +00:00
Sanjay Patel dbf44f5016 add tests for missed sext transform
llvm-svn: 275908
2016-07-18 20:37:51 +00:00
Sanjay Patel 8a2bf3099f auto-generate checks
llvm-svn: 275899
2016-07-18 20:06:51 +00:00
Michael Zolotukhin ea5b72825b [LoopSimplify] Update LCSSA after separating nested loops.
Summary:
Usually LCSSA survives this transformation, but in some cases (see
attached test) it doesn't: values from the original loop after
separating might be used from the outer loop. Before the transformation
it was the same loop, so LCSSA phis were not required.

This fixes PR28272.

Reviewers: sanjoy, hfinkel, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21665

llvm-svn: 275891
2016-07-18 19:44:19 +00:00
Michael Zolotukhin 7a3040dc83 [LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form.
Summary:
SSAUpdate might insert PHI-nodes inside loops, which can break LCSSA
form unless we fix it up.

This fixes PR28424.

Reviewers: sanjoy, chandlerc, hfinkel

Subscribers: uabelho, llvm-commits

Differential Revision: http://reviews.llvm.org/D21997

llvm-svn: 275883
2016-07-18 19:05:08 +00:00
Adam Nemet d6ba0bf831 [LoopDist] This test does not require ASSERTS
Only its counterpart, diagnostics-with-hotness-lazy-BFI.ll, which
invokes opt with -debug-only=.

llvm-svn: 275812
2016-07-18 16:37:32 +00:00
Adam Nemet b2593f78ca [LoopDist] Port to new PM
Summary:
The direct motivation for the port is to ensure that the OptRemarkEmitter
tests work with the new PM.

This remains a function pass because we not only create multiple loops
but could also version the original loop.

In the test I need to invoke opt
with -passes='require<aa>,loop-distribute'.  LoopDistribute does not
directly depend on AA however LAA does.  LAA uses getCachedResult so
I *think* we need manually pull in 'aa'.

Reviewers: davidxl, silvas

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22437

llvm-svn: 275811
2016-07-18 16:29:27 +00:00
Alexander Kornienko 63dd36faa5 Revert "r275571 [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals"
Causes https://llvm.org/bugs/show_bug.cgi?id=28588

llvm-svn: 275801
2016-07-18 15:51:31 +00:00
Simon Pilgrim 1b2ab113fb [SLPVectorizer][X86] Added sqrt vectorization tests
llvm-svn: 275788
2016-07-18 13:20:54 +00:00
David Majnemer 04c7c225a1 [GVNHoist] Change the key for VNtoInsns to a pair
While debugging GVNHoist, I found it confusing that the entries in a
VNtoInsns were not always value numbers.  They _usually_ were except for
StoreInst in which case they were a hash of two different value numbers.

This leads to two observations:
- It is more difficult to debug things when the semantic contents of
  VNtoInsns changes over time.
- Using a single value number is not much cheaper, the value of
  VNtoInsns is a SmallVector.
- It is not immediately clear what the algorithm would do if there were
  hash collisions in the StoreInst case.

Using a DenseMap of std::pair sidesteps all of this.

N.B.  The changes in the test were due their sensitivity to the
iteration order of VNtoInsns which has changed.

llvm-svn: 275761
2016-07-18 06:11:37 +00:00
NAKAMURA Takumi 966bde50c3 Revert r275678, "Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute""
This reverts also r275029, "Update Clang tests after adding inference for the returned argument attribute"

It broke LTO build. Seems miscompilation.

llvm-svn: 275756
2016-07-18 03:23:25 +00:00
Davide Italiano 4edd54794b [GVN] Move other PRE tests to a subdirectory.
llvm-svn: 275742
2016-07-17 23:55:20 +00:00
Davide Italiano ed8e0881c1 [GVN] Move the PRE/LOADPRE test in a subdirectory.
llvm-svn: 275741
2016-07-17 23:48:18 +00:00
Davide Italiano 6a69f829bd [GVN] Use FileCheck instead of grep for tests.
llvm-svn: 275739
2016-07-17 23:21:26 +00:00
Teresa Johnson cd21a646f6 [ThinLTO] Perform profile-guided indirect call promotion
Summary:
To enable profile-guided indirect call promotion in ThinLTO mode, we
simply add call graph edges for each profitable target from the profile
to the summaries, then the summary-guided importing will consider the
callee for importing as usual.

Also we need to enable the indirect call promotion pass creation in the
PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO
backend), so that the newly imported functions are considered for
promotion in the backends.

The IC promotion profiles refer to callees by GUID, which required
adding GUIDs to the per-module VST in bitcode (and assigning them
valueIds similar to how they are assigned valueIds in the combined
index).

Reviewers: mehdi_amini, xur

Subscribers: mehdi_amini, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21932

llvm-svn: 275707
2016-07-17 14:47:01 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
Sanjay Patel 79acd2a96b [InstCombine] allow X + signbit --> X ^ signbit for vector splats
llvm-svn: 275691
2016-07-16 18:29:26 +00:00
Sanjay Patel 040bd16e56 add vector test to show missing transform
llvm-svn: 275690
2016-07-16 18:24:18 +00:00
Sanjay Patel eb50476f77 update tests to use FileCheck, consolidate tests, fix comments
llvm-svn: 275688
2016-07-16 18:08:22 +00:00
Sanjay Patel e540a31f59 update test to use FileCheck
llvm-svn: 275687
2016-07-16 16:31:58 +00:00
Sanjay Patel 972a53fb42 auto-generate checks
llvm-svn: 275686
2016-07-16 16:27:58 +00:00
Sanjay Patel 309f98ef3a auto-ggenerate checks
llvm-svn: 275685
2016-07-16 16:24:06 +00:00
Sanjay Patel f9d2b20daf [InstCombine] reassociate logic ops with constants separated by a zext
This is a partial implementation of a general fold for associative+commutative operators:
(op (cast (op X, C2)), C1) --> (cast (op X, op (C1, C2)))
(op (cast (op X, C2)), C1) --> (op (cast X), op (C1, C2))

There are 7 associative operators and 13 cast types, so this could potentially go a lot further.

Differential Revision: https://reviews.llvm.org/D22421

llvm-svn: 275684
2016-07-16 15:20:19 +00:00
Hal Finkel 660096b260 Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute"
This reverts commit r275042; the initial commit triggered self-hosting failures
on ARM/AArch64. James Molloy identified the problematic backend code, which has
been disabled in r275677. Trying again...

Original commit message:

Let FuncAttrs infer the 'returned' argument attribute

A function can have one argument with the 'returned' attribute, indicating that
the associated argument is always the return value of the function. Add
FuncAttrs inference logic.

llvm-svn: 275678
2016-07-16 07:21:28 +00:00
Matt Arsenault 93be6e8c0a StructurizeCFG: Fix inverting constantexpr conditions
llvm-svn: 275626
2016-07-15 22:13:16 +00:00
Jingyue Wu 2b353a9522 [ReassociateGEP] Update tests to allow missing "inbounds" on certain GEPs.
With r275532 fixing miscompilation of GVN, "inbounds" on certain GEPs in these
tests cannot be preserved any more. Left a TODO in the tests for future
reference.

llvm-svn: 275596
2016-07-15 18:47:17 +00:00
Sanjay Patel 27fefb2fcf add tests for associative ops blocked by a cast
These are more generalized versions of the cases added in
r275302 and r275297.

llvm-svn: 275594
2016-07-15 18:39:02 +00:00
Rong Xu 96a19d35ae [PGO] IRPGO pre-cleanup pass changes
This patch adds a selected set of cleanup passes including a pre-inline pass
before LLVM IR PGO instrumentation. The inline is only intended to apply those
obvious/trivial ones before instrumentation so that much less instrumentation
is needed to get better profiling information. This will drastically improve
the instrumented code performance for large C++ applications. Another benefit
is the context sensitive counts that can potentially improve the PGO
optimization.

Differential Revision: http://reviews.llvm.org/D21405

llvm-svn: 275588
2016-07-15 18:10:49 +00:00
Adam Nemet aad816083e [OptRemark,LDist] RFC: Add hotness attribute
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334

This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution.  My goal is to shake out the design issues before scaling
it up to other types and passes.

Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count.  It's only printed in opt
currently since clang prints the diagnostic fields directly.  E.g.:

  remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)

A new API added is similar to emitOptimizationRemarkMissed.  The
difference is that it additionally takes a code region that the
diagnostic corresponds to.  From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI.  (Thanks to Hal for the analysis pass idea.)

This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context.  If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.

A new command-line option is added to turn this on in opt.

My plan is to switch all user of emitOptimizationRemark* to use this
module instead.

Reviewers: hfinkel

Subscribers: rcox2, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21771

llvm-svn: 275583
2016-07-15 17:23:20 +00:00
Jun Bum Lim a5737d8eac [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

Reviewers: hfinkel, eeckstein, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D21909

llvm-svn: 275571
2016-07-15 16:14:34 +00:00
Sebastian Pop 4177480aad code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275561
2016-07-15 13:45:20 +00:00
David Majnemer 959a6623b5 XFAIL two SeparateConstOffsetFromGEP tests
They appear to have relied on bugs hidden in copyIRFlags/andIRFlags.

This has been filed as PR28564.

llvm-svn: 275533
2016-07-15 05:37:22 +00:00
David Majnemer 92f84ccf0f [IR] andIRFlags and copyIRFlags needs to handle GEP
We didn't consider the inbounds flag on GEPs leading to downstream users
introducing UB.

This fixes PR28562.

llvm-svn: 275532
2016-07-15 05:02:31 +00:00
Adam Nemet 74730d9ab0 [LoopDist] Fix typo in diagnostic
llvm-svn: 275495
2016-07-14 22:33:46 +00:00
Ekaterina Romanova 7aea5906c0 [GVN] Fold constant expression in GVN.
Fix for PR 28418.

opt never finishes compiling a test when -gvn option is passed.
The problem is caused by the fact that GVN fails to fold a constant expression.

Differential Revision: https://reviews.llvm.org/D22185

llvm-svn: 275483
2016-07-14 22:02:25 +00:00
Matthew Simpson 65ca32b83c [LV] Allow interleaved accesses in loops with predicated blocks
This patch allows the formation of interleaved access groups in loops
containing predicated blocks. However, the predicated accesses are prevented
from forming groups.

Differential Revision: https://reviews.llvm.org/D19694

llvm-svn: 275471
2016-07-14 20:59:47 +00:00
Sanjoy Das 13623ad009 [JumpThreading] PRE unordered loads
Summary: Extend JumpThreading's PRE to unordered atomic loads.

Reviewers: hfinkel, reames

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D22326

llvm-svn: 275456
2016-07-14 19:21:15 +00:00
Jun Bum Lim c837af306e [PM] Port Dead Loop Deletion Pass to the new PM
Summary: Port Dead Loop Deletion Pass to the new pass manager.

Reviewers: silvas, davide

Subscribers: llvm-commits, sanjoy, mcrosier

Differential Revision: https://reviews.llvm.org/D21483

llvm-svn: 275453
2016-07-14 18:28:29 +00:00
Nico Weber 755cd760cd Revert r275401, it caused PR28551.
llvm-svn: 275420
2016-07-14 14:41:25 +00:00
Matthew Simpson 3c3b4a257b [LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversions
This patch prevents increases in the number of instructions, pre-instcombine,
due to induction variable scalarization. An increase in instructions can lead
to an increase in the compile-time required to simplify the induction
variables. We now maintain a new map for scalarized induction variables to
prevent us from converting between the scalar and vector forms.

This patch should resolve compile-time regressions seen after r274627.

llvm-svn: 275419
2016-07-14 14:36:06 +00:00
Sjoerd Meijer 716abbb2f5 This converts a signed remainder instruction to unsigned remainder, which
enables the code size optimisation to fold a rem and div into a single
aeabi_uidivmod call. This was not happening before because sdiv was converted
but srem not, and instructions with different signedness are not combined.

Differential Revision: http://reviews.llvm.org/D22214

llvm-svn: 275403
2016-07-14 12:23:48 +00:00
Sebastian Pop 63847d04e7 code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275401
2016-07-14 12:18:53 +00:00
Sjoerd Meijer 38c2cd0c14 This implements a more optimal algorithm for selecting a base constant in
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.

Differential Revision: http://reviews.llvm.org/D21183

llvm-svn: 275382
2016-07-14 07:44:20 +00:00
David Majnemer 666aa945a5 [InstCombine] Masked loads with undef masks can fold to normal loads
We were able to fold masked loads with an all-ones mask to a normal
load.  However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.

llvm-svn: 275380
2016-07-14 06:58:42 +00:00
David Majnemer 17a95aaa7b Simplify llvm.masked.load w/ undef masks
We can always pick the passthru value if the mask is undef: we are
permitted to treat the mask as-if it were filled with zeros.

llvm-svn: 275379
2016-07-14 06:58:37 +00:00
Davide Italiano 7dac027ed7 [IPSCCP] Constant fold struct argument/instructions when all the lattice values are constant.
This now should also work with the interprocedural variant of the pass.
Slightly easier now that the yak is shaved.

Differential Revision:   http://reviews.llvm.org/D22329

llvm-svn: 275363
2016-07-14 02:51:41 +00:00
Mehdi Amini 8484f92f7f [Scalarizer] PR28108: Skip over nullptr rather than crashing on it.
Summary:
In Scalarizer::gather we see if we already have a scattered form of Op,
and in that case use the new form.

In the particular case of PR28108, the found ValueVector SV has size 2,
where the first Value is nullptr, and the second is indeed a proper Value.
The nullptr then caused an assert to blow when we tried to do
cast<Instruction>(SV[I]).

With this patch we check SV[I] before doing the cast, and if it's nullptr
we just skip over it.

I don't know the Scalarizer well enough to know if this is the best fix
or if something should be done else where to prevent the nullptr from
being in the ValueVector at all, but at least this avoids the crash
and looking at the test case output it looks reasonable.

Reviewers: hfinkel, frasercrmck, wala, mehdi_amini

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21518

llvm-svn: 275359
2016-07-14 01:31:25 +00:00
David Majnemer 7f781aba97 [ConstantFolding] Fold masked loads
We can constant fold a masked load if the operands are appropriately
constant.

Differential Revision: http://reviews.llvm.org/D22324

llvm-svn: 275352
2016-07-14 00:29:50 +00:00
David Majnemer f89660aba7 [ConstantFolding] Extend FoldReinterpretLoadFromConstPtr to handle negative offsets
Treat loads which clip before the start of a global initializer the same
way we treat clipping beyond the end of the initializer: use zeros.

llvm-svn: 275345
2016-07-13 23:33:07 +00:00
Alina Sbirlea 640a61cd8b Extended LoadStoreVectorizer to vectorize subchains.
Summary:
LSV used to abort vectorizing a chain for interleaved load/store accesses that alias.
Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain.

Reviewers: llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22119

llvm-svn: 275317
2016-07-13 21:20:01 +00:00
Andrew Kaylor 346dd7f1bd Reverting r275284 due to platform-specific test failures
llvm-svn: 275304
2016-07-13 19:09:16 +00:00
Sanjay Patel eff2aa70fc add more tests for zexty xor sandwiches
...mmm sandwiches

llvm-svn: 275302
2016-07-13 18:58:55 +00:00
Sanjay Patel 904a88025a add test for zexty xor sandwich
llvm-svn: 275297
2016-07-13 18:40:38 +00:00
Sanjay Patel c00e48a3db [InstCombine] extend vector select matching for non-splat constants
In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean
way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the
direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could
move that helper function to a higher level.

There is an open question as to which is of these forms should be considered the canonical IR:
  %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b
  %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3>

Differential Revision: http://reviews.llvm.org/D22114

llvm-svn: 275289
2016-07-13 18:07:02 +00:00
Andrew Kaylor 12cccdd731 Fix for Bug 26903, adds support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 275284
2016-07-13 17:25:11 +00:00
David Majnemer 1b3db33e3d [ConstantFolding] Don't treat negative GEP offsets as positive
GEP offsets are signed, don't treat them as huge positive numbers.

llvm-svn: 275251
2016-07-13 05:16:16 +00:00
Dehao Chen 9cba1f4e7e New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275222
2016-07-12 22:37:48 +00:00
Michael Kuperstein a99c46cc73 [LV] Remove wrong assumption about LCSSA
The LCSSA pass itself will not generate several redundant PHI nodes in a single
exit block. However, such redundant PHI nodes don't violate LCSSA form, and may
be introduced by passes that preserve LCSSA, and/or preserved by the LCSSA pass
itself. So, assuming a single PHI node per exit block is not safe.

llvm-svn: 275217
2016-07-12 21:24:06 +00:00
Davide Italiano 0080269342 [SCCP] Constant fold structs if all the lattice value are constant.
Differential Revision:   http://reviews.llvm.org/D22269

llvm-svn: 275208
2016-07-12 19:54:19 +00:00
Dehao Chen b9f8e29290 [PM] Port LoopIdiomRecognize Pass to new PM
Summary: Port LoopIdiomRecognize Pass to new PM

Reviewers: davidxl

Subscribers: davide, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D22250

llvm-svn: 275202
2016-07-12 18:45:51 +00:00
Xinliang David Li 9eb472ba4b [PGO] Don't include full file path in static function profile counter names
Patch by Jake VanAdrighem
Differential Revision: http://reviews.llvm.org/D22028

llvm-svn: 275193
2016-07-12 17:14:51 +00:00
Sanjay Patel 4a6a751dce add tests for missing DeMorgan's Law folds
llvm-svn: 275192
2016-07-12 17:05:04 +00:00
Sanjay Patel 3900191ecc auto-generate checks
llvm-svn: 275188
2016-07-12 16:21:55 +00:00
Sanjay Patel 93dffe629a auto-generate checks
llvm-svn: 275187
2016-07-12 16:17:30 +00:00
Sanjay Patel 6d1f227e6b auto-generate checks
llvm-svn: 275186
2016-07-12 16:13:04 +00:00
Vitaly Buka 204dc533c5 Revert "New pass manager for LICM."
Summary: This reverts commit r275118.

Subscribers: sanjoy, mehdi_amini

Differential Revision: http://reviews.llvm.org/D22259

llvm-svn: 275156
2016-07-12 06:25:32 +00:00
Ivan Krasin 5474645dc8 Print remarks from WholeProgramDevirt pass for each call site.
Summary:
It's useful to have some visibility about which call sites are devirtualized,
especially for debug purposes. Another use case is a regression test on the
application side (like, Chromium).

Reviewers: pcc

Differential Revision: http://reviews.llvm.org/D22252

llvm-svn: 275145
2016-07-12 02:38:37 +00:00
Dehao Chen 7ef5820fa3 New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275118
2016-07-11 22:45:24 +00:00
Alina Sbirlea cbc6ac2afd Correct ordering of loads/stores.
Summary:
Aiming to correct the ordering of loads/stores. This patch changes the
insert point for loads to the position of the first load.
It updates the ordering method for loads to insert before, rather than after.

Before this patch the following sequence:
"load a[1], store a[1], store a[0], load a[2]"
Would incorrectly vectorize to "store a[0,1], load a[1,2]".
The correctness check was assuming the insertion point for loads is at
the position of the first load, when in practice it was at the last
load. An alternative fix would have been to invert the correctness check.
The current fix changes insert position but also requires reordering of
instructions before the vectorized load.

Updated testcases to reflect the changes.

Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22071

llvm-svn: 275117
2016-07-11 22:34:29 +00:00
Michael Kuperstein f0c59330e9 [X86] Make some cast costs more precise
Make some AVX and AVX512 cast costs more precise.
Based on part of a patch by Elena Demikhovsky (D15604).

Differential Revision: http://reviews.llvm.org/D22064

llvm-svn: 275106
2016-07-11 21:39:44 +00:00
Alina Sbirlea 327955e057 Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer
Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Add additional parameters: AddressSpace, Alignment, Fast.

Reviewers: llvm-commits, jlebar

Subscribers: arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21935

llvm-svn: 275100
2016-07-11 20:46:17 +00:00
Jingyue Wu 641cfee976 [SLSR] Call getPointerSizeInBits with the correct address space.
llvm-svn: 275083
2016-07-11 18:13:28 +00:00
Davide Italiano e8ae0b5eb4 [PM/IPO] Port LowerTypeTests to the new PassManager.
There's a little bit of churn in this patch because the initialization
mechanism is now shared between the old and the new PM. Other than
that, it's just a pretty mechanical translation.

llvm-svn: 275082
2016-07-11 18:10:06 +00:00
Dehao Chen 9232f98279 Implement callsite-hotness based inline cost for Sample-based PGO
Summary:
For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch.

E.g.

if (A1 && A2 && A3 && ..... && A10) {
  for (i=0; i < 100000000; i++) {
    callsite();
  }
}

Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value.

In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness.

Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR.

Reviewers: davidxl, eraman, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22118

llvm-svn: 275073
2016-07-11 16:48:54 +00:00
Dehao Chen 29d2641f52 Tune the weight propagation algorithm for sample profile.
Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22180

llvm-svn: 275072
2016-07-11 16:40:17 +00:00
Nicolai Haehnle 889a20cf40 [Sink] Don't move calls to readonly functions across stores
Summary:

Reviewers: hfinkel, majnemer, tstellarAMD, sunfish

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17279

llvm-svn: 275066
2016-07-11 14:11:51 +00:00
Hal Finkel 02012bcfee Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute
Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot.

llvm-svn: 275042
2016-07-11 04:51:23 +00:00
Hal Finkel 2cac58f604 Pointer-comparison folding should look through returned-argument functions
For functions which are known to return a specific argument, pointer-comparison
folding can look through the function calls as part of its analysis.

Differential Revision: http://reviews.llvm.org/D9387

llvm-svn: 275039
2016-07-11 03:37:59 +00:00
Hal Finkel 6fd5e1f02b Teach computeKnownBits to look through returned-argument functions
If a function is known to return one of its arguments, we can use that in order
to compute known bits of the return value.

Differential Revision: http://reviews.llvm.org/D9397

llvm-svn: 275036
2016-07-11 02:25:14 +00:00
Hal Finkel d66a7b05db Let FuncAttrs infer the 'returned' argument attribute
A function can have one argument with the 'returned' attribute, indicating that
the associated argument is always the return value of the function. Add
FuncAttrs inference logic.

Differential Revision: http://reviews.llvm.org/D22202

llvm-svn: 275027
2016-07-10 22:02:55 +00:00
Sean Silva db90d4d9c1 [PM] Port LoopVectorize to the new PM.
llvm-svn: 275000
2016-07-09 22:56:50 +00:00
Jingyue Wu debce55ac3 [SLSR] Fix crash on handling 128-bit integers.
ConstantInt::getSExtValue may fail on >64-bit integers. Add checks to call
getSExtValue only on narrow integers.

As a minor aside, simplify slsr-gep.ll to remove unnecessary load instructions.

llvm-svn: 274982
2016-07-09 19:13:18 +00:00
Davide Italiano 92b933a55c [PM] Port CrossDSOCFI to the new pass manager.
llvm-svn: 274962
2016-07-09 03:25:35 +00:00
Davide Italiano cd96cfd8df [PM] Port LoopSimplify to the new pass manager.
While here move simplifyLoop() function to the new header, as
suggested by Chandler in the review.

Differential Revision:  http://reviews.llvm.org/D21404

llvm-svn: 274959
2016-07-09 03:03:01 +00:00
Piotr Padlewski 3b77612839 Add 'thinlto_src_module' md with asserts or -enable-import-metadata
Summary:
This way the metadata will be only generated when asserts enabled,
or when -enable-import-metadata specified

FIXED missing colon on requires.

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D22167

llvm-svn: 274947
2016-07-08 23:01:49 +00:00
Piotr Padlewski d4b792346c Revert "Add 'thinlto_src_module' md with asserts or -enable-import-metadata"
Reverting because of 17463
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17463

This reverts commit d20cb431bba2ba43b4c65a8556cff445bfefbb7c.

llvm-svn: 274946
2016-07-08 22:55:48 +00:00
Anna Thomas 9ad45adfd7 Revert "InstCombine rule to fold truncs whose value is available"
This reverts commit r274853.
Caused failure in ppcBE build

llvm-svn: 274943
2016-07-08 22:15:08 +00:00
Piotr Padlewski d6efefa2b8 Add 'thinlto_src_module' md with asserts or -enable-import-metadata
Summary:
This way the metadata will be only generated when asserts enabled,
or when -enable-import-metadata specified

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D22167

llvm-svn: 274938
2016-07-08 21:25:39 +00:00
Sanjay Patel 664514f7fe [InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use
This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), 
but we'll try to err on the conservative side by going with the case that has
less IR instructions.

Note: This question came up in http://reviews.llvm.org/D22114 , but this part is
independent of that patch proposal, so I'm making this small change ahead of that
one. 

See also:
http://reviews.llvm.org/rL274926

llvm-svn: 274932
2016-07-08 21:17:51 +00:00
Sanjay Patel 5246482c7a add another multi-use test for logic->select transform
llvm-svn: 274929
2016-07-08 21:08:16 +00:00
Sanjay Patel f4a08ede03 [InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops
llvm-svn: 274926
2016-07-08 20:53:29 +00:00
Sanjay Patel 297a0e67b6 adjust test so it won't completely optimize away
llvm-svn: 274925
2016-07-08 20:35:53 +00:00
Sanjay Patel 0733e6b61c add tests for multi-use folding to select
llvm-svn: 274922
2016-07-08 20:22:27 +00:00
Dehao Chen 429f5c735f Remove inline hints computation from SampleProfile.cpp
Summary: As we will move to use uniformed hotness check in inliner, we do not need inline hints in SampleProfile pass any more.

Reviewers: dnovillo, davidxl

Subscribers: eraman, llvm-commits

Differential Revision: http://reviews.llvm.org/D19287

llvm-svn: 274918
2016-07-08 20:12:44 +00:00
Davide Italiano d555bde59f [SCCP] Fold constants as we build them whne visiting cast instructions.
This should be slightly more efficient and could avoid spurious overdefined
markings, as Eli pointed out.

Differential Revision:  http://reviews.llvm.org/D22122

llvm-svn: 274905
2016-07-08 19:13:40 +00:00
Sanjay Patel 1b6b824548 [InstCombine] check for one-use before turning simple logic op into a select
llvm-svn: 274891
2016-07-08 17:26:47 +00:00
Simon Pilgrim 4ca42e232d [SLPVectorizer][X86] Added fma vectorization tests
llvm-svn: 274889
2016-07-08 17:19:13 +00:00
Sanjay Patel 910ce0d511 add test to show multi-use output
llvm-svn: 274887
2016-07-08 17:12:27 +00:00
Sanjay Patel cbfca9e8ef [InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors
llvm-svn: 274883
2016-07-08 17:01:15 +00:00
Sanjay Patel 647174c8a4 add vector tests to show missing transform
llvm-svn: 274876
2016-07-08 16:39:53 +00:00
Sanjay Patel 46df968326 minimize tests
The cmp and load aren't required.

llvm-svn: 274864
2016-07-08 16:11:48 +00:00
Sanjay Patel e1acad9b61 regenerate checks
llvm-svn: 274860
2016-07-08 16:06:38 +00:00
Anna Thomas 3124f6273a InstCombine rule to fold truncs whose value is available
We can fold truncs whose operand feeds from a load, if the trunc value
is available through a prior load/store.

This change is from: http://reviews.llvm.org/D21246, which folded the
trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW
call, when the load type didnt match the prior load/store type.

Differential Revision: http://reviews.llvm.org/D21791

llvm-svn: 274853
2016-07-08 15:18:56 +00:00
Simon Pilgrim 4f1877fb57 [X86][SSE] Improve constant folding tests for CVTSD/CVTSS/CVTTSD/CVTTSS
As discussed on D22106, improve the testing for constant folding sse scalar conversion intrinsics to ensure we are correctly handling special/out of range cases

llvm-svn: 274846
2016-07-08 13:28:34 +00:00
Davide Italiano 16284df8ec [PM] Port InstSimplify to the new pass manager.
llvm-svn: 274796
2016-07-07 21:14:36 +00:00
Anna Thomas 6a78c78a03 [DSE] Remove dead stores in end blocks containing fence
We can remove dead stores in the presence of fence instructions. Fence
does not change an otherwise thread local store to visible.

reviewers: reames, dexonsmith, jfb
Differential Revision: http://reviews.llvm.org/D22001

llvm-svn: 274795
2016-07-07 20:51:42 +00:00
Sjoerd Meijer 17c08dc701 Code size optimisation: don't rewrite fputs to fwrite when optimising for size
because fwrite requires more arguments and thus extra MOVs are required.

llvm-svn: 274753
2016-07-07 13:56:23 +00:00
David Majnemer 7afb46d3c8 [LoopAccessAnalysis] Fix an integer overflow
We were inappropriately using 32-bit types to account for quantities
that can be far larger.

Fixed in PR28443.

llvm-svn: 274737
2016-07-07 06:24:36 +00:00
Elena Demikhovsky fc1e969dfc Fixed a bug in vectorizing GEP before gather/scatter intrinsic.
Vectorizing GEP was incorrect and broke SSA in some cases.
 
The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997.

Differential revision: http://reviews.llvm.org/D22035

llvm-svn: 274735
2016-07-07 06:06:46 +00:00
Sean Silva 59fe82f4ce [PM] Port TailCallElim
llvm-svn: 274708
2016-07-06 23:48:41 +00:00
Sean Silva b025d375a1 [PM] Port CorrelatedValuePropagation
llvm-svn: 274705
2016-07-06 23:26:29 +00:00
Sanjay Patel 65a51c25c1 [InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors
By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we
allow these transforms for splat vectors.

Differential Revision: http://reviews.llvm.org/D21899

llvm-svn: 274696
2016-07-06 22:23:01 +00:00
Chad Rosier 232e29ebea [MemorySSA] Reinstate the legacy printer and verifier.
Differential Revision: http://reviews.llvm.org/D22058

llvm-svn: 274679
2016-07-06 21:20:47 +00:00
Haicheng Wu a95cd1267f [LIR] Fix mis-compilation with unwinding.
To fix PR27859, bail out if there is an instruction may throw.

Differential Revision: http://reviews.llvm.org/D20638

llvm-svn: 274673
2016-07-06 21:05:40 +00:00
Piotr Padlewski 6deaa6afae Add 'thinlto_src_module' metadata to imported function
Added metadata to be able to make statistics on how many functions
that have been imported have been removed. Also module name might
be helpfull when debugging.

Reviewers: tejohnson, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D21943

llvm-svn: 274668
2016-07-06 20:26:25 +00:00
Justin Bogner a463537a36 NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0
Everywhere where cuda.syncthreads or __syncthreads is used, use the
properly namespaced nvvm.barrier0 instead.

llvm-svn: 274664
2016-07-06 20:02:45 +00:00
Chad Rosier dcfce2d0ec [DSE] Avoid iterator invalidation bugs.
The dse_with_dbg_value.ll test committed with r273141 is removed because this
we no longer performs any type of back tracking, which is what was causing the
codegen differences with and without debug information.

Differential Revision: http://reviews.llvm.org/D21613

llvm-svn: 274660
2016-07-06 19:48:52 +00:00
Sean Silva f50d4b6cdc Work around PR28400 a bit harder.
We were still crashing in the "no change" case because LVI was not
getting invalidated.

See the thread "Should analyses be able to hold AssertingVH to IR?
(related to PR28400)" for more discussion.

llvm-svn: 274656
2016-07-06 19:05:41 +00:00
Michael Kuperstein aa71bdd3af [TTI] The cost model should not assume vector casts get completely scalarized
The cost model should not assume vector casts get completely scalarized, since
on targets that have vector support, the common case is a partial split up to
the legal vector size. So, when a vector cast  gets split, the resulting casts
end up legal and cheap.

Instead of pessimistically assuming scalarization, base TTI can use the costs
the concrete TTI provides for the split vector, plus a fudge factor to account
for the cost of the split itself. This fudge factor is currently 1 by default,
except on AMDGPU where inserts and extracts are considered free.

Differential Revision: http://reviews.llvm.org/D21251

llvm-svn: 274642
2016-07-06 17:30:56 +00:00
Matthew Simpson 433cb1dfe3 [LV] Don't widen trivial induction variables
We currently always vectorize induction variables. However, if an induction
variable is only used for counting loop iterations or computing addresses with
getelementptr instructions, we don't need to do this. Vectorizing these trivial
induction variables can create vector code that is difficult to simplify later
on. This is especially true when the unroll factor is greater than one, and we
create vector arithmetic when computing step vectors. With this patch, we check
if an induction variable is only used for counting iterations or computing
addresses, and if so, scalarize the arithmetic when computing step vectors
instead. This allows for greater simplification.

This patch addresses the suboptimal pointer arithmetic sequence seen in
PR27881.

Reference: https://llvm.org/bugs/show_bug.cgi?id=27881
Differential Revision: http://reviews.llvm.org/D21620

llvm-svn: 274627
2016-07-06 14:26:59 +00:00
Elena Demikhovsky 971fbfda1e Vector GEP test: renamed + some comments
Differential revision: http://reviews.llvm.org/D21957

llvm-svn: 274611
2016-07-06 08:11:23 +00:00
Daniel Berlin fc7e651bfd Fix handling of forward unreachable but reverse-reachable blocks in MemorySSA construction
llvm-svn: 274606
2016-07-06 05:32:05 +00:00
Sanjay Patel cbaac41856 [InstCombine] enable vector select of bools -> logic folds
llvm-svn: 274465
2016-07-03 14:34:39 +00:00
Sanjay Patel 42396ae0ea add vector bool select tests and regenerate checks for scalar bool select tests
llvm-svn: 274460
2016-07-03 13:26:02 +00:00
Sean Silva 45835e731d Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

llvm-svn: 274455
2016-07-02 23:47:27 +00:00
Michael Kuperstein 071d8306b0 [PM] Port ConstantHoisting to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D21945

llvm-svn: 274411
2016-07-02 00:16:47 +00:00
Alina Sbirlea 8d8aa5dd6c Address two correctness issues in LoadStoreVectorizer
Summary:
GetBoundryInstruction returns the last instruction as the instruction which follows or end(). Otherwise the last instruction in the boundry set is not being tested by isVectorizable().
Partially solve reordering of instructions. More extensive solution to follow.

Reviewers: tstellarAMD, llvm-commits, jlebar

Subscribers: escha, arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21934

llvm-svn: 274389
2016-07-01 21:44:12 +00:00
Duncan P. N. Exon Smith e60719b3fa Revert "add tests for bugs fixed by the GVN hoist pass"
This reverts commit r274327 since the tests fail.  E.g.:
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17240

It looks like this commit is building on r274305, but that commit caused
a miscompile and was reverted in r274320.

llvm-svn: 274332
2016-07-01 04:55:13 +00:00
Sebastian Pop 196ba4f844 add tests for bugs fixed by the GVN hoist pass
https://llvm.org/bugs/show_bug.cgi?id=20242
https://llvm.org/bugs/show_bug.cgi?id=22005

llvm-svn: 274327
2016-07-01 03:03:19 +00:00
Matt Arsenault 0101ecade0 LoadStoreVectorizer: Don't increase alignment with no align set
If no alignment was set on the load/stores, it would vectorize
to the new type even though this increases the default alignment.

llvm-svn: 274323
2016-07-01 02:09:38 +00:00
Matt Arsenault 370e8226c7 LoadStoreVectorizer: Check TTI for vec reg bit width
llvm-svn: 274322
2016-07-01 02:07:22 +00:00
Matt Arsenault 42ad17059a LoadStoreVectorizer: Fix assert when merging pointer ops
This needs to use inttoptr/ptrtoint if combining an int and pointer
load. If a pointer is used always do an integer load.

llvm-svn: 274321
2016-07-01 01:55:52 +00:00
Duncan P. N. Exon Smith 9d1f156418 Revert "code hoisting pass based on GVN"
This reverts commit r274305, since it breaks self-hosting:
  http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232

Note that the blamelist on lab.llvm.org:8011 is incorrect.  The previous
build was r274299, but somehow r274305 wasn't included in the blamelist:
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules

llvm-svn: 274320
2016-07-01 01:51:40 +00:00
Matt Arsenault 241f34cde8 LoadStoreVectorizer: Use AA metadata
This was not passing the full instruction with metadata
to the alias query.

llvm-svn: 274318
2016-07-01 01:47:46 +00:00
Matt Arsenault d7e8898bdd LoadStoreVectorizer: if one element of a vector is integer, default to
integer.

Fixes issues on some architectures where we use arithmetic ops to build
vectors, which can cause bad things to happen for loads/stores of mixed
types.

Patch by Fiona Glaser

llvm-svn: 274307
2016-07-01 00:37:01 +00:00
Matt Arsenault 8a4ab5e19f LoadStoreVectorizer: Fix crashes on sub-byte types
llvm-svn: 274306
2016-07-01 00:36:54 +00:00
Sebastian Pop 5c5798c57c code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 274305
2016-07-01 00:24:31 +00:00
Matt Arsenault 079d0f19a2 LoadStoreVectorizer: Check skipFunction first.
Also add test I forgot to add to r274296.

llvm-svn: 274299
2016-06-30 23:50:18 +00:00
Matt Arsenault 2cbe52b990 LoadStoreVectorizer: Skip optnone functions
llvm-svn: 274296
2016-06-30 23:30:29 +00:00
Matt Arsenault 08debb0244 Add LoadStoreVectorizer pass
This was contributed by Apple, and I've been working on
minimal cleanups and generalizing it.

llvm-svn: 274293
2016-06-30 23:11:38 +00:00
Matt Arsenault 727e279ac4 SLPVectorizer: Move propagateMetadata to VectorUtils
This will be re-used by the LoadStoreVectorizer.

Fix handling of range metadata and testcase by Justin Lebar.

llvm-svn: 274281
2016-06-30 21:17:59 +00:00
Wei Mi 95685faeee Refine the set of UniformAfterVectorization instructions.
Except the seed uniform instructions (conditional branch and consecutive ptr
instructions), dependencies to be added into uniform set should only be used
by existing uniform instructions or intructions outside of current loop.

Differential Revision: http://reviews.llvm.org/D21755

llvm-svn: 274262
2016-06-30 18:42:56 +00:00
Jun Bum Lim 596a3bd9ec [DSE] Fix bug in partial overwrite tracking
Summary:
Found cases where DSE incorrectly add partially-overwritten intervals.
Please see the test case for details.

Reviewers: mcrosier, eeckstein, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D21859

llvm-svn: 274237
2016-06-30 15:32:20 +00:00
Sanjay Patel 7c6eab5777 [InstCombine] shrink switch conditions better (PR24766)
https://llvm.org/bugs/show_bug.cgi?id=24766#c2

This removes a hack that was added for the benefit of x86 codegen. 
It prevented shrinking the switch condition even to smaller legal (DataLayout) types.
We have a safety mechanism in CGP after:
http://reviews.llvm.org/rL251857
...so we're free to use the optimal (smallest) IR type now.

Differential Revision: http://reviews.llvm.org/D12965

llvm-svn: 274233
2016-06-30 14:51:21 +00:00
Sanjay Patel 7ad98babfa [InstCombine] extend matchSelectFromAndOr() to work with i1 scalar types
If the incoming types are i1, then we don't have to pattern match any sext ops.

Differential Revision: http://reviews.llvm.org/D21740

llvm-svn: 274228
2016-06-30 14:18:18 +00:00
Sanjay Patel 348111f4b9 add vector tests to show missing transform
llvm-svn: 274192
2016-06-30 00:09:13 +00:00
Sanjay Patel c3701e8b92 regenerate checks
llvm-svn: 274188
2016-06-29 23:58:39 +00:00
Evgeniy Stepanov a5da256f92 StackColoring for SafeStack.
This is a fix for PR27842.

An IR-level implementation of stack coloring tailored to work with
SafeStack. It is a bit weaker than the MI implementation in that it
does not the "lifetime start at first access" logic. This can be
improved in the future.

This patch also replaces the naive implementation of stack frame
layout with a greedy algorithm that can split existing stack slots
and even fit small objects inside the alignment padding of other
objects.

llvm-svn: 274162
2016-06-29 20:37:43 +00:00
Tim Shen aec68b263d [InstCombine] Simplify and correct folding fcmps with the same children
Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp *, x, y) | (fcmp *, x, y) and (fcmp *, x, y) & (fcmp *, x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants.

Currently InstCombine wrongly folds (fcmp ogt, x, y) | (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that.

Reviewers: spatel

Subscribers: llvm-commits, iteratee, echristo

Differential Revision: http://reviews.llvm.org/D21775

llvm-svn: 274156
2016-06-29 20:10:17 +00:00
Tim Shen 860a67eb4c [InstCombine, NFC] Change the generated variable names by creating new instructions
This removes some noise for D21775's test changes.

llvm-svn: 274155
2016-06-29 20:10:13 +00:00
Tim Shen 4561e784f4 [InstCombine] Add full tests for FoldAndOfFCmps and FoldOrOfFCmps
Summary: This adds tests for covering all cases that FoldAndOfFCmps and FoldOrOfFCmps handle.

Reviewers: spatel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21844

llvm-svn: 274144
2016-06-29 17:55:11 +00:00
Elena Demikhovsky 5e21c94f25 Reverted patch 273864
llvm-svn: 274115
2016-06-29 10:01:06 +00:00
Craig Topper df7454f94b Revert "[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition."
This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything.

llvm-svn: 274102
2016-06-29 04:57:00 +00:00
Craig Topper 2cc199baff [ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition.
If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case.

I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at.

Differential revision: http://reviews.llvm.org/D21493

llvm-svn: 274098
2016-06-29 03:46:47 +00:00
Eric Christopher 0c58837b1f Revert "[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions"
Revert "[InstCombine] Combine A->B->A BitCast"

as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847

This reverts commits r270135 and r263734.

llvm-svn: 274094
2016-06-29 03:05:58 +00:00
Weiming Zhao 5410edddb1 [ARM] Fix 28282: cost computation for constant hoisting
Summary:
This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282

Currently the cost model of constant hoisting checks the bit width of the data type of the constants.
However, the actual immediate value is small enough and not need to be hoisted.
This patch checks for the actual bit width needed for the constant.

Reviewers: t.p.northover, rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21668

llvm-svn: 274073
2016-06-28 22:30:45 +00:00
Sanjay Patel 3a0f2606ec minimize regression tests and update checks
llvm-svn: 274047
2016-06-28 18:40:08 +00:00
Sanjay Patel 8ce43c098b minimize regression tests and update checks
llvm-svn: 274046
2016-06-28 18:33:10 +00:00
Artur Pilipenko 7ad95ec22d Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270

llvm-svn: 274043
2016-06-28 18:27:25 +00:00
Adam Nemet bd861acf29 [LLE] Don't hoist conditionally executed loads
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional.  Thus we would
access a memory location that the original loop did not access.

llvm-svn: 273991
2016-06-28 04:02:47 +00:00
Easwaran Raman 22eb80a114 Fix size computation of array allocation in inline cost analysis
Differential revision: http://reviews.llvm.org/D21690

llvm-svn: 273952
2016-06-27 22:31:53 +00:00
Sanjay Patel 59ed2ffca3 [InstCombine] shrink type of sdiv if dividend is sexted and constant divisor is small enough (PR28153)
This should fix PR28153:
https://llvm.org/bugs/show_bug.cgi?id=28153

Differential Revision: http://reviews.llvm.org/D21769

llvm-svn: 273951
2016-06-27 22:27:11 +00:00
Sanjay Patel 5cdf699daa add tests for PR28153
llvm-svn: 273936
2016-06-27 20:28:59 +00:00
Elena Demikhovsky 6f2ec8104a Fixed crash of SLP Vectorizer on KNL
The bug is connected to vector GEPs.
https://llvm.org/bugs/show_bug.cgi?id=28313

llvm-svn: 273919
2016-06-27 20:07:00 +00:00
Sanjay Patel c6ada53be5 [InstCombine] use m_APInt for div --> ashr fold
The APInt matcher works with splat vectors, so we get this fold for vectors too.

llvm-svn: 273897
2016-06-27 17:25:57 +00:00
Artur Pilipenko 72f76b8805 Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures.
llvm-svn: 273895
2016-06-27 16:54:33 +00:00
Easwaran Raman 1832bf6aee [PM] Port PartialInlining to the new PM
Differential revision: http://reviews.llvm.org/D21699

llvm-svn: 273894
2016-06-27 16:50:18 +00:00
Artur Pilipenko a36aa41519 Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270

llvm-svn: 273892
2016-06-27 16:29:26 +00:00
Elena Demikhovsky f65e865e33 Removed extra test from the prev commit.
llvm-svn: 273865
2016-06-27 11:40:49 +00:00
Elena Demikhovsky 4c58b2761a Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)

  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Re-commit rL273257 - revision: http://reviews.llvm.org/D20789

llvm-svn: 273864
2016-06-27 11:19:23 +00:00
Igor Breger 7357849dca [ConstantFolding] Fix bitcast vector of i1.
Differential Revision: http://reviews.llvm.org/D21735

llvm-svn: 273845
2016-06-27 06:42:54 +00:00
Sanjay Patel 1d745384da add tests for potential select transforms
llvm-svn: 273833
2016-06-26 23:44:21 +00:00
Sanjoy Das a37bb4a65d [LoopUnswitch] Unswitch on conditions feeding into guards
Summary:
This is a straightforward extension of what LoopUnswitch does to
branches to guards.  That is, we unswitch

```
for (;;) {
  ...
  guard(loop_invariant_cond);
  ...
}
```

into

```
if (loop_invariant_cond) {
  for (;;) {
    ...
    // There is no need to emit guard(true)
    ...
  }
} else {
  for (;;) {
    ...
    guard(false);
    // SimplifyCFG will clean this up by adding an
    // unreachable after the guard(false)
    ...
  }
}
```

Reviewers: majnemer

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21725

llvm-svn: 273801
2016-06-26 05:10:45 +00:00
Sanjay Patel 51ff79fd82 update tests to use FileCheck
llvm-svn: 273784
2016-06-25 17:39:10 +00:00
David Majnemer e14e7bc4b8 Revert "[SimplifyCFG] Stop inserting calls to llvm.trap for UB"
This reverts commit r273778, it seems to break UBSan :/

llvm-svn: 273779
2016-06-25 08:19:55 +00:00
David Majnemer d346a37737 [SimplifyCFG] Stop inserting calls to llvm.trap for UB
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.

While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.

There are much better tools than llvm.trap: UBSan and ASan.

N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...

llvm-svn: 273778
2016-06-25 08:04:19 +00:00
David Majnemer bb53d23ef8 [InstSimplify] Replace calls to null with undef
Calling null is undefined behavior, we can simplify the resulting value
to undef.

llvm-svn: 273777
2016-06-25 07:37:30 +00:00
David Majnemer 1fea77c6fc [SimplifyCFG] Replace calls to null/undef with unreachable
Calling null is undefined behavior, a call to undef can be trivially
treated as a call to null.

llvm-svn: 273776
2016-06-25 07:37:27 +00:00
Sanjoy Das f63768cbfc [PlaceSafepoints] Don't call undef in test case; NFC
llvm-svn: 273764
2016-06-25 01:40:54 +00:00
Sanjoy Das d850068282 [LoopUnswitch] Avoid exponential behavior
Summary: (No semantic change intended).

Reviewers: majnemer, bogner, mzolotukhin

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21707

llvm-svn: 273763
2016-06-25 01:14:19 +00:00
David Majnemer 0f45572761 The absence of noreturn doesn't ensure mayReturn
There are two separate issues:
- LLVM doesn't consider infinite loops to be side effects: we happily
  hoist/sink above/below loops whose bounds are unknown.
- The absence of the noreturn attribute is insufficient for us to know
  if a function will definitely return.  Relying on noreturn in the
  middle-end for any property is an accident waiting to happen.

llvm-svn: 273762
2016-06-25 00:55:12 +00:00
Peter Collingbourne 0312f614b1 IR: Introduce llvm.type.checked.load intrinsic.
This intrinsic safely loads a function pointer from a virtual table pointer
using type metadata. This intrinsic is used to implement control flow integrity
in conjunction with virtual call optimization. The virtual call optimization
pass will optimize away llvm.type.checked.load intrinsics associated with
devirtualized calls, thereby removing the type check in cases where it is
not needed to enforce the control flow integrity constraint.

This patch also introduces the capability to copy type metadata between
global variables, and teaches the virtual call optimization pass to do so.

Differential Revision: http://reviews.llvm.org/D21121

llvm-svn: 273756
2016-06-25 00:23:04 +00:00
David Majnemer b8da3a2bb2 Reinstate r273711
r273711 was reverted by r273743.  The inliner needs to know about any
call sites in the inlined function.  These were obscured if we replaced
a call to undef with an undef but kept the call around.

This fixes PR28298.

llvm-svn: 273753
2016-06-25 00:04:10 +00:00
Michael Kuperstein 83b753d430 [PM] Port float2int to the new pass manager
Differential Revision: http://reviews.llvm.org/D21704

llvm-svn: 273747
2016-06-24 23:32:02 +00:00
Dehao Chen c66a06ad0e Hookup ProfileSummary with SampleProfilerLoader
Summary: Set ProfileSummary in SampleProfilerLoader.

Reviewers: davidxl, eraman

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21702

llvm-svn: 273745
2016-06-24 22:57:06 +00:00
Nico Weber ae2ef4ccd4 Revert r273711, it caused PR28298.
llvm-svn: 273743
2016-06-24 22:52:39 +00:00
Peter Collingbourne 7efd750607 IR: New representation for CFI and virtual call optimization pass metadata.
The bitset metadata currently used in LLVM has a few problems:

1. It has the wrong name. The name "bitset" refers to an implementation
   detail of one use of the metadata (i.e. its original use case, CFI).
   This makes it harder to understand, as the name makes no sense in the
   context of virtual call optimization.

2. It is represented using a global named metadata node, rather than
   being directly associated with a global. This makes it harder to
   manipulate the metadata when rebuilding global variables, summarise it
   as part of ThinLTO and drop unused metadata when associated globals are
   dropped. For this reason, CFI does not currently work correctly when
   both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable
   globals, and fails to associate metadata with the rebuilt globals. As I
   understand it, the same problem could also affect ASan, which rebuilds
   globals with a red zone.

This patch solves both of those problems in the following way:

1. Rename the metadata to "type metadata". This new name reflects how
   the metadata is currently being used (i.e. to represent type information
   for CFI and vtable opt). The new name is reflected in the name for the
   associated intrinsic (llvm.type.test) and pass (LowerTypeTests).

2. Attach metadata directly to the globals that it pertains to, rather
   than using the "llvm.bitsets" global metadata node as we are doing now.
   This is done using the newly introduced capability to attach
   metadata to global variables (r271348 and r271358).

See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html

Differential Revision: http://reviews.llvm.org/D21053

llvm-svn: 273729
2016-06-24 21:21:32 +00:00
Michael Kuperstein 82d5da5aac [PM] Port PreISelIntrinsicLowering to the new PM
llvm-svn: 273713
2016-06-24 20:13:42 +00:00
David Majnemer 3b3e954ea2 SimplifyInstruction does not imply DCE
We cannot remove an instruction with no uses just because
SimplifyInstruction succeeds.  It may have side effects.

llvm-svn: 273711
2016-06-24 19:34:46 +00:00
Reid Kleckner fbd5eef691 Revert "InstCombine rule to fold trunc when value available"
This reverts commit r273608.

Broke building code with sanitizers, where apparently these kinds of
loads, casts, and truncations are common:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502
http://crbug.com/623099

llvm-svn: 273703
2016-06-24 18:42:58 +00:00
Sanjay Patel f8b08f7179 [InstCombine] consolidate commutation variants of matchSelectFromAndOr() in one place; NFCI
By putting all the possible commutations together, we simplify the code.
Note that this is NFCI, but I'm adding tests that actually exercise each
commutation pattern because we don't have this anywhere else.

llvm-svn: 273702
2016-06-24 18:26:02 +00:00
Matthew Simpson e794678404 [LV] Preserve order of dependences in interleaved accesses analysis
The interleaved access analysis currently assumes that the inserted run-time
pointer aliasing checks ensure the absence of dependences that would prevent
its instruction reordering. However, this is not the case.

Issues can arise from how code generation is performed for interleaved groups.
For a load group, all loads in the group are essentially moved to the location
of the first load in program order, and for a store group, all stores in the
group are moved to the location of the last store. For groups having members
involved in a dependence relation with any other instruction in the loop, this
reordering can violate the dependence.

This patch teaches the interleaved access analysis how to avoid breaking such
dependences, and should fix PR27626.

An assumption of the original analysis was that the accesses had been collected
in "program order". The analysis was then simplified by visiting the accesses
bottom-up. However, this ordering was never guaranteed for anything other than
single basic block loops. Thus, this patch also enforces the desired ordering.

Reference: https://llvm.org/bugs/show_bug.cgi?id=27626
Differential Revision: http://reviews.llvm.org/D19984

llvm-svn: 273687
2016-06-24 15:33:25 +00:00
Chuang-Yu Cheng 68f7f1cf00 Teaching SimplifyCFG to recognize the Or-Mask trick that InstCombine uses to
reduce the number of comparisons.

Specifically, InstCombine can turn:
  (i == 5334 || i == 5335)
into:
  ((i | 1) == 5335)

SimplifyCFG was already able to detect the pattern:
  (i == 5334 || i == 5335)
to:
  ((i & -2) == 5334)

This patch supersedes D21315 and resolves PR27555
(https://llvm.org/bugs/show_bug.cgi?id=27555).

Thanks to David and Chandler for the suggestions!

Author: Thomas Jablin (tjablin)
Reviewers: majnemer chandlerc halfdan cycheng

http://reviews.llvm.org/D21397

llvm-svn: 273639
2016-06-24 01:59:00 +00:00
Anna Thomas 31a0b2088f InstCombine rule to fold trunc when value available
Summary:
This instcombine rule folds away trunc operations that have value available from a prior load or store.
This kind of code can be generated as a result of GVN widening the load or from source code as well.

Reviewers: reames, majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21246

llvm-svn: 273608
2016-06-23 20:22:22 +00:00
Artur Pilipenko 80771b9ad9 Upgrade other old memset/memcpy signatures in tests causing buildbot failures with rL273568.
llvm-svn: 273580
2016-06-23 16:34:52 +00:00
Michael Zolotukhin 2d3592d481 [LoopUnrollAnalyzer] Fix a bug in UnrolledInstAnalyzer::visitLoad.
When simplifying a load we need to make sure that the type of the
simplified value matches the type of the instruction we're processing.
In theory, we can handle casts here as we deal with constant data, but
since it's not implemented at the moment, we at least need to bail out.

This fixes PR28262.

llvm-svn: 273562
2016-06-23 14:31:31 +00:00
Hal Finkel a1271036c5 Allow DeadStoreElimination to track combinations of partial later wrties
DeadStoreElimination can currently remove a small store rendered unnecessary by
a later larger one, but could not remove a larger store rendered unnecessary by
a series of later smaller ones. This adds that capability.

It works by keeping a map, which is used as an effective interval map, for each
store later overwritten only partially, and filling in that interval map as
more such stores are discovered. No additional walking or aliasing queries are
used. In the map forms an interval covering the the entire earlier store, then
it is dead and can be removed. The map is used as an interval map by storing a
mapping between the ending offset and the beginning offset of each interval.

I discovered this problem when investigating a performance issue with code like
this on PowerPC:

  #include <complex>
  using namespace std;

  complex<float> bar(complex<float> C);
  complex<float> foo(complex<float> C) {
    return bar(C)*C;
  }

which produces this:

  define void @_Z4testSt7complexIfE(%"struct.std::complex"* noalias nocapture sret %agg.result, i64 %c.coerce) {
  entry:
    %ref.tmp = alloca i64, align 8
    %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"*
    %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32
    %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32
    %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float
    %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32
    %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float
    call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce)
    %2 = bitcast %"struct.std::complex"* %agg.result to i64*
    %3 = load i64, i64* %ref.tmp, align 8
    store i64 %3, i64* %2, align 4 ; <--- ***** THIS SHOULD NOT BE HERE ****
    %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0
    %4 = lshr i64 %3, 32
    %5 = trunc i64 %4 to i32
    %6 = bitcast i32 %5 to float
    %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1
    %7 = trunc i64 %3 to i32
    %8 = bitcast i32 %7 to float
    %mul_ad.i.i = fmul fast float %6, %1
    %mul_bc.i.i = fmul fast float %8, %0
    %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i
    %mul_ac.i.i = fmul fast float %6, %0
    %mul_bd.i.i = fmul fast float %8, %1
    %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i
    store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4
    store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4
    ret void
  }

the problem here is not just that the i64 store is unnecessary, but also that
it blocks further backend optimizations of the other uses of that i64 value in
the backend.

In the future, we might want to add a special case for handling smaller
accesses (e.g. using a bit vector) if the map mechanism turns out to be
noticeably inefficient. A sorted vector is also a possible replacement for the
map for small numbers of tracked intervals.

Differential Revision: http://reviews.llvm.org/D18586

llvm-svn: 273559
2016-06-23 13:46:39 +00:00
David Majnemer d1fbf48566 [SCCP] Don't assume all Constants are ConstantInt
This fixes PR28269.

llvm-svn: 273521
2016-06-23 00:14:29 +00:00
Sanjay Patel a06d989552 [ValueTracking] improve ComputeNumSignBits for vector constants
This is similar to the computeKnownBits improvement in rL268479. 
There's probably more we can do for vector logic instructions, but 
this should let us see non-splat constant masking ops that can
become vector selects instead of and/andn/or sequences.

Differential Revision: http://reviews.llvm.org/D21610

llvm-svn: 273459
2016-06-22 19:20:59 +00:00
Artur Pilipenko 1cec4fdddf Upgrade old memset/memcpy signatures (without isVolatile argument) in tests
We no longer have corresponding code in autoupgrade and the vast majority of the tests were fixed long time ago. Fix the remaining few. One of the verifier test cases is marked as XFAIL because it was passing only because the signature was incorrect.

llvm-svn: 273428
2016-06-22 15:16:06 +00:00
Sanjay Patel c6cacd6067 [InstSimplify] add ashr tests including vector types
llvm-svn: 273421
2016-06-22 14:18:04 +00:00
Simon Pilgrim bc35f9f702 [SLPVectorizer][X86] Added ceil/floor/nearbyint/rint/trunc vectorization tests
llvm-svn: 273420
2016-06-22 14:07:46 +00:00
Sanjay Patel 21579bb39a [InstSimplify] regenerate checks
llvm-svn: 273419
2016-06-22 14:00:16 +00:00
Haicheng Wu a783bac50b [Kryo] Enable loop prefetcher.
Differential Revision: http://reviews.llvm.org/D21535

llvm-svn: 273329
2016-06-21 22:47:56 +00:00
Easwaran Raman 8bceb9d210 Fix PR28219: Use profile summary from reader and not compute it
Differentiaal revision: http://reviews.llvm.org/D21546

llvm-svn: 273301
2016-06-21 19:29:49 +00:00
Elena Demikhovsky a266cf0518 reverted the prev commit due to assertion failure
llvm-svn: 273258
2016-06-21 12:10:11 +00:00
Elena Demikhovsky 9823c995bc Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)
  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Differential revision: http://reviews.llvm.org/D20789

llvm-svn: 273257
2016-06-21 11:32:01 +00:00
Simon Pilgrim 356e823b51 [X86][SSE] Add cost model for BSWAP of vectors
The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization.

Differential Revision: http://reviews.llvm.org/D21521

llvm-svn: 273217
2016-06-20 23:08:21 +00:00
Sanjay Patel 9ad8fb68f7 [InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869)
By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question
raised by PR27869:
https://llvm.org/bugs/show_bug.cgi?id=27869
...where InstCombine turns an icmp+zext into a shift causing us to miss the fold.

Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp.

Differential Revision: http://reviews.llvm.org/D21512

llvm-svn: 273200
2016-06-20 20:59:59 +00:00
Dehao Chen 071bb9d7af Pass AssumptionCacheTracker from SampleProfileLoader to Inliner
Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader

Reviewers: davidxl

Subscribers: eraman, vsk, danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D21205

llvm-svn: 273199
2016-06-20 20:53:40 +00:00
Matt Arsenault 802ebcb4bb InstCombine: Don't strip convergent from intrinsic callsites
Specific instances of intrinsic calls may want to be convergent, such
as certain register reads but the intrinsic declaration is not.

llvm-svn: 273188
2016-06-20 19:04:44 +00:00
Sanjay Patel 445d7ecf89 [InstCombine] consolidate some icmp+logic tests and improve checks
llvm-svn: 273186
2016-06-20 18:40:37 +00:00
Sanjay Patel 14dcb042bc [InstCombine] update to use FileCheck with autogenerated exact checking
llvm-svn: 273180
2016-06-20 18:23:40 +00:00
Sanjay Patel 06918ad79e [InstCombine] update to use FileCheck with autogenerated exact checking
llvm-svn: 273173
2016-06-20 17:56:13 +00:00
Sanjay Patel a038240660 [InstCombine] regenerate checks
llvm-svn: 273170
2016-06-20 17:48:48 +00:00
David Majnemer c5601df9fd Reapply "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273160, reapplying r273132.
RecursivelyDeleteTriviallyDeadInstructions cannot be called on a
parentless Instruction.

llvm-svn: 273162
2016-06-20 16:03:25 +00:00
Cong Liu 1c28b6d733 Revert "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273132.
Breaks multiple test under /llvm/test:Transforms (e.g.
llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan.

llvm-svn: 273160
2016-06-20 15:22:15 +00:00
Patrik Hagglund 7205215591 Fix for PR27940
After a store has been eliminated, when making sure that the
instruction iterator points to a valid instruction, dbg intrinsics are
now ignored as a new instruction.

Patch by Henric Karlsson.

Reviewed by Daniel Berlin.

Differential Revision: http://reviews.llvm.org/D21076

llvm-svn: 273141
2016-06-20 09:10:10 +00:00
David Majnemer a705843f23 [LoopIdiom] Don't remove dead operands manually
Removing dead instructions requires remembering which operands have
already been removed.  RecursivelyDeleteTriviallyDeadInstructions has
this logic, don't partially reimplement it in LoopIdiomRecognize.

This fixes PR28196.

llvm-svn: 273132
2016-06-20 02:33:29 +00:00
Sanjay Patel a4b052c7d1 [InstSimplify] add tests for PR27689; regenerate checks
llvm-svn: 273128
2016-06-19 21:40:12 +00:00
David Majnemer 3119599475 [LoadCombine] Combine Loads formed from GEPS with negative indexes
Change the underlying offset and comparisons to use int64_t instead of
uint64_t.

Patch by River Riddle!

Differential Revision: http://reviews.llvm.org/D21499

llvm-svn: 273105
2016-06-19 06:14:56 +00:00
Matt Arsenault a466a7cf62 Add looping testcase that broke in r272987
llvm-svn: 273081
2016-06-18 05:15:58 +00:00
Sanjoy Das e8fd9561cb [SCEV] Fix incorrect trip count computation
The way we elide max expressions when computing trip counts is incorrect
-- it breaks cases like this:

```
static int wrapping_add(int a, int b) {
  return (int)((unsigned)a + (unsigned)b);
}

void test() {
  volatile int end_buf = 2147483548; // INT_MIN - 100
  int end = end_buf;

  unsigned counter = 0;
  for (int start = wrapping_add(end,  200); start < end; start++)
    counter++;

  print(counter);
}
```

Note: the `NoWrap` variable that was being tested has little to do with
the values flowing into the max expression; it is a property of the
induction variable.

test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test
functionality I'm reverting in this change, so I've deleted the test
fully.

llvm-svn: 273079
2016-06-18 04:38:31 +00:00
Matt Arsenault 8fd5978811 Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."""
This seems to be causing an infinite loop / crash in instcombine
on some bots.

llvm-svn: 273069
2016-06-17 23:36:38 +00:00
Adam Nemet a9f09c6245 [LAA] Enable symbolic stride speculation for all LAA clients
This is a functional change for LLE and LDist.  The other clients (LV,
LVerLICM) already had this explicitly enabled.

The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides.  This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter.  This makes the
interface more friendly to the new Pass Manager.

The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.

llvm-svn: 273064
2016-06-17 22:35:41 +00:00
Matt Arsenault d76efc14b9 Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""
Reapply r272987. Condition should be in terms of the destination type,
and the flags should not be copied.

llvm-svn: 273045
2016-06-17 20:33:53 +00:00
Davide Italiano b49aa5c0c4 [PM] Port MergedLoadStoreMotion to the new pass manager, take two.
This is indeed a much cleaner approach (thanks to Daniel Berlin
for pointing out), and also David/Sean for review.

Differential Revision:  http://reviews.llvm.org/D21454

llvm-svn: 273032
2016-06-17 19:10:09 +00:00
James Y Knight 148a6469dc Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass.
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1
or 2-byte. For those, you need to mask and shift the 1 or 2 byte values
appropriately to use the 4-byte instruction.

This change adds support for cmpxchg-based instruction sets (only SPARC,
in LLVM). The support can be extended for LL/SC-based PPC and MIPS in
the future, supplanting the ISel expansions those architectures
currently use.

Tests added for the IR transform and SPARCv9.

Differential Revision: http://reviews.llvm.org/D21029

llvm-svn: 273025
2016-06-17 18:11:48 +00:00
Sanjay Patel 216d8cf720 [InstCombine] allow more than one use for vector bitcast folding with selects
The motivating example for this transform is similar to D20774 where bitcasts interfere
with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to 
produce min and max ops:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %bc1 = bitcast <4 x float> %a to <4 x i32>
  %bc2 = bitcast <4 x float> %b to <4 x i32>
  %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2
  %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1
  %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>*
  store <4 x i32> %sel1, <4 x i32>* %bc3
  %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>*
  store <4 x i32> %sel2, <4 x i32>* %bc4
  ret void
}

With this patch, we move the selects up to use the input args which allows getting rid of
all of the bitcasts:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
  %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a
  store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16
  store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16
  ret void
}

The asm for x86 SSE then improves from:

movaps  %xmm0, %xmm2
cmpltps %xmm1, %xmm2
movaps  %xmm2, %xmm3
andnps  %xmm1, %xmm3
movaps  %xmm2, %xmm4
andnps  %xmm0, %xmm4
andps %xmm2, %xmm0
orps  %xmm3, %xmm0
andps %xmm1, %xmm2
orps  %xmm4, %xmm2
movaps  %xmm0, (%rdi)
movaps  %xmm2, (%rsi)

To:

movaps  %xmm0, %xmm2
minps %xmm1, %xmm2
maxps %xmm0, %xmm1
movaps  %xmm2, (%rdi)
movaps  %xmm1, (%rsi)

The TODO comments show that we're limiting this transform only to vectors and only to bitcasts
because we need to improve other transforms or risk creating worse codegen.

Differential Revision: http://reviews.llvm.org/D21190

llvm-svn: 273011
2016-06-17 16:46:50 +00:00
Adam Nemet e7709d92ba [LLE] Don't hard-code the name of the preheader in test
Turns out a didn't get this right because symbolic stride versioning
changes the name.  Relax the matching.

llvm-svn: 272992
2016-06-17 09:13:15 +00:00
Matt Arsenault ce56f7bbaa Revert "InstCombine: Reduce trunc (shl x, K) width."
This reverts commit r272987.

This might be causing crashes on some bots.

llvm-svn: 272990
2016-06-17 06:28:53 +00:00
Matt Arsenault 028fd50642 InstCombine: Reduce trunc (shl x, K) width.
llvm-svn: 272987
2016-06-17 04:43:22 +00:00
Evgeniy Stepanov 45fa0fd758 [safestack] Sink unsafe address computation to each use.
This is a fix for PR27844.
When replacing uses of unsafe allocas, emit the new location
immediately after each use. Without this, the pointer stays live from
the function entry to the last use, while it's usually cheaper to
recalculate.

llvm-svn: 272969
2016-06-16 22:34:04 +00:00
Evgeniy Stepanov 72d961a1da [safestack] Fixup llvm.dbg.value when rewriting unsafe allocas.
When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are
updated to refer to the new location.

This change does the same to dbg.value intrinsics.

llvm-svn: 272968
2016-06-16 22:34:00 +00:00
Sanjoy Das 07c6521aed [EarlyCSE] Fold invariant loads
Redundant invariant loads can be CSE'ed with very little extra effort
over what early-cse already tracks, so it looks reasonable to make
early-cse handle this case.

llvm-svn: 272954
2016-06-16 20:47:57 +00:00
Justin Lebar b0bd07aff7 Fix strip-dead-debug-info test if path contains "bar".
This test checks that the string 'bar' (no quotes) doesn't exist in the
output after running opt.  But opt embeds the absolute path to the
filename, and on my machine, the filename contains the string 'jlebar',
causing the test to fail.

This patch changes the test to look for the string '"bar"' instead.

llvm-svn: 272941
2016-06-16 19:39:55 +00:00
Davide Italiano 41315f7873 [PM] Revert the port of MergeLoadStoreMotion to the new pass manager.
Daniel Berlin expressed some real concerns about the port and proposed
and alternative approach. I'll revert this for now while working on a
new patch, which I hope to put up for review shortly. Sorry for the churn.

llvm-svn: 272925
2016-06-16 17:40:53 +00:00
Adam Nemet 776346848a [LLE] New test to check that no versioning for symbolic strides occurs. NFC
This is currently only performed in the Vectorizer.  I will change this
as symbolic stride collection is moved to LAA.

This test will track when the actual functional change occurs.

llvm-svn: 272918
2016-06-16 16:45:29 +00:00
Igor Laevsky 87f0d0e185 Revert r272891 "[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo"
It was causing failures in Profile-i386 and Profile-x86_64 tests.

llvm-svn: 272912
2016-06-16 16:25:53 +00:00
Igor Laevsky c9179fd2c2 [JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo
We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise 
we will get dangling pointer inside BranchProbabilityInfo cache.

Differential Revision: http://reviews.llvm.org/D20957

llvm-svn: 272891
2016-06-16 13:28:25 +00:00
Patrik Hagglund 0acaefaf9d PR27938: Don't remove valid DebugLoc in Scalarizer
Added checks to make sure the Scalarizer::transferMetadata() don't
remove valid debug locations from instructions. This is important as
the verifier pass require that e.g. inlinable callsites have a valid
debug location.

https://llvm.org/bugs/show_bug.cgi?id=27938

Patch by Karl-Johan Karlsson

Reviewers: dblaikie

Differential Revision: http://reviews.llvm.org/D20807

llvm-svn: 272884
2016-06-16 10:48:54 +00:00
Chuang-Yu Cheng dbe00d51b4 SimplifyCFG is able to detect the pattern:
(i == 5334 || i == 5335)
to:
    ((i & -2) == 5334)

This transformation has some incorrect side conditions. Specifically, the
transformation is only applied when the right-hand side constant (5334 in
the example) is a power of two not equal and not equal to the negated mask.
These side conditions were added in r258904 to fix PR26323. The correct side
condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334].

It's a little bit hard to see why these transformations are correct and what
the side conditions ought to be. Here is a CVC3 program to verify them for
64-bit values:
    ONE  : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63);
    x    : BITVECTOR(64);
    y    : BITVECTOR(64);
    z    : BITVECTOR(64);
    mask : BITVECTOR(64) = BVSHL(ONE, z);
    QUERY( (y & ~mask = y) =>
           ((x & ~mask = y) <=> (x = y OR x = (y |  mask)))
    );

Please note that each pattern must be a dual implication (<--> or iff). One
directional implication can create spurious matches. If the implication is
only one-way, an unsatisfiable condition on the left side can imply a
satisfiable condition on the right side. Dual implication ensures that
satisfiable conditions are transformed to other satisfiable conditions and
unsatisfiable conditions are transformed to other unsatisfiable conditions.

Here is a concrete example of a unsatisfiable condition on the left
implying a satisfiable condition on the right:
    mask = (1 << z)
    (x & ~mask) == y --> (x == y || x == (y | mask))

Substituting y = 3, z = 0 yields:
    (x & -2) == 3 --> (x == 3 || x == 2)

The version of this code before r258904 had no side-conditions and
incorrectly justified itself in comments through one-directional
implication.

Thanks to Chandler for the suggestion!

Author: Thomas Jablin (tjablin)
Reviewers: chandlerc majnemer hfinkel cycheng

http://reviews.llvm.org/D21417

llvm-svn: 272873
2016-06-16 04:44:25 +00:00
Eli Friedman bd254a6f45 [InstCombine] Don't widen metadata on store-to-load forwarding
The original check for load CSE or store-to-load forwarding is wrong
when the forwarded stored value happened to be a load.

Ref https://github.com/JuliaLang/julia/issues/16894

Differential Revision: http://reviews.llvm.org/D21271

Patch by Yichao Yu!

llvm-svn: 272868
2016-06-16 02:33:42 +00:00
Xinliang David Li 1eaecefaf9 [PM] Port Add discriminator pass to new PM
llvm-svn: 272847
2016-06-15 21:51:30 +00:00
David Majnemer b62692e2e0 [TargetLibraryInfo] Teach isValidProtoForLibFunc about tan
We would fail to validate the type of the tan function which would cause
downstream users of isValidProtoForLibFunc to assert.

This fixes PR28143.

llvm-svn: 272802
2016-06-15 16:47:23 +00:00
Sean Silva e0a9e66040 [PM] Port SLPVectorizer to the new PM
This uses the "runImpl" approach to share code with the old PM.

Porting to the new PM meant abandoning the anonymous namespace enclosing
most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal
compared to having to pull the pass class into a header which the new PM
requires since it calls the constructor directly).

llvm-svn: 272766
2016-06-15 08:43:40 +00:00
Sean Silva a4c2d150d0 [PM] Port AlignmentFromAssumptions to the new PM.
This uses the "runImpl" pattern to share code between the old and new PM.

llvm-svn: 272757
2016-06-15 06:18:01 +00:00
Michael Kuperstein 3277a05fcf Recommit [LV] Enable vectorization of loops where the IV has an external use
r272715 broke libcxx because it did not correctly handle cases where the
last iteration of one IV is the second-to-last iteration of another.

Original commit message:
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.

llvm-svn: 272742
2016-06-15 00:35:26 +00:00
David Majnemer 4a697c312f [LoopUnroll] Don't crash trying to unroll loop with EH pad exit
We do not support splitting cleanuppad or catchswitches.  This is
problematic for passes which assume that a loop is in loop simplify
form (the loop would have a dedicated exit block instead of sharing it).

While it isn't great that we don't support this for cleanups, we still
cannot make loop-simplify form an assertable precondition because
indirectbr will also disable these sorts of CFG cleanups.

This fixes PR28132.

llvm-svn: 272739
2016-06-15 00:19:56 +00:00
David Majnemer cbf614a93b Remove the ScalarReplAggregates pass
Nearly all the changes to this pass have been done while maintaining and
updating other parts of LLVM.  LLVM has had another pass, SROA, which
has superseded ScalarReplAggregates for quite some time.

Differential Revision: http://reviews.llvm.org/D21316

llvm-svn: 272737
2016-06-15 00:19:09 +00:00
Matt Arsenault f42c69206d AMDGPU: Run pointer optimization passes
llvm-svn: 272736
2016-06-15 00:11:01 +00:00
Michael Kuperstein d4bd3ab5fe Reverting r272715 since it broke libcxx.
llvm-svn: 272730
2016-06-14 22:30:41 +00:00
Davide Italiano d737dd2ec6 [PM] Port WholeProgramDevirt to the new pass manager.
llvm-svn: 272721
2016-06-14 21:44:19 +00:00
Michael Kuperstein 23b6d6adc9 [LV] Enable vectorization of loops where the IV has an external use
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.

Differential Revision: http://reviews.llvm.org/D21048

llvm-svn: 272715
2016-06-14 21:27:27 +00:00
Evgeniy Stepanov 0be3cf1d35 Add a missing test.
This is a test for r272421: Disable MSan-hostile loop unswitching.

llvm-svn: 272713
2016-06-14 21:24:13 +00:00
Peter Collingbourne 96efdd6107 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

llvm-svn: 272709
2016-06-14 21:01:22 +00:00
Sanjoy Das d7e8206b58 [ValueTracking] Calls to @llvm.assume always return
This change teaches llvm::isGuaranteedToTransferExecutionToSuccessor
that calls to @llvm.assume always terminate.  Most other relevant
intrinsics should be covered by the "CS.onlyReadsMemory() ||
CS.onlyAccessesArgMemory()" bit but we were missing @llvm.assumes
because we state that it clobbers memory.

Added an LICM test case, but this change is not specific to LICM.

llvm-svn: 272703
2016-06-14 20:23:16 +00:00
Adam Nemet 73a26957fc [LoopVer] Update all existing PHIs in the exit block
We only used to add the edge from the cloned loop to PHIs that
corresponded to values defined by the loop.  We need to do this for all
PHIs obviously since we need a PHI operand for each incoming edge.

This includes things like PHIs with a constant value or with values
defined before the original loop (see the testcases).

After the patch the PHIs are added to the exit block in two passes.

In the first pass we ensure there is a single-operand (LCSSA) PHI for
each value defined by the loop.

In the second pass we loop through each (single-operand) PHI and add the
value for the edge from the cloned loop.  If the value is defined in the
loop we'll use the cloned instruction from the cloned loop.

Fixes PR28037

llvm-svn: 272649
2016-06-14 09:38:54 +00:00
Davide Italiano cccf4f01ad [PM] Port Mem2Reg to the new pass manager.
llvm-svn: 272630
2016-06-14 03:22:22 +00:00
Sean Silva 6347df0f81 [PM] Port MemCpyOpt to the new PM.
The need for all these Lookup* functions is just because of calls to
getAnalysis inside methods (i.e. not at the top level) of the
runOnFunction method. They should be straightforward to clean up when
the old PM is gone.

llvm-svn: 272615
2016-06-14 02:44:55 +00:00
Davide Italiano 5669ef1efe Placate bots fixing a typo in AA-pipeline description. Sorry.
llvm-svn: 272608
2016-06-14 01:11:12 +00:00
Sean Silva 46590d556a Bring back "[PM] Port JumpThreading to the new PM" with a fix
This reverts commit r272603 and adds a fix.

Big thanks to Davide for pointing me at r216244 which gives some insight
into how to fix this VS2013 issue. VS2013 can't synthesize a move
constructor. So the fix here is to add one explicitly to the
JumpThreadingPass class.

llvm-svn: 272607
2016-06-14 00:51:09 +00:00
Davide Italiano 89ab89d6cd [PM] Port MergedLoadStoreMotion to the new pass manager.
llvm-svn: 272606
2016-06-14 00:49:23 +00:00