Commit Graph

88089 Commits

Author SHA1 Message Date
Haicheng Wu 64d9d7c3f7 Revert "[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()"
Not sure it handles undef properly.

llvm-svn: 263605
2016-03-15 23:38:47 +00:00
Adam Nemet c979c6e123 Turn LoopLoadElimination on again
The latent bug that LLE exposed in the LoopVectorizer was resolved
(PR26952).

The pass can be disabled with -mllvm -enable-loop-load-elim=0

llvm-svn: 263595
2016-03-15 22:26:12 +00:00
Mike Aizatsky 298516ffa9 [libfuzzer] speeding up corpus load
llvm-svn: 263591
2016-03-15 21:47:21 +00:00
Bjorn Steinbrink 37ca462508 Also handle the new Rust pers fn to isCatchAll()
llvm-svn: 263585
2016-03-15 20:57:07 +00:00
Bjorn Steinbrink 59fdec673d Add Rust's personality function to the list of known personality functions
Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18192

llvm-svn: 263581
2016-03-15 20:35:45 +00:00
Evgeniy Stepanov d6e91369d8 [msan] Don't put module constructors in comdats.
There is something strange going on with debug info (.eh_frame_hdr)
disappearing when msan.module_ctor are placed in comdat sections.

Moving this functionality under flag, disabled by default.

llvm-svn: 263579
2016-03-15 20:25:47 +00:00
Teresa Johnson 1396809b17 [ThinLTO] Record all global variable defs in the summary
Record all variable defs with a summary record to aid in building a
complete reference graph and locating constant variable defs to import.

llvm-svn: 263576
2016-03-15 19:35:45 +00:00
Chris Bieneman ef43d448d4 [CMake] Add PACKAGE_VENDOR for customizing version output
Summary: This change adds a PACKAGE_VENDOR variable. When set it makes the version output more closely resemble the clang version output.

Reviewers: aprantl, bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18159

llvm-svn: 263566
2016-03-15 18:07:46 +00:00
Adam Nemet fdb20595a1 [LV] Preserve LoopInfo when store predication is used
This was a latent bug that got exposed by the change to add LoopSimplify
as a dependence to LoopLoadElimination.  Since LoopInfo was corrupted
after LV, LoopSimplify mis-compiled nbench in the test-suite (more
details in the PR).

The problem was that when we create the blocks for predicated stores we
didn't add those to any loops.

The original testcase for store predication provides coverage for this
assuming we verify LI on the way out of LV.

Fixes PR26952.

llvm-svn: 263565
2016-03-15 18:06:20 +00:00
Davide Italiano dfdf278ebf [MC] Rename TLSDESC as it's not ARM specific.
Similarly to what was done for TLSCALL in r263515.

llvm-svn: 263564
2016-03-15 17:29:52 +00:00
Changpeng Fang 01f6062227 AMDGPU/SI: Implement GroupStaticSize Intrinsic for Dynamic LDS
Summary:
  Static LDS size is saved in MachineFunctionInfo::LDSSize,
We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter,
we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize.

Reviewers:
    arsenm
    tstellarAMD

Subscribers
    llvm-commits, arsenm

Differential Revision:
   http://reviews.llvm.org/D18064

llvm-svn: 263563
2016-03-15 17:28:44 +00:00
Douglas Katzman 708eeb0519 Myriad: Add new sparc CPU kinds.
llvm-svn: 263557
2016-03-15 16:41:47 +00:00
Benjamin Kramer 96f4b12880 [GlobalOpt] Don't look through aliases when sorting names of globals.
If both are different aliases to the same value the sorting becomes
non-deterministic as array_pod_sort is not stable.

llvm-svn: 263550
2016-03-15 14:18:26 +00:00
Chad Rosier ebe559019b [SLP] Update comment to reflect reality. NFC.
llvm-svn: 263548
2016-03-15 13:27:58 +00:00
Lang Hames abda4d2526 [MachO] Extend the alt_entry support for aliases added in r263521 to
expressions of the form 'a = .' and 'a = Ltmp'.

llvm-svn: 263528
2016-03-15 04:20:49 +00:00
Eric Christopher 257338ff0f Use some braces to format this a little better.
llvm-svn: 263527
2016-03-15 03:01:31 +00:00
Teresa Johnson 2794f71575 BitcodeWriter dyn_cast cleanup for r263275 (NFC)
Address review suggestions from dblaikie: change a few dyn_cast to cast
and fold a cast into if condition.

llvm-svn: 263526
2016-03-15 02:41:29 +00:00
Eric Christopher ee00abe5e6 Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest
parentheses around '&&' within '||' [-Werror=parentheses].

llvm-svn: 263525
2016-03-15 02:19:06 +00:00
Teresa Johnson b43027d1e0 Move global ID computation from Function to GlobalValue (NFC)
Since the static getGlobalIdentifier and getGUID methods are now called
for global values other than functions, reflect that by moving these
methods to the GlobalValue class.

llvm-svn: 263524
2016-03-15 02:13:19 +00:00
Lang Hames 1b640e05ba [MachO] Add MachO alt-entry directive support.
This patch adds support for the MachO .alt_entry assembly directive, and uses
it for global aliases with non-zero GEP offsets. The alt_entry flag indicates
that a symbol should be layed out immediately after the preceding symbol.
Conceptually it introduces an alternate entry point for a function or data
structure. E.g.:

safe_foo:
  // check preconditions for foo
.alt_entry fast_foo
fast_foo:
  // body of foo, can assume preconditions.

The .alt_entry flag is also implicitly set on assembly aliases of the form:

a = b + C

where C is a non-zero constant, since these have the same effect as an
alt_entry symbol: they introduce a label that cannot be moved relative to the
preceding one. Setting the alt_entry flag on aliases of this form fixes
http://llvm.org/PR25381.

llvm-svn: 263521
2016-03-15 01:43:05 +00:00
Kostya Serebryany 0c5e3af862 [libFuzzer] use max_len exactly equal to the max size of input. Fix 32-bit build
llvm-svn: 263518
2016-03-15 01:28:00 +00:00
Sanjoy Das c11460e051 [StatepointLowering] Move an assertion; NFCI
Instead of running an explicit loop over `gc.relocate` calls hanging off
of a `gc.statepoint`, assert the validity of the type of the value being
relocated in `visitRelocate`.

llvm-svn: 263516
2016-03-15 01:16:31 +00:00
Davide Italiano 249c45d92e [MC] Rename TLSCALL as it's not ARM specific.
`MCSymbolRefExpr` variant kind for TLSCALL is prefixed with 
_ARM_ since this is how it was originally implemented.
The X86_64 version is exactly the same so there's no reason
to create a new variant, we can just rename the existing
one to be machine-independent.
This generalization is the first step to implement support
for GNU2 TLS dialect in MC.

Differential Revision:  http://reviews.llvm.org/D18160

llvm-svn: 263515
2016-03-15 00:25:22 +00:00
Teresa Johnson 26ab5772b0 [ThinLTO] Renaming of function index to module summary index (NFC)
(Resubmitting after fixing missing file issue)

With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263513
2016-03-15 00:04:37 +00:00
Eric Christopher da8b3f1914 Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on
pre-SSE41 hardware" as it seems to be causing crashes during code
generation in halide. PR forthcoming.

This reverts commit r263303.

llvm-svn: 263512
2016-03-14 23:59:57 +00:00
Justin Lebar 6827de19b2 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

llvm-svn: 263509
2016-03-14 23:15:34 +00:00
Amaury Sechet bdb261b4c0 Imporove load to store => memcpy
Summary: This now try to reorder instructions in order to help create the optimizable pattern.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer

Differential Revision: http://reviews.llvm.org/D16523

llvm-svn: 263503
2016-03-14 22:52:27 +00:00
Manuel Jacob 6be355961e Re-add ConstantFoldInstOperands form taking opcode and return type.
Summary:
This form was replaced by a form taking an instruction instead of opcode and
return type in r258391.  After committing this change (and some depending,
follow-up changes) it turned out in the review thread to be controversial.  The
discussion didn't come to a conclusion yet.  I'm re-adding the old form to fix
the API regression and to provide a better base for discussion, possibly on
llvm-dev.

A difference to the original function is that it can't be called with GEPs
(similarly to how it was already the case for compares).  In order to support
opaque pointers in the future, folding GEPs needs to be passed the source
element type, which is not possible with the current API.

Reviewers: dberlin, reames

Subscribers: dblaikie, eddyb

Differential Revision: http://reviews.llvm.org/D17901

llvm-svn: 263501
2016-03-14 22:34:17 +00:00
Amaury Sechet eae09c2c2a Factor out MachineBlockPlacement::fillWorkLists. NFC
Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii.

Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18077

llvm-svn: 263495
2016-03-14 21:24:11 +00:00
Teresa Johnson cec0cae313 Revert "[ThinLTO] Renaming of function index to module summary index (NFC)"
This reverts commit r263490. Missed a file.

llvm-svn: 263493
2016-03-14 21:18:10 +00:00
Teresa Johnson 892920b358 [ThinLTO] Renaming of function index to module summary index (NFC)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263490
2016-03-14 21:05:56 +00:00
Adam Nemet bb45810e4f Revert "Turn LoopLoadElimination on again"
This reverts commit r263472.

There is an LNT failure on clang-ppc64be-linux-lnt.  Turn this off,
while I am investigating.

llvm-svn: 263485
2016-03-14 20:38:55 +00:00
Sanjay Patel ee52b6e77d allow branch weight metadata on select instructions (PR26636)
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636

This doesn't accomplish anything on its own. It's the first step towards preserving 
and using branch weights with selects.

The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to 
determine how to correctly propagate the weights.

Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.

The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.

Differential Revision: http://reviews.llvm.org/D18133

llvm-svn: 263482
2016-03-14 20:18:59 +00:00
Justin Lebar 9d94397859 [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage.  Re-landed here with no changes.

Reviewers: chandlerc, jingyue

Subscribers: llvm-commits, tra, jhen, hfinkel

Differential Revision: http://reviews.llvm.org/D17739

llvm-svn: 263481
2016-03-14 20:18:54 +00:00
Ulrich Weigand 52aa7fba3f [SystemZ] Add missing isBranch flags to certain instruction
Some instructions were missing isBranch, isCall, or isTerminator
flags.  This didn't really affect code generation since most of
the affected patterns were used only for the AsmParser and/or
disassembler.

However, it could affect tools using the MC layer to disassemble
and parse binary code (e.g. via MCInstrDesc::mayAffectControlFlow).

llvm-svn: 263478
2016-03-14 20:16:30 +00:00
Keno Fischer a91ae8336b [SLPVectorizer] Fix dependency list
Summary:
DemandedBits was added to the requirements of SLPVectorizer in rL261212
(and various earlier version of it), but the appropriate initialization
statement was accidentally forgotten.

Ref [[ https://github.com/JuliaLang/julia/issues/14998 | JuliaLang/julia#14998 ]].

Patch by Yichao Yu.
Reviewers: mssimpso
Differential Revision: http://reviews.llvm.org/D18152

llvm-svn: 263476
2016-03-14 20:04:24 +00:00
Adam Nemet 5a19ae917b Turn LoopLoadElimination on again
The two issues that were discovered got fixed (r263058, r263173).

The pass can be disabled with -mllvm -enable-loop-load-elim=0

llvm-svn: 263472
2016-03-14 19:40:25 +00:00
Michael Kuperstein b7860fedd4 [AliasSetTracker] Do not strip pointer casts when processing MemSetInst
This fixes PR26843.

llvm-svn: 263462
2016-03-14 18:34:29 +00:00
Chad Rosier 27c352d26d [AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC.
http://reviews.llvm.org/D18125
Patch by Aditya Kumar.

llvm-svn: 263461
2016-03-14 18:24:34 +00:00
Quentin Colombet 40ce25b68b [SpillPlacement] Fix a quadratic behavior in spill placement.
The bad behavior happens when we have a function with a long linear chain of
basic blocks, and have a live range spanning most of this chain, but with very
few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the uses are,
and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two
new nodes to the network due to the completely linear shape of the CFG.
Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes,
which adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening on the
frontier where new nodes are being added. The internal nodes in the network are
not likely to be flip-flopping much, or they will at least settle down very
quickly. This means that while `SpillPlacer->iterate()` is recomputing all the
nodes in the network, it is probably only the two frontier nodes that are
changing their output.

Instead of recomputing the whole network on each iteration, we can maintain a
SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its neighbors are
  added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should converge
to a local minimum, but there is no guarantee that it will find a global
minimum. It is possible that updating nodes in a different order will cause us
to switch to a different local minimum. In other words, this is not NFC, but
although I saw a few runtime improvements and regressions when I benchmarked
this change, those were side effects and actually the performance change is in
the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks,
guidance and time for the review.

llvm-svn: 263460
2016-03-14 18:21:25 +00:00
Chad Rosier 6d98655070 [AArch64] Break the dependency between FP and SP when possible.
When the SP in not changed because of realignment/VLAs etc., we restore the SP
by using the previous value of SP and not the FP. Breaking the dependency will
help in cases when the epilog of a callee is close to the epilog of the caller;
for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of
the callee.

http://reviews.llvm.org/D18060
Patch by Aditya Kumar and Evandro Menezes.

llvm-svn: 263458
2016-03-14 18:17:41 +00:00
Chad Rosier 7a21bb196b [Mips] Fix -Wunused-private-field warning after r263444.
llvm-svn: 263454
2016-03-14 18:10:20 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Tom Stellard 331f981cc9 AMDGPU/SI: Handle wait states required for DPP instructions
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17543

llvm-svn: 263447
2016-03-14 17:05:56 +00:00
Sanjay Patel 62d707c8d9 [x86, AVX] replace masked load with full vector load when possible
Converting masked vector loads to regular vector loads for x86 AVX should always be a win.
I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any
objections.

1. x86 already does this kind of optimization for multiple scalar loads -> vector load.
2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner.

Differential Revision: http://reviews.llvm.org/D18094

llvm-svn: 263446
2016-03-14 16:54:43 +00:00
Daniel Sanders e8efff373a [mips] MIPS32R6 compact branch support
Summary:
MIPSR6 introduces a class of branches called compact branches. Unlike the
traditional MIPS branches which have a delay slot, compact branches do not
have a delay slot. The instruction following the compact branch is only
executed if the branch is not taken and must not be a branch.

It works by generating compact branches for MIPS32R6 when the delay slot
filler cannot fill a delay slot. Then, inspecting the generated code for
forbidden slot hazards (a compact branch with an adjacent branch or other
CTI) and inserting nops to clear this hazard.

Patch by Simon Dardis.

Reviewers: vkalintiris, dsanders

Subscribers: MatzeB, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D16353

llvm-svn: 263444
2016-03-14 16:24:05 +00:00
Marek Olsak ed2213e6ef AMDGPU/SI: Incomplete shader binaries need to finish execution at the end
Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D18058

llvm-svn: 263441
2016-03-14 15:57:14 +00:00
Nicolai Haehnle 74127fe8d7 AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergence
Summary:
When multiple threads perform an atomic op with the same arguments, they
will usually see different return values.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18101

llvm-svn: 263440
2016-03-14 15:37:18 +00:00
Vasileios Kalintiris 42db3ff47f [mips] Use range-based for loops. NFC.
llvm-svn: 263438
2016-03-14 15:05:30 +00:00