Commit Graph

14709 Commits

Author SHA1 Message Date
Adam Nemet 1428d41f9a [LoopDataPrefetch] Centralize the tuning cl::opts under the pass
This is effectively NFC, minus the renaming of the options
(-cyclone-prefetch-distance -> -prefetch-distance).

The change was requested by Tim in D17943.

llvm-svn: 264806
2016-03-29 23:45:52 +00:00
Anna Zaks 1a470b6f7c [tsan] Do not instrument reads/writes to instruction profile counters.
We have known races on profile counters, which can be reproduced by enabling
-fsanitize=thread and -fprofile-instr-generate simultaneously on a
multi-threaded program. This patch avoids reporting those races by not
instrumenting the reads and writes coming from the instruction profiler.

llvm-svn: 264805
2016-03-29 23:19:40 +00:00
Duncan P. N. Exon Smith e8eb94a9a5 ADCE: Remove debug info intrinsics in dead scopes
During ADCE, track which debug info scopes still have live references
from the code, and delete debug info intrinsics for the dead ones.

These intrinsics describe the locations of variables (in registers or
stack slots).  If there's no code left corresponding to a variable's
scope, then there's no way to reference the variable in the debugger and
it doesn't matter what its value is.

I add a DEBUG printout when the described location in an SSA register,
in case it helps some trying to track down why locations get lost.
However, we still delete these; the scope itself isn't attached to any
real code, so the ship has already sailed.

llvm-svn: 264800
2016-03-29 22:57:12 +00:00
Adam Nemet 85fba39390 [LoopDataPrefetch] Make more member functions private, NFC.
llvm-svn: 264798
2016-03-29 22:40:02 +00:00
Teresa Johnson b703c77b03 [ThinLTO] Remove post-pass metadata linking support
Since we have moved to a model where functions are imported in bulk from
each source module after making summary-based importing decisions, there
is no longer a need to link metadata as a postpass, and all users have
been removed.

This essentially reverts r255909 and follow-on fixes.

llvm-svn: 264763
2016-03-29 18:24:19 +00:00
Justin Lebar 3db0b85fc8 Make InlineSimple's one-arg constructor explicit. NFC
llvm-svn: 264744
2016-03-29 16:26:06 +00:00
Justin Lebar bd145b3cb2 Reformat a comment in InlineSimple.cpp. NFC
llvm-svn: 264743
2016-03-29 16:26:03 +00:00
Teresa Johnson efeae0e210 [ThinLTO] Use new GlobalValue::getGUID helper (NFC)
This was already being used for functions and aliases, was missed when
handling global variables.

llvm-svn: 264734
2016-03-29 14:49:26 +00:00
Hyojin Sung 4673f10568 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
    
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.

llvm-svn: 264697
2016-03-29 04:08:57 +00:00
Evgeniy Stepanov f575b2687c Remove personality for declarations in CloneModule.
Personality is copied as part of copyFunctionAttributes, but it is
invalid on a declaration. Remove the personality attribute it the
function body is not cloned.

Also add a verifier run over output modules in the llvm-split tool.

llvm-svn: 264667
2016-03-28 21:37:02 +00:00
Ryan Govostes 653f9d0273 [asan] Support dead code stripping on Mach-O platforms
On OS X El Capitan and iOS 9, the linker supports a new section
attribute, live_support, which allows dead stripping to remove dead
globals along with the ASAN metadata about them.

With this change __asan_global structures are emitted in a new
__DATA,__asan_globals section on Darwin.

Additionally, there is a __DATA,__asan_liveness section with the
live_support attribute. Each entry in this section is simply a tuple
that binds together the liveness of a global variable and its ASAN
metadata structure. Thus the metadata structure will be alive if and
only if the global it references is also alive.

Review: http://reviews.llvm.org/D16737
llvm-svn: 264645
2016-03-28 20:28:57 +00:00
Reid Kleckner ba85781f58 Revert "[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops"
This reverts commit r264596.

It does not compile.

llvm-svn: 264604
2016-03-28 18:07:40 +00:00
Hyojin Sung 0ada5b0d14 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being 
applied (e.g., transforming nested loops into simplified forms and loop vectorization).

The patch augments the existing PHI node-based check by adding a pre-test if the BB actually 
belongs to a set of loop headers and not eliminating it if yes. 

llvm-svn: 264596
2016-03-28 17:22:25 +00:00
Rong Xu 6090afd744 [PGO] Don't set the function hotness attribute when populating counters
Don't set the function hotness attribute on the fly. This changes the CFG
branch probability of the caller function, which leads to inconsistent BB
ordering. This patch moves the attribute setting to a separated loop after
 the counts in all functions are populated.

Fixes PR27024 - PGO instrumentation profile data is not reflected in correct
basic blocks.

Differential Revision: http://reviews.llvm.org/D18491

llvm-svn: 264594
2016-03-28 17:08:56 +00:00
Davide Italiano 6db1dcbf6b [SimplifyLibCalls] Transform printf("%s", "a") -> putchar('a').
llvm-svn: 264588
2016-03-28 15:54:01 +00:00
Hal Finkel 5c83a090bc [SROA] Fix typo in comment
llvm-svn: 264573
2016-03-28 11:23:21 +00:00
Hal Finkel 29f5131daf C++11 is required, remove some preprocessor checks for it
We require C++11 to build, so remove a few remaining preprocessor checks for
'__cplusplus >= 201103L'. This should always be true.

llvm-svn: 264572
2016-03-28 11:13:03 +00:00
Teresa Johnson d29478f70e [ThinLTO] Add optional import message and statistics
Summary:
Add a statistic to count the number of imported functions. Also, add a
new -print-imports option to emit a trace of imported functions, that
works even for an NDEBUG build.

Note that emitOptimizationRemark does not work for the above printing as
it expects a Function object and DebugLoc, neither of which we have
with summary-based importing.

This is part 2 of D18487, the first part was committed separately as
r264536.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18487

llvm-svn: 264537
2016-03-27 15:27:30 +00:00
Teresa Johnson 9aae395fa8 [ThinLTO] Don't try to import alias unless aliasee can be imported
With r264503, aliases are now being added to the GlobalsToImport set
even when their aliasees can't be imported due to their linkage type.
While the importing worked correctly (the aliases imported as
declarations) due to the logic in doImportAsDefinition, there is no
point to adding them to the GlobalsToImport set.

Additionally, with D18487 it was resulting in incorrectly printing a
message indicating that the alias was imported.

To avoid this, delay adding aliases to the GlobalsToImport set until
after the linkage type of the aliasee is checked.

This patch is part of D18487.

llvm-svn: 264536
2016-03-27 15:01:11 +00:00
Sanjay Patel 796db35f62 [SimplifyCFG] propagate branch metadata when creating select (PR26636)
llvm-svn: 264527
2016-03-26 23:30:50 +00:00
Mehdi Amini 01e321306b ThinLTO: use the callgraph from the combined index to drive the FunctionImporter
Summary:
Now that the summary contains the full reference/call graph, we can
replace the existing function importer that loads and inspect the IR
to iteratively walk the call graph by a traversal based purely on the
summary information. Decouple the actual importing decision from any
IR manipulation.

Reviewers: tejohnson

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18343

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264503
2016-03-26 05:40:34 +00:00
Sanjoy Das d4c783335b [RS4GC] Lower calls to @llvm.experimental.deoptimize
This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize``
to gc.statepoints wrapping ``__llvm_deoptimize``, and changes
``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize``
as a non GC leaf function.

I've had to hard code the ``"__llvm_deoptimize"`` name in
RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only
during codegen.  This isn't without precedent in the codebase, so I'm
not overtly concerned.

llvm-svn: 264456
2016-03-25 20:12:13 +00:00
David L Kreitzer 8d441eb936 Enable non-power-of-2 #pragma unroll counts.
Patch by Evgeny Stupachenko.

Differential Revision: http://reviews.llvm.org/D18202

llvm-svn: 264407
2016-03-25 14:24:52 +00:00
David Majnemer e09d035dad [LoopStrengthReduce] Don't hoist into a catchswitch
We try to hoist the insertion point as high as possible to encourage
sharing.  However, we must be careful not to hoist into a catchswitch as
it is both an EHPad and a terminator.

llvm-svn: 264344
2016-03-24 21:40:22 +00:00
Adam Nemet 7aba60c853 [LLE] Check for mismatching types between the store and the load earlier
isDependenceDistanceOfOne asserts that the store and the load access
through the same type.  This function is also used by
removeDependencesFromMultipleStores so we need to make sure we filter
out mismatching types before reaching this point.

Now we do this when the initial candidates are gathered.

This is a refinement of the fix made in r262267.

Fixes PR27048.

llvm-svn: 264313
2016-03-24 17:59:26 +00:00
Mike Aizatsky 9987f43ffa [sancov] code readability improvement.
Summary: Reply to http://reviews.llvm.org/D18341

Differential Revision: http://reviews.llvm.org/D18406

llvm-svn: 264213
2016-03-23 23:15:03 +00:00
George Burgess IV 0e4898685f Fix bugs in the MemorySSA walker.
There are a few bugs in the walker that this patch addresses.
Primarily:
- Caching can break when we have multiple BBs without phis
- We weren't optimizing some phis properly
- Because of how the DFS iterator works, there were times where we
  wouldn't cache any results of our DFS

I left the test cases with FIXMEs in, because I'm not sure how much
effort it will take to get those to work (read: We'll probably
ultimately have to end up redoing the walker, or we'll have to come up
with some creative caching tricks), and more test coverage = better.

Differential Revision: http://reviews.llvm.org/D18065

llvm-svn: 264180
2016-03-23 18:31:55 +00:00
Junmo Park 820964e9c6 Minor code cleanup. NFC.
llvm-svn: 264124
2016-03-23 01:38:35 +00:00
Davide Italiano 1a911e204d [ModuleUtils] Use range-based loop. NFC.
llvm-svn: 264122
2016-03-23 00:43:35 +00:00
Adam Nemet 8b47e0d0ea [LoopVersioning] Relax an assert for LCSSA PHIs
When you have multiple LCSSA (single-operand) PHIs that are converted
into two-operand PHIs due to versioning, only assert that the PHI
currently being converted has a single operand.  I.e. we don't want to
check PHIs that were converted earlier in the loop.

Fixes PR27023.

Thanks to Karl-Johan Karlsson for the minimized testcase!

llvm-svn: 264081
2016-03-22 18:38:15 +00:00
Zinovy Nis 07ac2bd4d0 [PATCH] Force LoopReroll to reset the loop trip count value after reroll.
It's a bug fix. 
For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes.
My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested.
I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev.

Differential Revision: http://reviews.llvm.org/D18316

llvm-svn: 264051
2016-03-22 13:50:57 +00:00
Mike Aizatsky 602f79275d [sancov] do not instrument nodes that are full pre-dominators
Summary:
Without tree pruning clang has 2,667,552 points.
Wiht only dominators pruning: 1,515,586.
With both dominators & predominators pruning: 1,340,534.

Resubmit of r262103.

Differential Revision: http://reviews.llvm.org/D18341

llvm-svn: 264003
2016-03-21 23:08:16 +00:00
George Burgess IV 3887a41725 [MemorySSA] Consider def-only BBs for live-in calculations.
If we have a BB with only MemoryDefs, live-in calculations will ignore
it. This means we get results like this:

define void @foo(i8* %p) {
  ; 1 = MemoryDef(liveOnEntry)
  store i8 0, i8* %p
  br i1 undef, label %if.then, label %if.end

if.then:
  ; 2 = MemoryDef(1)
  store i8 1, i8* %p
  br label %if.end

if.end:
  ; 3 = MemoryDef(1)
  store i8 2, i8* %p
  ret void
}

...When there should be a MemoryPhi in the `if.end` BB.

This patch fixes that behavior.

llvm-svn: 263991
2016-03-21 21:25:39 +00:00
Chad Rosier 2e5c526bb1 [SLP] Remove unnecessary member variables by using container APIs.
This changes the debug output, but still retains its usefulness.
Differential Revision: http://reviews.llvm.org/D18324

llvm-svn: 263975
2016-03-21 19:47:44 +00:00
David Majnemer abae6b588b [SimplifyLibCalls] Only consider sinpi/cospi functions within the same function
The sinpi/cospi can be replaced with sincospi to remove unnecessary
computations.  However, we need to make sure that the calls are within
the same function!

This fixes PR26993.

llvm-svn: 263875
2016-03-19 04:53:02 +00:00
David Majnemer cdf2873e36 [InstCombine] Don't insert instructions before a catch switch
CatchSwitches are not splittable, we cannot insert casts, etc. before
them.

This fixes PR26992.

llvm-svn: 263874
2016-03-19 04:39:52 +00:00
Mehdi Amini 8d05185a26 Rework linkInModule(), making it oblivious to ThinLTO
Summary:
ThinLTO is relying on linkInModule to import selected function.
However a lot of "magic" was hidden in linkInModule and the IRMover,
who would rename and promote global variables on the fly.

This is moving to an approach where the steps are decoupled and the
client is reponsible to specify the list of globals to import.
As a consequence some test are changed because they were relying on
the previous behavior which was importing the definition of *every*
single global without control on the client side.
Now the burden is on the client to decide if a global has to be imported
or not.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18122

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263863
2016-03-19 00:40:31 +00:00
Mike Aizatsky 759aca01ce [sancov] clang-formatting SanitizerCoverage.cpp and fully pleasing clang-tidy.
Differential Revision: http://reviews.llvm.org/D18288

llvm-svn: 263852
2016-03-18 23:29:29 +00:00
Chandler Carruth 3006115cfe Revert "Revert "[sancov] specifying sanitizer coverage dependencies.""
This reverts commit r263825, re-instating r263797.

llvm-svn: 263847
2016-03-18 22:43:42 +00:00
Chandler Carruth e2b7021a91 [sancov] Fix the sancov pass to initialize itself inside its
constructor. This should fix the recent crashes on certain
architectures.

llvm-svn: 263845
2016-03-18 22:35:58 +00:00
Sanjoy Das 74af78e3b0 [IndVars] Make the fix for PR26973 more obvious; NFCI
llvm-svn: 263828
2016-03-18 20:37:11 +00:00
Sanjoy Das 60fb899f28 [IndVars] Pass the right loop to isLoopInvariantPredicate
The loop on IVOperand's incoming values assumes IVOperand to be an
induction variable on the loop over which `S Pred X` is invariant;
otherwise loop invariant incoming values to IVOperand are not guaranteed
to dominate the comparision.

This fixes PR26973.

llvm-svn: 263827
2016-03-18 20:37:07 +00:00
Mike Aizatsky 075ed3eec1 Revert "[sancov] specifying sanitizer coverage dependencies."
This fails on arm.

This reverts commit 52c8e0f7119d1ea1050c0708565a8c92b73386d2.

llvm-svn: 263825
2016-03-18 20:34:58 +00:00
Mike Aizatsky 4f7994c8cb [sancov] specifying sanitizer coverage dependencies.
Summary:
These dependencies would be used in the future to reduce the number
of instrumented blocks(http://reviews.llvm.org/rL262103)

This is submitted as a separate CL because of previous problems with
ARM.

Subscribers: aemerson

Differential Revision: http://reviews.llvm.org/D18227

llvm-svn: 263797
2016-03-18 17:33:21 +00:00
Adam Nemet 709e3046ee [LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead
Summary:
It can hurt performance to prefetch ahead too much.  Be conservative for
now and don't prefetch ahead more than 3 iterations on Cyclone.

Reviewers: hfinkel

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17949

llvm-svn: 263772
2016-03-18 00:27:43 +00:00
Adam Nemet 6d8beeca53 [LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses
Summary:
And use this TTI for Cyclone.  As it was explained in the original RFC
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW
prefetcher work up to 2KB strides.

I am also adding tests for this and the previous change (D17943):

* Cyclone prefetching accesses with a large stride
* Cyclone not prefetching accesses with a small stride
* Generic Aarch64 subtarget not prefetching either

Reviewers: hfinkel

Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17945

llvm-svn: 263771
2016-03-18 00:27:38 +00:00
Adam Nemet b0c4eae073 [LoopVectorize] Annotate versioned loop with noalias metadata
Summary:
Use the new LoopVersioning facility (D16712) to add noalias metadata in
the vector loop if we versioned with memchecks.  This can enable some
optimization opportunities further down the pipeline (see the included
test or the benchmark improvement quoted in D16712).

The test also covers the bug I had in the initial version in D16712.

The vectorizer did not previously use LoopVersioning.  The reason is
that the vectorizer performs its transformations in single shot.  It
creates an empty single-block vector loop that it then populates with
the widened, if-converted instructions.  Thus creating an intermediate
versioned scalar loop seems wasteful.

So this patch (rather than bringing in LoopVersioning fully) adds a
special interface to LoopVersioning to allow the vectorizer to add
no-alias annotation while still performing its own versioning.

As the vectorizer propagates metadata from the instructions in the
original loop to the vector instructions we also check the pointer in
the original instruction and see if LoopVersioning can add no-alias
metadata based on the issued memchecks.

Reviewers: hfinkel, nadav, mzolotukhin

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17191

llvm-svn: 263744
2016-03-17 20:32:37 +00:00
Adam Nemet 5eccf07df3 [LoopVersioning] Annotate versioned loop with noalias metadata
Summary:
If we decide to version a loop to benefit a transformation, it makes
sense to record the now non-aliasing accesses in the newly versioned
loop.  This allows non-aliasing information to be used by subsequent
passes.

One example is 456.hmmer in SPECint2006 where after loop distribution,
we vectorize one of the newly distributed loops.  To vectorize we
version this loop to fully disambiguate may-aliasing accesses.  If we
add the noalias markers, we can use the same information in a later DSE
pass to eliminate some dead stores which amounts to ~25% of the
instructions of this hot memory-pipeline-bound loop.  The overall
performance improves by 18% on our ARM64.

The scoped noalias annotation is added in LoopVersioning.  The patch
then enables this for loop distribution.  A follow-on patch will enable
it for the vectorizer.  Eventually this should be run by default when
versioning the loop but first I'd like to get some feedback whether my
understanding and application of scoped noalias metadata is correct.

Essentially my approach was to have a separate alias domain for each
versioning of the loop.  For example, if we first version in loop
distribution and then in vectorization of the distributed loops, we have
a different set of memchecks for each versioning.  By keeping the scopes
in different domains they can conveniently be defined independently
since different alias domains don't affect each other.

As written, I also have a separate domain for each loop.  This is not
necessary and we could save some metadata here by using the same domain
across the different loops.  I don't think it's a big deal either way.

Probably the best is to review the tests first to see if I mapped this
problem correctly to scoped noalias markers.  I have plenty of comments
in the tests.

Note that the interface is prepared for the vectorizer which needs the
annotateInstWithNoAlias API.  The vectorizer does not use LoopVersioning
so we need a way to pass in the versioned instructions.  This is also
why the maps have to become part of the object state.

Also currently, we only have an AA-aware DSE after the vectorizer if we
also run the LTO pipeline.  Depending how widely this triggers we may
want to schedule a DSE toward the end of the regular pass pipeline.

Reviewers: hfinkel, nadav, ashutosh.nema

Subscribers: mssimpso, aemerson, llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16712

llvm-svn: 263743
2016-03-17 20:32:32 +00:00
Guozhi Wei 7b390ec4cd [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 263734
2016-03-17 18:47:20 +00:00
Sanjoy Das c9058ca9e0 [Statepoints] Export a magic constant into a header; NFC
llvm-svn: 263733
2016-03-17 18:42:17 +00:00
Sanjay Patel 9e23fedaf0 propagate 'unpredictable' metadata on select instructions
This is similar to D18133 where we allowed profile weights on select instructions. 
This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects.

A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at:
http://reviews.llvm.org/rL263648

Differential Revision: http://reviews.llvm.org/D18220

llvm-svn: 263716
2016-03-17 15:30:52 +00:00
Sanjoy Das 312038872d [Statepoints] Separate out logic for statepoint directives; NFC
This splits out the logic that maps the `"statepoint-id"` attribute into
the actual statepoint ID, and the `"statepoint-num-patch-bytes"`
attribute into the number of patchable bytes the statpeoint is lowered
into.  The new home of this logic is in IR/Statepoint.cpp, and this
refactoring will support similar functionality when lowering calls with
deopt operand bundles in the future.

llvm-svn: 263685
2016-03-17 01:56:10 +00:00
Chad Rosier fea398188c [SLP] Make DataLayout a member variable.
llvm-svn: 263656
2016-03-16 19:48:42 +00:00
Geoff Berry 56fabf9b55 Revert "[LSR] Create fewer redundant instructions."
This reverts commit r263644.  Investigating bootstrap failures.

llvm-svn: 263655
2016-03-16 19:21:47 +00:00
Evgeniy Stepanov 4b96ed693a [msan] Add a comment with a bug link.
llvm-svn: 263645
2016-03-16 17:39:17 +00:00
Geoff Berry 459b750871 [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Reviewers: atrick

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18001

llvm-svn: 263644
2016-03-16 17:29:49 +00:00
Haicheng Wu 7873857a88 [JumpThreading] See through Cast Instructions
To capture more jump-thread opportunity.

llvm-svn: 263618
2016-03-16 04:52:52 +00:00
Haicheng Wu 64d9d7c3f7 Revert "[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()"
Not sure it handles undef properly.

llvm-svn: 263605
2016-03-15 23:38:47 +00:00
Adam Nemet c979c6e123 Turn LoopLoadElimination on again
The latent bug that LLE exposed in the LoopVectorizer was resolved
(PR26952).

The pass can be disabled with -mllvm -enable-loop-load-elim=0

llvm-svn: 263595
2016-03-15 22:26:12 +00:00
Bjorn Steinbrink 37ca462508 Also handle the new Rust pers fn to isCatchAll()
llvm-svn: 263585
2016-03-15 20:57:07 +00:00
Evgeniy Stepanov d6e91369d8 [msan] Don't put module constructors in comdats.
There is something strange going on with debug info (.eh_frame_hdr)
disappearing when msan.module_ctor are placed in comdat sections.

Moving this functionality under flag, disabled by default.

llvm-svn: 263579
2016-03-15 20:25:47 +00:00
Adam Nemet fdb20595a1 [LV] Preserve LoopInfo when store predication is used
This was a latent bug that got exposed by the change to add LoopSimplify
as a dependence to LoopLoadElimination.  Since LoopInfo was corrupted
after LV, LoopSimplify mis-compiled nbench in the test-suite (more
details in the PR).

The problem was that when we create the blocks for predicated stores we
didn't add those to any loops.

The original testcase for store predication provides coverage for this
assuming we verify LI on the way out of LV.

Fixes PR26952.

llvm-svn: 263565
2016-03-15 18:06:20 +00:00
Benjamin Kramer 96f4b12880 [GlobalOpt] Don't look through aliases when sorting names of globals.
If both are different aliases to the same value the sorting becomes
non-deterministic as array_pod_sort is not stable.

llvm-svn: 263550
2016-03-15 14:18:26 +00:00
Chad Rosier ebe559019b [SLP] Update comment to reflect reality. NFC.
llvm-svn: 263548
2016-03-15 13:27:58 +00:00
Eric Christopher 257338ff0f Use some braces to format this a little better.
llvm-svn: 263527
2016-03-15 03:01:31 +00:00
Eric Christopher ee00abe5e6 Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest
parentheses around '&&' within '||' [-Werror=parentheses].

llvm-svn: 263525
2016-03-15 02:19:06 +00:00
Teresa Johnson b43027d1e0 Move global ID computation from Function to GlobalValue (NFC)
Since the static getGlobalIdentifier and getGUID methods are now called
for global values other than functions, reflect that by moving these
methods to the GlobalValue class.

llvm-svn: 263524
2016-03-15 02:13:19 +00:00
Teresa Johnson 26ab5772b0 [ThinLTO] Renaming of function index to module summary index (NFC)
(Resubmitting after fixing missing file issue)

With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263513
2016-03-15 00:04:37 +00:00
Justin Lebar 6827de19b2 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

llvm-svn: 263509
2016-03-14 23:15:34 +00:00
Amaury Sechet bdb261b4c0 Imporove load to store => memcpy
Summary: This now try to reorder instructions in order to help create the optimizable pattern.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer

Differential Revision: http://reviews.llvm.org/D16523

llvm-svn: 263503
2016-03-14 22:52:27 +00:00
Teresa Johnson cec0cae313 Revert "[ThinLTO] Renaming of function index to module summary index (NFC)"
This reverts commit r263490. Missed a file.

llvm-svn: 263493
2016-03-14 21:18:10 +00:00
Teresa Johnson 892920b358 [ThinLTO] Renaming of function index to module summary index (NFC)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263490
2016-03-14 21:05:56 +00:00
Adam Nemet bb45810e4f Revert "Turn LoopLoadElimination on again"
This reverts commit r263472.

There is an LNT failure on clang-ppc64be-linux-lnt.  Turn this off,
while I am investigating.

llvm-svn: 263485
2016-03-14 20:38:55 +00:00
Sanjay Patel ee52b6e77d allow branch weight metadata on select instructions (PR26636)
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636

This doesn't accomplish anything on its own. It's the first step towards preserving 
and using branch weights with selects.

The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to 
determine how to correctly propagate the weights.

Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.

The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.

Differential Revision: http://reviews.llvm.org/D18133

llvm-svn: 263482
2016-03-14 20:18:59 +00:00
Justin Lebar 9d94397859 [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage.  Re-landed here with no changes.

Reviewers: chandlerc, jingyue

Subscribers: llvm-commits, tra, jhen, hfinkel

Differential Revision: http://reviews.llvm.org/D17739

llvm-svn: 263481
2016-03-14 20:18:54 +00:00
Keno Fischer a91ae8336b [SLPVectorizer] Fix dependency list
Summary:
DemandedBits was added to the requirements of SLPVectorizer in rL261212
(and various earlier version of it), but the appropriate initialization
statement was accidentally forgotten.

Ref [[ https://github.com/JuliaLang/julia/issues/14998 | JuliaLang/julia#14998 ]].

Patch by Yichao Yu.
Reviewers: mssimpso
Differential Revision: http://reviews.llvm.org/D18152

llvm-svn: 263476
2016-03-14 20:04:24 +00:00
Adam Nemet 5a19ae917b Turn LoopLoadElimination on again
The two issues that were discovered got fixed (r263058, r263173).

The pass can be disabled with -mllvm -enable-loop-load-elim=0

llvm-svn: 263472
2016-03-14 19:40:25 +00:00
Chad Rosier 00bd82cade [CVP] Replace nonnegative with positive, per Philip's request. NFC.
llvm-svn: 263430
2016-03-14 13:48:00 +00:00
Haicheng Wu d60ae33d29 [CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative
The motivating example is this

for (j = n; j > 1; j = i) {
   i = j / 2;
}

The signed division is safely to be changed to an unsigned division (j is known
to be larger than 1 from the loop guard) and later turned into a single shift
without considering the sign bit.

llvm-svn: 263406
2016-03-14 03:24:28 +00:00
Mehdi Amini ba9fba81d6 Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Sanjay Patel 5658b58936 remove unnecessary cast; NFC
llvm-svn: 263343
2016-03-12 18:17:41 +00:00
Sanjay Patel 2e0027706a fix formatting; NFC
llvm-svn: 263342
2016-03-12 18:05:53 +00:00
Sanjay Patel 5781d840dd use range loops; NFCI
llvm-svn: 263341
2016-03-12 16:52:17 +00:00
Sanjay Patel c4acbae63f [x86, InstCombine] delete x86 SSE2 masked store with zero mask
This follows up on the related AVX instruction transforms, but this
one is too strange to do anything more with. Intel's behavioral
description of this instruction in its Software Developer's Manual
is tragi-comic.

llvm-svn: 263340
2016-03-12 15:16:59 +00:00
Eric Christopher 35abd051c0 Temporarily revert:
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date:   Fri Mar 11 17:15:50 2016 +0000

    Remove PreserveNames template parameter from IRBuilder

    Summary:
    Following r263086, we are now relying on a flag on the Context to
    discard Value names in release builds.

    Reviewers: chandlerc

    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: http://reviews.llvm.org/D18023

    From: Mehdi Amini <mehdi.amini@apple.com>

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
    91177308-0d34-0410-b5e6-96231b3b80d8

until we can figure out what to do about clang and Release build testing.

This reverts commit 263258.

llvm-svn: 263321
2016-03-12 01:47:22 +00:00
George Burgess IV b42b762bca [MemorySSA] Make a return type reflect reality. NFC.
llvm-svn: 263286
2016-03-11 19:34:03 +00:00
Sanjoy Das b51325dbdb Introduce @llvm.experimental.deoptimize
Summary:
This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.

In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality.  In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.

Note: this change does not address how `@llvm.experimental_deoptimize`
is lowered.  That will be done in a later change.

Reviewers: chandlerc, rnk, atrick, reames

Subscribers: llvm-commits, kmod, mjacob, maksfb, mcrosier, JosephTremoulet

Differential Revision: http://reviews.llvm.org/D17732

llvm-svn: 263281
2016-03-11 19:08:34 +00:00
Vedant Kumar e5a9a275d3 [PGO] Skip value profile instrumentation of inline asm
Value profile instrumentation treats inline asm calls like they are
indirect calls. This causes problems when the 'Callee' is passed to a
ptrtoint cast -- the verifier rightly claims that this is bogus and
crashes opt.

llvm-svn: 263278
2016-03-11 18:57:48 +00:00
Teresa Johnson 76a1c1d0ba [ThinLTO] Support for reference graph in per-module and combined summary.
Summary:
This patch adds support for including a full reference graph including
call graph edges and other GV references in the summary.

The reference graph edges can be used to make importing decisions
without materializing any source modules, can be used in the plugin
to make file staging decisions for distributed build systems, and is
expected to have other uses.

The call graph edges are recorded in each function summary in the
bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
there is PGO, where the ValueId can be mapped to the function GUID via
the ValueSymbolTable. In the function index in memory, the call graph
edges reference the target via the CalleeGUID instead of the
CalleeValueId.

The reference graph edges are recorded in each summary record with a
list of referenced value IDs, which can be mapped to value GUID via the
ValueSymbolTable.

Addtionally, a new summary record type is added to record references
from global variable initializers. A number of bitcode records and data
structures have been renamed to reflect the newly expanded scope of the
summary beyond functions. More cleanup will follow.

Reviewers: joker.eph, davidxl

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D17212

llvm-svn: 263275
2016-03-11 18:52:24 +00:00
Mehdi Amini 99eab3dd06 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
Mehdi Amini 1e9c925182 Do not specialize IRBuilder to strip names in SROA
Summary:
Following r263086, we are replacing this by a runtime check.
More cleanup will follow on the IRBuilder itself, but I submitted
this patch separately as SROA has a fancy "prefixInserter" class
that needs extra-love.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18022

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263256
2016-03-11 17:15:34 +00:00
Chandler Carruth ace8c8f765 [PM] Sink the "Expression" type for GVN into the class as a private
member type.

Because of how this type is used by the ValueTable, it cannot actually
have hidden visibility. GCC actually nicely warns about this but Clang
just silently ... I don't even know. =/ We should do a better job either
way though.

This should resolve a bunch of the GCC warnings about visibility that
the port of GVN triggered and make the visibility story a bit more
correct.

llvm-svn: 263250
2016-03-11 16:25:19 +00:00
Chandler Carruth 3bc9c7fb45 [PM] The order of evaluation of these analyses is actually significant,
much to my horror, so use variables to fix it in place.

This terrifies me. Both basic-aa and memdep will provide more precise
information when the domtree and/or the loop info is available. Because
of this, if your pass (like GVN) requires domtree, and then queries
memdep or basic-aa, it will get more precise results. If it does this in
the other order, it gets less precise results.

All of the ideas I have for fixing this are, essentially, terrible. Here
I've just caused us to stop having unspecified behavior as different
implementations evaluate the order of these arguments differently. I'm
actually rather glad that they do, or the fragility of memdep and
basic-aa would have gone on unnoticed. I've left comments so we don't
immediately break this again. This should fix bots whose host compilers
evaluate the order of arguments differently from Clang.

llvm-svn: 263231
2016-03-11 13:26:47 +00:00
Chandler Carruth b47f8010a9 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

llvm-svn: 263219
2016-03-11 11:05:24 +00:00
Benjamin Kramer c126353473 [InstCombine] Use Twines to generate names.
Since the names are used in a loop this does more work in debug builds. In
release builds value names are generally discarded so we don't have to do
the concatenation at all. It's also simpler code, no functional change
intended.

llvm-svn: 263215
2016-03-11 10:20:56 +00:00
Chandler Carruth 89c45a162f [PM] Port GVN to the new pass manager, wire it up, and teach a couple of
tests to run GVN in both modes.

This is mostly the boring refactoring just like SROA and other complex
transformation passes. There is some trickiness in that GVN's
ValueNumber class requires hand holding to get to compile cleanly. I'm
open to suggestions about a better pattern there, but I tried several
before settling on this. I was trying to balance my desire to sink as
much implementation detail into the source file as possible without
introducing overly many layers of abstraction.

Much like with SROA, the design of this system is made somewhat more
cumbersome by the need to support both pass managers without duplicating
the significant state and logic of the pass. The same compromise is
struck here.

I've also left a FIXME in a doxygen comment as the GVN pass seems to
have pretty woeful documentation within it. I'd like to submit this with
the FIXME and let those more deeply familiar backfill the information
here now that we have a nice place in an interface to put that kind of
documentaiton.

Differential Revision: http://reviews.llvm.org/D18019

llvm-svn: 263208
2016-03-11 08:50:55 +00:00
Pete Cooper adebb9379a Remove llvm::getDISubprogram in favor of Function::getSubprogram
llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself.

Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function.

Ideally this should be NFC, but in reality its possible that a function:

has no !dbg (in which case there's likely a bug somewhere in an opt pass), or
that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will

Reviewed by Duncan Exon Smith.

Differential Revision: http://reviews.llvm.org/D18074

llvm-svn: 263184
2016-03-11 02:14:16 +00:00
Adam Nemet efb234135c [LLE] Add missed LoopSimplify dependence
The code assumed that we always had a preheader without making the pass
dependent on LoopSimplify.

Thanks to Mattias Eriksson V for reporting this.

llvm-svn: 263173
2016-03-10 23:54:39 +00:00
Chandler Carruth 37f1f12226 [SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out
of, and I misdiagnosed for months and months.

Andrea has had a patch for this forever, but I just couldn't see how
it was fixing the root cause of the problem. It didn't make sense to me,
even though the patch was perfectly good and the analysis of the actual
failure event was *fantastic*.

Well, I came back to it today because the patch has sat for *far* too
long and needs attention and decided I wouldn't let it go until I really
understood what was going on. After quite some time in the debugger,
I finally realized that in fact I had just missed an important case with
my previous attempt to fix PR22093 in r225149. Not only do we need to
handle loads that won't be split, but stores-of-loads that we won't
split. We *do* actually have enough logic in the presplitting to form
new slices for split stores.... *unless* we decided not to split them!

I'm so sorry that it took me this long to come to the realization that
this is the issue. It seems so obvious in hind sight (of course).
Anyways, the fix becomes *much* smaller and more focused. The fact that
we're left doing integer smashing is related to the FIXME in my original
commit: fundamentally, we're not aggressive about pre-splitting for
loads and stores to the same alloca. If we want to get aggressive about
this, it'll need both what Andrea had put into the proposed fix, but
also a *lot* more logic to essentially iteratively pre-split the alloca
until we can't do any more. As I said in that commit log, its really
unclear that this is the right call. Instead, the integer blending and
letting targets lower this to narrower stores seems slightly better. But
we definitely shouldn't really go down that path just to fix this bug.

Again, tons of thanks are owed to Andrea and others at Sony for working
on this bug. I really should have seen what was going on here and
re-directed them sooner. =////

llvm-svn: 263121
2016-03-10 15:31:17 +00:00
Chandler Carruth d94a5962cc [SROA] Clean up some really weird code, no functionality changed.
We already have the instruction extracted into 'I', just cast that to
a store the way we do for loads. Also, we don't enter the if unless SI
is non-null, so don't test it again for null.

I'm pretty sure the entire test there can be nuked, but this is just the
trivial cleanup.

llvm-svn: 263112
2016-03-10 14:16:18 +00:00