Commit Graph

16998 Commits

Author SHA1 Message Date
Davide Italiano ed67f1978e [NewGVN] clang-format this file after recent changes.
llvm-svn: 292026
2017-01-14 20:15:04 +00:00
Davide Italiano 7cf29dcca5 [NewGVN] Try to be consistent wit the style used in this file. NFCI.
llvm-svn: 292025
2017-01-14 20:13:18 +00:00
Eugene Zelenko 5fa43960f3 [Transforms/Utils] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291983
2017-01-14 00:32:38 +00:00
Daniel Berlin b66164ca34 NewGVN: Kill unneeded DFSDomMap, cleanup a few comments.
llvm-svn: 291981
2017-01-14 00:24:23 +00:00
Sanjay Patel 40f401776b [InstCombine] optimize unsigned icmp of increment
Allows LLVM to optimize sequences like the following:

%add = add nuw i32 %x, 1
%cmp = icmp ugt i32 %add, %y

Into:

%cmp = icmp uge i32 %x, %y

Previously, only signed comparisons were being handled.

Decrements could also be handled, but 'sub nuw %x, 1' is currently canonicalized to
'add %x, -1' in InstCombineAddSub, losing the nuw flag. Removing that canonicalization
seems like it might have far-reaching ramifications so I kept this simple for now.

Patch by Matti Niemenmaa!

Differential Revision: https://reviews.llvm.org/D24700

llvm-svn: 291975
2017-01-13 23:25:46 +00:00
Sanjay Patel 2d4b456427 [InstCombine] use m_APInt to allow lshr folds for vectors with splat constants
llvm-svn: 291972
2017-01-13 23:04:10 +00:00
Daniel Berlin c0431fd02d NewGVN: Move leaders around properly to ensure we have a canonical dominating leader. Fixes PR 31613.
Summary:
This is a testcase where phi node cycling happens, and because we do
not order the leaders by domination or anything similar, the leader
keeps changing.

Using std::set for the members is too expensive, and we actually don't
need them sorted all the time, only at leader changes.

We could keep both a set and a vector, and keep them mostly sorted and
resort as necessary, or use a set and a fibheap, but all of this seems
premature.

After running some statistics, we are able to avoid the vast majority
of sorting by keeping a "next leader" field.  Most congruence classes only have
leader changes once or twice during GVN.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28594

llvm-svn: 291968
2017-01-13 22:40:01 +00:00
David Majnemer bba17390c7 [LoopStrengthReduce] Don't bother rewriting PHIs in catchswitch blocks
The catchswitch instruction cannot be split, don't bother trying to
rewrite it.

This fixes PR31627.

llvm-svn: 291966
2017-01-13 22:24:27 +00:00
David L. Jones 41cecba8e9 "Use" lambda captures which are otherwise only used in asserts. NFC
Summary:
The LLVM coding standards recommend "using" values that are only
needed by asserts:
http://llvm.org/docs/CodingStandards.html#assert-liberally

Without this change, LLVM cannot bootstrap with -Werror as the second
stage fails with this new warning:
https://reviews.llvm.org/rL291905

See also the previous fixes:
https://reviews.llvm.org/rL291916
https://reviews.llvm.org/rL291939
https://reviews.llvm.org/rL291940
https://reviews.llvm.org/rL291941

Reviewers: rsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28695

llvm-svn: 291957
2017-01-13 21:02:41 +00:00
Sanjay Patel acd24c7b6a [InstCombine] use 'match' and other clean-up; NFCI
llvm-svn: 291937
2017-01-13 18:52:10 +00:00
Sanjay Patel b22f6c5f26 [InstCombine] use m_APInt to allow shl folds for vectors with splat constants
llvm-svn: 291934
2017-01-13 18:39:09 +00:00
Sanjay Patel cf08203105 [InstCombine] use Op0/Op1 local variables more consistently with shifts; NFC
llvm-svn: 291923
2017-01-13 18:08:25 +00:00
Malcolm Parsons 17d266bc96 Remove unused lambda captures. NFC
llvm-svn: 291916
2017-01-13 17:12:16 +00:00
Sanjay Patel 5178363687 [InstCombine] if the condition of a select may be known via assumes, eliminate the select
This is a limited solution for PR31512:
https://llvm.org/bugs/show_bug.cgi?id=31512

The motivation is that we will need to increase usage of llvm.assume and/or metadata to solve PR28430:
https://llvm.org/bugs/show_bug.cgi?id=28430

...and this kind of simplification is needed to take advantage of that extra information.

The 'not' test case would be handled by:
https://reviews.llvm.org/D28485

Differential Revision:
https://reviews.llvm.org/D28337

llvm-svn: 291915
2017-01-13 17:02:42 +00:00
Benjamin Kramer 061f4a5fe6 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

llvm-svn: 291904
2017-01-13 14:39:03 +00:00
Evgeniy Stepanov f01c70fec0 [asan] Don't overalign global metadata.
Other than on COFF with incremental linking, global metadata should
not need any extra alignment.

Differential Revision: https://reviews.llvm.org/D28628

llvm-svn: 291859
2017-01-12 23:26:20 +00:00
Evgeniy Stepanov 5d31d08a21 [asan] Refactor instrumentation of globals.
llvm-svn: 291858
2017-01-12 23:03:03 +00:00
Teresa Johnson 83aaf358fd [ThinLTO] Import static functions from the same module as caller
Summary:
We can sometimes end up with multiple copies of a local function that
have the same GUID in the index. This happens when there are local
functions with the same name that are in different source files with the
same name (but in different directories), and they were compiled in
their own directory so had the same path at compile time.

In this case make sure we import the copy in the caller's module. While
it isn't a correctness problem (the renamed reference which is based on the
module IR hash will be unique since the module must have had an
externally visible function that was imported), importing the wrong copy
will result in lost performance opportunity since it won't be referenced
and inlined.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28440

llvm-svn: 291841
2017-01-12 22:04:45 +00:00
Robert Lougher b0124c1eb8 [DebugInfo] Remove redundant check in SimplifyCFG; NFC.
llvm-svn: 291813
2017-01-12 21:11:09 +00:00
Robert Lougher 6717a6fe54 [DebugInfo] DILocation variable declaration should be const; NFC.
llvm-svn: 291787
2017-01-12 18:33:49 +00:00
Robert Lougher f5df7a18dd [DebugInfo] Add const to DILocation variable declaration; NFC.
llvm-svn: 291785
2017-01-12 18:29:28 +00:00
Davide Italiano eac05f6b88 [NewGVN] Fixup store count for the `initial` congruency class.
It was always zero. When we move a store from `initial` to its
own congruency class, we end up with a negative store count, which
is obviously wrong.
Also, while here, change StoreCount to be signed so that the assertions
actually fire.

Ack'ed by Daniel Berlin.

llvm-svn: 291725
2017-01-11 23:41:24 +00:00
Kuba Mracek 503162b4a1 [asan] Set alignment of __asan_global_* globals to sizeof(GlobalStruct)
When using profiling and ASan together (-fprofile-instr-generate -fcoverage-mapping -fsanitize=address), at least on Darwin, the section of globals that ASan emits (__asan_globals) is misaligned and starts at an odd offset. This really doesn't have anything to do with profiling, but it triggers the issue because profiling emits a string section, which can have arbitrary size.  This patch changes the alignment to sizeof(GlobalStruct).

Differential Revision: https://reviews.llvm.org/D28573

llvm-svn: 291715
2017-01-11 22:26:10 +00:00
Davide Italiano 0dc68bfa87 Revert "[NewGVN] Strengthen a couple of assertions."
It's breaking some bots. Will investigate and recommit.

llvm-svn: 291712
2017-01-11 22:00:29 +00:00
Davide Italiano ff69405213 [NewGVN] Parenthesise assertion condition (-Wparenthesis).
Format an assertion message while I'm here.

llvm-svn: 291710
2017-01-11 21:58:42 +00:00
Davide Italiano 6e919df2f5 [NewGVN] Strengthen a couple of assertions.
StoreCount >= 0 on `unsigned` is always true, otherwise.

llvm-svn: 291709
2017-01-11 21:49:00 +00:00
Peter Collingbourne 7636532c1b LowerTypeTests: Represent the memory region size with the constant size-1.
This means that we can use a shorter instruction sequence in the case where
the size is a power of two and on the boundary between two representations.

Differential Revision: https://reviews.llvm.org/D28421

llvm-svn: 291706
2017-01-11 21:32:10 +00:00
Peter Collingbourne 6bca5a0d82 Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.", with a fix for an off-by-one error.
llvm-svn: 291699
2017-01-11 20:28:46 +00:00
Daniel Berlin f6eba4be2c NewGVN: Fix PR31594, by tracking the store count of congruence
classes, and updating checking to allow for equivalence through
reachability.

(Sadly, the checking here is not perfect, and can't be made perfect,
so we'll have to disable it after we are satisfied with correctness.
Right now it is just "very unlikely" to happen.)

llvm-svn: 291698
2017-01-11 20:22:36 +00:00
Daniel Berlin 3a1bd0216a NewGVN: Refactor performCongruenceFinding and split out congruence class moving
llvm-svn: 291697
2017-01-11 20:22:05 +00:00
Rong Xu 20f5df1d70 Resubmit "[PGO] Turn off comdat renaming in IR PGO by default"
This patch resubmits the changes in r291588.

llvm-svn: 291696
2017-01-11 20:19:41 +00:00
Michael Kuperstein f69e64662b [SLP] Remove bogus assert.
The removed assert seems bogus - it's perfectly legal for the roots of the
vectorized subtrees to be equal even if the original scalar values aren't,
if the original scalars happen to be equivalent.

This fixes PR31599.

Differential Revision: https://reviews.llvm.org/D28539

llvm-svn: 291692
2017-01-11 19:23:57 +00:00
Ivan Krasin 42e6b4fd98 Revert rL291205 because it breaks Chrome tests under CFI.
Summary:
Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.

This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.

Original URL: https://reviews.llvm.org/D28341

Reviewers: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28532

llvm-svn: 291684
2017-01-11 16:54:04 +00:00
Hal Finkel 8a9a783f2c Make processing @llvm.assume more efficient - Add affected values to the assumption cache
Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.

As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).

Differential Revision: https://reviews.llvm.org/D28459

llvm-svn: 291671
2017-01-11 13:24:24 +00:00
Chandler Carruth 3bab7e1a79 [PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.

Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.

We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.

This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.

I haven't split the unittest though because testing one component
without the other seems nearly intractable.

Differential Revision: https://reviews.llvm.org/D28452

llvm-svn: 291662
2017-01-11 09:43:56 +00:00
Mohammed Agabaria 2c96c43388 [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. 
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104 

llvm-svn: 291657
2017-01-11 08:23:37 +00:00
Chandler Carruth 410eaeb064 [PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

Differential Revision: https://reviews.llvm.org/D28292

llvm-svn: 291651
2017-01-11 06:23:21 +00:00
Adam Nemet e2aaf3a35e [LICM] Report failing to hoist conditionally-executed loads
These are interesting again because the user may not be aware that this
is a common reason preventing LICM.

A const is removed from an instruction pointer declaration in order to
pass it to ORE.

Differential Revision: https://reviews.llvm.org/D27940

llvm-svn: 291649
2017-01-11 04:39:49 +00:00
Adam Nemet 81941b3195 [LICM] Report failing to hoist a load with an invariant address
These are interesting because lack of precision in alias information
could be standing in the way of this optimization.

An example is the case in the test suite that I showed in the DevMeeting
talk:

http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/MultiSource/Benchmarks/FreeBench/distray/CMakeFiles/distray.dir/html/_org_test-suite_MultiSource_Benchmarks_FreeBench_distray_distray.c.html#L236

canSinkOrHoistInst is also used from LoopSink, which does not use
opt-remarks so we need to take ORE as an optional argument.

Differential Revision: https://reviews.llvm.org/D27939

llvm-svn: 291648
2017-01-11 04:39:45 +00:00
Adam Nemet 358433ce1b [LICM] Report successful hoist/sink/promotion
Differential Revision: https://reviews.llvm.org/D27938

llvm-svn: 291646
2017-01-11 04:39:35 +00:00
Rong Xu acd6360251 Revert "[PGO] Turn off comdat renaming in IR PGO by default"
This patch reverts r291588: [PGO] Turn off comdat renaming in IR PGO by default,
as we are seeing some hash mismatches in our internal tests.

llvm-svn: 291621
2017-01-10 23:54:31 +00:00
Sanjay Patel db0938fd9a [InstCombine] add a wrapper for a common pair of transforms; NFCI
Some of the callers are artificially limiting this transform to integer types;
this should make it easier to incrementally remove that restriction.

llvm-svn: 291620
2017-01-10 23:49:07 +00:00
Florian Hahn 4f9d6d56c0 [loop-unroll] Properly populate LoopInfo for loops cloned in LoopUnrollRuntime.
Summary:
This fixes Transforms/LoopUnroll/runtime-loop3.ll which failed with
EXTENSIVE_DEBUG, because the cloned basic blocks were not added to the
correct sub-loops in LoopUnrollRuntime.cpp.


Reviewers: dexonsmith, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28482

llvm-svn: 291619
2017-01-10 23:43:35 +00:00
Florian Hahn fdea2e420c [loop-unroll] Factor out code to update LoopInfo (NFC).
Move the code to update LoopInfo for cloned basic blocks to
addClonedBlockToLoopInfo, as suggested in 
https://reviews.llvm.org/D28482.

llvm-svn: 291614
2017-01-10 23:24:54 +00:00
Matt Arsenault 3f509042b0 InstCombine: Set operands instead of creating new call
llvm-svn: 291612
2017-01-10 23:17:52 +00:00
Matt Arsenault fdb78f8bae InstCombine: fdiv -x, -y -> fdiv x, y
llvm-svn: 291611
2017-01-10 23:08:54 +00:00
Michael Kuperstein ee31cbe35f [LV] Don't panic when encountering the IV of an outer loop.
Bail out instead of asserting when we encounter this situation,
which can actually happen.

The reason the test uses the new PM is that the "bad" phi, incidentally, gets
cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve
loop simplify form, so the cleanup has no chance to run.

This fixes PR31190.
We may want to solve this in a less conservative manner, since this phi is
actually uniform within the inner loop (or we may want LICM to output a cleaner
promotion to begin with).

Differential Revision: https://reviews.llvm.org/D28490

llvm-svn: 291589
2017-01-10 19:32:30 +00:00
Rong Xu ef1adad938 [PGO] Turn off comdat renaming in IR PGO by default
Summary:
In IR PGO we append the function hash to comdat functions to avoid the
potential hash mismatch. This turns out not legal in some cases: if the comdat
function is address-taken and used in comparison. Renaming changes the semantic.

This patch turns off comdat renaming by default.

To alleviate the hash mismatch issue, we now rename the profile variable
for comdat functions. Profile allows co-existing multiple versions of profiles
with different hash value. The inlined copy will always has the correct profile
counter. The out-of-line copy might not have the correct count. But we will
not have the bogus mismatch warning.

Reviewers: davidxl

Subscribers: llvm-commits, xur

Differential Revision: https://reviews.llvm.org/D28416

llvm-svn: 291588
2017-01-10 19:30:20 +00:00
Davide Italiano f8711f093e [SimplifyLibCalls] Propagate fast math flags while optimizing pow().
llvm-svn: 291577
2017-01-10 18:02:05 +00:00
Xin Tong 02b1397ac3 Fix a typo and also test a new machine for commit. NFC.
llvm-svn: 291532
2017-01-10 03:13:52 +00:00
Serge Pavlov 0668cd2c95 [StructurizeCfg] Update dominator info.
In some cases StructurizeCfg updates root node, but dominator info
remains unchanges, it causes crash when expensive checks are enabled.
To cope with this problem a new method was added to DominatorTreeBase
that allows adding new root nodes, it is called in StructurizeCfg to
put dominator tree in sync.

This change fixes PR27488.

Differential Revision: https://reviews.llvm.org/D28114

llvm-svn: 291530
2017-01-10 02:50:47 +00:00
Xin Tong 12c8cb3745 Add an assert for hasLoopInvariantOperands
Summary: Add an assert for hasLoopInvariantOperands

Reviewers: danielcdh, sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28501

llvm-svn: 291516
2017-01-10 00:39:49 +00:00
Davide Italiano 472684eaf5 [SimplifyLibCalls] pow(x, -0.5) -> 1.0 / sqrt(x).
Differential Revision:  https://reviews.llvm.org/D28479

llvm-svn: 291486
2017-01-09 21:55:23 +00:00
Matthew Simpson cf796478e9 [LV] Fix-up external IV users after updating dominator tree
This patch delays the fix-up step for external induction variable users until
after the dominator tree has been properly updated. This should fix PR30742.
The SCEVExpander in InductionDescriptor::transform can generate code in the
wrong location if the dominator tree is not up-to-date. We should work towards
keeping the dominator tree up-to-date throughout the transformation.

Reference: https://llvm.org/bugs/show_bug.cgi?id=30742
Differential Revision: https://reviews.llvm.org/D28168

llvm-svn: 291462
2017-01-09 19:05:29 +00:00
Sanjay Patel 940c06188e fix comment typos; NFC
llvm-svn: 291447
2017-01-09 16:27:56 +00:00
Jonas Paulsson cf7543c44b Remove unused method in LoopVectorize.cpp.
computeInterleaveCount() is not defined/used and is therefore removed.

Review: Davide Italiano
llvm-svn: 291423
2017-01-09 06:13:21 +00:00
Daniel Berlin b755aea8eb NewGVN: Fix PR 31573, a failure to verify memory congruency due to
not excluding ourselves when checking if any equivalent stores
exist.

llvm-svn: 291421
2017-01-09 05:34:29 +00:00
Daniel Berlin 2f1fbcc718 NewGVN: Change a std::vector to SmallVector and cleanup naming.
llvm-svn: 291420
2017-01-09 05:34:19 +00:00
Davide Italiano 1a12522e87 [SCCP] Unknown instructions are sent to overdefined anyway. NFCI.
llvm-svn: 291400
2017-01-08 21:19:05 +00:00
Matt Arsenault a7d2194168 SimplifyLibCalls: Remove incorrect optimization of fabs
fabs(x * x) is not generally safe to assume x is positive if x is a NaN.
This is also less general than it could be, so this will be replaced
with a transformation on the intrinsic.

llvm-svn: 291359
2017-01-07 19:55:12 +00:00
Daniel Berlin 32f8d560dd NewGVN: Make sure we properly lookup operand leaders while creating
congruence classes for stores, and then keep them up to date.  Add
testcases.

llvm-svn: 291351
2017-01-07 16:55:14 +00:00
Xin Tong ee5cb65ada Fix a typo. NFC
llvm-svn: 291335
2017-01-07 04:30:58 +00:00
Daniel Berlin 0444343326 NewGVN: Reformat and fix a few newlines
llvm-svn: 291334
2017-01-07 03:23:47 +00:00
Davide Italiano 1b97fc34a4 [NewGVN] Prefer auto over explicit type. NFCI.
llvm-svn: 291328
2017-01-07 02:05:50 +00:00
Peter Collingbourne d79e49d807 LowerTypeTests: Thread summary and action from the API and command line into the pass.
Also move command line handling out of the pass constructor and into
a separate function.

Differential Revision: https://reviews.llvm.org/D28422

llvm-svn: 291323
2017-01-07 01:17:24 +00:00
Daniel Berlin d92e7f9f74 NewGVN: Fix PR 31501.
Summary: LLVM's non-standard notion of phi nodes means we can't both try to substitute for undef in phi nodes *and* use phi nodes as leaders all the time. This changes NewGVN to use the same semantics as SimplifyPHINode to decide which phi nodes are equivalent.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28312

llvm-svn: 291308
2017-01-07 00:01:42 +00:00
Teresa Johnson 9006d52651 [ThinLTO] Handle conflicting local names gracefully
Summary:
r285871 introduced an assert that was overly aggressive in the case
of a same-named local in different same-named files (in different
directories), where the source name and therefore the GUID ended up
the same because the files were compiled in their own directory without
any leading path. Change the handling in the promotion logic to get
the summary for the version in that module.

This also exposed an issue where we are not always importing the
right copy, which is a performance not correctness issue (because
the renaming is based on the module hash which must be different,
see the bug report for details). I will fix that as a follow-on.

Fixes PR31561.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28411

llvm-svn: 291304
2017-01-06 23:38:41 +00:00
Kuba Mracek 316dc70f82 [asan] Change the visibility of ___asan_globals_registered to hidden
This flag is used to track global registration in Mach-O and it doesn't need to be exported and visible.

Differential Revision: https://reviews.llvm.org/D28250

llvm-svn: 291289
2017-01-06 22:02:58 +00:00
Xin Tong 3caaa36ac5 Fix use after free
Summary: Fix use after free in LoopUnswitch

Reviewers: chenli, atrick, hfinkel, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28412

llvm-svn: 291288
2017-01-06 21:49:08 +00:00
Mehdi Amini 27d224fbbb Fix LoopLoadElimination to keep original alignment on the inital hoisted store
This is fixing a bug where Loop Vectorization is widening a load but
with a lower alignment. Hoisting the load without propagating the alignment
will allow inst-combine to later deduce a higher alignment that what the pointer
actually is.

Differential Revision: https://reviews.llvm.org/D28408

llvm-svn: 291281
2017-01-06 21:06:51 +00:00
Wolfgang Pieb c17a279eda [DWARF] Null out the debug locs of (loop invariant) instructions hoisted by LICM in
order to avoid jumpy line tables. Calls are left alone because they may be inlined.

Differential Revision: https://reviews.llvm.org/D28390

llvm-svn: 291258
2017-01-06 18:38:57 +00:00
Filipe Cabecinhas 4647b74b51 [ASan] Make ASan instrument variable-masked loads and stores
Summary: Previously we only supported constant-masked loads and stores.

Reviewers: kcc, RKSimon, pgousseau, gbedwell, vitalybuka

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28370

llvm-svn: 291238
2017-01-06 15:24:51 +00:00
Peter Collingbourne 81271b7bd2 LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.
This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.

Differential Revision: https://reviews.llvm.org/D28341

llvm-svn: 291205
2017-01-06 02:22:47 +00:00
Xin Tong 8b8a600d92 Fix typo. NFC
llvm-svn: 291178
2017-01-05 21:40:08 +00:00
Teresa Johnson 6c475a7595 ThinLTO: add early "dead-stripping" on the Index
Summary:
Using the linker-supplied list of "preserved" symbols, we can compute
the list of "dead" symbols, i.e. the one that are not reachable from
a "preserved" symbol transitively on the reference graph.
Right now we are using this information to mark these functions as
non-eligible for import.

The impact is two folds:
- Reduction of compile time: we don't import these functions anywhere
  or import the function these symbols are calling.
- The limited number of import/export leads to better internalization.

Patch originally by Mehdi Amini.

Reviewers: mehdi_amini, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23488

llvm-svn: 291177
2017-01-05 21:34:18 +00:00
Michael Kuperstein c9acad12e9 [LICM] Allow promotion of some stores that are not guaranteed to execute.
Promotion is always legal when a store within the loop is guaranteed to execute.

However, this is not a necessary condition - for promotion to be memory model
semantics-preserving, it is enough to have a store that dominates every exit
block. This is because if the store dominates every exit block, the fact the
exit block was executed implies the original store was executed as well.

Differential Revision: https://reviews.llvm.org/D28147

llvm-svn: 291171
2017-01-05 20:42:06 +00:00
Andrew Kaylor 7353cf4623 [LICM] Small update to note changes made in hoistRegion
Differential Revision: https://reviews.llvm.org/D28363

llvm-svn: 291157
2017-01-05 18:53:24 +00:00
Xin Tong 9efb049fb3 Remove a unnecessary hasLoopInvariantOperands check in loop sink.
Summary:
Preheader instruction's operands will always be invariant w.r.t. the loop which its the preheader
for.

Memory aliases are handled in canSinkOrHoistInst.

Reviewers: danielcdh, davidxl

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28270

llvm-svn: 291132
2017-01-05 16:52:37 +00:00
Teresa Johnson 2b60384581 [ThinLTO] Add parenthesis as per build warning
Fixes a warning about "||" and "&&" due to r291108.

llvm-svn: 291119
2017-01-05 15:10:10 +00:00
Teresa Johnson 519465b993 [ThinLTO] Subsume all importing checks into a single flag
Summary:
This adds a new summary flag NotEligibleToImport that subsumes
several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal
and IsNotViableToInline). It also subsumes the checking of references
on the summary that was being done during the thin link by
eligibleForImport() for each candidate. It is much more efficient to
do that checking once during the per-module summary build and record
it in the summary.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28169

llvm-svn: 291108
2017-01-05 14:32:16 +00:00
Mohammed Agabaria 23599ba794 Currently isLikelyComplexAddressComputation tries to figure out if the given stride seems to be 'complex' and need some extra cost for address computation handling.
This code seems to be target dependent which may not be the same for all targets.
Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'.

Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general.

Differential Revision: https://reviews.llvm.org/D27518

llvm-svn: 291106
2017-01-05 14:03:41 +00:00
Peter Collingbourne b2ce2b6805 IR: Module summary representation for type identifiers; summary test scaffolding for lowertypetests.
Set up basic YAML I/O support for module summaries, plumb the summary into
the pass and add a few command line flags to test YAML I/O support. Bitcode
support to come separately, as will the code in LowerTypeTests that actually
uses the summary. Also add a couple of tests that pass by virtue of the pass
doing nothing with the summary (which happens to be the correct thing to do
for those tests).

Differential Revision: https://reviews.llvm.org/D28041

llvm-svn: 291069
2017-01-05 03:39:00 +00:00
Wolfgang Pieb ce13e716c5 [DWARF] Null out the debug locs of load instructions that have been moved by GVN
performing partial redundancy elimination (PRE). Not doing so can cause jumpy line
tables and confusing (though correct) source attributions.

Differential Revision: https://reviews.llvm.org/D27857

llvm-svn: 291037
2017-01-04 23:58:26 +00:00
Mehdi Amini 19ef4fad91 Use lazy-loading of Metadata in MetadataLoader when importing is enabled (NFC)
Summary:
This is a relatively simple scheme: we use the index emitted in the
bitcode to avoid loading all the global metadata. Instead we load
the index with their position in the bitcode so that we can load each
of them individually. Materializing the global metadata block in this
condition only triggers loading the named metadata, and the ones
referenced from there (transitively). When materializing a function,
metadata from the global block are loaded lazily as they are
referenced.

Two main current limitations are:

1) Global values other than functions are not materialized on demand,
so we need to eagerly load METADATA_GLOBAL_DECL_ATTACHMENT records
(and their transitive dependencies).
2) When we load a single metadata, we don't recurse on the operands,
instead we use a placeholder or a temporary metadata. Unfortunately
tepmorary nodes are very expensive. This is why we don't have it
always enabled and only for importing.

These two limitations can be lifted in a subsequent improvement if
needed.

With this change, the total link time of opt with ThinLTO and Debug
Info enabled is going down from 282s to 224s (~20%).

Reviewers: pcc, tejohnson, dexonsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28113

llvm-svn: 291027
2017-01-04 22:54:33 +00:00
Matt Arsenault 3bdd75d01e InstCombine: Fold cos(-x) -> cos(x)
Also cos(fabs(x)) -> cos(x)

llvm-svn: 291022
2017-01-04 22:49:03 +00:00
Daniel Berlin 6cc5e44068 NewGVN: Track the maximum number of iterations GVN takes on any function, so we can pinpoint performance issues.
llvm-svn: 291002
2017-01-04 21:01:02 +00:00
Robert Lougher 5bf0416f45 Reapply "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
This reapplies r289828 (reverted in r289833 as it broke the address sanitizer).  The
debugloc is now only set when the instruction is not a call, as this causes the
verifier to assert (the inliner requires an inlinable callsite to have a debug loc
if the caller and callee have debug info).

Original commit message:

Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Original review: https://reviews.llvm.org/D27590

llvm-svn: 290973
2017-01-04 17:40:32 +00:00
David Majnemer cb892e9066 [InstCombine] Move casts around shift operations
It is possible to perform a left shift before zero extending if the
shift would only shift out zeros.

llvm-svn: 290928
2017-01-04 02:21:34 +00:00
David Majnemer 022d2a563b [InstCombine] Combine adds across a zext
We can perform the following:
(add (zext (add nuw X, C1)), C2) -> (zext (add nuw X, C1+C2))

This is only possible if C2 is negative and C2 is greater than or equal to negative C1.

llvm-svn: 290927
2017-01-04 02:21:31 +00:00
Matt Arsenault 56ff4839ae InstCombine: Fold fabs on select of constants
llvm-svn: 290913
2017-01-03 22:40:34 +00:00
Sanjay Patel f0d1e77373 [InstCombine] use 'match' to reduce code bloat; NFCI
I wrote this patch before seeing the comment in:
https://reviews.llvm.org/D27114
...that suggests we should actually be canonicalizing the other way.

So just in case we decide this is the right way, we might as well
have a cleaner implementation.

llvm-svn: 290912
2017-01-03 22:25:31 +00:00
Matt Arsenault b264c94963 InstCombine: Add fma with constant transforms
DAGCombine already does these.

llvm-svn: 290860
2017-01-03 04:32:35 +00:00
Matt Arsenault 1cc294c85d InstCombine: Add fma + fabs/fneg transforms
fma (fneg x), (fneg y), z -> fma x, y, z
fma (fabs x), (fabs x), z -> fma x, x, z

llvm-svn: 290859
2017-01-03 04:32:31 +00:00
Sanjay Patel 1c9867d009 [EarlyCSE] less else, more auto; NFC
llvm-svn: 290848
2017-01-03 00:16:24 +00:00
Sanjay Patel b38ad88e9f [InstCombine] use combineMetadataForCSE instead of copying it; NFCI
llvm-svn: 290844
2017-01-02 23:25:28 +00:00
Xin Tong 2940231ff0 Make sure total loop body weight is preserved in loop peeling
Summary:
Regardless how the loop body weight is distributed, we should preserve
total loop body weight. i.e. we should have same weight reaching the body of the loop
or its duplicates in peeled and unpeeled case.

Reviewers: mkuper, davidxl, anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28179

llvm-svn: 290833
2017-01-02 20:27:23 +00:00
Daniel Berlin de43ef9601 NewGVN: Clean up after removing possibility of null expressions.
llvm-svn: 290828
2017-01-02 19:49:17 +00:00
Sanjay Patel 65d533ca42 fix typo; NFC
llvm-svn: 290827
2017-01-02 19:05:11 +00:00
Davide Italiano 67ada75d84 [NewGVN] Fold single-use variable inside the assertion.
It placates some bots which complain because they compile the
assertion out and think the variable is unused.

llvm-svn: 290825
2017-01-02 19:03:16 +00:00
Davide Italiano 841261624d [NewGVN] Restore old code to placate buildbots.
Apparently my suggestion of using ternary doesn't really work
as clang complains about incompatible types on LHS and RHS. Some
GCC versions happen to accept the code but clang behaviour is
correct here.

llvm-svn: 290822
2017-01-02 18:41:34 +00:00
Daniel Berlin 25f05b0ab7 NewGVN: Fix some formatting and comment issues
llvm-svn: 290820
2017-01-02 18:22:38 +00:00
Daniel Berlin 02c6b176e7 NewGVN: Add UnknownExpression and create them for things we can't symbolize. Kill fragile machinery for handling null expressions.
Summary:
This avoids the very fragile code for null expressions. We could also use a denseset that tracks which things have null expressions instead, but that seems pretty fragile and premature optimization.

This resolves a number of infinite loop cases, test reductions coming.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28193

llvm-svn: 290816
2017-01-02 18:00:53 +00:00
Daniel Berlin 589cecc6e9 NewGVN: Fix PR31480, PR31483, PR31499, by rewriting how memory congruence handling works.
Summary: Previously, we tried to fix up the equivalences during symbolic evaluation.  This does not work. Now, we change the equivalences during congruence finding, where it belongs.  We also initialize the equivalence table to give a maximal answer.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28192

llvm-svn: 290815
2017-01-02 18:00:46 +00:00
Davide Italiano b672537cbf [PMBuilder] Remove RunFloat2Int cl::opt.
The pass has been on by default for a long time without problems.

llvm-svn: 290814
2017-01-02 17:49:18 +00:00
Sanjay Patel aea60846c4 [Inliner] remove unnecessary null checks from AddAlignmentAssumptions(); NFCI
We bail out on the 1st line if the assumption cache is not set, so there's
no need to check it after that.

llvm-svn: 290787
2016-12-31 17:54:05 +00:00
Craig Topper d00db69227 [InstCombine][AVX-512] Teach InstCombine that llvm.x86.avx512.vcomi.sd and llvm.x86.avx512.vcomi.ss don't use the upper elements of their input.
This was already done for the SSE/SSE2 version of the intrinsics.

llvm-svn: 290776
2016-12-31 00:45:06 +00:00
Craig Topper 991636312b [InstCombine][AVX-512] When turning intrinsics with masking into native IR, don't emit a select if the mask is known to be all ones.
This saves InstCombine the burden of having to optimize the select later.

llvm-svn: 290774
2016-12-30 23:06:28 +00:00
Philip Reames fac031a178 Add a comment for a todo in LoopUnroll post cleanup
llvm-svn: 290769
2016-12-30 22:10:19 +00:00
Philip Reames a570a2303c [CVP] Adjust iteration order to reduce the amount of work required
CVP doesn't care about the order of blocks visited, but by using a pre-order traversal over the graph we can a) not visit unreachable blocks and b) optimize as we go so that analysis of later blocks produce slightly more precise results.

I noticed this via inspection and don't have a concrete example which points to the issue.  

llvm-svn: 290760
2016-12-30 18:00:55 +00:00
Davide Italiano 75e39f9790 [NewGVN] Remove unneeded newline from assertion message.
llvm-svn: 290755
2016-12-30 15:01:17 +00:00
David Majnemer 5ec5f278c9 [InstCombine] Address post-commit feedback
llvm-svn: 290741
2016-12-30 03:36:17 +00:00
Michael Kuperstein 76e06c8858 [LICM] When promoting scalars, allow inserting stores to thread-local allocas.
This is similar to the allocfn case - if an alloca is not captured, then it's
necessarily thread-local.

Differential Revision: https://reviews.llvm.org/D28170

llvm-svn: 290738
2016-12-30 01:03:17 +00:00
Dehao Chen cc76344ef5 Use continuous boosting factor for complete unroll.
Summary:
The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will apply a fixed boosting factor to the threshold (by discounting cost). The problem for this approach is that the threshold abruptly. This patch makes the boosting factor a function of runtime reduction percentage, capped by a fixed threshold. In this way, the threshold changes continuously.

The patch also simplified the code by reducing one parameter in UP.

The patch only affects code-gen of two speccpu2006 benchmark:

445.gobmk binary size decreases 0.08%, no performance change.
464.h264ref binary size increases 0.24%, no performance change.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26989

llvm-svn: 290737
2016-12-30 00:50:28 +00:00
Michael Kuperstein 4a86a1921a [LICM] Remove unneeded tracking of whether changes were made. NFC.
"Changed" doesn't actually change within the loop, so there's
no reason to keep track of it - we always return false during
analysis and true after the transformation is made.

llvm-svn: 290735
2016-12-30 00:43:22 +00:00
Michael Kuperstein 62b98c3977 [LICM] Make logic in promoteLoopAccessesToScalars easier to follow. NFC.
llvm-svn: 290734
2016-12-30 00:39:00 +00:00
David Majnemer a1cfd7c5f8 [InstCombine] More thoroughly canonicalize the position of zexts
We correctly canonicalized (add (sext x), (sext y)) to (sext (add x, y))
where possible.  However, we didn't perform the same canonicalization
for zexts or for muls.

llvm-svn: 290733
2016-12-30 00:28:58 +00:00
Michael Kuperstein ff36baefe7 [LICM] Compute exit blocks for promotion eagerly. NFC.
This moves the exit block and insertion point computation to be eager,
instead of after seeing the first scalar we can promote.

The cost is relatively small (the computation happens anyway, see discussion
on D28147), and the code is easier to follow, and can bail out earlier
if there's a catchswitch present.

llvm-svn: 290729
2016-12-29 23:11:19 +00:00
Michael Kuperstein 5566092963 [LICM] Don't try to promote in loops where we have no chance to promote. NFC.
We would check whether we have a prehader *or* dedicated exit blocks,
and go into the promotion loop. Then, for each alias set we'd check
if we have a preheader *and* dedicated exit blocks, and bail if not.

Instead, bail immediately if we don't have both.

llvm-svn: 290728
2016-12-29 22:51:22 +00:00
Michael Kuperstein b6da9cf3b7 [LICM] Only recompute LCSSA when we actually promoted something.
We want to recompute LCSSA only when we actually promoted a value.
This means we only need to look at changes made by promotion when
deciding whether to recompute it or not, not at regular sinking/hoisting.

(This was what the code was documented as doing, just not what it did)

Hopefully NFC.

llvm-svn: 290726
2016-12-29 22:37:13 +00:00
Daniel Berlin e0bd37e78f NewGVN: Fix PR 31491 by ensuring that we touch the right instructions. Change to one based numbering so we can assert we don't cause the same bug again.
llvm-svn: 290724
2016-12-29 22:15:12 +00:00
Craig Topper 17b5568bc7 [InstCombine] Use getVectorNumElements instead of explicitly casting to VectorType and calling getNumElements. NFC
llvm-svn: 290707
2016-12-29 07:03:18 +00:00
Craig Topper 62f06e241b [InstCombine] Fix typo in comment. NFC
llvm-svn: 290706
2016-12-29 05:38:31 +00:00
Craig Topper 2e18bcfc60 [InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC
llvm-svn: 290705
2016-12-29 04:24:32 +00:00
Craig Topper 1a8a3377cc [InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner.
We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded.

llvm-svn: 290704
2016-12-29 03:30:17 +00:00
Daniel Berlin 6658cc9ead NewGVN: Sort Dominator Tree in RPO order, and use that for generating order.
Summary:
The optimal iteration order for this problem is RPO order. We want to
process as many preds of a backedge as we can before we process the
backedge.

At the same time, as we add predicate handling, we want to be able to
touch instructions that are dominated by a given block by
ranges (because a change in value numbering a predicate possibly
affects all users we dominate that are using that predicate).
If we don't do it this way, we can't do value inference over
backedges (the paper covers this in depth).

The newgvn branch currently overshoots the last part, and guarantees
that it will touch *at least* the right set of instructions, but it
does touch more.  This is because the bitvector instruction ranges are
currently generated in RPO order (so we take the max and the min of
the ranges of dominated blocks, which means there are some in the
middle we didn't have to touch that we did).

We can do better by sorting the dominator tree, and then just using
dominator tree order.

As a preliminary, the dominator tree has some RPO guarantees, but not
enough. It guarantees that for a given node, your idom must come
before you in the RPO ordering. It guarantees no relative RPO ordering
for siblings.  We add siblings in whatever order they appear in the module.

So that is what we fix.

We sort the children array of the domtree into RPO order, and then use
the dominator tree for ordering, instead of RPO, since the dominator
tree is now a valid RPO ordering.

Note: This would help any other pass that iterates a forward problem
in dominator tree order.  Most of them are single pass.  It will still
maximize whatever result they compute.  We could also build the
dominator tree in this order, but our incremental updates would still
put it out of sort order, and recomputing the sort order is almost as
hard as general incremental updates of the domtree.

Also note that the sorting does not affect any tests, etc. Nothing
depends on domtree order, including the verifier, the equals
functions for domtree nodes, etc.

How much could this matter, you ask?
Here are the current numbers.
This is generated by running NewGVN over all files in LLVM.

Note that once we propagate equalities, the differences go up by an
order of magnitude or two (IE instead of 29, the max ends up in the
thousands, since the worst case we add a factor of N, where N is the
number of branch predicates).  So while it doesn't look that stark for
the default ordering, it gets *much much* worse.  There are also
programs in the wild where the difference is already pretty stark
(2 iterations vs hundreds).

RPO ordering:
759040 Number of iterations is 1
112908 Number of iterations is 2

Default dominator tree ordering:
755081 Number of iterations is 1
116234 Number of iterations is 2
   603 Number of iterations is 3
    27 Number of iterations is 4
     2 Number of iterations is 5
     1 Number of iterations is 7

Dominator tree sorted:
759040 Number of iterations is 1
112908 Number of iterations is 2
<yay!>

Really bad ordering (sort domtree siblings in postorder. not quite the
worst possible, but yeah):
754008 Number of iterations is 1
    21 Number of iterations is 10
     8 Number of iterations is 11
     6 Number of iterations is 12
     5 Number of iterations is 13
     2 Number of iterations is 14
     2 Number of iterations is 15
     3 Number of iterations is 16
     1 Number of iterations is 17
     2 Number of iterations is 18
 96642 Number of iterations is 2
     1 Number of iterations is 20
     2 Number of iterations is 21
     1 Number of iterations is 22
     1 Number of iterations is 29
 17266 Number of iterations is 3
  2598 Number of iterations is 4
   798 Number of iterations is 5
   273 Number of iterations is 6
   186 Number of iterations is 7
    80 Number of iterations is 8
    42 Number of iterations is 9

Reviewers: chandlerc, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28129

llvm-svn: 290699
2016-12-29 01:12:36 +00:00
Daniel Berlin 7ad1ea0984 Update equalsStoreHelper for the fact that only one branch can be true
llvm-svn: 290697
2016-12-29 00:49:32 +00:00
Piotr Padlewski 6c37d298d9 Revert "[NewGVN] replace emplace_back with push_back"
llvm-svn: 290692
2016-12-28 23:24:02 +00:00
Piotr Padlewski 629a7f2cc0 [NewGVN] replace emplace_back with push_back
emplace_back is not faster if it is equivalent to push_back. In this cases emplaced value had the
same type that the one stored in container. It is ugly and it might be even slower (see
Scott Meyers presentation about emplacement).

llvm-svn: 290685
2016-12-28 20:36:08 +00:00
Piotr Padlewski 26dada79ff [NewGVN] Simplyfy loop NFC
llvm-svn: 290683
2016-12-28 19:42:49 +00:00
Piotr Padlewski e4047b89ad [NewGVN] replace typedefs with usings
llvm-svn: 290680
2016-12-28 19:29:26 +00:00
Piotr Padlewski fc5727b2a2 [NewGVN] NFC fixes
llvm-svn: 290679
2016-12-28 19:17:17 +00:00
Davide Italiano 0e71480523 [NewGVN] Global sweep replacing NULL with nullptr. NFCI.
llvm-svn: 290670
2016-12-28 14:00:11 +00:00
Davide Italiano 0fb3c7cde5 [NewGVN] Remove redundant code. NFCI.
llvm-svn: 290669
2016-12-28 13:54:16 +00:00
Davide Italiano b111409015 [NewGVN] equals() for loads/stores is the same. Unify.
Differential Revision:  https://reviews.llvm.org/D28116

llvm-svn: 290667
2016-12-28 13:37:17 +00:00
Chandler Carruth 9900d18bab [PM] Teach the inliner's call graph update to handle inserting new edges
when they are call edges at the leaf but may (transitively) be reached
via ref edges.

It turns out there is a simple rule: insert everything as a ref edge
which is a safe conservative default. Then we let the existing update
logic handle promoting some of those to call edges.

Note that it would be fairly cheap to make these call edges right away
if that is desirable by testing whether there is some existing call path
from the source to the target. It just seemed like slightly more
complexity in this code path that isn't strictly necessary. If anyone
feels strongly about handling this differently I'm happy to change it.

llvm-svn: 290649
2016-12-28 03:13:12 +00:00
Craig Topper 28ec3460e4 [InstCombine] Remove a piece of a comment that said that InstCombiner contains pass infrastructure. That hasn't been true since r226618. NFC
llvm-svn: 290648
2016-12-28 03:12:42 +00:00
Michael Kuperstein cd7ad7130f [InstCombine] Canonicalize insert splat sequences into an insert + shuffle
This adds a combine that canonicalizes a chain of inserts which broadcasts
a value into a single insert + a splat shufflevector.

This fixes PR31286.

Differential Revision: https://reviews.llvm.org/D27992

llvm-svn: 290641
2016-12-28 00:18:08 +00:00
Kostya Serebryany f24e52c0c2 [sanitizer-coverage] sort the switch cases
llvm-svn: 290628
2016-12-27 21:20:06 +00:00
Davide Italiano b222549dc5 [NewGVN] Simplify a bit removing else after return. NFCI.
llvm-svn: 290615
2016-12-27 18:15:39 +00:00
Bryant Wong 7cb744621b [MemCpyOpt] Don't sink LoadInst below possible clobber.
Differential Revision: https://reviews.llvm.org/D26811

llvm-svn: 290611
2016-12-27 17:58:12 +00:00
Daniel Berlin 1f31fe529e Change a std::vector to SmallVector in NewGVN
llvm-svn: 290596
2016-12-27 09:20:36 +00:00
Chandler Carruth 141bf5d14d [PM] Add one of the features left out of the initial inliner patch:
skipping indirectly recursive inline chains.

To do this, we implicitly build an inline stack for each callsite and
check prior to inlining that doing so would not form a cycle. This uses
the exact same technique and even shares some code with the legacy PM
inliner.

This solution remains deeply unsatisfying to me because it means we
cannot actually iterate the inliner externally. Doing so would not be
able to easily detect and avoid such cycles. Some day I would very much
like to have a solution that works without this internal state to detect
cycles, but this is not that day.

llvm-svn: 290590
2016-12-27 06:46:20 +00:00
Craig Topper 72f2d4e8d6 [InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ instructions
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

This builds on r290554 which added supported for 128 and 256-bit.

llvm-svn: 290582
2016-12-27 05:30:09 +00:00
Chandler Carruth 03130d981c [PM] Teach the inliner in the new PM to merge attributes after inlining.
Also enable the new PM in the attributes test case which caught this
issue.

llvm-svn: 290572
2016-12-27 03:39:54 +00:00
Craig Topper 7f8540b5e7 [AVX-512][InstCombine] Teach InstCombine to turn masked scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.
An earlier commit added support for unmasked scalar operations. At that time isel wouldn't generate an optimal sequence for masked operations, but that has now been fixed.

llvm-svn: 290566
2016-12-27 01:56:30 +00:00
Chandler Carruth 0ee8bb11c3 [PM] Move the collection of call sites to a more appropriate place
inside of `InlineFunction`. Prior to this, call instructions are
specifically being rewritten and replaced within the inlined region,
invalidating some of the call sites.

Several of these regions are using the same technique to walk the
inlined region so this seems clearly safe up to this point.

I've also added a short circuit to the scan for call sites based on what
other code is doing.

With this, the most common crash I've found in the new inliner code is
fixed. I've turned it on for another test case that covers this
scenario.

I'll make my way through most of the other inliner test cases
just to get some easy coverage next.

llvm-svn: 290562
2016-12-27 01:24:50 +00:00
Craig Topper 020b228155 [AVX-512][InstCombine] Teach InstCombine to turn packed add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.
llvm-svn: 290559
2016-12-27 00:23:16 +00:00
Chandler Carruth 6e9bb7e064 [PM] Teach the always inliner in the new pass manager to support
removing fully-dead comdats without removing dead entries in comdats
with live members.

This factors the core logic out of the current inliner's internals to
a reusable utility and leverages that in both places. The factored out
code should also be (minorly) more efficient in cases where we have very
few dead functions or dead comdats to consider.

I've added a test case to cover this behavior of the always inliner.
This is the last significant bug in the new PM's always inliner I've
found (so far).

llvm-svn: 290557
2016-12-26 23:43:27 +00:00
Simon Pilgrim c9cf7fc7a4 [InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

Differential Revision: https://reviews.llvm.org/D28119

llvm-svn: 290554
2016-12-26 23:28:17 +00:00
Daniel Berlin 85f91b0ec3 clang-format NewGVN files
llvm-svn: 290551
2016-12-26 20:06:58 +00:00