Commit Graph

20907 Commits

Author SHA1 Message Date
Sameer Sahasrabuddhe 2de7653fd5 [AMDGPU] lower-switch in preISel as a workaround for legacy DA
Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.

As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, simoll

Differential Revision: https://reviews.llvm.org/D52221

llvm-svn: 342722
2018-09-21 11:26:55 +00:00
JF Bastien 73d8e4e531 Merge clang's isRepeatedBytePattern with LLVM's isBytewiseValue
Summary:
his code was in CGDecl.cpp and really belongs in LLVM's isBytewiseValue. Teach isBytewiseValue the tricks clang's isRepeatedBytePattern had, including merging undef properly, and recursing on more types.

clang part of this patch: D51752

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51751

llvm-svn: 342709
2018-09-21 05:17:42 +00:00
Xin Tong 9b7e45159f [GlobalDCE] AvailableExternal linkage is checked in isDiscardableIfUnused [NFC].
Summary:
AvailableExternal was not handled in isDiscardableIfUnused when isDiscardableIfUnused
was added in r158476. Till it was handled in r247044. This is a NFC.

Reviewers: pcc, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52319

llvm-svn: 342684
2018-09-20 21:16:16 +00:00
Fedor Sergeev ee8d31c49e [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

  Made getName helper to return std::string (instead of StringRef initially) to fix
  asan builtbot failures on CGSCC tests.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342664
2018-09-20 17:08:45 +00:00
Jesper Antonsson 719fa055d0 [InstCombine] Handle vector compares in foldGEPIcmp()
Summary:
This is to fix PR38984 "InstCombine assertion at vector gep/icmp folding":
https://bugs.llvm.org/show_bug.cgi?id=38984

Reviewers: majnemer, spatel, lattner, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D52263

llvm-svn: 342647
2018-09-20 13:37:28 +00:00
Bjorn Pettersson cd53b7f54e [IPSCCP] Fix a problem with removing labels in a switch with undef condition
Summary:
Before removing basic blocks that ipsccp has considered as dead
all uses of the basic block label must be removed. That is done
by calling ConstantFoldTerminator on the users. An exception
is when the branch condition is an undef value. In such
scenarios ipsccp is using some internal assumptions regarding
which edge in the control flow that should remain, while
ConstantFoldTerminator don't know how to fold the terminator.

The problem addressed here is related to ConstantFoldTerminator's
ability to rewrite a 'switch' into a conditional 'br'. In such
situations ConstantFoldTerminator returns true indicating that
the terminator has been rewritten. However, ipsccp treated the
true value as if the edge to the dead basic block had been
removed. So the code for resolving an undef branch condition
did not trigger, and we ended up with assertion that there were
uses remaining when deleting the basic block.

The solution is to resolve indeterminate branches before the
call to ConstantFoldTerminator.

Reviewers: efriedma, fhahn, davide

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52232

llvm-svn: 342632
2018-09-20 09:00:17 +00:00
Calixte Denizet eb7f60201c [IR] Add a boolean field in DILocation to know if a line must covered or not
Summary:
Some lines have a hit counter where they should not have one.
For example, in C++, some cleanup is adding at the end of a scope represented by a '}'.
So such a line has a hit counter where a user expects to not have one.
The goal of the patch is to add this information in DILocation which is used to get the covered lines in GCOVProfiling.cpp.
A following patch in clang will add this information when generating IR (https://reviews.llvm.org/D49916).

Reviewers: marco-c, davidxl, vsk, javed.absar, rnk

Reviewed By: rnk

Subscribers: eraman, xur, danielcdh, aprantl, rnk, dblaikie, #debug-info, vsk, llvm-commits, sylvestre.ledru

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D49915

llvm-svn: 342631
2018-09-20 08:53:06 +00:00
Eric Christopher 019889374b Temporarily Revert "[New PM] Introducing PassInstrumentation framework"
as it was causing failures in the asan buildbot.

This reverts commit r342597.

llvm-svn: 342616
2018-09-20 05:16:29 +00:00
Fedor Sergeev a5f279ea89 [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342597
2018-09-19 22:42:57 +00:00
Matt Morehouse e62fc3d0b6 [InstCombine] Disable strcmp->memcmp transform for MSan.
Summary:
The strcmp->memcmp transform can make the resulting memcmp read
uninitialized data, which MSan doesn't like.

Resolves https://github.com/google/sanitizers/issues/993.

Reviewers: eugenis, xbolva00

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D52272

llvm-svn: 342582
2018-09-19 19:37:24 +00:00
Fedor Sergeev 25de3f83be Revert rL342544: [New PM] Introducing PassInstrumentation framework
A bunch of bots fail to compile unittests. Reverting.

llvm-svn: 342552
2018-09-19 14:54:48 +00:00
Roman Lebedev f50023d37c [InstCombine] foldICmpWithLowBitMaskedVal(): handle uncanonical ((-1 << y) >> y) mask
Summary:
The last low-bit-mask-pattern-producing-pattern i can think of.

https://rise4fun.com/Alive/UGzE <- non-canonical
But we can not canonicalize it because of extra uses.

https://bugs.llvm.org/show_bug.cgi?id=38123

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52148

llvm-svn: 342548
2018-09-19 13:35:46 +00:00
Roman Lebedev ca2bdb03d6 [InstCombine] foldICmpWithLowBitMaskedVal(): handle uncanonical ((1 << y)+(-1)) mask
Summary:
Same as to D52146.
`((1 << y)+(-1))` is simply non-canoniacal version of `~(-1 << y)`: https://rise4fun.com/Alive/0vl
We can not canonicalize it due to the extra uses. But we can handle it here.

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52147

llvm-svn: 342547
2018-09-19 13:35:40 +00:00
Roman Lebedev 183a465dc6 [InstCombine] foldICmpWithLowBitMaskedVal(): handle ~(-1 << y) mask
Summary:
Two folds are happening here:
1. https://rise4fun.com/Alive/oaFX
2. And then `foldICmpWithHighBitMask()` (D52001): https://rise4fun.com/Alive/wsP4

This change doesn't just add the handling for eq/ne predicates,
it actually builds upon the previous `foldICmpWithLowBitMaskedVal()` work,
so **all** the 16 fold variants* are immediately supported.

I'm indeed only testing these two predicates.
I do not feel like re-proving all 16 folds*, because they were already proven
for the general case of constant with all-ones in low bits. So as long as
the mask produces all-ones in low bits, i'm pretty sure the fold is valid.

But required, i can re-prove, let me know.

* eq/ne are commutative - 4 folds; ult/ule/ugt/uge - are not commutative (the commuted variant is InstSimplified), 4 folds; slt/sle/sgt/sge are not commutative - 4 folds. 12 folds in total.

https://bugs.llvm.org/show_bug.cgi?id=38123
https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52146

llvm-svn: 342546
2018-09-19 13:35:27 +00:00
Fedor Sergeev 875c938fec [New PM] Introducing PassInstrumentation framework
Summary:
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342544
2018-09-19 12:25:52 +00:00
Benjamin Kramer e5e1ea79fd [InstCombine] Don't transform sin/cos -> tanl if for half types
This is still unsafe for long double, we will transform things into tanl
even if tanl is for another type. But that's for someone else to fix.

llvm-svn: 342542
2018-09-19 12:01:38 +00:00
Carlos Alberto Enciso ba4e437c6a [DebugInfo][Dexter] Speculated BB presents illegal variable value to debugger.
When SimplifyCFG changes the PHI node into a select instruction, the debug information becomes ambiguous. It causes the debugger to display wrong variable value. 

Differential Revision: https://reviews.llvm.org/D51976

llvm-svn: 342527
2018-09-19 08:16:56 +00:00
Christy Lee c85da8bd9a Do not optimize atomic load to non-atomic memcmp
Differential Revision: https://reviews.llvm.org/D51998

llvm-svn: 342498
2018-09-18 17:02:42 +00:00
Hiroshi Yamauchi fd2c699dd6 [PGO][CHR] Add opt remarks.
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52056

llvm-svn: 342495
2018-09-18 16:50:10 +00:00
Teresa Johnson 5e1c0e7610 [LTO] Make detection of WPD remark enablement more robust
Summary:
Currently only the first function in the module is checked to
see if it has remarks enabled. If that first function is a declaration,
remarks will be incorrectly skipped. Change to look for the first
non-empty function.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51556

llvm-svn: 342477
2018-09-18 13:42:24 +00:00
whitequark e7e14f464c [LLVM-C][OCaml] Add UnifyFunctionExitNodes pass to C and OCaml APIs
Summary:
Adds LLVMAddUnifyFunctionExitNodesPass to expose
createUnifyFunctionExitNodesPass to the C and OCaml APIs.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52212

llvm-svn: 342476
2018-09-18 13:36:03 +00:00
whitequark 1f50560e56 [LLVM-C][OCaml] Add LowerAtomic pass to C and OCaml APIs
Summary:
Adds LLVMAddLowerAtomicPass to expose createLowerAtomicPass in the C
and OCaml APIs.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52211

llvm-svn: 342475
2018-09-18 13:35:50 +00:00
Max Kazantsev 0994abda3a [IndVars] Remove unreasonable checks in rewriteLoopExitValues
A piece of logic in rewriteLoopExitValues has a weird check on number of
users which allowed an unprofitable transform in case if an instruction has
more than 6 users.

Differential Revision: https://reviews.llvm.org/D51404
Reviewed By: etherzhhb

llvm-svn: 342444
2018-09-18 04:57:18 +00:00
Matt Arsenault c640798597 LSV: Fix adjust alloca alignment trick for AMDGPU
This was checking the hardcoded address space 0 for the stack.
Additionally, this should be checking for legality with
the adjusted alignment, so defer the alignment check.

Also try to split if the unaligned access isn't allowed.

llvm-svn: 342442
2018-09-18 02:05:44 +00:00
Alina Sbirlea a782a70ad9 [EarlyCSEwMemorySSA] Add MSSA verification and tests to make EarlyCSE failures easier to track.
Summary:
EarlyCSE can make IR changes that will leave MemorySSA with accesses claiming to be optimized, but for which a subsequent MemorySSA run will yield a different optimized result.
Due to relying on AA queries, we can't fix this in general, unless we recompute MemorySSA.
Adding some tests to track this and a basic verify for future potential failures.

Reviewers: george.burgess.iv, gberry

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D51960

llvm-svn: 342422
2018-09-17 22:35:21 +00:00
Xin Tong 8a505c64b0 [CVP] Handle instructions with no user. No need to create CVPLattice state. This handles terminator instructions and more.
Summary:
I tested this patch by compiling sqlite3.ll (clang -O3 -mllvm -disable-llvm-optzns sqlite3.c.)

 opt -called-value-propagation sqlite3.ll -time-passes -f -o out.ll

 I get 10+% speedup for the pass. I expect some of the gain come from skipping terminator instructions.

    === BEFORE THE PATCH ===
    ===-------------------------------------------------------------------------===
                          ... Pass execution timing report ...
    ===-------------------------------------------------------------------------===
      Total Execution Time: 0.5562 seconds (0.5582 wall clock)

       ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
       0.2485 ( 46.4%)   0.0120 ( 57.7%)   0.2605 ( 46.8%)   0.2615 ( 46.8%)  Bitcode Writer
       0.1607 ( 30.0%)   0.0079 ( 37.7%)   0.1685 ( 30.3%)   0.1693 ( 30.3%)  Called Value Propagation
       0.1262 ( 23.6%)   0.0009 (  4.5%)   0.1271 ( 22.9%)   0.1275 ( 22.8%)  Module Verifier
       0.5353 (100.0%)   0.0209 (100.0%)   0.5562 (100.0%)   0.5582 (100.0%)  Total

    === AFTER THE PATCH ===
    ===-------------------------------------------------------------------------===
                          ... Pass execution timing report ...
    ===-------------------------------------------------------------------------===
      Total Execution Time: 0.5338 seconds (0.5355 wall clock)

       ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
       0.2498 ( 48.6%)   0.0118 ( 59.3%)   0.2615 ( 49.0%)   0.2629 ( 49.1%)  Bitcode Writer
       0.1377 ( 26.8%)   0.0075 ( 37.8%)   0.1452 ( 27.2%)   0.1455 ( 27.2%)  Called Value Propagation
       0.1264 ( 24.6%)   0.0006 (  3.0%)   0.1270 ( 23.8%)   0.1271 ( 23.7%)  Module Verifier
       0.5139 (100.0%)   0.0199 (100.0%)   0.5338 (100.0%)   0.5355 (100.0%)  Total

Reviewers: davide, mssimpso

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49108

llvm-svn: 342398
2018-09-17 15:28:01 +00:00
Alexandros Lamprineas 8a1c374b2e [GVNHoist] Re-enable GVNHoist by default
Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912
has been fixed by rL342055.

Precommit testing performed:
* Overnight runs of csmith comparing the output between programs
  compiled with gvn-hoist enabled/disabled.
* Bootstrap builds of clang with UbSan/ASan configurations.

llvm-svn: 342387
2018-09-17 12:24:55 +00:00
Max Kazantsev 5fe3620261 [NFC] Turn unsigned counters into boolean flags
llvm-svn: 342360
2018-09-17 06:33:29 +00:00
Craig Topper 2da7381678 [InstCombine] Support (sub (sext x), (sext y)) --> (sext (sub x, y)) and (sub (zext x), (zext y)) --> (zext (sub x, y))
Summary:
If the sub doesn't overflow in the original type we can move it above the sext/zext.

This is similar to what we do for add. The overflow checking for sub is currently weaker than add, so the test cases are constructed for what is supported.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52075

llvm-svn: 342335
2018-09-15 18:54:10 +00:00
Sanjay Patel 296d35a5e9 [InstCombine][x86] try harder to convert blendv intrinsic to generic IR (PR38814)
Missing optimizations with blendv are shown in:
https://bugs.llvm.org/show_bug.cgi?id=38814

If this works, it's an easier and more powerful solution than adding pattern matching 
for a few special cases in the backend. The potential danger with this transform in IR
is that the condition value can get separated from the select, and the backend might 
not be able to make a blendv out of it again. I don't think that's too likely, but 
I've kept this patch minimal with a 'TODO', so we can test that theory in the wild 
before expanding the transform.

Differential Revision: https://reviews.llvm.org/D52059

llvm-svn: 342324
2018-09-15 14:25:44 +00:00
Roman Lebedev 1b7fc87020 [InstCombine] Inefficient pattern for high-bits checking 3 (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

The last (as far i know?) pattern, non-canonical due to the extra use.
https://godbolt.org/z/aCMsPk
https://rise4fun.com/Alive/I6f

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52062

llvm-svn: 342321
2018-09-15 12:04:13 +00:00
Sanjay Patel 90a36346bc [InstCombine] refactor mul narrowing folds; NFCI
Similar to rL342278:
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

D52075 should be able to use this code too rather than
duplicating all of the logic.

llvm-svn: 342292
2018-09-14 22:23:35 +00:00
Sanjay Patel 46945b9e9d [InstCombine] add/use overflowing math helper functions; NFC
The mul case can already be refactored to use this similar to
rL342278.
The sub case is proposed in D52075.

llvm-svn: 342289
2018-09-14 21:30:07 +00:00
Wei Mi 6a14325dff [SampleFDO] Add FunctionOffsetTable in compact binary format profile.
The patch saves a function offset table which maps function name index to the
offset of its function profile to the start of the binary profile. By using
the function offset table, for those function profiles which will not be used
when compiling a module, the profile reader does't have to read them. For
profile size around 10~20M, it saves ~10% compile time.

Differential Revision: https://reviews.llvm.org/D51863

llvm-svn: 342283
2018-09-14 20:52:59 +00:00
Sanjay Patel 2426eb46dd [InstCombine] refactor add narrowing folds; NFCI
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

llvm-svn: 342278
2018-09-14 20:40:46 +00:00
Sebastian Pop 0f30f08b02 HotColdSplit: fix invalid SSA due to outlining
The test used to fail with an invalid phi node: the two predecessors were outlined
and the SSA representation was left invalid. The patch adds the exit block to the
cold region.

llvm-svn: 342277
2018-09-14 20:36:19 +00:00
Sebastian Pop 3abcf69074 HotColdSplit: fix isSingleEntrySingleExit
remove duplicate entries from isSingleEntrySingleExit: the Entry block is
already added by the loop over the dominance frontier.

Remove the heuristic from isOutlineCandidate that a region is too small when it
only contains a basic block. With this change we now grow regions starting from
a block and we continue adding to the ValidColdRegion. Check the heuristic just
before code generation.

llvm-svn: 342276
2018-09-14 20:36:14 +00:00
Sebastian Pop 1217160bb3 HotColdSplit: add back propagation to extend cold regions
Also fix a problem in forward propagation:
  const TerminatorInst *TI = It->getTerminator();
was set outside the while loop that iterates over It.

llvm-svn: 342275
2018-09-14 20:36:10 +00:00
Florian Hahn 3afb974aa5 [LoopInterchange] Preserve ScalarEvolution, by forgetting about interchanged loops.
As preparation for LoopInterchange becoming a loop pass, it needs to
preserve ScalarEvolution. Even though interchanging should not change
the trip count of the loop, it modifies loop entry, latch and exit
blocks.

I added -verify-scev to some loop interchange tests, but the verification does
not catch problems caused by missing invalidation of SE in loop interchange, as
the trip counts themselves do not change. So there might be potential to
make the SE verification covering more stuff in the future.

Reviewers: mkazantsev, efriedma, karthikthecool

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52026

llvm-svn: 342209
2018-09-14 07:50:20 +00:00
Max Kazantsev e9765ac275 [NFC] Remove meaningless code from GVN
llvm-svn: 342202
2018-09-14 04:50:38 +00:00
Hideki Saito d19851ac7e Fix for the buildbot failure http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/23635
from the commit (r342197) of https://reviews.llvm.org/D50820.

llvm-svn: 342201
2018-09-14 02:02:57 +00:00
Hideki Saito ea7f3035a0 [VPlan] Implement initial vector code generation support for simple outer loops.
Summary:
[VPlan] Implement vector code generation support for simple outer loops.

Context: Patch Series #1 for outer loop vectorization support in LV  using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).
                                                          
This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following:

  - force vector code generation using explicit vectorize_width

  - add conservative early returns in cost model and other places for VPlanNativePath

  - add code for setting up outer loop inductions 

  - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches

  - support for generating uniform inner branches

We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. 

Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal

Reviewed By: fhahn, hsaito

Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D50820

llvm-svn: 342197
2018-09-14 00:36:00 +00:00
Matt Morehouse 3bea25e554 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 342186
2018-09-13 21:45:55 +00:00
Roman Lebedev 6dc87004fa [InstCombine] Inefficient pattern for high-bits checking 2 (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

More complicated, canonical pattern:
https://rise4fun.com/Alive/uhA

We do need to have two `switch()`'es like this,
to not mismatch the swappable predicates.

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52001

llvm-svn: 342173
2018-09-13 20:33:12 +00:00
George Burgess IV d565b0f017 [PartiallyInlineLibCalls] Add DebugCounter support
This adds DebugCounter support to the PartiallyInlineLibCalls pass,
which should make debugging/automated bisection easier in the future.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50093

llvm-svn: 342172
2018-09-13 20:33:04 +00:00
George Burgess IV 7a8dc3dafa [DCE] Add DebugCounter support
Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50092

llvm-svn: 342170
2018-09-13 20:29:50 +00:00
Craig Topper 8fc05ce340 [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible.
This allows the xor to be removed completely.

This might help with recomitting r341674, but seems good regardless.

Coincidentally fixes PR38915.

Differential Revision: https://reviews.llvm.org/D51964

llvm-svn: 342163
2018-09-13 18:52:58 +00:00
Sanjay Patel 37e464876b [InstCombine] remove checks for IsFreeToInvert()
I accidentally committed this diff with rL342147 because
I had applied D51964. We probably do need those checks,
but D51964 has tests and more discussion/motivation,
so they should be re-added with that patch.

llvm-svn: 342149
2018-09-13 16:18:12 +00:00
Sanjay Patel 6f00fc3317 [InstCombine] reorder folds to reduce chance of infinite loops
I don't have a test case for this, but it's motivated by
the discussion in D51964, and I've added TODO comments for
the better fix - move simplifications into instsimplify
because that's more efficient and reduces risk of infinite
loops in instcombine caused by transforms trying to do the
opposite folds.

In this case, we know that the transform that tries to move
'not' through min/max can be fooled by the multiple uses
of a value in another min/max, so try to squash the 
foldSPFofSPF() patterns first.

llvm-svn: 342147
2018-09-13 16:04:06 +00:00
Sanjay Patel d341988c86 revert r341288 - [Reassociate] swap binop operands to increase factoring potential
This causes or exposes indeterminism that is visible in the output of -reassociate.

llvm-svn: 342083
2018-09-12 21:29:11 +00:00
Roman Lebedev 75404fb9f8 [InstCombine] Inefficient pattern for high-bits checking (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only `n` bits wide (where `n` is a variable.)
There are **many** ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

Let's handle the second variant first, since it is much simpler.
https://rise4fun.com/Alive/LYjY

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51985

llvm-svn: 342067
2018-09-12 18:19:43 +00:00
Alexandros Lamprineas 54d56f274c [GVNHoist] computeInsertionPoints() miscalculates IDF
Fix for https://bugs.llvm.org/show_bug.cgi?id=38912.

In GVNHoist::computeInsertionPoints() we iterate over the Value
Numbers and calculate the Iterated Dominance Frontiers without
clearing the IDFBlocks vector first. IDFBlocks ends up accumulating
an insane number of basic blocks, which bloats the compilation time
of SemaChecking.cpp with ubsan enabled.

Differential Revision: https://reviews.llvm.org/D51980

llvm-svn: 342055
2018-09-12 14:28:23 +00:00
David Green 2352b30c96 [SimplifyCFG] Put an alignment on generated switch tables
Previously the alignment on the newly created switch table data was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it to 16
bytes. This causes unnecessary code bloat.

Differential Revision: https://reviews.llvm.org/D51800

llvm-svn: 342039
2018-09-12 09:54:17 +00:00
Florian Hahn 1086ce2397 [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)
Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use
them for VPlan outside LoopVectorize.cpp

Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00

Reviewed By: rengolin, xbolva00

Differential Revision: https://reviews.llvm.org/D49488

llvm-svn: 342027
2018-09-12 08:01:57 +00:00
Vikram TV 7e98d69847 Break LoopUtils into an Analysis file.
Summary:
The InductionDescriptor and RecurrenceDescriptor classes basically analyze the IR to identify the respective IVs. So, it is better to have them in the "Analysis" directory instead of the "Transforms" directory.

The rationale for this is to make the Induction and Recurrence descriptor classes available for analysis passes. Currently including them in an analysis pass produces link error (http://lists.llvm.org/pipermail/llvm-dev/2018-July/124456.html).

Induction and Recurrence descriptors are moved from Transforms/Utils/LoopUtils.h|cpp to Analysis/IVDescriptors.h|cpp.

Reviewers: dmgreen, llvm-commits, hfinkel

Reviewed By: dmgreen

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D51153

llvm-svn: 342016
2018-09-12 01:59:43 +00:00
Sanjay Patel 1cf0734b2f [InstCombine] add folds for unsigned-overflow compares
Name: op_ugt_sum
  %a = add i8 %x, %y
  %r = icmp ugt i8 %x, %a
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

Name: sum_ult_op
  %a = add i8 %x, %y
  %r = icmp ult i8 %a, %x
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

https://rise4fun.com/Alive/ZRxI

AFAICT, this doesn't interfere with any add-saturation patterns
because those have >1 use for the 'add'. But this should be
better for IR analysis and codegen in the basic cases.

This is another fold inspired by PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

llvm-svn: 342004
2018-09-11 22:40:20 +00:00
Alexandros Lamprineas fe0512d575 Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts rL341954.

The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:

[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill

llvm-svn: 342001
2018-09-11 22:10:57 +00:00
Sanjay Patel 26725bdc50 [InstCombine] add folds for icmp with xor mask constant
These are the folds in Alive;
Name: xor_ult
Pre: isPowerOf2(-C1)
%xor = xor i8 %x, C1
%r = icmp ult i8 %xor, C1
=>
%r = icmp ugt i8 %x, ~C1

Name: xor_ugt
Pre: isPowerOf2(C1+1)
%xor = xor i8 %x, C1
%r = icmp ugt i8 %xor, C1
=>
%r = icmp ugt i8 %x, C1

https://rise4fun.com/Alive/Vty

The ugt case in its simplest form was already handled by DemandedBits,
but that's not ideal as shown in the multi-use test.

I'm not sure if these are all of the symmetrical folds, but I adjusted 
the existing code for one of the folds to try to show the similarities.

There's no obvious connection, but this is another preliminary step 
for PR14613...
https://bugs.llvm.org/show_bug.cgi?id=14613

llvm-svn: 341997
2018-09-11 22:00:15 +00:00
Matt Morehouse f0d7daa972 Revert "[SanitizerCoverage] Create comdat for global arrays."
This reverts r341987 since it will cause trouble when there's a module
ID collision.

llvm-svn: 341995
2018-09-11 21:15:41 +00:00
Matt Morehouse 7ce6032432 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 341987
2018-09-11 20:10:40 +00:00
Alina Sbirlea a496143c9e Update MemorySSA in LoopUnswitch.
Summary:
Update MemorySSA in old LoopUnswitch pass.
Actual dependency and update is disabled by default.

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45301

llvm-svn: 341984
2018-09-11 19:19:21 +00:00
Sanjay Patel 342c3bcf11 [InstCombine] enhance vector demanded elements to look at a vector select condition operand
I noticed that we were not back-propagating undef lanes to shuffle masks when we have a 
shuffle that reduces the vector width. This is part of investigating/solving PR38691:
https://bugs.llvm.org/show_bug.cgi?id=38691

The DAG equivalent was proposed with:
D51696

Differential Revision: https://reviews.llvm.org/D51433

llvm-svn: 341981
2018-09-11 18:49:00 +00:00
Vedant Kumar 727d89526e [gcov] Fix branch counters with switch statements (fix PR38821)
Right now, the counters are added in regards of the number of successors
for a given BasicBlock: it's good when we've only 1 or 2 successors (at
least with BranchInstr). But in the case of a switch statement, the
BasicBlock after switch has several predecessors and we need know from
which BB we're coming from.

So the idea is to revert what we're doing: add a PHINode in each block
which will select the counter according to the incoming BB.  They're
several pros for doing that:

- we fix the "switch" bug
- we remove the function call to "__llvm_gcov_indirect_counter_increment"
  and the lookup table stuff
- we replace by PHINodes, so the optimizer will probably makes a better
  job.

Patch by calixte!

Differential Revision: https://reviews.llvm.org/D51619

llvm-svn: 341977
2018-09-11 18:38:34 +00:00
Craig Topper 4e63db8387 [InstCombine] Fix incorrect usage of getPrimitiveSizeInBits when we should be using the element size for vectors
For vectors, getPrimitiveSizeInBits returns the full vector width. This code should using the element size for vectors. This could be fixed by calling getScalarSizeInBits, but its even easier to just get it from the APInt we're checking.

Differential Revision: https://reviews.llvm.org/D51938

llvm-svn: 341971
2018-09-11 17:57:20 +00:00
Florian Hahn 5b7e21a6b7 [CallSiteSplitting] Add debug location to created PHI nodes.
There are 2 cases when we create PHI nodes:
 * For the result of the call that was duplicated in the split blocks.
   Those PHI nodes should have the debug location of the call.

 * For values produced before the call. Those instructions need to be
   duplicated in the split blocks and the PHI nodes should have the
   debug locations of those instructions.

Fixes PR37962.

Reviewers: junbuml, gbedwell, vsk

Reviewed By: junbuml

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D51919

llvm-svn: 341970
2018-09-11 17:55:58 +00:00
Matt Morehouse 40fbdd0c4f Revert "[SanitizerCoverage] Create comdat for global arrays."
This reverts r341951 due to bot breakage.

llvm-svn: 341965
2018-09-11 17:20:14 +00:00
Craig Topper 12fd6bd4ad [InstCombine] Use dyn_cast instead of match(m_Constant). NFC
llvm-svn: 341962
2018-09-11 16:51:26 +00:00
Craig Topper a57bb61a3e [InstCombine] Support (mul (sext x), cst) --> (sext (mul x, cst')) and (mul (zext x), cst) --> (zext (mul x, cst')) for vectors constants.
Similar to D51236, but for mul instead of add.

Differential Revision: https://reviews.llvm.org/D51900

llvm-svn: 341961
2018-09-11 16:51:24 +00:00
Alexandros Lamprineas db18e972d7 [GVNHoist] Re-enable GVNHoist by default
Rebase rL340922 since https://bugs.llvm.org/show_bug.cgi?id=38807
has been fixed by rL341947.

llvm-svn: 341954
2018-09-11 15:55:45 +00:00
Matt Morehouse eac270caf4 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 341951
2018-09-11 15:23:14 +00:00
Johannes Doerfert ae3cfeb3ad [FuncAttrs] Remove "access range attributes" for read-none functions
The presence of readnone and an access range attribute (argmemonly,
inaccessiblememonly, inaccessiblemem_or_argmemonly) is considered an
error by the verifier. This seems strict but also not wrong. This
patch makes sure function attribute detection will remove all access
range attributes for readnone functions.

llvm-svn: 341927
2018-09-11 11:51:29 +00:00
Serguei Katkov 5f4a9e9ea0 [LICM] Avoid duplicate work during building AliasSetTracker
Currently we re-use cached info from sub loops or traverse them
to populate AliasSetTracker. But after that we traverse all basic blocks
from the current loop. This is redundant work.

All what we need is traversing the all basic blocks from the loop except
those which are used to get the data from the cache.

This should improve compile time only.

Reviewers: mkazantsev, reames, kariddi, anna
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51715

llvm-svn: 341896
2018-09-11 04:07:36 +00:00
Max Kazantsev e6413919ce [IndVars][NFC] Refactor to make modifications of Changed transparent
IndVarSimplify's design is somewhat odd in the way how it reports that
some transform has made a change. It has a `Changed` field which can
be set from within any function, which makes it hard to track whether or
not it was set properly after a transform was made. It leads to oversights
in setting this flag where needed, see example in PR38855.

This patch removes the `Changed` field, turns it into a local and unifies
the signatures of all relevant transform functions to return boolean value
which designates whether or not this transform has made a change.

Differential Revision: https://reviews.llvm.org/D51850
Reviewed By: skatkov

llvm-svn: 341893
2018-09-11 03:57:22 +00:00
Philip Reames 1f52e38e8e [LICM] (re-)simplify code using MemoryLocation API [NFC]
I'd made exactly this same change before, but it appears to have been accidentally reverted in another change.  (I'm assuming accidental since it was without comment or test case, and in an unrelated change.)

llvm-svn: 341892
2018-09-11 03:28:28 +00:00
Alina Sbirlea 116caa2920 [InstCombine] Partially revert rL341674 due to PR38897.
Summary:
Revert min/max changes in rL341674 dues to high compile times causing timeouts (PR38897).
Checking in to unblock failing builds. Patch available for post-commit review and re-revert once resolved.
Working on a smaller reproducer for PR38897.

Reviewers: craig.topper, spatel

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D51897

llvm-svn: 341883
2018-09-10 23:47:21 +00:00
Sanjay Patel 691d1a40e2 [InstCombine] use SelectInst operand names to make code clearer; NFC
Cleanup step for D51433.

llvm-svn: 341850
2018-09-10 18:37:59 +00:00
Sebastian Pop a1f20fcc96 HotColdSplitting: check that target supports cold calling convention
Before tagging a function with coldcc make sure the target supports cold calling
convention. Without this patch HotColdSplitting pass fails on aarch64 with:

  fatal error: error in backend: Unsupported calling convention.

llvm-svn: 341838
2018-09-10 15:08:02 +00:00
Sebastian Pop 6284b54ccd add flag instead of using a constant [NFC]
llvm-svn: 341837
2018-09-10 15:07:59 +00:00
Sebastian Pop ea0a91298d make flag name more specific to gvn [NFC]
llvm-svn: 341836
2018-09-10 15:07:56 +00:00
Tim Northover 12c1f7675f InstCombine: move hasOneUse check to the top of foldICmpAddConstant
There were two combines not covered by the check before now, neither of which
actually differed from normal in the benefit analysis.

The most recent seems to be because it was just added at the top of the
function (naturally). The older is from way back in 2008 (r46687) when we just
didn't put those checks in so routinely, and has been diligently maintained
since.

llvm-svn: 341831
2018-09-10 14:26:44 +00:00
Benjamin Kramer 28559a2605 Don't create a temporary vector of loop blocks just to iterate over them.
Loop's getBlocks returns an ArrayRef.

llvm-svn: 341821
2018-09-10 12:32:06 +00:00
John Brawn 8967e18c4a [GVN] Invalidate cached info for values replaced by equality propagation
When GVN propagates an equality by replacing one value with another it also
needs to invalidate the cached information for the value being replaced.

Differential Revision: https://reviews.llvm.org/D51218

llvm-svn: 341820
2018-09-10 12:23:05 +00:00
Max Kazantsev fde88578e5 [IndVars] Set Changed if rewriteFirstIterationLoopExitValues changes IR. PR38863
Currently, `rewriteFirstIterationLoopExitValues` does not set Changed flag even if it makes
changes in the IR. There is no clear evidence that it can cause a crash, but it
looks highly suspicious and likely invalid.

Differential Revision: https://reviews.llvm.org/D51779
Reviewed By: skatkov

llvm-svn: 341779
2018-09-10 06:50:16 +00:00
Max Kazantsev 4d10ba37b9 [IndVars] Set Changed if sinkUnusedInvariants changes IR. PR38863
Currently, `sinkUnusedInvariants` does not set Changed flag even if it makes
changes in the IR. There is no clear evidence that it can cause a crash, but it
looks highly suspicious and likely invalid.

Differential Revision: https://reviews.llvm.org/D51777
Reviewed By: skatkov

llvm-svn: 341777
2018-09-10 06:32:00 +00:00
Vikram TV 09be521d4d Move a transformation routine from LoopUtils to LoopVectorize.
Summary:
Move InductionDescriptor::transform() routine from LoopUtils to its only uses in LoopVectorize.cpp.
Specifically, the function is renamed as InnerLoopVectorizer::emitTransformedIndex().

This is a child to D51153.

Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51837

llvm-svn: 341776
2018-09-10 06:16:44 +00:00
Vikram TV 6594dc377d Move createMinMaxOp() out of RecurrenceDescriptor.
Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51838

llvm-svn: 341773
2018-09-10 05:05:08 +00:00
Abderrazek Zaafrani c30dfb2dfc [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Laod operand.
Differential Revision: https://reviews.llvm.org/D49151

llvm-svn: 341726
2018-09-07 22:41:57 +00:00
Alina Sbirlea f98c2c5e6d [MemorySSA] Update MemoryPhi wiring for block splitting to consider if identical edges were merged.
Summary:
Block splitting is done with either identical edges being merged, or not.
Only critical edges can be split without merging identical edges based on an option.
Teach the memoryssa updater to take this into account: for the same edge between two blocks only move one entry from the Phi in Old to the new Phi in New.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D51563

llvm-svn: 341709
2018-09-07 21:14:48 +00:00
Sanjay Patel c1416b60f2 [InstCombine] narrow vector select with padded condition and extracted result (PR38691)
shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496

llvm-svn: 341708
2018-09-07 21:03:34 +00:00
Fangrui Song b3b61de09a [PGO] Fix some style issue of ControlHeightReduction
Reviewers: yamauchi

Reviewed By: yamauchi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51811

llvm-svn: 341702
2018-09-07 20:23:15 +00:00
Hiroshi Yamauchi 06650941a2 [PGO][CHR] Build/warning fix
llvm-svn: 341692
2018-09-07 18:44:53 +00:00
JF Bastien 7e2dd2d2fa NFC: remove magic bool in LoopIdiomRecognize
Use an enum class instead.

llvm-svn: 341684
2018-09-07 18:17:59 +00:00
Hiroshi Yamauchi 5fb509b763 [PGO][CHR] Small cleanup.
Summary:
Do away with demangling. It wasn't really necessary.
Declared some local functions to be static.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51740

llvm-svn: 341681
2018-09-07 18:00:58 +00:00
Craig Topper 040c2b0acf [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.

I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.

Differential Revision: https://reviews.llvm.org/D51398

llvm-svn: 341674
2018-09-07 16:19:50 +00:00
Anna Thomas 110df11a1a [LV] Fix code gen for conditionally executed loads and stores
Fix a latent bug in loop vectorizer which generates incorrect code for
memory accesses that are executed conditionally. As pointed in review,
this bug definitely affects uniform loads and may affect conditional
stores that should have turned into scatters as well).

The code gen for conditionally executed uniform loads on architectures
that support masked gather instructions is broken.

Without this patch, we were unconditionally executing the *conditional*
load in the vectorized version.

This patch does the following:
1. Uniform conditional loads on architectures with gather support will
   have correct code generated. In particular, the cost model
   (setCostBasedWideningDecision) is fixed.
2. For the recipes which are handled after the widening decision is set,
   we use the isScalarWithPredication(I, VF) form which is added in the
   patch.

3. Fix the vectorization cost model for scalarization
   (getMemInstScalarizationCost): implement and use isPredicatedInst to
   identify *all* predicated instructions, not just scalar+predicated. So,
   now the cost for scalarization will be increased for maskedloads/stores
   and gather/scatter operations. In short, we should be choosing the
   gather/scatter in place of scalarization on archs where it is
   profitable.
4. We needed to weaken the assert in useEmulatedMaskMemRefHack.

Reviewers: Ayal, hsaito, mkuper

Differential Revision: https://reviews.llvm.org/D51313

llvm-svn: 341673
2018-09-07 15:53:48 +00:00
Aditya Kumar 801394a3d7 Hot cold splitting pass
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.

Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658

llvm-svn: 341669
2018-09-07 15:03:49 +00:00
Florian Hahn e32ff4b28a [InstCombine] Do not fold scalar ops over select with vector condition.
If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.

Reviewers: spatel, john.brawn, mssimpso, craig.topper

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51781

llvm-svn: 341666
2018-09-07 14:40:06 +00:00
Florian Hahn b30f7aeeeb [NewGVN] Mark function as changed if we erase instructions.
Currently eliminateInstructions only returns true if any instruction got
replaced. In the test case for this patch, we eliminate the trivially
dead calls, for which eliminateInstructions not do a replacement and the
function is not marked as changed, which is why the inliner crashes
while traversing the call graph.

Alternatively we could also change eliminateInstructions to return true
in case we mark instructions for deletion, but that's slightly more code
and doing it at the place where the replacement happens seems safer.

Fixes PR37517.

Reviewers: davide, mcrosier, efriedma, bjope

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D51169

llvm-svn: 341651
2018-09-07 11:41:34 +00:00
Alexander Potapenko 6301574cb4 [MSan] don't access MsanCtorFunction when using KMSAN
MSan has found a use of uninitialized memory in MSan, fix it.

llvm-svn: 341646
2018-09-07 09:56:36 +00:00
Alexander Potapenko 8fe99a0ef2 [MSan] Add KMSAN instrumentation to MSan pass
Introduce the -msan-kernel flag, which enables the kernel instrumentation.

The main differences between KMSAN and MSan instrumentations are:

- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
  Shadow and origin values for a particular X-byte memory location are
  read and written via pointers returned by
  __msan_metadata_ptr_for_load_X(u8 *addr) and
  __msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
  to a function returning that struct is inserted into every instrumented
  function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
  entry and unpoisoned with __msan_unpoison_alloca() before leaving the
  function;
- the pass doesn't declare any global variables or add global constructors
  to the translation unit.

llvm-svn: 341637
2018-09-07 09:10:30 +00:00
Max Kazantsev 9e6845d8e1 [IndVars] Set Changed when we delete dead instructions. PR38855
IndVars does not set `Changed` flag when it eliminates dead instructions. As result,
it may make IR modifications and report that it has done nothing. It leads to inconsistent
preserved analyzes results.

Differential Revision: https://reviews.llvm.org/D51770
Reviewed By: skatkov

llvm-svn: 341633
2018-09-07 07:23:39 +00:00
Wei Mi 94d44c97bc [SampleFDO] Make sample profile loader unaware of compact format change.
The patch tries to make sample profile loader independent of profile format
change. It moves compact format related code into FunctionSamples and
SampleProfileReader classes, and sample profile loader only has to interact
with those two classes and will be unaware of profile format changes.

The cleanup also contain some fixes to further remove the difference between
compactbinary format and binary format. After the cleanup using different
formats originated from the same profile will generate the same binaries,
which we verified by compiling two large server benchmarks w/wo thinlto.

Differential Revision: https://reviews.llvm.org/D51643

llvm-svn: 341591
2018-09-06 22:03:37 +00:00
Sanjay Patel 93bd15a005 [InstCombine] add xor+not folds
This fold is needed to avoid a regression when we try
to recommit rL300977. 
We can't see the most basic win currently because 
demanded bits changes the patterns:
https://rise4fun.com/Alive/plpp

llvm-svn: 341559
2018-09-06 16:23:40 +00:00
Alexander Potapenko 7f270fcf0a [MSan] store origins for variadic function parameters in __msan_va_arg_origin_tls
Add the __msan_va_arg_origin_tls TLS array to keep the origins for variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.

This is a reland of r341528.

test/msan/vararg.cc doesn't work on Mips, PPC and AArch64 (because this
patch doesn't touch them), XFAIL these arches.
Also turned out Clang crashed on i80 vararg arguments because of
incorrect origin type returned by getOriginPtrForVAArgument() - fixed it
and added a test.

llvm-svn: 341554
2018-09-06 15:14:36 +00:00
Sanjay Patel 1a00ffd656 [InstCombine] fix formatting in SimplifyDemandedVectorElts->Select; NFCI
I'm preparing to add the same functionality both here and to the DAG 
version of this code in D51696 / D51433, so try to make those cases 
as similar as possible to avoid bugs.

llvm-svn: 341545
2018-09-06 13:19:22 +00:00
Alexander Potapenko ac6595bd53 [MSan] revert r341528 to unbreak the bots
llvm-svn: 341541
2018-09-06 12:19:27 +00:00
Florian Hahn c51d08825f [LoopInterchange] Cleanup unused variables.
llvm-svn: 341537
2018-09-06 10:41:01 +00:00
Florian Hahn 236f6feb8f [LoopInterchange] Move preheader creation to transform stage and simplify.
There is no need to create preheaders in the analysis stage, we only
need them when adjusting the branches. Also, the only cases we need to
create our own preheaders is when they have more than 1 predecessors or
PHI nodes (even with only 1 predecessor, we could have an LCSSA phi
node). I have simplified the conditions and added some assertions to be
sure. Because we know the inner and outer loop need to be tightly
nested, it is sufficient to check if the inner loop preheader is the
outer loop header to check if we need to create a new preheader.

Reviewers: efriedma, mcrosier, karthikthecool

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D51703

llvm-svn: 341533
2018-09-06 09:57:27 +00:00
Alexander Potapenko 1a10ae0def [MSan] store origins for variadic function parameters in __msan_va_arg_origin_tls
Add the __msan_va_arg_origin_tls TLS array to keep the origins for
variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.

llvm-svn: 341528
2018-09-06 08:50:11 +00:00
Alexander Potapenko d518c5fc87 [MSan] Make sure variadic function arguments do not overflow __msan_va_arg_tls
Turns out that calling a variadic function with too many (e.g. >100 i64's)
arguments overflows __msan_va_arg_tls, which leads to smashing other TLS
data with function argument shadow values.

getShadow() already checks for kParamTLSSize and returns clean shadow if
the argument does not fit, so just skip storing argument shadow for such
arguments.

llvm-svn: 341525
2018-09-06 08:21:54 +00:00
Max Kazantsev f90154069c Revert "[IndVars] Turn isValidRewrite into an assertion" because it seems wrong
llvm-svn: 341517
2018-09-06 05:52:47 +00:00
Max Kazantsev 51690c4f52 [IndVars] Turn isValidRewrite into an assertion
Function rewriteLoopExitValues contains a check on isValidRewrite which
is needed to make sure that SCEV does not convert the pattern
`gep Base, (&p[n] - &p[0])` into `gep &p[n], Base - &p[0]`. This problem
has been fixed in SCEV long ago, so this check is just obsolete.

This patch converts it into an assertion to make sure that the SCEV will
not mess up this case in the future.

Differential Revision: https://reviews.llvm.org/D51582
Reviewed By: atrick

llvm-svn: 341516
2018-09-06 05:21:25 +00:00
Benjamin Kramer 9abad4814d [ControlHeightReduction] Remove unused includes
Also clang-format them.

llvm-svn: 341468
2018-09-05 13:51:05 +00:00
Benjamin Kramer 837f93af7d [Aggressive InstCombine] Move C bindings to their own header file.
llvm-svn: 341461
2018-09-05 11:41:12 +00:00
Richard Trieu 47c2bc58b3 Prevent unsigned overflow.
The sum of the weights is caculated in an APInt, which has a width smaller than
64.  In certain cases, the sum of the widths would overflow when calculations
are done inside an APInt, but would not if done with uint64_t.  Since the
values will be passed as uint64_t in the function call anyways, do all the math
in 64 bits.  Also added an assert in case the probabilities overflow 64 bits.

llvm-svn: 341444
2018-09-05 04:19:15 +00:00
Fangrui Song c8f348cba7 Fix -Wunused-function in release build after rL341386
llvm-svn: 341443
2018-09-05 03:10:20 +00:00
Sanjay Patel 63cf26cf01 [InstCombine] fix xor-or-xor fold to check uses and handle commutes
I'm probably missing some way to use m_Deferred to remove the code
duplication, but that can be a follow-up.

The improvement in demand_shrink_nsw.ll is an example of missing
the fold because the pattern matching was deficient. I didn't try
to follow the bits in that test, but Alive says it's correct:
https://rise4fun.com/Alive/ugc

llvm-svn: 341426
2018-09-04 23:22:13 +00:00
Zhaoshi Zheng a0aa41d793 Revert "Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions"
Reland r341269. Use std::stable_sort when sorting constant condidates.

Reverting commit, r341365:

  Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions

  One of the tests is failing 50% of the time when expensive checks are
  enabled. Not sure how deep the problem is so just reverting while the
  author can investigate so that the bots stop repeatedly failing and
  blaming things incorrectly. Will respond with details on the original
  commit.

Original commit, r341269:

  [Constant Hoisting] Hoisting Constant GEP Expressions

  Leverage existing logic in constant hoisting pass to transform constant GEP
  expressions sharing the same base global variable. Multi-dimensional GEPs are
  rewritten into single-dimensional GEPs.

  https://reviews.llvm.org/D51396

Differential Revision: https://reviews.llvm.org/D51654

llvm-svn: 341417
2018-09-04 22:17:03 +00:00
Anna Thomas dbacea188b [LV] First order recurrence phis should not be treated as uniform
This is fix for PR38786.
First order recurrence phis were incorrectly treated as uniform,
which caused them to be vectorized as uniform instructions.

Patch by Ayal Zaks and Orivej Desh!

Reviewed by: Anna

Differential Revision: https://reviews.llvm.org/D51639

llvm-svn: 341416
2018-09-04 22:12:23 +00:00
Hiroshi Yamauchi bd897a02a0 Fix a memory leak after rL341386.
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51658

llvm-svn: 341412
2018-09-04 21:28:22 +00:00
Sanjay Patel 0f70f86ce0 [InstCombine] make ((X & C) ^ C) form consistent for vectors
It would be better to create a 'not' here, but that's not possible yet.

llvm-svn: 341410
2018-09-04 21:17:14 +00:00
Sanjay Patel a89f183253 [InstCombine] simplify code for xor folds; NFCI
This is just a cleanup step. The TODO comments show
what is wrong with the 'and' version of the fold.
Fixing this should be part of recommitting:
rL300977

llvm-svn: 341405
2018-09-04 21:00:13 +00:00
Reid Kleckner 792a4f8a21 Fix unused variable warning
llvm-svn: 341400
2018-09-04 20:34:47 +00:00
Fedor Sergeev 8b6effd969 [SimpleLoopUnswitch] remove a chain of dead blocks at once
Recent change to deleteDeadBlocksFromLoop was not enough to
fix all the problems related to dead blocks after nontrivial
unswitching of switches.

We need to delete all the dead blocks that were created during
unswitching, otherwise we will keep having problems with phi's
or dead blocks.

This change removes all the dead blocks that are reachable from the loop,
not trying to track whether these blocks are newly created by unswitching
or not. While not completely correct, we are unlikely to get loose but
reachable dead blocks that do not belong to our loop nest.

It does fix all the failures currently known, in particular PR38778.

Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D51519

llvm-svn: 341398
2018-09-04 20:19:41 +00:00
Hiroshi Yamauchi 72ee6d6000 Fix build failures after rL341386.
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51647

llvm-svn: 341391
2018-09-04 18:10:54 +00:00
Hiroshi Yamauchi 9775a620b0 [PGO] Control Height Reduction
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.

if (hot_cond1) { // Likely true.
  do_stg_hot1();
}
if (hot_cond2) { // Likely true.
  do_stg_hot2();
}

->

if (hot_cond1 && hot_cond2) { // Hot path.
  do_stg_hot1();
  do_stg_hot2();
} else { // Cold path.
  if (hot_cond1) {
    do_stg_hot1();
  }
  if (hot_cond2) {
    do_stg_hot2();
  }
}

This speeds up some internal benchmarks up to ~30%.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D50591

llvm-svn: 341386
2018-09-04 17:19:13 +00:00
Chandler Carruth 6cb12444cc Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions
One of the tests is failing 50% of the time when expensive checks are
enabled. Not sure how deep the problem is so just reverting while the
author can investigate so that the bots stop repeatedly failing and
blaming things incorrectly. Will respond with details on the original
commit.

llvm-svn: 341365
2018-09-04 13:36:44 +00:00
Chandler Carruth 664aa868f5 [x86/SLH] Add a real Clang flag and LLVM IR attribute for Speculative
Load Hardening.

Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.

Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.

While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.

This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.

Differential Revision: https://reviews.llvm.org/D51157

llvm-svn: 341363
2018-09-04 12:38:00 +00:00
Nicola Zaghen 9588ad9611 [InstCombine] Fold icmp ugt/ult (add nuw X, C2), C --> icmp ugt/ult X, (C - C2)
Support for sgt/slt was added in rL294898, this adds the same cases also for unsigned compares.

This is the Alive proof: https://rise4fun.com/Alive/nyY

Differential Revision: https://reviews.llvm.org/D50972

llvm-svn: 341353
2018-09-04 10:29:48 +00:00
Max Kazantsev f34115c627 [NFC] Add assert to detect LCSSA breaches early
llvm-svn: 341347
2018-09-04 06:34:40 +00:00
Max Kazantsev 2cbba56337 [IndVars] Fix usage of SCEVExpander to not mess with SCEVConstant. PR38674
This patch removes the function `expandSCEVIfNeeded` which behaves not as
it was intended. This function tries to make a lookup for exact existing expansion
and only goes to normal expansion via `expandCodeFor` if this lookup hasn't found
anything. As a result of this, if some instruction above the loop has a `SCEVConstant`
SCEV, this logic will return this instruction when asked for this `SCEVConstant` rather
than return a constant value. This is both non-profitable and in some cases leads to
breach of LCSSA form (as in PR38674).

Whether or not it is possible to break LCSSA with this algorithm and with some
non-constant SCEVs is still in question, this is still being investigated. I wasn't
able to construct such a test so far, so maybe this situation is impossible. If it is,
it will go as a separate fix.

Rather than do it, it is always correct to just invoke `expandCodeFor` unconditionally:
it behaves smarter about insertion points, and as side effect of this it will choose a
constant value for SCEVConstants. For other SCEVs it may end up finding a better insertion
point. So it should not be worse in any case.

NOTE: So far the only known case for which this transform may break LCSSA is mapping
of SCEVConstant to an instruction. However there is a suspicion that the entire algorithm
can compromise LCSSA form for other cases as well (yet not proved).

Differential Revision: https://reviews.llvm.org/D51286
Reviewed By: etherzhhb

llvm-svn: 341345
2018-09-04 05:01:35 +00:00
Sanjay Patel 2fe1f62c88 [InstCombine] simplify xor/not folds; NFCI
llvm-svn: 341336
2018-09-03 18:40:56 +00:00
Sanjay Patel d75064e6d5 [InstCombine] allow add+not --> sub for arbitrary vector constants.
llvm-svn: 341335
2018-09-03 18:21:59 +00:00
Florian Hahn cc9dc599ba [SLC] Support expanding pow(x, n+0.5) to x * x * ... * sqrt(x)
Reviewers: evandro, efriedma, spatel

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51435

llvm-svn: 341330
2018-09-03 17:37:39 +00:00
Argyrios Kyrtzidis c30340b207 Add header guards to some headers that are missing them
Also adjust some of dsymutil's headers to put the header guards at the top,
otherwise the compiler will not recognize them as header guards.

llvm-svn: 341323
2018-09-03 16:22:05 +00:00
Sanjay Patel 17e709b66a [InstCombine] allow not+sub fold for arbitrary vector constants
The fold was implemented for the general case but use-limitation,
but the later constant version which didn't check uses was only
matching splat constants.

llvm-svn: 341292
2018-09-02 19:31:45 +00:00
Sanjay Patel ca36eb4e33 [Reassociate] swap binop operands to increase factoring potential
If we have a pair of binops feeding another pair of binops, rearrange the operands so 
the matching pair are together because that allows easy factorization folds to happen 
in instcombine:
((X << S) & Y) & (Z << S) --> ((X << S) & (Z << S)) & Y (reassociation)

--> ((X & Z) << S) & Y (factorize shift from 'and' ops optimization)

This is part of solving PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

Note that there's an instcombine version of this patch attached there, but we're trying
to make instcombine have less responsibility to improve compile-time efficiency.

For reasons I still don't completely understand, reassociate does this kind of transform
sometimes, but misses everything in my motivating cases.

This patch on its own is gluing an independent cleanup chunk to the end of the existing 
RewriteExprTree() loop. We can build on it and do something stronger to better order the 
full expression tree like D40049. That might be an alternative to the proposal to add a 
separate reassociation pass like D41574.

Differential Revision: https://reviews.llvm.org/D45842

llvm-svn: 341288
2018-09-02 14:22:54 +00:00
Sanjay Patel 099b1a4b0c [InstCombine] simplify code for 'or' fold
This is no-outwardly-visible-change intended, so no test.
But the code is smaller and more efficient. The check for
a 'not' op is intended to avoid the expensive value tracking
call when it should not be necessary, and it might prevent
infinite looping when we resurrect:
rL300977

llvm-svn: 341280
2018-09-01 15:08:59 +00:00
Zhaoshi Zheng f5297fb24b [Constant Hoisting] Hoisting Constant GEP Expressions
Leverage existing logic in constant hoisting pass to transform constant GEP
expressions sharing the same base global variable. Multi-dimensional GEPs are
rewritten into single-dimensional GEPs.

Differential Revision: https://reviews.llvm.org/D51396

llvm-svn: 341269
2018-09-01 00:04:56 +00:00
Matt Arsenault c807ce0ee4 SLPVectorizer: Fix assert with different sized address spaces
llvm-svn: 341215
2018-08-31 14:34:53 +00:00
Evandro Menezes 2123ea7d5c [InstCombine] Expand the simplification of pow() into exp2()
Generalize the simplification of `pow(2.0, y)` to `pow(2.0 ** n, y)` for all
scalar and vector types.

This improvement helps some benchmarks in SPEC CPU2000 and CPU2006, such as
252.eon, 447.dealII, 453.povray.  Otherwise, no significant regressions on
x86-64 or A64.

Differential revision: https://reviews.llvm.org/D49273

llvm-svn: 341095
2018-08-30 19:04:51 +00:00
Eli Friedman 94d3e4dd77 [SROA] Fix alignment for uses of PHI nodes.
Splitting an alloca can decrease the alignment of GEPs into the
partition.  Normally, rewriting accounts for this, but the code was
missing for uses of PHI nodes and select instructions.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38707 .

Differential Revision: https://reviews.llvm.org/D51335

llvm-svn: 341094
2018-08-30 18:59:24 +00:00
Matt Morehouse 7e042bb1d1 [libFuzzer] Port to Windows
Summary:
Port libFuzzer to windows-msvc.
This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well.
It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch.
It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them.

Patch By: metzman

Reviewers: morehouse, rnk

Reviewed By: morehouse, rnk

Subscribers: #sanitizers, delcypher, morehouse, kcc, eraman

Differential Revision: https://reviews.llvm.org/D51022

llvm-svn: 341082
2018-08-30 15:54:44 +00:00
Nicolai Haehnle 35617ed4cb [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary:
This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).

The purpose of this patch is to free up the name DivergenceAnalysis for the new generic
implementation. The generic implementation class will be shared by specialized
divergence analysis classes.

Patch by: Simon Moll

Reviewed By: nhaehnle

Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50434

Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb
llvm-svn: 341071
2018-08-30 14:21:36 +00:00
Martin Storsjo 22dcddf651 Revert "[SimplifyCFG] Common debug handling [NFC]"
This reverts commit r340997.

This change turned out not to be NFC after all, but e.g. causes
clang to crash when building the linux kernel for aarch64.

llvm-svn: 341031
2018-08-30 08:06:50 +00:00
Max Kazantsev d3a4cbe153 [NFC] Move OrderedInstructions and InstructionPrecedenceTracking to Analysis
These classes don't make any changes to IR and have no reason to be in
Transform/Utils. This patch moves them to Analysis folder. This will allow
us reusing these classes in some analyzes, like MustExecute.

llvm-svn: 341015
2018-08-30 04:49:03 +00:00
Max Kazantsev 3c284bde3f Re-enable "[NFC] Unify guards detection"
rL340921 has been reverted by rL340923 due to linkage dependency
from Transform/Utils to Analysis which is not allowed. In this patch
this has been fixed, a new utility function moved to Analysis.

Differential Revision: https://reviews.llvm.org/D51152

llvm-svn: 341014
2018-08-30 03:39:16 +00:00
Philip Reames bed556133d [SimplifyCFG] Rename a variable for readibility of a future change [NFC]
llvm-svn: 341004
2018-08-30 00:12:29 +00:00
Philip Reames 6bd16b5850 [SimplifyCFG] Fix a cost modeling oversight in branch commoning
The cost modeling was not accounting for the fact we were duplicating the instruction once per predecessor.  With a default threshold of 1, this meant we were actually creating #pred copies.

Adding to the fun, there is *absolutely no* test coverage for this.  Simply bailing for more than one predecessor passes all checked in tests.

llvm-svn: 341001
2018-08-30 00:03:02 +00:00
Philip Reames 7c57dac955 [SimplifyCFG] Common debug handling [NFC]
llvm-svn: 340997
2018-08-29 23:22:07 +00:00
Reid Kleckner 9397c2a23b Revert r340947 "[InstCombine] Expand the simplification of pow() into exp2()"
It broke the clang-cl self-host.

llvm-svn: 340991
2018-08-29 22:58:33 +00:00
Philip Reames 1887c40b22 Add a todo and tests to Address a review commnt from D50925 [NFC]
llvm-svn: 340978
2018-08-29 22:09:21 +00:00
Philip Reames f562fc8dbf [LICM] Hoist stores of invariant values to invariant addresses out of loops
Teach LICM to hoist stores out of loops when the store writes to a location otherwise unused in the loop, writes a value which is invariant, and is guaranteed to execute if the loop is entered.

Worth noting is that this transformation is partially overlapping with the existing promotion transformation. Reasons this is worthwhile anyway include:
 * For multi-exit loops, this doesn't require duplication of the store.
 * It kicks in for case where we can't prove we exit through a normal exit (i.e. we may throw), but can prove the store executes before that possible side exit.

Differential Revision: https://reviews.llvm.org/D50925

llvm-svn: 340974
2018-08-29 21:49:30 +00:00
Fedor Sergeev 7b49aa03af [SimpleLoopUnswitch] After unswitch delete dead blocks in parent loops
Summary:
Assert from PR38737 happens on the dead block inside the parent loop
after unswitching nontrivial switch in the inner loop.

deleteDeadBlocksFromLoop now takes extra care to detect/remove dead
blocks in all the parent loops in addition to the blocks from original
loop being unswitched.

Reviewers: asbirlea, chandlerc

Reviewed By: asbirlea

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51415

llvm-svn: 340955
2018-08-29 19:10:44 +00:00
Matt Morehouse cf311cfc20 Revert "[libFuzzer] Port to Windows"
This reverts r340949 due to bot breakage again.

llvm-svn: 340954
2018-08-29 18:40:41 +00:00
Sanjay Patel 0f29e953b7 [InstCombine] canonicalize fneg with llvm.sin
This is a follow-up to rL339604 which did the same transform
for a sin libcall. The handling of intrinsics vs. libcalls
is unfortunately scattered, so I'm just adding this next to
the existing transform for llvm.cos for now.

This should resolve PR38458:
https://bugs.llvm.org/show_bug.cgi?id=38458
If the call was already negated, the negates will cancel
each other out.

llvm-svn: 340952
2018-08-29 18:27:49 +00:00
Matt Morehouse 245ebd71ef [libFuzzer] Port to Windows
Summary:
Port libFuzzer to windows-msvc.
This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well.
It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch.
It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them.

Reviewers: morehouse, rnk

Reviewed By: morehouse, rnk

Subscribers: #sanitizers, delcypher, morehouse, kcc, eraman

Differential Revision: https://reviews.llvm.org/D51022

llvm-svn: 340949
2018-08-29 18:08:34 +00:00
Evandro Menezes 22e0bdf4ed [InstCombine] Expand the simplification of pow() with nested exp{,2}()
Expand the simplification of `pow(exp{,2}(x), y)` to all FP types.

This improvement helps some benchmarks in SPEC CPU2000 and CPU2006, such as
252.eon, 447.dealII, 453.povray.  Otherwise, no significant regressions on
x86-64 or A64.

Differential revision: https://reviews.llvm.org/D51195

llvm-svn: 340948
2018-08-29 17:59:48 +00:00
Evandro Menezes a3a7b53571 [InstCombine] Expand the simplification of pow() into exp2()
Generalize the simplification of `pow(2.0, y)` to `pow(2.0 ** n, y)` for all
scalar and vector types.

This improvement helps some benchmarks in SPEC CPU2000 and CPU2006, such as
252.eon, 447.dealII, 453.povray.  Otherwise, no significant regressions on
x86-64 or A64.

Differential revision: https://reviews.llvm.org/D49273

llvm-svn: 340947
2018-08-29 17:59:34 +00:00
Craig Topper 2bcb1eeee1 [InstCombine] Replace two calls to getNumUses() with !hasNUsesOrMore
We were calling getNumUses to check for 1 or 2 uses. But getNumUses is linear in the number of uses. We can instead use !hasNUsesOrMore(3) which will stop the linear scan as soon as it determines there are at least 3 uses even if there are more.

llvm-svn: 340939
2018-08-29 17:09:21 +00:00
Sanjay Patel d4e19d272a [InstCombine] move declarations closer to uses; NFC
llvm-svn: 340930
2018-08-29 14:42:12 +00:00
Sanjay Patel 7a05641fa8 [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.

llvm-svn: 340926
2018-08-29 13:24:34 +00:00
Alexandros Lamprineas f6db5bcd38 Revert r340922 "[GVNHoist] Re-enable GVNHoist by default"
Another sanitizer buildbot failed this time at bootstrap when
compiling SemaTemplateInstantiate.cpp with this assertion:
`dominates(MD, U) && "Memory Def does not dominate it's uses"'.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/15047

llvm-svn: 340925
2018-08-29 13:00:55 +00:00
Hans Wennborg 2c390c54f6 Revert r340921 "[NFC] Unify guards detection"
This broke the build, see e.g.

http://lab.llvm.org:8011/builders/clang-cmake-armv8-lnt/builds/4626/
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/18647/
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/5856/
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/22800/

> We have multiple places in code where we try to identify whether or not
> some instruction is a guard. This patch factors out this logic into a separate
> utility function which works uniformly in all places.
>
> Differential Revision: https://reviews.llvm.org/D51152
> Reviewed By: fedor.sergeev

llvm-svn: 340923
2018-08-29 12:21:32 +00:00
Alexandros Lamprineas c03b9b8854 [GVNHoist] Re-enable GVNHoist by default
Rebase rL338240 since the excessive memory usage observed when using
GVNHoist with UBSan has been fixed by rL340818.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 340922
2018-08-29 11:58:34 +00:00
Max Kazantsev 1dafaa87d9 [NFC] Unify guards detection
We have multiple places in code where we try to identify whether or not
some instruction is a guard. This patch factors out this logic into a separate
utility function which works uniformly in all places.

Differential Revision: https://reviews.llvm.org/D51152
Reviewed By: fedor.sergeev

llvm-svn: 340921
2018-08-29 11:37:34 +00:00
Max Kazantsev 8b4ffe66d6 [NFC] Factor out guard utility methods into a separate file
This patch creates file GuardUtils which will contain logic for work with guards
that can be shared across different passes.

Differential Revision: https://reviews.llvm.org/D51151
Reviewed By: fedor.sergeev

llvm-svn: 340914
2018-08-29 10:51:59 +00:00
Hans Wennborg e0f3e9283f LoopSink: Don't sink into blocks without an insertion point (PR38462)
In the PR, LoopSink was trying to sink into a catchswitch block, which
doesn't have a valid insertion point.

Differential Revision: https://reviews.llvm.org/D51307

llvm-svn: 340900
2018-08-29 06:55:27 +00:00
Zhaoshi Zheng 35818e2789 [QTOOL-37352] Consider isLegalAddressingImm in Constant Hoisting
In Thumb1, legal imm range is [0, 255] for ADD/SUB instructions. However, the
legal imm range for LD/ST in (R+Imm) addressing mode is [0, 127]. Imms in
[128, 255] are materialized by mov R, #imm, and LD/STs use them in (R+R)
addressing mode.

This patch checks if a constant is used as offset in (R+Imm), if so, it checks
isLegalAddressingMode passing the constant value as BaseOffset.

Differential Revision: https://reviews.llvm.org/D50931

llvm-svn: 340882
2018-08-28 23:00:59 +00:00
Alina Sbirlea 52e97a28d4 [SimpleLoopUnswitch] Form dedicated exits after trivial unswitches.
Summary:
Form dedicated exits after trivial unswitches.
Fixes PR38737, PR38283.

Reviewers: chandlerc, fedor.sergeev

Subscribers: sanjoy, jlebar, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D51375

llvm-svn: 340871
2018-08-28 20:41:05 +00:00
Matt Morehouse bab8556f01 Revert "[libFuzzer] Port to Windows"
This reverts commit r340860 due to failing tests.

llvm-svn: 340867
2018-08-28 19:07:24 +00:00
Matt Morehouse c6fff3b6f5 [libFuzzer] Port to Windows
Summary:
Port libFuzzer to windows-msvc.
This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well.
It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch.
It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them.

Patch By: metzman

Reviewers: morehouse, rnk

Reviewed By: morehouse, rnk

Subscribers: morehouse, kcc, eraman

Differential Revision: https://reviews.llvm.org/D51022

llvm-svn: 340860
2018-08-28 18:34:32 +00:00
Matt Arsenault 10de2775bd AMDGPU: Remove nan tests in class if src is nnan
llvm-svn: 340850
2018-08-28 18:10:02 +00:00
David Bolvansky c1b27b562b [Inliner] Attribute callsites with inline remarks
Summary:
Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites.

An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//.

For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons:
  remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6)
  remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive

It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites:
  define void @test1() {
    call void @bar(i1 true) #0
    call void @bar(i1 false) #2
    ret void
  }
  attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" }
  ..
  attributes #2 = { "inline-remark"="(cost=never): recursive" }

Patch by: yrouban (Yevgeny Rouban)

Reviewers: xbolva00, tejohnson, apilipenko

Reviewed By: xbolva00, tejohnson

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D50435

llvm-svn: 340834
2018-08-28 15:27:25 +00:00
Mikael Holmen 4d652c4ce7 [CloneFunction] Constant fold terminators before checking single predecessor
Summary:
This fixes PR31105.

There is code trying to delete dead code that does so by e.g. checking if
the single predecessor of a block is the block itself.

That check fails on a block like this
 bb:
   br i1 undef, label %bb, label %bb
since that has two (identical) predecessors.

However, after the check for dead blocks there is a call to
ConstantFoldTerminator on the basic block, and that call simplifies the
block to
 bb:
   br label %bb

Therefore we now do the call to ConstantFoldTerminator before the check if
the block is dead, so it can realize that it really is.

The original behavior lead to the block not being removed, but it was
simplified as above, and then we did a call to
    Dest->replaceAllUsesWith(&*I);
with old and new being equal, and an assertion triggered.

Reviewers: chandlerc, fhahn

Reviewed By: fhahn

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D51280

llvm-svn: 340820
2018-08-28 12:40:11 +00:00
Alexandros Lamprineas 484bd13e2d [GVNHoist] Prune out useless CHI insertions
Fix for the out-of-memory error when compiling SemaChecking.cpp
with GVNHoist and ubsan enabled. I've used a cache for inserted
CHIs to avoid excessive memory usage.

Differential Revision: https://reviews.llvm.org/D50323

llvm-svn: 340818
2018-08-28 11:07:54 +00:00
Max Kazantsev 0c4b84e2df [NFC] A loop can never contain Ret instruction
llvm-svn: 340808
2018-08-28 09:26:28 +00:00
Craig Topper a6cd4b9bce [InstCombine] Extend (add (sext x), cst) --> (sext (add x, cst')) and (add (zext x), cst) --> (zext (add x, cst')) to work for vectors
Differential Revision: https://reviews.llvm.org/D51236

llvm-svn: 340796
2018-08-28 02:02:29 +00:00
Sanjay Patel c615910be5 [InstCombine] fix formatting; NFC
llvm-svn: 340790
2018-08-27 23:01:10 +00:00
Sanjay Patel 42d31c20a8 [InstCombine] allow shuffle+binop canonicalization with widening shuffles
This lines up with the behavior of an existing transform where if both 
operands of the binop are shuffled, we allow moving the binop before the 
shuffle regardless of whether the shuffle changes the size of the vector.

llvm-svn: 340787
2018-08-27 22:41:44 +00:00
Evandro Menezes 253991cfaf [PATCH] [InstCombine] Fix issue in the simplification of pow() with nested exp{,2}()
Fix the issue of duplicating the call to `exp{,2}()` when it's nested in
`pow()`, as exposed by rL340462.

Differential revision: https://reviews.llvm.org/D51194

llvm-svn: 340784
2018-08-27 22:11:15 +00:00
George Burgess IV aa09a82b4b s/std::set/DenseSet/; NFC
We only use this set for `insert` and `count`, so a hashing container
seems better here.

llvm-svn: 340783
2018-08-27 22:10:59 +00:00
Nico Weber e75fd1b184 fix comment typo
llvm-svn: 340744
2018-08-27 14:25:22 +00:00
Max Kazantsev b5dd092051 [NFC] Split logic of ImplicitControlFlowTracking to allow generalization
We have a class `ImplicitControlFlowTracking` which allows us to keep track of
instructions that can abnormally exit and answer queries like "whether or not
there is side-exiting instruction above this instruction in its block".

We may want to have the similar tracking for other types of "special" instructions,
for example instructions that write memory.

This patch separates ImplicitControlFlowTracking into two classes, isolating all
general logic not related to implicit control flow into its parent class. We can
later make another child of this class to keep track of instructions that write
memory.

The motivation for that is that we want to make these checks efficiently in the
patch https://reviews.llvm.org/D50891.

NOTE: The naming of the parent class is not super cool, but the other options we
have are hardly better. Please feel free to rename it as NFC if you think you've
found a more informative name for it.

Differential Revision: https://reviews.llvm.org/D50954
Reviewed By: fedor.sergeev

llvm-svn: 340728
2018-08-27 09:43:16 +00:00
Chandler Carruth 9ae926b973 [IR] Replace `isa<TerminatorInst>` with `isTerminator()`.
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.

All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.

llvm-svn: 340701
2018-08-26 09:51:22 +00:00
Chandler Carruth 698fbe7b59 [IR] Sink `isExceptional` predicate to `Instruction`, rename it to
`isExceptionalTermiantor` and implement it for opcodes as well following
the common pattern in `Instruction`.

Part of removing `TerminatorInst` from the `Instruction` type hierarchy
to make it easier to share logic and interfaces between instructions
that are both terminators and not terminators.

llvm-svn: 340699
2018-08-26 08:56:42 +00:00
Chandler Carruth 96fc1de77d [IR] Begin removal of TerminatorInst by removing successor manipulation.
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.

The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.

Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.

This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html

There will be a series of much more mechanical changes following this
one to complete this move.

Differential Revision: https://reviews.llvm.org/D47467

llvm-svn: 340698
2018-08-26 08:41:15 +00:00
Xinliang David Li bcf726a32d [PGO] add target md5sum in warning message for icall
Differential revision: http://reviews.llvm.org/D51193

llvm-svn: 340657
2018-08-24 21:38:24 +00:00
Philip Reames 1c0fde61a6 [AST] Simplify code minorly using pattern match [NFC]
llvm-svn: 340638
2018-08-24 19:13:39 +00:00
David Bolvansky 1ccbddca2c Revert [Inliner] Attribute callsites with inline remarks
llvm-svn: 340619
2018-08-24 16:39:41 +00:00
David Bolvansky 7c0537a3ac [Inliner] Attribute callsites with inline remarks
Summary:
Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites.

An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//.

For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons:
  remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6)
  remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive

It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites:
  define void @test1() {
    call void @bar(i1 true) #0
    call void @bar(i1 false) #2
    ret void
  }
  attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" }
  ..
  attributes #2 = { "inline-remark"="(cost=never): recursive" }

Patch by: yrouban (Yevgeny Rouban)

Reviewers: xbolva00, tejohnson, apilipenko

Reviewed By: xbolva00, tejohnson

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D50435

llvm-svn: 340618
2018-08-24 16:28:36 +00:00
Philip Reames 9ec15faf20 [LICM] Hoist an invariant_start out of loops if there are no stores executed before it
Once the invariant_start is reached, we know that no instruction *after* it can modify the memory. So, if we can prove the location isn't read *between entry into the loop and the execution of the invariant_start*, we can execute the invariant_start before entering the loop.

Differential Revision: https://reviews.llvm.org/D51181

llvm-svn: 340617
2018-08-24 16:24:48 +00:00
Florian Hahn 406f1ff1cd [Local] Make DoesKMove required for combineMetadata.
This patch makes the DoesKMove argument non-optional, to force people
to think about it. Most cases where it is false are either code hoisting
or code sinking, where we pick one instruction from a set of
equal instructions among different code paths.

Reviewers: dberlin, nlopes, efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D47475

llvm-svn: 340606
2018-08-24 11:40:04 +00:00
David Bolvansky 589bb484f6 [LoopVectorize][NFCI] Use find instead of count
Summary:
Avoid "count" if possible -> use "find" to check for the existence of keys.

Passed llvm test suite.

Reviewers: fhahn, dcaballe, mkuper, rengolin

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51054

llvm-svn: 340563
2018-08-23 18:34:58 +00:00
David Bolvansky 43b0e25847 [InstCombine] Fold Select with binary op - FP opcodes
Summary:
Follow up for https://reviews.llvm.org/rL339520 and https://reviews.llvm.org/rL338300

Alive:

```
%A = fcmp oeq float %x, 0.0
%B = fadd nsz float %x, %z
%C = select i1 %A, float %B, float %y
=>
%C = select i1 %A, float %z, float %y
----------                                                                      
  %A = fcmp oeq float %x, 0.0
  %B = fadd nsz float %x, %z
  %C = select %A, float %B, float %y
=>
  %C = select %A, float %z, float %y

Done: 1                                                                         
Optimization is correct

%A = fcmp une float %x, -0.0
%B = fadd nsz float %x, %z
%C = select i1 %A, float %y, float %B
=>
%C = select i1 %A, float %y, float %z
----------                                                                      
  %A = fcmp une float %x, -0.0
  %B = fadd nsz float %x, %z
  %C = select %A, float %y, float %B
=>
  %C = select %A, float %y, float %z

Done: 1                                                                         
Optimization is correct
```


Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50714

llvm-svn: 340538
2018-08-23 15:22:15 +00:00
Brian Homerding 3ecabd709f [FunctionAttrs] Infer WriteOnly Function Attribute
These changes expand the FunctionAttr logic in order to mark functions as
WriteOnly when appropriate. This is done through an additional bool variable
and extended logic.

Reviewers: hfinkel, jdoerfert

Differential Revision: https://reviews.llvm.org/D48387

llvm-svn: 340537
2018-08-23 15:05:22 +00:00
John Brawn 23cbf09fad [GVN] Invalidate cached info for phis when setting dead predecessors to undef
When GVN sets the incoming value for a phi to undef because the incoming block
is unreachable it needs to also invalidate the cached info for that phi in
MemoryDependenceAnalysis, otherwise later queries will return stale information.

Differential Revision: https://reviews.llvm.org/D51099

llvm-svn: 340529
2018-08-23 12:48:17 +00:00
Florian Hahn 17e7ace5e9 [SCCP] Remove unused variable added in r340525.
llvm-svn: 340526
2018-08-23 11:17:59 +00:00
Florian Hahn 3052290dc0 Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
This version of the patch fixes cleaning up ssa_copy intrinsics, so it does not
crash for instructions in blocks that have been marked unreachable.

This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

Differential Revision: https://reviews.llvm.org/D45330

llvm-svn: 340525
2018-08-23 11:04:00 +00:00
Alexander Richardson 6bcf2ba2f0 Allow creating llvm::Function in non-zero address spaces
Most users won't have to worry about this as all of the
'getOrInsertFunction' functions on Module will default to the program
address space.

An overload has been added to Function::Create to abstract away the
details for most callers.

This is based on https://reviews.llvm.org/D37054 but without the changes to
make passing a Module to Function::Create() mandatory. I have also added
some more tests and fixed the LLParser to accept call instructions for
types in the program address space.

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D47541

llvm-svn: 340519
2018-08-23 09:25:17 +00:00
David Bolvansky 8715e03477 [LibCalls] Added returned attribute to libcalls
Reviewers: efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51092

llvm-svn: 340512
2018-08-23 05:18:23 +00:00
Evandro Menezes 6acbe30ee1 [NFC] Refactor simplification of pow()
llvm-svn: 340476
2018-08-22 23:18:02 +00:00
Alina Sbirlea 8b83d68544 Update MemorySSA in LoopSimplifyCFG.
Summary:
Add MemorySSA as a dependency to LoopSimplifyCFG and preserve it.
Disabled by default until all passes preserve MemorySSA.

Reviewers: bogner, chandlerc

Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D50911

llvm-svn: 340445
2018-08-22 20:10:21 +00:00
Alina Sbirlea c1a216b251 Update MemorySSA in LoopInstSimplify.
Summary:
Add MemorySSA as a depency to LoopInstInstSimplify and preserve it.
Disabled by default until all passes preserve MemorySSA.

Reviewers: chandlerc

Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D50906

llvm-svn: 340444
2018-08-22 20:05:21 +00:00
Max Kazantsev 611d645a08 [GuardWidening] Ignore guards with trivial conditions
Guard widening should not spend efforts on dealing with guards with trivial true/false conditions.
Such guards can easily be eliminated by any further cleanup pass like instcombine. However we
should not unconditionally delete them because it may be profitable to widen other conditions
into such guards.

Differential Revision: https://reviews.llvm.org/D50247
Reviewed By: fedor.sergeev

llvm-svn: 340381
2018-08-22 02:40:49 +00:00
Alina Sbirlea ab6f84f763 Update MemorySSA in BasicBlockUtils.
Summary:
Extend BasicBlocksUtils to update MemorySSA.

Subscribers: sanjoy, arsenm, nhaehnle, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45300

llvm-svn: 340365
2018-08-21 23:32:03 +00:00
Marcello Maggioni 883fe455f1 [LICM] Refactor some AliasSetTracker code to get rid of new/deletes. NFC
Differential Revision: https://reviews.llvm.org/D51024

llvm-svn: 340333
2018-08-21 20:30:14 +00:00
Florian Hahn 7cdf52e425 [CodeExtractor] Use 'normal destination' BB as insert point to store invoke results.
Currently CodeExtractor tries to use the next node after an invoke to
place the store for the result of the invoke, if it is an out parameter
of the region. This fails, as the invoke terminates the current BB.
In that case, we can place the store in the 'normal destination' BB, as
the result will only be available in that case.


Reviewers: davidxl, davide, efriedma

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D51037

llvm-svn: 340331
2018-08-21 20:07:46 +00:00
Craig Topper 3d8fe39ca7 [InstCombine] Pull simple checks above a more complicated one. NFCI
I'm assuming its easier to make sure the RHS of an XOR is all ones than it is to check for the many select patterns we have. So lets check that first. Same with the one use check.

llvm-svn: 340321
2018-08-21 19:17:00 +00:00
Florian Hahn 9583d4fa03 [GVN] Assign new value number to calls reading memory, if there is no MemDep info.
Currently we assign the same value number to two calls reading the same
memory location if we do not have MemoryDependence info. Without MemDep
Info we cannot guarantee that there is no store between the two calls, so we
have to assign a new number to the second call.

It also adds a new option EnableMemDep to enable/disable running
MemoryDependenceAnalysis and also renamed NoLoads to NoMemDepAnalysis to
be more explicit what it does. As it also impacts calls that read memory,
NoLoads is a bit confusing.

Reviewers: efriedma, sebpop, john.brawn, wmi

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D50893

llvm-svn: 340319
2018-08-21 19:11:27 +00:00
Philip Reames c3c23e8cf2 [AST] Remove notion of volatile from alias sets [NFCI]
Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet.

It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.)

There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for..

llvm-svn: 340312
2018-08-21 17:59:11 +00:00
Craig Topper b172b8884a [BypassSlowDivision] Teach bypass slow division not to interfere with div by constant where constants have been constant hoisted, but not moved from their basic block
DAGCombiner doesn't pay attention to whether constants are opaque before doing the div by constant optimization. So BypassSlowDivision shouldn't introduce control flow that would make DAGCombiner unable to see an opaque constant. This can occur when a div and rem of the same constant are used in the same basic block. it will be hoisted, but not leave the block.

Longer term we probably need to look into the X86 immediate cost model used by constant hoisting and maybe not mark div/rem immediates for hoisting at all.

This fixes the case from PR38649.

Differential Revision: https://reviews.llvm.org/D51000

llvm-svn: 340303
2018-08-21 17:15:33 +00:00
Anna Thomas b02b0ad8c7 [LV] Vectorize loops where non-phi instructions used outside loop
Summary:
Follow up change to rL339703, where we now vectorize loops with non-phi
instructions used outside the loop. Note that the cyclic dependency
identification occurs when identifying reduction/induction vars.

We also need to identify that we do not allow users where the PSCEV information
within and outside the loop are different. This was the fix added in rL307837
for PR33706.

Reviewers: Ayal, mkuper, fhahn

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50778

llvm-svn: 340278
2018-08-21 14:40:27 +00:00
Max Kazantsev 097ef69182 [LICM] Hoist guards with invariant conditions
This patch teaches LICM to hoist guards from the loop if they are guaranteed to execute and
if there are no side effects that could prevent that.

Differential Revision: https://reviews.llvm.org/D50501
Reviewed By: reames

llvm-svn: 340256
2018-08-21 08:11:31 +00:00
Reid Kleckner 85a8c12db8 Re-land r334313 "[asan] Instrument comdat globals on COFF targets"
If we can use comdats, then we can make it so that the global metadata
is thrown away if the prevailing definition of the global was
uninstrumented. I have only tested this on COFF targets, but in theory,
there is no reason that we cannot also do this for ELF.

This will allow us to re-enable string merging with ASan on Windows,
reducing the binary size cost of ASan on Windows.

I tested this change with ASan+PGO, and I fixed an issue with the
__llvm_profile_raw_version symbol. With the old version of my patch, we
would attempt to instrument that symbol on ELF because it had a comdat
with external linkage. If we had been using the linker GC-friendly
metadata scheme, everything would have worked, but clang does not enable
it by default.

llvm-svn: 340232
2018-08-20 23:35:45 +00:00
Craig Topper bee74793a3 [InstCombine] Add splat vector constant support to foldICmpAddOpConst.
Differential Revision: https://reviews.llvm.org/D50946

llvm-svn: 340231
2018-08-20 23:04:25 +00:00
Michael Berg 0b838deddc extend binop folds for selects to include true and false binops flag intersection
Summary: This change address bug 38641

Reviewers: spatel, wristow

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D50996

llvm-svn: 340222
2018-08-20 22:26:58 +00:00
Justin Bogner 6f1740d52f [SimplifyCFG] Replace some uses of bitwise or with logical or
It's clearer to use logical or for boolean values. Thanks to Steven
Zhang for noticing!

llvm-svn: 340153
2018-08-20 06:37:11 +00:00
Craig Topper 24674ca773 [InstCombine] Move some variable declarations into a more appropriate scope. NFC
llvm-svn: 340150
2018-08-20 05:35:12 +00:00
whitequark c438ac2352 [LLVM-C] Add coroutine passes
Differential Revision: https://reviews.llvm.org/D50950

llvm-svn: 340147
2018-08-19 23:39:57 +00:00
Jun Lim da5864c73c Test commit
I just removed a blank space.

llvm-svn: 340069
2018-08-17 18:40:41 +00:00
Evandro Menezes 4b39010afb [InstCombine] Refactor the simplification of pow() (NFC)
Refactor all cases dealing with `exp{,2,10}()` into one function in
preparation for D49273.  Otherwise, NFC.

llvm-svn: 340061
2018-08-17 17:59:53 +00:00
Teresa Johnson cb9a82fc7b [ThinLTO] Add option for printing import failure reasons
Summary:
Adds the option for the printing of summary information about functions
considered but rejected for importing during the thin link.

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D50881

llvm-svn: 340047
2018-08-17 16:53:47 +00:00
Florian Hahn 19f9e32f07 [InstrSimplify,NewGVN] Add option to ignore additional instr info when simplifying.
NewGVN uses InstructionSimplify for simplifications of leaders of
congruence classes. It is not guaranteed that the metadata or other
flags/keywords (like nsw or exact) of the leader is available for all members
in a congruence class, so we cannot use it for simplification.

This patch adds a InstrInfoQuery struct with a boolean field
UseInstrInfo (which defaults to true to keep the current behavior as
default) and a set of helper methods to get metadata/keywords for a
given instruction, if UseInstrInfo is true. The whole thing might need a
better name, to avoid confusion with TargetInstrInfo but I am not sure
what a better name would be.

The current patch threads through InstrInfoQuery to the required
places, which is messier then it would need to be, if
InstructionSimplify and ValueTracking would share the same Query struct.

The reason I added it as a separate struct is that it can be shared
between InstructionSimplify and ValueTracking's query objects. Also,
some places do not need a full query object, just the InstrInfoQuery.

It also updates some interfaces that do not take a Query object, but a
set of optional parameters to take an additional boolean UseInstrInfo.

See https://bugs.llvm.org/show_bug.cgi?id=37540.

Reviewers: dberlin, davide, efriedma, sebpop, hiraditya

Reviewed By: hiraditya

Differential Revision: https://reviews.llvm.org/D47143

llvm-svn: 340031
2018-08-17 14:39:04 +00:00
Anna Thomas 1962621a7e [LICM] Add a diagnostic analysis for identifying alias information
Summary:
Currently, in LICM, we use the alias set tracker to identify if the
instruction (we're interested in hoisting) aliases with instruction that
modifies that memory location.

This patch adds an LICM alias analysis diagnostic tool that checks the
mod ref info of the instruction we are interested in hoisting/sinking,
with every instruction in the loop.  Because of O(N^2) complexity this
is now only a diagnostic tool to show the limitation we have with the
alias set tracker and is OFF by default.

Test cases show the difference with the diagnostic analysis tool, where
we're able to hoist out loads and readonly + argmemonly calls from the
loop, where the alias set tracker analysis is not able to hoist these
instructions out.

Reviewers: reames, mkazantsev, fedor.sergeev, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50854

llvm-svn: 340026
2018-08-17 13:44:00 +00:00
Andrea Di Biagio f874607f32 [InstCombine] Remove unused method FAddCombine::createFDiv(). NFC
This commit fixes a (gcc 7.3.0) [-Wunused-function] warning caused by the
presence of unused method FaddCombine::createFDiv().
The last use of that method was removed at r339519.

llvm-svn: 340014
2018-08-17 11:33:48 +00:00
Chen Zheng e2d47dd1bb [MISC]Fix wrong usage of std::equal()
Differential Revision: https://reviews.llvm.org/D49958

llvm-svn: 340000
2018-08-17 07:51:01 +00:00
Sanjay Patel 8ba631d9c8 [InstCombine] add reflection fold for tan(-x)
This is a follow-up suggested with rL339604.
For tan(), we don't have a corresponding LLVM 
intrinsic -- unlike sin/cos -- so this is the 
only way/place that we can do this fold currently.

llvm-svn: 339958
2018-08-16 22:46:20 +00:00
Vedant Kumar ee6c233ae0 [InstrProf] Use atomic profile counter updates for TSan
Thread sanitizer instrumentation fails to skip all loads and stores to
profile counters. This can happen if profile counter updates are merged:

  %.sink = phi i64* ...
  %pgocount5 = load i64, i64* %.sink
  %27 = add i64 %pgocount5, 1
  %28 = bitcast i64* %.sink to i8*
  call void @__tsan_write8(i8* %28)
  store i64 %27, i64* %.sink

To suppress TSan diagnostics about racy counter updates, make the
counter updates atomic when TSan is enabled. If there's general interest
in this mode it can be surfaced as a clang/swift driver option.

Testing: check-{llvm,clang,profile}

rdar://40477803

Differential Revision: https://reviews.llvm.org/D50867

llvm-svn: 339955
2018-08-16 22:24:47 +00:00
Alina Sbirlea 2ab544bcf5 Update MemorySSA in Local utils removing blocks.
Summary: Extend Local utils to update MemorySSA.

Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D48790

llvm-svn: 339951
2018-08-16 21:58:44 +00:00
Michael Berg ed89d069f4 add a missed case for binary op FMF propagation under select folds
llvm-svn: 339938
2018-08-16 20:59:45 +00:00
Philip Reames 0e2f9b9e30 [LICM][NFC] Restructure pointer invalidation API in terms of MemoryLocation
Main value is just simplifying code.  I'll further simply the argument handling case in a bit, but that involved a slightly orthogonal change so I went with the mildy ugly intermediate for this patch.

Note that the isSized check in the old LICM code was not carried across.  It turns out that check was dead.  a) no test exercised it, and b) langref and verifier had been updated to disallow unsized types used in loads.

llvm-svn: 339930
2018-08-16 20:11:15 +00:00
Evandro Menezes c05c7e11bb [InstCombine] Expand the simplification of pow(x, 0.5) to sqrt(x)
Expand the number of cases when `pow(x, 0.5)` is simplified into `sqrt(x)`
by considering the math semantics with more granularity.

Differential revision: https://reviews.llvm.org/D50036

llvm-svn: 339887
2018-08-16 15:58:08 +00:00
Sanjay Patel 039f556f44 [InstCombine] move vector compare before same-shuffled ops
This is a step towards fixing PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 339875
2018-08-16 12:52:17 +00:00
Max Kazantsev 72d7d649e3 [NFC] Remove const modifier to allow further development in LICM
llvm-svn: 339846
2018-08-16 08:30:15 +00:00
Craig Topper 9c1d9fdeaa [X86] Remove masking from the 512-bit padds and psubs intrinsics. Use select in IR instead.
llvm-svn: 339842
2018-08-16 06:20:24 +00:00
Matt Arsenault 9a389fbd79 AMDGPU: Stop producing icmp/fcmp intrinsics with invalid types
llvm-svn: 339815
2018-08-15 21:14:25 +00:00
Amara Emerson 070ac768ff [InstCombine] Fix IC trying to create a xor of pointer types.
rdar://42473741

Differential Revision: https://reviews.llvm.org/D50775

llvm-svn: 339796
2018-08-15 17:46:22 +00:00
Marcello Maggioni e98aaf1d91 [GVN] Fix typo in IsValueFullyAvailableInBlock. NFC.
DenseMap insert() method return a pair<iterator, bool>
not pair<iterator, char>
Noticed it and thought I might just fix it ...

llvm-svn: 339777
2018-08-15 15:06:53 +00:00
Chijun Sima e8263f33d9 [SimplifyCFG] Remove pointer from SmallPtrSet before deletion
Summary:
Previously, `eraseFromParent()` calls `delete` which invalidates the value of the pointer. Copying the value of the pointer later is undefined behavior in C++11 and implementation-defined (which may cause a segfault on implementations having strict pointer safety) in C++14.

This patch removes the BasicBlock pointer from related SmallPtrSet before `delete` invalidates it in the SimplifyCFG pass.

Reviewers: kuhar, dmgreen, davide, trentxintong

Reviewed By: kuhar, dmgreen

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50717

llvm-svn: 339773
2018-08-15 13:56:21 +00:00
David Green 6cb6478739 [UnJ] Rename hasInvariantIterationCount to hasIterationCountInvariantInParent NFC
This hopefully describes the API of the function more precisely.

llvm-svn: 339762
2018-08-15 10:59:41 +00:00
Max Kazantsev 530b8d1c3d [NFC] Refactoring of LoopSafetyInfo, step 1
Turn structure into class, encapsulate methods, add clarifying comments.

Differential Revision: https://reviews.llvm.org/D50693
Reviewed By: reames

llvm-svn: 339752
2018-08-15 05:55:43 +00:00
Max Kazantsev df58dd8418 [NFC] Add sanitizing assertion to ICF tracker
llvm-svn: 339751
2018-08-15 05:50:38 +00:00
Max Kazantsev 68290f838a [NFC][LICM] Make hoist method void
Method hoist always returns true. This patch makes it void.

Differential Revision: https://reviews.llvm.org/D50696
Reviewed By: hiraditya

llvm-svn: 339750
2018-08-15 02:49:12 +00:00
Evgeniy Stepanov a265a13bbe [hwasan] Add a basic API.
Summary:
Add user tag manipulation functions:
  __hwasan_tag_memory
  __hwasan_tag_pointer
  __hwasan_print_shadow (very simple and ugly, for now)

Reviewers: vitalybuka, kcc

Subscribers: kubamracek, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50746

llvm-svn: 339746
2018-08-15 00:39:35 +00:00
Matt Morehouse 0f22fac274 [SanitizerCoverage] Add associated metadata to PC guards.
Summary:
Without this metadata LLD strips unused PC table entries
but won't strip unused guards.  This metadata also seems
to influence the linker to change the ordering in the PC
guard section to match that of the PC table section.

The libFuzzer runtime library depends on the ordering
of the PC table and PC guard sections being the same.  This
is not generally guaranteed, so we may need to redesign
PC tables/guards/counters in the future.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: kcc, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50483

llvm-svn: 339733
2018-08-14 22:04:34 +00:00
Anna Thomas 6a1dd77f5d NFC: Clarify comment in loop vectorization legality
Clarifying the comment about PSCEV and external IV users by referencing
the bug in question.

llvm-svn: 339722
2018-08-14 20:25:13 +00:00
Anna Thomas 60a1e4dddc [LV] Teach about non header phis that have uses outside the loop
Summary:
This patch teaches the loop vectorizer to vectorize loops with non
header phis that have have outside uses.  This is because the iteration
dependence distance for these phis can be widened upto VF (similar to
how we do for induction/reduction) if they do not have a cyclic
dependence with header phis. When identifying reduction/induction/first
order recurrence header phis, we already identify if there are any cyclic
dependencies that prevents vectorization.

The vectorizer is taught to extract the last element from the vectorized
phi and update the scalar loop exit block phi to contain this extracted
element from the vector loop.

This patch can be extended to vectorize loops where instructions other
than phis have outside uses.

Reviewers: Ayal, mkuper, mssimpso, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50579

llvm-svn: 339703
2018-08-14 18:22:19 +00:00
Fedor Sergeev b55705f6e9 [Inliner] add inliner stats to new pm version of inliner
Increment existing NumInlined and NumDeleted stats in InlinerPass::run.

llvm-svn: 339682
2018-08-14 15:19:14 +00:00
Tomasz Krupa e766e5f636 [X86] Constant folding of adds/subs intrinsics
Summary: This adds constant folding of signed add/sub with saturation intrinsics.

Reviewers: craig.topper, spatel, RKSimon, chandlerc, efriedma

Reviewed By: craig.topper

Subscribers: rnk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50499

llvm-svn: 339659
2018-08-14 09:04:01 +00:00
Teresa Johnson b0a1d3bdf1 [ThinLTO] Fix printing of WPD remarks
Summary:
When WPD is performed in a ThinLTO backend, the function may be created
if it isn't already in that module. Module::getOrInsertFunction may
add a bitcast, in which case the returned Constant is not a Function and
doesn't have a name. Invoke stripPointerCasts() on the returned value
where we access its name.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49959

llvm-svn: 339640
2018-08-14 03:00:16 +00:00
Roman Lebedev 3534874fbf [InstCombine] Re-land: Optimize redundant 'signed truncation check pattern'.
Summary:
This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250):
```
signed char test(unsigned int x) { return x; }
```
`clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3`
* Old: {F6904292}
* With this patch: {F6904294}

General pattern:
  X & Y

Where `Y` is checking that all the high bits (covered by a mask `4294967168`)
are uniform, i.e.  `%arg & 4294967168`  can be either  `4294967168`  or  `0`
Pattern can be one of:
  %t = add        i32 %arg,    128
  %r = icmp   ult i32 %t,      256
Or
  %t0 = shl       i32 %arg,    24
  %t1 = ashr      i32 %t0,     24
  %r  = icmp  eq  i32 %t1,     %arg
Or
  %t0 = trunc     i32 %arg  to i8
  %t1 = sext      i8  %t0   to i32
  %r  = icmp  eq  i32 %t1,     %arg
This pattern is a signed truncation check.

And `X` is checking that some bit in that same mask is zero.
I.e. can be one of:
  %r = icmp sgt i32   %arg,    -1
Or
  %t = and      i32   %arg,    2147483648
  %r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:
  %r = icmp ult i32 %arg, 128

The transform itself ended up being rather horrible, even though i omitted some cases.
Surely there is some infrastructure that can help clean this up that i missed?

https://rise4fun.com/Alive/3Ou

The initial commit (rL339610)
was reverted, since the first assert was being triggered.
The @positive_with_extra_and test now has coverage for that case.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: RKSimon, erichkeane, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50465

llvm-svn: 339621
2018-08-13 21:54:37 +00:00
Sanjay Patel 15bff18c6f [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds (retry r339608)
Even though this code is below a function called optimizeFloatingPointLibCall(),
we apparently can't guarantee that we're dealing with FPMathOperators, so bail
out immediately if that's not true.

llvm-svn: 339618
2018-08-13 21:49:19 +00:00
Roman Lebedev 28a42c7706 Revert "[InstCombine] Optimize redundant 'signed truncation check pattern'."
At least one buildbot was able to actually trigger that assert
on the top of the function. Will investigate.

This reverts commit r339610.

llvm-svn: 339612
2018-08-13 20:46:22 +00:00
Roman Lebedev 4c4750771f [InstCombine] Optimize redundant 'signed truncation check pattern'.
Summary:
This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250):
```
signed char test(unsigned int x) { return x; }
```
`clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3`
* Old: {F6904292}
* With this patch: {F6904294}

General pattern:
  X & Y

Where `Y` is checking that all the high bits (covered by a mask `4294967168`)
are uniform, i.e.  `%arg & 4294967168`  can be either  `4294967168`  or  `0`
Pattern can be one of:
  %t = add        i32 %arg,    128
  %r = icmp   ult i32 %t,      256
Or
  %t0 = shl       i32 %arg,    24
  %t1 = ashr      i32 %t0,     24
  %r  = icmp  eq  i32 %t1,     %arg
Or
  %t0 = trunc     i32 %arg  to i8
  %t1 = sext      i8  %t0   to i32
  %r  = icmp  eq  i32 %t1,     %arg
This pattern is a signed truncation check.

And `X` is checking that some bit in that same mask is zero.
I.e. can be one of:
  %r = icmp sgt i32   %arg,    -1
Or
  %t = and      i32   %arg,    2147483648
  %r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:
  %r = icmp ult i32 %arg, 128

https://rise4fun.com/Alive/3Ou

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: RKSimon, erichkeane, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50465

llvm-svn: 339610
2018-08-13 20:33:08 +00:00
Sanjay Patel 66c6fe6534 revert r339608 - [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
Can't set the builder flags without knowing this is an FPMathOperator. I'll add a test
for that and try again.

llvm-svn: 339609
2018-08-13 20:20:38 +00:00
Sanjay Patel 981f50919e [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
llvm-svn: 339608
2018-08-13 20:14:27 +00:00
Sanjay Patel e45a83d447 [SimplifyLibCalls] add reflection fold for -sin(-x) (PR38458)
This is a very partial fix for the reported problem. I suspect
we do not get this fold in most motivating cases because most of
the time, the libcall would have been replaced by an intrinsic,
and that optimization is handled elsewhere...but maybe it should
be handled here?

llvm-svn: 339604
2018-08-13 19:24:41 +00:00
Sanjay Patel ce4ddbe960 [SimplifyLibCalls] reduce code for optimizeCos; NFCI
llvm-svn: 339588
2018-08-13 17:40:49 +00:00
Simon Pilgrim 82edf8d329 [InstCombine] Limit simplifyAllocaArraySize constant folding to values that fit into a uint64_t
Fixes OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5223

llvm-svn: 339584
2018-08-13 16:50:20 +00:00
Evandro Menezes 5ecd6c1a46 [SLC] Expand simplification of pow() for vector types
Also consider vector constants when simplifying `pow()`.

Differential revision: https://reviews.llvm.org/D50035

llvm-svn: 339578
2018-08-13 16:12:37 +00:00
Max Kazantsev 5c490b49c3 [GuardWidening] Widen very likely non-taken br instructions
This is a second part of D49974 that handles widening of conditional branches that
have very likely `false` branch.

Differential Revision: https://reviews.llvm.org/D50040
Reviewed By: reames

llvm-svn: 339537
2018-08-13 07:58:19 +00:00
Craig Topper 8caccc32b5 [InstCombine] Fix typo in comment. NFC
llvm-svn: 339532
2018-08-13 00:54:23 +00:00
Craig Topper 8bb49218bc [InstCombine] Replace call to haveNoCommonBitsSet in visitXor with just the special case that doesn't use computeKnownBits.
Summary: computeKnownBits is expensive. The cases that would be detected by the computeKnownBits portion of haveNoCommonBitsSet were already handled by the earlier call to SimplifyDemandedInstructionBits.

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50604

llvm-svn: 339531
2018-08-13 00:38:27 +00:00
David Bolvansky 01d98cc03f [InstCombine] Fold Select with binary op - non-commutative opcodes
Summary:
Basic version was merged - https://reviews.llvm.org/D49954

This adds support for FP & non-commutative opcodes

Precommited tests: https://reviews.llvm.org/rL338727

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: jfb

Differential Revision: https://reviews.llvm.org/D50190

llvm-svn: 339520
2018-08-12 17:30:07 +00:00
Sanjay Patel dc185ee275 [InstCombine] fix/enhance fadd/fsub factorization
(X * Z) + (Y * Z) --> (X + Y) * Z
  (X * Z) - (Y * Z) --> (X - Y) * Z
  (X / Z) + (Y / Z) --> (X + Y) / Z
  (X / Z) - (Y / Z) --> (X - Y) / Z

The existing code that implemented these folds failed to 
optimize vectors, and it transformed code with multiple 
uses when it should not have.

llvm-svn: 339519
2018-08-12 15:48:26 +00:00
David Green f7111d1ece [UnJ] Improve explicit loop count checks
Try to improve the computed counts when it has been explicitly set by a pragma
or command line option. This moves the code around, so that first call to
computeUnrollCount to get a sensible count and override that if explicit unroll
and jam counts are specified.

Also added some extra debug messages for when unroll and jamming is disabled.

Differential Revision: https://reviews.llvm.org/D50075

llvm-svn: 339501
2018-08-11 07:37:31 +00:00
David Green 395b80cd3c [UnJ] Create a hasInvariantIterationCount function. NFC
Pulled out a separate function for some code that calculates
if an inner loop iteration count is invariant to it's outer
loop.

Differential Revision: https://reviews.llvm.org/D50063

llvm-svn: 339500
2018-08-11 06:57:28 +00:00
JF Bastien fe258d9776 Re-commit "[NFC] More ConstantMerge refactoring"
My previous change moved some code upwards which caused an assert in debug mode
because the global value didn't necessarily have an initializer. Don't do that.

llvm-svn: 339485
2018-08-10 22:41:09 +00:00
Philip Reames 85afd1a9a0 [LICM] Hoist assumes out of loops
If we have an assume which is known to execute and whose operand is invariant, we can lift that into the pre-header. So long as we don't change which paths the assume executes on, this is a legal transformation. It's likely to be a useful canonicalization as other transforms only look for dominating assumes.

Differential Revision: https://reviews.llvm.org/D50364

llvm-svn: 339481
2018-08-10 22:21:56 +00:00
JF Bastien b99f131ffd Revert "[NFC] More ConstantMerge refactoring"
Sanitizers seem unhappy.

llvm-svn: 339480
2018-08-10 22:10:20 +00:00
JF Bastien 62fb8ea4e0 [NFC] More ConstantMerge refactoring
This makes my upcoming patch much easier to read.

llvm-svn: 339478
2018-08-10 21:58:00 +00:00
Sanjay Patel 85e17bb195 [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI
This is a retry of rL339439 with a fix for the problem that
caused the original commit to be reverted at rL339446. 

That problem was that the compare can be integer while
the binop is FP or vice-versa, so we need to use the binop 
type when we ask for the identity constant.

A test to guard against the problem was added at rL339453.

llvm-svn: 339469
2018-08-10 20:30:35 +00:00
Matt Arsenault d35f46caf1 AMDGPU: Turn class x, p_zero|n_zero into fcmp oeq x, 0
The library does use this for some reason.

llvm-svn: 339461
2018-08-10 18:58:49 +00:00
Evgeniy Stepanov 453e7ac785 [hwasan] Add -hwasan-with-ifunc flag.
Summary: Similar to asan's flag, it can be used to disable the use of ifunc to access hwasan shadow address.

Reviewers: vitalybuka, kcc

Subscribers: srhines, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50544

llvm-svn: 339447
2018-08-10 16:21:37 +00:00
Sanjay Patel c9cc86a5b3 [InstCombine] revert r339439 - rearrange code for foldSelectBinOpIdentity
That was supposed to be NFC, but it exposed a logic hole somewhere that
caused bots to fail.

llvm-svn: 339446
2018-08-10 16:12:19 +00:00
Sanjay Patel 3b92a17526 [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI
This should make it easier to folow and to add the planned enhancements
such as D50190.

llvm-svn: 339439
2018-08-10 15:11:26 +00:00
Alexander Potapenko 75a954330b [MSan] Shrink the register save area for non-SSE builds
If code is compiled for X86 without SSE support, the register save area
doesn't contain FPU registers, so `AMD64FpEndOffset` should be equal to
`AMD64GpEndOffset`.

llvm-svn: 339414
2018-08-10 08:06:43 +00:00
David Bolvansky 909889b2cb [InstCombine] Transform str(n)cmp to memcmp
Summary:
Motivation examples:
int strcmp_memcmp() {
    char buf[12];
    return strcmp(buf, "key") == 0;
}

int strcmp_memcmp2() {
    char buf[12];
    return strcmp(buf, "key") != 0;
}

int strncmp_memcmp() {
    char buf[12];
    return strncmp(buf, "key", 3) == 0;
}

can be turned to memcmp.

See test file for more cases.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: spatel, llvm-commits

Differential Revision: https://reviews.llvm.org/D50233

llvm-svn: 339410
2018-08-10 04:32:54 +00:00
Matt Arsenault d54b7f0592 ValueTracking: Start enhancing isKnownNeverNaN
llvm-svn: 339399
2018-08-09 22:40:08 +00:00
Sanjay Patel c6944f795d [InstSimplify] move minnum/maxnum with Inf folds from instcombine
llvm-svn: 339396
2018-08-09 22:20:44 +00:00
JF Bastien 42ca9ccb70 [NFC] ConstantMerge: factor out some functions
This makes the code easier to read and will make an upcoming patch I have easier to review because that patch needed this refactoring to reuse some of the functions.

llvm-svn: 339391
2018-08-09 21:56:09 +00:00
JF Bastien ebcaa31768 ConstantMerge: update MadeChange when change is made
It was always false, which is obviously wrong.

llvm-svn: 339390
2018-08-09 21:36:57 +00:00
Philip Reames 7d79433136 [LICM] Suppress a compiler warning noticed by one of the bots
llvm-svn: 339388
2018-08-09 21:15:33 +00:00
Philip Reames ca256d93fb [LICM] hoist fences out of loops w/o memory operations
The motivating case is an otherwise dead loop with a fence in it. At the moment, this goes all the way through the optimizer and we end up emitting an entirely pointless loop on x86. This case may seem a bit contrived, but we've seen it in real code as the result of otherwise reasonable lowering strategies combined w/thread local memory optimizations (such as escape analysis).

To handle this simple case, we can teach LICM to hoist must execute fences when there is no other memory operation within the loop.

Differential Revision: https://reviews.llvm.org/D50489

llvm-svn: 339378
2018-08-09 20:18:42 +00:00
Sanjay Patel 55accd7dd3 [InstCombine] allow fsub+fmul FMF folds for vectors
llvm-svn: 339368
2018-08-09 18:42:12 +00:00
Alina Sbirlea bf9fe79397 SCEV should forget all loops containing a deleted block.
Summary:
LoopSimplifyCFG should update ScEv for all loops after a block is deleted.
If the deleted block "Succ" is part of L, then it is part of all parent loops, so forget topmost loop.

Reviewers: greened, mkazantsev, sanjoy

Subscribers: jlebar, javed.absar, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D50422

llvm-svn: 339363
2018-08-09 17:53:26 +00:00
Reid Kleckner 80c6ec11d9 [GlobalOpt] Don't apply fastcc if it would break inalloca invariants
The inalloca parameter has to be the only parameter passed in memory.
Changing the convention to fastcc can break that.

At some point we should teach global opt how to optimize ABI attributes
like inalloca and maybe byval. These attributes are mainly used to match
C ABIs. They are harder for LLVM to optimize and they don't always
generate the best code.

Fixes PR38487

llvm-svn: 339360
2018-08-09 17:29:26 +00:00
Sanjay Patel ebec4204da [InstCombine] reduce code duplication; NFC
llvm-svn: 339349
2018-08-09 15:07:13 +00:00
JF Bastien 3f270336e1 [NFC] ConstantMerge: don't insert when find should be used
Summary: DenseMap's operator[] performs an insertion if the entry isn't found. The second phase of ConstantMerge isn't trying to insert anything: it's just looking to see if the first phased performed an insertion. Use find instead, avoiding insertion of every single global initializer in the map of constants. This has the side-effect of making all entries in CMap non-null (because only global declarations would have null initializers, and that would be a bug).

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D50476

llvm-svn: 339309
2018-08-09 04:17:48 +00:00
Philip Reames 22b20a09a0 [LICM] Add an assert to ensure all instruction types needing aliasing are handled [NFC]
llvm-svn: 339308
2018-08-09 03:44:28 +00:00
Sanjay Patel fe839695a8 [InstCombine] fold fadd+fsub with common operand
This is a sibling to the simplify from:
https://reviews.llvm.org/rL339174

llvm-svn: 339267
2018-08-08 16:19:22 +00:00
Sanjay Patel 2054dd79c2 [InstCombine] fold fsub+fsub with common operand
This is a sibling to the simplify from:
rL339171

llvm-svn: 339266
2018-08-08 16:04:48 +00:00
Sanjay Patel a194b2d2ff [InstCombine] fold fneg into constant operand of fmul/fdiv
This accounts for the missing IR fold noted in D50195. We don't need any fast-math to enable the negation transform. 
FP negation can always be folded into an fmul/fdiv constant to eliminate the fneg.

I've limited this to one-use to ensure that we are eliminating an instruction rather than replacing fneg by a 
potentially expensive fdiv or fmul.

Differential Revision: https://reviews.llvm.org/D50417

llvm-svn: 339248
2018-08-08 14:29:08 +00:00
Roman Lebedev a677651a5a [InstCombine] De Morgan: sink 'not' into 'xor' (PR38446)
Summary:
https://rise4fun.com/Alive/IT3

Comes up in the [most ugliest]  `signed int` -> `signed char`  case of
`-fsanitize=implicit-conversion` (https://reviews.llvm.org/D50250)
Previously, we were stuck with `not`: {F6867736}
But now we are able to completely get rid of it: {F6867737}
(FIXME: why are we loosing the metadata? that seems wrong/strange.)

Here, we only want to do that it we will be able to completely
get rid of that 'not'.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: vsk, erichkeane, llvm-commits

Differential Revision: https://reviews.llvm.org/D50301

llvm-svn: 339243
2018-08-08 13:31:19 +00:00
Anastasis Grammenos 52d5283483 [Local] Add dbg location on unreachable inst in changeToUnreachable
As show in https://bugs.llvm.org/show_bug.cgi?id=37960
it would be desirable to have debug location in the unreachable
instruction.

Also adds a unti test for this function.

Differential Revision: https://reviews.llvm.org/D50340

llvm-svn: 339173
2018-08-07 20:21:56 +00:00
Alexey Bataev 0edcd0278d [SLP] Fix insert point for reused extract instructions.
Summary:
Reworked the previously committed patch to insert shuffles for reused
extract element instructions in the correct position. Previous logic was
incorrect, and might lead to the crash with PHIs and EH instructions.

Reviewers: efriedma, javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50143

llvm-svn: 339166
2018-08-07 19:21:05 +00:00
Florian Hahn 950576bdf8 [GVN,NewGVN] Keep nonnull if K does not move.
In combineMetadata, we should be able to preserve K's nonnull metadata,
if K does not move. This condition should hold for all replacements by
NewGVN/GVN, but I added a bunch of assertions to verify that.

Fixes PR35038.

There probably are additional kinds of metadata that could be preserved
using similar reasoning. This is follow-up work.

Reviewers: dberlin, davide, efriedma, nlopes

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D47339

llvm-svn: 339149
2018-08-07 15:36:11 +00:00
Sanjay Patel 948ff87d7d [InstSimplify] move minnum/maxnum with common op fold from instcombine
llvm-svn: 339144
2018-08-07 14:36:27 +00:00
Florian Hahn 39bbe179aa [GVN,NewGVN] Move patchReplacementInstruction to Utils/Local.h
This function is shared between both implementations. I am not sure if
Utils/Local.h is the best place though.

Reviewers: davide, dberlin, efriedma, xbolva00

Reviewed By: efriedma, xbolva00

Differential Revision: https://reviews.llvm.org/D47337

llvm-svn: 339138
2018-08-07 13:27:33 +00:00